Spring AI Agentic Patterns (Part 7): Session API — Event-Sourced Short-Term Memory with Context Compaction

Spring Blog

Spring Blog2026年4月15日

Spring AI Agentic Patterns (Part 7): Session API — Event-Sourced Short-Term Memory with Context Compaction

9.0Score

TL;DR · AI 摘要

本文介绍 Spring AI 全新 Session API，采用事件溯源架构管理短期对话记忆，通过“轮次”原子化保障工具调用完整性，并提供可组合的上下文压缩触发器与策略，解决传统 ChatMemory 粗暴截断导致的上下文断裂问题，为多智能体协作提供结构化记忆底座。

核心要点

采用事件溯源日志替代扁平消息列表，以“轮次”为原子单位管理上下文，彻底避免工具调用序列被截断导致的模型幻觉。
提供可插拔的上下文压缩机制，支持按轮次、Token阈值或组合条件触发压缩，智能保留关键对话结构。
新 API 将作为 Spring AI 2.1 核心组件取代旧版 ChatMemory，原生支持多智能体分支隔离与元数据检索。

#Spring AI#Agent架构#上下文管理#事件溯源#Java

打开原文

_A New Session API for Spring AI — Structured, Compactable, Multi-Agent-Ready_

Part 7 of the [Spring AI Agentic Patterns](https://spring.io/blog/2026/04/07/spring-ai-agentic-patterns-6-memory-tools) series completes the memory picture. After covering Agent Skills, AskUserQuestionTool, TodoWriteTool, Subagent Orchestration, A2A Integration, and AutoMemoryTools for long-term cross-session memory, we now add the complementary short-term layer: Spring AI Session. Storing conversation history as a flat message list works for short exchanges but breaks down as sessions grow — naive truncation silently discards tool-call sequences mid-exchange, leaving the model with orphaned results and broken turn structure. Spring AI Session solves this by automatically recording every message, tool call, and result for the active exchange and managing the context window intelligently, while AutoMemoryTools retains curated facts that must survive beyond the session. A complete agent memory stack needs both; neither replaces the other.

Roadmap: Incubating in spring-ai-community; targets Spring AI 2.1 (November 2026), when ChatMemory will be deprecated in its favour.

ChatMemory evicts the oldest messages with no turn safety, no event identity, no multi-agent support, and no record of what was discarded. Spring AI Session replaces it with an event-sourced log, pluggable compaction strategies, branch isolation, and keyword-searchable recall storage.

🚀 Want to jump right in? Skip to the Getting Started section.

* *

Session API Architecture

Session and SessionEvent

Session is an immutable metadata-only value object — it holds the session ID, user ID, TTL, and arbitrary metadata. The event log lives separately in the repository, fetched on demand.

SessionEvent wraps a Spring AI Message and adds what Message intentionally omits: a UUID, sessionId, timestamp, an optional branch label for multi-agent hierarchies, and framework flags like METADATA_SYNTHETIC.

code

SessionService service = new DefaultSessionService(InMemorySessionRepository.builder().build());

Session session = service.create(
    CreateSessionRequest.builder().userId("alice").build()
);

service.appendMessage(session.id(), new UserMessage("What is Spring AI?"));
service.appendMessage(session.id(), new AssistantMessage("Spring AI is..."));

List<Message> history = service.getMessages(session.id()); // ready to pass to an LLM

Turns

A turn is the atomic unit of conversation: one UserMessage plus all following events — assistant replies, tool calls, tool results — up to the next UserMessage. All compaction strategies operate at turn granularity, so the kept window always starts on a USER message. The model never sees an orphaned tool result or a split exchange.

code

Turn 1: [USER "What is Spring AI?"]  [ASSISTANT "Spring AI is..."]
Turn 2: [USER "Can it use tools?"]   [ASSISTANT (tool call)]  [TOOL result]  [ASSISTANT "Yes,..."]

* *

Context Compaction

Compaction reduces the event history to fit the context window while preserving coherence. It is driven by two composable abstractions: triggers and strategies.

Triggers

code

new TurnCountTrigger(20);                                   // fires when > 20 turns
TokenCountTrigger.builder().threshold(4000).build();        // fires at 4000 estimated tokens

// OR-composite — fires if either condition is met
CompositeCompactionTrigger.anyOf(
    new TurnCountTrigger(20),
    TokenCountTrigger.builder().threshold(4000).build()
);

Strategies

| Strategy | LLM call? | Best for | | --- | --- | --- | | SlidingWindowCompactionStrategy | No | Cost-sensitive, short-term context | | TurnWindowCompactionStrategy | No | Turn-structured dialogues | | TokenCountCompactionStrategy | No | Hard context-window limits | | RecursiveSummarizationCompactionStrategy | Yes | Long-running, context-rich sessions |

The first three keep a verbatim suffix of events (by message count, turn count, or token budget). All three snap the cut point to the nearest turn boundary — no partial turns are ever kept.

Recursive Summarization is the most powerful: it uses an LLM to summarize the events being archived and stores the result as a synthetic user + assistant turn. Each subsequent compaction pass builds on prior summaries — creating a rolling compressed history that never starts from scratch:

code

RecursiveSummarizationCompactionStrategy.builder(chatClient)
    .maxEventsToKeep(10)
    .overlapSize(2)   // feed 2 events from the active window into the summary prompt
    .build();

Note: Trigger and strategy must always be configured together — setting one without the other throws IllegalArgumentException at build time. Either set both via .compactionTrigger(...) and .compactionStrategy(...), or omit both to disable compaction entirely.

* *

ChatClient Integration

SessionMemoryAdvisor wires session management into the ChatClient pipeline transparently. On every request it loads history, prepends it to the prompt, appends the new user and assistant messages, and runs compaction if a trigger fires — all without any manual code in the application.

code

@Bean
SessionMemoryAdvisor sessionMemoryAdvisor(SessionService sessionService,
        ChatClient.Builder chatClientBuilder) {

    return SessionMemoryAdvisor.builder(sessionService)
        .defaultUserId("alice")
        .compactionTrigger(new TurnCountTrigger(20))
        .compactionStrategy(
            RecursiveSummarizationCompactionStrategy.builder(chatClientBuilder.build())
                .maxEventsToKeep(10)
                .build()
        )
        .build();
}

@Bean
ChatClient chatClient(ChatClient.Builder chatClientBuilder, SessionMemoryAdvisor advisor) {
    return chatClientBuilder.defaultAdvisors(advisor).build();
}

Pass a session ID at call time via the advisor context:

code

String response = chatClient.prompt()
    .user("Hello!")
    .advisors(a -> a.param(SessionMemoryAdvisor.SESSION_ID_CONTEXT_KEY, "session-abc"))
    .call()
    .content();

If no session exists for the given ID, the advisor creates one automatically.

* *

Multi-Agent Branch Isolation

When an orchestrator fans out to parallel sub-agents, all agents can share the same Session — but each must see only its own events plus its ancestors'. SessionEvent.branch is a dot-separated path that records the producing agent's position in the hierarchy:

code

orchestrator        branch = "orch"
├── researcher      branch = "orch.researcher"
└── writer          branch = "orch.writer"

Events with branch = null are root-level — visible to every agent. Pass EventFilter.forBranch() to apply isolation automatically inside the advisor:

code

// Researcher sees: null-branch + "orch" + own "orch.researcher" events
// Hidden: "orch.writer" (sibling)
SessionMemoryAdvisor researcherAdvisor = SessionMemoryAdvisor.builder(sessionService)
    .defaultSessionId(sharedSessionId)
    .eventFilter(EventFilter.forBranch("orch.researcher"))
    .build();

Synthetic summary events from RecursiveSummarizationCompactionStrategy always carry branch = null, so compaction summaries remain visible to every agent in the session.

* *

Recall Storage

Compaction improves prompt efficiency, but older events are removed from the active context window. SessionEventTools implements the MemGPT _Recall Storage_ pattern: the full verbatim event log is always retained and searchable by keyword, even after compaction has pruned it from the prompt.

code

ChatClient client = ChatClient.builder(chatModel)
    .defaultTools(SessionEventTools.builder(sessionService).build())
    .defaultAdvisors(advisor)
    .build();

The conversation_search tool is auto-discovered by Spring AI. When the model needs to recall a prior exchange it calls the tool with a keyword and an optional page index; results come back as chronologically ordered JSON. Synthetic summary events are searchable too — their text is indexed in the recall store.

* *

JDBC Persistence

spring-ai-session-jdbc stores session data in two tables (AI_SESSION and AI_SESSION_EVENT, an append-only event log) with support for PostgreSQL, MySQL, MariaDB, and H2. The Spring Boot starter auto-configures everything:

code

<dependency>
    <groupId>org.springaicommunity</groupId>
    <artifactId>spring-ai-starter-session-jdbc</artifactId>
</dependency>

For PostgreSQL or MySQL, enable schema initialisation:

code

spring:
  ai:
    session:
      repository:
        jdbc:
          initialize-schema: always

No additional bean declarations are required.

* *

Getting Started

Requirements: Java 17+, Spring AI 2.0.0-M4+, Spring Boot 4.0.2+

1. Import the BOM:

code

<dependencyManagement>
    <dependencies>
        <dependency>
            <groupId>org.springaicommunity</groupId>
            <artifactId>spring-ai-session-bom</artifactId>
            <version>0.2.0</version>
            <type>pom</type>
            <scope>import</scope>
        </dependency>
    </dependencies>
</dependencyManagement>

2. Add a starter — JDBC for production, or spring-ai-session-management alone for in-memory development:

code

<dependency>
    <groupId>org.springaicommunity</groupId>
    <artifactId>spring-ai-starter-session-jdbc</artifactId>
</dependency>

3. Wire the advisor and use it:

code

@Bean
SessionMemoryAdvisor sessionMemoryAdvisor(SessionService sessionService) {
    return SessionMemoryAdvisor.builder(sessionService)
        .defaultUserId("alice")
        .compactionTrigger(new TurnCountTrigger(20))
        .compactionStrategy(SlidingWindowCompactionStrategy.builder().maxEvents(10).build())
        .build();
}

@Bean
ChatClient chatClient(ChatModel chatModel, SessionMemoryAdvisor advisor) {
    return ChatClient.builder(chatModel).defaultAdvisors(advisor).build();
}

code

Session session = sessionService.create(
    CreateSessionRequest.builder().userId("alice").build()
);

String response = chatClient.prompt()
    .user("What is Spring AI?")
    .advisors(a -> a.param(SessionMemoryAdvisor.SESSION_ID_CONTEXT_KEY, session.id()))
    .call()
    .content();

* *

From ChatMemory to Session API

The Session API is designed to replaceChatMemory as Spring AI's primary conversation persistence abstraction:

| | ChatMemory | Spring AI Session | | --- | --- | --- | | Storage unit | Message (flat list) | SessionEvent (immutable, timestamped, identified) | | Compaction | Evict oldest messages | Four pluggable strategies incl. LLM summarization | | Turn safety | Not enforced | All strategies snap to turn boundaries | | Multi-agent | Not supported | Branch isolation with dot-separated labels | | Recall search | Not available | conversation_search tool via SessionEventTools | | Concurrency | Implementation-dependent | Optimistic CAS write in all implementations |

The equivalent of MessageWindowChatMemory.builder().maxMessages(20).build() is:

code

SessionMemoryAdvisor.builder(sessionService)
    .compactionTrigger(new TurnCountTrigger(20))
    .compactionStrategy(SlidingWindowCompactionStrategy.builder().maxEvents(20).build())
    .build();

* *

Conclusion

Spring AI Session brings a structured, event-sourced short-term memory layer to the Spring AI ecosystem — with turn-safe compaction, LLM-powered summarization, multi-agent branch isolation, and keyword-searchable recall storage. Paired with AutoMemoryTools from Part 6, you now have both halves of a complete agent memory stack: a durable long-term layer for facts that outlive the session, and a coherent short-term layer for the active conversation. The library is available from the spring-ai-community organization.

* *

Resources

GitHub: spring-ai-community/spring-ai-session
Documentation: spring-ai-community.github.io/spring-ai-session
Spring AI Reference: docs.spring.io/spring-ai/reference
ChatMemory API (current): Chat Memory Reference

#### [](http://spring.io/blog/2026/04/15/spring-ai-session-management#agentic-patterns-series)Agentic Patterns Series

Part 1: Agent Skills — modular, reusable agent capabilities
Part 2: AskUserQuestionTool — interactive agent workflows
Part 3: TodoWriteTool — structured task management
Part 4: Subagent Orchestration — hierarchical multi-agent architectures
Part 5: A2A Integration — building interoperable agents
Part 6: AutoMemoryTools — persistent long-term memory across sessions
Part 7: Spring AI Session (this post) — structured short-term memory with turn-safe compaction