Back to Intelligent Chatbots
    Intermediate
    ~20 minutes

    Adding Memory to Chatbots

    Transform your stateless chatbot into a context-aware assistant that remembers user preferences, conversation history, and key information across sessions.

    Conversation Memory
    Persistent Storage
    Session Management

    In the previous tutorial, you built a basic chatbot that processes each message independently. But real conversations have context—users expect the AI to remember what was said earlier. When a customer says "I'd like to return the shoes I mentioned," your chatbot needs to know which shoes they're talking about.

    Spring AI solves this with the ChatMemory abstraction and Advisors—components that intercept requests and responses to automatically inject conversation history into each prompt. This means the LLM receives not just the current message, but the full context needed to generate coherent, personalized responses.

    Why Conversation Memory Matters

    ❌ Without Memory

    User: Hi, I'm looking for running shoes
    Bot: Great! What size and color do you prefer?
    User: Size 10, black please
    Bot: I'm sorry, I don't understand. What are you looking for?

    ✓ With Memory

    User: Hi, I'm looking for running shoes
    Bot: Great! What size and color do you prefer?
    User: Size 10, black please
    Bot: Here are our black running shoes in size 10...

    Choose Your Memory Strategy

    Spring AI supports different memory backends. Choose based on your deployment requirements.

    In-Memory (Default)

    Fast, simple storage that lives in application memory. Perfect for development and testing.

    Pros:
    • + Zero configuration
    • + Ultra-fast access
    • + No external dependencies
    Cons:
    • - Lost on restart
    • - Not shared across instances
    • - Limited by JVM heap

    Redis-Backed

    Persistent, distributed storage using Redis. Ideal for production with multiple app instances.

    Pros:
    • + Survives restarts
    • + Shared across instances
    • + TTL support
    Cons:
    • - Requires Redis setup
    • - Network latency
    • - Additional infrastructure

    Summarized Memory

    Intelligent memory that summarizes old conversations to stay within token limits.

    Pros:
    • + Unlimited conversation length
    • + Cost-effective
    • + Preserves key context
    Cons:
    • - May lose details
    • - Requires extra LLM calls
    • - More complex implementation
    1

    Configure In-Memory Storage

    Let's start with the simplest approach—in-memory storage. This is perfect for development and single-instance deployments. Spring AI's InMemoryChatMemory stores conversations in a HashMap, keyed by session ID.

    ChatConfig.java
    @ConfigurationpublicclassChatConfig{@BeanpublicChatMemorychatMemory(){// In-memory storage - lost on restartreturnnewInMemoryChatMemory();}}
    2

    Create the Memory-Aware Controller

    The key to memory is the MessageChatMemoryAdvisor. This advisor automatically: retrieves past messages from storage, appends them to the prompt, saves the new exchange after the response, and manages conversation context seamlessly.

    ChatController.java with Memory
    @RestController@RequestMapping("/api/chat")publicclassChatController{privatefinalChatClient chatClient;publicChatController(ChatClient.Builder builder,ChatMemory chatMemory){this.chatClient = builder
    .defaultSystem("""
    You are a helpful customer support assistant for TechCorp.
    Be friendly, professional, and remember user preferences.
    If the user mentions their name, remember it for future messages.
    """).defaultAdvisors(newMessageChatMemoryAdvisor(chatMemory)).build();}@PostMappingpublicStringchat(@RequestBodyChatRequest request,@RequestHeader("X-Session-Id")String sessionId){return chatClient.prompt().user(request.getMessage()).advisors(advisor -> advisor
    .param(CHAT_MEMORY_CONVERSATION_ID_KEY, sessionId).param(CHAT_MEMORY_RETRIEVE_SIZE_KEY,20)).call().content();}}

    Key Parameters Explained

    CHAT_MEMORY_CONVERSATION_ID_KEY

    Unique identifier for each conversation. Use session IDs, user IDs, or any string that groups related messages together.

    CHAT_MEMORY_RETRIEVE_SIZE_KEY

    How many previous messages to include. More context = better understanding, but also higher token costs and latency.

    Pro tip: The session ID should come from your authentication layer. For anonymous users, generate a UUID on first visit and store it in a cookie or local storage.

    3

    Test Your Memory-Enabled Chatbot

    Now let's verify that memory is working. Make multiple requests with the same session ID and watch the chatbot maintain context across the conversation.

    Testing Memory with cURL
    # Test conversation with memory# First messagecurl-X POST http://localhost:8080/api/chat \-H"Content-Type: application/json"\-H"X-Session-Id: user-123"\-d'{"message": "Hi! My name is Sarah and I love hiking."}'# Response: "Hello Sarah! It is great to meet a fellow hiking enthusiast..."# Second message - same sessioncurl-X POST http://localhost:8080/api/chat \-H"Content-Type: application/json"\-H"X-Session-Id: user-123"\-d'{"message": "What outdoor activities would you recommend for me?"}'# Response: "Since you mentioned loving hiking, Sarah, here are some ideas..."
    Advanced

    Persistent Memory with Redis

    For production deployments, you'll want memory that survives restarts and scales across multiple instances. Here's how to implement a Redis-backed memory store with automatic expiration.

    pom.xml - Redis Dependencies
    <!-- pom.xml - Add these dependencies --><dependency><groupId>org.springframework.ai</groupId><artifactId>spring-ai-openai-spring-boot-starter</artifactId></dependency><!-- Optional: For persistent memory with Redis --><dependency><groupId>org.springframework.boot</groupId><artifactId>spring-boot-starter-data-redis</artifactId></dependency>
    RedisChatMemory.java
    @ConfigurationpublicclassRedisChatMemoryConfig{@BeanpublicChatMemorychatMemory(RedisTemplate<String,Object> redisTemplate){returnnewRedisChatMemory(redisTemplate,Duration.ofHours(24));}}// Custom Redis implementationpublicclassRedisChatMemoryimplementsChatMemory{privatefinalRedisTemplate<String,Object> redisTemplate;privatefinalDuration ttl;privatestaticfinalStringKEY_PREFIX="chat:memory:";publicRedisChatMemory(RedisTemplate<String,Object> redisTemplate,Duration ttl){this.redisTemplate = redisTemplate;this.ttl = ttl;}@Overridepublicvoidadd(String conversationId,List<Message> messages){String key =KEY_PREFIX+ conversationId;List<Message> existing =get(conversationId,Integer.MAX_VALUE);
    existing.addAll(messages);
    redisTemplate.opsForValue().set(key, existing, ttl);}@OverridepublicList<Message>get(String conversationId,int lastN){String key =KEY_PREFIX+ conversationId;@SuppressWarnings("unchecked")List<Message> messages =(List<Message>) redisTemplate.opsForValue().get(key);if(messages ==null)returnnewArrayList<>();int start =Math.max(0, messages.size()- lastN);returnnewArrayList<>(messages.subList(start, messages.size()));}@Overridepublicvoidclear(String conversationId){
    redisTemplate.delete(KEY_PREFIX+ conversationId);}}

    TTL (Time-To-Live): Set an appropriate expiration for conversations. 24 hours is common for customer support, while longer periods may be needed for ongoing projects or assistant-style applications.

    Managing Token Limits with Windowed Memory

    Long conversations can exceed the model's context window (e.g., 128K tokens for GPT-4). Windowed memory keeps only the N most recent messages, automatically trimming older ones.

    Windowed Memory Configuration
    @ConfigurationpublicclassWindowedMemoryConfig{@BeanpublicChatClientchatClient(ChatClient.Builder builder,ChatMemory chatMemory){return builder
    .defaultSystem("You are a helpful assistant.").defaultAdvisors(// Use windowed memory - only keep last 10 messagesnewMessageChatMemoryAdvisor(chatMemory,"",10)).build();}}

    Smart Memory with Conversation Summarization

    For the best of both worlds—unlimited conversation length while preserving key context—implement summarization. When conversations get long, older messages are condensed into a summary.

    SummarizedMemoryService.java
    @ServicepublicclassSummarizedMemoryService{privatefinalChatClient chatClient;privatefinalChatMemory chatMemory;privatestaticfinalintSUMMARIZE_THRESHOLD=20;publicStringchat(String sessionId,String userMessage){// Check if conversation is getting longList<Message> history = chatMemory.get(sessionId,Integer.MAX_VALUE);if(history.size()>SUMMARIZE_THRESHOLD){summarizeOldMessages(sessionId, history);}return chatClient.prompt().user(userMessage).advisors(a -> a.param(CHAT_MEMORY_CONVERSATION_ID_KEY, sessionId)).call().content();}privatevoidsummarizeOldMessages(String sessionId,List<Message> messages){// Keep last 5 messages, summarize the restList<Message> toSummarize = messages.subList(0, messages.size()-5);String summary = chatClient.prompt().system("Summarize this conversation in 2-3 sentences, preserving key facts:").user(formatMessages(toSummarize)).call().content();// Clear and start fresh with summary + recent messages
    chatMemory.clear(sessionId);
    chatMemory.add(sessionId,List.of(newSystemMessage("Previous conversation summary: "+ summary)));
    chatMemory.add(sessionId, messages.subList(messages.size()-5, messages.size()));}}

    Cost consideration: Summarization requires an extra LLM call. Only trigger it when necessary (e.g., every 20 messages) to balance cost and quality.

    Memory Best Practices

    Use meaningful session IDs

    Combine user ID + context (e.g., 'user-123-order-support') for better organization and debugging.

    Set appropriate TTLs

    Expire old conversations to prevent unbounded storage growth and comply with data retention policies.

    Handle memory failures gracefully

    If Redis is unavailable, fall back to in-memory or stateless mode rather than failing completely.

    Monitor memory usage

    Track conversation lengths and storage size. Alert if conversations grow unexpectedly large.

    Your Chatbot Now Remembers!

    With conversation memory in place, your chatbot can maintain context, remember user preferences, and provide truly personalized experiences. Ready to go further?