What is the best way to learn Java 8?

Start with Lambda Expressions and Functional Interfaces, then progress to Stream API and Optional. Practice with real coding examples and take quizzes to test your understanding.

How do I prepare for Spring Boot interviews?

Focus on core concepts like dependency injection, REST APIs, Spring Data JPA, and Spring Security. Practice with our 100+ Spring Boot quiz questions covering real interview scenarios.

What topics are covered in System Design?

We cover scalability patterns, database design, microservices architecture, distributed systems, caching strategies, API design, and security architecture.

Back to Intelligent Chatbots

Intermediate

~20 minutes

Adding Memory to Chatbots

Transform your stateless chatbot into a context-aware assistant that remembers user preferences, conversation history, and key information across sessions.

Conversation Memory

Persistent Storage

Session Management

In the previous tutorial, you built a basic chatbot that processes each message independently. But real conversations have context—users expect the AI to remember what was said earlier. When a customer says "I'd like to return the shoes I mentioned," your chatbot needs to know which shoes they're talking about.

Spring AI solves this with the ChatMemory abstraction and Advisors—components that intercept requests and responses to automatically inject conversation history into each prompt. This means the LLM receives not just the current message, but the full context needed to generate coherent, personalized responses.

Why Conversation Memory Matters

❌ Without Memory

User: Hi, I'm looking for running shoes

Bot: Great! What size and color do you prefer?

User: Size 10, black please

Bot: I'm sorry, I don't understand. What are you looking for?

✓ With Memory

User: Hi, I'm looking for running shoes

Bot: Great! What size and color do you prefer?

User: Size 10, black please

Bot: Here are our black running shoes in size 10...

Choose Your Memory Strategy

Spring AI supports different memory backends. Choose based on your deployment requirements.

In-Memory (Default)

Fast, simple storage that lives in application memory. Perfect for development and testing.

Pros:

+ Zero configuration
+ Ultra-fast access
+ No external dependencies

Cons:

- Lost on restart
- Not shared across instances
- Limited by JVM heap

Redis-Backed

Persistent, distributed storage using Redis. Ideal for production with multiple app instances.

Pros:

+ Survives restarts
+ Shared across instances
+ TTL support

Cons:

- Requires Redis setup
- Network latency
- Additional infrastructure

Summarized Memory

Intelligent memory that summarizes old conversations to stay within token limits.

Pros:

+ Unlimited conversation length
+ Cost-effective
+ Preserves key context

Cons:

- May lose details
- Requires extra LLM calls
- More complex implementation

Configure In-Memory Storage

Let's start with the simplest approach—in-memory storage. This is perfect for development and single-instance deployments. Spring AI's InMemoryChatMemory stores conversations in a HashMap, keyed by session ID.

ChatConfig.java

@ConfigurationpublicclassChatConfig{@BeanpublicChatMemorychatMemory(){// In-memory storage - lost on restartreturnnewInMemoryChatMemory();}}

Create the Memory-Aware Controller

The key to memory is the MessageChatMemoryAdvisor. This advisor automatically: retrieves past messages from storage, appends them to the prompt, saves the new exchange after the response, and manages conversation context seamlessly.

ChatController.java with Memory

@RestController@RequestMapping("/api/chat")publicclassChatController{privatefinalChatClient chatClient;publicChatController(ChatClient.Builder builder,ChatMemory chatMemory){this.chatClient = builder
.defaultSystem("""
You are a helpful customer support assistant for TechCorp.
Be friendly, professional, and remember user preferences.
If the user mentions their name, remember it for future messages.
""").defaultAdvisors(newMessageChatMemoryAdvisor(chatMemory)).build();}@PostMappingpublicStringchat(@RequestBodyChatRequest request,@RequestHeader("X-Session-Id")String sessionId){return chatClient.prompt().user(request.getMessage()).advisors(advisor -> advisor
.param(CHAT_MEMORY_CONVERSATION_ID_KEY, sessionId).param(CHAT_MEMORY_RETRIEVE_SIZE_KEY,20)).call().content();}}

Key Parameters Explained

CHAT_MEMORY_CONVERSATION_ID_KEY

Unique identifier for each conversation. Use session IDs, user IDs, or any string that groups related messages together.

CHAT_MEMORY_RETRIEVE_SIZE_KEY

How many previous messages to include. More context = better understanding, but also higher token costs and latency.

Pro tip: The session ID should come from your authentication layer. For anonymous users, generate a UUID on first visit and store it in a cookie or local storage.

Test Your Memory-Enabled Chatbot

Now let's verify that memory is working. Make multiple requests with the same session ID and watch the chatbot maintain context across the conversation.

Testing Memory with cURL

# Test conversation with memory# First messagecurl-X POST http://localhost:8080/api/chat \-H"Content-Type: application/json"\-H"X-Session-Id: user-123"\-d'{"message": "Hi! My name is Sarah and I love hiking."}'# Response: "Hello Sarah! It is great to meet a fellow hiking enthusiast..."# Second message - same sessioncurl-X POST http://localhost:8080/api/chat \-H"Content-Type: application/json"\-H"X-Session-Id: user-123"\-d'{"message": "What outdoor activities would you recommend for me?"}'# Response: "Since you mentioned loving hiking, Sarah, here are some ideas..."

Advanced

Persistent Memory with Redis

For production deployments, you'll want memory that survives restarts and scales across multiple instances. Here's how to implement a Redis-backed memory store with automatic expiration.

pom.xml - Redis Dependencies

<!-- pom.xml - Add these dependencies --><dependency><groupId>org.springframework.ai</groupId><artifactId>spring-ai-openai-spring-boot-starter</artifactId></dependency><!-- Optional: For persistent memory with Redis --><dependency><groupId>org.springframework.boot</groupId><artifactId>spring-boot-starter-data-redis</artifactId></dependency>

RedisChatMemory.java

@ConfigurationpublicclassRedisChatMemoryConfig{@BeanpublicChatMemorychatMemory(RedisTemplate<String,Object> redisTemplate){returnnewRedisChatMemory(redisTemplate,Duration.ofHours(24));}}// Custom Redis implementationpublicclassRedisChatMemoryimplementsChatMemory{privatefinalRedisTemplate<String,Object> redisTemplate;privatefinalDuration ttl;privatestaticfinalStringKEY_PREFIX="chat:memory:";publicRedisChatMemory(RedisTemplate<String,Object> redisTemplate,Duration ttl){this.redisTemplate = redisTemplate;this.ttl = ttl;}@Overridepublicvoidadd(String conversationId,List<Message> messages){String key =KEY_PREFIX+ conversationId;List<Message> existing =get(conversationId,Integer.MAX_VALUE);
existing.addAll(messages);
redisTemplate.opsForValue().set(key, existing, ttl);}@OverridepublicList<Message>get(String conversationId,int lastN){String key =KEY_PREFIX+ conversationId;@SuppressWarnings("unchecked")List<Message> messages =(List<Message>) redisTemplate.opsForValue().get(key);if(messages ==null)returnnewArrayList<>();int start =Math.max(0, messages.size()- lastN);returnnewArrayList<>(messages.subList(start, messages.size()));}@Overridepublicvoidclear(String conversationId){
redisTemplate.delete(KEY_PREFIX+ conversationId);}}

TTL (Time-To-Live): Set an appropriate expiration for conversations. 24 hours is common for customer support, while longer periods may be needed for ongoing projects or assistant-style applications.

Managing Token Limits with Windowed Memory

Long conversations can exceed the model's context window (e.g., 128K tokens for GPT-4). Windowed memory keeps only the N most recent messages, automatically trimming older ones.

Windowed Memory Configuration

@ConfigurationpublicclassWindowedMemoryConfig{@BeanpublicChatClientchatClient(ChatClient.Builder builder,ChatMemory chatMemory){return builder
.defaultSystem("You are a helpful assistant.").defaultAdvisors(// Use windowed memory - only keep last 10 messagesnewMessageChatMemoryAdvisor(chatMemory,"",10)).build();}}

Smart Memory with Conversation Summarization

For the best of both worlds—unlimited conversation length while preserving key context—implement summarization. When conversations get long, older messages are condensed into a summary.

SummarizedMemoryService.java

@ServicepublicclassSummarizedMemoryService{privatefinalChatClient chatClient;privatefinalChatMemory chatMemory;privatestaticfinalintSUMMARIZE_THRESHOLD=20;publicStringchat(String sessionId,String userMessage){// Check if conversation is getting longList<Message> history = chatMemory.get(sessionId,Integer.MAX_VALUE);if(history.size()>SUMMARIZE_THRESHOLD){summarizeOldMessages(sessionId, history);}return chatClient.prompt().user(userMessage).advisors(a -> a.param(CHAT_MEMORY_CONVERSATION_ID_KEY, sessionId)).call().content();}privatevoidsummarizeOldMessages(String sessionId,List<Message> messages){// Keep last 5 messages, summarize the restList<Message> toSummarize = messages.subList(0, messages.size()-5);String summary = chatClient.prompt().system("Summarize this conversation in 2-3 sentences, preserving key facts:").user(formatMessages(toSummarize)).call().content();// Clear and start fresh with summary + recent messages
chatMemory.clear(sessionId);
chatMemory.add(sessionId,List.of(newSystemMessage("Previous conversation summary: "+ summary)));
chatMemory.add(sessionId, messages.subList(messages.size()-5, messages.size()));}}

Cost consideration: Summarization requires an extra LLM call. Only trigger it when necessary (e.g., every 20 messages) to balance cost and quality.

Memory Best Practices

Use meaningful session IDs

Combine user ID + context (e.g., 'user-123-order-support') for better organization and debugging.

Set appropriate TTLs

Expire old conversations to prevent unbounded storage growth and comply with data retention policies.

Handle memory failures gracefully

If Redis is unavailable, fall back to in-memory or stateless mode rather than failing completely.

Monitor memory usage

Track conversation lengths and storage size. Alert if conversations grow unexpectedly large.

What's Next?

Function Calling

Let your chatbot take actions

RAG-Enhanced Chatbots

Ground responses in your data

Streaming Responses

Real-time output

Your Chatbot Now Remembers!

With conversation memory in place, your chatbot can maintain context, remember user preferences, and provide truly personalized experiences. Ready to go further?