What is the best way to learn Java 8?

Start with Lambda Expressions and Functional Interfaces, then progress to Stream API and Optional. Practice with real coding examples and take quizzes to test your understanding.

How do I prepare for Spring Boot interviews?

Focus on core concepts like dependency injection, REST APIs, Spring Data JPA, and Spring Security. Practice with our 100+ Spring Boot quiz questions covering real interview scenarios.

What topics are covered in System Design?

We cover scalability patterns, database design, microservices architecture, distributed systems, caching strategies, API design, and security architecture.

Spring AI Tutorials

Tutorial 11

Configuring Multiple LLMs in Spring AI

Use OpenAI, Anthropic, Ollama, and other providers together in a single application for cost optimization and specialized tasks

1
Why Use Multiple LLMs?

Different AI models excel at different tasks. By configuring multiple LLMs, you can optimize for cost, performance, and capabilities:

Cost Optimization

Use cheaper models for simple tasks, expensive models for complex ones

Latency Control

Fast local models for quick responses, cloud models for quality

Specialization

Code models for programming, vision models for images

Fallback & Redundancy

Switch providers if one is down or rate-limited

Real-World Example

Use GPT-4o for complex reasoning, Claude for long documents, Llama via Ollama for privacy-sensitive local processing, and GPT-3.5-turbo for simple classification tasks.

2
Configuration Setup

Step 1: Add Dependencies for Each Provider

Xml Example

<!-- OpenAI --><dependency><groupId>org.springframework.ai</groupId><artifactId>spring-ai-openai-spring-boot-starter</artifactId></dependency><!-- Anthropic Claude --><dependency><groupId>org.springframework.ai</groupId><artifactId>spring-ai-anthropic-spring-boot-starter</artifactId></dependency><!-- Ollama (local models) --><dependency><groupId>org.springframework.ai</groupId><artifactId>spring-ai-ollama-spring-boot-starter</artifactId></dependency>

Step 2: Configure API Keys

Yaml Example

spring:ai:openai:api-key: ${OPENAI_API_KEY}chat:options:model: gpt-4o
temperature:0.7anthropic:api-key: ${ANTHROPIC_API_KEY}chat:options:model: claude-3-5-sonnet-20241022max-tokens:4096ollama:base-url: http://localhost:11434chat:options:model: llama3.2

Auto-Configuration Note

When multiple starters are present, Spring AI auto-configures all of them. You'll need to qualify which one to inject or create named beans.

3
Creating Named ChatClients

Define Multiple ChatClient Beans

Java Example

importorg.springframework.ai.chat.client.ChatClient;importorg.springframework.ai.openai.OpenAiChatModel;importorg.springframework.ai.anthropic.AnthropicChatModel;importorg.springframework.ai.ollama.OllamaChatModel;importorg.springframework.context.annotation.Bean;importorg.springframework.context.annotation.Configuration;importorg.springframework.context.annotation.Primary;@ConfigurationpublicclassChatClientConfig{@Bean@Primary// Default when no qualifier specifiedpublicChatClientopenAiChatClient(OpenAiChatModel openAiModel){returnChatClient.builder(openAiModel).defaultSystem("You are a helpful assistant powered by GPT-4o.").build();}@BeanpublicChatClientclaudeChatClient(AnthropicChatModel claudeModel){returnChatClient.builder(claudeModel).defaultSystem("You are Claude, an AI assistant by Anthropic.").build();}@BeanpublicChatClientollamaChatClient(OllamaChatModel ollamaModel){returnChatClient.builder(ollamaModel).defaultSystem("You are a local AI assistant running on Ollama.").build();}}

Inject Specific ChatClients

Java Example

@Service@RequiredArgsConstructorpublicclassMultiModelService{// Inject by bean name using @Qualifier@Qualifier("openAiChatClient")privatefinalChatClient openAiClient;@Qualifier("claudeChatClient")privatefinalChatClient claudeClient;@Qualifier("ollamaChatClient")privatefinalChatClient ollamaClient;publicStringaskOpenAI(String question){return openAiClient.prompt().user(question).call().content();}publicStringaskClaude(String question){return claudeClient.prompt().user(question).call().content();}publicStringaskLocal(String question){return ollamaClient.prompt().user(question).call().content();}}

4
Smart Router Pattern

Create a router service that automatically selects the best model based on the task:

Java Example

@Service@Slf4jpublicclassLLMRouter{privatefinalChatClient openAiClient;privatefinalChatClient claudeClient;privatefinalChatClient ollamaClient;publicLLMRouter(@Qualifier("openAiChatClient")ChatClient openAiClient,@Qualifier("claudeChatClient")ChatClient claudeClient,@Qualifier("ollamaChatClient")ChatClient ollamaClient){this.openAiClient = openAiClient;this.claudeClient = claudeClient;this.ollamaClient = ollamaClient;}publicStringroute(String query,TaskType taskType){ChatClient selectedClient =switch(taskType){caseCODE_GENERATION->{
log.info("Using GPT-4o for code generation");yield openAiClient;}caseLONG_DOCUMENT->{
log.info("Using Claude for long document (200K context)");yield claudeClient;}caseSIMPLE_CLASSIFICATION->{
log.info("Using local Ollama for simple task");yield ollamaClient;}caseSENSITIVE_DATA->{
log.info("Using local Ollama for privacy");yield ollamaClient;}default-> openAiClient;};return selectedClient.prompt().user(query).call().content();}publicenumTaskType{CODE_GENERATION,LONG_DOCUMENT,SIMPLE_CLASSIFICATION,SENSITIVE_DATA,GENERAL}}

Usage Example

Java Example

@RestController@RequestMapping("/api/ai")@RequiredArgsConstructorpublicclassAIController{privatefinalLLMRouter router;@PostMapping("/ask")publicResponseEntity<String>ask(@RequestParamString query,@RequestParam(defaultValue ="GENERAL")TaskType taskType){String response = router.route(query, taskType);returnResponseEntity.ok(response);}}

Cost Savings

This pattern can reduce API costs by 60-80% by routing simple tasks to cheaper or local models.

5
Fallback & Retry Pattern

Implement resilient AI calls with automatic fallback to alternative providers:

Java Example

@Service@Slf4jpublicclassResilientAIService{privatefinalList<ChatClient> clientPriorityList;publicResilientAIService(@Qualifier("openAiChatClient")ChatClient openAi,@Qualifier("claudeChatClient")ChatClient claude,@Qualifier("ollamaChatClient")ChatClient ollama){// Priority order: OpenAI -> Claude -> Ollama (local fallback)this.clientPriorityList =List.of(openAi, claude, ollama);}publicStringcallWithFallback(String prompt){for(int i =0; i < clientPriorityList.size(); i++){ChatClient client = clientPriorityList.get(i);try{
log.info("Attempting provider {} of {}", i +1, clientPriorityList.size());return client.prompt().user(prompt).call().content();}catch(Exception e){
log.warn("Provider {} failed: {}. Trying next...", i +1, e.getMessage());if(i == clientPriorityList.size()-1){thrownewRuntimeException("All providers failed", e);}}}thrownewRuntimeException("No providers available");}}

Production Tip

Consider using Spring Retry or Resilience4j for more sophisticated retry policies with exponential backoff and circuit breakers.

What You've Learned

Multi-LLM Benefits

Cost, latency, and specialization

Configuration

Multiple provider setup in YAML

Named Beans

@Qualifier for specific clients

Router Pattern

Task-based model selection

Fallback Pattern

Resilient multi-provider calls

Cost Optimization

60-80% savings with smart routing

Configuring Multiple LLMs in Spring AI

1Why Use Multiple LLMs?

Cost Optimization

Latency Control

Specialization

Fallback & Redundancy

2Configuration Setup

Step 1: Add Dependencies for Each Provider

Step 2: Configure API Keys

3Creating Named ChatClients

Define Multiple ChatClient Beans

Inject Specific ChatClients

4Smart Router Pattern

Usage Example

5Fallback & Retry Pattern

What You've Learned

💬 Comments & Discussion

1
Why Use Multiple LLMs?

2
Configuration Setup

3
Creating Named ChatClients

4
Smart Router Pattern

5
Fallback & Retry Pattern