What is the best way to learn Java 8?

Start with Lambda Expressions and Functional Interfaces, then progress to Stream API and Optional. Practice with real coding examples and take quizzes to test your understanding.

How do I prepare for Spring Boot interviews?

Focus on core concepts like dependency injection, REST APIs, Spring Data JPA, and Spring Security. Practice with our 100+ Spring Boot quiz questions covering real interview scenarios.

What topics are covered in System Design?

We cover scalability patterns, database design, microservices architecture, distributed systems, caching strategies, API design, and security architecture.

OpenAI Integration

Integrate OpenAI's GPT models into your Spring Boot applications with a unified API, streaming support, and rigorous type safety.

OpenAI has set the standard for modern AI capabilities with models like GPT-4,GPT-4 Turbo, and GPT-4o. Spring AI provides first-class support for OpenAI, giving Java developers access to chat completions, embeddings, image generation with DALL-E, speech synthesis, and audio transcription—all through a consistent, type-safe API.

The OpenAI integration handles connection management, automatic retries, rate limiting backoff, and streaming. You write business logic; Spring AI handles the infrastructure. And when you need to switch to Azure OpenAI for enterprise deployment, your code stays the same—only configuration changes.

Available OpenAI Models

Chat Models

GPT-4o
Latest
GPT-4 Turbo128k context
GPT-48k/32k context
GPT-3.5 Turbo
Budget

Other Capabilities

DALL-E 3 — Image generation
Whisper — Speech to text
TTS — Text to speech
Vision — Image understanding

Core Concepts

Tokens & Context

LLMs process text in chunks called tokens (≈0.75 words). Each model has a strict context window—GPT-4 Turbo supports 128k tokens, while GPT-4 supports 8k/32k. This limit includes both input and output.

Chat Roles

System: Sets behavioral guidelines and persona.
User: The actual query or input from humans.
Assistant: The model's response (use for conversation history).

Temperature

Controls randomness (0.0 to 2.0). Use 0.0-0.3 for factual/code tasks, 0.7-1.0 for creative writing. Higher values increase variety but may reduce coherence.

Understanding these concepts is essential for cost control and output quality. Token limits determine how much context you can provide—for RAG applications, you might use 80% of the context for retrieved documents and 20% for the actual conversation. Temperature dramatically affects output: customer support bots should use low temperature for consistent answers, while brainstorming assistants benefit from higher values.

Getting Started

Configuration

Add Spring AI OpenAI dependency and configure your API key

pom.xml

<!-- Maven Dependency --><dependency><groupId>org.springframework.ai</groupId><artifactId>spring-ai-openai-spring-boot-starter</artifactId></dependency><!-- Add Spring AI BOM for version management --><dependencyManagement><dependencies><dependency><groupId>org.springframework.ai</groupId><artifactId>spring-ai-bom</artifactId><version>1.0.0-M4</version><type>pom</type><scope>import</scope></dependency></dependencies></dependencyManagement>

Application Properties

Configure your OpenAI connection settings

application.properties

# Required: Your OpenAI API key (use environment variable in production)spring.ai.openai.api-key=${OPENAI_API_KEY}# Model selection (default: gpt-4o)spring.ai.openai.chat.options.model=gpt-4o# Generation parametersspring.ai.openai.chat.options.temperature=0.7spring.ai.openai.chat.options.max-tokens=2000spring.ai.openai.chat.options.top-p=1.0# Optional: Organization ID (for teams)spring.ai.openai.organization-id=org-xxxx# Optional: Override base URL (for proxies or Azure)# spring.ai.openai.base-url=https://your-proxy.com/v1

Never commit API keys! Use environment variables: export OPENAI_API_KEY=sk-...or Spring's @Value("${OPENAI_API_KEY}") injection.

Basic Usage

Chat Service Implementation

OpenAIChatService.java

@ServicepublicclassOpenAIChatService{privatefinalChatClient chatClient;publicOpenAIChatService(ChatClient.Builder builder){this.chatClient = builder
.defaultSystem("""
You are a helpful customer service assistant for TechStore.
Be friendly, concise, and helpful. If you don't know something,
say so honestly rather than making up information.
""").build();}// Simple chat - returns complete responsepublicStringchat(String userMessage){return chatClient.prompt().user(userMessage).call().content();}// Streaming - returns tokens as they're generatedpublicFlux<String>streamChat(String userMessage){return chatClient.prompt().user(userMessage).stream().content();}// With conversation historypublicStringchatWithHistory(String userMessage,List<Message> history){return chatClient.prompt().messages(history).user(userMessage).call().content();}}

The ChatClient.Builder is injected automatically by Spring AI. Use .defaultSystem() to set a persona that applies to all requests—this is where you define your assistant's behavior, personality, and constraints. The system prompt is crucial for consistent, on-brand responses.

Advanced Features

Streaming with SSE

Stream responses to your frontend in real-time using Server-Sent Events.

Streaming Controller

@GetMapping(value ="/chat/stream",
produces =MediaType.TEXT_EVENT_STREAM_VALUE)publicFlux<String>streamChat(@RequestParamString message){return chatClient.prompt().user(message).stream().content().map(chunk ->"data: "+ chunk +"\n\n");}

Structured Output (JSON Mode)

Parse AI responses directly into Java objects—no regex needed.

Structured Output

publicrecordProductInfo(String name,String category,BigDecimal price,List<String> features
){}publicProductInfoextractProduct(String description){return chatClient.prompt().user("Extract product info: "+ description).call().entity(ProductInfo.class);}

Custom Model Parameters

Override default settings per-request for fine-grained control over model behavior.

Per-Request Options

importorg.springframework.ai.openai.OpenAiChatOptions;publicStringgenerateCreativeContent(String prompt){return chatClient.prompt().user(prompt).options(OpenAiChatOptions.builder().model("gpt-4o").temperature(0.9)// More creative.maxTokens(2000)// Longer responses.topP(0.95)// Nucleus sampling.presencePenalty(0.6)// Encourage new topics.frequencyPenalty(0.3)// Reduce repetition.build()).call().content();}publicStringgenerateCode(String spec){return chatClient.prompt().user(spec).options(OpenAiChatOptions.builder().model("gpt-4-turbo").temperature(0.0)// Deterministic.maxTokens(4000)// Room for code.build()).call().content();}

Vision: Image Understanding

GPT-4o and GPT-4 Vision can analyze images. Pass images as URLs or base64-encoded data.

Vision API Usage

publicStringanalyzeImage(String imageUrl,String question){var userMessage =newUserMessage(question,List.of(newMedia(MimeTypeUtils.IMAGE_PNG, imageUrl)));return chatClient.prompt().messages(userMessage).call().content();}// Example: Analyze a product imageString analysis =analyzeImage("https://example.com/product.jpg","Describe this product. What are its key features?");

Pricing & Cost Optimization

GPT-3.5 Turbo

Budget-friendly

$0.50

per 1M input tokens

Best for: Simple Q&A, classification, summarization

GPT-4o

Balanced

$5.00

per 1M input tokens

Best for: Complex tasks, code, multimodal

GPT-4 Turbo

Maximum context

$10.00

per 1M input tokens

Best for: Long documents, complex reasoning

Cost optimization tips:

Use GPT-3.5 Turbo for simple tasks—it's 10-20x cheaper
Set maxTokens to prevent unexpectedly long responses
Cache responses for repeated queries with Spring Cache
Use smaller models for initial filtering, GPT-4 for final answers

Error Handling & Rate Limits

Resilient Service with Retry

@ServicepublicclassResilientOpenAIService{privatefinalChatClient chatClient;@Retryable(
value ={OpenAiApiException.class},
maxAttempts =3,
backoff =@Backoff(delay =1000, multiplier =2))publicStringchat(String message){try{return chatClient.prompt().user(message).call().content();}catch(OpenAiApiException e){if(e.getStatusCode()==429){
log.warn("Rate limit hit, retry scheduled...");throw e;// Will be retried}if(e.getStatusCode()==503){
log.error("OpenAI service unavailable");thrownewServiceUnavailableException("AI service temporarily down");}throw e;}}@RecoverpublicStringfallback(OpenAiApiException e,String message){
log.error("All retries exhausted for: {}", message);return"I'm sorry, our AI service is currently unavailable. Please try again later.";}}

OpenAI enforces rate limits based on your tier—typically requests per minute (RPM) and tokens per minute (TPM). When you hit these limits, the API returns a 429 Too Many Requests error.Spring Retry with exponential backoff handles this gracefully, automatically waiting before retrying.

Best Practices

💰 Cost Management

Set maxTokens to prevent runaway costs
Use GPT-3.5 Turbo for simple tasks (10x cheaper)
Cache common queries with Spring Cache
Monitor usage via OpenAI dashboard

🛡️ Security

Never expose API keys in frontend code
Sanitize user input to prevent prompt injection
Implement rate limiting per user
Use content moderation for user inputs

⚡ Performance

Use streaming for better perceived latency
Batch similar requests when possible
Set appropriate timeouts (30-60s typical)
Monitor response times and token usage

🎯 Quality

Write detailed, specific system prompts
Include examples in prompts for consistency
Use temperature 0 for factual tasks
Test prompts with varied inputs

Start Building with OpenAI

Now that you understand OpenAI integration, build intelligent chatbots, implement RAG, or explore other AI providers supported by Spring AI.