Vector Databases
The backbone of semantic search and RAG systems. Vector databases store high-dimensional embeddings and enable lightning-fast similarity searches across millions of documents.
Traditional databases excel at exact matching—finding records where status = 'active'. But AI applications need semantic search: finding content that means the same thing, even if it uses different words. "How do I cancel my subscription?" should match a document titled "Ending your membership"—that's what vector databases enable.
Under the hood, embeddings are converted to high-dimensional vectors (typically 768-3072 dimensions), and similarity is computed using distance metrics like cosine similarity. Specialized indexing algorithms like HNSW make these searches blazingly fast—even across billions of vectors.
How Vector Search Works
The Problem with Traditional Search
Keyword search (like SQL LIKE or Elasticsearch BM25) only matches exact terms. Searching for "automobile repair" won't find documents about "car maintenance" unless you manually build synonym lists. This breaks down in real-world applications where users phrase things differently.
The Vector Solution
Embeddings capture meaning, not just words. Text is converted to a dense vector where semantically similar content clusters together in the vector space. Searching becomes a nearest-neighbor problem: find the K vectors closest to the query vector.
Distance Metrics
Different distance metrics measure "similarity" in different ways. Your choice affects both accuracy and performance.
Cosine Similarity
Measures angle between vectors (ignores magnitude)
Best for: Text embeddings, normalized vectors
Euclidean (L2)
Straight-line distance in vector space
Best for: Image embeddings, spatial data
Dot Product
Raw similarity score (not normalized)
Best for: Pre-normalized embeddings, ranking
Rule of Thumb: Use Cosine for text (OpenAI, Cohere embeddings). Check your embedding model's documentation—some are pre-normalized for dot product.
Vector Store Comparison
| Store | Type | Best For | Max Vectors | Pricing |
|---|---|---|---|---|
| PGVector | Self-Hosted | Existing Postgres users, small-medium scale | ~10M | Free (Postgres) |
| Chroma | Self-Hosted | Local dev, prototyping, simple deployments | ~1M | Free (Open Source) |
| Pinecone | Managed | Production at scale, zero ops | Billions | $0.025/hr + storage |
| Weaviate | Both | Hybrid search, GraphQL, multi-modal | Billions | Free tier + paid |
| Milvus | Self-Hosted | High-performance, Kubernetes deployments | Billions | Free (Open Source) |
| Qdrant | Both | Rust performance, advanced filtering | Billions | Free tier + $25/mo |
Spring AI Integration
Maven Dependencies
Add the starter for your chosen vector store
<!-- PGVector (PostgreSQL) --><dependency><groupId>org.springframework.ai</groupId><artifactId>spring-ai-pgvector-store-spring-boot-starter</artifactId></dependency><!-- Chroma --><dependency><groupId>org.springframework.ai</groupId><artifactId>spring-ai-chroma-store-spring-boot-starter</artifactId></dependency><!-- Pinecone --><dependency><groupId>org.springframework.ai</groupId><artifactId>spring-ai-pinecone-store-spring-boot-starter</artifactId></dependency><!-- Weaviate --><dependency><groupId>org.springframework.ai</groupId><artifactId>spring-ai-weaviate-store-spring-boot-starter</artifactId></dependency>PGVector Configuration
Use your existing PostgreSQL database
# Database connectionspring.datasource.url=jdbc:postgresql://localhost:5432/vectordbspring.datasource.username=postgresspring.datasource.password=postgres# PGVector settingsspring.ai.vectorstore.pgvector.index-type=HNSWspring.ai.vectorstore.pgvector.distance-type=COSINE_DISTANCEspring.ai.vectorstore.pgvector.dimensions=1536# Initialize schema automaticallyspring.ai.vectorstore.pgvector.initialize-schema=trueVectorStore Service
Complete CRUD operations
@ServicepublicclassDocumentIndexService{privatefinalVectorStore vectorStore;publicDocumentIndexService(VectorStore vectorStore){this.vectorStore = vectorStore;}// Add documents with metadatapublicvoidindex(String content,Map<String,Object> metadata){Document doc =newDocument(content, metadata);
vectorStore.add(List.of(doc));}// Semantic searchpublicList<Document>search(String query,int topK){return vectorStore.similaritySearch(SearchRequest.query(query).withTopK(topK).withSimilarityThreshold(0.7));}// Search with metadata filterpublicList<Document>searchByCategory(String query,String category){FilterExpressionBuilder b =newFilterExpressionBuilder();return vectorStore.similaritySearch(SearchRequest.query(query).withTopK(5).withFilterExpression(b.eq("category", category).build()));}// Delete by IDspublicvoiddelete(List<String> documentIds){
vectorStore.delete(documentIds);}}Indexing Algorithms
Brute-force search (comparing every vector) is O(n)—too slow for millions of vectors. Approximate Nearest Neighbor (ANN) algorithms trade perfect accuracy for massive speed gains.
HNSW (Hierarchical Navigable Small World)
The gold standard for most use cases. Builds a multi-layer graph for navigating to nearest neighbors. Excellent query speed with high recall (typically 95-99%).
- ✓ Best query performance
- ✓ Good recall
- ✗ Higher memory usage
- ✗ Slower index builds
IVF (Inverted File Index)
Clusters vectors and searches only relevant clusters. Lower memory than HNSW but requires tuning the number of clusters (nlist) and probes (nprobe).
- ✓ Lower memory
- ✓ Faster index builds
- ✗ Needs tuning
- ✗ Lower recall if misconfigured
Recommendation: Use HNSW unless you have memory constraints. PGVector, Pinecone, and Weaviate all default to HNSW.
Performance Optimization
✓ Do
- • Batch inserts—add 100+ documents per call
- • Use metadata filters—narrow search space
- • Set appropriate topK—don't retrieve more than needed
- • Cache frequent queries—embeddings are deterministic
- • Monitor recall—test with known-good results
✗ Avoid
- • Single-document inserts in loops
- • Very low similarity thresholds (returns noise)
- • Storing raw text in vector DB (use metadata reference)
- • Mixing embedding models (index vs query)
- • Ignoring index warm-up time after restarts
Which Vector Store Should You Use?
Use PGVector if...
- • You already use PostgreSQL
- • You need ACID transactions with vectors
- • You have <10 million vectors
- • You want to keep infrastructure simple
Use Pinecone if...
- • You want zero infrastructure management
- • You need to scale to billions of vectors
- • Low latency is critical (<50ms P99)
- • You have budget for managed services
Use Chroma if...
- • You're prototyping or learning
- • You need something running in 5 minutes
- • You want embedded (in-process) mode
- • Open source is a hard requirement
Use Weaviate if...
- • You need hybrid search (vector + keyword)
- • You want built-in ML model inference
- • GraphQL API is preferred
- • Multi-tenancy is required