Vector Databases

    The backbone of semantic search and RAG systems. Vector databases store high-dimensional embeddings and enable lightning-fast similarity searches across millions of documents.

    Traditional databases excel at exact matching—finding records where status = 'active'. But AI applications need semantic search: finding content that means the same thing, even if it uses different words. "How do I cancel my subscription?" should match a document titled "Ending your membership"—that's what vector databases enable.

    Under the hood, embeddings are converted to high-dimensional vectors (typically 768-3072 dimensions), and similarity is computed using distance metrics like cosine similarity. Specialized indexing algorithms like HNSW make these searches blazingly fast—even across billions of vectors.

    How Vector Search Works

    The Problem with Traditional Search

    Keyword search (like SQL LIKE or Elasticsearch BM25) only matches exact terms. Searching for "automobile repair" won't find documents about "car maintenance" unless you manually build synonym lists. This breaks down in real-world applications where users phrase things differently.

    Keyword Search: "Java programming" ❌ misses "JDK development", "Spring Framework"

    The Vector Solution

    Embeddings capture meaning, not just words. Text is converted to a dense vector where semantically similar content clusters together in the vector space. Searching becomes a nearest-neighbor problem: find the K vectors closest to the query vector.

    Vector Search: "Java programming" ✓ finds "JDK", "Spring", "OOP code"

    Distance Metrics

    Different distance metrics measure "similarity" in different ways. Your choice affects both accuracy and performance.

    Cosine Similarity

    Measures angle between vectors (ignores magnitude)

    Most Common

    Best for: Text embeddings, normalized vectors

    Euclidean (L2)

    Straight-line distance in vector space

    Good Default

    Best for: Image embeddings, spatial data

    Dot Product

    Raw similarity score (not normalized)

    Fastest

    Best for: Pre-normalized embeddings, ranking

    Rule of Thumb: Use Cosine for text (OpenAI, Cohere embeddings). Check your embedding model's documentation—some are pre-normalized for dot product.

    Vector Store Comparison

    StoreTypeBest ForMax VectorsPricing
    PGVector
    Self-Hosted
    Existing Postgres users, small-medium scale~10MFree (Postgres)
    Chroma
    Self-Hosted
    Local dev, prototyping, simple deployments~1MFree (Open Source)
    Pinecone
    Managed
    Production at scale, zero opsBillions$0.025/hr + storage
    Weaviate
    Both
    Hybrid search, GraphQL, multi-modalBillionsFree tier + paid
    Milvus
    Self-Hosted
    High-performance, Kubernetes deploymentsBillionsFree (Open Source)
    Qdrant
    Both
    Rust performance, advanced filteringBillionsFree tier + $25/mo

    Spring AI Integration

    Maven Dependencies

    Add the starter for your chosen vector store

    pom.xml
    <!-- PGVector (PostgreSQL) --><dependency><groupId>org.springframework.ai</groupId><artifactId>spring-ai-pgvector-store-spring-boot-starter</artifactId></dependency><!-- Chroma --><dependency><groupId>org.springframework.ai</groupId><artifactId>spring-ai-chroma-store-spring-boot-starter</artifactId></dependency><!-- Pinecone --><dependency><groupId>org.springframework.ai</groupId><artifactId>spring-ai-pinecone-store-spring-boot-starter</artifactId></dependency><!-- Weaviate --><dependency><groupId>org.springframework.ai</groupId><artifactId>spring-ai-weaviate-store-spring-boot-starter</artifactId></dependency>

    PGVector Configuration

    Use your existing PostgreSQL database

    application.properties
    # Database connectionspring.datasource.url=jdbc:postgresql://localhost:5432/vectordbspring.datasource.username=postgresspring.datasource.password=postgres# PGVector settingsspring.ai.vectorstore.pgvector.index-type=HNSWspring.ai.vectorstore.pgvector.distance-type=COSINE_DISTANCEspring.ai.vectorstore.pgvector.dimensions=1536# Initialize schema automaticallyspring.ai.vectorstore.pgvector.initialize-schema=true

    VectorStore Service

    Complete CRUD operations

    DocumentIndexService.java
    @ServicepublicclassDocumentIndexService{privatefinalVectorStore vectorStore;publicDocumentIndexService(VectorStore vectorStore){this.vectorStore = vectorStore;}// Add documents with metadatapublicvoidindex(String content,Map<String,Object> metadata){Document doc =newDocument(content, metadata);
    vectorStore.add(List.of(doc));}// Semantic searchpublicList<Document>search(String query,int topK){return vectorStore.similaritySearch(SearchRequest.query(query).withTopK(topK).withSimilarityThreshold(0.7));}// Search with metadata filterpublicList<Document>searchByCategory(String query,String category){FilterExpressionBuilder b =newFilterExpressionBuilder();return vectorStore.similaritySearch(SearchRequest.query(query).withTopK(5).withFilterExpression(b.eq("category", category).build()));}// Delete by IDspublicvoiddelete(List<String> documentIds){
    vectorStore.delete(documentIds);}}

    Indexing Algorithms

    Brute-force search (comparing every vector) is O(n)—too slow for millions of vectors. Approximate Nearest Neighbor (ANN) algorithms trade perfect accuracy for massive speed gains.

    HNSW (Hierarchical Navigable Small World)

    The gold standard for most use cases. Builds a multi-layer graph for navigating to nearest neighbors. Excellent query speed with high recall (typically 95-99%).

    • ✓ Best query performance
    • ✓ Good recall
    • ✗ Higher memory usage
    • ✗ Slower index builds

    IVF (Inverted File Index)

    Clusters vectors and searches only relevant clusters. Lower memory than HNSW but requires tuning the number of clusters (nlist) and probes (nprobe).

    • ✓ Lower memory
    • ✓ Faster index builds
    • ✗ Needs tuning
    • ✗ Lower recall if misconfigured

    Recommendation: Use HNSW unless you have memory constraints. PGVector, Pinecone, and Weaviate all default to HNSW.

    Performance Optimization

    ✓ Do

    • Batch inserts—add 100+ documents per call
    • Use metadata filters—narrow search space
    • Set appropriate topK—don't retrieve more than needed
    • Cache frequent queries—embeddings are deterministic
    • Monitor recall—test with known-good results

    ✗ Avoid

    • • Single-document inserts in loops
    • • Very low similarity thresholds (returns noise)
    • • Storing raw text in vector DB (use metadata reference)
    • • Mixing embedding models (index vs query)
    • • Ignoring index warm-up time after restarts

    Which Vector Store Should You Use?

    Use PGVector if...

    • • You already use PostgreSQL
    • • You need ACID transactions with vectors
    • • You have <10 million vectors
    • • You want to keep infrastructure simple

    Use Pinecone if...

    • • You want zero infrastructure management
    • • You need to scale to billions of vectors
    • • Low latency is critical (<50ms P99)
    • • You have budget for managed services

    Use Chroma if...

    • • You're prototyping or learning
    • • You need something running in 5 minutes
    • • You want embedded (in-process) mode
    • • Open source is a hard requirement

    Use Weaviate if...

    • • You need hybrid search (vector + keyword)
    • • You want built-in ML model inference
    • • GraphQL API is preferred
    • • Multi-tenancy is required

    Start Building with Vectors

    Vector databases unlock semantic search, RAG, and recommendation systems. Start with PGVector for simplicity or Pinecone for scale.