Vector Databases - Architecture Insights

What They Are

Vector databases store and search high-dimensional vectors, which are arrays of numbers that represent data in embedding space. They find vectors “similar” to a query vector, enabling semantic search, recommendation, and AI-powered applications.

The rise of machine learning created vectors. When you pass text through a language model, an image through a vision model, or user behavior through a recommendation model, you get a vector, typically hundreds or thousands of dimensions. These vectors capture semantic meaning: similar items have similar vectors. “How do I reset my password?” and “I forgot my login credentials” have different words but similar meaning, producing vectors that are geometrically close.

Traditional databases can’t efficiently search vectors. Finding the nearest neighbors to a query vector with a standard index would require comparing against every vector in the database. Vector databases use specialized index structures that make approximate nearest-neighbor search tractable at scale.

Data Structure

┌────────────────────────────────────────────────────────────────────────────┐
│  VECTOR COLLECTION: documents                                              │
├──────────┬────────────────────────────────────────┬────────────────────────┤
│  ID      │  VECTOR (768 dimensions)               │  METADATA              │
├──────────┼────────────────────────────────────────┼────────────────────────┤
│  doc_1   │  [0.12, -0.34, 0.56, ..., 0.89]       │  {category: "support", │
│          │   ↑                                    │   date: "2024-01-15"}  │
│          │   Embedding from ML model              │                        │
├──────────┼────────────────────────────────────────┼────────────────────────┤
│  doc_2   │  [0.11, -0.33, 0.58, ..., 0.87]       │  {category: "support"} │
│          │   ↑ Similar vector = similar meaning   │                        │
├──────────┼────────────────────────────────────────┼────────────────────────┤
│  doc_3   │  [-0.45, 0.78, -0.12, ..., 0.23]      │  {category: "billing"} │
│          │   ↑ Different vector = different topic │                        │
└──────────┴────────────────────────────────────────┴────────────────────────┘

Query: Find vectors nearest to [0.13, -0.35, 0.55, ..., 0.88]
       with category = "support"

Results: doc_1 (similarity: 0.98)  ← Very similar
         doc_2 (similarity: 0.96)  ← Similar
         doc_3 not returned       ← Filtered by metadata

Each record stores a high-dimensional vector (the embedding) plus optional metadata. Queries find the K nearest neighbors to a query vector, optionally filtered by metadata.

How They Work

Vector Embeddings

Data enters as vectors, typically generated by ML models. A sentence goes through a transformer model; out comes a 768-dimensional vector. An image goes through a CNN; out comes a 512-dimensional vector. The database stores these vectors alongside metadata.

Distance Metrics

“Similarity” is measured geometrically:

Cosine similarity: Measures the angle between vectors (common for text)
Euclidean distance: Measures straight-line distance
Dot product: Measures how much vectors point in the same direction

The right metric depends on how embeddings were trained.

Approximate Nearest Neighbor (ANN) Indexes

The key innovation. Rather than comparing a query against every stored vector, ANN indexes organize vectors into structures that narrow the search space:

HNSW (Hierarchical Navigable Small World) builds a multi-layer graph where you can navigate from any point to similar points through short hops.

IVF (Inverted File Index) clusters vectors into buckets and only searches relevant buckets.

These trade perfect accuracy for massive speed improvements. Instead of checking a million vectors, you check a few thousand.

Hybrid Search

Most vector databases support combining vector similarity with traditional filters. “Find documents similar to this query AND where category=’support’ AND created_date > last_week.” This requires indexing both vectors and metadata.

Why They Excel

Semantic Understanding

Traditional search matches keywords. Vector search matches meaning. This enables “find similar” functionality that keyword search cannot achieve.

ML Integration

Vector databases are the storage layer for ML-powered features. They’re designed to work with the embeddings that modern models produce.

Scale

Specialized indexes search billions of vectors in milliseconds.

Why They Struggle

No Exact Match

Vector search returns approximate results based on similarity. If you need exact retrieval, it’s the wrong tool.

Embedding Quality Dependency

The database is only as good as the embeddings. Garbage vectors in, garbage results out.

Operational Maturity

Vector databases are newer than other categories. Tooling, best practices, and operational knowledge are still developing.

When to Use Them

Vector databases power:

Semantic search: Find documents by meaning, not keywords
RAG (Retrieval-Augmented Generation): Retrieve relevant context from your data to feed to GPT or Claude
Recommendation systems: Find similar products, content, or users
Image/audio search: Find visually or acoustically similar media
Anomaly detection: Identify vectors far from normal clusters

When to Look Elsewhere

If you don’t have embeddings or aren’t using ML models, vector databases don’t apply. For exact matching, use traditional databases. For datasets under 100,000 items, brute-force search often outperforms the overhead of ANN indexes.

Examples

Pinecone is a managed service focused on simplicity and production reliability, with serverless scaling.

Weaviate is open source with a GraphQL API and hybrid search combining vectors with BM25 keyword matching.

Milvus scales to billions of vectors and offers multiple index types and deployment options.

Qdrant is written in Rust for performance, with strong filtering capabilities alongside vector search.

pgvector is a PostgreSQL extension that adds vector operations to an existing relational database, often the simplest path for applications already using PostgreSQL.

Chroma focuses on developer experience for prototyping and smaller-scale applications.