Vector Search Algorithms: Finding Needles in Billion-Point Haystacks

How does Spotify find similar songs from 100 million tracks in 3 milliseconds?

Answer: Really clever math called ANN (Approximate Nearest Neighbor).

The Naive Approach (Too Slow)

Brute force:

def find_similar(query_vector, all_vectors):
    distances = []
    for vector in all_vectors:  # Check every single one!
        distance = cosine_similarity(query_vector, vector)
        distances.append(distance)
    return sorted(distances)[:10]  # Top 10

Problem: For 1 million vectors, this takes 2 seconds.

For 1 billion? 33 minutes.

Unacceptable.

HNSW: The Graph Approach

Idea: Build a graph where similar vectors are connected.

Think "6 degrees of Kevin Bacon" but for vectors.

How it works:

  1. Build hierarchical layers (like a pyramid)
  2. Start at top (few nodes)
  3. Jump to closest node
  4. Move down a layer
  5. Repeat until bottom
  6. Return nearest neighbors

Result: Find answer in 5-10 hops instead of checking all 1M vectors.

Speed: 3ms for 1M vectors (1000x faster than brute force)

Trade-off: Uses more memory (stores the graph).

IVF: The Clustering Approach

Idea: Group similar vectors into clusters. Search clusters, not individual vectors.

How it works:

  1. Divide 1M vectors into 1000 clusters (K-means)
  2. Query comes in
  3. Find closest cluster (check 1000, not 1M)
  4. Search only that cluster (check 1000, not 1M)

Total checks: 2000 instead of 1M

Result: 500x faster than brute force.

Trade-off: Slightly less accurate (might miss vectors near cluster boundaries).

Which One to Use?

Algorithm Speed Accuracy Memory Best For
Brute Force Slow 100% Low <10K vectors
HNSW Fastest 95-99% High Real-time apps
IVF Fast 90-95% Medium Large datasets
Product Quantization Fast 85-90% Lowest Billions of vectors

Real-World Example

Pinecone (vector database) uses HNSW + Product Quantization:

  • HNSW for fast search
  • Product Quantization to compress vectors
  • Hybrid approach balances speed/memory/accuracy

Result: Search billions of vectors in <50ms.

Try It (Python)

import faiss
import numpy as np

# Create 100K random vectors (384 dimensions)
vectors = np.random.random((100000, 384)).astype('float32')

# Build HNSW index
index = faiss.IndexHNSWFlat(384, 32)  # 32 = connections per node
index.add(vectors)

# Search
query = np.random.random((1, 384)).astype('float32')
distances, indices = index.search(query, k=10)

print(f"Found 10 nearest neighbors in: {time}ms")

Speed: 2-3ms for 100K vectors.

Why This Matters

Vector search powers:

  • AI memory (RAG systems)
  • Recommendation engines (Netflix, Spotify)
  • Image search (Google, Pinterest)
  • Fraud detection (find similar transactions)
  • Drug discovery (find similar molecules)

Market size: $2B in 2024 → $10B+ by 2030

The Math (Optional)

Cosine similarity = dot product / (magnitude × magnitude)

def cosine_similarity(a, b):
    return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))

Values from -1 to 1:

  • 1.0 = identical
  • 0.0 = unrelated
  • -1.0 = opposite

Performance Benchmarks

Dataset: 1 million 384-dim vectors

Method Query Time Memory Accuracy
Linear scan 2000ms 1.5GB 100%
IVF (100 clusters) 8ms 1.8GB 94%
HNSW (M=16) 3ms 4GB 98%
HNSW (M=32) 2ms 6GB 99%

Takeaway: HNSW is fastest but uses more RAM.


Try: FAISS library from Facebook Related: RAG tutorial (Part 1) uses vector search EOF