All articles
· 6 min deep-divesearchinformation-retrievalsystems
Article 1 in your session

Approximate Nearest Neighbor Search — Trading 1% Accuracy for 1000× Speed

A visual deep dive into ANN search. Why brute-force nearest neighbor fails at scale, how approximate methods achieve 99% recall with logarithmic query time, and the fundamental accuracy-speed tradeoff behind every vector search system.

Introduction 0%
Introduction
🎯 0/3 0%

🎯

1 billion vectors.
Find the closest in 1ms.

Brute-force nearest neighbor search compares your query to every single vector in the database. At 1 billion vectors with 768 dimensions, that’s 1.5 trillion FLOPs per query — 15 seconds on CPU. ANN search does it in 1 millisecond with 99% recall. How? By accepting a tiny accuracy loss for a massive speed gain.

↓ Scroll to understand the fundamental tradeoff behind vector search

ANN Search

The Fundamental Problem: Nearest Neighbor at Scale

Brute-Force kNNQANN SearchQChecks ALL 8 points → O(N)Checks only 3 nearby → O(log N)
Exact kNN checks all N points. ANN uses index structures to check only a small subset.
↑ Answer the question above to continue ↑
🟢 Quick Check Knowledge Check

Why can't you just use a faster CPU/GPU to solve the brute-force nearest neighbor problem at billion scale?

ANN Math

Why brute force is impossible at scale

1
N = 1 billion vectors, d = 768 dimensions
Typical production setup: embedding model outputs 768-dim vectors
2
Cost per comparison: 768 multiplications + 767 additions ≈ 1535 FLOPs
Cosine similarity requires a dot product + norms
3
Total: 1,000,000,000 × 1535 = 1.535 × 10¹² FLOPs per query
1.5 TRILLION floating point operations
4
At 10 TFLOPS (GPU): 150ms per query
Barely acceptable — and you'd need an entire GPU per query
5
At 100 GFLOPS (CPU): 15 seconds per query
Completely unusable. ANN reduces this to ~1ms with 99% recall.
↑ Answer the question above to continue ↑
🟢 Quick Check Knowledge Check

An ANN index reports '99% recall@10'. What does this mean?

↑ Answer the question above to continue ↑
🟡 Checkpoint Knowledge Check

You have a production system with 100M vectors and recall@10 = 97%. A stakeholder asks: 'Can you get recall to 100% without brute force?' What's your answer?

🎓 What You Now Know

Brute-force kNN is O(N×d) — at 1B vectors it takes 15 seconds on CPU. Completely unusable for real-time search.

ANN trades tiny accuracy for massive speed — 99% recall at 1000× speedup. The accuracy-speed curve is highly nonlinear.

Recall@K is the standard metric — it measures what fraction of true nearest neighbors appear in the approximate result set.

Every ANN algorithm (IVF, HNSW, LSH) sits on this tradeoff curve — they differ in HOW they partition or organize vectors to enable fast approximate search.

Next up: explore exactly how IVF and HNSW organize vectors to achieve logarithmic query times. ⚡

Keep Learning