Approximate Nearest Neighbor Search — Trading 1% Accuracy for 1000× Speed
A visual deep dive into ANN search. Why brute-force nearest neighbor fails at scale, how approximate methods achieve 99% recall with logarithmic query time, and the fundamental accuracy-speed tradeoff behind every vector search system.
🎯
1 billion vectors.
Find the closest in 1ms.
Brute-force nearest neighbor search compares your query to every single vector in the database. At 1 billion vectors with 768 dimensions, that’s 1.5 trillion FLOPs per query — 15 seconds on CPU. ANN search does it in 1 millisecond with 99% recall. How? By accepting a tiny accuracy loss for a massive speed gain.
↓ Scroll to understand the fundamental tradeoff behind vector search
The Fundamental Problem: Nearest Neighbor at Scale
Why can't you just use a faster CPU/GPU to solve the brute-force nearest neighbor problem at billion scale?
💡 Think about algorithm complexity classes: O(N) vs O(log N). Can hardware close that gap?
This is a complexity argument, not a hardware argument. Brute force is O(N×d) — cost grows linearly with database size. Doubling your vectors doubles your query time. GPUs help the constant factor but not the scaling curve. ANN algorithms achieve O(log N) or O(√N) query time. At N=1B: brute force scans 1B vectors, but HNSW traverses ~30 hops (log₂ of 1B). That's a 33-million× reduction in comparisons. No GPU speedup can match an algorithmic complexity improvement this large. This is why every production vector search system uses ANN.
Why brute force is impossible at scale
N = 1 billion vectors, d = 768 dimensions Cost per comparison: 768 multiplications + 767 additions ≈ 1535 FLOPs Total: 1,000,000,000 × 1535 = 1.535 × 10¹² FLOPs per query At 10 TFLOPS (GPU): 150ms per query At 100 GFLOPS (CPU): 15 seconds per query An ANN index reports '99% recall@10'. What does this mean?
💡 What are the two things ANN trades off against each other?
Recall@K is the standard metric for ANN quality. If recall@10 = 99%, it means: on average, 9.9 out of the true 10 nearest neighbors appear in the ANN result set. The 0.1 missing neighbor is replaced by a point that's very close but not truly in the top 10. This is the fundamental ANN trade-off: you sacrifice a tiny bit of accuracy (1%) for a massive speedup (1000x+). In practice, recall@10 ≥ 95% is acceptable for most search applications.
You have a production system with 100M vectors and recall@10 = 97%. A stakeholder asks: 'Can you get recall to 100% without brute force?' What's your answer?
💡 Think about what the 'approximate' in ANN fundamentally means.
ANN recall is asymptotic — it approaches 100% but reaching it exactly requires exhaustive search. The recall-compute curve is highly nonlinear: going from 95% to 99% recall might cost 2× more compute, but going from 99% to 99.9% costs 10× more, and 99.9% to 100% costs 100× more (effectively brute force). For most search applications, recall@10 ≥ 98% is sufficient. The practical question is: 'Is the 0.1% of missed results worth 100× more compute?' Almost always, the answer is no.
🎓 What You Now Know
✓ Brute-force kNN is O(N×d) — at 1B vectors it takes 15 seconds on CPU. Completely unusable for real-time search.
✓ ANN trades tiny accuracy for massive speed — 99% recall at 1000× speedup. The accuracy-speed curve is highly nonlinear.
✓ Recall@K is the standard metric — it measures what fraction of true nearest neighbors appear in the approximate result set.
✓ Every ANN algorithm (IVF, HNSW, LSH) sits on this tradeoff curve — they differ in HOW they partition or organize vectors to enable fast approximate search.
Next up: explore exactly how IVF and HNSW organize vectors to achieve logarithmic query times. ⚡
↗ Keep Learning
IVF Index — Partitioning Vector Space with K-Means for Fast Search
A visual deep dive into the Inverted File Index (IVF). How k-means clustering partitions billion-scale vector collections into searchable regions, the nprobe recall-speed knob, and IVF complexity analysis.
HNSW — Hierarchical Navigable Small World Graphs for Vector Search
A visual deep dive into HNSW, the dominant graph-based index for vector search. Understand the multi-layer graph structure, the M and ef parameters, complexity analysis, and how it compares to IVF, LSH, and brute force.
Vector Databases — Search by Meaning, Not Keywords
A visual deep dive into vector databases. From embeddings to ANN search to HNSW — understand how AI-powered search finds what you actually mean, not just what you typed.
K-Nearest Neighbors — The Algorithm with No Training Step
A scroll-driven visual deep dive into KNN. Learn how the laziest algorithm in ML works, why distance metrics matter, and how the curse of dimensionality kills it.
Comments
No comments yet. Be the first!