All articles
· 9 min deep-divesearchinformation-retrievalsystems
Article 1 in your session

HNSW — Hierarchical Navigable Small World Graphs for Vector Search

A visual deep dive into HNSW, the dominant graph-based index for vector search. Understand the multi-layer graph structure, the M and ef parameters, complexity analysis, and how it compares to IVF, LSH, and brute force.

Introduction 0%
Introduction
🎯 0/4 0%

🕸️

A graph where every hop
gets you closer.

HNSW (Hierarchical Navigable Small World) builds a multi-layer graph where each vector is a node and edges connect nearby vectors. Searching means “hopping” through the graph — each hop brings you closer to the query. The result: O(log N) query time with the highest recall of any ANN algorithm. The tradeoff? It’s memory-hungry.

↓ Scroll to understand the graph-based index that powers most vector databases

HNSW Deep Dive

HNSW: The Graph-Based Index

M (Max Connections)Edges per node at each layer.M = 4 (low):• Small graph, fast build• Less memory (4 edges/node)• Lower recall — fewer pathsM = 48 (high):• Dense graph, slow build• More memory (48 edges/node)• Higher recall — many pathsSweet spot: M = 16-64ef (Search Beam Width)Candidates to track during search.ef = 10 (low):• Greedily follows best path• Very fast (~0.1ms)• May get stuck in local optimaef = 200 (high):• Explores many paths• Slower (~5ms)• Near-perfect recall (99%+)Must be ≥ K (top-K results)
HNSW's two critical parameters: M (connections per node) and ef (search beam width)
↑ Answer the question above to continue ↑
🟢 Quick Check Knowledge Check

You're tuning HNSW for a latency-sensitive application. You set ef=5 (very low beam width). What problem will you likely encounter?

HNSW Math

HNSW complexity

🏗️ Build Time

Each of N vectors is inserted into the graph, connecting to M neighbors at each layer. The hierarchical structure means insertion traverses O(log N) layers.

O(N × log(N) × M)
💾 Space

Every node stores M edges at each layer, and the number of layers grows logarithmically with N. This makes HNSW memory-hungry — every edge must be explicitly stored.

O(N × M × number_of_layers)
🔍 Query Time

Search traverses log(N) layers, tracking ef candidates at each step, and checking M neighbors per candidate. The logarithmic layer traversal is what gives HNSW its speed.

O(log(N) × ef × M)
📦 Memory at Scale

At 1 billion vectors with M=16, the graph index alone requires roughly 2 TB of RAM. This is HNSW's main cost — every edge is an explicit pointer in memory.

N × M × 4 bytes × log(N) ≈ 2 TB
Practical Latency

For 100M vectors, typical query latency is 1-10ms — much faster than IVF at equivalent recall, but at the cost of roughly 4× more memory.

↑ Answer the question above to continue ↑
🟡 Checkpoint Knowledge Check

HNSW query time is O(log N × ef × M). If you double the database size from 500M to 1B vectors, how does query latency change?

Complexity Table

The Big Complexity Comparison

Index TypeBuild TimeQuery TimeSpaceRecallBrute ForceO(1)O(N×d)O(N×d)100%IVFO(N×K×I)O(√N×d)O(N×d)~95%HNSWO(N log N)O(log N)O(N×M×L)~99%LSHO(N×L×K)O(N^(1/c))O(N×L)~90%IVF-PQO(N×K×I)O(√N×m)O(N×m)~90%Inv. Index (text)O(N×L_avg)O(1) lookupO(N×L_avg)100%N=vectors, d=dimensions, M=HNSW edges, L=layers/hashes, m=PQ subvectors, K=clusters, I=k-means iters
Time and space complexity of major index types — pick the right tool for your scale
↑ Answer the question above to continue ↑
🟡 Checkpoint Knowledge Check

You have 500M 768-dim vectors. HNSW with M=32 achieves 99% recall but requires ~500GB RAM for the graph alone. You only have 64GB RAM. What do you do?

You’ve seen the memory–accuracy tradeoff at scale. Now let’s zoom into why HNSW’s hierarchy matters — the layered structure is the key insight that separates HNSW from a flat graph and gives it O(log N) search.

↑ Answer the question above to continue ↑
🔴 Challenge Knowledge Check

Why does HNSW use MULTIPLE layers instead of a single flat graph?

🎓 What You Now Know

HNSW builds a navigable graph — each vector is a node, edges connect nearby vectors, search means hopping through the graph.

M controls the graph density — more edges = higher recall but more memory. Sweet spot: M = 16-64.

ef controls the search beam width — higher ef explores more paths for better recall at the cost of latency.

O(log N) query time, but memory-hungry — HNSW gives the highest recall but requires storing the entire graph in RAM.

When RAM is tight, use IVF-PQ instead — 32× compression with ~92% recall fits billion-scale datasets on a single machine.

HNSW dominates vector search, but production systems combine it with BM25 keyword search and cross-encoder reranking for best results. ⚡

Keep Learning