All articles
· 15 min deep-divesystemsperformance
Article 1 in your session

Caching — The Art of Remembering What's Expensive to Compute

A visual deep dive into caching. From CPU caches to CDNs — understand cache strategies, eviction policies, and the hardest problem in computer science: cache invalidation.

Introduction 0%
Introduction
🎯 0/5 0%

The fastest computation is
the one you never do.

Caching is the most important performance technique in all of computing.
Your CPU uses it. Your browser uses it. Netflix, Google, and every app you’ve ever used —
they all live and die by how well they cache.

↓ Scroll to understand the technique behind everyone’s performance gains

The Problem

Why Caching Matters: The Latency Gap

Different storage systems have wildly different speeds. The gap between “fast” and “slow” is enormous — and growing.

CPU Register~0.3 nsL1 Cache~1 nsRAM~100 ns (100x slower)SSD~100 μs (1000x)Network~10 msDatabase~1-10 ms
Latency comparison — each level is 10-100x slower than the one above
What Is a Cache

Cache Hit vs Cache Miss

A cache is simple: a fast layer that stores copies of frequently accessed data. When you look for something, two things can happen:

Cache Hit ⚡ (Fast Path)App RequestCache✓ Found!Return~1 ms ⚡Done!Cache Miss 🐢 (Slow Path)App RequestCache✗ Not foundDatabaseFetch dataStore in cache for next time~50 ms 🐢50x slower!
Cache hit = fast path. Cache miss = slow path + store for next time.

Cache Hit Rate — the metric that matters

🎯 Hit Rate

The percentage of requests served directly from cache without touching the slower backend. This single number tells you how effective your cache is — aim for 95%+ in production.

Hit Rate = hits / (hits + misses)
Production Target

A 99% hit rate means only 1 in 100 requests reaches the database. This is the standard target for production systems — the remaining 1% of misses barely impacts average latency.

⏱️ Average Latency

Your real-world latency is a weighted average of cache speed and database speed. With 99% hit rate, 1ms cache, and 50ms database: 0.99×1ms + 0.01×50ms = 1.49ms — a 33× improvement over hitting the database every time.

Avg Latency = hit_rate × cache_latency + miss_rate × db_latency
↑ Answer the question above to continue ↑
🟢 Quick Check Knowledge Check

Your cache has a 95% hit rate. Cache latency is 2ms, database latency is 100ms. What's the average request latency?

Strategies

Cache Strategies: Read vs Write

How do you keep the cache and database in sync? Different strategies have different trade-offs.

Cache-AsideApp manages cache1. Check cache2. If miss → read DB3. Store in cache✓ Most commonRedis + most appsWrite-ThroughWrite to both1. Write to cache2. Cache writes to DB3. Both always in sync✓ Strong consistency✗ Slower writesWrite-BackWrite to cache only1. Write to cache2. Async → batch to DB3. Risk: data loss if crash✓ Fastest writes✗ Data loss riskWhen to use which?Cache-Aside: Default choice. Works for 90% of use cases.Write-Through: When consistency is critical (banking, inventory).Write-Back: When write speed is critical (metrics, analytics, logging).
The three main caching patterns
↑ Answer the question above to continue ↑
🟡 Checkpoint Knowledge Check

Your e-commerce site needs to cache product inventory counts. Which strategy should you use?

Eviction

Cache Eviction: When Space Runs Out

Caches have limited space. When they’re full and a new item needs to be stored, which old item gets removed? This is the eviction policy.

LRULeast Recently UsedEvict the item thathasn’t been used forthe longest time.✓ Simple & effectiveUsed in Redis, MemcachedLFULeast Frequently UsedEvict the item withthe lowest totalaccess count.✓ Keeps hot items✗ Slow to adaptTTLTime-To-LiveEach item expires aftera set time period.5 min, 1 hour, etc.✓ Simple, predictableOften combined with LRULRU Example: Cache size = 3Access A:AAccess B, C:ABC← Full!Access D:ABCD← A evicted (least recent)
The three most common eviction policies
↑ Answer the question above to continue ↑
🟡 Checkpoint Knowledge Check

Your cache holds 4 items and uses LRU eviction. Access pattern: A, B, C, D, A, E. Which item gets evicted when E is accessed?

Invalidation

The Hardest Problem: Cache Invalidation

When the source data changes, the cache becomes stale — it holds an old version. Serving stale data can mean showing the wrong price, the wrong balance, or the wrong inventory. Getting invalidation right is critical.

TTL-Based”Expire after 5 min”✓ Simple, no coordination✗ Stale for up to TTLBest for: profiles, feedsEvent-Based”DB changed → clear cache”✓ Always fresh✗ Complex infrastructureBest for: inventory, pricesVersion-Based”product_v3” vs “product_v4”✓ No invalidation needed✗ More storage usedBest for: CSS, JS bundlesMost production systems combine TTL + Event-Based:TTL as a safety net (max staleness), events for immediate invalidation of critical data
Three approaches to keeping caches fresh
↑ Answer the question above to continue ↑
🟡 Checkpoint Knowledge Check

A user updates their profile photo. Your app uses 5-minute TTL caching. What happens for the next 5 minutes?

Real World

Caching in the Real World

👤 UserMumbai, IndiaCDN EdgeMumbai (~5ms)Origin ServerOregon, US (~200ms)Only on missReal-world CDN stats:Netflix: 95%+ hit rate • Cloudflare: 300+ edge locations • YouTube: 70%+ from CDN cache
CDNs: the caching layer that powers the internet
↑ Answer the question above to continue ↑
🔴 Challenge Knowledge Check

Netflix serves video to 200+ million users globally. Where does most of the video content come from?

🎓 What You Now Know

Caches trade space for speed — Store computed results in a fast layer to avoid slow re-computation.

Cache-Aside is the default pattern — App checks cache first, falls back to DB on miss, stores result.

LRU is the most common eviction policy — When space runs out, evict whoever was used longest ago.

Invalidation is the hard part — TTL for simplicity, event-based for freshness. Usually both.

CDNs are caching at global scale — Edge servers near users power Netflix, YouTube, and every fast website.

Caching is the foundation of every fast system. Whether you’re building a startup or operating at Netflix scale, the concepts are the same: put frequently accessed data closer to the user, evict smartly, and invalidate carefully. 🚀

Keep Learning