Caching — The Art of Remembering What's Expensive to Compute

Introduction 0%

Introduction

🎯 0/5 0%

⚡

The fastest computation is
the one you never do.

Caching is the most important performance technique in all of computing.
Your CPU uses it. Your browser uses it. Netflix, Google, and every app you’ve ever used —
they all live and die by how well they cache.

↓ Scroll to understand the technique behind everyone’s performance gains

The Problem

Why Caching Matters: The Latency Gap

Different storage systems have wildly different speeds. The gap between “fast” and “slow” is enormous — and growing.

Latency comparison — each level is 10-100x slower than the one above

What Is a Cache

Cache Hit vs Cache Miss

A cache is simple: a fast layer that stores copies of frequently accessed data. When you look for something, two things can happen:

Cache hit = fast path. Cache miss = slow path + store for next time.

Cache Hit Rate — the metric that matters

🎯 Hit Rate

The percentage of requests served directly from cache without touching the slower backend. This single number tells you how effective your cache is — aim for 95%+ in production.

Hit Rate = hits / (hits + misses)

⚡ Production Target

A 99% hit rate means only 1 in 100 requests reaches the database. This is the standard target for production systems — the remaining 1% of misses barely impacts average latency.

⏱️ Average Latency

Your real-world latency is a weighted average of cache speed and database speed. With 99% hit rate, 1ms cache, and 50ms database: 0.99×1ms + 0.01×50ms = 1.49ms — a 33× improvement over hitting the database every time.

Avg Latency = hit_rate × cache_latency + miss_rate × db_latency

↑ Answer the question above to continue ↑

Your cache has a 95% hit rate. Cache latency is 2ms, database latency is 100ms. What's the average request latency?

Strategies

Cache Strategies: Read vs Write

How do you keep the cache and database in sync? Different strategies have different trade-offs.

The three main caching patterns

↑ Answer the question above to continue ↑

Your e-commerce site needs to cache product inventory counts. Which strategy should you use?

Eviction

Cache Eviction: When Space Runs Out

Caches have limited space. When they’re full and a new item needs to be stored, which old item gets removed? This is the eviction policy.

The three most common eviction policies

↑ Answer the question above to continue ↑

Your cache holds 4 items and uses LRU eviction. Access pattern: A, B, C, D, A, E. Which item gets evicted when E is accessed?

Invalidation

The Hardest Problem: Cache Invalidation

When the source data changes, the cache becomes stale — it holds an old version. Serving stale data can mean showing the wrong price, the wrong balance, or the wrong inventory. Getting invalidation right is critical.

Three approaches to keeping caches fresh

↑ Answer the question above to continue ↑

A user updates their profile photo. Your app uses 5-minute TTL caching. What happens for the next 5 minutes?

Real World

Caching in the Real World

CDNs: the caching layer that powers the internet

↑ Answer the question above to continue ↑

Netflix serves video to 200+ million users globally. Where does most of the video content come from?

🎓 What You Now Know

✓ Caches trade space for speed — Store computed results in a fast layer to avoid slow re-computation.

✓ Cache-Aside is the default pattern — App checks cache first, falls back to DB on miss, stores result.

✓ LRU is the most common eviction policy — When space runs out, evict whoever was used longest ago.

✓ Invalidation is the hard part — TTL for simplicity, event-based for freshness. Usually both.

✓ CDNs are caching at global scale — Edge servers near users power Netflix, YouTube, and every fast website.

Caching is the foundation of every fast system. Whether you’re building a startup or operating at Netflix scale, the concepts are the same: put frequently accessed data closer to the user, evict smartly, and invalidate carefully. 🚀

Caching — The Art of Remembering What's Expensive to Compute

The fastest computation is
the one you never do.

Why Caching Matters: The Latency Gap

Cache Hit vs Cache Miss

Cache Hit Rate — the metric that matters

Cache Strategies: Read vs Write

Cache Eviction: When Space Runs Out

The Hardest Problem: Cache Invalidation

Caching in the Real World

🎓 What You Now Know

Comments

↗ Keep Learning

Database Sharding — Scaling Beyond One Machine

Vector Databases — Search by Meaning, Not Keywords

Database Sharding — Scaling Beyond One Machine

The fastest computation is the one you never do.

Why Caching Matters: The Latency Gap

Cache Hit vs Cache Miss

Cache Hit Rate — the metric that matters

Cache Strategies: Read vs Write

Cache Eviction: When Space Runs Out

The Hardest Problem: Cache Invalidation

Caching in the Real World

🎓 What You Now Know

Comments

↗ Keep Learning

Database Sharding — Scaling Beyond One Machine

Vector Databases — Search by Meaning, Not Keywords

Database Sharding — Scaling Beyond One Machine

The fastest computation is
the one you never do.