There are only two hard things in computer science: cache invalidation and naming things. — Phil Karlton
Caching makes slow things fast and expensive things cheap. Done well, it’s the difference between an app that works at 10 users and an app that works at 10 million. Done badly, it’s the source of the worst kinds of bugs — the ones where the data is technically correct but stale, or worse, intermittently wrong.
This post is the practical caching guide. We’ll cover the patterns that work, the failure modes to avoid, and how to get the most out of Redis specifically.
Why Redis?
You can cache in many places — application memory, Memcached, Varnish, the database itself. Why is Redis the default?
- Rich data types — strings, hashes, lists, sets, sorted sets, streams, JSON. You’re not limited to “key → blob”.
- Atomic operations —
INCR,LPUSH,SETNX, transactions. Lots of synchronization primitives baked in. - Persistence options — RDB snapshots, AOF (append-only file), or both. You can choose how much you can afford to lose.
- Pub/Sub and Streams — message-passing primitives if you need them.
- Battle-tested — it’s the default cache layer for half the internet.
Memcached is faster for pure key-value workloads, but Redis’s flexibility is worth the small overhead almost every time.
The four caching patterns
1. Cache-aside (lazy loading)
The most common pattern. The application reads from the cache; on miss, it fetches from the database, writes to the cache, and returns.
def get_user(user_id):
cache_key = f"user:{user_id}"
cached = redis.get(cache_key)
if cached:
return json.loads(cached)
user = db.fetch_user(user_id)
if user:
redis.set(cache_key, json.dumps(user), ex=3600) # 1h TTL
return user
Pros: simple, the cache is just an optimization, the DB is always the source of truth. Cons: every cache miss is a cache miss + DB hit. First load after a deploy is slow.
This is the right default for 80% of cases.
2. Read-through
Same shape as cache-aside, but the cache layer does the DB fetch itself, hidden behind a function or library. Conceptually identical from the application’s view.
3. Write-through
Every write goes to both the DB and the cache, synchronously:
def update_user(user_id, data):
db.update_user(user_id, data)
redis.set(f"user:{user_id}", json.dumps(data), ex=3600)
Pros: cache is always fresh. Cons: every write is now slower (two round-trips). And if the cache write fails, you have inconsistency.
Use it when reads vastly outnumber writes and freshness matters.
4. Write-behind (write-back)
The application writes to the cache; a background worker eventually flushes to the DB.
Pros: writes are very fast. Cons: complex, fragile (what if the worker crashes before flushing?), and you may lose recent writes.
Rarely worth it. Most teams should not use write-behind.
Cache invalidation: the hard problem
A cached value is wrong the moment the underlying data changes. You have three strategies:
TTL (time-to-live)
Just expire entries after some time. Pick a TTL that balances freshness with hit rate:
redis.set(key, value, ex=600) # 10 min
This is the simplest and most robust strategy. Pair it with a sensible TTL (usually 1 minute to 1 hour for application data) and call it done in most cases.
Explicit invalidation on write
When the underlying data changes, delete the cache entry:
def update_user(user_id, data):
db.update_user(user_id, data)
redis.delete(f"user:{user_id}")
Pros: users see fresh data immediately. Cons: every code path that mutates the data has to invalidate. Easy to miss one.
Versioning / generation keys
Keep a “version number” for a tenant or user, and include it in cache keys:
v = redis.get(f"user:{user_id}:v") or 1
cached = redis.get(f"user:{user_id}:v{v}:profile")
When data changes, bump the version (INCR) — old cache entries become unreachable and naturally expire. No need to enumerate keys.
This trick is gold for invalidating groups of related cache entries.
Key design
Bad key design will destroy you faster than any other caching mistake.
Use namespaced keys
user:1234
user:1234:profile
session:abc-def-123
posts:user:1234:page:1
Use : as the separator (Redis convention; tools like RedisInsight understand it).
Don’t put unbounded data in the key name
search:"some user-supplied text here" # NO. Hash it.
search:5d41402abc4b2a76b9719d911017c592 # YES. md5(query)
User input shouldn’t directly become key names — it can blow up your key count and break tooling.
Pick the right Redis data type
| Data | Redis type |
|---|---|
| Single object (JSON blob) | STRING |
| Multiple fields of one entity | HASH (more memory-efficient than JSON in a string) |
| Recent items, FIFO/LIFO | LIST |
| Set membership / dedup | SET |
| Leaderboard, sorted by score | SORTED SET |
| Sliding-window rate limit | SORTED SET (timestamps) |
| Stream of events | STREAM |
Storing user objects as a HASH:
redis.hset(f"user:{user_id}", mapping={"name": "Alzy", "email": "[email protected]"})
redis.hget(f"user:{user_id}", "email")
Lets you update or read individual fields without round-tripping the whole object.
TTL strategy
A few rules of thumb:
- Application data (rarely changes): 5–60 minutes.
- Per-user / session data: 15–60 minutes (or session length).
- Hot read paths (lots of writes too): 30s–5 min.
- Static-ish data (categories, configs): hours or days.
- Idempotency tokens, request dedup: match the natural request lifetime (e.g. 1 day).
Add jitter to TTLs so cache entries don’t all expire at once and cause a thundering herd:
ttl = 3600 + random.randint(0, 600) # 1h ± 10 min
redis.set(key, value, ex=ttl)
The thundering herd
When a popular cache key expires, every concurrent request misses the cache and hits the DB simultaneously. The DB falls over. This is the classic cache failure mode at scale.
Three defenses:
1. Probabilistic early refresh
Before the TTL expires, occasionally let one request “early-refresh” the cache:
def get_with_early_refresh(key, fetcher, ttl):
val_with_meta = redis.get(key)
if not val_with_meta:
return _refresh(key, fetcher, ttl)
val, expires_at = parse(val_with_meta)
remaining = expires_at - now()
# Probability of refresh increases as we approach expiry
if random.random() < (1.0 - remaining / ttl) ** 4:
return _refresh(key, fetcher, ttl)
return val
Spreads the load over time instead of one big spike.
2. Distributed locks
Only one process refreshes; others wait or serve stale.
lock_key = f"lock:{cache_key}"
got_lock = redis.set(lock_key, "1", nx=True, ex=10)
if got_lock:
try:
value = expensive_fetch()
redis.set(cache_key, value, ex=3600)
finally:
redis.delete(lock_key)
Use a real distributed lock library (Redlock, python-redis-lock) for production.
3. Stale-while-revalidate
Serve the stale value while refreshing in the background. Users get a fast (slightly stale) response; only one worker does the slow refresh.
This is the same idea as HTTP’s stale-while-revalidate — and it’s almost always the right answer for high-traffic caches.
Rate limiting with Redis
Cache and rate limit often share the same Redis. The fixed-window pattern:
def is_allowed(user_id: int, limit: int = 100, window: int = 60) -> bool:
key = f"ratelimit:{user_id}:{int(time.time()) // window}"
count = redis.incr(key)
if count == 1:
redis.expire(key, window)
return count <= limit
Sliding-window with sorted sets gives you smoother behavior — see Rate Limiting Strategies for APIs .
Cache stampede prevention with SETNX
For “compute this expensive thing once” patterns:
def get_expensive_thing(key):
cached = redis.get(key)
if cached:
return cached
# Try to claim the right to compute
lock = redis.set(f"{key}:lock", "1", nx=True, ex=30)
if lock:
result = expensive_computation()
redis.set(key, result, ex=300)
redis.delete(f"{key}:lock")
return result
# Lost the race; retry briefly
time.sleep(0.05)
return get_expensive_thing(key)
In a busy system, prefer a real distributed lock library — there are subtle bugs in DIY locking (e.g., the lock owner crashing before deleting the lock).
Production gotchas
KEYS *will lock your Redis. It scans the whole keyspace synchronously. UseSCANfor any iteration.- Memory is finite. Set
maxmemoryand a sensiblemaxmemory-policy(allkeys-lruorallkeys-lfufor caches). - Persistence vs cache. If Redis is purely a cache, disable AOF and RDB. If it stores the only copy of some data (sessions, queues), enable persistence with thought.
- Connection pooling. Don’t open a connection per request. Use the connection pool that ships with your client.
- Pipeline batches. When sending many commands,
MULTI/EXECor pipelining drops round-trips dramatically. - Network failures happen. Wrap cache calls so a Redis outage doesn’t 500 your app — read-through to DB and serve uncached.
def get_user(user_id):
try:
cached = redis.get(f"user:{user_id}")
if cached:
return json.loads(cached)
except RedisError:
pass # cache failure is non-fatal
return db.fetch_user(user_id)
What not to cache
- Per-user data with extreme freshness needs (financial balances, inventory).
- Data that’s already fast. A primary-key lookup on an indexed table is probably <1ms — caching adds complexity for no gain.
- Things you only read once. Caches pay off on repeated reads.
Cache where it pays. Don’t cache where it doesn’t. Measure both — your guess is wrong as often as it’s right.
Conclusion
Caching is one of those skills that pays back forever. Pick the right pattern (cache-aside is the default), set sane TTLs with jitter, design keys carefully, and plan for the thundering herd. Most caches stay simple; the bugs come from invalidation, so be honest about what your app actually requires.
For more on production architecture, see Rate Limiting Strategies for APIs and Designing REST APIs That Don’t Suck .
Happy caching!
Building something AI-, backend-, or data-heavy and want a second pair of eyes? I do consulting and freelance work — see my projects and ways to reach me at rajpoot.dev .