Should I build a cache or use Redis / Memcached?

Use Redis or Memcached. Building one is a fun design exercise, not a useful project. The hard parts (consistent hashing, replication, eviction) are solved problems.

Memcached for pure cache: simpler, multi-threaded, lower memory overhead. Redis when you need data structures (sorted sets for leaderboards, streams, pub/sub) or persistence.

Design a Distributed Cache — Memcached Successor Patterns in 2026

A distributed cache is the system design interview classic. Even if you’ll never build one, designing it teaches you consistent hashing, replication, and the cache-coherence tradeoffs that show up everywhere.

Requirements

Get / set / delete by key.
Sub-millisecond latency.
Survive node failure.
Scale horizontally.
Eviction when full.

Architecture

Client SDK (consistent-hashes key → server)
    ↓
Cache nodes (in-memory store)
    ↓
(optional) Replica per shard for HA

Stateless clients. Stateful servers. No central coordinator.

Consistent hashing

Naive: shard = hash(key) % N. Adding a node moves nearly all keys.

Consistent hash: keys mapped to a virtual ring; each node owns an arc. Adding a node moves only its arc’s worth of keys.

class ConsistentHash:
    def __init__(self, nodes, virtual=150):
        self.ring = {}
        for node in nodes:
            for v in range(virtual):
                key = hash(f"{node}:{v}")
                self.ring[key] = node
        self.sorted_keys = sorted(self.ring)

    def get(self, key):
        h = hash(key)
        # find first ring position >= h
        for k in self.sorted_keys:
            if k >= h:
                return self.ring[k]
        return self.ring[self.sorted_keys[0]]   # wrap

Virtual nodes (150 per real node) smooth out distribution.

Replication

For HA: each shard has a primary + 1–2 replicas. On primary failure, a replica is promoted.

For Memcached-style ephemeral cache: often no replication. A node failure → those keys are lost; clients refetch from origin. Cheap and simple.

For Redis-style: Sentinel or Cluster handles failover.

Eviction policies

When memory fills:

LRU (least recently used): evict the oldest-touched key. Simple; good for typical caches.
LFU (least frequently used): evict the least-used key. Better for skewed access.
FIFO: oldest first. Rare.
Random: pick a random key. Simple; surprisingly effective.

Memcached uses slab-based LRU. Redis exposes policies via maxmemory-policy .

TTL handling

Each entry has an expiry timestamp. Active expiration (sweep periodically) + lazy expiration (check on access). Both Memcached and Redis use a hybrid.

Hot keys

A single key receives 90% of traffic; one node bears the load. Solutions:

Client-side cache (small in-process LRU) catches most reads before the server.
Replicate hot keys to multiple nodes; client picks randomly.
Sub-shard a hot key: key:{key}:{0..N}; aggregate.

Same idea as Design a Leaderboard at Scale .

Cache stampede

Hot key expires; thousands of concurrent misses hit origin. Mitigations:

Single-flight: only one fetch per key in flight; others wait.
Probabilistic early refresh: refresh near (not at) TTL.
Stale-while-revalidate: serve stale; refresh in background.

For the patterns see Caching Strategies in 2026 .

Capacity

For a typical cache:

Per-node: 64–256 GB RAM, 32–64 vCPUs.
Items: ~10–100M per node.
Throughput: 100k–1M ops/sec/node.

A 16-node cluster → 1B items, 16M ops/sec.

What I’d actually use

For a real product:

Memcached for ephemeral caching (sessions, computed values).
Redis / Valkey when data structures matter (queues, sorted sets, streams).
Application-side LRU in-process for the hottest 1k items.

Don’t build a custom cache. The tradeoff space is solved.

Read this next

If you want a Python consistent-hashing client + Redis cache library, it’s at rajpoot.dev .

Building something AI-, backend-, or data-heavy and want a second pair of eyes? I do consulting and freelance work — see my projects and ways to reach me at rajpoot.dev .

Requirements#

Architecture#

Consistent hashing#

Replication#

Eviction policies#

TTL handling#

Hot keys#

Cache stampede#

Capacity#

What I’d actually use#

Read this next#