URL shortener is the canonical system design interview question because it touches every interesting topic: ID generation, caching, schema design, capacity, analytics, abuse. Here’s how I’d actually design it — interview answer or production service, the structure is the same.
Requirements (this is the trick)
The first job is gathering requirements. Fail here and the rest is wasted.
Functional
- Given a long URL, return a short URL.
- Given a short URL, redirect to the long URL.
- Optional: custom alias, expiration, click analytics.
Non-functional
- Read-heavy. Roughly 100:1 reads to writes for typical shortener.
- Low latency redirect — sub-100ms p99.
- High availability — broken redirects look broken.
- Durable — losing a mapping is unacceptable.
Out of scope
- User accounts, OAuth, billing UI. Pretend these exist via a sibling service.
Back-of-envelope capacity
| Number | |
|---|---|
| New URLs per day | 100M |
| Reads per day | 10B (100:1 ratio) |
| New URLs / sec (avg) | ~1,200 |
| Reads / sec (avg) | ~120k |
| Reads / sec (peak, 3×) | ~360k |
| URL row size | ~500 bytes (URL + metadata) |
| Storage / year | 100M × 365 × 500B ≈ 18 TB/year |
Conclusion: writes are easy. Reads are not. The whole design pivots around making reads cheap.
API
POST /api/shorten
body: {"url": "https://example.com/...", "ttl_days": 365}
→ 201 {"short": "https://r.pt/aB3xQ"}
GET /{code}
→ 301 Location: <long_url> (or 404 / 410 expired)
Why 301 and not 302? 301 is permanent and cacheable — clients and CDN cache forward. For a service handling 10B redirects/day, that’s enormous. Use 301 by default; switch to 302 only if you need real-time analytics on every click.
ID generation — pick wisely
The short code is just an ID encoded into a short string. Three patterns:
1. Random hash, retry on collision
def shorten(url):
for _ in range(5):
code = secrets.token_urlsafe(6)[:7]
if try_insert(code, url):
return code
raise SystemError
- Pros: Stateless. Trivial to scale.
- Cons: Collisions cost an extra DB call. At 100M URLs and 7-char base62 (~3.5T space), collision probability is low but nonzero.
2. Hash of URL (deterministic)
code = base62(sha256(url + salt))[:7]
- Pros: Same URL → same short. Caches well.
- Cons: Same URL → same short, even if user wanted a fresh one. Salt-and-rehash on collision still requires DB check.
3. Counter + base62
A monotonic counter (or batched range) per shard, encoded base62 to make it short.
n = next_counter() # 1, 2, 3, ...
code = base62(n) # "1", "2", ..., "Z", "10", ..., "aB3xQ"
- Pros: No collisions. Sequential. Compresses well.
- Cons: Counter is global state. Snowflake-style (timestamp + machine ID + sequence) sidesteps that.
My pick: counter-based with a Snowflake-like generator. Each app server gets a machine ID, generates (timestamp << 22) | (machine_id << 12) | sequence, encodes base62. No coordination on the hot path.
def base62(n: int) -> str:
A = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"
out = []
while n:
n, r = divmod(n, 62)
out.append(A[r])
return "".join(reversed(out))
7 base62 chars = 62⁷ ≈ 3.5T codes. Plenty for a long time.
Schema
CREATE TABLE urls (
code TEXT PRIMARY KEY,
long_url TEXT NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
expires_at TIMESTAMPTZ,
user_id BIGINT,
is_active BOOLEAN NOT NULL DEFAULT true
);
CREATE INDEX urls_user ON urls (user_id);
CREATE INDEX urls_expires ON urls (expires_at) WHERE is_active = true;
code as primary key gets us a free unique constraint and the most common query (lookup by code) is an index hit.
Read path (the hot path)
client → CDN → app server → cache → database
Layered caching:
- CDN cache (1 hour) — most popular URLs served entirely from edge. Free.
- Redis (24 hour) — hot working set in memory.
- Database — the source of truth.
async def lookup(code: str) -> str | None:
cached = await redis.get(f"u:{code}")
if cached:
return cached.decode()
if cached == b"": # negative caching for not-found
return None
row = await db.fetchrow("SELECT long_url FROM urls WHERE code = $1 AND is_active", code)
if row is None:
await redis.set(f"u:{code}", "", ex=60) # cache miss for 1 min
return None
await redis.set(f"u:{code}", row["long_url"], ex=86400)
return row["long_url"]
Two production patterns most tutorials skip:
- Negative caching for 404s prevents brute-force scanning from hammering the DB.
- TTL ≠ delete-on-expiry. Long TTLs are fine when you have invalidation logic; they’re risky when you don’t.
Write path
async def shorten(url: str, ttl_days: int = 365) -> str:
code = base62(snowflake_id())
await db.execute(
"INSERT INTO urls (code, long_url, expires_at) VALUES ($1, $2, $3)",
code, url, now() + timedelta(days=ttl_days),
)
await redis.set(f"u:{code}", url, ex=86400) # warm cache on insert
return code
Warming the cache on insert pays off when users immediately share the link.
Capacity again, sharper
Reads at 360k/s peak. With a 95% cache hit ratio:
- 5% miss → 18k/s to DB.
- A single Postgres can do that on a primary key lookup. Read-replicas if needed.
Storage: 18TB/year. Postgres can hold that with partitioning, but at multi-year scales, consider:
- Cold-data archive: URLs not hit in N days → S3 + parquet, served via a fallback path.
- Sharded Postgres or a managed KV store (DynamoDB, Spanner, FoundationDB).
Analytics
Don’t write analytics on the hot path. The redirect should be one DB read, no write.
Pattern:
client → app → emit click event to Kafka
↓
analytics consumer
↓
write to ClickHouse / BigQuery
The redirect path stays sub-millisecond. Analytics can lag a few seconds — nobody clicks “show me my clicks” expecting microsecond freshness.
Abuse and security
A URL shortener is a phishing/spam vector. Defenses:
- URL validation at creation: reject non-
http(s), malformed, very long URLs. - Block lists for known-bad domains (Google Safe Browsing API).
- Rate limit on the create endpoint per IP and per user.
- CAPTCHA for anonymous creates above a threshold.
- Click-time scanning — at redirect, check if domain is on a recent block list; if so, show an interstitial warning rather than blind redirect.
Custom aliases
POST /api/shorten
body: {"url": "...", "alias": "blog-post-42"}
Same flow but the user picks the code. Reserve a separate namespace so customs can’t collide with auto-generated codes:
def is_valid_alias(s: str) -> bool:
return 4 <= len(s) <= 32 and re.fullmatch(r"[a-zA-Z0-9-]+", s) and "-" in s
The “must contain a hyphen” rule is a cheap separator: aB3xQ is auto, my-link is custom. They can’t collide.
Multi-region
For a global service:
- Read replicas in every region (Postgres logical replication or a managed multi-region DB).
- Writes to a primary region; reads from local replica.
- CDN at the edge — serves most redirects without hitting any backend.
- Asynchronous propagation for analytics.
The interesting tradeoff: write linearizability across regions costs you write availability during partitions. For a URL shortener, eventual consistency on creation is fine — a fresh link being briefly invisible in another region is acceptable.
What interviewers love to dig into
- “How do you handle a celebrity click storm?” → CDN, request coalescing, pre-warm cache when traffic spikes detected.
- “How do you migrate the schema?” → Add column nullable; backfill in batches; switch reads; drop default.
- “What if Redis goes down?” → DB has the answer, slightly slower; circuit-break Redis to fail fast; consider a second cache layer.
- “How do you reach 1M shortens/sec?” → Snowflake IDs (no coord), partition the urls table on
code, queue inserts into the analytics path.
What I’d actually do day one
For a production URL shortener built today:
- Postgres (with
urlspartitioned bycodehash) for source of truth. - Redis/Valkey as the cache.
- A small Go or Rust service for the redirect (latency budget is tight).
- Cloudflare in front (CDN + DDoS).
- Kafka → ClickHouse for click analytics.
- CSP and Safe Browsing checks for security.
That’s a system that scales from 0 to 1B redirects/day with the same architecture.
Read this next
- Distributed Systems Fundamentals — the underlying mental model.
- Rate Limiting Strategies for APIs — the abuse-prevention layer.
- Redis Caching Strategies — the cache layer.
If you want a worked-out URL shortener repo with everything above, it’s at rajpoot.dev .
Building something AI-, backend-, or data-heavy and want a second pair of eyes? I do consulting and freelance work — see my projects and ways to reach me at rajpoot.dev .