URL shorteners look trivial: take URL, return short. Then the link goes viral, your DB is on fire, analytics fan out to ten teams, and you discover the requirements were never simple. This post is the working design.
Requirements
- Generate short URL
https://r.dev/aBcDeFfrom any long URL. - Redirect (301 or 302) on click.
- Track clicks (count, geo, referer).
- Custom slugs:
https://r.dev/launch. - 99.99% availability.
- Tail latency p99 <50ms.
Encoding
Base62 ([0-9A-Za-z]):
ALPHA = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
def encode(n: int) -> str:
s = ""
while n:
n, r = divmod(n, 62)
s = ALPHA[r] + s
return s or "0"
def decode(s: str) -> int:
n = 0
for c in s:
n = n * 62 + ALPHA.index(c)
return n
6 chars = 56B unique keys. 7 chars = 3.5T. 7 is plenty.
Key generation
Three approaches:
A. Counter (sequential)
n = await db.execute("SELECT nextval('url_id_seq')")
key = encode(n)
Pros: dense; sequential; smallest possible. Cons: predictable; reveals usage; hot row contention.
B. Counter blocks per shard
# At startup, each shard reserves a block of 10000 IDs
my_block = await reserve_block()
# Use IDs locally; write to DB
Pros: dense; no per-write contention. Cons: needs reservation logic.
C. Random
import secrets
key = "".join(secrets.choice(ALPHA) for _ in range(7))
# Check unique; retry if collision
Pros: unguessable; horizontally trivial. Cons: collision check; slightly less dense.
For most apps: B for default; C for vanity / public shorteners.
Schema
CREATE TABLE links (
key text PRIMARY KEY,
url text NOT NULL,
user_id bigint,
created_at timestamptz NOT NULL DEFAULT now()
);
CREATE TABLE clicks (
key text NOT NULL,
clicked_at timestamptz NOT NULL DEFAULT now(),
referer text,
country text,
ua_hash text
) PARTITION BY RANGE (clicked_at);
links is small (KV-style). clicks grows; partition by date for retention.
Read path
async def redirect(key: str):
if cached := await redis.get(f"link:{key}"):
url = cached.decode()
else:
row = await db.fetch_one("SELECT url FROM links WHERE key = $1", key)
if not row: raise NotFound
url = row["url"]
await redis.set(f"link:{key}", url, ex=86400)
asyncio.create_task(record_click(key, request))
return Response(status=302, headers={"Location": url})
Cache hit: fast. Cache miss: DB lookup, then cache. Click recording is async (fire-and-forget); never blocks the redirect.
Edge caching
Redirects served from CDN:
GET /aBcDeF
Cache-Control: public, max-age=300
Most clicks served from a POP; origin sees a fraction. For a viral link: 1M clicks → maybe 1k origin requests.
Click analytics
Don’t write each click to Postgres synchronously. Pipeline:
Click → Kafka (or similar) → consumer → ClickHouse / batched Postgres insert
ClickHouse handles billions of clicks; aggregations are fast. Postgres holds metadata.
For low-volume: batch + COPY to Postgres every minute.
Custom slugs
async def create_custom(slug: str, url: str, user: User):
if await db.exists("SELECT 1 FROM links WHERE key = $1", slug):
raise Conflict("slug taken")
if forbidden(slug): raise Forbidden # admin/login/etc.
if not user.has_premium: raise PaymentRequired
await db.execute("INSERT INTO links (key, url, user_id) VALUES ($1, $2, $3)", slug, url, user.id)
Reserve names for system routes. Maybe gate to paid users.
Abuse / safety
URLs to malware sites; phishing; spam. Need:
- URL scanning (Google Safe Browsing API, internal classifier).
- Reputation signals (new account, many links/day, suspicious targets).
- Reporting (community reports, abuse@).
- Takedown (admin panel; fast revoke).
For a public shortener, this is half the engineering effort.
Vanity short domain
r.dev is shorter than myapp.com/r/.... Worth the cost of a domain.
Capacity sketch
- 100M new links/day = ~1200 writes/sec average; ~12k peak. One Postgres handles fine.
- 1B clicks/day = 12k reads/sec average; ~120k peak. Cache + edge handles fine.
- Storage: 100M × 200B = 20GB/day links (negligible); clicks: GB/day, partition + archive.
For most shorteners: a single big Postgres + Redis + edge cache scales further than you’d expect.
Common mistakes
1. Synchronous click logging
Each click writes to Postgres. Postgres becomes the bottleneck under viral load. Async / batch.
2. No cache
Every redirect hits DB. Cache redirect mappings; they’re stable.
3. Sequential counter without sharding
One row’s nextval becomes the bottleneck. Reserve blocks per shard.
4. No takedown path
Malicious link goes viral. No way to revoke quickly. Build the admin tool day one.
5. 302 vs 301 confusion
301 is permanent (cached forever by browsers). 302 is temporary. For most shorteners: 302 so you can change/disable.
Read this next
If you want my URL shortener starter (FastAPI + Postgres + Redis + edge cache), it’s at rajpoot.dev .
Building something AI-, backend-, or data-heavy and want a second pair of eyes? I do consulting and freelance work — see my projects and ways to reach me at rajpoot.dev .