Redis rate limiting.
Fixed window
def allow(user, limit=100, window=60):
key = f"rate:{user}:{int(time.time()) // window}"
count = redis.incr(key)
if count == 1:
redis.expire(key, window)
return count <= limit
Simple but burst at window boundaries.
Sliding window log
def allow_sliding(user, limit=100, window=60):
key = f"rate:{user}"
now = time.time()
pipe = redis.pipeline()
pipe.zremrangebyscore(key, 0, now - window)
pipe.zadd(key, {f"{now}-{uuid4()}": now})
pipe.zcard(key)
pipe.expire(key, window)
_, _, count, _ = pipe.execute()
return count <= limit
Precise. More expensive.
Sliding window counter (approx)
def allow_swc(user, limit=100, window=60):
now = time.time()
cur_key = f"rate:{user}:{int(now // window)}"
prev_key = f"rate:{user}:{int(now // window) - 1}"
cur = int(redis.get(cur_key) or 0)
prev = int(redis.get(prev_key) or 0)
elapsed = now % window
weighted = prev * ((window - elapsed) / window) + cur
if weighted >= limit: return False
pipe = redis.pipeline()
pipe.incr(cur_key)
pipe.expire(cur_key, 2 * window)
pipe.execute()
return True
Good approximation; cheap.
Token bucket (Lua)
-- KEYS[1] = bucket key
-- ARGV[1] = capacity
-- ARGV[2] = refill_rate per second
-- ARGV[3] = now (timestamp)
-- ARGV[4] = cost
local capacity = tonumber(ARGV[1])
local rate = tonumber(ARGV[2])
local now = tonumber(ARGV[3])
local cost = tonumber(ARGV[4])
local data = redis.call("HMGET", KEYS[1], "tokens", "ts")
local tokens = tonumber(data[1]) or capacity
local last = tonumber(data[2]) or now
local elapsed = math.max(0, now - last)
tokens = math.min(capacity, tokens + elapsed * rate)
if tokens < cost then
redis.call("HSET", KEYS[1], "tokens", tokens, "ts", now)
redis.call("EXPIRE", KEYS[1], 60)
return 0
end
tokens = tokens - cost
redis.call("HSET", KEYS[1], "tokens", tokens, "ts", now)
redis.call("EXPIRE", KEYS[1], 60)
return 1
script = redis.register_script(LUA)
allowed = script(keys=[f"bucket:{user}"], args=[100, 10, time.time(), 1])
Allows bursts up to capacity, sustained refill rate.
Distributed (multi-shard)
For very high QPS, shard keys:
shard = hash(user) % 16
key = f"rate:{shard}:{user}"
Per-IP + per-user
allow_ip = allow(f"ip:{request.ip}", 1000, 60)
allow_user = allow(f"user:{user.id}", 100, 60) if user else True
return allow_ip and allow_user
upstash/ratelimit (TS)
import { Ratelimit } from "@upstash/ratelimit";
import { Redis } from "@upstash/redis";
const rl = new Ratelimit({
redis: Redis.fromEnv(),
limiter: Ratelimit.slidingWindow(10, "10 s"),
});
const { success, limit, remaining, reset } = await rl.limit(userId);
Production-ready, multiple algorithms.
Common mistakes
- Race condition without Lua / atomic.
- Window boundary bursts.
- Missing TTL → memory grows.
- Rate limiting from app pods → consistency issues without shared store.
- Counting failed requests against limit.
Read this next
If you want my Lua rate limiter, it’s at rajpoot.dev .
Building something AI-, backend-, or data-heavy and want a second pair of eyes? I do consulting and freelance work — see my projects and ways to reach me at rajpoot.dev .