Redis rate limiting.

Fixed window

def allow(user, limit=100, window=60):
    key = f"rate:{user}:{int(time.time()) // window}"
    count = redis.incr(key)
    if count == 1:
        redis.expire(key, window)
    return count <= limit

Simple but burst at window boundaries.

Sliding window log

def allow_sliding(user, limit=100, window=60):
    key = f"rate:{user}"
    now = time.time()
    pipe = redis.pipeline()
    pipe.zremrangebyscore(key, 0, now - window)
    pipe.zadd(key, {f"{now}-{uuid4()}": now})
    pipe.zcard(key)
    pipe.expire(key, window)
    _, _, count, _ = pipe.execute()
    return count <= limit

Precise. More expensive.

Sliding window counter (approx)

def allow_swc(user, limit=100, window=60):
    now = time.time()
    cur_key = f"rate:{user}:{int(now // window)}"
    prev_key = f"rate:{user}:{int(now // window) - 1}"
    
    cur = int(redis.get(cur_key) or 0)
    prev = int(redis.get(prev_key) or 0)
    
    elapsed = now % window
    weighted = prev * ((window - elapsed) / window) + cur
    
    if weighted >= limit: return False
    
    pipe = redis.pipeline()
    pipe.incr(cur_key)
    pipe.expire(cur_key, 2 * window)
    pipe.execute()
    return True

Good approximation; cheap.

Token bucket (Lua)

-- KEYS[1] = bucket key
-- ARGV[1] = capacity
-- ARGV[2] = refill_rate per second
-- ARGV[3] = now (timestamp)
-- ARGV[4] = cost

local capacity = tonumber(ARGV[1])
local rate = tonumber(ARGV[2])
local now = tonumber(ARGV[3])
local cost = tonumber(ARGV[4])

local data = redis.call("HMGET", KEYS[1], "tokens", "ts")
local tokens = tonumber(data[1]) or capacity
local last = tonumber(data[2]) or now

local elapsed = math.max(0, now - last)
tokens = math.min(capacity, tokens + elapsed * rate)

if tokens < cost then
    redis.call("HSET", KEYS[1], "tokens", tokens, "ts", now)
    redis.call("EXPIRE", KEYS[1], 60)
    return 0
end

tokens = tokens - cost
redis.call("HSET", KEYS[1], "tokens", tokens, "ts", now)
redis.call("EXPIRE", KEYS[1], 60)
return 1
script = redis.register_script(LUA)
allowed = script(keys=[f"bucket:{user}"], args=[100, 10, time.time(), 1])

Allows bursts up to capacity, sustained refill rate.

Distributed (multi-shard)

For very high QPS, shard keys:

shard = hash(user) % 16
key = f"rate:{shard}:{user}"

Per-IP + per-user

allow_ip = allow(f"ip:{request.ip}", 1000, 60)
allow_user = allow(f"user:{user.id}", 100, 60) if user else True
return allow_ip and allow_user

upstash/ratelimit (TS)

import { Ratelimit } from "@upstash/ratelimit";
import { Redis } from "@upstash/redis";

const rl = new Ratelimit({
  redis: Redis.fromEnv(),
  limiter: Ratelimit.slidingWindow(10, "10 s"),
});

const { success, limit, remaining, reset } = await rl.limit(userId);

Production-ready, multiple algorithms.

Common mistakes

  • Race condition without Lua / atomic.
  • Window boundary bursts.
  • Missing TTL → memory grows.
  • Rate limiting from app pods → consistency issues without shared store.
  • Counting failed requests against limit.

Read this next

If you want my Lua rate limiter, it’s at rajpoot.dev .


Building something AI-, backend-, or data-heavy and want a second pair of eyes? I do consulting and freelance work — see my projects and ways to reach me at rajpoot.dev .