Why use Postgres as a queue when Redis / SQS exist?

One less moving part. Atomic with your business data (jobs and updates in same tx). Durable. Visible (SQL). Good for hundreds-to-thousands of jobs/sec. Graduate when you outgrow it.

When do I outgrow Postgres queues?

Sustained > 10k jobs/sec, multi-region, or you need Kafka-style replay/streaming. Below that, Postgres handles it fine and saves you ops.

Postgres as a Queue in 2026 — Boring, Reliable, Underrated

Postgres makes a great job queue up to surprising scale. No Redis, no SQS, no Kafka — just a table and SKIP LOCKED. This post is the working playbook.

Schema

CREATE TABLE jobs (
    id          bigserial PRIMARY KEY,
    queue       text NOT NULL DEFAULT 'default',
    payload     jsonb NOT NULL,
    priority    int NOT NULL DEFAULT 0,
    run_at      timestamptz NOT NULL DEFAULT now(),
    attempts    int NOT NULL DEFAULT 0,
    locked_at   timestamptz,
    locked_by   text,
    created_at  timestamptz NOT NULL DEFAULT now()
);

CREATE INDEX jobs_pickup ON jobs (queue, run_at, priority DESC)
    WHERE locked_at IS NULL;

Partial index = small + fast for the “available jobs” query.

Enqueue

async def enqueue(queue, payload, run_at=None, priority=0):
    await db.execute(
        "INSERT INTO jobs (queue, payload, run_at, priority) VALUES ($1, $2, $3, $4)",
        queue, payload, run_at or "now()", priority
    )

Atomic with business writes:

async with db.transaction():
    await db.execute("INSERT INTO orders ...")
    await enqueue("send_confirmation", {"order_id": order_id})

Either both happen or neither. The outbox pattern, free.

Dequeue (SKIP LOCKED)

WITH next_job AS (
    SELECT id FROM jobs
    WHERE queue = $1 AND run_at <= now() AND locked_at IS NULL
    ORDER BY priority DESC, run_at
    LIMIT 1
    FOR UPDATE SKIP LOCKED
)
UPDATE jobs SET locked_at = now(), locked_by = $2, attempts = attempts + 1
WHERE id = (SELECT id FROM next_job)
RETURNING id, payload;

Multi-worker safe: each worker grabs a different row. Other workers skip locked rows.

async def claim_job(queue, worker_id):
    return await db.fetchrow(SQL, queue, worker_id)

Process and complete

async def worker_loop(worker_id):
    while True:
        job = await claim_job("default", worker_id)
        if not job:
            await asyncio.sleep(1)
            continue
        try:
            await process(job["payload"])
            await db.execute("DELETE FROM jobs WHERE id = $1", job["id"])
        except Exception as e:
            await handle_failure(job, e)

Delete on success. On failure: requeue with backoff or move to dead-letter.

Retries with backoff

async def handle_failure(job, error):
    if job["attempts"] < 5:
        delay = 2 ** job["attempts"] + random.uniform(0, 1)
        await db.execute(
            "UPDATE jobs SET locked_at = NULL, run_at = now() + interval '1 second' * $1 WHERE id = $2",
            delay, job["id"]
        )
    else:
        await db.execute(
            "INSERT INTO dead_jobs SELECT * FROM jobs WHERE id = $1", job["id"]
        )
        await db.execute("DELETE FROM jobs WHERE id = $1", job["id"])
        log.error(f"job {job['id']} dead-lettered: {error}")

Exponential + jitter. Cap retries; dead-letter the rest. See Idempotency .

Visibility timeout

A worker dies mid-job; lock stuck forever. Reaper:

-- Periodic cleanup
UPDATE jobs SET locked_at = NULL, locked_by = NULL
WHERE locked_at < now() - interval '5 minutes';

If a job takes longer than the timeout, the worker should heartbeat:

UPDATE jobs SET locked_at = now() WHERE id = $1 AND locked_by = $2;

LISTEN / NOTIFY for low latency

Polling has 1s+ latency. For instant wake-up:

-- On enqueue
NOTIFY jobs_default, '';

# Worker
async with db.acquire() as conn:
    await conn.execute("LISTEN jobs_default")
    while True:
        try:
            job = await claim_job(...)
            if job:
                await process(...)
            else:
                await conn.notifies.get(timeout=5)  # wakes on NOTIFY or after 5s
        except asyncio.TimeoutError:
            pass

Sub-millisecond pickup latency. Pair with periodic polling as a safety net.

Concurrency

Postgres handles SKIP LOCKED well. With ~100k jobs/sec sustained you’ll hit:

WAL bandwidth (writes).
Index bloat on the partial index.
Connection limits.

Mitigations:

PgBouncer transaction pooling.
Partition by queue or date.
Vacuum aggressively — high-churn tables need it.

Multi-queue

Use the queue column for logical separation:

async def claim_any(worker_id):
    return await db.fetchrow(SQL_WITH_ANY_QUEUE, worker_id)

Workers can specialize: high-priority workers on urgent queue, generalists on default.

Scheduled jobs

await enqueue("cleanup", {}, run_at=tomorrow_3am)

run_at <= now() filter handles scheduling. For complex schedules: pg_cron extension or external scheduler that calls enqueue.

At-least-once semantics

Jobs may run twice (worker dies after work but before delete). Make handlers idempotent:

async def send_email(payload):
    if await already_sent(payload["email_id"]):
        return
    await sendgrid.send(...)
    await mark_sent(payload["email_id"])

Comparison

	Postgres queue	Redis queue	SQS	Kafka
Setup	None (already there)	Add Redis	Add SQS	Add cluster
Throughput	~10k/sec	~100k/sec	High	Very high
Durability	High (DB)	Configurable	High	High
Atomic with biz data	Yes	No	No	No
Visibility	SQL	RedisInsight	Console	Less
Streaming / replay	No	Limited	No	Yes

For most apps: Postgres queue handles the load and saves you a service.

Tools

River (Go) — Postgres queue with great DX.
Procrastinate (Python) — uses LISTEN/NOTIFY.
GoodJob (Ruby/Rails) — same pattern.
graphile-worker (TypeScript) — robust.
Hand-rolled — what this post showed.

For Python: Procrastinate is excellent.

Common mistakes

1. No partial index

Sequential scan of “available jobs” on million-row table. Add the partial index.

2. No vacuum

Bloat accumulates; queries slow. autovacuum on, or vacuum manually.

3. Single worker

One worker = bottleneck. Multiple workers + SKIP LOCKED scale near linearly.

4. Forever-locked stuck jobs

Worker dies; lock stays. Add reaper.

5. Non-idempotent handlers

At-least-once delivery duplicates work. Make handlers idempotent.

What I’d ship today

For ≤10k jobs/sec:

Postgres-backed queue with the schema above.
SKIP LOCKED for concurrency.
LISTEN/NOTIFY for low latency.
Exponential backoff retries.
Dead-letter table for poison messages.
Reaper for stuck locks.
Procrastinate / River / graphile-worker if you want it off-the-shelf.

When you outgrow it: graduate to Kafka or NATS, but most apps never do.

Read this next

If you want my Postgres-as-queue Python implementation, it’s at rajpoot.dev .

Building something AI-, backend-, or data-heavy and want a second pair of eyes? I do consulting and freelance work — see my projects and ways to reach me at rajpoot.dev .

Schema#

Enqueue#

Dequeue (SKIP LOCKED)#

Process and complete#

Retries with backoff#

Visibility timeout#

LISTEN / NOTIFY for low latency#

Concurrency#

Multi-queue#

Scheduled jobs#

At-least-once semantics#

Comparison#

Tools#

Common mistakes#

1. No partial index#

2. No vacuum#

3. Single worker#

4. Forever-locked stuck jobs#

5. Non-idempotent handlers#

What I’d ship today#

Read this next#