Why can't I just rename the column?

During deploy, old code runs against the new schema. Old code reads the old column; the column doesn't exist; errors. You need a multi-step migration where old and new code can both work.

How long does a zero-downtime migration take?

Days to weeks for big tables. Each step is its own deploy. The migration is slower than a direct change but never blocks. For small tables, the steps are cheap; you can run several in a day.

Zero-Downtime Database Migrations in 2026 — The Expand-and-Contract Playbook

A database migration that takes the site down for 30 minutes is unacceptable in 2026. Worse, schema changes during a deploy can fail subtly when old and new code race. The expand-and-contract pattern eliminates both. This post is the full playbook.

The principle

Old code and new code coexist during deploy. Migrations must be safe with both versions running simultaneously. The pattern:

Expand: add the new shape (nullable, optional).
Backfill: make old data fit the new shape.
Switch reads: code reads the new shape.
Switch writes: code writes the new shape only.
Contract: remove the old shape.

Each step is its own deploy. None blocks.

Adding a column

-- Step 1 (deploy 1): add nullable, with default
ALTER TABLE users ADD COLUMN avatar_url TEXT;

Old code ignores it. New code starts populating.

-- Step 2 (background job, optional):
UPDATE users SET avatar_url = generate_default_avatar(id)
WHERE avatar_url IS NULL;

-- Step 3 (deploy 2): make NOT NULL once backfill complete
ALTER TABLE users ALTER COLUMN avatar_url SET NOT NULL;

Easy case. The backfill avoids row-locks if done in batches.

Renaming a column

The classic painful case. Don’t rename in place. Multi-step:

Step 1 (deploy 1): add new column

ALTER TABLE users ADD COLUMN full_name TEXT;

Step 2 (deploy 2): write to both

# code
db.execute(
    "UPDATE users SET name = $1, full_name = $1 WHERE id = $2",
    name, user_id,
)

Old reads still see name. New reads see full_name.

Step 3: backfill

UPDATE users SET full_name = name WHERE full_name IS NULL;

In batches for big tables.

Step 4 (deploy 3): switch reads

return db.fetchval("SELECT full_name FROM users WHERE id = $1", user_id)

Both columns still written. Reads from new.

Step 5 (deploy 4): stop writing old

db.execute("UPDATE users SET full_name = $1 WHERE id = $2", name, user_id)

Step 6 (deploy 5): drop old

ALTER TABLE users DROP COLUMN name;

Five deploys. Always safe; never blocks. For small low-traffic tables you can collapse some steps; for high-traffic, follow the full sequence.

Dropping a column

Reverse of adding:

Stop reading the column (deploy 1).
Stop writing the column (deploy 2).
Drop after a release cycle of stability (deploy 3).

The gap between steps 2 and 3 is your safety net for rollback.

Changing a column type

Treat as add-rename-drop:

Add new column with new type.
Write to both during transition.
Backfill.
Switch reads.
Drop old.

For type changes that fit existing data (e.g., VARCHAR(50) → VARCHAR(100)), Postgres handles in place without rewrite. For changes that need rewrite (TEXT → INTEGER), use the full pattern.

Adding an index

CREATE INDEX CONCURRENTLY idx_users_email ON users (email);

CONCURRENTLY doesn’t take an exclusive lock. Builds in the background. Does NOT abort if a transaction conflicts; partially-built indexes need cleanup.

Adding a NOT NULL column

ALTER TABLE users ADD COLUMN email TEXT NOT NULL DEFAULT '';   -- Postgres 11+ instant

Postgres 11+: ADD COLUMN ... NOT NULL DEFAULT '...' is instant; doesn’t rewrite the table. Older Postgres requires the multi-step.

For deeper Postgres specifics see PostgreSQL 18 Features and Postgres VACUUM and Bloat .

Foreign keys

-- Step 1: add as NOT VALID (no full-table check)
ALTER TABLE orders ADD CONSTRAINT fk_user
  FOREIGN KEY (user_id) REFERENCES users(id) NOT VALID;

-- Step 2 (later): validate without exclusive lock
ALTER TABLE orders VALIDATE CONSTRAINT fk_user;

NOT VALID enforces for new rows; doesn’t lock to validate old rows. VALIDATE does the check without an exclusive lock.

Tools

Alembic (Python): write migrations as code; auto-generate from SQLAlchemy.
dbmate: language-agnostic; SQL files.
Atlas (Ariga): declarative schema; computes diff.
Drizzle Kit (TypeScript): same idea for Drizzle.

For dev: use the tool’s auto-generation. For prod: review every migration; sometimes auto-gen produces unsafe operations (e.g., locking).

Safety checklist

Before running any migration on prod:

Step is online-safe (no exclusive lock).
Backfill is batched; doesn’t lock huge swaths.
Old code and new code can both run during transition.
Rollback plan exists.
Tested on staging with realistic data volume.
Monitoring during the run.

What I’d ship today

For a new app:

Alembic / Drizzle Kit / Atlas generating migrations.
Code review every migration; reject any with locking risk.
Multi-step pattern for renames, drops, type changes.
Backfill jobs that batch + can resume.
Always CONCURRENTLY for indexes.

Boring. Safe. Ships forever.

Read this next

If you want my migration runbook + Alembic templates, it’s at rajpoot.dev .

Building something AI-, backend-, or data-heavy and want a second pair of eyes? I do consulting and freelance work — see my projects and ways to reach me at rajpoot.dev .

The principle#

Adding a column#

Renaming a column#

Step 1 (deploy 1): add new column#

Step 2 (deploy 2): write to both#

Step 3: backfill#

Step 4 (deploy 3): switch reads#

Step 5 (deploy 4): stop writing old#

Step 6 (deploy 5): drop old#

Dropping a column#

Changing a column type#

Adding an index#

Adding a NOT NULL column#

Foreign keys#

Tools#

Safety checklist#

What I’d ship today#

Read this next#