Cheatsheet for keeping Postgres tables healthy under SQLAlchemy workloads.
Why it matters
Postgres MVCC keeps dead row versions (tuples) until vacuum runs. Bloat = wasted disk + slower scans + index bloat.
Hot tables (high churn) need autovacuum tuned per-table.
Show table stats
SELECT
schemaname, relname,
n_live_tup, n_dead_tup,
last_vacuum, last_autovacuum,
last_analyze, last_autoanalyze,
autovacuum_count, vacuum_count
FROM pg_stat_user_tables
ORDER BY n_dead_tup DESC LIMIT 20;
n_dead_tup rising → vacuum can’t keep up.
Manual VACUUM
VACUUM; -- all tables, light
VACUUM ANALYZE users; -- specific table + stats
VACUUM (VERBOSE, ANALYZE) users;
VACUUM FULL users; -- rewrites table; AccessExclusive lock!
Rarely run VACUUM FULL in production — locks table. Use pg_repack instead.
ANALYZE
ANALYZE;
ANALYZE users;
ANALYZE users (email, created_at);
Updates planner statistics. Run after big data changes (large COPY, backfill).
Tune autovacuum per table
ALTER TABLE events SET (
autovacuum_vacuum_scale_factor = 0.05, -- vacuum after 5% dead tuples
autovacuum_analyze_scale_factor = 0.025,
autovacuum_vacuum_cost_limit = 2000,
autovacuum_vacuum_cost_delay = 10,
autovacuum_naptime = 30
);
Defaults (0.2 / 0.1) are too lax for hot tables.
Find bloat (pgstattuple)
CREATE EXTENSION pgstattuple;
SELECT * FROM pgstattuple('users');
SELECT * FROM pgstatindex('ix_users_email');
Or use Postgres-bloat-query (publicly available SQL):
-- Bloat estimate per table (Google "postgres bloat query")
REINDEX
REINDEX INDEX CONCURRENTLY ix_users_email;
REINDEX TABLE CONCURRENTLY users;
REINDEX DATABASE mydb; -- avoid in prod; locks
CONCURRENTLY is online. For very large indexes: significant time + temp space.
pg_repack (online table rewrite)
pg_repack -h prod -U app -d mydb -t huge_table
Rebuilds table + indexes online. Reclaims bloat. Needs ~2× disk temporarily.
For changing column types / adding partitioning to existing tables: pg_repack helps.
VACUUM FREEZE (transaction ID wraparound)
VACUUM FREEZE;
Required to prevent transaction ID wraparound. Autovacuum does this; for huge tables it can be slow → tune.
Wraparound stops writes; emergency. Watch:
SELECT datname, age(datfrozenxid) FROM pg_database;
If age approaches 2 billion → emergency.
Set per-table for hot tables
Hot tables (lots of UPDATE / DELETE) need more aggressive vacuum:
__table_args__ = (
Index(...),
{
"postgresql_with": {
"autovacuum_vacuum_scale_factor": 0.05,
"autovacuum_analyze_scale_factor": 0.025,
"fillfactor": 80, # leave space for HOT updates
},
},
)
fillfactor
For tables with frequent UPDATEs on indexed columns: leave space in page for HOT updates.
ALTER TABLE users SET (fillfactor = 80);
80% = 20% room. Reduces page splits.
TOAST
Large values (>2KB) stored out-of-line (TOAST). Doesn’t usually need tuning.
Index size monitoring
SELECT
schemaname, tablename, indexname,
pg_size_pretty(pg_relation_size(indexrelid)) AS size,
idx_scan
FROM pg_stat_user_indexes
WHERE schemaname = 'public'
ORDER BY pg_relation_size(indexrelid) DESC;
Bloated unused indexes → drop.
Table size
SELECT
schemaname, tablename,
pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) AS total_size,
pg_size_pretty(pg_relation_size(schemaname||'.'||tablename)) AS table_size
FROM pg_tables
WHERE schemaname = 'public'
ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC;
Total includes indexes + TOAST.
Connection / activity
SELECT pid, usename, application_name, state, query, age(now(), query_start) AS dur
FROM pg_stat_activity
WHERE state != 'idle'
ORDER BY query_start LIMIT 20;
Hot queries; long-running transactions.
Kill query
SELECT pg_cancel_backend(pid); -- graceful
SELECT pg_terminate_backend(pid); -- force
For runaway queries.
Common mistakes
- Default autovacuum on hot tables — bloat accumulates.
VACUUM FULLin production — table-wide AccessExclusive lock.- No fillfactor tuning on update-heavy tables.
- Ignoring
n_dead_tupgrowth — bloat invisible until perf tanks. - No transaction ID wraparound monitoring.
Read this next
If you want my autovacuum tuning + bloat-detection scripts, it’s at rajpoot.dev .
Building something AI-, backend-, or data-heavy and want a second pair of eyes? I do consulting and freelance work — see my projects and ways to reach me at rajpoot.dev .