A backup that’s never been restored is a hope. Postgres backup is one of those things you do right, ignore for years, then thank yourself for during the actual incident. This post is the working playbook.
What you need
- Daily full backup to durable storage.
- Continuous WAL archiving for point-in-time recovery (PITR).
- Cross-region copy for disaster.
- Encryption at rest.
- Monthly restore tests.
Tools
| Strengths | |
|---|---|
| pgBackRest | Most mature; parallel; encryption; cross-cloud |
| Barman | Mature; common in Italian / European deployments |
| WAL-G | S3-native; Go-based; simple |
| pg_basebackup | Built-in; basic; not for production at scale |
For 2026 self-hosted: pgBackRest is the default.
pgBackRest setup
# /etc/pgbackrest/pgbackrest.conf
[global]
repo1-type=s3
repo1-s3-endpoint=s3.amazonaws.com
repo1-s3-bucket=my-pg-backups
repo1-s3-key=...
repo1-s3-key-secret=...
repo1-s3-region=us-east-1
repo1-cipher-type=aes-256-cbc
repo1-cipher-pass=...
repo1-retention-full=14
repo1-retention-diff=7
start-fast=y
log-level-console=info
log-level-file=detail
process-max=4
compress-type=zst
compress-level=3
[main]
pg1-path=/var/lib/postgresql/18/main
pgbackrest --stanza=main backup --type=full
pgbackrest --stanza=main backup --type=diff
pgbackrest --stanza=main info
Daily full + hourly diff + continuous WAL. 14 days retention. AES encryption.
WAL archiving
In postgresql.conf:
archive_mode = on
archive_command = 'pgbackrest --stanza=main archive-push %p'
Every WAL segment archived as it’s filled. Without this, PITR doesn’t work.
PITR
# Restore to a point
pgbackrest --stanza=main \
--type=time \
--target="2026-04-30 14:30:00 UTC" \
restore
Restores the most recent backup, then replays WAL up to the target time. Useful when:
- A bad migration ran at 14:32; restore to 14:30.
- A user deleted production data.
- Corruption discovered after-the-fact.
Cross-region copy
Don’t keep backups in only the region of the primary:
repo2-type=s3
repo2-s3-bucket=my-pg-backups-eu
repo2-s3-region=eu-west-1
A regional outage that takes down primary AND backups is too painful. Always cross-region.
Encryption
Backups contain ALL your data. Encrypt at rest:
- pgBackRest’s
cipher-type=aes-256-cbc+ a strong passphrase. - Stored separately from the backups (KMS / Vault).
- Rotated annually.
Restore testing
# Standalone test cluster
pgbackrest --stanza=main --pg1-path=/test/pg restore
pg_ctl -D /test/pg start -o "-p 5433"
# Verify
psql -p 5433 -c "SELECT count(*) FROM users;"
Schedule monthly. Compare counts to expected. Don’t just verify “the file came back” — verify the data is queryable.
For chaos engineering , include “kill the primary; restore from backup” as a quarterly drill.
RPO / RTO
- RPO (Recovery Point Objective): how much data can you lose? With WAL archiving and minute-level archives: ~1 minute.
- RTO (Recovery Time Objective): how fast must you restore? Depends on data size:
- 100 GB: ~30 min.
- 1 TB: ~2 hours.
- 10 TB: ~12+ hours.
For tighter RTO: use streaming replication + auto-failover . Backups are for the case the primary AND replica are gone.
Managed services
RDS, Cloud SQL, Crunchy Bridge — all do automated backups. They’re good. Verify:
- Retention period (default may be too short).
- Cross-region copies.
- PITR is enabled.
- Restore actually works (test it).
For most teams: managed is enough. Self-hosted backups when you self-host Postgres.
Logical backups
pg_dump produces a logical backup. Useful for:
- Per-table or per-schema export.
- Cross-version migration.
- Sending data to a customer.
Not for primary disaster recovery — restore is slow at scale.
Common mistakes
1. No WAL archiving
PITR doesn’t work. Best you can do is restore yesterday’s full backup. Lose a day of data.
2. Backups on same disk as primary
Disk fails; both gone. Always remote.
3. No encryption
Backup S3 bucket leaks; data leak.
4. Untested restores
The backup ran for 3 years; nobody verified. Day of incident: corruption in the backup itself. Test.
5. No cross-region
Region down; primary down; backup unreachable. Cross-region everywhere.
What I’d ship today
For self-hosted Postgres in 2026:
- pgBackRest to S3.
- WAL archiving every minute.
- Daily full + hourly diff, 14-day retention.
- Cross-region copy.
- AES-256 encryption.
- Monthly restore drill as a calendar event.
For managed: trust it; verify quarterly.
Read this next
- Postgres Replication and HA in 2026
- Postgres VACUUM and Bloat in 2026
- Chaos Engineering in 2026
- Incident Response and Postmortems
If you want my pgBackRest config + restore drill runbook, it’s at rajpoot.dev .
Building something AI-, backend-, or data-heavy and want a second pair of eyes? I do consulting and freelance work — see my projects and ways to reach me at rajpoot.dev .