A backup that’s never been restored is a hope. Postgres backup is one of those things you do right, ignore for years, then thank yourself for during the actual incident. This post is the working playbook.

What you need

  • Daily full backup to durable storage.
  • Continuous WAL archiving for point-in-time recovery (PITR).
  • Cross-region copy for disaster.
  • Encryption at rest.
  • Monthly restore tests.

Tools

Strengths
pgBackRestMost mature; parallel; encryption; cross-cloud
BarmanMature; common in Italian / European deployments
WAL-GS3-native; Go-based; simple
pg_basebackupBuilt-in; basic; not for production at scale

For 2026 self-hosted: pgBackRest is the default.

pgBackRest setup

# /etc/pgbackrest/pgbackrest.conf
[global]
repo1-type=s3
repo1-s3-endpoint=s3.amazonaws.com
repo1-s3-bucket=my-pg-backups
repo1-s3-key=...
repo1-s3-key-secret=...
repo1-s3-region=us-east-1
repo1-cipher-type=aes-256-cbc
repo1-cipher-pass=...
repo1-retention-full=14
repo1-retention-diff=7

start-fast=y
log-level-console=info
log-level-file=detail
process-max=4
compress-type=zst
compress-level=3

[main]
pg1-path=/var/lib/postgresql/18/main
pgbackrest --stanza=main backup --type=full
pgbackrest --stanza=main backup --type=diff
pgbackrest --stanza=main info

Daily full + hourly diff + continuous WAL. 14 days retention. AES encryption.

WAL archiving

In postgresql.conf:

archive_mode = on
archive_command = 'pgbackrest --stanza=main archive-push %p'

Every WAL segment archived as it’s filled. Without this, PITR doesn’t work.

PITR

# Restore to a point
pgbackrest --stanza=main \
    --type=time \
    --target="2026-04-30 14:30:00 UTC" \
    restore

Restores the most recent backup, then replays WAL up to the target time. Useful when:

  • A bad migration ran at 14:32; restore to 14:30.
  • A user deleted production data.
  • Corruption discovered after-the-fact.

Cross-region copy

Don’t keep backups in only the region of the primary:

repo2-type=s3
repo2-s3-bucket=my-pg-backups-eu
repo2-s3-region=eu-west-1

A regional outage that takes down primary AND backups is too painful. Always cross-region.

Encryption

Backups contain ALL your data. Encrypt at rest:

  • pgBackRest’s cipher-type=aes-256-cbc + a strong passphrase.
  • Stored separately from the backups (KMS / Vault).
  • Rotated annually.

Restore testing

# Standalone test cluster
pgbackrest --stanza=main --pg1-path=/test/pg restore
pg_ctl -D /test/pg start -o "-p 5433"

# Verify
psql -p 5433 -c "SELECT count(*) FROM users;"

Schedule monthly. Compare counts to expected. Don’t just verify “the file came back” — verify the data is queryable.

For chaos engineering , include “kill the primary; restore from backup” as a quarterly drill.

RPO / RTO

  • RPO (Recovery Point Objective): how much data can you lose? With WAL archiving and minute-level archives: ~1 minute.
  • RTO (Recovery Time Objective): how fast must you restore? Depends on data size:
    • 100 GB: ~30 min.
    • 1 TB: ~2 hours.
    • 10 TB: ~12+ hours.

For tighter RTO: use streaming replication + auto-failover . Backups are for the case the primary AND replica are gone.

Managed services

RDS, Cloud SQL, Crunchy Bridge — all do automated backups. They’re good. Verify:

  • Retention period (default may be too short).
  • Cross-region copies.
  • PITR is enabled.
  • Restore actually works (test it).

For most teams: managed is enough. Self-hosted backups when you self-host Postgres.

Logical backups

pg_dump produces a logical backup. Useful for:

  • Per-table or per-schema export.
  • Cross-version migration.
  • Sending data to a customer.

Not for primary disaster recovery — restore is slow at scale.

Common mistakes

1. No WAL archiving

PITR doesn’t work. Best you can do is restore yesterday’s full backup. Lose a day of data.

2. Backups on same disk as primary

Disk fails; both gone. Always remote.

3. No encryption

Backup S3 bucket leaks; data leak.

4. Untested restores

The backup ran for 3 years; nobody verified. Day of incident: corruption in the backup itself. Test.

5. No cross-region

Region down; primary down; backup unreachable. Cross-region everywhere.

What I’d ship today

For self-hosted Postgres in 2026:

  1. pgBackRest to S3.
  2. WAL archiving every minute.
  3. Daily full + hourly diff, 14-day retention.
  4. Cross-region copy.
  5. AES-256 encryption.
  6. Monthly restore drill as a calendar event.

For managed: trust it; verify quarterly.

Read this next

If you want my pgBackRest config + restore drill runbook, it’s at rajpoot.dev .


Building something AI-, backend-, or data-heavy and want a second pair of eyes? I do consulting and freelance work — see my projects and ways to reach me at rajpoot.dev .