A Postgres primary that goes down without automated failover is a 30-minute outage. With proper HA, it’s 30 seconds. This post is the working playbook for Postgres HA in 2026.
Replication shapes
Streaming (physical) replication
Block-level binary replay. Replicas are byte-identical to the primary.
- Pros: Fast, low overhead, exact copy.
- Cons: Same Postgres major version. All-or-nothing (can’t partial-replicate).
- Best for: HA failover, read replicas.
Logical replication
Decodes WAL into row-level events; subscribers apply them.
- Pros: Cross-version. Partial table sets. Feed non-Postgres consumers (Postgres CDC ).
- Cons: Higher overhead. Slot management is operational work.
- Best for: Migrations, CDC pipelines.
Use both. Streaming for HA, logical for streaming pipelines.
Auto-failover tools
Patroni
The mature standard. Uses etcd / ZooKeeper / Consul for consensus; manages primary election, replica promotion, fencing.
# patroni.yml
scope: prod-pg
namespace: /patroni
name: pg-1
restapi:
listen: 0.0.0.0:8008
etcd:
hosts: etcd-1:2379, etcd-2:2379, etcd-3:2379
postgresql:
listen: 0.0.0.0:5432
data_dir: /var/lib/postgresql/18/main
Battle-tested. Most production self-hosted Postgres clusters in 2026 run Patroni.
pg_auto_failover
Citus’s tool. Lighter than Patroni, simpler config. Only does failover.
Stolon
Spotify’s tool. Kubernetes-friendly. Less popular in 2026.
Cloud-managed
RDS Multi-AZ, Cloud SQL HA, Crunchy Bridge — handle failover automatically. The right choice for most teams.
RPO and RTO targets
- RPO (Recovery Point Objective): how much data you can lose. Synchronous replication = 0; async streaming = seconds.
- RTO (Recovery Time Objective): how long until service resumes. Auto-failover = 30s; manual = minutes.
Synchronous replication is expensive (every write waits for the replica to ack) but RPO=0 is required for some workloads. Pick consciously:
-- per-transaction
SET LOCAL synchronous_commit = on;
SET LOCAL synchronous_standby_names = 'replica1';
COMMIT;
Topology in 2026
Common production topology:
Primary (write + sync replica ack)
│
├──── Sync replica (same AZ) ── for RPO=0
│
├──── Async replica (different AZ) ── for HA
│
└──── Async replica (different region) ── for DR
Plus a logical replication slot to Kafka / search index.
Backups (don’t skip)
HA is not backup. Both can fail. Backups:
pg_basebackupfor full snapshots.- WAL archiving for point-in-time recovery (PITR).
- Tools: pgBackRest, Barman, WAL-G — pick one.
- Test restores monthly. An untested backup is a hope.
Connection pooling
A single Postgres handles ~hundreds of concurrent connections well; thousands poorly. Use a pooler:
- PgBouncer — classic, transaction-pooling mode.
- Pgpool — heavier, more features.
- Supavisor / Hyperdrive — Postgres-flavored cloud poolers.
# pgbouncer.ini
pool_mode = transaction
max_client_conn = 10000
default_pool_size = 25
Apps connect to PgBouncer; PgBouncer multiplexes to the small DB pool.
Watch out: transaction-mode breaks LISTEN/NOTIFY, prepared statements (without protocol-level fix), and SET LOCAL. Plan around it.
Read scaling
Async replicas serve reads:
- Application reads from a connection pool that points to replicas.
- Writes always to primary.
- Reads tolerate small lag.
For read-after-write consistency, route those reads to primary. SQLAlchemy’s bind_routing and Postgres’s synchronous_commit_* can help.
Operational realities
- Replication lag is a metric. Alert on it.
- WAL size if a slot is stuck. Monitor
pg_replication_slots. - Failover testing — schedule it. A failover that’s never been tested doesn’t work.
- Vacuum on the primary holds back replicas if there’s a long transaction. See PostgreSQL MVCC, Isolation, Locking .
Read this next
- PostgreSQL 18 Features
- PostgreSQL MVCC, Isolation, Locking
- Postgres CDC, Logical Replication, Debezium
- Distributed SQL — CockroachDB / Spanner / Yugabyte
If you want a Patroni + PgBouncer + pgBackRest reference setup, it’s at rajpoot.dev .
Building something AI-, backend-, or data-heavy and want a second pair of eyes? I do consulting and freelance work — see my projects and ways to reach me at rajpoot.dev .