Scenario
A 30-second network partition between the primary and standby causes the automatic failover system (Patroni/repmgr) to promote the standby. When the network recovers, both nodes are now accepting writes — the original primary continued accepting writes during the partition, and the newly promoted standby also accepted writes. This is a split-brain scenario. Data written to each node is now divergent.
How to Identify
Conditions:
- Both servers return
pg_is_in_recovery() = false
pg_current_wal_lsn() is different on both nodes after the partition
- WAL timeline on the new primary is higher than the old primary’s timeline
- Application writes went to different nodes simultaneously
- Fencing (STONITH) was not in place or failed
Analysis Steps
-- On NODE 1 (original primary): check status
SELECT pg_is_in_recovery() AS in_recovery,
pg_current_wal_lsn() AS current_lsn,
timeline_id AS timeline
FROM pg_control_checkpoint();
-- On NODE 2 (promoted standby): check status
SELECT pg_is_in_recovery() AS in_recovery,
pg_current_wal_lsn() AS current_lsn,
timeline_id AS timeline
FROM pg_control_checkpoint();
-- If both return in_recovery=false: split-brain confirmed
-- The node with higher timeline_id is the "correct" new primary
-- On original primary: check if any writes happened during the partition
-- (look for transactions with XID > the last replicated XID)
SELECT pg_current_wal_lsn() - pg_control_checkpoint().checkpoint_lsn AS wal_since_promotion;
-- On new primary: check timeline history
-- ls $PGDATA/pg_wal/*.history ← shows when and why promotion happened
Pitfalls
- Split-brain is caused by lack of fencing: the old primary should be immediately and forcibly shut down (STONITH — “Shoot The Other Node In The Head”) when a failover is triggered.
- Automatic failover without fencing is dangerous in any HA system — not just PostgreSQL.
synchronous_commit = on with a synchronous standby prevents data loss during failover, but doesn’t prevent split-brain if fencing fails.
- Patroni uses distributed consensus (etcd/Consul/ZooKeeper) and DCS-based leader locks to prevent split-brain — this is why DCS is required.
- After a split-brain, data reconciliation is a manual, complex process. Some writes from the old primary will be lost.
Resolution Approach
Immediately shut down the old primary (forcibly if necessary). Identify the divergence point using WAL comparison. Accept that writes to the old primary after the divergence point are lost. Optionally: manually extract diverged writes from the old primary’s WAL using pg_waldump for reconciliation.