Scenario
A streaming replication standby was running fine. After a primary server maintenance window, the standby stops replicating. The standby log shows FATAL: could not connect to the primary server: connection to server at "10.0.1.10", port 5432 failed: Connection refused. The DBA checks the primary — it’s running. pg_stat_replication shows no connected standbys.
How to Identify
Conditions:
- Standby log shows
could not connect to the primary server or connection refused
pg_stat_replication on primary shows zero rows (no connected standbys)
primary_conninfo in standby’s postgresql.auto.conf may be stale (wrong IP after failover/reconfiguration)
pg_hba.conf on primary may not include the standby’s new IP
max_wal_senders may be set too low on the primary
Analysis Steps
-- On STANDBY: check what primary_conninfo is configured
-- cat $PGDATA/postgresql.auto.conf | grep primary_conninfo
-- cat $PGDATA/recovery.conf (pre-PG12)
-- On PRIMARY: check max_wal_senders
SHOW max_wal_senders;
-- If 0: replication is completely disabled
-- If small number (e.g., 2): may be exhausted by other standbys
-- On PRIMARY: check how many walsenders are active
SELECT count(*) FROM pg_stat_replication;
SELECT count(*) FROM pg_stat_activity WHERE backend_type = 'walsender';
-- On PRIMARY: check pg_hba.conf allows the standby to connect
SELECT type, database, user_name, address, auth_method
FROM pg_hba_file_rules
WHERE database = ARRAY['replication'];
-- Must have a row allowing the standby host/CIDR for replication database
-- On PRIMARY: check if wal_level is sufficient
SHOW wal_level;
-- 'minimal' = streaming replication disabled (needs 'replica' or 'logical')
-- Check network: from standby OS shell:
-- pg_isready -h primary_host -p 5432
-- psql -h primary_host -p 5432 -U replicator -d replication -c "SELECT 1;"
Pitfalls
max_wal_senders = 0 completely disables streaming replication. Setting it requires a restart.
wal_level = minimal prevents all replication. Changing it requires a restart.
pg_hba.conf reload (SELECT pg_reload_conf()) is enough to apply HBA changes — no restart needed.
- After IP changes (cloud instance stop/start, migration),
primary_conninfo in postgresql.auto.conf must be updated manually — PostgreSQL does not update it automatically.
- A password change on the replication user without updating
.pgpass or the connection string causes authentication failure.
max_wal_senders should be at least the number of standbys + replication slots + pg_basebackup connections + 2 spare.
Resolution Approach
Check in this order: (1) wal_level and max_wal_senders on primary, (2) pg_hba.conf for the standby’s IP, (3) network connectivity, (4) primary_conninfo on the standby for stale IP/credentials. Fix each layer and reload/restart as needed.