Scenario
A DBA performs PITR targeting 15:00 UTC. After restore, pg_last_xact_replay_timestamp() shows only 12:45 UTC — the recovery stopped 2+ hours before the target. No error was logged. The restore command exits cleanly. The database promotes itself and appears healthy, but data from 12:45 to 15:00 is missing.
How to Identify
Conditions:
- Recovery stopped early with no error: WAL archive has a gap at that point
restore_command returned exit code 1 silently — WAL file not found
- WAL archiving failed partway through the day (
pg_stat_archiver shows failed segments)
recovery_target_time is in the future relative to the most recent archived WAL
recovery_target_action = 'promote' caused automatic promotion when WAL ran out
Analysis Steps
-- After restore: check what timestamp recovery reached
SELECT pg_last_xact_replay_timestamp() AT TIME ZONE 'UTC' AS recovery_stopped_at;
-- If this is earlier than your target: WAL was missing for that period
-- Check recovery target settings
SELECT name, setting FROM pg_settings WHERE name LIKE 'recovery%';
-- Check pg_stat_archiver on SOURCE (before incident) for gaps:
SELECT archived_count, last_archived_wal, last_archived_time,
failed_count, last_failed_wal, last_failed_time
FROM pg_stat_archiver;
-- failed_count > 0 at time of failure = WAL gap in archive
-- Try to list WAL archive directory to find the gap:
-- ls /wal_archive/ | sort | uniq
-- Look for sequence breaks in file names (each WAL is ~1 sequence number apart)
-- Check if recovery.conf or postgresql.conf has correct restore_command:
-- grep restore_command $PGDATA/postgresql.conf
-- Test restore_command manually:
-- bash -c "cp /wal_archive/000000010000000100000050 /tmp/test_wal && echo OK"
-- Check postgresql log for "restored log file" messages during recovery:
-- grep "restored log file\|requested WAL segment\|could not restore" $PGDATA/log/postgresql*.log
Pitfalls
- When
restore_command returns a non-zero exit code (file not found), PostgreSQL interprets it as “no more WAL” and stops recovery — silently if recovery_target_action = 'promote'.
- A gap in the WAL archive (even a single missing segment) stops PITR at the segment before the gap. Data after the gap is permanently unrecoverable from that archive.
recovery_target_action = 'promote' causes automatic promotion when WAL runs out, even if target time was not reached. Use 'pause' to prevent silent premature promotion.
- Some archive tools (barman, pgbackrest) compress WAL — the restore_command must decompress. If decompression fails, the command returns non-zero.
Resolution Approach
Always use recovery_target_action = 'pause' in production PITR so you can inspect the recovery position before promotion. After restoration, compare pg_last_xact_replay_timestamp() to the target time. If short, the WAL archive has a gap — check pg_stat_archiver history from the source server.