Scenario
A DBA notices WAL generation is 800 MB/min despite an average write rate of only 150 MB/min of actual data. The WAL is growing 5x faster than expected. Archiving is keeping up but the storage costs for WAL archiving have tripled. The DBA checks pg_stat_wal and finds wal_fpi (full page images) is very high compared to wal_records.
How to Identify
Conditions:
pg_stat_wal.wal_fpi >> wal_records — many full page images being written
full_page_writes = on (default) — necessary but can be tuned
checkpoint_completion_target is low — checkpoints complete quickly, then the next period restarts FPI writes
- Tables have high
FILLFACTOR = 100 and many updates — every updated page must be written as FPI after each checkpoint
- Using PostgreSQL on ZFS or ext4 with 4 KB block size and 8 KB PostgreSQL
block_size: partial page writes happen during OS crash
Analysis Steps
-- Check WAL statistics
SELECT
wal_records,
wal_fpi,
pg_size_pretty(wal_bytes) AS wal_bytes,
round(100.0 * wal_fpi / NULLIF(wal_records, 0), 1) AS fpi_pct_of_records,
round(wal_bytes::numeric / NULLIF(wal_records + wal_fpi, 0), 0) AS avg_bytes_per_wal_entry
FROM pg_stat_wal;
-- fpi_pct_of_records > 50% = full page writes dominating WAL
-- Check full_page_writes setting
SHOW full_page_writes;
-- 'on' = required for crash safety (do NOT disable)
-- Check if data checksums are enabled (alternative to full_page_writes)
SELECT name, setting FROM pg_settings WHERE name = 'data_checksums';
-- Check checkpoint frequency (frequent checkpoints restart FPI cycle often)
SELECT checkpoints_timed, checkpoints_req FROM pg_stat_bgwriter;
-- Check block_size and OS page size mismatch
SHOW block_size; -- typically 8192 (8 KB)
-- OS page size: getconf PAGE_SIZE (typically 4 KB on Linux)
-- Mismatch = partial page write risk = full_page_writes critical
-- Check wal_compression setting
SHOW wal_compression;
-- 'off' = FPIs stored uncompressed (default)
-- 'on' = FPIs compressed with pglz (reduces WAL size, small CPU cost)
Pitfalls
- NEVER disable
full_page_writes in production. Without it, a crash after a partial page write can corrupt data in ways that are undetectable and unrecoverable. The WAL size increase is the cost of crash safety.
wal_compression = on reduces FPI WAL size by 30-60% with negligible CPU cost. This is almost always worth enabling.
- Reducing checkpoint frequency (larger
max_wal_size) does NOT reduce total FPI writes — each checkpoint starts a new FPI cycle. The total FPI WAL is proportional to the number of dirty pages at each checkpoint, not the checkpoint frequency.
- Data checksums +
full_page_writes = off is NOT a valid safety combination. Data checksums detect corruption but cannot repair it.
Resolution Approach
Enable wal_compression = on for immediate WAL size reduction. Tune FILLFACTOR on heavily-updated tables to reduce the number of pages that become dirty between checkpoints. If using ZFS: enable recordsize=8k to align with PostgreSQL’s block size, which may reduce partial page write risk.