Scenario
A production OLTP database on an HDD-based storage array shows a perfectly periodic pattern: every 5 minutes, query latency spikes from 3ms to 400-500ms for exactly 20-30 seconds. The DBA correlates the spikes with PostgreSQL checkpoint activity. During the spike, pg_stat_activity shows backends waiting in wait_event = 'BufferIO' and wait_event = 'DataFileWrite'. The checkpoint is writing 8 GB of dirty data in 30 seconds, saturating the storage IOPS.
How to Identify
Conditions:
- Periodic latency spikes correlating with checkpoint timing
pg_stat_bgwriter.buffers_checkpoint very high (large checkpoint writes)
pg_stat_bgwriter.buffers_clean very low (bgwriter not writing ahead of checkpoints)
- Backends waiting in
BufferIO or DataFileWrite wait events during spikes
checkpoint_completion_target low (checkpoints complete too quickly)
- Storage IOPS maxed out during checkpoint (visible in
iostat)
Analysis Steps
-- Check checkpoint statistics
SELECT
checkpoints_timed,
checkpoints_req,
checkpoint_write_time / 1000.0 AS write_time_sec,
checkpoint_sync_time / 1000.0 AS sync_time_sec,
buffers_checkpoint,
buffers_clean,
buffers_backend,
maxwritten_clean
FROM pg_stat_bgwriter;
-- Check checkpoint timing settings
SHOW checkpoint_timeout; -- how often checkpoints happen
SHOW checkpoint_completion_target; -- how fast checkpoint writes run
SHOW max_wal_size;
-- Calculate checkpoint write rate:
-- If checkpoint_timeout=5min and checkpoint_completion_target=0.5 (50%):
-- Checkpoint must write all dirty pages in 2.5 minutes → I/O burst
-- With checkpoint_completion_target=0.9:
-- Checkpoint has 4.5 minutes to write → I/O spread out
-- Check bgwriter efficiency
SELECT
CASE WHEN buffers_checkpoint + buffers_clean > 0
THEN round(100.0 * buffers_clean / (buffers_checkpoint + buffers_clean), 1)
ELSE 0
END AS pct_written_by_bgwriter
FROM pg_stat_bgwriter;
-- Low % = bgwriter is not writing pages ahead of checkpoint → checkpoint does all the work
Pitfalls
checkpoint_completion_target = 0.5 (old PostgreSQL default) causes checkpoints to complete in 50% of checkpoint_timeout — creating 30-second I/O bursts every 5 minutes.
- Increasing
shared_buffers without increasing checkpoint_completion_target or bgwriter_lru_maxpages makes the problem worse: more dirty pages accumulate and are written in a shorter burst.
- Moving from HDD to SSD completely changes this problem — SSDs handle checkpoint bursts without latency spikes. The fix on HDD is to spread I/O; the fix on SSD is usually to increase
shared_buffers and accept checkpoint bursts.
effective_io_concurrency affects how aggressively PostgreSQL does read-ahead I/O, but checkpoints are about write throughput, not read throughput.
Resolution Approach
Set checkpoint_completion_target = 0.9 to spread checkpoint writes over 90% of the checkpoint interval. Tune bgwriter to write dirty pages proactively before the checkpoint, reducing the burst. On HDD: also consider increasing checkpoint_timeout to allow more time for I/O spreading.