Cookbook recipe

WAL Write/fsync Degradation

Applies to PostgreSQL 13–17 Last reviewed May 2026 Grounded in source
Estimated investigation4 min

Scenario

Scenario After migrating to a new SAN storage system, average transaction latency increased from 2ms to 45ms. The application load and query plans are unchanged. The DBA checks pg_stat_activity and finds backends waiting in wait_event =…

Investigation Path

Scenario

After migrating to a new SAN storage system, average transaction latency increased from 2ms to 45ms. The application load and query plans are unchanged. The DBA checks pg_stat_activity and finds backends waiting in wait_event = 'WALSync' and wait_event = 'WALWrite'. The new SAN has a write cache that’s disabled for data integrity reasons — every WAL write now does a full fsync to disk.

How to Identify

Conditions:

  • Backends stuck in wait_event = 'WALSync' or wait_event = 'WALWrite'
  • Transaction latency increased after storage change
  • pg_stat_wal shows high wal_sync_time per WAL write
  • synchronous_commit = on (default) — every COMMIT waits for WAL fsync
  • Storage subsystem has write-through caching disabled or is NFS without async mount

Analysis Steps

-- Check current WAL wait events
SELECT wait_event, count(*) AS backends_waiting
FROM pg_stat_activity
WHERE wait_event IN ('WALWrite', 'WALSync', 'WALLock', 'WALBuffersFull')
GROUP BY wait_event
ORDER BY backends_waiting DESC;

-- Check WAL statistics (PG 14+)
SELECT
    wal_records,
    wal_fpi,
    wal_bytes,
    wal_buffers_full,
    wal_write,
    wal_sync,
    wal_write_time,
    wal_sync_time
FROM pg_stat_wal;
-- High wal_sync_time / wal_sync = slow fsync per WAL write

-- Check synchronous_commit setting
SHOW synchronous_commit;
-- 'on' = full durability: COMMIT waits for WAL write + fsync to disk
-- 'remote_write' = waits for WAL write to standby (no local fsync wait)
-- 'off' = no fsync wait (risk: recent committed txns lost on crash)

-- Check wal_sync_method
SHOW wal_sync_method;
-- 'fsync' = fsync() call per WAL file
-- 'open_sync' = O_SYNC flag (per-write sync, faster on some platforms)
-- 'fdatasync' = fdatasync() (skips metadata, faster than fsync)

-- Check current storage I/O wait (OS level):
-- iostat -x 1 | grep -E "Device|sdb|nvme"
-- Look for: %util near 100%, await > 5ms

Pitfalls

  • synchronous_commit = off trades durability for speed: the last few milliseconds of committed transactions can be lost on a crash. This is acceptable for non-critical data but never for financial/audit records.
  • fsync = off is catastrophically dangerous — data files can be corrupted on crash. Never set this in production. The old fsync = off “performance tip” has caused many data loss incidents.
  • NFS storage with sync mount option causes every write to traverse the network. Use async mount option for NFS-backed PostgreSQL (with appropriate UPS/HA).
  • Changing wal_sync_method requires a restart and may perform differently depending on OS and storage controller.

Resolution Approach

First: check if the storage truly requires every write to be fsynced (write cache configuration). Second: tune wal_sync_method for the storage type. Third: use synchronous_commit = local on the standby or off for non-critical transactions. Fourth: use write-back cache with a battery-backed unit (BBU) on the storage controller.

This is a Pro lesson

Get every Learning Pathway and cookbook recipe — grounded in PostgreSQL source code, with diagnostics, fixes, and prevention for each topic.

Continue this lesson to learn:

  • Mitigation Actions
  • All 36 Learning Pathway lessons
  • 170+ cookbook recipes
  • Source-grounded diagnostics & fixes

Secure checkout Cancel anytime Source-grounded

Career Impact

This scenario builds production judgment and operational confidence under pressure.

Open Career Dashboard →

Keep going

Related & next steps

Was this helpful?

← All cookbook recipes