Cookbook recipe

Serialization Failure (ERROR: could not serialize access)

Applies to PostgreSQL 13–17 Last reviewed May 2026 Grounded in source
Estimated investigation4 min

Scenario

Scenario A financial transaction processing service runs all operations in SERIALIZABLE isolation to guarantee correctness. During peak hours, application logs show a 30% failure rate with SQLSTATE 40001 (serialization_failure). Retry logic exists but the retries themselves…

Investigation Path

Scenario

A financial transaction processing service runs all operations in SERIALIZABLE isolation to guarantee correctness. During peak hours, application logs show a 30% failure rate with SQLSTATE 40001 (serialization_failure). Retry logic exists but the retries themselves also fail, creating a thundering herd: all retried transactions restart simultaneously, conflict with each other again, and retry again — multiplying the load.

How to Identify

Conditions:

  • Application logs contain ERROR: could not serialize access due to concurrent update or ERROR: could not serialize access due to read/write dependencies among transactions
  • SQLSTATE: 40001 errors in application error tracking
  • pg_stat_database.conflicts counter rising (note: this tracks replica conflicts, but application-side retries indicate serialization pressure)
  • Error rate correlates directly with concurrent transaction count during peak hours
  • Retry storms visible in application metrics — retry spikes coincide with original failure spikes

Analysis Steps

-- 1. Check serialization failure rate in the database
SELECT
    datname,
    xact_commit,
    xact_rollback,
    (xact_rollback::float / NULLIF(xact_commit + xact_rollback, 0) * 100)
        AS rollback_rate_pct
FROM   pg_stat_database
WHERE  datname = current_database();

-- 2. Check current isolation level for the session
SHOW transaction_isolation;
SHOW default_transaction_isolation;

-- 3. Look for transactions currently running in SERIALIZABLE mode
SELECT
    pid,
    usename,
    state,
    left(query, 80)         AS query,
    now() - xact_start      AS xact_duration
FROM   pg_stat_activity
WHERE  state = 'active'
  AND  now() - xact_start > interval '100ms'
ORDER  BY xact_duration DESC;

-- 4. Examine pg_stat_database for conflict indicators
SELECT
    datname,
    conflicts,
    deadlocks,
    xact_rollback
FROM   pg_stat_database
WHERE  datname = current_database();

-- 5. Check if serializable_failure_on_conflict is a known pattern
-- Look at the rate of rollbacks as a proxy for serialization failures:
SELECT datname,
       xact_rollback,
       deadlocks,
       blk_read_time,
       blk_write_time
FROM   pg_stat_database
WHERE  datname = current_database();

Pitfalls

  • Serialization failures are normal and expected in SERIALIZABLE isolation — they are a correctness mechanism, not a database bug. The application MUST always retry transactions that fail with SQLSTATE 40001.
  • Do not use REPEATABLE READ assuming it prevents all anomalies — it prevents non-repeatable reads but allows write skew anomalies that SERIALIZABLE catches and rejects.
  • Not all 40001 errors need DBA investigation — a small, steady failure rate with successful retries is acceptable. Only high rates or thundering herds need architectural intervention.
  • Naive retry logic without exponential backoff causes thundering herd — all retrying transactions restart simultaneously and conflict again. Exponential backoff with jitter breaks the synchronization.
  • Minimizing transaction duration is the most effective mitigation — shorter transactions have shorter conflict windows and lower probability of overlapping with concurrent transactions.

Resolution Approach

Serialization failures require application-layer handling: implement exponential backoff with random jitter so retries do not thunder together. Minimize transaction duration by moving non-database work outside the transaction boundary. For operations where exact serializable isolation is not strictly required, consider READ COMMITTED with explicit SELECT FOR UPDATE locking to avoid the retry cost while maintaining correctness for specific rows.

This is a Pro lesson

Get every Learning Pathway and cookbook recipe — grounded in PostgreSQL source code, with diagnostics, fixes, and prevention for each topic.

Continue this lesson to learn:

  • Mitigation Actions
  • All 36 Learning Pathway lessons
  • 170+ cookbook recipes
  • Source-grounded diagnostics & fixes

Secure checkout Cancel anytime Source-grounded

Career Impact

This scenario builds production judgment and operational confidence under pressure.

Open Career Dashboard →

Keep going

Related & next steps

Was this helpful?

← All cookbook recipes