Cookbook recipe

Stale Table Statistics Causing Bad Plans

Applies to PostgreSQL 13–17 Last reviewed May 2026 Grounded in source
Estimated investigation4 min

Scenario

Scenario An ETL job loads 8 million rows into an orders table overnight. The next morning, queries that were fast the day before now take 60+ seconds. No schema changes occurred. The DBA checks EXPLAIN ANALYZE…

Investigation Path

Scenario

An ETL job loads 8 million rows into an orders table overnight. The next morning, queries that were fast the day before now take 60+ seconds. No schema changes occurred. The DBA checks EXPLAIN ANALYZE and sees the planner estimated 1,200 rows but actually processed 4 million.

How to Identify

Conditions:

  • Queries became slow after a bulk INSERT, DELETE, or UPDATE operation
  • EXPLAIN ANALYZE shows wildly wrong row estimates (estimate << actual)
  • pg_stat_user_tables.last_autoanalyze is old or predates the bulk load
  • n_mod_since_analyze is very high relative to n_live_tup
  • Autovacuum was disabled on the table or scale factor was not met

Analysis Steps

-- 1. Check statistics freshness and modification count
SELECT
    schemaname,
    relname,
    last_analyze,
    last_autoanalyze,
    n_live_tup,
    n_dead_tup,
    n_mod_since_analyze,
    n_mod_since_analyze::float / NULLIF(n_live_tup, 0) AS mod_ratio
FROM pg_stat_user_tables
WHERE relname = 'orders'
ORDER BY mod_ratio DESC NULLS LAST;
-- mod_ratio > 0.10 = autovacuum threshold not yet met, stats may be stale

-- 2. Confirm bad estimates in the plan
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM orders WHERE status = 'pending';
-- rows=X (estimate) vs actual rows=Y — large gap = stale stats

-- 3. Check autovacuum configuration for this table
SELECT reloptions
FROM   pg_class
WHERE  relname = 'orders';
-- Look for autovacuum_analyze_scale_factor or autovacuum_enabled=false

Pitfalls

  • Do not confuse VACUUM with ANALYZE. VACUUM removes dead tuples; ANALYZE updates statistics. They are separate operations. VACUUM ANALYZE does both.
  • ANALYZE on very large tables adds I/O load. Run during off-peak hours or use targeted per-column analyze.
  • Autovacuum uses a threshold: n_mod_since_analyze > autovacuum_analyze_scale_factor * n_live_tup + autovacuum_analyze_threshold. After a bulk load that doubles table size, this threshold may not trigger until the next autovacuum cycle.
  • Statistics freshness is table-wide — even if only one column was used in the filter, all column statistics are updated together.

Resolution Approach

Run ANALYZE immediately after bulk data loads. For recurring ETL jobs, add ANALYZE table_name; as the final step of every load script. Tune autovacuum_analyze_scale_factor down for tables with frequent large loads so autovacuum triggers faster.

This is a Pro lesson

Get every Learning Pathway and cookbook recipe — grounded in PostgreSQL source code, with diagnostics, fixes, and prevention for each topic.

Continue this lesson to learn:

  • Analysis Steps
  • Pitfalls
  • Mitigation Actions
  • All 36 Learning Pathway lessons
  • 170+ cookbook recipes
  • Source-grounded diagnostics & fixes

Secure checkout Cancel anytime Source-grounded

Career Impact

This scenario builds production judgment and operational confidence under pressure.

Open Career Dashboard →

Keep going

Related & next steps

Was this helpful?

← All cookbook recipes