Scenario
A DBA receives alerts that multiple tables are showing over 20% dead tuple ratios. When checking pg_stat_user_tables, last_autovacuum is either NULL or shows timestamps from several hours ago. The database has hundreds of tables and four autovacuum workers. One large reporting table has been running a long autovacuum job for three hours, starving the other tables of worker capacity.
How to Identify
Conditions:
pg_stat_user_tables.last_autovacuum is NULL or stale across multiple tables
- Dead tuples accumulating (
n_dead_tup growing) despite autovacuum being enabled
autovacuum_max_workers exceeded — all workers are busy with other tables
- One or more tables have an active lock that prevents autovacuum from running
pg_stat_bgwriter.maxwritten_clean rising — bgwriter saturated, slowing autovacuum I/O
Analysis Steps
-- 1. Find tables with stale or missing autovacuum runs
SELECT relname,
n_dead_tup,
n_live_tup,
last_autovacuum,
last_autoanalyze,
now() - last_autovacuum AS since_last_vacuum
FROM pg_stat_user_tables
WHERE n_dead_tup > 1000
OR last_autovacuum IS NULL
ORDER BY n_dead_tup DESC
LIMIT 30;
-- 2. Check currently running autovacuum workers
SELECT pid, query, state, now() - query_start AS duration, wait_event
FROM pg_stat_activity
WHERE backend_type = 'autovacuum worker'
ORDER BY query_start;
-- 3. Check configured max workers vs how many are running
SHOW autovacuum_max_workers;
-- 4. Find tables that should have triggered autovacuum but haven't
SELECT relname,
n_dead_tup,
n_live_tup * current_setting('autovacuum_vacuum_scale_factor')::numeric
+ current_setting('autovacuum_vacuum_threshold')::numeric AS trigger_threshold
FROM pg_stat_user_tables
WHERE n_dead_tup >
n_live_tup * current_setting('autovacuum_vacuum_scale_factor')::numeric
+ current_setting('autovacuum_vacuum_threshold')::numeric
ORDER BY n_dead_tup DESC;
-- 5. Check bgwriter saturation
SELECT buffers_clean, maxwritten_clean, buffers_backend,
buffers_alloc, checkpoints_req
FROM pg_stat_bgwriter;
-- 6. Check autovacuum_freeze_max_age risk — tables nearing the freeze threshold
SELECT relname, age(relfrozenxid) AS xid_age,
current_setting('autovacuum_freeze_max_age')::INT AS freeze_max_age
FROM pg_class
WHERE relkind = 'r'
ORDER BY age(relfrozenxid) DESC
LIMIT 10;
Pitfalls
- Never disable autovacuum globally — transaction ID wraparound (a database-stopping emergency) requires autovacuum to periodically freeze old rows. Even setting
autovacuum = off at the instance level is dangerous.
- Autovacuum skips tables with conflicting locks — a long-running transaction holding a lock on a table will cause autovacuum to repeatedly skip that table. Identify and terminate blocking queries.
autovacuum_vacuum_cost_limit throttles autovacuum I/O. On I/O-constrained systems it may be too low, causing autovacuum to sleep more than it works.
autovacuum_max_workers (default 3) may be insufficient for databases with hundreds of active tables. Increase it, but each worker uses memory (maintenance_work_mem).
- Increasing
autovacuum_max_workers without also increasing autovacuum_vacuum_cost_limit may not help if autovacuum is I/O-throttled rather than worker-count-limited.
Resolution Approach
When autovacuum is not running or skipping tables, the fix depends on the root cause: