The one thing to understand first
Because MVCC leaves dead tuples behind (see the MVCC article), something must reclaim that space and, just as importantly, advance the cluster’s frozen-XID horizon to prevent transaction-ID wraparound. Autovacuum is the background subsystem that does both. Its code spans src/backend/postmaster/autovacuum.c (scheduling) and src/backend/access/heap/vacuumlazy.c (the actual work).
Autovacuum is not optional cleanup — it is the process that keeps your database able to accept writes. It does two jobs that look unrelated but share one engine: reclaiming dead-tuple space, and freezing old rows before the 32-bit XID counter wraps. Neglect either and you get runaway bloat or a cluster that refuses writes.
Launcher and workers
Autovacuum has two process types:
- The launcher wakes every
autovacuum_naptimeseconds and decides which databases need attention, prioritising those closest to wraparound. - A worker is forked per database (up to
autovacuum_max_workers) to process the tables that exceed their thresholds.
The threshold formula
A table becomes eligible for autovacuum when its estimated dead tuples exceed:
threshold = autovacuum_vacuum_threshold
+ autovacuum_vacuum_scale_factor * reltuples
With defaults (threshold = 50, scale_factor = 0.2), a 1,000,000-row table is vacuumed after ~200,050 dead tuples. The dead-tuple estimate comes from the cumulative statistics system, updated by backends as they modify rows. The same shape of formula with analyze_* parameters governs autoanalyze.
The classic mistake on large tables: the 20% scale factor means a billion-row table waits for 200 million dead tuples before vacuuming — far too late. The fix is per-table tuning:
ALTER TABLE big_table SET (
autovacuum_vacuum_scale_factor = 0.02,
autovacuum_vacuum_threshold = 10000);
What a vacuum pass actually does
lazy_scan_heap() in vacuumlazy.c runs in phases:
- Scan heap. Walk pages (skipping all-visible pages via the visibility map), collecting the line pointers of dead tuples whose
xmaxis older than the vacuum’s cutoff XID. - Vacuum indexes. For each index, remove entries pointing at the dead tuples. This is why many indexes make vacuum slower.
- Vacuum heap. Turn the dead line pointers into reusable space and defragment the page.
- Update the free space map and visibility map.
Crucially, ordinary (lazy) vacuum does not return space to the operating system — it makes space reusable within the table. Only VACUUM FULL (which rewrites the table and takes an exclusive lock) shrinks the file.
The visibility map
Each table has a visibility map (visibilitymap.c): two bits per heap page. The all-visible bit lets index-only scans skip heap fetches and lets the next vacuum skip the page entirely. The all-frozen bit lets aggressive freeze passes skip pages. Keeping these bits set is a major reason vacuum gets cheaper over time on append-mostly tables.
Freezing and wraparound
XIDs are 32-bit and wrap around after ~4 billion. To stay safe, vacuum freezes old tuples — marking them as unconditionally visible — before their age reaches autovacuum_freeze_max_age. When any table crosses that age, an anti-wraparound autovacuum is forced even if the table is otherwise quiet, and it cannot be skipped. Ignoring the warnings leads PostgreSQL to eventually refuse new writes to protect data — a production-stopping event.
-- Tables closest to wraparound trouble
SELECT relname, age(relfrozenxid) AS xid_age
FROM pg_class
WHERE relkind = 'r'
ORDER BY xid_age DESC LIMIT 10;
Cost-based throttling
To avoid I/O storms, vacuum accumulates a “cost” as it reads and dirties pages and sleeps when it exceeds autovacuum_vacuum_cost_limit. On modern SSD systems the defaults are often too conservative; raising the cost limit lets autovacuum keep up with high-churn tables.
Layer 3 — Watch it happen on your own database
-- Tables closest to wraparound trouble
SELECT relname, age(relfrozenxid) AS xid_age
FROM pg_class
WHERE relkind = 'r'
ORDER BY xid_age DESC LIMIT 10;
-- Dead-tuple counts and last autovacuum time
SELECT relname, n_dead_tup, n_live_tup, last_autovacuum
FROM pg_stat_user_tables
ORDER BY n_dead_tup DESC LIMIT 10;
-- A vacuum in flight: which phase and how far along
SELECT * FROM pg_stat_progress_vacuum;
The first query is your wraparound early-warning system — anything whose xid_age approaches autovacuum_freeze_max_age (default 200M) is overdue. The second shows which tables are accumulating dead tuples faster than autovacuum clears them. The third lets you watch a live vacuum move through its scan, index, and cleanup phases.
Layer 4 — The levers this hands you
- Per-table scale factors. The 20% default is wrong for large tables; set
autovacuum_vacuum_scale_factorlow (e.g. 0.02) with a fixedautovacuum_vacuum_thresholdon big, churny tables. - Raise the cost limit on SSDs. Increase
autovacuum_vacuum_cost_limit(or lowerautovacuum_vacuum_cost_delay) so vacuum keeps pace with write volume. - More workers for many hot tables.
autovacuum_max_workersbounds concurrency; raise it if many tables qualify at once. - Long transactions throttle vacuum globally by holding back the cutoff XID — fix the transaction, not vacuum.
- More indexes = slower vacuum because of the per-index cleanup phase; drop unused indexes.
Layer 5 — What an Oracle DBA should expect vs what they get
Oracle has no direct equivalent because it never created the problem autovacuum solves:
- No background “undo cleanup” to compare to. Oracle reclaims old row images from UNDO automatically and overwrites blocks in place, so there is no table-level vacuum at all. In PostgreSQL, vacuum is a first-class operational concern you must monitor and tune.
- Transaction-ID wraparound is uniquely Postgres. Oracle’s SCN is 64-bit and never wraps in practice; PostgreSQL’s 32-bit XID does, which is why freezing exists and why an ignored wraparound warning can stop writes entirely. There is no Oracle drill for this.
- “My table doesn’t shrink after a big delete.” Correct — lazy vacuum only makes space reusable inside the table, like Oracle’s high-water-mark behaviour. Reclaiming OS space needs
VACUUM FULL(an exclusive-lock rewrite) orpg_repack, analogous to an Oracle segment shrink/move. - The visibility map is the index-only-scan enabler. Oracle index-only access just needs the index; in Postgres an index-only scan also checks the all-visible bit, so keeping vacuum current directly affects whether index-only scans are allowed.
Key takeaway
Autovacuum schedules per-database workers when a table’s dead-tuple estimate crosses threshold + scale_factor × reltuples, then vacuumlazy.c scans the heap (skipping all-visible pages), cleans indexes, defragments pages, and freezes old XIDs. It reclaims space for reuse and defends against wraparound — it does not shrink files. Tune scale factors and cost limits per table, hunt long transactions, and watch age(relfrozenxid): a healthy autovacuum is the difference between a database that quietly keeps up and one that seizes.