Scenario
An event log table retains only 90 days of data via a nightly DELETE job that removes millions of old rows. After 6 months, index files on created_at and event_type are 40GB each, while the table itself is 8GB. Query performance degrades as index scans traverse mostly-empty pages.
How to Identify
Conditions:
pg_relation_size(index) >> pg_relation_size(table) — index larger than table
- Index scans slower than expected despite good statistics
pgstattuple shows high dead tuple percentage in index
pg_stat_user_indexes.idx_scan shows index is used but queries are slow
Analysis Steps
-- 1. Find bloated indexes (index size >> table size)
SELECT
t.relname AS table_name,
i.indexrelname AS index_name,
pg_size_pretty(pg_relation_size(i.indexrelid)) AS index_size,
pg_size_pretty(pg_relation_size(t.relid)) AS table_size,
ROUND(pg_relation_size(i.indexrelid)::NUMERIC /
NULLIF(pg_relation_size(t.relid), 0), 2) AS index_to_table_ratio,
i.idx_scan
FROM pg_stat_user_indexes i
JOIN pg_stat_user_tables t ON t.relid = i.relid
ORDER BY pg_relation_size(i.indexrelid) DESC
LIMIT 20;
-- 2. Use pgstattuple for detailed bloat analysis (requires extension)
-- CREATE EXTENSION IF NOT EXISTS pgstattuple;
-- SELECT * FROM pgstatindex('idx_events_created_at');
-- Look for: dead_leaf_pages and avg_leaf_density < 50%
-- 3. Check HOT update ratio
SELECT relname, n_tup_upd, n_tup_hot_upd,
ROUND(n_tup_hot_upd::NUMERIC / NULLIF(n_tup_upd, 0) * 100, 1) AS hot_ratio
FROM pg_stat_user_tables
WHERE relname = 'events';
Pitfalls
- Regular
VACUUM removes dead heap tuples but does NOT always reclaim all index pages. Only REINDEX fully rebuilds an index.
REINDEX (without CONCURRENTLY) takes ACCESS EXCLUSIVE lock — blocks all reads and writes. Always use REINDEX CONCURRENTLY in production.
- If
REINDEX CONCURRENTLY is interrupted, it leaves an invalid index that must be cleaned up with DROP INDEX CONCURRENTLY.
- HOT updates avoid index bloat — but only work when the updated column has no index. Adding an index to a frequently-updated column prevents HOT and causes index bloat.
Resolution Approach
REINDEX CONCURRENTLY rebuilds the index from scratch, producing a compact index with no dead pages. For tables with heavy DELETE patterns, schedule periodic REINDEX CONCURRENTLY during off-peak hours. Long-term: consider partitioning so old partitions can be dropped (which also drops indexes cleanly).