Statistics and Selectivity: Why the Planner Guesses Wrong

Applies to PostgreSQL 13–17 Last reviewed May 2026 Grounded in source

The one thing to understand first

The <a class="sev1-termlink" href="https://thesev1database.com/glossary/query-planner/" title="Query planner">optimizer chooses plans by estimating how many rows each operation produces. Those estimates come entirely from statistics gathered by ANALYZE into the pg_statistic catalog (human-readable via pg_stats). Bad statistics → bad estimates → bad plans. Almost every “mysteriously slow query” traces back here.

The planner is only as smart as its newest ANALYZE, and it assumes your columns are independent until you tell it otherwise. Those two facts — stale stats and the independence assumption — explain the overwhelming majority of catastrophically wrong row estimates you will ever debug.

What ANALYZE collects

For each column, ANALYZE samples rows (default 300 × statistics_target) and computes:

This is a Pro lesson

Get every Learning Pathway and cookbook recipe — grounded in PostgreSQL source code, with diagnostics, fixes, and prevention for each topic.

Continue this lesson to learn:

How a single-column estimate is made
The multi-column trap
Extended statistics fix correlation
Raising resolution
Layer 3 — Watch it happen on your own database
Layer 4 — The levers this hands you

All 36 Learning Pathway lessons
170+ cookbook recipes
Source-grounded diagnostics & fixes

Unlock everything — $24.99/month $199/year — save 35%

Secure checkout Cancel anytime Source-grounded

Already a member? Log in · New here? Create a free account

Was this helpful?

← Back to 02 — Performance: Query & Index Mastery