PostgreSQL Crash Due to OOM Kill

Applies to PostgreSQL 13–17 Last reviewed May 2026 Grounded in source

Estimated investigation4 min

Investigation Path

Scenario

PostgreSQL restarts unexpectedly every few days. The PostgreSQL log shows the server started fresh — no crash-related errors. The DBA checks /var/log/kern.log and finds Out of memory: Kill process 12345 (postgres) score 900 or sacrifice child. The Linux OOM (Out of Memory) killer is terminating PostgreSQL backends when the server runs out of memory. The last OOM’d process was the one executing a large analytical query with high work_mem.

How to Identify

Conditions:

Unexpected PostgreSQL restarts with no error in postgresql.log
dmesg | grep oom or journalctl -k | grep "Out of memory" shows OOM events
Memory pressure: free -m shows very low available memory
work_mem × max_connections × sort operations exceeds available RAM
Large shared_buffers + many connections + high work_mem = OOM risk

Analysis Steps

-- Check memory configuration
SELECT name, setting, unit FROM pg_settings
WHERE name IN (
    'shared_buffers', 'work_mem', 'maintenance_work_mem',
    'max_connections', 'max_parallel_workers',
    'wal_buffers', 'effective_cache_size'
);

-- Estimate worst-case memory usage:
-- shared_buffers + (max_connections × work_mem × sort_operations_per_query)
-- + OS overhead
SELECT
    pg_size_pretty(pg_size_bytes(current_setting('shared_buffers')))  AS shared_buffers,
    current_setting('max_connections')::int                           AS max_connections,
    pg_size_pretty(pg_size_bytes(current_setting('work_mem')))        AS work_mem_per_sort,
    pg_size_pretty(
        current_setting('max_connections')::bigint
        * pg_size_bytes(current_setting('work_mem'))
    )  AS worst_case_work_mem;

-- Find current memory-intensive queries:
SELECT pid, usename, state,
       extract(epoch from now()-query_start) AS query_seconds,
       left(query, 200) AS query
FROM pg_stat_activity
WHERE state = 'active'
ORDER BY query_seconds DESC;

-- Check OS memory (run from shell):
-- free -m
-- cat /proc/meminfo | grep -E "MemTotal|MemFree|MemAvailable|Cached"

Pitfalls

work_mem is per sort/hash operation, not per connection. A single complex query can use 10× work_mem. This makes worst-case memory hard to predict.
Reducing work_mem globally hurts query performance. The correct approach is low global work_mem with higher per-role settings for analytics users.
Setting vm.overcommit_memory = 2 on Linux prevents OOM but means PostgreSQL starts failing allocations earlier. A better balance: vm.overcommit_memory = 1 (default heuristic).
huge_pages = try in PostgreSQL can reduce memory overhead significantly.
The Linux OOM killer has no knowledge of PostgreSQL’s internal consistency — killing a random backend causes an abrupt crash and full recovery on next start.

Resolution Approach

Calculate the actual memory budget, reduce work_mem globally (set higher per-role for analytics), set vm.overcommit_ratio to control how much memory Linux allows PostgreSQL to commit, and add monitoring to alert before OOM conditions occur.

This is a Pro lesson

Get every Learning Pathway and cookbook recipe — grounded in PostgreSQL source code, with diagnostics, fixes, and prevention for each topic.

Continue this lesson to learn:

Mitigation Actions

All 36 Learning Pathway lessons
170+ cookbook recipes
Source-grounded diagnostics & fixes

Unlock everything — $24.99/month $199/year — save 35%

Secure checkout Cancel anytime Source-grounded

Already a member? Log in · New here? Create a free account

PostgreSQL Crash Due to OOM Kill

What This Is

Why It Matters

Next Step

Scenario

Investigation Path

Scenario

How to Identify

Analysis Steps

Pitfalls

Resolution Approach

Career Impact

Related & next steps

More like this

Don't get paged twice for the same bug.

Scenario

Investigation Path

Scenario

How to Identify

Analysis Steps

Pitfalls

Resolution Approach

Related Knowledge

Related Interview Questions

Career Impact

Related & next steps

lan More like this

Don't get paged twice for the same bug.

More like this