What causes "SQLSTATE 08006 — connection_failure: Mid-Session Connection Drop" in PostgreSQL?

08006 fires when the TCP connection to a PostgreSQL backend is severed mid-session — the connection existed and was valid, then died without a clean close. Unlike 08001 (could not establish connection), 08006 means you had a connection and lost it. Root causes by layer: pgBouncer server_idle_timeout — pgBouncer drops idle server-side connections after a configurable timeout. If a client holds a session-mode connection that goes idle, pgBouncer closes the server side while the client still believes the connection is alive. On the next query the client gets 08006. Firewall / NAT idle timeout — Cloud NAT gateways silently RST TCP connections with no traffic. AWS default: 350s. GCP: 1200s.

How do I prevent "SQLSTATE 08006 — connection_failure: Mid-Session Connection Drop"?

Set tcp_keepalives_idle = 60 in postgresql.conf — well below any NAT/firewall idle timeout on any cloud provider Set idle_in_transaction_session_timeout = '5min' to auto-close sessions that forget to COMMIT or ROLLBACK Configure connection pool keepalive at the application level (HikariCP: keepaliveTime, JDBC: keepAlives=true) On AWS RDS: ensure Security Group rules and set tcp_keepalives_idle below 350 seconds (the NAT gateway idle timeout) Build retry logic in your application: catch 08006 / 08003 and retry once on a fresh connection with exponential backoff

SQLSTATE 08006 — Connection Failure | PostgreSQL Error Reference

Symptoms

Conditions:

ERROR: connection to server was lost
SSL connection has been closed unexpectedly
Disconnects happen during idle periods (30 seconds to several minutes), not under load
pgBouncer logs show closing because: server idle timeout or server login failed
pg_stat_activity shows idle sessions that then disappear without a client-initiated close
On AWS RDS / Aurora: disconnects cluster near the 350-second NAT gateway idle boundary
Application connection pool reports stale connections needing recreation at a regular interval

Environment

Difficulty: Intermediate | PostgreSQL versions: 12, 13, 14, 15, 16, 17

Most common in cloud environments (RDS, Aurora, Supabase, Neon, Cloud SQL) where NAT gateways and load balancers silently drop idle TCP connections. Also common with pgBouncer in transaction or statement pooling mode.

Root Cause

08006 fires when the TCP connection to a PostgreSQL backend is severed mid-session — the connection existed and was valid, then died without a clean close. Unlike 08001 (could not establish connection), 08006 means you had a connection and lost it.

Root causes by layer:

pgBouncer server_idle_timeout — pgBouncer drops idle server-side connections after a configurable timeout. If a client holds a session-mode connection that goes idle, pgBouncer closes the server side while the client still believes the connection is alive. On the next query the client gets 08006.
Firewall / NAT idle timeout — Cloud NAT gateways silently RST TCP connections with no traffic. AWS default: 350s. GCP: 1200s. Azure: 240s. PostgreSQL does not know the connection is dead until the next write attempt.
idle_in_transaction_session_timeout — PostgreSQL kills sessions that hold an open transaction without activity beyond the configured timeout.
TCP keepalive misconfiguration — OS-level keepalives default to 2 hours, far longer than any NAT/firewall idle timeout.
OOM killer — OS kills a PostgreSQL backend under memory pressure; existing clients receive 08006 on the next interaction.

Investigation

-- 1. Find long-idle connections
SELECT pid, usename, application_name, state,
       now() - state_change AS idle_for,
       wait_event_type, wait_event, client_addr
FROM   pg_stat_activity
WHERE  state IN ('idle', 'idle in transaction')
ORDER  BY idle_for DESC;

-- 2. Check current timeout settings
SHOW idle_in_transaction_session_timeout;
SHOW tcp_keepalives_idle;
SHOW tcp_keepalives_interval;
SHOW tcp_keepalives_count;

-- 3. Check connection counts vs limit
SELECT count(*) AS total,
       sum(CASE WHEN state = 'idle' THEN 1 ELSE 0 END) AS idle,
       (SELECT setting::int FROM pg_settings WHERE name = 'max_connections') AS max_conn
FROM   pg_stat_activity
WHERE  backend_type = 'client backend';

-- 4. On Linux: check OOM kills
-- sudo dmesg | grep -i "oom\|killed process\|postgres"

1 more diagnostic query

Identify the exact root cause. Find the connection leak, the bloated table, the lock holder.

Unlock all queries — $24.99/month $199/year — save 35%

Already subscribed? Log in to access

Fix Now

Fix 1 — Enable TCP keepalives on the PostgreSQL server (most universal fix):

# postgresql.conf
tcp_keepalives_idle     = 60    # start probing after 60 s idle
tcp_keepalives_interval = 10    # probe every 10 s
tcp_keepalives_count    = 5     # drop after 5 failed probes

# Apply without restart
SELECT pg_reload_conf();

Fix 2 — Align pgBouncer timeouts:

# pgbouncer.ini
server_idle_timeout = 600       # > NAT timeout for your cloud provider
server_lifetime     = 3600
client_idle_timeout = 0         # let the client manage its own timeout

Fix 3 — Terminate stale in-transaction sessions and set a guard timeout:

-- Kill sessions stuck in open transactions
SELECT pg_terminate_backend(pid)
FROM   pg_stat_activity
WHERE  state = 'idle in transaction'
AND    now() - state_change > interval '5 minutes';

-- Prevent future occurrences
ALTER SYSTEM SET idle_in_transaction_session_timeout = '5min';
SELECT pg_reload_conf();

Resolution & Prevention

Set tcp_keepalives_idle = 60 in postgresql.conf — well below any NAT/firewall idle timeout on any cloud provider
Set idle_in_transaction_session_timeout = '5min' to auto-close sessions that forget to COMMIT or ROLLBACK
Configure connection pool keepalive at the application level (HikariCP: keepaliveTime, JDBC: keepAlives=true)
On AWS RDS: ensure Security Group rules and set tcp_keepalives_idle below 350 seconds (the NAT gateway idle timeout)
Build retry logic in your application: catch 08006 / 08003 and retry once on a fresh connection with exponential backoff

What This Is

Why It Matters

Next Step

`connection_failure` SQLSTATE 08006 — connection_failure: Mid-Session Connection Drop

Symptoms

Environment

Root Cause

Investigation

Fix Now

Resolution & Prevention

References

Related & next steps

Concepts on this page

Don't get paged twice for the same bug.

What This Is

Why It Matters

Next Step

monitor_heart Symptoms

deployed_code Environment

bug_report Root Cause

terminal Investigation

healing Fix Now

shield Resolution & Prevention

link References

Related & next steps

menu_book Concepts on this page

Don't get paged twice for the same bug.

Symptoms

Environment

Root Cause

Investigation

Fix Now

Resolution & Prevention

References

Concepts on this page