SQLSTATE 08006 ERROR Class 08: 08 — Connection Exception

connection_failure SQLSTATE 08006 — connection_failure: Mid-Session Connection Drop

SQLSTATE 08006 connection_failure: diagnose mid-session connection drops caused by pgBouncer idle timeout, NAT gateway RST, tcp_keepalives misconfiguration, and idle_in_transaction_session_timeout.

PG 12, 13, 14, 15, 16, 17, 18 Official docs
Last reviewed May 2026 Grounded in source
Production impact Medium Competency Connections & Pooling Career Resolve "too many connections" live Frequency Occasional

Symptoms

Conditions:

  • ERROR: connection to server was lost
  • SSL connection has been closed unexpectedly
  • Disconnects happen during idle periods (30 seconds to several minutes), not under load
  • pgBouncer logs show closing because: server idle timeout or server login failed
  • pg_stat_activity shows idle sessions that then disappear without a client-initiated close
  • On AWS RDS / Aurora: disconnects cluster near the 350-second NAT gateway idle boundary
  • Application connection pool reports stale connections needing recreation at a regular interval

Environment

Difficulty: Intermediate  |  PostgreSQL versions: 12, 13, 14, 15, 16, 17

Most common in cloud environments (RDS, Aurora, Supabase, Neon, Cloud SQL) where NAT gateways and load balancers silently drop idle TCP connections. Also common with pgBouncer in transaction or statement pooling mode.

Root Cause

08006 fires when the TCP connection to a PostgreSQL backend is severed mid-session — the connection existed and was valid, then died without a clean close. Unlike 08001 (could not establish connection), 08006 means you had a connection and lost it.

Root causes by layer:

  • pgBouncer server_idle_timeout — pgBouncer drops idle server-side connections after a configurable timeout. If a client holds a session-mode connection that goes idle, pgBouncer closes the server side while the client still believes the connection is alive. On the next query the client gets 08006.
  • Firewall / NAT idle timeout — Cloud NAT gateways silently RST TCP connections with no traffic. AWS default: 350s. GCP: 1200s. Azure: 240s. PostgreSQL does not know the connection is dead until the next write attempt.
  • idle_in_transaction_session_timeout — PostgreSQL kills sessions that hold an open transaction without activity beyond the configured timeout.
  • TCP keepalive misconfiguration — OS-level keepalives default to 2 hours, far longer than any NAT/firewall idle timeout.
  • OOM killer — OS kills a PostgreSQL backend under memory pressure; existing clients receive 08006 on the next interaction.

Investigation

-- 1. Find long-idle connections
SELECT pid, usename, application_name, state,
       now() - state_change AS idle_for,
       wait_event_type, wait_event, client_addr
FROM   pg_stat_activity
WHERE  state IN ('idle', 'idle in transaction')
ORDER  BY idle_for DESC;

-- 2. Check current timeout settings
SHOW idle_in_transaction_session_timeout;
SHOW tcp_keepalives_idle;
SHOW tcp_keepalives_interval;
SHOW tcp_keepalives_count;

-- 3. Check connection counts vs limit
SELECT count(*) AS total,
       sum(CASE WHEN state = 'idle' THEN 1 ELSE 0 END) AS idle,
       (SELECT setting::int FROM pg_settings WHERE name = 'max_connections') AS max_conn
FROM   pg_stat_activity
WHERE  backend_type = 'client backend';

-- 4. On Linux: check OOM kills
-- sudo dmesg | grep -i "oom\|killed process\|postgres"

1 more diagnostic query

Identify the exact root cause. Find the connection leak, the bloated table, the lock holder.

Fix Now

Fix 1 — Enable TCP keepalives on the PostgreSQL server (most universal fix):

# postgresql.conf
tcp_keepalives_idle     = 60    # start probing after 60 s idle
tcp_keepalives_interval = 10    # probe every 10 s
tcp_keepalives_count    = 5     # drop after 5 failed probes

# Apply without restart
SELECT pg_reload_conf();

Fix 2 — Align pgBouncer timeouts:

# pgbouncer.ini
server_idle_timeout = 600       # > NAT timeout for your cloud provider
server_lifetime     = 3600
client_idle_timeout = 0         # let the client manage its own timeout

Fix 3 — Terminate stale in-transaction sessions and set a guard timeout:

-- Kill sessions stuck in open transactions
SELECT pg_terminate_backend(pid)
FROM   pg_stat_activity
WHERE  state = 'idle in transaction'
AND    now() - state_change > interval '5 minutes';

-- Prevent future occurrences
ALTER SYSTEM SET idle_in_transaction_session_timeout = '5min';
SELECT pg_reload_conf();

Resolution & Prevention

  • Set tcp_keepalives_idle = 60 in postgresql.conf — well below any NAT/firewall idle timeout on any cloud provider
  • Set idle_in_transaction_session_timeout = '5min' to auto-close sessions that forget to COMMIT or ROLLBACK
  • Configure connection pool keepalive at the application level (HikariCP: keepaliveTime, JDBC: keepAlives=true)
  • On AWS RDS: ensure Security Group rules and set tcp_keepalives_idle below 350 seconds (the NAT gateway idle timeout)
  • Build retry logic in your application: catch 08006 / 08003 and retry once on a fresh connection with exponential backoff

References

Keep going

Related & next steps

Concepts on this page

Was this helpful?