SQLSTATE 22021 ERROR Class 22: Data Exception

character_not_in_repertoire invalid byte sequence for encoding “UTF8” — 22021

PostgreSQL error “invalid byte sequence for encoding “UTF8″ — 22021” (SQLSTATE 22021): what it means, common causes, and how to fix it.

PG 9.6, 10, 11, 12, 13, 14, 15, 16, 17, 18 Official docs
Last reviewed Jun 2026 Grounded in source

Diagnostic Queries

Symptoms

Input contained a byte sequence that is not valid in the database’s server encoding. PostgreSQL raises SQLSTATE 22021 (character_not_in_repertoire).

  • A byte sequence isn’t valid in the server encoding.
  • Common loading data of a different encoding.
  • The offending byte is shown in hex.

What the server log shows

ERROR:  invalid byte sequence for encoding "UTF8": 0xa9

Why PostgreSQL raises this — what the manual says

Section 23.3.3 Automatic Character Set Conversion Between Server and Client:

“the server will still check that incoming data is valid for that encoding”

PostgreSQL validates that incoming bytes form legal characters in the relevant encoding. A byte sequence that isn’t valid (e.g. mismatched client/server encoding) can’t be represented, so PostgreSQL reports 22021.

Common causes

  • Loading data in a different encoding without conversion.
  • client_encoding not matching the actual data.
  • Binary/garbage bytes in a text field.

How to fix it

  1. Set client_encoding to the source encoding so the server converts it.
  2. Convert files to the server encoding first (e.g. iconv).
  3. Strip invalid bytes before loading.

Related & next steps

Reference: PostgreSQL 18 Section 24.3 “Character Set Support”.

Was this helpful?