invalid byte sequence for encoding “UTF8”
SQLSTATE 22021 condition character_not_in_repertoire class 22 — Data Exception severity ERROR
Reproduced & verified on PostgreSQL 14.23, 15.18, 16.14, 17.10 and 18.4 — identical message on every version.
Last reviewed 11 Jun 2026 · Reproduced live with the SQL on this page.
Symptoms
Input contained a byte sequence that is not valid in the database’s server encoding. PostgreSQL raises SQLSTATE 22021 (character_not_in_repertoire).
- A byte sequence isn’t valid in the server encoding.
- Common loading data of a different encoding.
- The offending byte is shown in hex.
What the server log shows
ERROR: invalid byte sequence for encoding "UTF8": 0xa9
Why PostgreSQL raises this — what the manual says
Section 23.3.3 Automatic Character Set Conversion Between Server and Client:
“the server will still check that incoming data is valid for that encoding”
PostgreSQL validates that incoming bytes form legal characters in the relevant encoding. A byte sequence that isn’t valid (e.g. mismatched client/server encoding) can’t be represented, so PostgreSQL reports 22021.
Common causes
- Loading data in a different encoding without conversion.
client_encodingnot matching the actual data.- Binary/garbage bytes in a text field.
How to fix it
- Set
client_encodingto the source encoding so the server converts it. - Convert files to the server encoding first (e.g.
iconv). - Strip invalid bytes before loading.
Related & next steps
Reference: PostgreSQL 18 Section 24.3 “Character Set Support”.