Diagnostic Queries
Symptoms
Input contained a byte sequence that is not valid in the database’s server encoding. PostgreSQL raises SQLSTATE 22021 (character_not_in_repertoire).
- A byte sequence isn’t valid in the server encoding.
- Common loading data of a different encoding.
- The offending byte is shown in hex.
What the server log shows
ERROR: invalid byte sequence for encoding "UTF8": 0xa9
Why PostgreSQL raises this — what the manual says
Section 23.3.3 Automatic Character Set Conversion Between Server and Client:
“the server will still check that incoming data is valid for that encoding”
PostgreSQL validates that incoming bytes form legal characters in the relevant encoding. A byte sequence that isn’t valid (e.g. mismatched client/server encoding) can’t be represented, so PostgreSQL reports 22021.
Common causes
- Loading data in a different encoding without conversion.
client_encodingnot matching the actual data.- Binary/garbage bytes in a text field.
How to fix it
- Set
client_encodingto the source encoding so the server converts it. - Convert files to the server encoding first (e.g.
iconv). - Strip invalid bytes before loading.
Related & next steps
Reference: PostgreSQL 18 Section 24.3 “Character Set Support”.
Thanks — noted. This helps keep the database accurate.