invalid byte sequence for encoding “UTF8”: 0x…

SQLSTATE 22021 condition character_not_in_repertoire class 22 — Data Exception severity ERROR

Reproduced & verified on PostgreSQL 14.23, 15.18, 16.14, 17.10 and 18.4 — identical message on every version.

Last reviewed 30 May 2025 · Reproduced live with the SQL on this page.

Symptoms

Input contained a byte sequence that is not valid in the UTF8 encoding. PostgreSQL raises SQLSTATE 22021 (character_not_in_repertoire).

A byte sequence isn’t valid UTF8.
Common when loading data in a different encoding (e.g. Latin-1).
The offending byte is shown in hex.

What the server log shows

ERROR:  invalid byte sequence for encoding "UTF8": 0xe9

Why PostgreSQL raises this — what the manual says

Section 23.3.3 Automatic Character Set Conversion Between Server and Client:

“the server will still check that incoming data is valid for that encoding”

A UTF8 database validates that incoming bytes form legal UTF8. A byte from another encoding (e.g. 0xe9 from Latin-1) is not valid UTF8, so PostgreSQL rejects it with 22021.

Common causes

Loading Latin-1/Windows-1252 data into a UTF8 database without conversion.
client_encoding not matching the actual data encoding.
Binary/garbage bytes in a text field.

How to fix it

Set client_encoding to the source encoding so the server converts it (e.g. SET client_encoding = 'LATIN1';).
Convert the file to UTF8 first (e.g. iconv -f latin1 -t utf8).
Clean out invalid bytes before loading.

Related & next steps

Reference: PostgreSQL 18 Section 24.3 “Character Set Support”.