invalid byte sequence for encoding “UTF8”: 0x…

SQLSTATE 22021 condition character_not_in_repertoire class 22 — Data Exception severity ERROR
Reproduced & verified on PostgreSQL 14.23, 15.18, 16.14, 17.10 and 18.4 — identical message on every version.
Last reviewed 30 May 2025 · Reproduced live with the SQL on this page.

Symptoms

Input contained a byte sequence that is not valid in the UTF8 encoding. PostgreSQL raises SQLSTATE 22021 (character_not_in_repertoire).

What the server log shows

ERROR:  invalid byte sequence for encoding "UTF8": 0xe9

Why PostgreSQL raises this — what the manual says

Section 23.3.3 Automatic Character Set Conversion Between Server and Client:

“the server will still check that incoming data is valid for that encoding”

A UTF8 database validates that incoming bytes form legal UTF8. A byte from another encoding (e.g. 0xe9 from Latin-1) is not valid UTF8, so PostgreSQL rejects it with 22021.

Common causes

How to fix it

  1. Set client_encoding to the source encoding so the server converts it (e.g. SET client_encoding = 'LATIN1';).
  2. Convert the file to UTF8 first (e.g. iconv -f latin1 -t utf8).
  3. Clean out invalid bytes before loading.

Related & next steps

Reference: PostgreSQL 18 Section 24.3 “Character Set Support”.