invalid byte sequence for encoding “…”: …

SQLSTATE 22021 condition character_not_in_repertoire class 22 — Data Exception severity ERROR
Reproduced & verified on PostgreSQL 14.23, 15.18, 16.14, 17.10 and 18.4 — identical message on every version.
Last reviewed 30 May 2025 · Reproduced live with the SQL on this page.

Symptoms

Bytes being read into the database are not valid in the target/client encoding. PostgreSQL raises SQLSTATE 22021 (character_not_in_repertoire) and shows the offending byte.

What the server log shows

ERROR:  invalid byte sequence for encoding "UTF8": 0x80

Why PostgreSQL raises this — what the manual says

Section 23.3.3 Automatic Character Set Conversion Between Server and Client:

“the server will still check that incoming data is valid for that encoding”

PostgreSQL validates that incoming bytes form legal code points in the declared client encoding before converting to the database encoding. An illegal byte sequence (e.g. Latin-1 bytes labelled UTF-8) cannot be decoded and fails with 22021.

Common causes

How to fix it

  1. Set the correct client encoding: SET client_encoding = 'LATIN1'; (or convert the file to UTF-8 first).
  2. For COPY, specify the file’s encoding: COPY t FROM '…' WITH (ENCODING 'WIN1252');.
  3. Clean/convert the source data with iconv before loading.

Related & next steps

Reference: PostgreSQL 18 Section 24.3 “Character Set Support”.