invalid byte sequence for encoding "…": … — SQLSTATE 22021 | PostgreSQL Error Reference

Diagnostic Queries

Symptoms

Bytes being read into the database are not valid in the target/client encoding. PostgreSQL raises SQLSTATE 22021 (character_not_in_repertoire) and shows the offending byte.

Common during COPY/import of files with the wrong declared encoding.
The message shows the bad byte, e.g. 0x80.
Mismatch between actual file bytes and client_encoding.

What the server log shows

ERROR:  invalid byte sequence for encoding "UTF8": 0x80

Why PostgreSQL raises this — what the manual says

Section 23.3.3 Automatic Character Set Conversion Between Server and Client:

“the server will still check that incoming data is valid for that encoding”

PostgreSQL validates that incoming bytes form legal code points in the declared client encoding before converting to the database encoding. An illegal byte sequence (e.g. Latin-1 bytes labelled UTF-8) cannot be decoded and fails with 22021.

Common causes

Importing a Latin-1/Windows-1252 file as UTF-8.
A wrong client_encoding for the data being sent.
Mixed-encoding data in one file.

How to fix it

Set the correct client encoding: SET client_encoding = 'LATIN1'; (or convert the file to UTF-8 first).
For COPY, specify the file’s encoding: COPY t FROM '…' WITH (ENCODING 'WIN1252');.
Clean/convert the source data with iconv before loading.

Related & next steps

Reference: PostgreSQL 18 Section 24.3 “Character Set Support”.

Was this helpful?

`character_not_in_repertoire` invalid byte sequence for encoding “…”: … — 22021

Diagnostic Queries

Symptoms

What the server log shows

Why PostgreSQL raises this — what the manual says

Common causes

How to fix it

Related & next steps

Don't get paged twice for the same bug.

terminal Diagnostic Queries

Symptoms

What the server log shows

Why PostgreSQL raises this — what the manual says

Common causes

How to fix it

Related & next steps

Don't get paged twice for the same bug.

Diagnostic Queries