Skip to Content
ConfigurationClickHouse Sink

ClickHouse Sink

Error Classification

When the sink fails to write a batch to ClickHouse, it classifies the error before deciding what to do next.

ClassificationSink actionWhen
RetryableNACK — NATS redelivers after a delayTransient condition; same data would succeed once CH recovers
PermanentDLQ — batch written to Dead Letter QueueData or schema problem; same message will fail again on retry
UnknownDLQ (conservative)Not yet classified; logged with needs_classification so gaps surface from real traffic

The delay between a NACK and redelivery is controlled by NatsConsumerNakDelay (default 5 s). The maximum number of redeliveries before NATS stops is NatsConsumerMaxDeliver (default 10). After 10 attempts the message is dead-lettered by NATS automatically.


Retryable errors

Transient conditions where the same message is expected to succeed once ClickHouse or the network recovers.

CodeNameReason
159TIMEOUT_EXCEEDEDQuery timeout
198DNS_ERRORDNS resolution failure
201QUOTA_EXPIREDQuota exhausted — resets on schedule
202TOO_MANY_SIMULTANEOUS_QUERIESServer overloaded
203NO_FREE_CONNECTIONConnection pool exhausted
209SOCKET_TIMEOUTNetwork timeout
210NETWORK_ERRORNetwork layer error
236ABORTEDServer-initiated query abort
241MEMORY_LIMIT_EXCEEDEDTransient resource pressure
242TABLE_IS_READ_ONLYReplica recovery in progress
243NOT_ENOUGH_SPACEDisk pressure (may clear)
244UNEXPECTED_ZOOKEEPER_ERRORTransient ZooKeeper/Keeper error
254NO_ACTIVE_REPLICASAll replicas temporarily down
265NO_AVAILABLE_REPLICANo replica available
279ALL_CONNECTION_TRIES_FAILEDAll replicas unreachable
285TOO_LESS_LIVE_REPLICASNot enough live replicas for quorum
286UNSATISFIED_QUORUM_FOR_PREVIOUS_WRITEPrevious write quorum not yet met
289REPLICA_IS_NOT_IN_QUORUMReplication lag
290LIMIT_EXCEEDEDRate or resource limit
297SHARD_HAS_NO_CONNECTIONSShard connection pool empty
364RECEIVED_ERROR_TOO_MANY_REQUESTSHTTP 429 / CH rate limit
384PART_IS_TEMPORARILY_LOCKEDMerge in progress, temporary lock
999KEEPER_EXCEPTIONClickHouse Keeper (ZooKeeper) exception
1000POCO_EXCEPTIONPoco network/IO library exception
Network/IOio.EOF, io.ErrUnexpectedEOF, ECONNREFUSED, ECONNRESET, EPIPE, net timeout

Permanent errors

Data or schema problems where retrying the same message will produce the same failure. Operator intervention is required.

CodeNameReason
6CANNOT_PARSE_TEXTMalformed payload
7INCORRECT_NUMBER_OF_COLUMNSSchema mismatch
16NO_SUCH_COLUMN_IN_TABLEColumn missing from table
18CANNOT_INSERT_ELEMENT_INTO_CONSTANT_COLUMNBad data value
20NUMBER_OF_COLUMNS_DOESNT_MATCHSchema mismatch
25CANNOT_PARSE_ESCAPE_SEQUENCEMalformed payload
26CANNOT_PARSE_QUOTED_STRINGMalformed payload
27CANNOT_PARSE_INPUT_ASSERTION_FAILEDMalformed payload
38CANNOT_PARSE_DATEBad date value in payload
41CANNOT_PARSE_DATETIMEBad datetime value in payload
43ILLEGAL_TYPE_OF_ARGUMENTType mismatch
44ILLEGAL_COLUMNColumn issue
47UNKNOWN_IDENTIFIERUnknown column reference
53TYPE_MISMATCHType mismatch
60UNKNOWN_TABLETable does not exist
72CANNOT_PARSE_NUMBERBad numeric value in payload
80INCORRECT_QUERYMalformed query
81UNKNOWN_DATABASEWrong database in connection config
117INCORRECT_DATABad data value
164READONLYClickHouse in readonly mode
192UNKNOWN_USERAuth: user doesn’t exist
193WRONG_PASSWORDAuth: wrong password
194REQUIRED_PASSWORDAuth: password required
195IP_ADDRESS_NOT_ALLOWEDAuth: IP not in allowlist
291DATABASE_ACCESS_DENIEDPermission denied on database
321VALUE_IS_OUT_OF_RANGE_OF_DATA_TYPEValue out of range for column type
349CANNOT_INSERT_NULL_IN_ORDINARY_COLUMNNULL into NOT NULL column
392QUERY_IS_PROHIBITEDQuery type prohibited by server policy
516AUTHENTICATION_FAILEDAuthentication failure

Code 60 (UNKNOWN_TABLE) is classified as permanent. During a live schema migration the table may briefly not exist; if this causes unexpected DLQ traffic, pause the pipeline until the migration completes.


Unknown errors

Any error code not in the lists above is classified as unknown and treated conservatively as permanent — the batch is written to the DLQ. The sink logs these with a needs_classification attribute so they can be identified from metrics or logs and added to the appropriate list.

To add a new code, add one line to internal/sink/errors/classification.go:

// in retryableCodes or permanentCodes map int32(chproto.ErrNewCode): {}, // NNN — reason
Last updated on