OTLP Receiver Backpressure

When the GlassFlow OTLP receiver cannot keep up with incoming data it signals backpressure to the upstream OTel Collector or SDK instead of silently dropping events or consuming unbounded memory.

How Backpressure Works

The receiver has two independent backpressure gates:

Concurrency Limit

Each receiver pod enforces a maximum number of concurrent in-flight requests. When that limit is reached, any new request is rejected immediately with:

HTTP — 429 Too Many Requests + Retry-After: 1
gRPC — ResourceExhausted

OTel Collector exporters and SDK exporters treat both responses as retryable and will back off automatically.

NATS Stream Full

Each pipeline’s internal NATS stream has a configurable capacity. When the stream is full — typically because the sink is writing to ClickHouse more slowly than the receiver is ingesting — publish retries will exhaust and the request is rejected with the same signals:

HTTP — 429 Too Many Requests + Retry-After: 1
gRPC — ResourceExhausted

Both backpressure conditions surface as the same protocol-level response (429 / ResourceExhausted) so upstream exporters handle them identically. The distinction matters only for tuning — see Configuration below.

What the Collector Sees

Standard OTel Collector exporters (otlphttp, otlp) retry on both 429 and ResourceExhausted by default. You do not need to change your Collector config to handle backpressure — the Collector will queue and retry the failed batch automatically.

If your Collector’s retry queue is exhausted (e.g. the backpressure is sustained for longer than the Collector’s configured retry_on_failure.max_elapsed_time), the Collector will drop the batch and log an error. Monitor your Collector’s otelcol_exporter_send_failed_spans / _logs / _metric_points metrics to detect sustained backpressure.

Configuration

Concurrency limit

Controls the maximum number of requests processed simultaneously per receiver pod. Increase this if you have spare CPU and memory headroom on the pod; decrease it to protect a smaller pod.


# values.yaml
sources:
  otlpReceiver:
    maxConcurrentRequests: 50   # default

Env var	Default	Description
`GLASSFLOW_OTLP_MAX_CONCURRENT_REQUESTS`	`50`	Max concurrent in-flight batches per pod

Sizing guidance — At 50 concurrent slots with the default 4 MiB request body limit, peak memory per pod is roughly 200 MiB, which fits within the default pod limit of 512 MiB. If you increase maxConcurrentRequests, increase the pod memory limit proportionally:


peak_memory ≈ maxConcurrentRequests × maxBodyBytes

NATS chunk size

Controls how many messages are published to NATS per async-publish round. This bounds the number of in-flight NATS futures and caps per-request memory inside the receiver.


# values.yaml
sources:
  otlpReceiver:
    natsChunkSize: 1000   # default

Env var	Default	Description
`GLASSFLOW_OTLP_NATS_CHUNK_SIZE`	`1000`	Messages per NATS publish chunk

Pipeline stream capacity

The pipeline’s internal NATS stream buffer controls how much data can be queued between the receiver and the sink before backpressure is triggered. Configure this on the pipeline resource:


"resources": {
  "maxMsgs": 100000,
  "maxBytes": 0
}

A larger stream absorbs more burst traffic before backpressure kicks in. See the Pipeline Configuration Reference for full details.

Resolving Sustained Backpressure

Backpressure is a signal that your sink is slower than your ingest rate. To resolve it:

Scale up the sink

Increase the sink replica count so more ClickHouse writers are running in parallel. Set resources.sink.replicas when creating or updating the pipeline via the API:


"resources": {
  "sink": {
    "replicas": 3
  }
}

The default is 1. You can update an existing pipeline by sending a PATCH /api/v1/pipeline/{id} request with the new resources value.

Scale up ClickHouse capacity

If the sink replicas are already at a reasonable count but ClickHouse insert latency is high, increase ClickHouse resources or consider using a MergeTree table with async inserts to reduce insert pressure.

Scale out receiver pods

If backpressure is hitting the concurrency limit (not the stream), add more receiver pods to distribute the ingest load:


# values.yaml
sources:
  otlpReceiver:
    replicas: 3

Scaling out receiver pods increases total ingest throughput but does not help if the bottleneck is a full NATS stream — all pods share the same stream and will all hit backpressure at the same point. Fix the downstream sink first.

Monitoring Backpressure

The following metrics help identify which gate is triggering backpressure. All are emitted by the OTLP receiver pod.

Metric	Description
`gfm_receiver_request_count{status="error"}`	Requests that failed — includes both concurrency-limit and stream-full rejections
`gfm_receiver_request_duration_seconds`	Request latency; rising p99 often precedes sustained backpressure

The gfm_ingestor_backpressure_* and gfm_stream_depth* metrics are emitted by the Kafka ingestor component, which is not present in OTLP source pipelines. They do not apply here.

To distinguish concurrency-limit backpressure from stream-full backpressure, check the receiver pod logs — each rejection is logged with a reason field (overloaded or stream_backpressure).

See the Metrics Reference for the full list of available metrics and how to access them.