GlassFlow Metrics
This guide provides comprehensive information about GlassFlow’s metrics, including available metrics, labels, and monitoring best practices.
- Metrics are enabled by default and available at the OTEL collector endpoint
- All backend metrics follow Prometheus format and can be scraped by Prometheus
- Metrics include component-specific labels for detailed monitoring
- OpenTelemetry (OTEL) collector exposes metrics via Prometheus exporter
- UI metrics are exported via OTLP, not Prometheus scraping
Metrics Overview
GlassFlow exports comprehensive metrics in Prometheus format through an OpenTelemetry collector. The metrics are designed to provide visibility into:
- Data Ingestion: Kafka record consumption rates and volumes
- Data Processing: Processing duration, throughput, and byte volume metrics
- Data Sinking: ClickHouse write operations and performance
- Error Handling: Dead Letter Queue (DLQ) operations
- OTLP Receiver: Incoming OTLP request rates and latency
- HTTP Server: API request rates and latency
- UI: Page views, interactions, and frontend API performance
Metric Naming Convention
All GlassFlow backend metrics follow a consistent naming pattern:
{namespace}_gfm_{metric_name}Where:
{namespace}- Deployment namespace prefix (e.g., “glassflow” if deployed in glassflow namespace)gfm- GlassFlow Metrics prefix{metric_name}- Descriptive metric name
UI metrics use the prefix gfm_ui_ and are exported via OTLP.
The namespace prefix is automatically added based on your deployment configuration. If you deploy GlassFlow in a different namespace, the prefix will change accordingly.
Core Metrics
Data Ingestion Metrics
{namespace}_gfm_kafka_records_read_total
- Type: Counter
- Description: Total number of records read from Kafka
- Unit: Records
- Components: Ingestor
- Labels:
component: Component type (e.g., “ingestor”) - Added by GlassFlowpipeline_id: Unique pipeline identifier - Added by GlassFlowinstance: Instance identifier - Added by Prometheusjob: Job identifier - Added by Prometheus
Example:
glassflow_gfm_kafka_records_read_total{component="ingestor",instance="ingestor-0-7f44fbbfd8-bqbw9",job="pipeline-load-pipeline-1-05b7/ingestor",pipeline_id="load-pipeline-1-05b7"} 16914In this example, glassflow is the namespace prefix. If you deploy in a different namespace, the prefix will change accordingly.
Data Processing Metrics
{namespace}_gfm_processing_duration_seconds
- Type: Histogram
- Description: Processing duration in seconds
- Unit: Seconds
- Components: Ingestor, Sink, Transform, Filter, Dedup
- Labels:
component: Component type - Added by GlassFlowpipeline_id: Unique pipeline identifier - Added by GlassFlowstage: (Optional) Processing stage — Added by GlassFlow. Values:dedup_filter,dedup_write,schema_mapping,total_preparation,per_message. Omitted when not applicable.instance: Instance identifier - Added by Prometheusjob: Job identifier - Added by Prometheusle: Histogram bucket boundary - Added by Prometheus
Histogram Buckets:
- 0.001s (1ms)
- 0.005s (5ms)
- 0.01s (10ms)
- 0.025s (25ms)
- 0.05s (50ms)
- 0.1s (100ms)
- 0.25s (250ms)
- 0.5s (500ms)
- 1.0s (1s)
- 2.5s (2.5s)
- 5.0s (5s)
- 10.0s (10s)
Example:
glassflow_gfm_processing_duration_seconds_bucket{component="ingestor",instance="ingestor-0-7f44fbbfd8-bqbw9",job="pipeline-load-pipeline-1-05b7/ingestor",pipeline_id="load-pipeline-1-05b7",le="5"} 16914
glassflow_gfm_processing_duration_seconds_sum{component="ingestor",instance="ingestor-0-7f44fbbfd8-bqbw9",job="pipeline-load-pipeline-1-05b7/ingestor",pipeline_id="load-pipeline-1-05b7"} 2.8343126270000267
glassflow_gfm_processing_duration_seconds_count{component="ingestor",instance="ingestor-0-7f44fbbfd8-bqbw9",job="pipeline-load-pipeline-1-05b7/ingestor",pipeline_id="load-pipeline-1-05b7"} 16914In this example, glassflow is the namespace prefix. If you deploy in a different namespace, the prefix will change accordingly.
{namespace}_gfm_processor_messages_total
- Type: Counter
- Description: Total number of messages processed by a processor, by status
- Unit: Messages
- Components: Ingestor, Sink, Transform, Filter, Dedup
- Labels:
component: Component type - Added by GlassFlowpipeline_id: Unique pipeline identifier - Added by GlassFlowstatus: Outcome - Added by GlassFlow — Values:success,error,filtered,duplicate,outinstance: Instance identifier - Added by Prometheusjob: Job identifier - Added by Prometheus
There is no separate gfm_records_filtered_total metric. To track filtered records, query gfm_processor_messages_total with status="filtered".
{namespace}_gfm_bytes_processed_total
- Type: Counter
- Description: Total bytes processed
- Unit: Bytes
- Components: Ingestor, Sink, Transform, Filter, Dedup
- Labels:
component: Component type - Added by GlassFlowpipeline_id: Unique pipeline identifier - Added by GlassFlowdirection: Data flow direction - Added by GlassFlow — Values:in,outinstance: Instance identifier - Added by Prometheusjob: Job identifier - Added by Prometheus
Data Sinking Metrics
{namespace}_gfm_clickhouse_records_written_total
- Type: Counter
- Description: Total number of records written to ClickHouse
- Unit: Records
- Components: Sink
- Labels:
component: Component type (e.g., “sink”) - Added by GlassFlowpipeline_id: Unique pipeline identifier - Added by GlassFlowinstance: Instance identifier - Added by Prometheusjob: Job identifier - Added by Prometheus
Example:
glassflow_gfm_clickhouse_records_written_total{component="sink",instance="sink-7c87594fd9-jmdw2",job="pipeline-load-pipeline-1-05b7/sink",pipeline_id="load-pipeline-1-05b7"} 80000In this example, glassflow is the namespace prefix. If you deploy in a different namespace, the prefix will change accordingly.
{namespace}_gfm_clickhouse_records_written_per_second
- Type: Gauge
- Description: Number of records written to ClickHouse per second
- Unit: Records per second
- Components: Sink
- Labels:
component: Component type (e.g., “sink”) - Added by GlassFlowpipeline_id: Unique pipeline identifier - Added by GlassFlowinstance: Instance identifier - Added by Prometheusjob: Job identifier - Added by Prometheus
Example:
glassflow_gfm_clickhouse_records_written_per_second{component="sink",instance="sink-7c87594fd9-jmdw2",job="pipeline-load-pipeline-1-05b7/sink",pipeline_id="load-pipeline-1-05b7"} 430120.2485745206In this example, glassflow is the namespace prefix. If you deploy in a different namespace, the prefix will change accordingly.
Error Handling Metrics
{namespace}_gfm_dlq_records_written_total
- Type: Counter
- Description: Total number of records written to dead letter queue
- Unit: Records
- Components: Ingestor, Sink
- Labels:
component: Component type - Added by GlassFlowpipeline_id: Unique pipeline identifier - Added by GlassFlowinstance: Instance identifier - Added by Prometheusjob: Job identifier - Added by Prometheus
This metric is defined in the code but may not appear in the sample metrics if no records have been written to the DLQ during the observation period.
HTTP Server Metrics
{namespace}_gfm_http_server_request_count
- Type: Counter
- Description: Total number of HTTP requests
- Unit: Requests
- Components: API server
- Labels:
method: HTTP method - Added by GlassFlowpath: Route path template - Added by GlassFlowstatus: HTTP response status code (integer) - Added by GlassFlowinstance: Instance identifier - Added by Prometheusjob: Job identifier - Added by Prometheus
HTTP server metrics do not include component or pipeline_id labels. They are scoped by method, path, and status only.
{namespace}_gfm_http_server_request_duration_seconds
- Type: Histogram
- Description: Duration of HTTP requests in seconds
- Unit: Seconds
- Components: API server
- Labels:
method: HTTP method - Added by GlassFlowpath: Route path template - Added by GlassFlowstatus: HTTP response status code (integer) - Added by GlassFlowinstance: Instance identifier - Added by Prometheusjob: Job identifier - Added by Prometheusle: Histogram bucket boundary - Added by Prometheus
Histogram Buckets: Same as processing_duration_seconds (0.001 to 10.0).
OTLP Receiver Metrics
{namespace}_gfm_receiver_request_count
- Type: Counter
- Description: Total number of OTLP receiver requests
- Unit: Requests
- Components: otlp.logs, otlp.metrics, otlp.traces
- Labels:
component: Component type - Added by GlassFlowpipeline_id: Unique pipeline identifier - Added by GlassFlowtransport: Transport protocol - Added by GlassFlow — Values:http,grpcstatus: Request outcome - Added by GlassFlow — Values:ok,errorinstance: Instance identifier - Added by Prometheusjob: Job identifier - Added by Prometheus
{namespace}_gfm_receiver_request_duration_seconds
- Type: Histogram
- Description: Duration of OTLP receiver requests in seconds
- Unit: Seconds
- Components: otlp.logs, otlp.metrics, otlp.traces
- Labels:
component: Component type - Added by GlassFlowpipeline_id: Unique pipeline identifier - Added by GlassFlowtransport: Transport protocol - Added by GlassFlow — Values:http,grpcstatus: Request outcome - Added by GlassFlow — Values:ok,errorinstance: Instance identifier - Added by Prometheusjob: Job identifier - Added by Prometheusle: Histogram bucket boundary - Added by Prometheus
Histogram Buckets: Same as processing_duration_seconds (0.001 to 10.0).
Back-Pressure Metrics
These metrics are emitted by the ingestor component and the NATS stream sampler. They help identify whether back-pressure is active and how long episodes last.
{namespace}_gfm_ingestor_backpressure_active
- Type: Gauge (Int64)
- Description: Set to
1while the ingestor is blocked waiting for NATS to drain;0otherwise - Unit: —
- Components: Ingestor
- Labels:
pipeline_id: Unique pipeline identifier - Added by GlassFlowinstance: Instance identifier - Added by Prometheusjob: Job identifier - Added by Prometheus
{namespace}_gfm_ingestor_backpressure_events_total
- Type: Counter
- Description: Total number of times the ingestor entered a back-pressure episode
- Unit: Events
- Components: Ingestor
- Labels:
pipeline_id: Unique pipeline identifier - Added by GlassFlowinstance: Instance identifier - Added by Prometheusjob: Job identifier - Added by Prometheus
{namespace}_gfm_ingestor_backpressure_duration_seconds
- Type: Histogram
- Description: Duration of each ingestor back-pressure episode in seconds
- Unit: Seconds
- Components: Ingestor
- Labels:
pipeline_id: Unique pipeline identifier - Added by GlassFlowinstance: Instance identifier - Added by Prometheusjob: Job identifier - Added by Prometheusle: Histogram bucket boundary - Added by Prometheus
Histogram Buckets: 0.1, 0.5, 1, 2.5, 5, 10, 30, 60, 120, 300, 600, 1800 seconds (second-to-minute scale, unlike the millisecond-scale default buckets used by request/processing histograms).
{namespace}_gfm_stream_depth
- Type: Gauge (Int64)
- Description: Number of messages currently stored in a JetStream stream
- Unit: Messages
- Components: Ingestor
- Labels:
pipeline_id: Unique pipeline identifier - Added by GlassFlowstream: JetStream stream name - Added by GlassFlowinstance: Instance identifier - Added by Prometheusjob: Job identifier - Added by Prometheus
{namespace}_gfm_stream_depth_ratio
- Type: Gauge (Float64)
- Description: Stream depth divided by
max_messages; ranges from0.0to1.0. Sustained values near1.0indicate the stream is near capacity and back-pressure is likely - Unit: Ratio (0.0–1.0)
- Components: Ingestor
- Labels:
pipeline_id: Unique pipeline identifier - Added by GlassFlowstream: JetStream stream name - Added by GlassFlowinstance: Instance identifier - Added by Prometheusjob: Job identifier - Added by Prometheus
UI Metrics
UI metrics are exported via OTLP (not Prometheus scraping). All UI metrics use the gfm_ui_ prefix.
Interaction Metrics
gfm_ui_page_views_total
- Type: Counter
- Description: Total page views
- Labels:
path,component
gfm_ui_button_clicks_total
- Type: Counter
- Description: Total button clicks
- Labels:
button_name,component
gfm_ui_form_submissions_total
- Type: Counter
- Description: Total form submissions
- Labels:
form_name,success,component
Pipeline Lifecycle Metrics
gfm_ui_pipeline_created_total
- Type: Counter
- Description: Total pipelines created from the UI
- Labels:
pipeline_type,component
gfm_ui_pipeline_deleted_total
- Type: Counter
- Description: Total pipelines deleted from the UI
- Labels:
pipeline_id,component
gfm_ui_pipeline_status_changed_total
- Type: Counter
- Description: Total pipeline status changes from the UI
- Labels:
pipeline_id,from_status,to_status,component
UI API Metrics
gfm_ui_api_request_count
- Type: Counter
- Description: Total API requests made from the UI
- Labels:
method,path,status,component
gfm_ui_api_request_errors_total
- Type: Counter
- Description: Total API request errors from the UI (HTTP status >= 400)
- Labels:
method,path,status,component
gfm_ui_api_request_duration_seconds
- Type: Histogram
- Description: Duration of API requests made from the UI
- Labels:
method,path,status,component
gfm_ui_page_load_duration_seconds
- Type: Histogram
- Description: Page load duration
- Labels:
path,component
Component-Specific Metrics
Ingestor Component
The Ingestor component primarily exports:
{namespace}_gfm_kafka_records_read_total- Records consumed from Kafka{namespace}_gfm_processing_duration_seconds- Processing time for ingested records{namespace}_gfm_processor_messages_total- Message counts by status{namespace}_gfm_bytes_processed_total- Bytes processed (in/out){namespace}_gfm_dlq_records_written_total- Records sent to DLQ on processing errors{namespace}_gfm_ingestor_backpressure_active- 1 while blocked on NATS back-pressure{namespace}_gfm_ingestor_backpressure_events_total- Count of back-pressure episodes{namespace}_gfm_ingestor_backpressure_duration_seconds- Duration of each back-pressure episode{namespace}_gfm_stream_depth- Current message count in the JetStream stream{namespace}_gfm_stream_depth_ratio- Stream fill ratio (0.0–1.0)
Sink Component
The Sink component primarily exports:
{namespace}_gfm_clickhouse_records_written_total- Records written to ClickHouse{namespace}_gfm_clickhouse_records_written_per_second- Write rate to ClickHouse{namespace}_gfm_processing_duration_seconds- Processing time for sink operations (with optionalstage:dedup_filter,dedup_write,schema_mapping,total_preparation,per_message){namespace}_gfm_processor_messages_total- Message counts by status{namespace}_gfm_bytes_processed_total- Bytes processed (in/out){namespace}_gfm_dlq_records_written_total- Records sent to DLQ on write errors
Transform Component
The Transform component primarily exports:
{namespace}_gfm_processing_duration_seconds- Processing time for transform operations{namespace}_gfm_processor_messages_total- Message counts by status{namespace}_gfm_bytes_processed_total- Bytes processed (in/out)
Filter Component
The Filter component primarily exports:
{namespace}_gfm_processing_duration_seconds- Processing time for filter operations{namespace}_gfm_processor_messages_total- Message counts by status (usestatus="filtered"to track filtered records){namespace}_gfm_bytes_processed_total- Bytes processed (in/out)
Dedup Component
The Dedup component primarily exports:
{namespace}_gfm_processing_duration_seconds- Processing time for dedup operations (withstage:dedup_filter,dedup_write){namespace}_gfm_processor_messages_total- Message counts by status (usestatus="duplicate"to track deduplicated records){namespace}_gfm_bytes_processed_total- Bytes processed (in/out)
API Server
The API server exports HTTP metrics when metrics are enabled:
{namespace}_gfm_http_server_request_count- HTTP request count by method, path, and status{namespace}_gfm_http_server_request_duration_seconds- HTTP request duration by method, path, and status
OTLP Receiver
The OTLP receiver components (otlp.logs, otlp.metrics, otlp.traces) export:
{namespace}_gfm_receiver_request_count- Receiver request count by transport (http/grpc) and status (ok/error){namespace}_gfm_receiver_request_duration_seconds- Receiver request duration by transport and status{namespace}_gfm_bytes_processed_total- Bytes processed (in/out)
The gfm_ingestor_backpressure_* and gfm_stream_depth* metrics are emitted by the Kafka ingestor and are not available for OTLP source pipelines.
UI
The UI exports interaction and performance metrics via OTLP:
gfm_ui_page_views_total- Page view countsgfm_ui_button_clicks_total- Button click countsgfm_ui_form_submissions_total- Form submission countsgfm_ui_api_request_count- API request counts from the UIgfm_ui_api_request_errors_total- API request error counts from the UIgfm_ui_api_request_duration_seconds- API request duration from the UIgfm_ui_page_load_duration_seconds- Page load durationgfm_ui_pipeline_created_total- Pipeline creation countsgfm_ui_pipeline_deleted_total- Pipeline deletion countsgfm_ui_pipeline_status_changed_total- Pipeline status change counts
Metric Labels
GlassFlow metrics include labels from two sources:
Application Labels (Added by GlassFlow)
These labels are added by the GlassFlow application code:
| Label | Description | Example Values |
|---|---|---|
component | Component type | ingestor, sink, dedup, transform, filter, api, otlp.logs, otlp.metrics, otlp.traces |
pipeline_id | Unique pipeline identifier | load-pipeline-1-05b7 |
stage | Processing stage (optional, for processing_duration_seconds) | dedup_filter, dedup_write, schema_mapping, total_preparation, per_message |
status | Outcome (processor messages), request result (receiver), or HTTP status code (HTTP/UI metrics) | success, error, filtered, duplicate, out, ok, or HTTP code e.g. 200 |
direction | Data flow direction (for bytes_processed_total) | in, out |
transport | Transport protocol (for receiver metrics) | http, grpc |
stream | JetStream stream name (for stream depth metrics) | pipeline stream name |
method | HTTP method (HTTP and UI API metrics) | GET, POST, PUT, DELETE |
path | Route path template (HTTP and UI metrics) | /api/v1/pipelines, /health |
Prometheus Labels (Added by Prometheus)
These labels are automatically added by Prometheus during the scraping process:
| Label | Description | Example Values |
|---|---|---|
instance | Instance identifier (typically pod name) | ingestor-0-7f44fbbfd8-bqbw9 |
job | Job identifier (from Prometheus config) | pipeline-load-pipeline-1-05b7/ingestor |
le | Histogram bucket boundary (for histogram metrics only) | 0.001, 0.005, 1.0, +Inf |
Label Sources:
- Application labels (
component,pipeline_id) are added by GlassFlow code and are consistent across all deployments - Prometheus labels (
instance,job,le) are added by Prometheus during scraping and depend on your monitoring setup - The
joblabel comes from your Prometheus configuration’sjob_namefield - The
instancelabel typically contains the Kubernetes pod name or target endpoint - HTTP server metrics (
gfm_http_server_request_count,gfm_http_server_request_duration_seconds) do not includecomponentorpipeline_idlabels
Accessing Metrics
Metrics Endpoint
The OTEL collector service exposes metrics at:
{release-name}-otel-collector.{namespace}.svc.cluster.local:9090/metricsFor example, if you installed GlassFlow with the release name glassflow-chart in the glassflow namespace:
glassflow-chart-otel-collector.glassflow.svc.cluster.local:9090/metricsPrometheus Scraping
To scrape metrics with Prometheus, add the following configuration to your Prometheus config:
# GlassFlow OTEL Collector metrics
- job_name: 'glassflow-otel-collector'
static_configs:
- targets: ['glassflow-chart-otel-collector.glassflow.svc.cluster.local:9090']
metrics_path: /metrics
scrape_interval: 15sReplace glassflow-chart and glassflow with your actual release name and namespace if different.
Understanding the job Label
The job label in your metrics comes from the job_name field in your Prometheus configuration. For example:
- If your Prometheus config has
job_name: 'glassflow-otel-collector', thenjob="glassflow-otel-collector" - If you use Kubernetes service discovery, the job name might be auto-generated based on the service name
- The job name helps Prometheus identify which scrape configuration was used to collect the metrics
Job Label Examples:
job="pipeline-load-pipeline-1-05b7/ingestor"- Indicates this metric came from an ingestor componentjob="pipeline-load-pipeline-1-05b7/sink"- Indicates this metric came from a sink componentjob="glassflow-otel-collector"- Indicates this metric came from the OTEL collector endpoint
Monitoring Best Practices
Key Metrics to Monitor
-
Throughput Metrics:
rate({namespace}_gfm_kafka_records_read_total[5m])- Kafka consumption rate{namespace}_gfm_clickhouse_records_written_per_second- ClickHouse write raterate({namespace}_gfm_bytes_processed_total{direction="in"}[5m])- Bytes ingestion raterate({namespace}_gfm_bytes_processed_total{direction="out"}[5m])- Bytes output rate
-
Latency Metrics:
histogram_quantile(0.95, rate({namespace}_gfm_processing_duration_seconds_bucket[5m]))- 95th percentile processing timehistogram_quantile(0.99, rate({namespace}_gfm_processing_duration_seconds_bucket[5m]))- 99th percentile processing time- Use the
stagelabel onprocessing_duration_secondsfor per-stage timing (e.g.,dedup_filter,dedup_write,schema_mapping,total_preparation,per_message) histogram_quantile(0.95, rate({namespace}_gfm_http_server_request_duration_seconds_bucket[5m]))- 95th percentile API latency
-
Error Metrics:
rate({namespace}_gfm_dlq_records_written_total[5m])- DLQ write raterate({namespace}_gfm_processor_messages_total{status="error"}[5m])- Processor error raterate({namespace}_gfm_processor_messages_total{status="filtered"}[5m])- Record filtering raterate({namespace}_gfm_processor_messages_total{status="duplicate"}[5m])- Deduplication rate
-
Receiver Metrics:
rate({namespace}_gfm_receiver_request_count{status="error"}[5m])- OTLP receiver error ratehistogram_quantile(0.95, rate({namespace}_gfm_receiver_request_duration_seconds_bucket[5m]))- 95th percentile receiver latency
-
Back-Pressure Metrics (Kafka ingestor pipelines only):
{namespace}_gfm_ingestor_backpressure_active- Currently in back-pressure (1) or not (0)rate({namespace}_gfm_ingestor_backpressure_events_total[5m])- Rate of new back-pressure episodeshistogram_quantile(0.95, rate({namespace}_gfm_ingestor_backpressure_duration_seconds_bucket[5m]))- 95th percentile episode duration{namespace}_gfm_stream_depth_ratio- Stream fill ratio; alert when sustained near1.0{namespace}_gfm_stream_depth- Absolute message backlog in stream
-
Health Metrics:
{namespace}_up- Service availability{namespace}_target_info- Service metadata
Installation and Setup
For detailed installation instructions and configuration options, see the Observability Installation Guide.