Skip to Content
ArchitectureKubernetes Components

Kubernetes Components

GlassFlow runs in the glassflow namespace and deploys shared infrastructure alongside per-pipeline workloads.

Shared Components

These components are deployed once and shared across all pipelines.

Core

Pod nameKindPurpose
glassflow-api-*DeploymentREST API for pipeline CRUD, orchestration, and integration with the UI and Python SDK
glassflow-ui-*DeploymentWeb interface for pipeline configuration and real-time monitoring
glassflow-controller-manager-*DeploymentKubernetes operator that watches Pipeline CRDs and manages their lifecycle, scaling, and updates
glassflow-postgresql-*StatefulSetPostgreSQL database for storing pipeline configuration

Messaging (NATS)

Pod nameKindPurpose
glassflow-nats-{0..N}StatefulSetNATS JetStream cluster (3 or 5 nodes) providing persistent messaging, high availability, and automatic failover between pipeline stages
glassflow-nats-box-*DeploymentUtility container with NATS CLI tools for debugging and cluster administration

OTLP Receiver

Pod nameKindPurpose
glassflow-otlp-receiver-*DeploymentAccepts OpenTelemetry logs, traces, and metrics over gRPC (4317) and HTTP (4318). Flattens OTLP protobuf to JSON and routes data to the correct pipeline via the x-glassflow-pipeline-id header. Deployed once and shared across all OTLP pipelines. Only deployed when at least one pipeline uses an OTLP source.

Observability

Pod nameKindPurpose
glassflow-otel-collector-*DeploymentCollects metrics and logs from pipeline components; exposes Prometheus metrics on /metrics port 9090
glassflow-prometheus-nats-exporter-*DeploymentExports NATS server and JetStream metrics for Prometheus

Custom Resources

CRD namePurpose
pipelines.etl.glassflow.ioDeclarative Pipeline resource managed by the controller — defines sources, transforms, join, sink, and scaling for each pipeline

Per-Pipeline Resources

Each pipeline runs in its own namespace (pipeline-{pipeline-name}-{unique-id}) to isolate resources and security boundaries.

WorkloadKindName patternPurpose
IngestorStatefulSetingestor-{replica-id}Consumes data from the configured source (Kafka or OTLP) and publishes to NATS JetStream. Scales horizontally.
TransformStatefulSetdedup-{replica-id}Runs filter (expr), deduplication (BadgerDB), and stateless transformations (expr) in order. Each capability activates only when configured. Scales horizontally.
SinkStatefulSetsink-{replica-id}Batches messages from NATS JetStream and writes to ClickHouse with retry logic and connection pooling. Scales horizontally.

For guidance on replica counts and resource values, see the Scaling Guide.

Infrastructure

AreaDetails
NamespaceAll shared components run in the glassflow namespace with dedicated RBAC, network policies, and resource quotas
High availabilityNATS cluster (3 or 5 nodes) provides fault tolerance, automatic failover, and data replication
StoragePersistent volumes for NATS JetStream data and application logs; ConfigMaps and Secrets for configuration
NetworkingInternal communication via Kubernetes DNS; NATS cluster via headless services; external access via LoadBalancer or Ingress; TLS encryption for NATS cluster traffic
Last updated on