GlassFlow Helm Values Configuration
This comprehensive guide covers all available configuration options in the GlassFlow Helm chart’s values.yaml file. Use this reference to customize your GlassFlow deployment for production environments.
Quick Start: For basic installations, you can use the default values. For production deployments, review the sections below to optimize your configuration.
Global Settings
Global settings apply across all components of the GlassFlow deployment.
global:
# Global image registry - prepended to all image repositories
imageRegistry: "ghcr.io/glassflow/"
# Observability configuration
observability:
metrics:
enabled: true # Enable metrics collection
logs:
enabled: false # Enable log export
exporter:
otlp: {} # OTLP exporter configuration
otelCollector:
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 500m
memory: 512Mi
# NATS global configuration
nats:
# NATS address for operator connection
# Defaults to {{ .Release.Name }}-nats.{{ .Release.Namespace }}.svc.cluster.local
address: ""
stream:
maxAge: 24h # Maximum age of messages in streams
maxBytes: 0 # Maximum size of streams (0 = unlimited)
# Pipeline namespace configuration
pipelines:
namespace:
auto: true # When true, operator creates per-pipeline namespaces (pipeline-<id>)
name: "glassflow-pipelines" # Fixed namespace to deploy all pipelines into (when auto is false)
create: true # When auto is false, Helm can optionally create the namespace
usageStats:
enabled: true
installationId: ""Pipeline Namespaces:
- By default, the operator creates per-pipeline namespaces (
pipeline-<id>) - To use a fixed namespace for all pipelines, set
global.pipelines.namespace.auto: false - When
autoisfalse, all pipelines deploy to the namespace specified inglobal.pipelines.namespace.name
Key Global Settings
| Setting | Description | Default | Production Recommendation |
|---|---|---|---|
imageRegistry | Global Docker registry prefix | ghcr.io/glassflow/ | - |
observability.metrics.enabled | Enable metrics collection | true | Keep enabled for monitoring |
observability.logs.enabled | Enable log export | false | Enable for production monitoring |
observability.logs.exporter.otlp | Your OTLP collector endpoint | {} | Configure your OTLP endpoint where glassflow will send logs. See OTLP Exporter Configuration for detailed setup |
nats.stream.maxAge | Message retention period | 24h | Adjust based on your data retention needs |
nats.stream.maxBytes | Maximum stream size | 0 (unlimited) | Set a byte limit based on expected data volume |
otelCollector.resources | OTel collector sidecar resources | 100m/128Mi req, 500m/512Mi lim | Increase if log/metric volume is high |
pipelines.namespace.auto | Create per-pipeline namespaces | true | Set to false to use fixed namespace |
pipelines.namespace.name | Fixed namespace for all pipelines | glassflow-pipelines | Used when auto is false |
pipelines.namespace.create | Create namespace if it doesn’t exist | true | Only applies when auto is false |
usageStats.enabled | Send anonymous usage statistics | true | Set to false to opt out |
API Component
Configure the GlassFlow backend API service.
api:
# Scaling configuration
replicas: 1
logLevel: "INFO"
# Container image settings
image:
repository: glassflow-etl-be
tag: v2.11.2
pullPolicy: IfNotPresent
# Resource allocation
resources:
requests:
memory: "100Mi"
cpu: "100m"
limits:
memory: "200Mi"
cpu: "250m"
# Service configuration
service:
type: ClusterIP
port: 8081
targetPort: 8081
# Environment variables
env: []API Configuration Options
| Setting | Description | Default | Production Recommendation |
|---|---|---|---|
replicas | Number of API instances | 1 | 1 is sufficient for API operations |
logLevel | Logging verbosity | INFO | Use DEBUG for troubleshooting |
resources.requests | Minimum resources | 100Mi/100m | Scale based on load |
resources.limits | Maximum resources | 200Mi/250m | Set appropriate limits |
UI Component
Configure the GlassFlow frontend user interface.
ui:
# Scaling configuration
replicas: 1
# Container image settings
image:
repository: glassflow-etl-fe
tag: v2.11.2
pullPolicy: IfNotPresent
# Resource allocation
resources:
requests:
memory: "512Mi"
cpu: "100m"
limits:
memory: "1Gi"
cpu: "200m"
# Service configuration
service:
type: ClusterIP
port: 8080
targetPort: 8080
# Environment variables (object format)
env: {}
# Kafka Kerberos Gateway sidecar (for connecting to Kerberos-secured Kafka clusters)
kafkaGateway:
enabled: true
image:
repository: kafka-kerberos-gateway
tag: latest
pullPolicy: IfNotPresent
resources:
requests:
memory: "128Mi"
cpu: "50m"
limits:
memory: "256Mi"
cpu: "200m"
port: 8082 # Internal port within the UI pod
# Auth0 authentication configuration
auth0:
enabled: false
profileRoute: "/api/auth/me"
secret: ""
appBaseUrl: "http://localhost:8080"
domain: ""
issuerBaseUrl: ""
clientId: ""
clientSecret: ""Auth0 integration is disabled by default. Set enabled: true and configure the domain, client ID, client secret, and base URL to enable authentication.
UI Configuration Options
| Setting | Description | Default | Production Recommendation |
|---|---|---|---|
replicas | Number of UI instances | 1 | 1 is sufficient for UI pod |
resources.requests | Minimum resources | 512Mi/100m | Frontend typically needs more memory |
resources.limits | Maximum resources | 1Gi/200m | Adjust based on user load |
env | Environment variables (object format) | {} | Use object format, not array |
kafkaGateway.enabled | Enable Kafka Kerberos Gateway sidecar | true | Enable if connecting to Kerberos-secured Kafka |
kafkaGateway.resources | Gateway resource requests/limits | 128Mi/50m - 256Mi/200m | Adjust based on usage |
kafkaGateway.port | Gateway internal port | 8082 | Internal port within UI pod |
GlassFlow Operator
Configure the Kubernetes operator that manages ETL pipelines resources in k8s. The operator chart and code is in a separate repo and is deployed as a dependency chart.
glassflow-operator:
controllerManager:
replicas: 1
manager:
# Maximum duration a reconcile operation can run before timing out
reconcileTimeout: 15m
# Operator image configuration
image:
repository: glassflow-etl-k8s-operator
tag: v2.1.0
pullPolicy: IfNotPresent
# Resource allocation
resources:
requests:
cpu: 10m
memory: 64Mi
limits:
cpu: 500m
memory: 128Mi
# Service account configuration
serviceAccount:
annotations: {}
# ETL component configurations
glassflowComponents:
ingestor:
image:
repository: glassflow-etl-ingestor
tag: v2.11.2
logLevel: "INFO"
resources:
requests:
cpu: 1000m
memory: 256Mi
limits:
cpu: 1500m
memory: 512Mi
affinity: {}
join:
image:
repository: glassflow-etl-join
tag: v2.11.2
logLevel: "INFO"
resources:
requests:
cpu: 500m
memory: 256Mi
limits:
cpu: 1000m
memory: 1Gi
affinity: {}
sink:
image:
repository: glassflow-etl-sink
tag: v2.11.2
logLevel: "INFO"
resources:
requests:
cpu: 1000m
memory: 500Mi
limits:
cpu: 1500m
memory: 1.5Gi
affinity: {}
dedup:
image:
repository: glassflow-etl-dedup
tag: v2.11.2
pullPolicy: IfNotPresent
logLevel: "INFO"
resources:
requests:
cpu: 1000m
memory: 1Gi
limits:
cpu: 2000m
memory: 2Gi
storage:
size: "40Gi"
className: ""
affinity: {}Operator Configuration Options
| Component | CPU Request | Memory Request | CPU Limit | Memory Limit |
|---|---|---|---|---|
| Controller Manager | 10m | 64Mi | 500m | 128Mi |
| Ingestor | 1000m | 256Mi | 1500m | 512Mi |
| Join | 500m | 256Mi | 1000m | 1Gi |
| Sink | 1000m | 500Mi | 1500m | 1.5Gi |
| Dedup | 1000m | 1Gi | 2000m | 2Gi |
The dedup.storage field provisions a PersistentVolumeClaim for the deduplication state store. Set storage.size based on your expected deduplication key volume and storage.className to a StorageClass name if you need a specific provisioner.
NATS Configuration
NATS is the messaging system used for internal communication between GlassFlow components. Nats is deployed as a dependency chart using the official nats charts repo
nats:
# Enable/disable NATS deployment
enabled: true
# NATS configuration
config:
# Clustering for high availability
cluster:
enabled: true
port: 6222
replicas: 3 # Must be 2+ when JetStream is enabled
# JetStream for persistent messaging
jetstream:
enabled: true
# Memory store (fast, non-persistent)
memoryStore:
enabled: false
maxSize: 1Gi
# File store (persistent, recommended for production)
fileStore:
enabled: true
dir: /data
pvc:
enabled: true
size: 100Gi
storageClassName: ""
# Container resource allocation
container:
merge:
resources:
requests:
memory: "3Gi"
cpu: "4000m"
limits:
memory: "3Gi"
cpu: "4000m"NATS Configuration Options
| Setting | Description | Default | Production Recommendation |
|---|---|---|---|
enabled | Deploy NATS with GlassFlow | true | Use external NATS for large deployments |
cluster.replicas | Number of NATS nodes | 3 | Use 3+ for production |
jetstream.fileStore.pvc.size | Storage size | 100Gi | Scale based on data volume |
resources.requests | Minimum resources | 3Gi/4000m | NATS is I/O and CPU intensive at high throughput |
NATS Prometheus Exporter
Nats Prometheus exporter collects all NATS related metrics. These metrics are provdied together with GlassFlow metrics on the /metrics endpoint.
Details on accessing GlassFlow metrics can be found here
natsPrometheusExporter:
image:
repository: natsio/prometheus-nats-exporter
tag: 0.17.3
pullPolicy: IfNotPresent
# Metrics to collect
metrics:
accstatz: true
connz: true
connz_detailed: true
jsz: true
gatewayz: true
leafz: true
routez: true
subz: true
varz: true
service:
type: ClusterIP
port: 80
targetPort: 7777
protocol: TCP
name: httpPostgreSQL Configuration
GlassFlow requires PostgreSQL for persisting pipeline definitions, connection credentials, and run history. By default, the chart deploys a single-node PostgreSQL instance. Set postgresql.enabled: false and configure global.postgres.connection_url (or global.postgres.secret) to use an external PostgreSQL instance.
postgresql:
enabled: true
image:
repository: postgres
tag: "17-alpine"
pullPolicy: IfNotPresent
replicaCount: 1
auth:
enabled: true
database: "glassflow"
sslmode: "disable"
username: "glassflow"
password: "glassflow123"
existingSecret:
enabled: false
name: ""
keys:
usernameKey: username
passwordKey: password
databaseKey: database
service:
type: ClusterIP
port: 5432
persistence:
enabled: true
size: 10Gi
storageClass: ""
resources:
requests:
memory: "512Mi"
cpu: "100m"
limits:
memory: "2Gi"
cpu: "1000m"Change postgresql.auth.password before deploying to production. For production, prefer auth.existingSecret to avoid storing credentials in values.yaml.
PostgreSQL Configuration Options
| Setting | Description | Default | Production Recommendation |
|---|---|---|---|
enabled | Deploy PostgreSQL with GlassFlow | true | Set to false to use an external instance |
auth.password | Database password | glassflow123 | Change before deploying to production |
auth.existingSecret.enabled | Source credentials from a Kubernetes Secret | false | Enable for production to avoid plaintext credentials in values.yaml |
persistence.size | PVC size for PostgreSQL data | 10Gi | Scale based on pipeline and connection volume |
global.postgres.connection_url | External PostgreSQL URL | "" | Set when postgresql.enabled is false |
Notification Service
The notification service sends Slack and email alerts for pipeline events. Notifications are activated by setting global.notifications.enabled: true; the channel details are configured in this section.
notificationService:
replicas: 1
image:
repository: glassflow-notifier
tag: v1.0.1
pullPolicy: IfNotPresent
resources:
requests:
memory: "100Mi"
cpu: "100m"
limits:
memory: "200Mi"
cpu: "250m"
slack:
enabled: "false"
webhookUrl: ""
defaultChannel: "#notifications"
email:
enabled: "false"
smtpHost: ""
smtpPort: 587
smtpUsername: ""
smtpPassword: ""
fromAddress: ""
toAddress: ""Notification Service Configuration Options
| Setting | Description | Default | Production Recommendation |
|---|---|---|---|
slack.enabled | Enable Slack notifications | "false" | Set to "true" and provide webhookUrl |
slack.webhookUrl | Incoming webhook URL | "" | Required when Slack is enabled |
slack.defaultChannel | Default Slack channel | #notifications | Override per-alert in pipeline config |
email.enabled | Enable email notifications | "false" | Set to "true" and configure SMTP fields |
email.smtpHost | SMTP server host | "" | Required when email is enabled |
Ingress Configuration
Configure external access to GlassFlow services.
ingress:
# Enable external access
enabled: false
# Ingress controller class
ingressClassName: "nginx" # or "traefik", "istio"
# Ingress annotations
annotations:
cert-manager.io/cluster-issuer: "letsencrypt-prod"
nginx.ingress.kubernetes.io/ssl-redirect: "true"
# Host configurations
hosts:
- host: "glassflow.example.com"
paths:
- path: "/"
pathType: Prefix
serviceName: "glassflow-ui"
servicePort: 8080
- path: "/api/v1"
pathType: Prefix
serviceName: "glassflow-api"
servicePort: 8081
# TLS configuration
tls:
- hosts:
- "glassflow.example.com"
secretName: "glassflow-tls-secret"Ingress Configuration Options
By default, helm deployment does not expose GlassFlow to the internet. See Using Ingress for details on configuring ingress for enabling external access.
| Setting | Description | Default | Production Recommendation |
|---|---|---|---|
enabled | Enable external access | false | Set to true for production |
ingressClassName | Ingress controller | "" | Specify your controller |
hosts | Domain configurations | [] | Configure your domains |
tls | HTTPS configuration | [] | Enable for production |
Security Settings
Configure security contexts and service accounts.
# Pod security context
podSecurityContext: {}
# fsGroup: 2000
# runAsNonRoot: true
# runAsUser: 1000
# Container security context
securityContext: {}
# capabilities:
# drop:
# - ALL
# readOnlyRootFilesystem: true
# runAsNonRoot: true
# runAsUser: 1000
# Service account configuration
serviceAccount:
create: true
automount: true
annotations:
eks.amazonaws.com/role-arn: "arn:aws:iam::ACCOUNT:role/ROLE"
name: ""Security Configuration Options
| Setting | Description | Default | Production Recommendation |
|---|---|---|---|
podSecurityContext | Pod-level security context | {} | Configure fsGroup, runAsNonRoot, etc. for proper permissions |
securityContext | Container-level security context | {} | Enable readOnlyRootFilesystem, runAsNonRoot, etc. for hardened containers |
serviceAccount.create | Create service account | true | Use existing for production |
serviceAccount.automount | Automount API credentials | true | Enable for service account token mounting |
serviceAccount.name | Service account name | "" | Use custom name if needed |
serviceAccount.annotations | Service account annotations | {} | Useful for IAM roles, OIDC providers |
Credential Encryption
GlassFlow can encrypt Kafka and ClickHouse credentials before they are persisted in PostgreSQL. When enabled, the API uses AES-256-GCM to encrypt the following fields at rest:
- Kafka: SASL password, TLS private key, Kerberos keytab
- ClickHouse: connection password
Credentials are decrypted transparently at runtime and are never stored or logged in plaintext when encryption is active.
Enable credential encryption for all production deployments. When global.encryption.enabled is false (the default), Kafka and ClickHouse credentials are stored in plaintext in the PostgreSQL connections table. Any user or process with read access to the database can retrieve them without restriction.
How it works
The encryption key is a 32-byte (256-bit) random value stored in a Kubernetes Secret that you create and manage. You must supply the Secret before enabling encryption — the chart does not generate one automatically. This keeps the key lifecycle outside of Helm and prevents accidental key rotation during upgrades.
When global.encryption.enabled is true, global.encryption.existingSecret.name must be set. The chart will fail with a validation error if the field is empty.
Configuration
global:
encryption:
# Set to true to enable AES-256-GCM encryption of credentials in PostgreSQL.
# When false (default), credentials are stored in plaintext.
enabled: false
# Reference an existing Kubernetes Secret that contains the encryption key.
# Required when enabled=true — the chart does not generate this Secret.
existingSecret:
# Name of the Kubernetes Secret in the same namespace as GlassFlow.
name: ""
# Key within the Secret whose value is the 32-byte encryption key.
key: "encryption-key"Encryption configuration options
| Setting | Type | Default | Description |
|---|---|---|---|
global.encryption.enabled | bool | false | Enable AES-256-GCM encryption of credentials stored in PostgreSQL |
global.encryption.existingSecret.name | string | "" | Name of a pre-existing Kubernetes Secret containing the encryption key. Required when enabled is true |
global.encryption.existingSecret.key | string | "encryption-key" | Key inside the Secret whose value is the 32-byte encryption key |
Pod Configuration
Configure pod-level settings for scheduling and labeling.
# Pod annotations (useful for monitoring, logging, etc.)
podAnnotations:
prometheus.io/scrape: "true"
prometheus.io/port: "9090"
prometheus.io/path: "/metrics"
# Pod labels
podLabels: {}
# Node selector for main pods (API and UI)
nodeSelector: {}
# Pod tolerations
tolerations: []
# Pod affinity rules
affinity: {}Pod Configuration Options
| Setting | Description | Default | Production Recommendation |
|---|---|---|---|
podAnnotations | Additional pod annotations | {} | Add monitoring annotations |
podLabels | Additional pod labels | {} | Useful for service discovery |
nodeSelector | Node selector for scheduling | {} | Use for dedicated nodes |
tolerations | Pod tolerations | [] | For tainted nodes |
affinity | Pod affinity/anti-affinity rules | {} | Control pod placement |
Autoscaling Configuration
Configure horizontal pod autoscaling for the API and UI components.
autoscaling:
enabled: false
minReplicas: 1
maxReplicas: 5
targetCPUUtilizationPercentage: 80Autoscaling Configuration Options
| Setting | Description | Default | Production Recommendation |
|---|---|---|---|
enabled | Enable autoscaling | false | Enable for production workloads |
minReplicas | Minimum number of replicas | 1 | Set based on minimum load |
maxReplicas | Maximum number of replicas | 5 | Set based on peak load |
targetCPUUtilizationPercentage | Target CPU utilization | 80 | Adjust based on workload |
Best Practices
Production Checklist:
- Use 3+ NATS replicas for high availability
- Set appropriate resource requests and limits
- Enable ingress with TLS
- Configure persistent storage for NATS
- Set up monitoring and logging
- Use node selectors for dedicated resources
Resource Sizing Guidelines
| Environment | API CPU | API Memory | UI CPU | UI Memory | NATS CPU | NATS Memory | NATS Replicas | NATS Storage | Dedup CPU | Dedup Memory |
|---|---|---|---|---|---|---|---|---|---|---|
| Development | 50m | 50Mi | 50m | 256Mi | 500m | 1Gi | 1 | 10Gi | 500m | 512Mi |
| Production | 500m | 500Mi | 200m | 1Gi | 4000m | 4Gi | 3 | 100Gi | 1000m | 1Gi |
| High-Performance | 1000m | 1Gi | 500m | 2Gi | 4000m+ | 8Gi | 5 | 500Gi | 2000m | 2Gi |
Monitoring Configuration
# Enable comprehensive monitoring
global:
observability:
metrics:
enabled: true
logs:
enabled: true
exporter:
otlp:
endpoint: "https://your-otel-collector:4317"
tls:
insecure: false
# Add monitoring annotations
podAnnotations:
prometheus.io/scrape: "true"
prometheus.io/port: "9090"
prometheus.io/path: "/metrics"Troubleshooting
Common Issues
-
NATS Connection Issues
# Ensure NATS is properly configured nats: config: cluster: replicas: 3 # Must be 2+ for JetStream -
Resource Constraints
# Check resource requests vs limits resources: requests: memory: "100Mi" # Should be realistic cpu: "100m" limits: memory: "200Mi" # Should be higher than requests cpu: "250m" -
Ingress Not Working
# Verify ingress configuration ingress: enabled: true ingressClassName: "nginx" # Must match your controller hosts: - host: "your-domain.com"
Validation Commands
# Validate Helm values
helm template glassflow glassflow/glassflow-etl -f values.yaml --dry-run
# Check resource usage
kubectl top pods -n glassflow
# Verify services
kubectl get svc -n glassflow
# Check ingress
kubectl get ingress -n glassflowNext Steps
After configuring your values.yaml:
- Install GlassFlow:
helm install glassflow glassflow/glassflow-etl -f values.yaml - Verify Installation: Check pod status and service endpoints
- Configure Monitoring: Set up Prometheus/Grafana dashboards
- Set Up Logging: Configure log aggregation
- Test Functionality: Create your first ETL pipeline
For more information, see the Installation Guide and Pipeline JSON Reference.