Run a demo pipeline
GlassFlow comes with a comprehensive demo environment that allows you to test its capabilities locally. This guide walks you through a local installation using GlassFlow CLI. The CLI brings up a Kind cluster and deploys Kafka, ClickHouse, and GlassFlow ETL. You will then create a pipeline and run through a deduplication demo.
Demo Overview
This guide has you set up a pipeline using the GlassFlow UI, then verify deduplication with sample events.
Install GlassFlow locally
Install the GlassFlow CLI so you can run the demo. You need Docker (or compatible runtime), Helm (installΒ or brew install helm), and kubectl (installed automatically via Homebrew if you use the recommended install).
Verify installation
glassflow versionglassflow version 2.2.1
commit: 7b4743328312791cbeb0ba8235dd5b85719e9f2d
date: 2026-03-04T11:15:36ZStart GlassFlow locally
With the CLI installed, bring up the local environment (Kind, Kafka, ClickHouse, and GlassFlow ETL):
glassflow upThis will:
- Create a Kind cluster (if needed)
- Install GlassFlow ETL (glassflow namespace) via Helm
- Install Kafka (kafka namespace) and ClickHouse (clickhouse namespace) via Helm
- Wait for all services to be ready (up to ~25 minutes)
The following steps guide you through creating a pipeline in the GlassFlow UI.
For more options and details, see the Installation Guide.
Set up the pipeline via the UI
Create a new topic in Kafka
kubectl exec -n kafka svc/kafka -- bash -c 'cat > /tmp/client.properties << EOF
security.protocol=SASL_PLAINTEXT
sasl.mechanism=SCRAM-SHA-256
sasl.jaas.config=org.apache.kafka.common.security.scram.ScramLoginModule required username="user1" password="glassflow-demo-password";
EOF
kafka-topics.sh --bootstrap-server kafka.kafka.svc.cluster.local:9092 \
--command-config /tmp/client.properties \
--create --topic duplicated-events \
--partitions 1 --replication-factor 1'Send example data to the topic
Send one sample event to the duplicated-events topic so the pipeline will have data to ingest once you create it in the UI:
kubectl exec -n kafka svc/kafka -- bash -c 'cat > /tmp/events.json << "EOFEVENTS"
{"event_id": "49a6fdd6f305428881f3436eb498fc9d", "type": "page_view", "source": "web", "created_at": "2025-03-20T10:00:00Z"}
EOFEVENTS
cat > /tmp/client.properties << EOF
security.protocol=SASL_PLAINTEXT
sasl.mechanism=PLAIN
sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="user1" password="glassflow-demo-password";
EOF
cat /tmp/events.json | kafka-console-producer.sh --bootstrap-server kafka.kafka.svc.cluster.local:9092 \
--producer.config /tmp/client.properties \
--topic duplicated-events'Create a new table in ClickHouse
kubectl exec -n clickhouse svc/clickhouse -- clickhouse-client \
--user default \
--password glassflow-demo-password \
--query "CREATE TABLE IF NOT EXISTS deduplicated_events (event_id UUID, type String, source String, created_at DateTime) ENGINE = MergeTree ORDER BY event_id"Configure pipeline in the UI
Once running, you can access (ports may vary if alternatives were chosen):
- GlassFlow UI: http://localhost:30080Β
- GlassFlow API: http://localhost:30180Β
- ClickHouse HTTP: http://localhost:30090Β
Open the GlassFlow UI and use the connection details below to create a pipeline.
- In this demo we will create a single topic pipeline.
- Give the pipeline a name (for example, βDemo Pipelineβ).
- The UI will automatically generate a pipeline ID for the pipeline.
Kafka Connection
Authentication Method: SASL/PLAIN
Security Protocol: SASL_PLAINTEXT
Bootstrap Servers: kafka.kafka.svc.cluster.local:9092
Username: user1
Password: glassflow-demo-passwordKafka Topic
Topic Name: duplicated-events
Consumer Group Initial Offset: latest
Schema:
{
"event_id": "ddccabe2-c673-4d8a-affc-8647db00f7b5",
"type": "page_view",
"source": "web",
"created_at": "2025-12-03T15:17:34.907877Z"
}Deduplication
Enabled: true
Deduplicate Key: event_id
Deduplicate Key Type: string
Time Window: 1hSkip Filter and skip Transform.
ClickHouse Connection
Host: clickhouse.clickhouse.svc.cluster.local
HTTP/S Port: 8123
Native Port: 9000
Username: default
Password: glassflow-demo-password
Use SSL: falseClickHouse Table
Table: deduplicated_eventsWait for the pipeline to be deployed. The UI will redirect to the pipeline page once the pipeline is deployed.
Send data to Kafka
Run the following on your machine in a terminal:
# Send multiple JSON events to Kafka
kubectl exec -n kafka svc/kafka -- bash -c 'cat > /tmp/events.json << "EOFEVENTS"
{"event_id": "49a6fdd6f305428881f3436eb498fc9d", "type": "page_view", "source": "web", "created_at": "2025-03-20T10:00:00Z"}
{"event_id": "49a6fdd6f305428881f3436eb498fc9d", "type": "page_view", "source": "web", "created_at": "2025-03-20T10:01:00Z"}
{"event_id": "f0ed455046a543459d9a51502cdc756d", "type": "page_view", "source": "web", "created_at": "2025-03-20T10:03:00Z"}
EOFEVENTS
cat > /tmp/client.properties << EOF
security.protocol=SASL_PLAINTEXT
sasl.mechanism=PLAIN
sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="user1" password="glassflow-demo-password";
EOF
cat /tmp/events.json | kafka-console-producer.sh --bootstrap-server kafka.kafka.svc.cluster.local:9092 \
--producer.config /tmp/client.properties \
--topic duplicated-events'Verify Results
After a few seconds (maximum delay time - default 1 minute), you should see the deduplicated events in ClickHouse:
kubectl exec -n clickhouse svc/clickhouse -- clickhouse-client \
--user default \
--password glassflow-demo-password \
--format prettycompact \
--query "SELECT * FROM deduplicated_events" ββevent_idββββββββββββββββββββββββββββββ¬βtypeβββββββ¬βsourceββ¬ββββββββββcreated_atββ
1. β 49a6fdd6-f305-4288-81f3-436eb498fc9d β page_view β web β 2025-03-20 10:00:00 β
2. β f0ed4550-46a5-4345-9d9a-51502cdc756d β page_view β web β 2025-03-20 10:02:00 β
ββββββββββββββββββββββββββββββββββββββββ΄ββββββββββββ΄βββββββββ΄ββββββββββββββββββββββCongratulations!! Youβve completed the demo.
Summary on what you achieved:
- You created a GlassFlow pipeline via the UI; it runs on your local Kubernetes cluster (started with the GlassFlow CLI).
- You sent eventsβincluding duplicatesβto a Kafka topic.
- GlassFlow consumed from the topic, deduplicated by
event_id, and wrote the result to a ClickHouse table. - You verified the deduplicated data in the ClickHouse table.
To start creating your own pipelines with the UI, you can follow the Web UI Usage guide.
Cleaning Up
To clean up the demo environment:
glassflow downor force deletion by adding the --force flag:
glassflow down --force