Skip to Content
Getting StartedRun a demo pipeline

Run a demo pipeline

GlassFlow comes with a comprehensive demo environment that allows you to test its capabilities locally. This guide walks you through a local installation using GlassFlow CLI. The CLI brings up a Kind cluster and deploys Kafka, ClickHouse, and GlassFlow ETL. You will then create a pipeline and run through a deduplication demo.

Demo Overview

This guide has you set up a pipeline using the GlassFlow UI, then verify deduplication with sample events.

Install GlassFlow locally

Install the GlassFlow CLI so you can run the demo. You need Docker (or compatible runtime), Helm (installΒ  or brew install helm), and kubectl (installed automatically via Homebrew if you use the recommended install).

Verify installation

glassflow version
glassflow version 2.2.1 commit: 7b4743328312791cbeb0ba8235dd5b85719e9f2d date: 2026-03-04T11:15:36Z

Start GlassFlow locally

With the CLI installed, bring up the local environment (Kind, Kafka, ClickHouse, and GlassFlow ETL):

glassflow up

This will:

  • Create a Kind cluster (if needed)
  • Install GlassFlow ETL (glassflow namespace) via Helm
  • Install Kafka (kafka namespace) and ClickHouse (clickhouse namespace) via Helm
  • Wait for all services to be ready (up to ~25 minutes)

The following steps guide you through creating a pipeline in the GlassFlow UI.

For more options and details, see the Installation Guide.

Set up the pipeline via the UI

Create a new topic in Kafka

kubectl exec -n kafka svc/kafka -- bash -c 'cat > /tmp/client.properties << EOF security.protocol=SASL_PLAINTEXT sasl.mechanism=SCRAM-SHA-256 sasl.jaas.config=org.apache.kafka.common.security.scram.ScramLoginModule required username="user1" password="glassflow-demo-password"; EOF kafka-topics.sh --bootstrap-server kafka.kafka.svc.cluster.local:9092 \ --command-config /tmp/client.properties \ --create --topic duplicated-events \ --partitions 1 --replication-factor 1'

Send example data to the topic

Send one sample event to the duplicated-events topic so the pipeline will have data to ingest once you create it in the UI:

kubectl exec -n kafka svc/kafka -- bash -c 'cat > /tmp/events.json << "EOFEVENTS" {"event_id": "49a6fdd6f305428881f3436eb498fc9d", "type": "page_view", "source": "web", "created_at": "2025-03-20T10:00:00Z"} EOFEVENTS cat > /tmp/client.properties << EOF security.protocol=SASL_PLAINTEXT sasl.mechanism=PLAIN sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="user1" password="glassflow-demo-password"; EOF cat /tmp/events.json | kafka-console-producer.sh --bootstrap-server kafka.kafka.svc.cluster.local:9092 \ --producer.config /tmp/client.properties \ --topic duplicated-events'

Create a new table in ClickHouse

kubectl exec -n clickhouse svc/clickhouse -- clickhouse-client \ --user default \ --password glassflow-demo-password \ --query "CREATE TABLE IF NOT EXISTS deduplicated_events (event_id UUID, type String, source String, created_at DateTime) ENGINE = MergeTree ORDER BY event_id"

Configure pipeline in the UI

Once running, you can access (ports may vary if alternatives were chosen):

Open the GlassFlow UI and use the connection details below to create a pipeline.

  • In this demo we will create a single topic pipeline.
  • Give the pipeline a name (for example, β€œDemo Pipeline”).
  • The UI will automatically generate a pipeline ID for the pipeline.

Kafka Connection

Authentication Method: SASL/PLAIN Security Protocol: SASL_PLAINTEXT Bootstrap Servers: kafka.kafka.svc.cluster.local:9092 Username: user1 Password: glassflow-demo-password

Kafka Topic

Topic Name: duplicated-events Consumer Group Initial Offset: latest Schema: { "event_id": "ddccabe2-c673-4d8a-affc-8647db00f7b5", "type": "page_view", "source": "web", "created_at": "2025-12-03T15:17:34.907877Z" }

Deduplication

Enabled: true Deduplicate Key: event_id Deduplicate Key Type: string Time Window: 1h

Skip Filter and skip Transform.

ClickHouse Connection

Host: clickhouse.clickhouse.svc.cluster.local HTTP/S Port: 8123 Native Port: 9000 Username: default Password: glassflow-demo-password Use SSL: false

ClickHouse Table

Table: deduplicated_events

Wait for the pipeline to be deployed. The UI will redirect to the pipeline page once the pipeline is deployed.

Send data to Kafka

Run the following on your machine in a terminal:

# Send multiple JSON events to Kafka kubectl exec -n kafka svc/kafka -- bash -c 'cat > /tmp/events.json << "EOFEVENTS" {"event_id": "49a6fdd6f305428881f3436eb498fc9d", "type": "page_view", "source": "web", "created_at": "2025-03-20T10:00:00Z"} {"event_id": "49a6fdd6f305428881f3436eb498fc9d", "type": "page_view", "source": "web", "created_at": "2025-03-20T10:01:00Z"} {"event_id": "f0ed455046a543459d9a51502cdc756d", "type": "page_view", "source": "web", "created_at": "2025-03-20T10:03:00Z"} EOFEVENTS cat > /tmp/client.properties << EOF security.protocol=SASL_PLAINTEXT sasl.mechanism=PLAIN sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="user1" password="glassflow-demo-password"; EOF cat /tmp/events.json | kafka-console-producer.sh --bootstrap-server kafka.kafka.svc.cluster.local:9092 \ --producer.config /tmp/client.properties \ --topic duplicated-events'

Verify Results

After a few seconds (maximum delay time - default 1 minute), you should see the deduplicated events in ClickHouse:

kubectl exec -n clickhouse svc/clickhouse -- clickhouse-client \ --user default \ --password glassflow-demo-password \ --format prettycompact \ --query "SELECT * FROM deduplicated_events"
β”Œβ”€event_id─────────────────────────────┬─type──────┬─source─┬──────────created_at─┐ 1. β”‚ 49a6fdd6-f305-4288-81f3-436eb498fc9d β”‚ page_view β”‚ web β”‚ 2025-03-20 10:00:00 β”‚ 2. β”‚ f0ed4550-46a5-4345-9d9a-51502cdc756d β”‚ page_view β”‚ web β”‚ 2025-03-20 10:02:00 β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Congratulations!! You’ve completed the demo.

Summary on what you achieved:

  • You created a GlassFlow pipeline via the UI; it runs on your local Kubernetes cluster (started with the GlassFlow CLI).
  • You sent eventsβ€”including duplicatesβ€”to a Kafka topic.
  • GlassFlow consumed from the topic, deduplicated by event_id, and wrote the result to a ClickHouse table.
  • You verified the deduplicated data in the ClickHouse table.

To start creating your own pipelines with the UI, you can follow the Web UI Usage guide.

Cleaning Up

To clean up the demo environment:

glassflow down

or force deletion by adding the --force flag:

glassflow down --force
Last updated on