Publish and Consume data

This page explains how to publish data to GlassFlow pipelines.

Ways to publish data into GlassFlow

Publishing data in GlassFlow involves sending data to a pipeline for processing. You use built in integrations to publish data or write code in Python for a new custom integration.

Publishing and Consuming data via GlassFlow managed connectors

GlassFlow can automatically publish and consume data on your behalf using managed connectors. The connectors can be configured during the pipeline creation. These include data sources like Google PubSub, AWS SQS. On data sinks, you can send data directly to a webhook sink or to a clickhouse database using the managed sink connectors.

Visit the Integrations page for more information.

Publishing data using Python SDK

The Python SDK provides a programmatic way to interact with GlassFlow pipelines to produce or consume data continuously. Using the SDK you create a custom connector for any data source in Python.

Prerequisites

You created a Pipeline.
You have the pipeline credentials such as PIPELINE_ID and PIPELINE_ACCESS_TOKEN.

Install GlassFlow Python SDK

Install a GlassFlow SDK using the pipcommand in a terminal.

pip install "glassflow>=2.0.5"

Push Data to the pipeline

from glassflow import PipelineDataSource

source = PipelineDataSource(
    pipeline_id=PIPELINE_ID,
    pipeline_access_token=PIPELINE_ACCESS_TOKEN
)
# event is the json data that you want to send to glassflow pipeline
event = {"id": 123, "source": "sdk testing"} 
source.publish(event)

As you can see, with just a few lines of python code, you can send data to your GlassFlow pipeline. The pipeline always interacts with data in json format.

Your pipeline function will automatically run with the published data. The data is made available to the function in json format.

Consume Data from the pipeline

from glassflow import PipelineDataSink

sink = PipelineDataSink(
    pipeline_id=PIPELINE_ID,
    pipeline_access_token=PIPELINE_ACCESS_TOKEN
)
received_events = []
while True:
    res = sink.consume()

    if res.status_code == 200:
        event = res.json()
        print(event)
        received_events.append(event)

GlassFlow examples GitHub repository has usage examples in Jupyter Notebook format that you can use to learn more on how to send data via python

You can also take a look at the Python SDK docs to learn more about what else you can do with the GlassFlow Python SDK

PreviousDefine a transformation function NextManage pipeline

Last updated 1 month ago