Define a transformation function
This page explains how to create a custom transformation function with GlassFlow in Python
Data transformation enables the conversion or mapping of data from one format or structure into another. GlassFlow facilitates this process using a custom Python transformation function, allowing for a wide range of transformation scenarios including data cleaning, validation, normalization, enrichment, and more.
Implementing Transformations
To perform data transformations in GlassFlow, you write a Python script containing a mandatory handler
function. This function is where you define your transformation logic:
GlassFlow automatically invokes this function when a data pipeline runs and it passes two arguments:
data
- represents the event dispatched to the pipeline, accessible within the function as a JSON or Python dictionary.log
- is a Python logging object to generate logs. Any logs created by the user will be included in the pipeline logs, which can be viewed through the CLI.
The handler
function processes this data and returns the transformed data as a JSON or Python dictionary.
Default Transformation Function
When you create a pipeline in GlassFlow without a custom transformation function, a default "echo" function is automatically created. Here's the basic structure of the default transformation function script in GlassFlow:
To customize the transformation function, you can modify the handler.py
file to include your transformation logic.
You can also include other Python dependencies (Python packages that youimport
into your script) in the transformation function. You can add a requirements.txt file when creating the pipeline
Next
In the Create a Pipeline section, you will learn how to configure a new pipeline.
Last updated