Weaviate
Integrating Weaviate as a Sink with GlassFlow Using Webhook Connector
Last updated
Integrating Weaviate as a Sink with GlassFlow Using Webhook Connector
Last updated
© 2023 GlassFlow
In this guide, you will learn the process of integrating Weaviate as a data sink with GlassFlow using the Webhook connector. With this setup, you can automatically push data processed by GlassFlow into Weaviate, enabling advanced vector search functionalities and building AI applications. Read more about the use case with Weaviate and GlassFlow.
Before you begin, make sure you have the following:
A Weaviate Cluster instance is up and running.
A GlassFlow account. Sign up for a free GlassFlow account.
Login into the Weaviate console.
Create a new collection.
Choose a vectorizer type like text2vec-openai
and model like text-embedding-3-small
Copy the Weaviate API URL and Admin KEY.
Step 2: Log in to GlassFlow and Create a New Pipeline
Log in to the GlassFlow WebApp.
Navigate to GlassFlow WebApp and log in with your credentials.
Create a New Pipeline.
Go to the "Pipelines" section and click on "Create New Pipeline."
Provide a name for your pipeline, for example, Weaviate-Sink-Pipeline
.
Select the "Space" you want the pipeline to reside in.
Step 3: Configure the Webhook as a Data Source
Choose "Webhook" as the Data Source.
During the pipeline creation process, select "Webhook" as your data source connector.
GlassFlow will provide you with a unique Webhook URL. This is where you can send data from any source to be processed and sent to Weaviate.
Step 4: Add a Transformation Stage
Configure the Transformation Stage.
In the pipeline setup, you will see an option to add a transformation function. This is where you can define how the data should be processed before it reaches Weaviate.
Upload a Python script (transform.py
) or write your transformation logic directly in the GlassFlow WebApp.
For instance, if you need to enrich the incoming data or convert it into vector format using vector embedding models such as from OpenAI, you can implement this logic here.
Choose Dependencies (if needed).
If your transformation requires external libraries (e.g., pandas
, openai
), you can select them from the dependency menu in GlassFlow.
Step 5: Choose Weaviate as a Data Sink
Select "Weaviate" as Your Data Sink.
After setting up the transformation stage, choose "Webhook" as your data sink connector.
Configure the Data Sink.
Fill in the URL and headers under Connector Details:
Method: POST
URL: https://${WEAVIATE_API_URL}/v1/objects
Headers:
Content-Type
: application/json
Authentication
: Bearer ${WEAVIATE_API_KEY}
Step 5: Finalize and Deploy the Pipeline
Review and Deploy the Pipeline.
Review all the settings in the GlassFlow WebApp and click "Create Pipeline."
Your pipeline will now be active and start sending processed data to Weaviate via the Webhook connector.
Step 6: Query and Analyze Data in Weaviate
Monitor Incoming Data in Weaviate.
Once data starts flowing into Weaviate, you can use its query capabilities to analyze and retrieve data.
Perform vector searches, filter results based on properties, and build intelligent applications on top of your Weaviate data.
Advanced Searches (Optional).
Leverage Weaviate’s vector search capabilities to perform more sophisticated queries and provide better results for your end-users.
Conclusion
By following this tutorial, you’ve set up a real-time data pipeline that streams processed data directly from GlassFlow into Weaviate using the Webhook sink connector. This integration is perfect for applications requiring fast, scalable, and intelligent data retrieval.