Real-time log data anomaly detection
A practical example of creating a pipeline for real-time logs data anomaly detection using AI.
Last updated
A practical example of creating a pipeline for real-time logs data anomaly detection using AI.
Last updated
© 2023 GlassFlow
This example data transformation pipeline demonstrates data anomaly detection with GlassFlow and AI to monitor server logs to detect unusual patterns or suspicious activities and send notifications to Slack.
Link to the GitHub project repository
You will use the GlassFlow WebApp to create a data processing pipeline.
To start with the pipeline creation, you need the following.
You have an OpenAI API account.
Slack account: If don't have a Slack account, sign up for a new free one here and go to the Slack Get Started page.
Slack workspace: You need access to a Slack workspace where you're an admin. If you are creating just a new workspace, follow this guide.
You created an incoming webhook for your Slack workspace.
Navigate to the GlassFlow WebApp and log in with your credentials.
Click on "Create New Pipeline" and provide a name. You can name it "Log Data Anomaly Detection".
Select "SDK" to configure the pipeline to use Python SDK to ingest log data from a source like PostgreSQL. For the sake of the demo, we use a sample server logs generator Python script.
AI-powered transformation function in Python detects anomalies in log data using LLMs(Large Language Models) like GPT-3.5-turbo
from OpenAI. Create an API key and set the API key
in the transformation code below. Paste the updated transformation function code into the transformer's built-in editor.
Note that the handler function is mandatory to implement in your code. Without it, the transformation function will not be successful.
The transformation function uses openai external library in the code, so we need to choose it from the Dependencies dropdown menu. GlassFlow includes the library in the function deployment and runtime. Read more about Python dependencies for transformation.
Select "Webhook" as a data sink to configure the pipeline to use the Slack Incoming Webhook URL.
Fill in the URL and headers under Connector Details:
Method: POST
URL: https://hooks.slack.com/services/T00000000/B00000000/XXXXXXXXXXXXXXXXXXXXXXXX
Headers:
Content-Type
: application/json
Confirm the pipeline settings in the final step and click "Create Pipeline".
Once the pipeline is created, copy its credentials such as Pipeline ID and Access Token.
To complete this part you'll need the following:
Python is installed on your machine.
Download and Install Pip to manage project packages.
Clone the glassflow-examples
repository to your local machine:
Navigate to the project directory:
Create a new virtual environment:
Install the required dependencies:
Add a .env
file in the project directory and add the following configuration variables:
Replace your_pipeline_id
and your_pipeline_access_token
with appropriate values obtained from your GlassFlow account.
Run producer.py
Python script in a terminal to publish sample server log data to the GlassFlow pipeline:
GlassFlow pipeline automatically sends transformed events to Slack in case any suspicious or unusual activities are detected in the sample logs. You should see an output indicating that messages are being received on Slack.
Following this tutorial, you’ve set up a real-time log data anomaly detection pipeline using GlassFlow, Open AI, and Slack. Enriched logs, containing identified anomalies, can also be sent to Amazon S3 or OpenSearch Service for further analysis and long-term storage. Additionally, alert notifications can be integrated with communication platforms such as Microsoft Teams or SMS services like Twilio.
This pipeline can be easily adapted for other real-time alerting use cases. That includes monitoring financial transactions for fraud, detecting security breaches, tracking performance metrics, and ensuring compliance with regulatory requirements.