Real-time log data anomaly detection

A practical example of creating a pipeline for real-time logs data anomaly detection using AI.

This example data transformation pipeline demonstrates data anomaly detection with GlassFlow and AI to monitor server logs to detect unusual patterns or suspicious activities and send notifications to Slack.

Setting Up the Pipeline with GlassFlow

You will use the GlassFlow WebApp to create a data processing pipeline.

Prerequisites

To start with the pipeline creation, you need the following.

Step 1. Log in to GlassFlow WebApp

Navigate to the GlassFlow WebApp and log in with your credentials.

Step 2. Create a New Pipeline

Click on "Create New Pipeline" and provide a name. You can name it "Log Data Anomaly Detection".

Step 3. Configure a Data Source

Select "SDK" to configure the pipeline to use Python SDK to ingest log data from a source like PostgreSQL. For the sake of the demo, we use a sample server logs generator Python script.

Step 4. Define the Transformer

AI-powered transformation function in Python detects anomalies in log data using LLMs(Large Language Models) like GPT-3.5-turbo from OpenAI. Create an API key and set the API key

openai.api_key="{REPLACE_WITH_YOUR_OPENAI_API_KEY}"

in the transformation code below. Paste the updated transformation function code into the transformer's built-in editor.

Note that the handler function is mandatory to implement in your code. Without it, the transformation function will not be successful.

Step 5. Choose a transformer dependency

The transformation function uses openai external library in the code, so we need to choose it from the Dependencies dropdown menu. GlassFlow includes the library in the function deployment and runtime. Read more about Python dependencies for transformation.

Step 6. Configure a Data Sink

Select "Webhook" as a data sink to configure the pipeline to use the Slack Incoming Webhook URL.

Fill in the URL and headers under Connector Details:

  1. Method: POST

  2. URL: https://hooks.slack.com/services/T00000000/B00000000/XXXXXXXXXXXXXXXXXXXXXXXX

  3. Headers:

    • Content-Type: application/json

Step 7. Confirm the Pipeline

Confirm the pipeline settings in the final step and click "Create Pipeline".

Step 8. Copy the Pipeline Credentials

Once the pipeline is created, copy its credentials such as Pipeline ID and Access Token.

Send sample log data to the pipeline

Prerequisites

To complete this part you'll need the following:

Installation

  1. Clone the glassflow-examples repository to your local machine:

    git clone https://github.com/glassflow/glassflow-examples.git
  2. Navigate to the project directory:

    cd use-cases/real-time-data-anomaly-detection
  3. Create a new virtual environment:

    python -m venv .venv && source .venv/bin/activate
  4. Install the required dependencies:

    pip install -r requirements.txt

Create an environment configuration file

Add a .env file in the project directory and add the following configuration variables:

PIPELINE_ID=your_pipeline_id
PIPELINE_ACCESS_TOKEN=your_pipeline_access_token

Replace your_pipeline_id and your_pipeline_access_token with appropriate values obtained from your GlassFlow account.

Run the Log Producer

Run producer.py Python script in a terminal to publish sample server log data to the GlassFlow pipeline:

python producer.py

GlassFlow pipeline automatically sends transformed events to Slack in case any suspicious or unusual activities are detected in the sample logs. You should see an output indicating that messages are being received on Slack.

Summary

Following this tutorial, you’ve set up a real-time log data anomaly detection pipeline using GlassFlow, Open AI, and Slack. Enriched logs, containing identified anomalies, can also be sent to Amazon S3 or OpenSearch Service for further analysis and long-term storage. Additionally, alert notifications can be integrated with communication platforms such as Microsoft Teams or SMS services like Twilio.

This pipeline can be easily adapted for other real-time alerting use cases. That includes monitoring financial transactions for fraud, detecting security breaches, tracking performance metrics, and ensuring compliance with regulatory requirements.

Last updated

Logo

© 2023 GlassFlow