Skip to Content
InstallationMigrationv2.4.x to v2.5.x

Migration Guide: v2.4.x to v2.5.x

This guide provides detailed instructions for migrating from GlassFlow v2.4.x to v2.5.x. This is a major architectural upgrade that introduces PostgreSQL metadata storage and requires careful migration planning.

⚠️

Breaking Change: v2.5.x introduces breaking changes that require migration. Rolling back from v2.5.x to v2.4.x is not supported.

What Changes in v2.5.x

Major Architectural Changes

  1. PostgreSQL Metadata Storage - Replaces NATS KV for storing pipeline configurations, schemas, and state
  2. File-Based Deduplication - New BadgerDB-based deduplication service with persistent storage
  3. Enhanced Pipeline Schema - Improved pipeline configuration format and validation
  4. New User Interface - Completely redesigned UI with better user experience

Infrastructure Requirements

  • Additional Storage - PostgreSQL requires persistent storage (default: 10Gi)
  • Deduplication Storage - File-based deduplication requires persistent storage
  • Memory Requirements - For deduplication enabled pipelines, drastically reduced NATS memory requirements

Pre-Migration Checklist

Backup Current Configuration

Before starting migration, create backups of your current pipeline configurations:

Manual backup through the web interface:

  1. Access your GlassFlow UI at http://your-glassflow-url
  2. Navigate to the pipelines list
  3. For each pipeline, click “Download config” to save the pipeline.json file
  4. Store all downloaded files in a backup directory

This method is recommended for users with a small number of pipelines or those who prefer a visual approach.

Additional Backups

# Export current Helm values helm get values your-release-name -n glassflow > current-values.yaml

Verify System Requirements

Ensure your Kubernetes cluster meets the requirements:

# Check available storage kubectl get storageclass # Verify node resources kubectl describe nodes # Check current GlassFlow version helm list -n glassflow kubectl get pods -n glassflow -o wide

Stop All Running Pipelines

Critical: All pipelines must be stopped before migration. You can stop them via the UI or API:

# Via API - Stop each pipeline individually curl -X POST http://your-glassflow-api/api/v1/pipeline/{pipeline-id}/stop # Verify all pipelines are stopped curl http://your-glassflow-api/api/v1/pipeline | jq '.[] | {id: .id, status: .status}'

Migration Process

Update Helm Repository

helm repo update # Verify new chart version is available helm search repo glassflow/glassflow-etl --versions

Expected output should show version 0.4.1 with app version 2.5.1:

NAME CHART VERSION APP VERSION DESCRIPTION glassflow/glassflow-etl 0.4.1 2.5.1 A Helm chart for Kubernetes

Prepare Helm Values

Create or update your Helm values file to include PostgreSQL configuration:

# values-v2.5.x.yaml # PostgreSQL configuration (new in v2.5.x) postgresql: enabled: true auth: database: "glassflow" username: "glassflow" password: "your-secure-password" # Change this! # Existing configuration (preserve your current settings) # ... your existing Helm values ...

Execute Helm Upgrade

Run the Helm upgrade with extended timeout to allow for migration:

helm upgrade your-release-name glassflow/glassflow-etl \ --namespace glassflow \ --version 0.4.1 \ --values values-v2.5.x.yaml \ --wait \ --timeout 600s

Monitor Migration Progress

The migration process includes several automated steps. Monitor the progress:

# Watch overall upgrade progress kubectl get pods -n glassflow -w # Monitor migration job specifically kubectl get jobs -n glassflow | grep migration # Follow migration logs in real-time kubectl logs -n glassflow job/your-release-name-glassflow-etl-migration -f

Verify Migration Completion

Check that migration completed successfully:

# Verify migration job completed kubectl get jobs -n glassflow # Should show COMPLETIONS 1/1 for migration job # Check PostgreSQL is running kubectl get pods -n glassflow | grep postgresql # Verify API is responding kubectl port-forward -n glassflow svc/your-release-name-glassflow-etl 8080:8080 & curl http://localhost:8080/health

Post-Migration Verification

Verify Pipeline Data Migration

Check that all pipelines were migrated successfully:

# List pipelines via API curl http://localhost:8080/api/v1/pipeline | jq '.[] | {id: .pipeline_id, name: .name}' # Compare with pre-migration backup # Count should match the number of .json files in your backup directory ls ./pipeline-backup-*/pipeline-*.json | wc -l

Test New UI

Access the new user interface:

# Port forward to UI kubectl port-forward -n glassflow svc/your-release-name-glassflow-etl 8080:8080 # Open browser to http://localhost:8080 # Verify all pipelines are visible and manageable

Resume Pipeline Operations

Once verification is complete, resume your pipelines:

# Via new UI - Start pipelines individually # Or via API curl -X POST http://localhost:8080/api/v1/pipeline/{pipeline-id}/start

Monitor System Health

Monitor the system after migration:

# Check all pods are healthy kubectl get pods -n glassflow # Monitor PostgreSQL kubectl logs -n glassflow deployment/your-release-name-postgresql

Migration Logs Analysis

Successful Migration Example

{"time":"2025-12-03T17:44:48.050Z","level":"INFO","msg":"Starting App","version":"2.5.0"} {"time":"2025-12-03T17:44:48.089Z","level":"INFO","msg":"postgres connection established","max_conns":25,"min_conns":5} {"time":"2025-12-03T17:44:48.089Z","level":"INFO","msg":"Starting data migration from NATS KV to PostgreSQL","kv_store_name":"glassflow-pipelines"} {"time":"2025-12-03T17:44:48.093Z","level":"INFO","msg":"Found pipelines in NATS KV store","store_name":"glassflow-pipelines","count":5} {"time":"2025-12-03T17:44:48.093Z","level":"INFO","msg":"Migrating pipeline with same ID","pipeline_id":"demo-pipeline-1","name":"demo-dedup"} {"time":"2025-12-03T17:44:48.095Z","level":"INFO","msg":"inserting pipeline","pipeline_id":"demo-pipeline-1","pipeline_name":"demo-dedup"} {"time":"2025-12-03T17:44:48.102Z","level":"INFO","msg":"pipeline inserted successfully","pipeline_id":"demo-pipeline-1","pipeline_name":"demo-dedup"} {"time":"2025-12-03T17:44:48.103Z","level":"INFO","msg":"Pipeline migrated successfully","pipeline_id":"demo-pipeline-1","name":"demo-dedup"} {"time":"2025-12-03T17:44:48.103Z","level":"INFO","msg":"Data migration completed","migrated":5,"skipped":0,"errors":0,"store_name":"glassflow-pipelines"} {"time":"2025-12-03T17:44:48.103Z","level":"INFO","msg":"data migration from NATS KV completed","kv_store_name":"glassflow-pipelines"}

Key Migration Metrics

  • migrated: Number of pipelines successfully migrated
  • skipped: Number of pipelines skipped (already exist in PostgreSQL)
  • errors: Number of migration errors (should be 0)

Troubleshooting

Common Issues and Solutions

Migration Job Fails

# Check migration job logs kubectl logs -n glassflow job/your-release-name-glassflow-etl-migration # Common causes: # 1. Insufficient storage for PostgreSQL # 2. PostgreSQL connection issues # 3. NATS KV access problems

PostgreSQL Connection Issues

# Check PostgreSQL pod status kubectl get pods -n glassflow | grep postgresql # Check PostgreSQL logs kubectl logs -n glassflow deployment/your-release-name-postgresql # Verify PostgreSQL service kubectl get svc -n glassflow | grep postgresql

Pipeline Data Missing

# Verify migration job completed kubectl get jobs -n glassflow # Check if pipelines exist in PostgreSQL kubectl exec -n glassflow deployment/your-release-name-postgresql -- \ psql -U glassflow -d glassflow -c "SELECT id, name FROM pipelines;"

UI Not Loading

# Check UI pod status kubectl get pods -n glassflow | grep ui # Check UI logs kubectl logs -n glassflow deployment/your-release-name-ui # Verify UI service kubectl get svc -n glassflow | grep ui

Getting Help

If you encounter any issues during migration:

Useful Commands for Support

# Collect migration information kubectl get all -n glassflow > glassflow-resources.txt kubectl logs -n glassflow job/migration-job > migration-logs.txt helm get values your-release-name -n glassflow > helm-values.txt # Check system resources kubectl top nodes > node-resources.txt kubectl top pods -n glassflow > pod-resources.txt

Migration Support: The GlassFlow team is available to help with migration issues. Contact us at [email protected] with your migration logs and system information.

Last updated on