Backup and restore

Backup your Infrahub deployment to protect against data loss and enable disaster recovery. This page covers backup strategies, procedures, and restoration processes.

What to backup

A complete Infrahub backup includes:

Neo4j database - All graph data, schemas, branches, and version history
Prefect PostgreSQL database - Task execution history and workflow state
Artifact storage - Generated artifacts, files, and Git repositories
Configuration - Environment variables and configuration files

Backup strategies

Full backup

Capture all components at a consistent point in time:

Best for disaster recovery
Ensures data consistency across all services
Requires stopping or pausing services
Suitable for scheduled maintenance windows

Hot backup

Backup while services are running:

Minimizes downtime
May have slight inconsistencies
Suitable for continuous operations
Requires careful coordination

Incremental backup

Backup only changed data since last backup:

Reduces backup size and time
Requires baseline full backup
More complex restoration process
Best for large deployments

Backup tools

infrahub-backup CLI (recommended)

The infrahub-backup tool provides automated backup and restore:

# Download the tool
curl https://infrahub.opsmill.io/ops/$(uname -s)/$(uname -m)/infrahub-backup \
  -o infrahub-backup
chmod +x infrahub-backup

# Create backup
./infrahub-backup create

# Restore from backup
./infrahub-backup restore infrahub_backup_20250302_120000.tar.gz

Features:

Automatic Neo4j and PostgreSQL backup
Integrity verification with SHA256 checksums
Coordinated service shutdown and restart
Metadata preservation

Limitations:

Artifact storage backup not yet included
Requires Docker Compose deployment

For detailed usage, see the backup guide.

Manual backup procedures

For custom backup workflows or Kubernetes deployments:

Backup procedures

Docker Compose backup

Step 1: Stop task workers Prevent new tasks from starting:

docker compose stop task-worker

Step 2: Backup Neo4j

docker exec -it -u neo4j infrahub-database-1 bash
mkdir -p backups
neo4j-admin database backup --to-path=backups/ neo4j
exit

# Copy to host
docker cp infrahub-database-1:/var/lib/neo4j/backups/neo4j-2025-03-02T12-00-00.backup .

Step 3: Backup PostgreSQL

docker compose exec -T task-manager-db \
  pg_dump -Fc -U postgres -d prefect > prefect_backup.dump

Step 4: Backup artifacts

# For local storage
docker compose cp infrahub-server:/opt/infrahub/storage /backup/artifacts/

# For S3 storage
aws s3 sync s3://your-infrahub-bucket /backup/artifacts/

Step 5: Restart services

docker compose start task-worker

Kubernetes backup

For Kubernetes deployments, see the dedicated backup guide: Kubernetes Backup Guide

Restore procedures

Docker Compose restore

Step 1: Stop services

docker compose stop task-worker infrahub-server task-manager

Step 2: Restore Neo4j

# Copy backup to container
docker cp neo4j-2025-03-02T12-00-00.backup infrahub-database-1:/var/lib/neo4j/

# Connect to container
docker exec -it -u neo4j infrahub-database-1 bash

# Drop existing database
cypher-shell -d system -u neo4j
DROP DATABASE neo4j;
exit;

# Clean data directories
rm -rf /data/databases/neo4j
rm -rf /data/transactions/neo4j

# Restore backup
neo4j-admin database restore \
  --from-path=/var/lib/neo4j/neo4j-2025-03-02T12-00-00.backup neo4j \
  --overwrite-destination=true

# Recreate database
cypher-shell -d system -u neo4j
CREATE DATABASE neo4j;
exit

Step 3: Restore PostgreSQL

docker compose exec -T task-manager-db \
  pg_restore -d postgres -U postgres --clean --create prefect_backup.dump

Step 4: Restore artifacts

# For local storage
docker compose cp /backup/artifacts/ infrahub-server:/opt/infrahub/storage

# For S3 storage
aws s3 sync /backup/artifacts/ s3://your-infrahub-bucket

Step 5: Restart services

docker compose start task-manager
docker compose start infrahub-server
docker compose start task-worker

Step 6: Verify restoration

# Check API health
curl http://localhost:8000/api/schema/summary

# Check database
docker compose exec database cypher-shell -u neo4j -c "SHOW DATABASES;"

Neo4j cluster backup and restore

For Neo4j Enterprise clusters, follow these specialized procedures:

Backup from cluster node

# Connect to a follower node
docker exec -it -u neo4j infrahub-database-core2-1 bash

# Create backup
mkdir -p backups
neo4j-admin database backup --to-path=backups/ neo4j

Restore to cluster node

Step 1: Transfer backup

docker cp neo4j-2025-03-02T12-00-00.backup infrahub-database-core3-1:/var/lib/neo4j/

Step 2: Drop database cluster-wide

cypher-shell -d system -u neo4j
DROP DATABASE neo4j;
SHOW SERVERS;

Step 3: Clean target node

docker exec -it -u neo4j infrahub-database-core3-1 bash
rm -rf /data/databases/neo4j
rm -rf /data/transactions/neo4j
exit

docker restart infrahub-database-core3-1

Step 4: Restore backup

docker exec -it -u neo4j infrahub-database-core3-1 bash
neo4j-admin database restore \
  --from-path=/var/lib/neo4j/neo4j-2025-03-02T12-00-00.backup neo4j

Step 5: Get seed instance ID

cypher-shell -d system -u neo4j
SHOW SERVERS;

Note the serverId for the target node. Step 6: Recreate database from seed

CREATE DATABASE neo4j
TOPOLOGY 3 PRIMARIES
OPTIONS {
  existingData: 'use',
  existingDataSeedInstance: 'd05fce79-e63e-485a-9ce7-1abbf9d18fce'
};

Step 7: Verify cluster sync

SHOW DATABASES;
SHOW SERVERS;

For detailed cluster procedures, see the backup guide.

Backup automation

Scheduled backups with cron

Create a backup script:

backup.sh

#!/bin/bash
set -e

BACKUP_DIR="/backup/infrahub"
DATE=$(date +%Y%m%d_%H%M%S)

# Create backup directory
mkdir -p "$BACKUP_DIR"

# Run infrahub-backup
/usr/local/bin/infrahub-backup create

# Move backup to storage
mv infrahub_backup_*.tar.gz "$BACKUP_DIR/infrahub_backup_$DATE.tar.gz"

# Clean old backups (keep last 7 days)
find "$BACKUP_DIR" -name "infrahub_backup_*.tar.gz" -mtime +7 -delete

# Upload to S3 (optional)
aws s3 sync "$BACKUP_DIR" s3://your-backup-bucket/infrahub/

Schedule with cron:

# Edit crontab
crontab -e

# Run daily at 2 AM
0 2 * * * /usr/local/bin/backup.sh >> /var/log/infrahub-backup.log 2>&1

Kubernetes CronJob

Create a Kubernetes CronJob:

backup-cronjob.yaml

apiVersion: batch/v1
kind: CronJob
metadata:
  name: infrahub-backup
  namespace: infrahub
spec:
  schedule: "0 2 * * *"  # Daily at 2 AM
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: backup
            image: registry.opsmill.io/opsmill/infrahub-backup:latest
            env:
            - name: BACKUP_DESTINATION
              value: s3://your-backup-bucket/infrahub/
            volumeMounts:
            - name: backup-storage
              mountPath: /backup
          volumes:
          - name: backup-storage
            persistentVolumeClaim:
              claimName: backup-pvc
          restartPolicy: OnFailure

Apply:

kubectl apply -f backup-cronjob.yaml

Backup retention

Retention policy example

Hourly backups: Keep for 24 hours
Daily backups: Keep for 7 days
Weekly backups: Keep for 4 weeks
Monthly backups: Keep for 12 months

Implement retention with script

retention.sh

#!/bin/bash
BACKUP_DIR="/backup/infrahub"

# Keep hourly backups for 1 day
find "$BACKUP_DIR/hourly" -mtime +1 -delete

# Keep daily backups for 7 days
find "$BACKUP_DIR/daily" -mtime +7 -delete

# Keep weekly backups for 28 days
find "$BACKUP_DIR/weekly" -mtime +28 -delete

# Keep monthly backups for 365 days
find "$BACKUP_DIR/monthly" -mtime +365 -delete

Testing backups

Verify backup integrity

# Verify checksum
sha256sum -c infrahub_backup_20250302_120000.tar.gz.sha256

# Test archive extraction
tar -tzf infrahub_backup_20250302_120000.tar.gz > /dev/null

Test restoration

Periodically test restoration in a separate environment:

Deploy fresh Infrahub instance
Restore from backup
Verify data integrity
Test API functionality
Document any issues

Database backup guide - Detailed backup procedures
Docker Compose deployment - Deployment configuration
Kubernetes deployment - Kubernetes-specific backups
Monitoring - Monitor backup health

Documentation Index

​What to backup

​Backup strategies

​Full backup

​Hot backup

​Incremental backup

​Backup tools

​infrahub-backup CLI (recommended)

​Manual backup procedures

​Backup procedures

​Docker Compose backup

​Kubernetes backup

​Restore procedures

​Docker Compose restore

​Neo4j cluster backup and restore

​Backup from cluster node

​Restore to cluster node

​Backup automation

​Scheduled backups with cron

​Kubernetes CronJob

​Backup retention

​Retention policy example

​Implement retention with script

​Testing backups

​Verify backup integrity

​Test restoration

​Related resources