Understanding data transformations in Infrahub

Transformations convert infrastructure data from Infrahub’s graph database into formats that external systems require. Using Jinja2 templates or Python code, transformations enable Infrahub to serve as a source of truth that generates device configurations, API payloads, documentation, and other artifacts in the exact format each system expects.

Why transformations matter

Infrastructure data exists in a canonical form in Infrahub—devices, interfaces, IP addresses stored as interconnected nodes. However, external systems need specific formats:

Network devices expect vendor-specific configuration syntax
Cloud APIs require JSON payloads with specific structure
Documentation systems need Markdown or HTML
Monitoring tools expect CSV or custom formats
Ticketing systems need structured data

Transformations bridge this gap, converting Infrahub’s graph data into whatever format each system requires. This design provides several benefits: Single source of truth: Infrastructure data lives in one place (Infrahub), not scattered across configuration files Consistency: Generated artifacts are always consistent with the source data, eliminating drift Flexibility: Change the output format by updating the transformation without touching the source data Reusability: A single transformation can generate artifacts for multiple devices or objects Version control: Transformations live in Git repositories, providing full history and collaboration

Core concepts

Transformation components

A transformation consists of two main components: GraphQL query: Defines the input data. The query retrieves exactly the data needed from Infrahub’s graph database:

query GetDeviceConfig($device_id: String!) {
  InfraDevice(ids: [$device_id]) {
    edges {
      node {
        hostname {
          value
        }
        interfaces {
          edges {
            node {
              name {
                value
              }
              enabled {
                value
              }
              ip_addresses {
                edges {
                  node {
                    address {
                      value
                    }
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}

Transformation logic: Processes the query results and generates output. This can be:

A Jinja2 template for text-based formats (configurations, documentation)
A Python class for complex logic or structured data (JSON, YAML)

The transformation automatically inherits parameters from the GraphQL query. In the example above, $device_id becomes a parameter the transformation accepts.

Jinja2 transformations

Jinja2 transformations render templates using data from GraphQL queries. They excel at generating text-based formats: Device configurations: Generate vendor-specific configurations:

hostname {{ device.hostname.value }}
!
{% for interface in device.interfaces.edges %}
interface {{ interface.node.name.value }}
  {% if interface.node.enabled.value %}
  no shutdown
  {% else %}
  shutdown
  {% endif %}
  {% for ip in interface.node.ip_addresses.edges %}
  ip address {{ ip.node.address.value }}
  {% endfor %}
{% endfor %}

Documentation: Generate Markdown documentation:

# Device: {{ device.hostname.value }}

## Interfaces

{% for interface in device.interfaces.edges %}
### {{ interface.node.name.value }}

- Status: {{ "Enabled" if interface.node.enabled.value else "Disabled" }}
- IP Addresses:
{% for ip in interface.node.ip_addresses.edges %}
  - {{ ip.node.address.value }}
{% endfor %}
{% endfor %}

Jinja2 transformations (CoreTransformJinja2) store the template path in the repository:

name: device_config
template_path: templates/arista_eos.j2
query: device_query
repository: my_repo

Python transformations

Python transformations provide full programming flexibility for complex logic or structured output:

from infrahub_sdk.transforms import InfrahubTransform

class CloudFormationTransform(InfrahubTransform):
    async def transform(self, data: dict) -> dict:
        """Generate CloudFormation template from device data."""
        device = data["InfraDevice"]["edges"][0]["node"]
        
        resources = {}
        for interface in device["interfaces"]["edges"]:
            iface = interface["node"]
            resource_name = f"NetworkInterface{iface['name']['value']}"
            
            resources[resource_name] = {
                "Type": "AWS::EC2::NetworkInterface",
                "Properties": {
                    "Description": f"{device['hostname']['value']} {iface['name']['value']}",
                    "SubnetId": {"Ref": "SubnetId"},
                    "PrivateIpAddresses": [
                        {"PrivateIpAddress": ip["node"]["address"]["value"], "Primary": True}
                        for ip in iface["ip_addresses"]["edges"]
                    ]
                }
            }
        
        return {
            "AWSTemplateFormatVersion": "2010-09-09",
            "Resources": resources
        }

Python transformations (CoreTransformPython) reference the class in the repository:

name: cloudformation_template
file_path: transforms/cloudformation.py
class_name: CloudFormationTransform
query: device_query
repository: my_repo

Transformation parameters

Transformations inherit parameters from their GraphQL queries. This enables: Static transformations: No parameters, generate the same output each time:

query GetAllDevices {
  InfraDevice {
    edges {
      node {
        hostname { value }
      }
    }
  }
}

Dynamic transformations: Parameters allow generating output for specific objects:

query GetDevice($device_id: String!) {
  InfraDevice(ids: [$device_id]) {
    # ...
  }
}

When rendering, provide parameter values:

infrahubctl render my_transform --param device_id=<uuid>

Architecture and implementation

Transformation execution flow

When a transformation runs:

Query execution: The GraphQL query executes against Infrahub, retrieving data from the graph database
Data preparation: Query results are formatted and passed to the transformation logic
Transformation: The Jinja2 template renders or Python class executes, producing output
Result return: The output is returned to the caller (API, artifact system, CLI)

This flow occurs in several contexts: On-demand rendering: REST API endpoint for immediate transformation:

curl -X POST http://infrahub/api/transform/my_transform \
  -H "Content-Type: application/json" \
  -d '{"device_id": "<uuid>"}'

Artifact generation: Transformations run as part of artifact definitions, storing results in object storage Development/testing: CLI tools render transformations locally:

infrahubctl render my_transform --param device_id=<uuid>

Repository integration

Transformations live in Git repositories alongside schemas, checks, and other infrastructure-as-code:

my_infrahub_repo/
├── .infrahub.yml           # Declares transformations
├── queries/
│   └── device_query.gql    # GraphQL queries
├── templates/
│   ├── arista_eos.j2       # Jinja2 templates
│   └── cisco_ios.j2
└── transforms/
    └── cloudformation.py   # Python transformations

The .infrahub.yml file registers transformations:

jinja2_transforms:
  - name: arista_config
    description: Generate Arista EOS configuration
    query: device_query
    template_path: templates/arista_eos.j2

python_transforms:
  - name: cloudformation_template
    description: Generate AWS CloudFormation template
    query: device_query
    class_name: CloudFormationTransform
    file_path: transforms/cloudformation.py

When Infrahub syncs the repository, it creates CoreTransformJinja2 or CoreTransformPython objects in the database.

Transformation with groups

Transformations themselves are generic—they process data without knowing which objects they’ll transform. Targeting happens through artifact definitions and generators, which combine transformations with groups:

artifacts:
  - name: device_configs
    transformation: arista_config
    targets: arista_devices_group

This pattern enables: Reusable logic: One transformation applies to multiple groups through different artifact definitions Bulk processing: Artifact definitions automatically apply transformations to all group members Dynamic targeting: Adding devices to groups automatically includes them in transformation processing See the Generators concept for how this pattern extends to creating objects.

Branch-aware transformations

Transformations are branch-aware schema objects. This means: Different transformations per branch: A feature branch can have modified transformation logic without affecting production Schema-aware: Transformations automatically adapt to schema changes in the branch Data-aware: Transformations see the branch’s data view, not the main branch When testing infrastructure changes in a branch, transformations generate artifacts using the branch’s data and schema.

Implementation examples

Jinja2 transformation with filters

Jinja2 transformations support custom filters for common operations:

{# Format IP addresses with netmask #}
interface {{ interface.name.value }}
  ip address {{ interface.ip_address.value | ipaddr('address') }} {{ interface.ip_address.value | ipaddr('netmask') }}

{# Convert interface names #}
interface {{ interface.name.value | replace('Ethernet', 'Eth') }}

{# Filter enabled interfaces #}
{% for interface in device.interfaces.edges | selectattr('node.enabled.value') %}
interface {{ interface.node.name.value }}
  no shutdown
{% endfor %}

Infrahub provides infrastructure-specific filters beyond Jinja2’s built-in filters.

Python transformation accessing files

Python transformations can access files in the repository:

from infrahub_sdk.transforms import InfrahubTransform
from pathlib import Path

class TemplateBasedTransform(InfrahubTransform):
    async def transform(self, data: dict) -> dict:
        # self.root_directory points to repository root
        template_path = Path(self.root_directory) / "templates" / "config.j2"
        
        with open(template_path) as f:
            template_content = f.read()
        
        # Process template_content with data
        # ...

This allows Python transformations to use Jinja2 templates or other files while maintaining complex logic.

Multi-vendor transformations

A common pattern is vendor-specific transformations selected by device attribute:

class MultiVendorConfigTransform(InfrahubTransform):
    async def transform(self, data: dict) -> str:
        device = data["InfraDevice"]["edges"][0]["node"]
        vendor = device["vendor"]["node"]["name"]["value"]
        
        # Select template based on vendor
        template_map = {
            "Arista": "arista_eos.j2",
            "Cisco": "cisco_ios.j2",
            "Juniper": "juniper_junos.j2"
        }
        
        template_path = Path(self.root_directory) / "templates" / template_map[vendor]
        
        # Render vendor-specific template
        # ...

Transformation error handling

Transformations should handle errors gracefully:

from infrahub_sdk.transforms import InfrahubTransform
from infrahub_sdk.exceptions import ValidationError

class ValidatingTransform(InfrahubTransform):
    async def transform(self, data: dict) -> dict:
        try:
            device = data["InfraDevice"]["edges"][0]["node"]
        except (KeyError, IndexError) as e:
            raise ValidationError(f"Invalid query response: {e}")
        
        if not device.get("hostname", {}).get("value"):
            raise ValidationError("Device missing required hostname")
        
        # Transform logic
        # ...

Proper error handling helps diagnose issues during artifact generation.

Testing transformations

Infrahub provides a testing framework for transformations with minimal configuration. Tests execute locally during development and automatically in the CI pipeline when changes affect transformations. Create test fixtures in the repository:

# tests/fixtures/device_fixture.json
{
  "InfraDevice": {
    "edges": [
      {
        "node": {
          "hostname": {"value": "router-1"},
          "interfaces": {
            "edges": [
              {
                "node": {
                  "name": {"value": "Ethernet1"},
                  "enabled": {"value": true}
                }
              }
            ]
          }
        }
      }
    ]
  }
}

Define expected output:

# tests/expected/device_config.txt
hostname router-1
!
interface Ethernet1
  no shutdown

The testing framework compares transformation output with expected output, failing tests on mismatch. See the Resource Testing Framework documentation for details.

Design trade-offs

Jinja2 vs. Python

Jinja2 transformations are simpler but less flexible. Python transformations are more powerful but more complex. Choose based on requirements: Use Jinja2 when:

Output is text-based (configurations, documentation)
Logic is simple (loops, conditionals, basic formatting)
Template maintainability matters (non-programmers edit templates)

Use Python when:

Output is structured data (JSON, YAML for APIs)
Complex logic is required (calculations, external API calls)
Type safety and IDE support matter

Transformation granularity

Should you create many small transformations or few large ones? Many small transformations:

Easier to test (focused scope)
More reusable (compose into artifacts)
Clearer purpose
More objects to manage

Few large transformations:

Fewer objects to manage
May duplicate logic
Harder to test
Less flexible

Recommendation: Prefer smaller, focused transformations that can be composed.

Query complexity

GraphQL queries can fetch deeply nested data in one request, but complex queries impact performance. Balance between: Fewer complex queries: Fetch everything needed in one query

Pros: Fewer database round-trips
Cons: Slower queries, more data transferred

More simple queries: Fetch only what’s needed

Pros: Faster individual queries
Cons: Multiple round-trips, coordination complexity

Recommendation: Start with simpler queries and combine them only when performance requires it.

Generators - Using transformations to create objects
Artifacts - Storing transformation output
GraphQL - Writing efficient queries
Repository Integration - Managing transformations in Git

Documentation Index

​Understanding data transformations in Infrahub

​Why transformations matter

​Core concepts

​Transformation components

​Jinja2 transformations

​Python transformations

​Transformation parameters

​Architecture and implementation

​Transformation execution flow

​Repository integration

​Transformation with groups

​Branch-aware transformations

​Implementation examples

​Jinja2 transformation with filters

​Python transformation accessing files

​Multi-vendor transformations

​Transformation error handling

​Testing transformations

​Design trade-offs

​Jinja2 vs. Python

​Transformation granularity

​Query complexity

​Related topics