Documentation Index
Fetch the complete documentation index at: https://mintlify.com/OpsMill/infrahub/llms.txt
Use this file to discover all available pages before exploring further.
Understanding data transformations in Infrahub
Transformations convert infrastructure data from Infrahub’s graph database into formats that external systems require. Using Jinja2 templates or Python code, transformations enable Infrahub to serve as a source of truth that generates device configurations, API payloads, documentation, and other artifacts in the exact format each system expects.
Infrastructure data exists in a canonical form in Infrahub—devices, interfaces, IP addresses stored as interconnected nodes. However, external systems need specific formats:
- Network devices expect vendor-specific configuration syntax
- Cloud APIs require JSON payloads with specific structure
- Documentation systems need Markdown or HTML
- Monitoring tools expect CSV or custom formats
- Ticketing systems need structured data
Transformations bridge this gap, converting Infrahub’s graph data into whatever format each system requires. This design provides several benefits:
Single source of truth: Infrastructure data lives in one place (Infrahub), not scattered across configuration files
Consistency: Generated artifacts are always consistent with the source data, eliminating drift
Flexibility: Change the output format by updating the transformation without touching the source data
Reusability: A single transformation can generate artifacts for multiple devices or objects
Version control: Transformations live in Git repositories, providing full history and collaboration
Core concepts
A transformation consists of two main components:
GraphQL query: Defines the input data. The query retrieves exactly the data needed from Infrahub’s graph database:
query GetDeviceConfig($device_id: String!) {
InfraDevice(ids: [$device_id]) {
edges {
node {
hostname {
value
}
interfaces {
edges {
node {
name {
value
}
enabled {
value
}
ip_addresses {
edges {
node {
address {
value
}
}
}
}
}
}
}
}
}
}
}
Transformation logic: Processes the query results and generates output. This can be:
- A Jinja2 template for text-based formats (configurations, documentation)
- A Python class for complex logic or structured data (JSON, YAML)
The transformation automatically inherits parameters from the GraphQL query. In the example above, $device_id becomes a parameter the transformation accepts.
Jinja2 transformations render templates using data from GraphQL queries. They excel at generating text-based formats:
Device configurations: Generate vendor-specific configurations:
hostname {{ device.hostname.value }}
!
{% for interface in device.interfaces.edges %}
interface {{ interface.node.name.value }}
{% if interface.node.enabled.value %}
no shutdown
{% else %}
shutdown
{% endif %}
{% for ip in interface.node.ip_addresses.edges %}
ip address {{ ip.node.address.value }}
{% endfor %}
{% endfor %}
Documentation: Generate Markdown documentation:
# Device: {{ device.hostname.value }}
## Interfaces
{% for interface in device.interfaces.edges %}
### {{ interface.node.name.value }}
- Status: {{ "Enabled" if interface.node.enabled.value else "Disabled" }}
- IP Addresses:
{% for ip in interface.node.ip_addresses.edges %}
- {{ ip.node.address.value }}
{% endfor %}
{% endfor %}
Jinja2 transformations (CoreTransformJinja2) store the template path in the repository:
name: device_config
template_path: templates/arista_eos.j2
query: device_query
repository: my_repo
Python transformations provide full programming flexibility for complex logic or structured output:
from infrahub_sdk.transforms import InfrahubTransform
class CloudFormationTransform(InfrahubTransform):
async def transform(self, data: dict) -> dict:
"""Generate CloudFormation template from device data."""
device = data["InfraDevice"]["edges"][0]["node"]
resources = {}
for interface in device["interfaces"]["edges"]:
iface = interface["node"]
resource_name = f"NetworkInterface{iface['name']['value']}"
resources[resource_name] = {
"Type": "AWS::EC2::NetworkInterface",
"Properties": {
"Description": f"{device['hostname']['value']} {iface['name']['value']}",
"SubnetId": {"Ref": "SubnetId"},
"PrivateIpAddresses": [
{"PrivateIpAddress": ip["node"]["address"]["value"], "Primary": True}
for ip in iface["ip_addresses"]["edges"]
]
}
}
return {
"AWSTemplateFormatVersion": "2010-09-09",
"Resources": resources
}
Python transformations (CoreTransformPython) reference the class in the repository:
name: cloudformation_template
file_path: transforms/cloudformation.py
class_name: CloudFormationTransform
query: device_query
repository: my_repo
Transformations inherit parameters from their GraphQL queries. This enables:
Static transformations: No parameters, generate the same output each time:
query GetAllDevices {
InfraDevice {
edges {
node {
hostname { value }
}
}
}
}
Dynamic transformations: Parameters allow generating output for specific objects:
query GetDevice($device_id: String!) {
InfraDevice(ids: [$device_id]) {
# ...
}
}
When rendering, provide parameter values:
infrahubctl render my_transform --param device_id=<uuid>
Architecture and implementation
When a transformation runs:
- Query execution: The GraphQL query executes against Infrahub, retrieving data from the graph database
- Data preparation: Query results are formatted and passed to the transformation logic
- Transformation: The Jinja2 template renders or Python class executes, producing output
- Result return: The output is returned to the caller (API, artifact system, CLI)
This flow occurs in several contexts:
On-demand rendering: REST API endpoint for immediate transformation:
curl -X POST http://infrahub/api/transform/my_transform \
-H "Content-Type: application/json" \
-d '{"device_id": "<uuid>"}'
Artifact generation: Transformations run as part of artifact definitions, storing results in object storage
Development/testing: CLI tools render transformations locally:
infrahubctl render my_transform --param device_id=<uuid>
Repository integration
Transformations live in Git repositories alongside schemas, checks, and other infrastructure-as-code:
my_infrahub_repo/
├── .infrahub.yml # Declares transformations
├── queries/
│ └── device_query.gql # GraphQL queries
├── templates/
│ ├── arista_eos.j2 # Jinja2 templates
│ └── cisco_ios.j2
└── transforms/
└── cloudformation.py # Python transformations
The .infrahub.yml file registers transformations:
jinja2_transforms:
- name: arista_config
description: Generate Arista EOS configuration
query: device_query
template_path: templates/arista_eos.j2
python_transforms:
- name: cloudformation_template
description: Generate AWS CloudFormation template
query: device_query
class_name: CloudFormationTransform
file_path: transforms/cloudformation.py
When Infrahub syncs the repository, it creates CoreTransformJinja2 or CoreTransformPython objects in the database.
Transformations themselves are generic—they process data without knowing which objects they’ll transform. Targeting happens through artifact definitions and generators, which combine transformations with groups:
artifacts:
- name: device_configs
transformation: arista_config
targets: arista_devices_group
This pattern enables:
Reusable logic: One transformation applies to multiple groups through different artifact definitions
Bulk processing: Artifact definitions automatically apply transformations to all group members
Dynamic targeting: Adding devices to groups automatically includes them in transformation processing
See the Generators concept for how this pattern extends to creating objects.
Transformations are branch-aware schema objects. This means:
Different transformations per branch: A feature branch can have modified transformation logic without affecting production
Schema-aware: Transformations automatically adapt to schema changes in the branch
Data-aware: Transformations see the branch’s data view, not the main branch
When testing infrastructure changes in a branch, transformations generate artifacts using the branch’s data and schema.
Implementation examples
Jinja2 transformations support custom filters for common operations:
{# Format IP addresses with netmask #}
interface {{ interface.name.value }}
ip address {{ interface.ip_address.value | ipaddr('address') }} {{ interface.ip_address.value | ipaddr('netmask') }}
{# Convert interface names #}
interface {{ interface.name.value | replace('Ethernet', 'Eth') }}
{# Filter enabled interfaces #}
{% for interface in device.interfaces.edges | selectattr('node.enabled.value') %}
interface {{ interface.node.name.value }}
no shutdown
{% endfor %}
Infrahub provides infrastructure-specific filters beyond Jinja2’s built-in filters.
Python transformations can access files in the repository:
from infrahub_sdk.transforms import InfrahubTransform
from pathlib import Path
class TemplateBasedTransform(InfrahubTransform):
async def transform(self, data: dict) -> dict:
# self.root_directory points to repository root
template_path = Path(self.root_directory) / "templates" / "config.j2"
with open(template_path) as f:
template_content = f.read()
# Process template_content with data
# ...
This allows Python transformations to use Jinja2 templates or other files while maintaining complex logic.
A common pattern is vendor-specific transformations selected by device attribute:
class MultiVendorConfigTransform(InfrahubTransform):
async def transform(self, data: dict) -> str:
device = data["InfraDevice"]["edges"][0]["node"]
vendor = device["vendor"]["node"]["name"]["value"]
# Select template based on vendor
template_map = {
"Arista": "arista_eos.j2",
"Cisco": "cisco_ios.j2",
"Juniper": "juniper_junos.j2"
}
template_path = Path(self.root_directory) / "templates" / template_map[vendor]
# Render vendor-specific template
# ...
Transformations should handle errors gracefully:
from infrahub_sdk.transforms import InfrahubTransform
from infrahub_sdk.exceptions import ValidationError
class ValidatingTransform(InfrahubTransform):
async def transform(self, data: dict) -> dict:
try:
device = data["InfraDevice"]["edges"][0]["node"]
except (KeyError, IndexError) as e:
raise ValidationError(f"Invalid query response: {e}")
if not device.get("hostname", {}).get("value"):
raise ValidationError("Device missing required hostname")
# Transform logic
# ...
Proper error handling helps diagnose issues during artifact generation.
Infrahub provides a testing framework for transformations with minimal configuration. Tests execute locally during development and automatically in the CI pipeline when changes affect transformations.
Create test fixtures in the repository:
# tests/fixtures/device_fixture.json
{
"InfraDevice": {
"edges": [
{
"node": {
"hostname": {"value": "router-1"},
"interfaces": {
"edges": [
{
"node": {
"name": {"value": "Ethernet1"},
"enabled": {"value": true}
}
}
]
}
}
}
]
}
}
Define expected output:
# tests/expected/device_config.txt
hostname router-1
!
interface Ethernet1
no shutdown
The testing framework compares transformation output with expected output, failing tests on mismatch.
See the Resource Testing Framework documentation for details.
Design trade-offs
Jinja2 vs. Python
Jinja2 transformations are simpler but less flexible. Python transformations are more powerful but more complex. Choose based on requirements:
Use Jinja2 when:
- Output is text-based (configurations, documentation)
- Logic is simple (loops, conditionals, basic formatting)
- Template maintainability matters (non-programmers edit templates)
Use Python when:
- Output is structured data (JSON, YAML for APIs)
- Complex logic is required (calculations, external API calls)
- Type safety and IDE support matter
Should you create many small transformations or few large ones?
Many small transformations:
- Easier to test (focused scope)
- More reusable (compose into artifacts)
- Clearer purpose
- More objects to manage
Few large transformations:
- Fewer objects to manage
- May duplicate logic
- Harder to test
- Less flexible
Recommendation: Prefer smaller, focused transformations that can be composed.
Query complexity
GraphQL queries can fetch deeply nested data in one request, but complex queries impact performance. Balance between:
Fewer complex queries: Fetch everything needed in one query
- Pros: Fewer database round-trips
- Cons: Slower queries, more data transferred
More simple queries: Fetch only what’s needed
- Pros: Faster individual queries
- Cons: Multiple round-trips, coordination complexity
Recommendation: Start with simpler queries and combine them only when performance requires it.