Rate Limits - Infrahub

Infrahub implements rate limiting to ensure fair resource allocation and prevent API abuse. Understanding these limits helps you design resilient integrations.

Current Rate Limiting Status

As of version 0.15.0, Infrahub does not enforce hard rate limits on API requests. However, this may change in future versions. It is recommended to implement rate limiting best practices in your integrations now to ensure forward compatibility.

Future Rate Limiting

While not currently enforced, future versions of Infrahub may implement:

Per-User Limits

Requests per minute: 1000 requests/minute per user account
Concurrent connections: 50 simultaneous connections per user
WebSocket subscriptions: 100 active subscriptions per user

Per-API-Key Limits

Requests per minute: 5000 requests/minute per API key
Concurrent connections: 200 simultaneous connections per API key
WebSocket subscriptions: 500 active subscriptions per API key

Anonymous Access Limits

If anonymous access is enabled (SETTINGS.main.allow_anonymous_access = true):

Requests per minute: 100 requests/minute per IP address
Concurrent connections: 10 simultaneous connections per IP
No WebSocket subscriptions

GraphQL Query Complexity

Infrahub monitors GraphQL query complexity through metrics:

Depth Limits

Maximum query depth: Currently unlimited, but tracked via graphql_query_depth metric
Recommended depth: Keep queries under 10 levels deep for optimal performance

Height Limits

Maximum query height: Currently unlimited, but tracked via graphql_query_height metric
Recommended height: Limit to 100 fields per query for optimal performance

See /home/daytona/workspace/source/backend/infrahub/graphql/app.py:281 for complexity metric tracking.

Response Size Limits

GraphQL Responses

Maximum response size: Currently unlimited
Metric tracking: Response sizes are tracked via graphql_response_size metric
Recommendation: Use pagination to keep responses under 1MB

File Operations

Maximum file size: Depends on repository configuration
Recommendation: Stream large files rather than loading them entirely into memory

Best Practices

1. Implement Exponential Backoff

When encountering errors, use exponential backoff:

import time
import random

def make_request_with_backoff(request_func, max_retries=5):
    for attempt in range(max_retries):
        try:
            return request_func()
        except Exception as e:
            if attempt == max_retries - 1:
                raise
            
            # Exponential backoff with jitter
            wait_time = (2 ** attempt) + random.uniform(0, 1)
            time.sleep(wait_time)

2. Use Pagination

Always paginate large result sets:

query {
  InfraDevice(first: 50) {
    pageInfo {
      hasNextPage
      endCursor
    }
    edges {
      cursor
      node {
        id
        name { value }
      }
    }
  }
}

3. Optimize Query Complexity

Request only the fields you need:

# Good - Minimal field selection
query {
  InfraDevice {
    edges {
      node {
        id
        name { value }
      }
    }
  }
}

# Avoid - Requesting unnecessary nested data
query {
  InfraDevice {
    edges {
      node {
        id
        name { value }
        site {
          node {
            id
            name { value }
            region {
              node {
                id
                name { value }
                # Too deep...
              }
            }
          }
        }
      }
    }
  }
}

4. Cache Responses

Implement client-side caching:

from functools import lru_cache
import time

class InfrahubClient:
    @lru_cache(maxsize=100)
    def get_schema(self, branch: str, ttl: int = 300):
        # Cache schema for 5 minutes
        cache_key = f"{branch}:{int(time.time() / ttl)}"
        return self._fetch_schema(branch)

5. Batch Operations

Group related operations together:

mutation {
  device1: InfraDeviceCreate(data: {name: {value: "router-01"}}) { ok }
  device2: InfraDeviceCreate(data: {name: {value: "router-02"}}) { ok }
  device3: InfraDeviceCreate(data: {name: {value: "router-03"}}) { ok }
}

6. Use WebSocket Subscriptions Wisely

Limit the number of active subscriptions:

class SubscriptionManager {
  constructor(maxSubscriptions = 10) {
    this.maxSubscriptions = maxSubscriptions;
    this.activeSubscriptions = new Map();
  }
  
  subscribe(id, query) {
    if (this.activeSubscriptions.size >= this.maxSubscriptions) {
      throw new Error('Maximum subscriptions reached');
    }
    
    // Create subscription
    this.activeSubscriptions.set(id, subscription);
  }
  
  unsubscribe(id) {
    const subscription = this.activeSubscriptions.get(id);
    if (subscription) {
      subscription.close();
      this.activeSubscriptions.delete(id);
    }
  }
}

7. Monitor Your Usage

Track your API usage patterns:

import time
from collections import deque

class RateLimitTracker:
    def __init__(self, window_seconds=60):
        self.window_seconds = window_seconds
        self.requests = deque()
    
    def record_request(self):
        now = time.time()
        self.requests.append(now)
        
        # Remove old requests outside the window
        while self.requests and self.requests[0] < now - self.window_seconds:
            self.requests.popleft()
    
    def get_rate(self):
        return len(self.requests) / self.window_seconds

8. Handle Authentication Errors

Refresh tokens proactively before they expire:

class TokenManager:
    def __init__(self, client):
        self.client = client
        self.access_token = None
        self.refresh_token = None
        self.token_expiry = None
    
    def get_token(self):
        # Refresh if token expires in less than 5 minutes
        if self.token_expiry and time.time() > self.token_expiry - 300:
            self.refresh_access_token()
        
        return self.access_token
    
    def refresh_access_token(self):
        response = self.client.refresh_token(self.refresh_token)
        self.access_token = response['access_token']
        # JWT exp is in seconds since epoch
        self.token_expiry = jwt.decode(
            self.access_token, 
            options={"verify_signature": False}
        )['exp']

Monitoring Rate Limits

Prometheus Metrics

Infrahub exposes metrics that help you monitor API usage:

graphql_duration - Query execution time (can indicate throttling)
graphql_query_errors - Error count (may spike if limits are hit)
graphql_response_size - Response size distribution

Custom Monitoring

Implement custom monitoring in your applications:

import logging

logger = logging.getLogger('infrahub_client')

class MonitoredClient:
    def __init__(self, client):
        self.client = client
        self.request_count = 0
        self.error_count = 0
    
    async def query(self, *args, **kwargs):
        self.request_count += 1
        
        try:
            result = await self.client.query(*args, **kwargs)
            
            if result.get('errors'):
                self.error_count += 1
                logger.warning(
                    f"GraphQL errors: {result['errors']}",
                    extra={'request_count': self.request_count}
                )
            
            return result
        except Exception as e:
            self.error_count += 1
            logger.error(
                f"Request failed: {e}",
                extra={'request_count': self.request_count}
            )
            raise

Response Headers

While rate limiting is not currently enforced, future versions may include these headers in responses:

X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 847
X-RateLimit-Reset: 1704067200
X-RateLimit-Resource: user

Prepare your clients to handle these headers:

def handle_rate_limit_headers(response):
    limit = response.headers.get('X-RateLimit-Limit')
    remaining = response.headers.get('X-RateLimit-Remaining')
    reset = response.headers.get('X-RateLimit-Reset')
    
    if remaining and int(remaining) < 10:
        logger.warning(
            f"Rate limit nearly exhausted: {remaining}/{limit} remaining"
        )
    
    return response

Rate Limit Exceeded Responses

If rate limiting is implemented in future versions, expect these responses:

HTTP 429 Too Many Requests

{
  "errors": ["Rate limit exceeded. Please retry after 60 seconds."],
  "retry_after": 60
}

GraphQL Error

{
  "data": null,
  "errors": [
    {
      "message": "Rate limit exceeded",
      "extensions": {
        "code": "RATE_LIMIT_EXCEEDED",
        "retry_after": 60
      }
    }
  ]
}

Contact

For questions about rate limits or to request increased limits for enterprise deployments, contact your Infrahub administrator or OpsMill support.

Documentation Index

​Current Rate Limiting Status

​Future Rate Limiting

​Per-User Limits

​Per-API-Key Limits

​Anonymous Access Limits

​GraphQL Query Complexity

​Depth Limits

​Height Limits

​Response Size Limits

​GraphQL Responses

​File Operations

​Best Practices

​1. Implement Exponential Backoff

​2. Use Pagination

​3. Optimize Query Complexity

​4. Cache Responses

​5. Batch Operations

​6. Use WebSocket Subscriptions Wisely

​7. Monitor Your Usage

​8. Handle Authentication Errors

​Monitoring Rate Limits

​Prometheus Metrics

​Custom Monitoring

​Response Headers

​Rate Limit Exceeded Responses

​HTTP 429 Too Many Requests

​GraphQL Error

​Contact