[BETA] Generic Guardrail API - Integrate Without a PR

The Problem

As a guardrail provider, integrating with LiteLLM traditionally requires:

Making a PR to the LiteLLM repository
Waiting for review and merge
Maintaining provider-specific code in LiteLLM's codebase
Updating the integration for changes to your API

The Solution

The Generic Guardrail API lets you integrate with LiteLLM instantly by implementing a simple API endpoint. No PR required.

Key Benefits

No PR Needed - Deploy and integrate immediately
Universal Support - Works across ALL LiteLLM endpoints (chat, embeddings, image generation, etc.)
Simple Contract - One endpoint, three response types
Multi-Modal Support - Handle both text and images in requests/responses
Custom Parameters - Pass provider-specific params via config
Full Control - You own and maintain your guardrail API

Supported Endpoints

The Generic Guardrail API works with the following LiteLLM endpoints:

/v1/chat/completions - OpenAI Chat Completions
/v1/completions - OpenAI Text Completions
/v1/responses - OpenAI Responses API
/v1/images/generations - OpenAI Image Generation
/v1/audio/transcriptions - OpenAI Audio Transcriptions
/v1/audio/speech - OpenAI Text-to-Speech
/v1/messages - Anthropic Messages
/v1/rerank - Cohere Rerank
Pass-through endpoints

How It Works

LiteLLM extracts text and images from any request (chat messages, embeddings, image prompts, etc.)
Sends extracted content + metadata to your API endpoint
Your API responds with: BLOCKED, NONE, or GUARDRAIL_INTERVENED
LiteLLM enforces the decision and applies any modifications

API Contract

Endpoint

Implement POST /beta/litellm_basic_guardrail_api

Request Format

{
  "texts": ["extracted text from the request"],  // array of text strings
  "images": ["base64_encoded_image_data"],  // optional array of images
  "tools": [  // tool calls sent to the LLM (in the OpenAI Chat Completions spec)
    {
      "type": "function",
      "function": {
        "name": "get_weather",
        "description": "Get the current weather",
        "parameters": {
          "type": "object",
          "properties": {
            "location": {"type": "string"}
          }
        }
      }
    }
  ],
  "tool_calls": [  // tool calls received from the LLM (in the OpenAI Chat Completions spec)
    {
      "id": "call_abc123",
      "type": "function",
      "function": {
        "name": "get_weather",
        "arguments": "{\"location\": \"San Francisco\"}"
      }
    }
  ],
  "structured_messages": [  // optional, full messages in OpenAI format (for chat endpoints)
    {"role": "system", "content": "You are a helpful assistant"},
    {"role": "user", "content": "Hello"}
  ],
  "request_data": {
    "user_api_key_hash": "hash of the litellm virtual key used",
    "user_api_key_alias": "alias of the litellm virtual key used",
    "user_api_key_user_id": "user id associated with the litellm virtual key used",
    "user_api_key_user_email": "user email associated with the litellm virtual key used",
    "user_api_key_team_id": "team id associated with the litellm virtual key used",
    "user_api_key_team_alias": "team alias associated with the litellm virtual key used",
    "user_api_key_end_user_id": "end user id associated with the litellm virtual key used",
    "user_api_key_org_id": "org id associated with the litellm virtual key used"
  },
  "request_headers": {  // optional: inbound request headers (allowlist). Allowed headers show their value; all others show "[present]" to indicate the header existed.
    "User-Agent": "OpenAI/Python 2.17.0",
    "Content-Type": "application/json",
    "X-Request-Id": "[present]"
  },
  "litellm_version": "1.x.y",  // optional: LiteLLM library version running this proxy
  "input_type": "request",  // "request" or "response"
  "litellm_call_id": "unique_call_id",  // the call id of the individual LLM call
  "litellm_trace_id": "trace_id",  // the trace id of the LLM call - useful if there are multiple LLM calls for the same conversation
  "additional_provider_specific_params": {
    // your custom params from config
  }
}

Response Format

{
  "action": "BLOCKED" | "NONE" | "GUARDRAIL_INTERVENED",
  "blocked_reason": "why content was blocked",  // required if action=BLOCKED
  "texts": ["modified text"],  // optional array of modified text strings
  "images": ["modified_base64_image"]  // optional array of modified images
}

Actions:

BLOCKED - LiteLLM raises error and blocks request
NONE - Request proceeds unchanged
GUARDRAIL_INTERVENED - Request proceeds with modified texts/images (provide texts and/or images fields)

Parameters

`tools` Parameter

The tools parameter provides information about available function/tool definitions in the request.

Format: OpenAI ChatCompletionToolParam format (see OpenAI API reference)

Built-in tools that carry only a type and no function block, such as {"type": "code_interpreter"} or {"type": "file_search", "vector_store_ids": [...]}, are also accepted and forwarded to your endpoint intact (including their tool-specific config). Your endpoint should treat function as optional and branch on type.

Example:

{
  "type": "function",
  "function": {
    "name": "get_weather",
    "description": "Get the current weather in a location",
    "parameters": {
      "type": "object",
      "properties": {
        "location": {
          "type": "string",
          "description": "City and state, e.g. San Francisco, CA"
        },
        "unit": {
          "type": "string",
          "enum": ["celsius", "fahrenheit"]
        }
      },
      "required": ["location"]
    }
  }
}

Availability:

Input only: Tools are only passed for input_type="request" (pre-call guardrails). Output/response guardrails do not currently receive tool definitions.
Supported endpoints: The tools parameter is supported on: /v1/chat/completions, /v1/responses, and /v1/messages. Other endpoints do not have tool support.

Use cases:

Enforce tool permission policies (e.g., only allow certain users/teams to access specific tools)
Validate tool schemas before sending to LLM
Log tool usage for audit purposes
Block sensitive tools based on user context

`tool_calls` Parameter

The tool_calls parameter contains actual function/tool invocations being made in the request or response.

Format: OpenAI ChatCompletionMessageToolCall format (see OpenAI API reference)

Example:

{
  "id": "call_abc123",
  "type": "function",
  "function": {
    "name": "get_weather",
    "arguments": "{\"location\": \"San Francisco\", \"unit\": \"celsius\"}"
  }
}

Key Difference from tools:

tools = Tool definitions/schemas (what tools are available)
tool_calls = Tool invocations/executions (what tools are being called with what arguments)

Availability:

Both input and output: Tool calls can be present in both input_type="request" (assistant messages requesting tool calls) and input_type="response" (LLM responses with tool calls).
Supported endpoints: The tool_calls parameter is supported on: /v1/chat/completions, /v1/responses, and /v1/messages.

Use cases:

Validate tool call arguments before execution
Redact sensitive data from tool call arguments (e.g., PII)
Log tool invocations for audit/debugging
Block tool calls with dangerous parameters
Modify tool call arguments (e.g., enforce constraints, sanitize inputs)
Monitor tool usage patterns across users/teams

`structured_messages` Parameter

The structured_messages parameter provides the full input in OpenAI chat completion spec format, useful for distinguishing between system and user messages.

Format: Array of OpenAI chat completion messages (see OpenAI API reference)

Example:

[
  {"role": "system", "content": "You are a helpful assistant"},
  {"role": "user", "content": "Hello"}
]

Availability:

Supported endpoints: /v1/chat/completions, /v1/messages, /v1/responses
Input only: Only passed for input_type="request" (pre-call guardrails)

Use cases:

Apply different policies for system vs user messages
Enforce role-based content restrictions
Log structured conversation context

LiteLLM Configuration

Add to config.yaml:

litellm_settings:
  guardrails:
    - guardrail_name: "my-guardrail"
      litellm_params:
        guardrail: generic_guardrail_api
        mode: pre_call  # or post_call, during_call
        api_base: https://your-guardrail-api.com
        api_key: os.environ/YOUR_GUARDRAIL_API_KEY  # optional
        unreachable_fallback: fail_closed  # default: fail_closed. Set to fail_open to proceed if the guardrail endpoint is unreachable (network errors, or HTTP 502/503/504 from an upstream proxy/LB).
        fail_on_error: true  # default: true (fail closed). Set to false to proceed on ANY guardrail error. See "Error handling" below before changing this.
        additional_provider_specific_params:
          # your custom parameters
          threshold: 0.8
          language: "en"

Error handling: `unreachable_fallback` and `fail_on_error`

Two settings control what LiteLLM does when the guardrail itself fails, rather than returning a verdict. They sit on a spectrum from strict to permissive, and they compose:

unreachable_fallback (default fail_closed) only reacts to the guardrail endpoint being unreachable: network errors, timeouts, or an HTTP 502/503/504 from an upstream proxy/load balancer. Set it to fail_open to let requests proceed in just those cases.
fail_on_error (default true) is the broader control. It governs any guardrail error, not only unreachability.

`fail_on_error`	Behavior on a guardrail error
`true` (default)	Fail closed. Any error blocks the request: a non-2xx response, a malformed or unparseable body, a network failure, or an internal serialization/validation error. This preserves LiteLLM's existing behavior
`false`	Fail open (complete). Any guardrail error is downgraded to a critical-level log line and the request proceeds as if the guardrail were not configured

Only a valid guardrail response can act. With fail_on_error: false, a parsed BLOCKED decision still blocks; everything that is not a valid response (errors, malformed bodies, unreachable endpoints) is bypassed. This applies to both the request hook (pre_call) and the response hook (post_call); on the response path, a fail-open returns the already-generated model output, while fail-closed turns a successful generation into an error.

danger

fail_on_error: false is a complete bypass on failure. It means that any failure in the guardrail or its endpoint, for any reason, will cause the guardrail to be skipped for that request rather than block it. Enable it only if you have understood and accepted that tradeoff: choose it when your availability and operational constraints are stronger than your security constraints. If the guardrail is a hard security boundary, leave it at the default true (fail closed).

The default is fail closed precisely because a guardrail is usually a security control. Every fail-open bypass is logged at critical level (Generic Guardrail API error (fail-open) ...) with the call id and trace id, so you can alert on it and audit how often it happens.

Static and dynamic headers

You can send two kinds of headers to your guardrail endpoint:

Static headers (headers): A key/value map sent with every request to your guardrail. Use this for fixed values (e.g. API keys, X-Service-Name). Configure in litellm_params:

litellm_params:
  guardrail: generic_guardrail_api
  api_base: https://your-guardrail-api.com
  headers:
    X-Service-Name: "my-app"
    X-API-Key: "secret"

Dynamic headers (extra_headers): A list of header names that are forwarded from the client request to your guardrail. Only headers in this list (plus a small default allowlist such as x-litellm-*) have their values sent; others are sent as [present]. Use this to pass through client-provided headers (e.g. x-request-id, x-correlation-id). Configure in litellm_params:
```
litellm_params:
  guardrail: generic_guardrail_api
  api_base: https://your-guardrail-api.com
  extra_headers:
    - x-request-id
    - x-correlation-id
    - x-custom-auth
```

This mirrors the MCP static and extra headers behavior.

Example: Pillar Security

Pillar Security uses the Generic Guardrail API to provide comprehensive AI security scanning including prompt injection protection, PII/PCI detection, secret detection, and content moderation.

guardrails:
  - guardrail_name: "pillar-security"
    litellm_params:
      guardrail: generic_guardrail_api
      mode: [pre_call, post_call]
      api_base: https://api.pillar.security/api/v1/integrations/litellm
      api_key: os.environ/PILLAR_API_KEY
      default_on: true
      additional_provider_specific_params:
        plr_mask: true      # Enable automatic masking of sensitive data
        plr_evidence: true  # Include detection evidence in response
        plr_scanners: true  # Include scanner details in response

See the Pillar Security documentation for full configuration options.

Usage

Users apply your guardrail by name:

response = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "hello"}],
    guardrails=["my-guardrail"]
)

Or with dynamic parameters:

response = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "hello"}],
    guardrails=[{
        "my-guardrail": {
            "extra_body": {
                "custom_threshold": 0.9
            }
        }
    }]
)

Implementation Example

See mock_bedrock_guardrail_server.py for a complete reference implementation.

Minimal FastAPI example:

from fastapi import FastAPI
from pydantic import BaseModel
from typing import List, Optional, Dict, Any

app = FastAPI()

class GuardrailRequest(BaseModel):
    texts: List[str]
    images: Optional[List[str]] = None
    tools: Optional[List[Dict[str, Any]]] = None  # OpenAI ChatCompletionToolParam format (tool definitions)
    tool_calls: Optional[List[Dict[str, Any]]] = None  # OpenAI ChatCompletionMessageToolCall format (tool invocations)
    structured_messages: Optional[List[Dict[str, Any]]] = None  # OpenAI messages format (for chat endpoints)
    request_data: Dict[str, Any]
    input_type: str  # "request" or "response"
    litellm_call_id: Optional[str] = None
    litellm_trace_id: Optional[str] = None
    additional_provider_specific_params: Dict[str, Any]

class GuardrailResponse(BaseModel):
    action: str  # BLOCKED, NONE, or GUARDRAIL_INTERVENED
    blocked_reason: Optional[str] = None
    texts: Optional[List[str]] = None
    images: Optional[List[str]] = None

@app.post("/beta/litellm_basic_guardrail_api")
async def apply_guardrail(request: GuardrailRequest):
    # Your guardrail logic here
    
    # Example: Check text content
    for text in request.texts:
        if "badword" in text.lower():
            return GuardrailResponse(
                action="BLOCKED",
                blocked_reason="Content contains prohibited terms"
            )
    
    # Example: Check tool definitions (if present in request)
    if request.tools:
        for tool in request.tools:
            if tool.get("type") == "function":
                function_name = tool.get("function", {}).get("name", "")
                # Block sensitive tool definitions
                if function_name in ["delete_data", "access_admin_panel"]:
                    return GuardrailResponse(
                        action="BLOCKED",
                        blocked_reason=f"Tool '{function_name}' is not allowed"
                    )
    
    # Example: Check tool calls (if present in request or response)
    if request.tool_calls:
        for tool_call in request.tool_calls:
            if tool_call.get("type") == "function":
                function_name = tool_call.get("function", {}).get("name", "")
                arguments_str = tool_call.get("function", {}).get("arguments", "{}")
                
                # Parse arguments and validate
                import json
                try:
                    arguments = json.loads(arguments_str)
                    # Block dangerous arguments
                    if "file_path" in arguments and ".." in str(arguments["file_path"]):
                        return GuardrailResponse(
                            action="BLOCKED",
                            blocked_reason="Tool call contains path traversal attempt"
                        )
                except json.JSONDecodeError:
                    pass
    
    # Example: Check structured messages (if present in request)
    if request.structured_messages:
        for message in request.structured_messages:
            if message.get("role") == "system":
                # Apply stricter policies to system messages
                if "admin" in message.get("content", "").lower():
                    return GuardrailResponse(
                        action="BLOCKED",
                        blocked_reason="System message contains restricted terms"
                    )
    
    return GuardrailResponse(action="NONE")

When to Use This

✅ Use Generic Guardrail API when:

You want instant integration without waiting for PRs
You maintain your own guardrail service
You need full control over updates and features
You want to support all LiteLLM endpoints automatically

❌ Make a PR when:

You want deeper integration with LiteLLM internals
Your guardrail requires complex LiteLLM-specific logic
You want to be featured as a built-in provider

Questions?

This is a beta API. We're actively improving it based on feedback. Open an issue or PR if you need additional capabilities.

The Problem​

The Solution​

Key Benefits​

Supported Endpoints​

How It Works​

API Contract​

Endpoint​

Request Format​

Response Format​

Parameters​

tools Parameter​

tool_calls Parameter​

structured_messages Parameter​

LiteLLM Configuration​

Error handling: unreachable_fallback and fail_on_error​

Static and dynamic headers​

Example: Pillar Security​

Usage​

Implementation Example​

When to Use This​

Questions?​

The Problem

The Solution

Key Benefits

Supported Endpoints

How It Works

API Contract

Endpoint

Request Format

Response Format

Parameters

`tools` Parameter

`tool_calls` Parameter

`structured_messages` Parameter

LiteLLM Configuration

Error handling: `unreachable_fallback` and `fail_on_error`

Static and dynamic headers

Example: Pillar Security

Usage

Implementation Example

When to Use This

Questions?