Auto Routing

LiteLLM can auto select the best model for a request based on rules you define.

LiteLLM Python SDK

Auto routing allows you to define routing rules that automatically select the best model for a request based on the input content. This is useful for directing different types of queries to specialized models.

Setup

Create a router configuration file (e.g., router.json):

{
    "encoder_type": "openai",
    "encoder_name": "text-embedding-3-large",
    "routes": [
        {
            "name": "litellm-gpt-4.1",
            "utterances": [
                "litellm is great"
            ],
            "description": "positive affirmation",
            "function_schemas": null,
            "llm": null,
            "score_threshold": 0.5,
            "metadata": {}
        },
        {
            "name": "litellm-claude-35",
            "utterances": [
                "how to code a program in [language]"
            ],
            "description": "coding assistant",
            "function_schemas": null,
            "llm": null,
            "score_threshold": 0.5,
            "metadata": {}
        }
    ]
}

Configure the Router with auto routing models:

from litellm import Router
import os

router = Router(
    model_list=[
        # Embedding models for routing
        {
            "model_name": "custom-text-embedding-model",
            "litellm_params": {
                "model": "text-embedding-3-large",
                "api_key": os.getenv("OPENAI_API_KEY"),
            },
        },
        # Your target models
        {
            "model_name": "litellm-gpt-4.1",
            "litellm_params": {
                "model": "gpt-4.1",
            },
            "model_info": {"id": "openai-id"},
        },
        {
            "model_name": "litellm-claude-35",
            "litellm_params": {
                "model": "claude-3-5-sonnet-latest",
            },
            "model_info": {"id": "claude-id"},
        },
        # Auto router configuration
        {
            "model_name": "auto_router1",
            "litellm_params": {
                "model": "auto_router/auto_router_1",
                "auto_router_config_path": "router.json",
                "auto_router_default_model": "gpt-4o-mini",
                "auto_router_embedding_model": "custom-text-embedding-model",
            },
        },
    ],
)

Usage

Once configured, use the auto router by calling it with your auto router model name:

# This request will be routed to gpt-4.1 based on the utterance match
response = await router.acompletion(
    model="auto_router1",
    messages=[{"role": "user", "content": "litellm is great"}],
)

# This request will be routed to claude-3-5-sonnet-latest for coding queries
response = await router.acompletion(
    model="auto_router1",
    messages=[{"role": "user", "content": "how to code a program in python"}],
)

Configuration Parameters

auto_router_config_path: Path to your router.json configuration file
auto_router_default_model: Fallback model when no route matches
auto_router_embedding_model: Model used for generating embeddings to match against utterances

Router Configuration Schema

The router.json file supports the following structure:

encoder_type: Type of encoder (e.g., "openai")
encoder_name: Name of the embedding model
routes: Array of routing rules with:
- name: Target model name (must match a model in your model_list)
- utterances: Example phrases/patterns to match against
- description: Human-readable description of the route
- score_threshold: Minimum similarity score to trigger this route (0.0-1.0)
- metadata: Additional metadata for the route

LiteLLM Proxy Server

Setup

Navigate to the LiteLLM UI and go to Models+Endpoints > Add Model > Auto Router Tab.

Configure the following required fields:

Auto Router Name - The model name that developers will use when making LLM API requests to LiteLLM
Default Model - The fallback model used when no route is matched (e.g., if set to "gpt-4o-mini", unmatched requests will be routed to gpt-4o-mini)
Embedding Model - The model used to generate embeddings for input messages. These embeddings are used to semantically match input against the utterances defined in your routes

Route Configuration

Click Add Route to create a new routing rule. Each route consists of utterances that are matched against input messages to determine the target model.

Configure each route with:

Utterances - Example phrases that will trigger this route. Use placeholders in brackets for variables:

"how to code a program in [language]",
"can you explain this [language] code",
"can you explain this [language] script",
"can you convert this [language] code to [target_language]"

Description - A human-readable description of what this route handles
Score Threshold - The minimum similarity score (0.0-1.0) required to trigger this route

Usage

Once added developers need to select the model=auto_router1 in the model field of the LLM API request.

OpenAI Python v1.0.0+
Curl Request

import openai
client = openai.OpenAI(
    api_key="sk-1234", # replace with your LiteLLM API key
    base_url="http://localhost:4000"
)

# This request will be auto-routed based on the content
response = client.chat.completions.create(
    model="auto_router1",
    messages=[
        {
            "role": "user",
            "content": "how to code a program in python"
        }
    ]
)

print(response)

curl -X POST http://localhost:4000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $LITELLM_API_KEY" \
-d '{
    "model": "auto_router1",
    "messages": [{"role": "user", "content": "how to code a program in python"}]
}'

How It Works

When a request comes in, LiteLLM generates embeddings for the input message
It compares these embeddings against the utterances defined in your routes
If a route's similarity score exceeds the threshold, the request is routed to that model
If no route matches, the request goes to the default model

LiteLLM Python SDK​

Setup​

Usage​

Configuration Parameters​

Router Configuration Schema​

LiteLLM Proxy Server​

Setup​

Route Configuration​

Usage​

How It Works​

LiteLLM Python SDK

Setup

Usage

Configuration Parameters

Router Configuration Schema

LiteLLM Proxy Server

Setup

Route Configuration

Usage

How It Works