Auto Routing
LiteLLM can auto select the best model for a request based on rules you define.
LiteLLM Python SDKโ
Auto routing allows you to define routing rules that automatically select the best model for a request based on the input content. This is useful for directing different types of queries to specialized models.
Setupโ
- Create a router configuration file (e.g.,
router.json
):
{
"encoder_type": "openai",
"encoder_name": "text-embedding-3-large",
"routes": [
{
"name": "litellm-gpt-4.1",
"utterances": [
"litellm is great"
],
"description": "positive affirmation",
"function_schemas": null,
"llm": null,
"score_threshold": 0.5,
"metadata": {}
},
{
"name": "litellm-claude-35",
"utterances": [
"how to code a program in [language]"
],
"description": "coding assistant",
"function_schemas": null,
"llm": null,
"score_threshold": 0.5,
"metadata": {}
}
]
}
- Configure the Router with auto routing models:
from litellm import Router
import os
router = Router(
model_list=[
# Embedding models for routing
{
"model_name": "custom-text-embedding-model",
"litellm_params": {
"model": "text-embedding-3-large",
"api_key": os.getenv("OPENAI_API_KEY"),
},
},
# Your target models
{
"model_name": "litellm-gpt-4.1",
"litellm_params": {
"model": "gpt-4.1",
},
"model_info": {"id": "openai-id"},
},
{
"model_name": "litellm-claude-35",
"litellm_params": {
"model": "claude-3-5-sonnet-latest",
},
"model_info": {"id": "claude-id"},
},
# Auto router configuration
{
"model_name": "auto_router1",
"litellm_params": {
"model": "auto_router/auto_router_1",
"auto_router_config_path": "router.json",
"auto_router_default_model": "gpt-4o-mini",
"auto_router_embedding_model": "custom-text-embedding-model",
},
},
],
)
Usageโ
Once configured, use the auto router by calling it with your auto router model name:
# This request will be routed to gpt-4.1 based on the utterance match
response = await router.acompletion(
model="auto_router1",
messages=[{"role": "user", "content": "litellm is great"}],
)
# This request will be routed to claude-3-5-sonnet-latest for coding queries
response = await router.acompletion(
model="auto_router1",
messages=[{"role": "user", "content": "how to code a program in python"}],
)
Configuration Parametersโ
- auto_router_config_path: Path to your router.json configuration file
- auto_router_default_model: Fallback model when no route matches
- auto_router_embedding_model: Model used for generating embeddings to match against utterances
Router Configuration Schemaโ
The router.json
file supports the following structure:
- encoder_type: Type of encoder (e.g., "openai")
- encoder_name: Name of the embedding model
- routes: Array of routing rules with:
- name: Target model name (must match a model in your model_list)
- utterances: Example phrases/patterns to match against
- description: Human-readable description of the route
- score_threshold: Minimum similarity score to trigger this route (0.0-1.0)
- metadata: Additional metadata for the route
LiteLLM Proxy Serverโ
Setupโ
Navigate to the LiteLLM UI and go to Models+Endpoints > Add Model > Auto Router Tab.
Configure the following required fields:
- Auto Router Name - The model name that developers will use when making LLM API requests to LiteLLM
- Default Model - The fallback model used when no route is matched (e.g., if set to "gpt-4o-mini", unmatched requests will be routed to gpt-4o-mini)
- Embedding Model - The model used to generate embeddings for input messages. These embeddings are used to semantically match input against the utterances defined in your routes
Route Configurationโ
Click Add Route to create a new routing rule. Each route consists of utterances that are matched against input messages to determine the target model.
Configure each route with:
- Utterances - Example phrases that will trigger this route. Use placeholders in brackets for variables:
"how to code a program in [language]",
"can you explain this [language] code",
"can you explain this [language] script",
"can you convert this [language] code to [target_language]"
- Description - A human-readable description of what this route handles
- Score Threshold - The minimum similarity score (0.0-1.0) required to trigger this route
Usageโ
Once added developers need to select the model=auto_router1
in the model
field of the LLM API request.
- OpenAI Python v1.0.0+
- Curl Request
import openai
client = openai.OpenAI(
api_key="sk-1234", # replace with your LiteLLM API key
base_url="http://localhost:4000"
)
# This request will be auto-routed based on the content
response = client.chat.completions.create(
model="auto_router1",
messages=[
{
"role": "user",
"content": "how to code a program in python"
}
]
)
print(response)
curl -X POST http://localhost:4000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $LITELLM_API_KEY" \
-d '{
"model": "auto_router1",
"messages": [{"role": "user", "content": "how to code a program in python"}]
}'
How It Worksโ
- When a request comes in, LiteLLM generates embeddings for the input message
- It compares these embeddings against the utterances defined in your routes
- If a route's similarity score exceeds the threshold, the request is routed to that model
- If no route matches, the request goes to the default model