[Preview] v1.80.8.rc.1 - Introducing A2A Agent Gateway
Deploy this version
- Docker
- Pip
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:v1.80.8.rc.1
pip install litellm==1.80.8
Key Highlights
- Agent Gateway (A2A) - Invoke agents through the AI Gateway with request/response logging and access controls
- Guardrails API v2 - Generic Guardrail API with streaming support, structured messages, and tool call checks
- Customer (End User) Usage UI - Track and visualize end-user spend directly in the dashboard
- vLLM Batch + Files API - Support for batch and files API with vLLM deployments
- Dynamic Rate Limiting on Teams - Enable dynamic rate limits and priority reservation on team-level
- Google Cloud Chirp3 HD - New text-to-speech provider with Chirp3 HD voices
Agent Gateway (A2A)
This release introduces A2A Agent Gateway for LiteLLM, allowing you to invoke and manage A2A agents with the same controls you have for LLM APIs.
As a LiteLLM Gateway Admin, you can now do the following:
- Request/Response Logging - Every agent invocation is logged to the Logs page with full request and response tracking.
- Access Control - Control which Team/Key can access which agents.
As a developer, you can continue using the A2A SDK, all you need to do is point you A2AClient to the LiteLLM proxy URL and your API key.
Works with the A2A SDK:
from a2a.client import A2AClient
client = A2AClient(
base_url="http://localhost:4000", # Your LiteLLM proxy
api_key="sk-1234" # LiteLLM API key
)
response = client.send_message(
agent_id="my-agent",
message="What's the status of my order?"
)
Get started with Agent Gateway here: Agent Gateway Documentation
Customer (End User) Usage UI
Users can now filter usage statistics by customers, providing the same granular filtering capabilities available for teams and organizations.
Details:
- Filter usage analytics, spend logs, and activity metrics by customer ID
- View customer-level breakdowns alongside existing team and user-level filters
- Consistent filtering experience across all usage and analytics views
New Providers and Endpoints
New Providers (5 new providers)
| Provider | Supported LiteLLM Endpoints | Description |
|---|---|---|
| Z.AI (Zhipu AI) | /v1/chat/completions, /v1/responses, /v1/messages | Built-in support for Zhipu AI GLM models |
| RAGFlow | /v1/chat/completions, /v1/responses, /v1/messages, /v1/vector_stores | RAG-based chat completions with vector store support |
| PublicAI | /v1/chat/completions, /v1/responses, /v1/messages | OpenAI-compatible provider via JSON config |
| Google Cloud Chirp3 HD | /v1/audio/speech, /v1/audio/speech/stream | Text-to-speech with Google Cloud Chirp3 HD voices |
New LLM API Endpoints (2 new endpoints)
| Endpoint | Method | Description | Documentation |
|---|---|---|---|
/v1/agents/invoke | POST | Invoke A2A agents through the AI Gateway | Agent Gateway |
/cursor/chat/completions | POST | Cursor BYOK endpoint - accepts Responses API input, returns Chat Completions output | Cursor Integration |
New Models / Updated Models
New Model Support (33 new models)
| Provider | Model | Context Window | Input ($/1M tokens) | Output ($/1M tokens) | Features |
|---|---|---|---|---|---|
| OpenAI | gpt-5.1-codex-max | 400K | $1.25 | $10.00 | Reasoning, vision, PDF input, responses API |
| Azure | azure/gpt-5.1-codex-max | 400K | $1.25 | $10.00 | Reasoning, vision, PDF input, responses API |
| Anthropic | claude-opus-4-5 | 200K | $5.00 | $25.00 | Computer use, reasoning, vision |
| Bedrock | global.anthropic.claude-opus-4-5-20251101-v1:0 | 200K | $5.00 | $25.00 | Computer use, reasoning, vision |
| Bedrock | amazon.nova-2-lite-v1:0 | 1M | $0.30 | $2.50 | Reasoning, vision, video, PDF input |
| Bedrock | amazon.titan-image-generator-v2:0 | - | - | $0.008/image | Image generation |
| Fireworks | fireworks_ai/deepseek-v3p2 | 164K | $1.20 | $1.20 | Function calling, response schema |
| Fireworks | fireworks_ai/kimi-k2-instruct-0905 | 262K | $0.60 | $2.50 | Function calling, response schema |
| DeepSeek | deepseek/deepseek-v3.2 | 164K | $0.28 | $0.40 | Reasoning, function calling |
| Mistral | mistral/mistral-large-3 | 256K | $0.50 | $1.50 | Function calling, vision |
| Azure AI | azure_ai/mistral-large-3 | 256K | $0.50 | $1.50 | Function calling, vision |
| Moonshot | moonshot/kimi-k2-0905-preview | 262K | $0.60 | $2.50 | Function calling, web search |
| Moonshot | moonshot/kimi-k2-turbo-preview | 262K | $1.15 | $8.00 | Function calling, web search |
| Moonshot | moonshot/kimi-k2-thinking-turbo | 262K | $1.15 | $8.00 | Function calling, web search |
| OpenRouter | openrouter/deepseek/deepseek-v3.2 | 164K | $0.28 | $0.40 | Reasoning, function calling |
| Databricks | databricks/databricks-claude-haiku-4-5 | 200K | $1.00 | $5.00 | Reasoning, function calling |
| Databricks | databricks/databricks-claude-opus-4 | 200K | $15.00 | $75.00 | Reasoning, function calling |
| Databricks | databricks/databricks-claude-opus-4-1 | 200K | $15.00 | $75.00 | Reasoning, function calling |
| Databricks | databricks/databricks-claude-opus-4-5 | 200K | $5.00 | $25.00 | Reasoning, function calling |
| Databricks | databricks/databricks-claude-sonnet-4 | 200K | $3.00 | $15.00 | Reasoning, function calling |
| Databricks | databricks/databricks-claude-sonnet-4-1 | 200K | $3.00 | $15.00 | Reasoning, function calling |
| Databricks | databricks/databricks-gemini-2-5-flash | 1M | $0.30 | $2.50 | Function calling |
| Databricks | databricks/databricks-gemini-2-5-pro | 1M | $1.25 | $10.00 | Function calling |
| Databricks | databricks/databricks-gpt-5 | 400K | $1.25 | $10.00 | Function calling |
| Databricks | databricks/databricks-gpt-5-1 | 400K | $1.25 | $10.00 | Function calling |
| Databricks | databricks/databricks-gpt-5-mini | 400K | $0.25 | $2.00 | Function calling |
| Databricks | databricks/databricks-gpt-5-nano | 400K | $0.05 | $0.40 | Function calling |
| Vertex AI | vertex_ai/chirp | - | $30.00/1M chars | - | Text-to-speech (Chirp3 HD) |
| Z.AI | zai/glm-4.6 | 200K | $0.60 | $2.20 | Function calling |
| Z.AI | zai/glm-4.5 | 128K | $0.60 | $2.20 | Function calling |
| Z.AI | zai/glm-4.5v | 128K | $0.60 | $1.80 | Function calling, vision |
| Z.AI | zai/glm-4.5-flash | 128K | Free | Free | Function calling |
| Vertex AI | vertex_ai/bge-large-en-v1.5 | - | - | - | BGE Embeddings |
Features
-
- Allow reasoning_effort='none' for Azure gpt-5.1 models - PR #17311
-
- Add Nova embedding support - PR #17253
- Add support for Bedrock Qwen 2 imported model - PR #17461
- Bedrock OpenAI model support - PR #17368
- Add support for file content download for Bedrock batches - PR #17470
- Make streaming chunk size configurable in Bedrock API - PR #17357
- Add experimental latest-user filtering for Bedrock - PR #17282
- Handle Cohere v4 embed response dictionary format - PR #17220
- Remove not compatible beta header from Bedrock - PR #17301
- Add model price and details for Global Opus 4.5 Bedrock endpoint - PR #17380
-
Gemini (Google AI Studio + Vertex AI)
- Add better handling in image generation for Gemini models - PR #17292
- Fix reasoning_content showing duplicate content in streaming responses - PR #17266
- Handle partial JSON chunks after first valid chunk - PR #17496
- Fix Gemini 3 last chunk thinking block - PR #17403
- Fix Gemini image_tokens treated as text tokens in cost calculation - PR #17554
- Make sure that media resolution is only for Gemini 3 model - PR #17137
-
- Add Z.AI as built-in provider - PR #17307
-
- Update Databricks model pricing and add new models - PR #17277
-
- Add support of audio transcription for OVHcloud - PR #17305
-
- Add Mistral Large 3 model support - PR #17547
-
- Fix missing Moonshot turbo models and fix incorrect pricing - PR #17432
-
- Add context window exception mapping for Together AI - PR #17284
-
- Support Deepseek 3.2 with Reasoning - PR #17384
-
- Add Nova Lite 2 reasoning support with reasoningConfig - PR #17371
-
- Fix auth not working with ollama.com - PR #17191
-
- Fix supports_response_schema before using json_tool_call workaround - PR #17438
-
- Fix empty response + vLLM streaming - PR #17516
-
- Add support for TwelveLabs Pegasus video understanding - PR #17193
Bug Fixes
-
- Fix extra_headers in messages API bedrock invoke - PR #17271
- Fix Bedrock models in model map - PR #17419
- Make Bedrock converse messages respect modify_params as expected - PR #17427
- Fix Anthropic beta headers for Bedrock imported Qwen models - PR #17467
- Preserve usage from JSON response for OpenAI provider in Bedrock - PR #17589
-
- Fix acompletion throws error with SambaNova models - PR #17217
-
General
LLM API Endpoints
Features
-
- Add passthrough cost tracking for Veo - PR #17296
-
- Add missing OCR and aOCR to CallTypes enum - PR #17435
-
General
- Support routing to only websearch supported deployments - PR #17500
Bugs
- General
Management Endpoints / UI
Features
-
New Login Page
-
Customer (End User) Usage
-
Virtual Keys
-
Models + Endpoints
-
Callbacks
-
Management Routes
-
OCI Configuration
- Enable Oracle Cloud Infrastructure configuration via UI - PR #17159
Bugs
-
UI Fixes
- Fix Request and Response Panel JSONViewer - PR #17233
- Adding Button Loading States to Edit Settings - PR #17236
- Fix Various Text, button state, and test changes - PR #17237
- Fix Fallbacks Immediately Deleting before API resolves - PR #17238
- Remove Feature Flags - PR #17240
- Fix metadata tags and model name display in UI for Azure passthrough - PR #17258
- Change labeling around Vertex Fields - PR #17383
- Remove second scrollbar when sidebar is expanded + tooltip z index - PR #17436
- Fix Select in Edit Membership Modal - PR #17524
- Change useAuthorized Hook to redirect to new Login Page - PR #17553
-
SSO
-
Auth / JWT
- JWT Auth - Allow using regular OIDC flow with user info endpoints - PR #17324
- Fix litellm user auth not passing issue - PR #17342
- Add other routes in JWT auth - PR #17345
- Fix new org team validate against org - PR #17333
- Fix litellm_enterprise ensure imported routes exist - PR #17337
- Use organization.members instead of deprecated organization field - PR #17557
-
Organizations/Teams
AI Integrations (2 new integrations)
Logging (1 new integration)
New Integration
Improvements & Fixes
-
- Fix Datadog callback regression when ddtrace is installed - PR #17393
-
- Fix clean arize-phoenix traces - PR #16611
-
- Fix MLflow streaming spans for Anthropic passthrough - PR #17288
-
- Fix Langfuse logger test mock setup - PR #17591
-
General
- Improve PII anonymization handling in logging callbacks - PR #17207
Guardrails (1 new integration)
New Integration
- Generic Guardrail API
- Generic Guardrail API - allows guardrail providers to add INSTANT support for LiteLLM w/out PR to repo - PR #17175
- Guardrails API V2 - user api key metadata, session id, specify input type (request/response), image support - PR #17338
- Guardrails API - add streaming support - PR #17400
- Guardrails API - support tool call checks on OpenAI
/chat/completions, OpenAI/responses, Anthropic/v1/messages- PR #17459 - Guardrails API - new
structured_messagesparam - PR #17518 - Correctly map a v1/messages call to the anthropic unified guardrail - PR #17424
- Support during_call event type for unified guardrails - PR #17514
Improvements & Fixes
-
- Refactor Noma guardrail to use shared Responses transformation and include system instructions - PR #17315
-
- Fix AIM guardrail tests - PR #17499
-
- Fix Bedrock Guardrail indent and import - PR #17378
-
General Guardrails
Secret Managers
-
- Allow setting SSL verify to false - PR #17433
-
General
- Make email and secret manager operations independent in key management hooks - PR #17551
Spend Tracking, Budgets and Rate Limiting
-
Rate Limiting
-
Spend Logs
-
Enforce User Param
- Enforce support of enforce_user_param to OpenAI post endpoints - PR #17407
MCP Gateway
-
MCP Configuration
-
MCP Tool Results
- Preserve tool metadata in CallToolResult - PR #17561
Agent Gateway (A2A)
-
Agent Invocation
-
Agent Access Control
- Enforce Allowed agents by key, team + add agent access groups on backend - PR #17502
-
Agent Gateway UI
Performance / Loadbalancing / Reliability improvements
-
Audio/Speech Performance
- Fix
/audio/speechperformance by usingshared_sessions- PR #16739
- Fix
-
Memory Optimization
-
Database
-
Proxy Caching
- Fix proxy caching between requests in aiohttp transport - PR #17122
-
Session Management
-
Vector Store
- Fix vector store configuration synchronization failure - PR #17525
Documentation Updates
-
Provider Documentation
-
Guides
-
Projects
-
Cleanup
Infrastructure / CI/CD
-
Helm Chart
- Add ingress-only labels - PR #17348
-
Docker
-
OpenAPI Schema
- Refactor add_schema_to_components to move definitions to components/schemas - PR #17389
-
Security
New Contributors
- @weichiet made their first contribution in PR #17242
- @AndyForest made their first contribution in PR #17220
- @omkar806 made their first contribution in PR #17217
- @v0rtex20k made their first contribution in PR #17178
- @hxomer made their first contribution in PR #17207
- @orgersh92 made their first contribution in PR #17316
- @dannykopping made their first contribution in PR #17313
- @rioiart made their first contribution in PR #17333
- @codgician made their first contribution in PR #17278
- @epistoteles made their first contribution in PR #17277
- @kothamah made their first contribution in PR #17368
- @flozonn made their first contribution in PR #17371
- @richardmcsong made their first contribution in PR #17389
- @matt-greathouse made their first contribution in PR #17384
- @mossbanay made their first contribution in PR #17380
- @mhielpos-asapp made their first contribution in PR #17376
- @Joilence made their first contribution in PR #17367
- @deepaktammali made their first contribution in PR #17357
- @axiomofjoy made their first contribution in PR #16611
- @DevajMody made their first contribution in PR #17445
- @andrewtruong made their first contribution in PR #17439
- @AnasAbdelR made their first contribution in PR #17490
- @dominicfeliton made their first contribution in PR #17516
- @kristianmitk made their first contribution in PR #17504
- @rgshr made their first contribution in PR #17130
- @dominicfallows made their first contribution in PR #17489
- @irfansofyana made their first contribution in PR #17467
- @GusBricker made their first contribution in PR #17191
- @OlivverX made their first contribution in PR #17255
- @withsmilo made their first contribution in PR #17585

