v1.82.0 - Realtime Guardrails, Projects Management, and 10+ Performance Optimizations
Deploy this version​
- Docker
- Pip
docker run litellm
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-1.82.0-stable
pip install litellm
pip install litellm==1.82.0
Key Highlights​
- Realtime API guardrails — Full guardrails support for
/v1/realtimeWebSocket sessions with pre/post-call enforcement, voice transcription hooks, session termination policies, and Vertex AI Gemini Live support - PR #22152, PR #22153, PR #22161, PR #22165 - Projects Management — New Projects UI with full CRUD, project-scoped virtual keys, and admin opt-in toggle — organize teams and keys by project - PR #22315, PR #22360, PR #22373, PR #22412
- Guardrail ecosystem expansion — Noma v2, Lakera v2 post-call, Singapore regulatory policies (PDPA + MAS), employment discrimination blockers, code execution blocker, guardrail policy versioning, and production monitoring - PR #21400, PR #21783, PR #21948
- OpenAI Codex 5.3 — day 0 — Full support for
gpt-5.3-codexon OpenAI and Azure, plusgpt-audio-1.5andgpt-realtime-1.5model coverage - PR #22035 - 10+ performance optimizations — Streaming hot-path fixes, Redis pipeline batching, database task batching, ModelResponse init skip, and router cache improvements — lower latency and CPU on every request
/v1/messages→/responsesrouting —/v1/messagesrequests are now routed to the Responses API by default for OpenAI/Azure models
v1/messages routing change
This version starts routing /v1/messages requests to the /responses API by default. To opt out and continue using chat/completions, set LITELLM_USE_CHAT_COMPLETIONS_URL_FOR_ANTHROPIC_MESSAGES=true or litellm_settings.use_chat_completions_url_for_anthropic_messages: true in your config.
New Models / Updated Models​
New Model Support (20 new models)​
| Provider | Model | Context Window | Input ($/1M tokens) | Output ($/1M tokens) | Features |
|---|---|---|---|---|---|
| OpenAI | gpt-5.3-codex | 272K | $1.75 | $14.00 | Reasoning, coding |
| Azure OpenAI | azure/gpt-5.3-codex | 272K | $1.75 | $14.00 | Azure deployment |
| OpenAI | gpt-audio-1.5 | 128K | $2.50 | $10.00 | Audio model |
| Azure OpenAI | azure/gpt-audio-1.5-2026-02-23 | 128K | $2.50 | $10.00 | Audio model |
| OpenAI | gpt-realtime-1.5 | 32K | $4.00 | $16.00 | Realtime model |
| Azure OpenAI | azure/gpt-realtime-1.5-2026-02-23 | 32K | $4.00 | $16.00 | Realtime model |
| Groq | groq/openai/gpt-oss-safeguard-20b | 131K | $0.075 | $0.30 | Guardrail inference |
| Google Vertex AI | vertex_ai/gemini-3.1-flash-image-preview | - | - | - | Image generation |
| Perplexity | perplexity/perplexity/sonar | - | - | - | Sonar search |
| Perplexity | perplexity/openai/gpt-5.1 | - | - | - | Hosted routing |
| Perplexity | perplexity/openai/gpt-5-mini | - | - | - | Hosted routing |
| Perplexity | perplexity/google/gemini-2.5-flash | - | - | - | Hosted routing |
| Perplexity | perplexity/google/gemini-2.5-pro | - | - | - | Hosted routing |
| Perplexity | perplexity/google/gemini-3-flash-preview | - | - | - | Hosted routing |
| Perplexity | perplexity/google/gemini-3-pro-preview | - | - | - | Hosted routing |
| Perplexity | perplexity/anthropic/claude-haiku-4-5 | - | - | - | Hosted routing |
| Perplexity | perplexity/anthropic/claude-sonnet-4-5 | - | - | - | Hosted routing |
| Perplexity | perplexity/anthropic/claude-opus-4-5 | - | - | - | Hosted routing |
| Perplexity | perplexity/anthropic/claude-opus-4-6 | - | - | - | Hosted routing |
| Perplexity | perplexity/xai/grok-4-1-fast-non-reasoning | - | - | - | Hosted routing |
Features​
-
- New Azure OpenAI models 2026-02-25 - PR #22114
-
- Adjust
mistral-small-2503input/output cost per token - PR #22097
- Adjust
-
- Add
groq/openai/gpt-oss-safeguard-20bmodel pricing - PR #21951
- Add
-
- Update AIML model pricing - PR #22139
-
- Thread
api_basetoget_model_info+ graceful fallback - PR #21970
- Thread
-
- Fix function calling for PublicAI Apertus models - PR #21582
-
- Add deprecation dates for
grok-2-vision-1212andgrok-3-minimodels - PR #20102
- Add deprecation dates for
-
General
Bug Fixes​
-
- Fix converse handling for
parallel_tool_calls- PR #22267 - Restore
parallel_tool_callsmapping inmap_openai_params- PR #22333 - Correct
modelInputformat for Converse API batch models - PR #21656 - Prevent double UUID in
create_fileS3 key - PR #21650 - Filter internal
json_tool_callwhen mixed with real tools - PR #21107 - Pass timeout param to Bedrock rerank HTTP client - PR #22021
- Fix converse handling for
-
- Fix model cost map for anthropic fast and
inference_geo- PR #21904
- Fix model cost map for anthropic fast and
-
General
LLM API Endpoints​
Features​
-
- Guardrails support for
/v1/realtimeWebSocket endpoint - PR #22152 - Vertex AI Gemini Live via unified
/realtimeendpoint - PR #22153 - Guardrails with
pre_call/post_callmode on realtime WebSocket - PR #22161 end_session_after_n_fails+ Endpoint Settings wizard step - PR #22165- Guardrail hook for voice transcription - PR #21976
- Fix guardrails not firing for Gemini/Vertex AI and
provider_configrealtime sessions - PR #22168 - Add logging, spend tracking support + tool tracing - PR #22105
- Guardrails support for
-
- Enable local file support for OCR - PR #22133
-
- Preserve thinking blocks in agentic loop follow-up messages - PR #21604
-
General
Bugs​
- General
- Fix mypy attr-defined errors on realtime websocket calls - PR #22202
Management Endpoints / UI​
Features​
-
Projects
- Add Projects page with list and create flows - PR #22315
- Add Project Details page with edit modal - PR #22360
- Add project keys table and project dropdown on key create/edit - PR #22373
- Add delete project action to Projects table - PR #22412
- Add Projects Opt-In Toggle in Admin Settings - PR #22416
- Include
created_atandupdated_atin/project/listresponse - PR #22323 - Add tags in project - PR #22216
-
Virtual Keys + Access Groups
- Add bidirectional team/key sync for Access Group CRUD flows - PR #22253
- Add pagination and search to
/key/aliasesto prevent OOMs - PR #22137 - Add paginated key alias selector in UI - PR #22157
- Add
project_idandaccess_group_idfilters for key list endpoint - PR #22356 - Add KeyInfoHeader component - PR #22047
- Restrict Edit Settings to key owners - PR #21985
- Fix virtual key grace period from env/UI - PR #20321
-
Agents
-
Proxy Auth / SSO
-
Usage / Spend Logs
- Add user filtering to usage page - PR #22059
- Allow using AI to understand usage patterns - PR #22042
- Use backend
request_duration_msand make Duration sortable in Logs - PR #22122 - Add
request_duration_msto SpendLogs - PR #22066 - Enrich failure spend logs with key/team metadata - PR #22049
- Show real tool names in logs for Anthropic-format tools - PR #22048
-
Models + Endpoints
-
UI Improvements
-
Health Checks
Bugs​
- Populate
user_idanduser_infofor admin users in/user/info- PR #22239 - Fix virtual keys pagination stale totals when filtering - PR #22222
- Fix Spend Update Queue aggregation never triggers with default presets - PR #21963
- Fix timezone config lookup and replace hardcoded timezone map with
ZoneInfo- PR #21754 - Fix custom auth budget issue - PR #22164
- Fix missing OAuth session state - PR #21992
- Fix Transport Type for OpenAPI Spec on UI - PR #22005
- Fix Claude Code plugin schema - PR #22271
- Add missing migration for
LiteLLM_ClaudeCodePluginTable- PR #22335 - Only tag selected deployment in access group creation - PR #21655
- State management fixes for CheckBatchCost - PR #21921
- Remove duplicate antd import in ToolPolicies - PR #22107