[Preview] v1.82.0 - Realtime Guardrails, Projects Management, and 10+ Performance Optimizations
Deploy this version​
- Docker
- Pip
docker run litellm
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-1.82.0
pip install litellm
pip install litellm==1.82.0
Key Highlights​
- Realtime API guardrails — Full guardrails support for
/v1/realtimeWebSocket sessions with pre/post-call enforcement, voice transcription hooks, session termination policies, and Vertex AI Gemini Live support - PR #22152, PR #22153, PR #22161, PR #22165 - Projects Management — New Projects UI with full CRUD, project-scoped virtual keys, and admin opt-in toggle — organize teams and keys by project - PR #22315, PR #22360, PR #22373, PR #22412
- Guardrail ecosystem expansion — Noma v2, Lakera v2 post-call, Singapore regulatory policies (PDPA + MAS), employment discrimination blockers, code execution blocker, guardrail policy versioning, and production monitoring - PR #21400, PR #21783, PR #21948
- OpenAI Codex 5.3 — day 0 — Full support for
gpt-5.3-codexon OpenAI and Azure, plusgpt-audio-1.5andgpt-realtime-1.5model coverage - PR #22035 - 10+ performance optimizations — Streaming hot-path fixes, Redis pipeline batching, database task batching, ModelResponse init skip, and router cache improvements — lower latency and CPU on every request
/v1/messages→/responsesrouting —/v1/messagesrequests are now routed to the Responses API by default for OpenAI/Azure models
v1/messages routing change
This version starts routing /v1/messages requests to the /responses API by default. To opt out and continue using chat/completions, set LITELLM_USE_CHAT_COMPLETIONS_URL_FOR_ANTHROPIC_MESSAGES=true or litellm_settings.use_chat_completions_url_for_anthropic_messages: true in your config.
New Models / Updated Models​
New Model Support (20 new models)​
| Provider | Model | Context Window | Input ($/1M tokens) | Output ($/1M tokens) | Features |
|---|---|---|---|---|---|
| OpenAI | gpt-5.3-codex | 272K | $1.75 | $14.00 | Reasoning, coding |
| Azure OpenAI | azure/gpt-5.3-codex | 272K | $1.75 | $14.00 | Azure deployment |
| OpenAI | gpt-audio-1.5 | 128K | $2.50 | $10.00 | Audio model |
| Azure OpenAI | azure/gpt-audio-1.5-2026-02-23 | 128K | $2.50 | $10.00 | Audio model |
| OpenAI | gpt-realtime-1.5 | 32K | $4.00 | $16.00 | Realtime model |
| Azure OpenAI | azure/gpt-realtime-1.5-2026-02-23 | 32K | $4.00 | $16.00 | Realtime model |
| Groq | groq/openai/gpt-oss-safeguard-20b | 131K | $0.075 | $0.30 | Guardrail inference |
| Google Vertex AI | vertex_ai/gemini-3.1-flash-image-preview | - | - | - | Image generation |
| Perplexity | perplexity/perplexity/sonar | - | - | - | Sonar search |
| Perplexity | perplexity/openai/gpt-5.1 | - | - | - | Hosted routing |
| Perplexity | perplexity/openai/gpt-5-mini | - | - | - | Hosted routing |
| Perplexity | perplexity/google/gemini-2.5-flash | - | - | - | Hosted routing |
| Perplexity | perplexity/google/gemini-2.5-pro | - | - | - | Hosted routing |
| Perplexity | perplexity/google/gemini-3-flash-preview | - | - | - | Hosted routing |
| Perplexity | perplexity/google/gemini-3-pro-preview | - | - | - | Hosted routing |
| Perplexity | perplexity/anthropic/claude-haiku-4-5 | - | - | - | Hosted routing |
| Perplexity | perplexity/anthropic/claude-sonnet-4-5 | - | - | - | Hosted routing |
| Perplexity | perplexity/anthropic/claude-opus-4-5 | - | - | - | Hosted routing |
| Perplexity | perplexity/anthropic/claude-opus-4-6 | - | - | - | Hosted routing |
| Perplexity | perplexity/xai/grok-4-1-fast-non-reasoning | - | - | - | Hosted routing |
Features​
-
- New Azure OpenAI models 2026-02-25 - PR #22114
-
- Adjust
mistral-small-2503input/output cost per token - PR #22097
- Adjust
-
- Add
groq/openai/gpt-oss-safeguard-20bmodel pricing - PR #21951
- Add
-
- Update AIML model pricing - PR #22139
-
- Thread
api_basetoget_model_info+ graceful fallback - PR #21970
- Thread
-
- Fix function calling for PublicAI Apertus models - PR #21582
-
- Add deprecation dates for
grok-2-vision-1212andgrok-3-minimodels - PR #20102
- Add deprecation dates for
-
General
Bug Fixes​
-
- Fix converse handling for
parallel_tool_calls- PR #22267 - Restore
parallel_tool_callsmapping inmap_openai_params- PR #22333 - Correct
modelInputformat for Converse API batch models - PR #21656 - Prevent double UUID in
create_fileS3 key - PR #21650 - Filter internal
json_tool_callwhen mixed with real tools - PR #21107 - Pass timeout param to Bedrock rerank HTTP client - PR #22021
- Fix converse handling for
-
- Fix model cost map for anthropic fast and
inference_geo- PR #21904
- Fix model cost map for anthropic fast and
-
General
LLM API Endpoints​
Features​
-
- Guardrails support for
/v1/realtimeWebSocket endpoint - PR #22152 - Vertex AI Gemini Live via unified
/realtimeendpoint - PR #22153 - Guardrails with
pre_call/post_callmode on realtime WebSocket - PR #22161 end_session_after_n_fails+ Endpoint Settings wizard step - PR #22165- Guardrail hook for voice transcription - PR #21976
- Fix guardrails not firing for Gemini/Vertex AI and
provider_configrealtime sessions - PR #22168 - Add logging, spend tracking support + tool tracing - PR #22105
- Guardrails support for
-
- Enable local file support for OCR - PR #22133
-
- Preserve thinking blocks in agentic loop follow-up messages - PR #21604
-
General
Bugs​
- General
- Fix mypy attr-defined errors on realtime websocket calls - PR #22202
Management Endpoints / UI​
Features​
-
Projects
- Add Projects page with list and create flows - PR #22315
- Add Project Details page with edit modal - PR #22360
- Add project keys table and project dropdown on key create/edit - PR #22373
- Add delete project action to Projects table - PR #22412
- Add Projects Opt-In Toggle in Admin Settings - PR #22416
- Include
created_atandupdated_atin/project/listresponse - PR #22323 - Add tags in project - PR #22216
-
Virtual Keys + Access Groups
- Add bidirectional team/key sync for Access Group CRUD flows - PR #22253
- Add pagination and search to
/key/aliasesto prevent OOMs - PR #22137 - Add paginated key alias selector in UI - PR #22157
- Add
project_idandaccess_group_idfilters for key list endpoint - PR #22356 - Add KeyInfoHeader component - PR #22047
- Restrict Edit Settings to key owners - PR #21985
- Fix virtual key grace period from env/UI - PR #20321
-
Agents
-
Proxy Auth / SSO
-
Usage / Spend Logs
- Add user filtering to usage page - PR #22059
- Allow using AI to understand usage patterns - PR #22042
- Use backend
request_duration_msand make Duration sortable in Logs - PR #22122 - Add
request_duration_msto SpendLogs - PR #22066 - Enrich failure spend logs with key/team metadata - PR #22049
- Show real tool names in logs for Anthropic-format tools - PR #22048
-
Models + Endpoints
-
UI Improvements
-
Health Checks
Bugs​
- Populate
user_idanduser_infofor admin users in/user/info- PR #22239 - Fix virtual keys pagination stale totals when filtering - PR #22222
- Fix Spend Update Queue aggregation never triggers with default presets - PR #21963
- Fix timezone config lookup and replace hardcoded timezone map with
ZoneInfo- PR #21754 - Fix custom auth budget issue - PR #22164
- Fix missing OAuth session state - PR #21992
- Fix Transport Type for OpenAPI Spec on UI - PR #22005
- Fix Claude Code plugin schema - PR #22271
- Add missing migration for
LiteLLM_ClaudeCodePluginTable- PR #22335 - Only tag selected deployment in access group creation - PR #21655
- State management fixes for CheckBatchCost - PR #21921
- Remove duplicate antd import in ToolPolicies - PR #22107
AI Integrations​
Logging​
-
- Fix Langfuse OTEL trace issues - PR #21309
-
- Fix nested traces coexistence with OTEL callback - PR #22169
-
- Add optional digest mode for Slack alert types - PR #21683
-
General
Guardrails​
-
- Noma guardrails v2 based on custom guardrails framework - PR #21400
-
- Add Lakera v2 post-call hook with fixed PII masking - PR #21783
-
Built-in Guardrails
- Block code execution guardrail to prevent agents from executing code - PR #22154
- Employment discrimination topic blockers for 5 protected classes - PR #21962
- Claims agent guardrails (5 categories + policy template) - PR #22113
- New code execution evaluation dataset - PR #22065
- Tool policies: auto-discover tools + policy enforcement - PR #22041
-
Policy Templates
-
Guardrail Monitoring
- Guardrail Monitor — measure guardrail reliability in production - PR #21944
-
Security
- Fix unauthenticated RCE and sandbox escape in custom code guardrail - PR #22095
Prompt Management​
No major prompt management changes in this release.
Secret Managers​
No major secret manager changes in this release.
Spend Tracking, Budgets and Rate Limiting​
- Priority PayGo cost tracking for Gemini/Vertex AI - PR #21909
- Add
request_duration_msto SpendLogs for latency tracking per request - PR #22066 - Add
in_flight_requestsmetric to/health/backlog+ Prometheus - PR #22319 - Enrich failure spend logs with key/team metadata - PR #22049
- Add spend tracking lifecycle logging for debugging spend flows - PR #22029
- Fix budget timezone config lookup and replace hardcoded timezone map with
ZoneInfo- PR #21754 - Fix Spend Update Queue aggregation never triggering with default presets - PR #21963
- Avoid mutating caller-owned dicts in
SpendUpdateQueueaggregation - PR #21742 - Optimize old spendlog deletion cron job - PR #21930
- Health check max tokens configuration - PR #22299
MCP Gateway​
- Pass MCP auth headers from request context to tool fetch for
/v1/responsesand/chat/completions- PR #22291 - Default
available_on_public_internetto true for MCP server behavior consistency - PR #22331 - Clear error messages for IP filtering / no available tools - PR #22142
- Strip stale
mcp-session-idheader to prevent 400 errors across proxy workers - PR #21417 - Skip health check for MCP with passthrough token auth - PR #21982
- Fix missing OAuth session state - PR #21992
- Fix Transport Type for OpenAPI Spec on UI - PR #22005
- Add e2e test for stateless StreamableHTTP behavior - PR #22033
Performance / Loadbalancing / Reliability improvements​
Streaming & hot-path
- Streaming latency improvements — 4 targeted hot-path fixes - PR #22346
- Skip throwaway
Usage()construction inModelResponse.__init__- PR #21611 - Optimize
is_model_o_series_modelwithstartswith- PR #21690 - Use cached
_safe_get_request_headersinstead of per-request construction - PR #21430 - Emit
x-litellm-overhead-duration-msheader for streaming requests - PR #22027
Database & Redis
- Batch 11
create_task()calls into 1 inupdate_database()- PR #22028 - Redis pipeline spend updates for batched writes - PR #22044
- Recover from prisma-query-engine zombie process - PR #21899
- Optimize old spendlog deletion cron job - PR #21930
Router & caching
- Add cache invalidation for
_cached_get_model_group_info- PR #20376 - Remove cache eviction close that kills in-use httpx clients - PR #22247
- Store background task references in
LLMClientCache._remove_keyto prevent unawaited coroutine warnings - PR #22143 - Fix
ensure_arrival_timeset before calculating queue time - PR #21918
Connection management
- Only set
enable_cleanup_closedon aiohttp when required - PR #21897 - Prometheus child_exit cleanup for gunicorn workers - PR #22324
- Prometheus multiprocess cleanup - PR #22221
- Limit concurrent health checks with
health_check_concurrency- PR #20584 - Isolate
get_configfailures from model sync loop - PR #22224
Other
- Semantic cache: support configurable vector dimensions - PR #21649
- Honor
MAX_STRING_LENGTH_PROMPT_IN_DBfrom config env vars - PR #22106 - Enhance
MidStreamFallbackErrorto preserve original status code and attributes - PR #22225 - Network mock utility for testing - PR #21942
- Add missing return type annotations to iterator protocol methods in streaming_handler - PR #21750
Security​
- Fix critical/high CVEs in OS-level libs and NPM transitive dependencies - PR #22008
- Fix unauthenticated RCE and sandbox escape in custom code guardrail - PR #22095
- Remove hardcoded base64 string flagged by secret scanner - PR #22125
Documentation Updates​
- Add OpenAI Agents SDK tutorial with LiteLLM Proxy - PR #21221
- Add OpenClaw integration tutorial - PR #21605
- Add Google GenAI SDK tutorial (JS & Python) - PR #21885
- Add Gollem Go agent framework cookbook example - PR #21747
- Update AssemblyAI docs with Universal-3 Pro, Speech Understanding, and LLM Gateway - PR #21130
- Add
store_model_in_dbrelease docs - PR #21863 - Add Credential Usage Tracking docs - PR #22112
- Add proxy request tags docs - PR #22129
- Add trailing slash to
/mcpendpoint URLs - PR #20509 - Add pre-PR checklist to UI contributing guide - PR #21886
- Replace Azure OpenAI key with mock key in docs - PR #21997
- Add performance & reliability section to v1.81.14 release notes - PR #21950
- Update v1.81.12-stable release notes to point to stable.1 - PR #22036
- Add security vulnerability scan report to v1.81.14 release notes - PR #22385
New Contributors​
- @janfrederickk made their first contribution in PR #21660
- @hztBUAA made their first contribution in PR #21656
- @LeeJuOh made their first contribution in PR #21754
- @WhoisMonesh made their first contribution in PR #21750
- @trevorprater made their first contribution in PR #21747
- @edwiniac made their first contribution in PR #21870
- @stakeswky made their first contribution in PR #21867
- @ta-stripe made their first contribution in PR #21701
- @ron-zhong made their first contribution in PR #21948
- @Arindam200 made their first contribution in PR #21221
- @Canvinus made their first contribution in PR #21964
- @nicolopignatelli made their first contribution in PR #21951
- @MarshHawk made their first contribution in PR #20584
- @gavksingh made their first contribution in PR #22106
- @roni-frantchi made their first contribution in PR #22090
- @noahnistler made their first contribution in PR #22133
- @dylan-duan-aai made their first contribution in PR #21130
- @rasmi made their first contribution in PR #22322
Diff Summary​
02/28/2026​
- New Models / Updated Models: 26
- LLM API Endpoints: 14
- Management Endpoints / UI: 38
- AI Integrations: 25
- Spend Tracking, Budgets and Rate Limiting: 10
- MCP Gateway: 8
- Performance / Loadbalancing / Reliability improvements: 22
- Security: 3
- Documentation Updates: 14

