Skip to main content

v1.81.9 - Control which MCP Servers are exposed on the Internet

Krrish Dholakia
CEO, LiteLLM
Ishaan Jaff
CTO, LiteLLM

Deploy this version​

docker run litellm
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
docker.litellm.ai/berriai/litellm:main-v1.81.9

Key Highlights​


MCP Servers on the Public Internet​

This release makes it safe to expose MCP servers on the public internet by adding public/private visibility and IP-based access control. You can now run internet-facing MCP services while restricting access to trusted networks and keeping internal tools private.

Get started

UI Team Soft Budget Alerts​

Set a soft budget on any team to receive email alerts when spending crosses the threshold — without blocking any requests. Configure the threshold and alerting emails directly from the Admin UI, with no proxy restart needed.

Get started

Let's dive in.


New Models / Updated Models​

New Model Support (13 new models)​

ProviderModelContext WindowInput ($/1M tokens)Output ($/1M tokens)
Anthropicclaude-opus-4-61M$5.00$25.00
AWS Bedrockanthropic.claude-opus-4-6-v11M$5.00$25.00
Azure AIazure_ai/claude-opus-4-6200K$5.00$25.00
Vertex AIvertex_ai/claude-opus-4-61M$5.00$25.00
Google Geminigemini/deep-research-pro-preview-12-202565K$2.00$12.00
Vertex AIvertex_ai/deep-research-pro-preview-12-202565K$2.00$12.00
Moonshotmoonshot/kimi-k2.5262K$0.60$3.00
OpenRouteropenrouter/qwen/qwen3-235b-a22b-2507262K$0.07$0.10
OpenRouteropenrouter/qwen/qwen3-235b-a22b-thinking-2507262K$0.11$0.60
Together AItogether_ai/zai-org/GLM-4.7200K$0.45$2.00
Together AItogether_ai/moonshotai/Kimi-K2.5256K$0.50$2.80
ElevenLabselevenlabs/eleven_v3-$0.18/1K chars-
ElevenLabselevenlabs/eleven_multilingual_v2-$0.18/1K chars-

Features​

Bug Fixes​


LLM API Endpoints​

Features​

  • Messages API

    • Filter unsupported Claude Code beta headers for non-Anthropic providers - PR #20578
    • Fix inconsistent response format in anthropic.messages.acreate() when using non-Anthropic providers - PR #20442
    • Fix 404 on /api/event_logging/batch endpoint that caused Claude Code "route not found" errors - PR #20504
  • A2A Agent Gateway

    • Allow calling A2A agents through LiteLLM /chat/completions API - PR #20358
    • Use A2A registered agents with /chat/completions - PR #20362
    • Fix A2A agents deployed with localhost/internal URLs in their agent cards - PR #20604
  • Files API

    • Add support for delete and GET via file_id for Gemini - PR #20329
  • General

    • Add User-Agent customization support - PR #19881
    • Fix search tools not found when using per-request routers - PR #19818
    • Forward extra headers in chat - PR #20386

Management Endpoints / UI​

Features​

  • SSO Configuration

    • SSO Config Team Mappings - PR #20111
    • UI - SSO: Add Team Mappings - PR #20299
    • Extract user roles from JWT access token for Keycloak compatibility - PR #20591
  • Auth / SDK

    • Add proxy_auth for auto OAuth2/JWT token management in SDK - PR #20238
  • Virtual Keys

    • Key reset_spend endpoint - PR #20305
    • UI - Keys: Allowed Routes to Key Info and Edit Pages - PR #20369
    • Add Key info endpoint object permission data - PR #20407
    • Keys and Teams Router Setting + Allow Override of Router Settings - PR #20205
  • Teams & Budgets

    • Add soft_budget to Team Table + Create/Update Endpoints - PR #20530
    • Team Soft Budget Email Alerts - PR #20553
    • UI - Team Settings: Soft Budget + Alerting Emails - PR #20634
    • UI - User Budget Page: Unlimited Budget Checkbox - PR #20380
    • /user/update allow for max_budget resets - PR #20375
  • UI Improvements

    • Default Team Settings: Migrate to use Reusable Model Select - PR #20310
    • Navbar: Option to Hide Community Engagement Buttons - PR #20308
    • Show team alias on Models health page - PR #20359
    • Admin Settings: Add option for Authentication for public AI Hub - PR #20444
    • Adjust daily spend date filtering for user timezone - PR #20472
  • SCIM

    • Add base /scim/v2 endpoint for SCIM resource discovery - PR #20301
  • Proxy CLI

Bugs​

  • Fix: Remove unnecessary key blocking on UI login that prevented access - PR #20210
  • UI - Team Settings: Disable Global Guardrail Persistence - PR #20307
  • UI - Model Info Page: Fix Input and Output Labels - PR #20462
  • UI - Model Page: Column Resizing on Smaller Screens - PR #20599
  • Fix /key/list user_id Empty String Edge Case - PR #20623
  • Add array type checks for model, agent, and MCP hub data to prevent UI crashes - PR #20469
  • Fix unique constraint on daily tables + logging when updates fail - PR #20394

Logging / Guardrail / Prompt Management Integrations​

Bug Fixes (3 fixes)​

  • Langfuse

    • Fix Langfuse OTEL trace export failing when spans contain null attributes - PR #20382
  • Prometheus

    • Fix incorrect failure metrics labels causing miscounted error rates - PR #20152
  • Slack Alerts

    • Fix Slack alert delivery failing for certain budget threshold configurations - PR #20257

Guardrails (7 updates)​

  • Custom Code Guardrails

    • Add HTTP support to custom code guardrails + Unified guardrails for MCP + Agent guardrail support - PR #20619
    • Custom Code Guardrails UI Playground - PR #20377
  • Team-Based Guardrails

    • Implement team-based isolation guardrails management - PR #20318
  • OpenAI Moderations

    • Ensure OpenAI Moderations Guard works with OpenAI Embeddings - PR #20523
  • GraySwan / Cygnal

    • Fix fail-open for GraySwan and pass metadata to Cygnal API endpoint - PR #19837
  • General

    • Check for model_response_choices before guardrail input - PR #19784
    • Preserve streaming content on guardrail-sampled chunks - PR #20027

Spend Tracking, Budgets and Rate Limiting​

  • Support 0 cost models - Allow zero-cost model entries for internal/free-tier models - PR #20249

MCP Gateway (9 updates)​

  • MCP Semantic Filtering - Filter MCP tools using semantic similarity to reduce tool sprawl for LLM calls - PR #20296, PR #20316
  • UI - MCP Semantic Filtering - Add support for MCP Semantic Filtering configuration on UI - PR #20454
  • MCP IP-Based Access Control - Set MCP servers as private/public available on internet with IP-based restrictions - PR #20607, PR #20620
  • Fix MCP "Session not found" error on VSCode reconnect - PR #20298
  • Fix OAuth2 'Capabilities: none' bug for upstream MCP servers - PR #20602
  • Include Config Defined Search Tools in /search_tools/list - PR #20371
  • UI - Search Tools: Show Config Defined Search Tools - PR #20436
  • Ensure MCP permissions are enforced when using JWT Auth - PR #20383
  • Fix gcs_bucket_name not being passed correctly for MCP server storage configuration - PR #20491

Performance / Loadbalancing / Reliability improvements (14 improvements)​

  • Prometheus ~40% CPU reduction - Parallelize budget metrics, fix caching bug, reduce CPU usage - PR #20544
  • Prevent closed client errors by reverting httpx client caching - PR #20025
  • Avoid unnecessary Router creation when no models or search tools are configured - PR #20661
  • Optimize wrapper_async with CallTypes caching and reduced lookups - PR #20204
  • Cache _get_relevant_args_to_use_for_logging() at module level - PR #20077
  • LRU cache for normalize_request_route - PR #19812
  • Optimize get_standard_logging_metadata with set intersection - PR #19685
  • Early-exit guards in completion_cost for unused features - PR #20020
  • Optimize get_litellm_params with sparse kwargs extraction - PR #19884
  • Guard debug log f-strings and remove redundant dict copies - PR #19961
  • Replace enum construction with frozenset lookup - PR #20302
  • Guard debug f-string in update_environment_variables - PR #20360
  • Warn when budget lookup fails to surface silent caching misses - PR #20545
  • Add INFO-level session reuse logging per request for better observability - PR #20597

Database Changes​

Schema Updates​

TableChange TypeDescriptionPRMigration
LiteLLM_TeamTableNew ColumnAdded allow_team_guardrail_config boolean field for team-based guardrail isolationPR #20318Migration
LiteLLM_DeletedTeamTableNew ColumnAdded allow_team_guardrail_config boolean fieldPR #20318Migration
LiteLLM_TeamTableNew ColumnAdded soft_budget (double precision) for soft budget alertingPR #20530Migration
LiteLLM_DeletedTeamTableNew ColumnAdded soft_budget (double precision)PR #20653Migration
LiteLLM_MCPServerTableNew ColumnAdded available_on_public_internet boolean for MCP IP-based access controlPR #20607Migration

Documentation Updates (14 updates)​

  • Add FAQ for setting up and verifying LITELLM_LICENSE - PR #20284
  • Model request tags documentation - PR #20290
  • Add Prisma migration troubleshooting guide - PR #20300
  • MCP Semantic Filtering documentation - PR #20316
  • Add CopilotKit SDK doc as supported agents SDK - PR #20396
  • Add documentation for Nova Sonic - PR #20320
  • Update Vertex AI Text to Speech doc to show use of audio - PR #20255
  • Improve Okta SSO setup guide with step-by-step instructions - PR #20353
  • Langfuse doc update - PR #20443
  • Expose MCPs on public internet documentation - PR #20626
  • Add blog post: Achieving Sub-Millisecond Proxy Overhead - PR #20309
  • Add blog post about litellm-observatory - PR #20622
  • Update Opus 4.6 blog with adaptive thinking - PR #20637
  • gpt-5-search-api docs clarifications - PR #20512

New Contributors​

  • @Quentin-M made their first contribution in PR #19818
  • @amirzaushnizer made their first contribution in PR #20235
  • @cscguochang made their first contribution in PR #20214
  • @krauckbot made their first contribution in PR #20273
  • @agrattan0820 made their first contribution in PR #19784
  • @nina-hu made their first contribution in PR #20472
  • @swayambhu94 made their first contribution in PR #20469
  • @ssadedin made their first contribution in PR #20566

Full Changelog​

v1.81.6-nightly...v1.81.9