Engineering Blog

Announcing Router Plugins: Customize Routing Signals

Router plugins are now live on LiteLLM. Configure a plugin pipeline to determine which models to pick for a given input. Plugins can be chained as well

Krrish Dholakia — July 17, 2026

Auto Router v2: one router for complexity, semantic, and adaptive routing

Auto Router v2 folds LiteLLM's complexity, semantic, and adaptive routers into a single router with an LLM classifier, keyword tiers, multi-model pools, and adaptive Thompson sampling.

Krrish Dholakia — July 13, 2026

Incident Report: Prompt Cache Invalidation for Claude Code on Bedrock Invoke

Date: July 4 to July 10, 2026

Mateo Wang Krrish Dholakia Ishaan Jaffer — July 13, 2026

July stability update: hardening MCP auth and cutting pass-through memory

A two-week product quality update. We addressed two major issues (MCP credential resolution and pass-through memory) and shipped 134 bug fixes in total. Plus our next goal: 95% end-to-end test coverage.

Ishaan Jaffer Tin Lo Mateo Wang Yassin Kortam — July 11, 2026

July Townhall: Product + Roadmap Updates

Join the LiteLLM July townhall on Thursday, 23 July at 7:30 AM PT to learn about LiteLLM's product updates and roadmap.

Krrish Dholakia Ishaan Jaffer — July 9, 2026

Day 0 Support: GPT-5.6 (Sol, Terra, Luna)

Day 0 support for the GPT-5.6 family (Sol, Terra, and Luna) on LiteLLM.

Mateo Wang Krrish Dholakia Ishaan Jaffer — July 9, 2026

5 ways to cut Claude Code costs with LiteLLM

Practical levers a platform admin can pull on the LiteLLM proxy to reduce Claude Code spend without asking developers to change a thing.

Krrish Dholakia — July 4, 2026

Day 0 Support: Claude Sonnet 5

Day 0 support for Claude Sonnet 5 on the LiteLLM AI Gateway. Use it across Anthropic, Azure, Vertex AI, and Bedrock.

Mateo Wang Krrish Dholakia Ishaan Jaffer — June 30, 2026

LiteLLM × Headroom: Use 60-95% fewer tokens with Claude Code

Cut input tokens on Claude Code and other LLM traffic by attaching Headroom as a pre_call guardrail on LiteLLM.

Krrish Dholakia Ishaan Jaffer — June 30, 2026

June Townhall Updates: 94 Bug Fixes, OCR + Realtime are in Rust, and a Zero-Regression Commitment

A recap of the June LiteLLM town hall covering security hardening, our zero-regression commitment, 78 feature commits, and the gradual migration of the gateway to Rust.

Krrish Dholakia Ishaan Jaffer — June 26, 2026

Swap OpenAI Code Interpreter for E2B/OpenSandbox

Keep the OpenAI code_interpreter tool in your requests, run the code in your own sandbox. LiteLLM intercepts the tool call and routes it to E2B or OpenSandbox; no client changes.

Krrish Dholakia — June 23, 2026

Migrating LiteLLM to Rust - Building the Fastest and Litest AI Gateway

LiteLLM is moving its AI gateway to Rust: 15x throughput, 11x less memory, and sub-1ms per-request overhead. No v2, no migration, your config stays the same.

Ishaan Jaffer — June 22, 2026

LiteLLM version support: focusing on the four most recent stable lines

Starting Monday, June 29, 2026, LiteLLM actively supports the four most recent stable minor lines. Older lines reach end of life, and the window rolls forward as new stable lines ship.

Yuneng Jiang — June 20, 2026

Semantic Caching on Valkey and AWS ElastiCache

LiteLLM now supports semantic prompt caching on Valkey clusters running the valkey-search module, including AWS ElastiCache for Valkey, with no RediSearch, Redis Stack, or Qdrant required.

Yassin Kortam — June 17, 2026

June Townhall: Product + Roadmap Updates

Join the LiteLLM June townhall on Thursday, 25 June at 7:30 AM PST to learn about LiteLLM's product updates and roadmap.

Krrish Dholakia Ishaan Jaffer — June 16, 2026

June Stability Update: We're Making Stability a First-Class Citizen at LiteLLM

Ishaan Jaffer Varoon Raghav — June 15, 2026

Day 0 Support: Claude Fable 5

Day 0 support for Claude Fable 5 on the LiteLLM AI Gateway. Use it across Anthropic, Azure, Vertex AI, and Bedrock.

Mateo Wang Krrish Dholakia Ishaan Jaffer — June 10, 2026

A Unified Agent Control Plane

The AI Gateway is moving up the stack: from routing model calls to routing agent work.

Krrish Dholakia — June 10, 2026

Announcing LiteLLM x Microsoft ASSERT

LiteLLM now integrates with Microsoft ASSERT for policy-driven agent evaluation — catch safety and quality defects before they reach production.

Mubashir Osmani Krrish Dholakia — June 3, 2026

LiteLLM Labs: Announcing Lite-Harness SDK — Unified API for Claude Code, Codex, and Pi AI

One SDK. Swap between Claude Code, Codex, and Pi AI by changing a string. Pairs with the LiteLLM AI Gateway for keys, budgets, logs, and fallbacks.

Krrish Dholakia Ishaan Jaffer — June 2, 2026

Fixed in 1.84.0+ - Version Update: Authentication Bypass via Host Header Injection (GHSA-4xpc-pv4p-pm3w)

Disclosure of a Host-header authentication bypass in the LiteLLM proxy. Addressed in v1.84.0. Very limited deployments are potentially affected, and no LiteLLM Cloud customers were affected.

Krrish Dholakia Ishaan Jaffer Yuneng Jiang — June 1, 2026

Day 0 Support: Claude Opus 4.8

Day 0 support for Claude Opus 4.8 on the LiteLLM AI Gateway. Use it across Anthropic, Azure, Vertex AI, and Bedrock.

Mateo Wang Krrish Dholakia Ishaan Jaffer — May 28, 2026

How we built a background agent to cover 30% of our backlog

How we built a background agent on the LiteLLM AI Gateway that merges PRs with no human in the loop (the infra, harness, and credential-scoping calls behind it).

Krrish Dholakia Ishaan Jaffer — May 27, 2026

May Townhall Updates: Security Hardening, Release Versioning, and the Agent Platform

A recap of the May LiteLLM town hall covering 89 security fixes, new release versioning, MCP toolsets, performance wins, and the LiteLLM Agent Platform.

Krrish Dholakia Ishaan Jaffer — May 26, 2026

DAY 0 Support: Gemini 3.5 Flash on LiteLLM

Guide to using Gemini 3.5 Flash on LiteLLM Proxy and SDK with day 0 support.

Sameer Kankute Krrish Dholakia Ishaan Jaffer — May 19, 2026

Google AI Studio Managed Agents on LiteLLM

LiteLLM now supports the Google AI Studio Managed Agents API. Create, manage, and run custom agents through LiteLLM.

Sameer Kankute Krrish Dholakia Ishaan Jaffer — May 19, 2026

May Townhall: Product + Roadmap Updates

Join the LiteLLM May townhall on Tuesday, 19 May at 7:30 AM PST to learn about LiteLLM's product updates and roadmap.

Krrish Dholakia Ishaan Jaffer — May 19, 2026

Announcing Componentized Deployments

How LiteLLM's componentized deployment isolates the management/UI control plane from the LLM data plane, improving reliability at scale.

Yassin Kortam — May 18, 2026

Security Update: Mistral AI PyPI Supply Chain Attack — LiteLLM Not Impacted

On May 11, 2026, a malicious version of the mistralai PyPI package was published as part of a coordinated supply chain attack. LiteLLM is not affected — we call Mistral exclusively via httpx, never by importing the mistralai SDK.

Mubashir Osmani — May 12, 2026

LiteLLM Managed Agents Platform — Alpha Now Open for Public Preview

Spawn sandboxed agent sessions on the LiteLLM Gateway — a control plane for managed agents, now in public preview.

Krrish Dholakia Ishaan Jaffer — May 8, 2026

Security Update: CVE-2026-42208 in LiteLLM Proxy

CVE-2026-42208 (SQL injection in LiteLLM Proxy's API key verification path) is fixed. Upgrade to v1.83.10-stable.

Krrish Dholakia Ishaan Jaffer — April 29, 2026

Incident Report: Prisma DB Reconnect Blocks the Event Loop and Kills Liveliness

Date: April 2026

Yuneng Jiang — April 29, 2026

LiteLLM release versioning is changing: standard names, MINOR for weekly, PATCH for hotfixes

Dropping `-stable` and `-nightly` suffixes. Weekly releases bump MINOR; PATCH is now reserved for actual hotfixes. Old releases keep their tags forever; new ones start with `1.84.0`.

Yuneng Jiang — April 28, 2026

Gemini Embedding 2 (GA): Multimodal Embeddings on LiteLLM

Use generally available gemini-embedding-2 for multimodal embeddings on LiteLLM via Gemini API and Vertex AI—the same flows as preview, stable model id.

Sameer Kankute — April 24, 2026

Day 0 Support: GPT-5.5 and GPT-5.5 Pro

Day 0 support for GPT-5.5 and GPT-5.5 Pro on LiteLLM.

Mateo Wang Krrish Dholakia Ishaan Jaffer — April 24, 2026

Security Update: CVE-2026-30623 — Command Injection via Anthropic's MCP SDK

CVE-2026-30623 (authenticated RCE via MCP stdio transport) is fixed. Upgrade to v1.83.6-nightly or v1.83.7-stable or later.

Krrish Dholakia Ishaan Jaffer — April 21, 2026

LiteLLM × Akto: Model-Based Detection Alongside Built-in Guardrails

Chain Akto's model-based detection with LiteLLM's built-in guardrails — catch PII, prompt injection, and policy violations that pattern-based checks miss.

Krrish Dholakia Ishaan Jaffer — April 21, 2026

Day 0 Support: Claude Opus 4.7

Day 0 support for Claude Opus 4.7 on LiteLLM AI Gateway - use across Anthropic, Azure, Vertex AI, and Bedrock.

Sameer Kankute Ishaan Jaffer Krrish Dholakia — April 16, 2026

Making the AI Gateway Resilient to Redis Failures

How LiteLLM's production AI Gateway handles Redis degradation at scale without cascading failures — circuit breaker pattern, 0ms fast-fail, automatic recovery.

Ishaan Jaffer — April 11, 2026

April Townhall Updates: CI/CD v2, Stability, and Product Roadmap

A recap of the April LiteLLM town hall covering CI/CD v2, product stability work, and the near-term roadmap.

Krrish Dholakia Ishaan Jaffer — April 10, 2026

Security Update: Vulnerability Disclosures and Ongoing Hardening

Disclosure of security vulnerabilities fixed in LiteLLM v1.83.0, and the launch of our bug bounty program.

Krrish Dholakia Ishaan Jaffer — April 3, 2026

April Townhall: Security + Product Roadmap

Join the LiteLLM April townhall on Friday, 10 April at 7:30 AM to learn about LiteLLM's security and product roadmap.

Krrish Dholakia Ishaan Jaffer — April 2, 2026

Announcing CI/CD v2 for LiteLLM

CI/CD v2 introduces isolated environments, stronger security gates, and safer release separation for LiteLLM.

Krrish Dholakia — March 30, 2026

LiteLLM + Vanta: SOC 2 Type 2 and ISO 27001 Recertification

LiteLLM is partnering with Vanta on SOC 2 Type 2 and ISO 27001 recertification and engaging independent auditors for verification.

Krrish Dholakia — March 30, 2026

Security Townhall Updates

What happened, what we've done, and what comes next for LiteLLM's release and security processes.

Krrish Dholakia Ishaan Jaffer — March 27, 2026

Security Update: Suspected Supply Chain Incident

As of 2:00 PM ET on March 24, 2026

Krrish Dholakia Ishaan Jaffer — March 24, 2026

Incident Report: Guardrail logging exposed secret headers in spend logs and traces

Date: March 18, 2026

LiteLLM Team — March 18, 2026

Day 0 Support: GPT-5.4-mini and GPT-5.4-nano

GPT-5.4-mini and GPT-5.4-nano model support in LiteLLM

Sameer Kankute Krrish Dholakia Ishaan Jaff — March 17, 2026

New Video Characters, Edit and Extension API support

LiteLLM now supports creating, retrieving, and managing reusable video characters across multiple video generations.

Sameer Kankute Krrish Dholakia Ishaan Jaffer — March 16, 2026

Realtime WebRTC HTTP Endpoints

Use the LiteLLM proxy to route OpenAI-style WebRTC realtime via HTTP: client_secrets and SDP exchange.

Sameer Kankute Krrish Dholakia Ishaan Jaffer — March 12, 2026

Day 0 Support: GPT-5.4

GPT-5.4 model support in LiteLLM

Sameer Kankute Krrish Dholakia Ishaan Jaffer — March 5, 2026

DAY 0 Support: Gemini 3.1 Flash Lite Preview on LiteLLM

Guide to using Gemini 3.1 Flash Lite Preview on LiteLLM Proxy and SDK with day 0 support.

Sameer Kankute Krrish Dholakia Ishaan Jaffer — March 3, 2026

Incident Report: Cache Eviction Closes In-Use httpx Clients

Date: February 27, 2026

Ryan Crabbe Ishaan Jaffer Krrish Dholakia — February 27, 2026

Day 0 Support: GPT-5.3-Codex

Day 0 support for GPT-5.3-Codex on LiteLLM, including phase parameter handling for Responses API.

Sameer Kankute Krrish Dholakia Ishaan Jaffer — February 24, 2026

Incident Report: Encrypted Content Failures in Multi-Region Responses API Load Balancing

Date: Feb 24, 2026

Sameer Kankute Krrish Dholakia Ishaan Jaffer — February 24, 2026

Incident Report: Wildcard Blocking New Models After Cost Map Reload

Date: Feb 23, 2026

Sameer Kankute Krrish Dholakia Ishaan Jaffer — February 23, 2026

Incident Report: SERVER_ROOT_PATH regression broke UI routing

Date: January 22, 2026

Yuneng Jiang Ishaan Jaffer Krrish Dholakia — February 21, 2026

DAY 0 Support: Gemini 3.1 Pro on LiteLLM

Guide to using Gemini 3.1 Pro on LiteLLM Proxy and SDK with day 0 support.

Sameer Kankute Krrish Dholakia Ishaan Jaffer — February 19, 2026

Incident Report: vLLM Embeddings Broken by encoding_format Parameter

Date: Feb 16, 2026

Sameer Kankute Krrish Dholakia Ishaan Jaffer — February 18, 2026

Day 0 Support: Claude Sonnet 4.6

Day 0 support for Claude Sonnet 4.6 on LiteLLM AI Gateway - use across Anthropic, Azure, Vertex AI, and Bedrock.

Ishaan Jaffer Krrish Dholakia — February 17, 2026

Incident Report: Invalid beta headers with Claude Code

Date: February 13, 2026

Sameer Kankute Ishaan Jaffer Krrish Dholakia — February 16, 2026

Day 0 Support: MiniMax-M2.5

Day 0 support for MiniMax-M2.5 on LiteLLM

Sameer Kankute Krrish Dholakia Ishaan Jaffer — February 12, 2026

Incident Report: Invalid model cost map on main

Date: January 27, 2026

Ishaan Jaffer — February 10, 2026

Your Middleware Could Be a Bottleneck

How we improved LiteLLM proxy latency and throughput by replacing a single middleware base class

Krrish Dholakia Ishaan Jaffer Ryan Crabbe — February 7, 2026

Improve release stability with 24 hour load tests

How we built a long-running, release-validation system to catch regressions before they reach users.

Alexsander Hamir Krrish Dholakia Ishaan Jaffer — February 6, 2026

Day 0 Support: Claude Opus 4.6

Day 0 support for Claude Opus 4.6 on LiteLLM AI Gateway - use across Anthropic, Azure, Vertex AI, and Bedrock.

Sameer Kankute Ishaan Jaffer Krrish Dholakia — February 5, 2026

Achieving Sub-Millisecond Proxy Overhead

Our Q1 performance target and architectural direction for achieving sub-millisecond proxy overhead on modest hardware.

Alexsander Hamir Krrish Dholakia Ishaan Jaffer — February 2, 2026

DAY 0 Support: Gemini 3 Flash on LiteLLM

Guide to using Gemini 3 Flash on LiteLLM Proxy and SDK with day 0 support.

Sameer Kankute Krrish Dholakia Ishaan Jaffer — December 17, 2025

Day 0 Support: Claude 4.5 Opus (+Advanced Features)

Guide to Claude Opus 4.5 and advanced features in LiteLLM: Tool Search, Programmatic Tool Calling, and Effort Parameter.

Sameer Kankute Krrish Dholakia Ishaan Jaffer — November 25, 2025

DAY 0 Support: Gemini 3 on LiteLLM

Common questions and best practices for using gemini-3-pro-preview with LiteLLM Proxy and SDK.

Sameer Kankute Krrish Dholakia Ishaan Jaffer — November 19, 2025

Gemini Embedding 2 Preview: Multimodal Embeddings on LiteLLM

Generate embeddings from text, images, audio, video, and PDFs with gemini-embedding-2-preview on LiteLLM via Gemini API (one vector per input, OpenAI-compatible) and Vertex AI (single unified vector per request).

Sameer Kankute — March 11, 2025