Skip to main content

v1.89.3 - Guardrails & Cache-Control Fixes

Update: no performance regression found

An earlier version of this note flagged a potential throughput regression. We investigated and could not confirm or reproduce any regression in the released version. The one report we received came from a deployment running custom code on top of what we shipped, and our testing points to those changes, not LiteLLM, as the likely cause.

Correctness and error rates were never affected. If you're on this version, there's nothing you need to do.

We're still monitoring incoming reports and will update this note if anything changes.

Deploy this version​

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
docker.litellm.ai/berriai/litellm:1.89.3

v1.89.3 is a patch release on top of v1.89.2. It backports guardrail correctness fixes (a single pre-call hook for model-level guardrails, no DB re-init on every poll, 400 instead of 500 when AIM blocks a request) and caps Anthropic cache-control injection at the 4-block limit.

What's Changed​

  • fix(integrations): cap Anthropic cache_control injection at 4 blocks - PR #30480
  • fix(guardrails): run pre_call hook once for model-level guardrails - PR #30543
  • fix(guardrails): stop re-initializing DB guardrails on every poll - PR #30542
  • fix(guardrails): return 400 not 500 when AIM blocks a request - PR #30573

Full Changelog​

https://github.com/BerriAI/litellm/compare/v1.89.2...v1.89.3

🚅
LiteLLM Enterprise
SSO/SAML, audit logs, spend tracking, multi-team management, and guardrails — built for production.
Learn more →