Skip to main content

2 posts tagged with "snowflake"

View All Tags

Krrish Dholakia
Ishaan Jaffer

These are the changes since v1.63.11-stable.

This release brings:

  • LLM Translation Improvements (MCP Support and Bedrock Application Profiles)
  • Perf improvements for Usage-based Routing
  • Streaming guardrail support via websockets
  • Azure OpenAI client perf fix (from previous release)

Docker Run LiteLLM Proxy​

docker run
-e STORE_MODEL_IN_DB=True
-p 4000:4000
ghcr.io/berriai/litellm:main-v1.63.14-stable.patch1

Demo Instance​

Here's a Demo Instance to test changes:

New Models / Updated Models​

  • Azure gpt-4o - fixed pricing to latest global pricing - PR
  • O1-Pro - add pricing + model information - PR
  • Azure AI - mistral 3.1 small pricing added - PR
  • Azure - gpt-4.5-preview pricing added - PR

LLM Translation​

  1. New LLM Features
  • Bedrock: Support bedrock application inference profiles Docs
    • Infer aws region from bedrock application profile id - (arn:aws:bedrock:us-east-1:...)
  • Ollama - support calling via /v1/completions Get Started
  • Bedrock - support us.deepseek.r1-v1:0 model name Docs
  • OpenRouter - OPENROUTER_API_BASE env var support Docs
  • Azure - add audio model parameter support - Docs
  • OpenAI - PDF File support Docs
  • OpenAI - o1-pro Responses API streaming support Docs
  • [BETA] MCP - Use MCP Tools with LiteLLM SDK Docs
  1. Bug Fixes
  • Voyage: prompt token on embedding tracking fix - PR
  • Sagemaker - Fix ‘Too little data for declared Content-Length’ error - PR
  • OpenAI-compatible models - fix issue when calling openai-compatible models w/ custom_llm_provider set - PR
  • VertexAI - Embedding ‘outputDimensionality’ support - PR
  • Anthropic - return consistent json response format on streaming/non-streaming - PR

Spend Tracking Improvements​

  • litellm_proxy/ - support reading litellm response cost header from proxy, when using client sdk
  • Reset Budget Job - fix budget reset error on keys/teams/users PR
  • Streaming - Prevents final chunk w/ usage from being ignored (impacted bedrock streaming + cost tracking) PR

UI​

  1. Users Page
    • Feature: Control default internal user settings PR
  2. Icons:
    • Feature: Replace external "artificialanalysis.ai" icons by local svg PR
  3. Sign In/Sign Out
    • Fix: Default login when default_user_id user does not exist in DB PR

Logging Integrations​

  • Support post-call guardrails for streaming responses Get Started
  • Arize Get Started
    • fix invalid package import PR
    • migrate to using standardloggingpayload for metadata, ensures spans land successfully PR
    • fix logging to just log the LLM I/O PR
    • Dynamic API Key/Space param support Get Started
  • StandardLoggingPayload - Log litellm_model_name in payload. Allows knowing what the model sent to API provider was Get Started
  • Prompt Management - Allow building custom prompt management integration Get Started

Performance / Reliability improvements​

  • Redis Caching - add 5s default timeout, prevents hanging redis connection from impacting llm calls PR
  • Allow disabling all spend updates / writes to DB - patch to allow disabling all spend updates to DB with a flag PR
  • Azure OpenAI - correctly re-use azure openai client, fixes perf issue from previous Stable release PR
  • Azure OpenAI - uses litellm.ssl_verify on Azure/OpenAI clients PR
  • Usage-based routing - Wildcard model support Get Started
  • Usage-based routing - Support batch writing increments to redis - reduces latency to same as ‘simple-shuffle’ PR
  • Router - show reason for model cooldown on ‘no healthy deployments available error’ PR
  • Caching - add max value limit to an item in in-memory cache (1MB) - prevents OOM errors on large image url’s being sent through proxy PR

General Improvements​

  • Passthrough Endpoints - support returning api-base on pass-through endpoints Response Headers Docs
  • SSL - support reading ssl security level from env var - Allows user to specify lower security settings Get Started
  • Credentials - only poll Credentials table when STORE_MODEL_IN_DB is True PR
  • Image URL Handling - new architecture doc on image url handling Docs
  • OpenAI - bump to pip install "openai==1.68.2" PR
  • Gunicorn - security fix - bump gunicorn==23.0.0 PR

Complete Git Diff​

Here's the complete git diff

Krrish Dholakia
Ishaan Jaffer

These are the changes since v1.63.2-stable.

This release is primarily focused on:

  • [Beta] Responses API Support
  • Snowflake Cortex Support, Amazon Nova Image Generation
  • UI - Credential Management, re-use credentials when adding new models
  • UI - Test Connection to LLM Provider before adding a model

Known Issues​

  • 🚨 Known issue on Azure OpenAI - We don't recommend upgrading if you use Azure OpenAI. This version failed our Azure OpenAI load test

Docker Run LiteLLM Proxy​

docker run
-e STORE_MODEL_IN_DB=True
-p 4000:4000
ghcr.io/berriai/litellm:main-v1.63.11-stable

Demo Instance​

Here's a Demo Instance to test changes:

New Models / Updated Models​

  • Image Generation support for Amazon Nova Canvas Getting Started
  • Add pricing for Jamba new models PR
  • Add pricing for Amazon EU models PR
  • Add Bedrock Deepseek R1 model pricing PR
  • Update Gemini pricing: Gemma 3, Flash 2 thinking update, LearnLM PR
  • Mark Cohere Embedding 3 models as Multimodal PR
  • Add Azure Data Zone pricing PR
    • LiteLLM Tracks cost for azure/eu and azure/us models

LLM Translation​

  1. New Endpoints
  1. New LLM Providers
  1. New LLM Features
  1. Bug Fixes
  • OpenAI: Return code, param and type on bad request error More information on litellm exceptions
  • Bedrock: Fix converse chunk parsing to only return empty dict on tool use PR
  • Bedrock: Support extra_headers PR
  • Azure: Fix Function Calling Bug & Update Default API Version to 2025-02-01-preview PR
  • Azure: Fix AI services URL PR
  • Vertex AI: Handle HTTP 201 status code in response PR
  • Perplexity: Fix incorrect streaming response PR
  • Triton: Fix streaming completions bug PR
  • Deepgram: Support bytes.IO when handling audio files for transcription PR
  • Ollama: Fix "system" role has become unacceptable PR
  • All Providers (Streaming): Fix String data: stripped from entire content in streamed responses PR

Spend Tracking Improvements​

  1. Support Bedrock converse cache token tracking Getting Started
  2. Cost Tracking for Responses API Getting Started
  3. Fix Azure Whisper cost tracking Getting Started

UI​

Re-Use Credentials on UI​

You can now onboard LLM provider credentials on LiteLLM UI. Once these credentials are added you can re-use them when adding new models Getting Started

Test Connections before adding models​

Before adding a model you can test the connection to the LLM provider to verify you have setup your API Base + API Key correctly

General UI Improvements​

  1. Add Models Page
    • Allow adding Cerebras, Sambanova, Perplexity, Fireworks, Openrouter, TogetherAI Models, Text-Completion OpenAI on Admin UI
    • Allow adding EU OpenAI models
    • Fix: Instantly show edit + deletes to models
  2. Keys Page
    • Fix: Instantly show newly created keys on Admin UI (don't require refresh)
    • Fix: Allow clicking into Top Keys when showing users Top API Key
    • Fix: Allow Filter Keys by Team Alias, Key Alias and Org
    • UI Improvements: Show 100 Keys Per Page, Use full height, increase width of key alias
  3. Users Page
    • Fix: Show correct count of internal user keys on Users Page
    • Fix: Metadata not updating in Team UI
  4. Logs Page
    • UI Improvements: Keep expanded log in focus on LiteLLM UI
    • UI Improvements: Minor improvements to logs page
    • Fix: Allow internal user to query their own logs
    • Allow switching off storing Error Logs in DB Getting Started
  5. Sign In/Sign Out

Security​

  1. Support for Rotating Master Keys Getting Started
  2. Fix: Internal User Viewer Permissions, don't allow internal_user_viewer role to see Test Key Page or Create Key Button More information on role based access controls
  3. Emit audit logs on All user + model Create/Update/Delete endpoints Getting Started
  4. JWT
    • Support multiple JWT OIDC providers Getting Started
    • Fix JWT access with Groups not working when team is assigned All Proxy Models access
  5. Using K/V pairs in 1 AWS Secret Getting Started

Logging Integrations​

  1. Prometheus: Track Azure LLM API latency metric Getting Started
  2. Athina: Added tags, user_feedback and model_options to additional_keys which can be sent to Athina Getting Started

Performance / Reliability improvements​

  1. Redis + litellm router - Fix Redis cluster mode for litellm router PR

General Improvements​

  1. OpenWebUI Integration - display thinking tokens

Complete Git Diff​

Here's the complete git diff