Files
Homelab/Development/litellm/docs/my-website/release_notes/v1.74.0-stable/index.md

17 KiB
Raw Blame History

title, slug, date, authors, hide_table_of_contents
title slug date authors hide_table_of_contents
v1.74.0-stable v1-74-0-stable 2025-07-05T10:00:00
name title url image_url
Krrish Dholakia CEO, LiteLLM https://www.linkedin.com/in/krish-d/ https://pbs.twimg.com/profile_images/1298587542745358340/DZv3Oj-h_400x400.jpg
name title url image_url
Ishaan Jaffer CTO, LiteLLM https://www.linkedin.com/in/reffajnaahsi/ https://pbs.twimg.com/profile_images/1613813310264340481/lz54oEiB_400x400.jpg
false

import Image from '@theme/IdealImage'; import Tabs from '@theme/Tabs'; import TabItem from '@theme/TabItem';

Deploy this version

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:v1.74.0-stable
pip install litellm==1.74.0.post2

Key Highlights

  • MCP Gateway Namespace Servers - Clients connecting to LiteLLM can now specify which MCP servers to use.
  • Key/Team Based Logging on UI - Proxy Admins can configure team or key-based logging settings directly in the UI.
  • Azure Content Safety Guardrails - Added support for prompt injection and text moderation with Azure Content Safety Guardrails.
  • VertexAI Deepseek Models - Support for calling VertexAI Deepseek models with LiteLLM's/chat/completions or /responses API.
  • Github Copilot API - You can now use Github Copilot as an LLM API provider.

MCP Gateway: Namespaced MCP Servers

This release brings support for namespacing MCP Servers on LiteLLM MCP Gateway. This means you can specify the x-mcp-servers header to specify which servers to list tools from.

This is useful when you want to point MCP clients to specific MCP Servers on LiteLLM.

Usage

curl --location 'https://api.openai.com/v1/responses' \
--header 'Content-Type: application/json' \
--header "Authorization: Bearer $OPENAI_API_KEY" \
--data '{
    "model": "gpt-4o",
    "tools": [
        {
            "type": "mcp",
            "server_label": "litellm",
            "server_url": "<your-litellm-proxy-base-url>/mcp",
            "require_approval": "never",
            "headers": {
                "x-litellm-api-key": "Bearer YOUR_LITELLM_API_KEY",
                "x-mcp-servers": "Zapier_Gmail"
            }
        }
    ],
    "input": "Run available tools",
    "tool_choice": "required"
}'

In this example, the request will only have access to tools from the "Zapier_Gmail" MCP server.

curl --location '<your-litellm-proxy-base-url>/v1/responses' \
--header 'Content-Type: application/json' \
--header "Authorization: Bearer $LITELLM_API_KEY" \
--data '{
    "model": "gpt-4o",
    "tools": [
        {
            "type": "mcp",
            "server_label": "litellm",
            "server_url": "<your-litellm-proxy-base-url>/mcp",
            "require_approval": "never",
            "headers": {
                "x-litellm-api-key": "Bearer YOUR_LITELLM_API_KEY",
                "x-mcp-servers": "Zapier_Gmail,Server2"
            }
        }
    ],
    "input": "Run available tools",
    "tool_choice": "required"
}'

This configuration restricts the request to only use tools from the specified MCP servers.

{
  "mcpServers": {
    "LiteLLM": {
      "url": "<your-litellm-proxy-base-url>/mcp",
      "headers": {
        "x-litellm-api-key": "Bearer $LITELLM_API_KEY",
        "x-mcp-servers": "Zapier_Gmail,Server2"
      }
    }
  }
}

This configuration in Cursor IDE settings will limit tool access to only the specified MCP server.

Team / Key Based Logging on UI

<Image img={require('../../img/release_notes/team_key_logging.png')} style={{width: '100%', display: 'block', margin: '2rem auto'}} />


This release brings support for Proxy Admins to configure Team/Key Based Logging Settings on the UI. This allows routing LLM request/response logs to different Langfuse/Arize projects based on the team or key.

For developers using LiteLLM, their logs are automatically routed to their specific Arize/Langfuse projects. On this release, we support the following integrations for key/team based logging:

  • langfuse
  • arize
  • langsmith

Azure Content Safety Guardrails

<Image img={require('../../img/azure_content_safety_guardrails.jpg')} style={{width: '100%', display: 'block', margin: '2rem auto'}} />


LiteLLM now supports Azure Content Safety Guardrails for Prompt Injection and Text Moderation. This is great for internal chat-ui use cases, as you can now create guardrails with detection for Azures Harm Categories, specify custom severity thresholds and run them across 100+ LLMs for just that use-case (or across all your calls).

Get Started

Python SDK: 2.3 Second Faster Import Times

This release brings significant performance improvements to the Python SDK with 2.3 seconds faster import times. We've refactored the initialization process to reduce startup overhead, making LiteLLM more efficient for applications that need quick initialization. This is a major improvement for applications that need to initialize LiteLLM quickly.


New Models / Updated Models

Pricing / Context Window Updates

Provider Model Context Window Input ($/1M tokens) Output ($/1M tokens) Type
Watsonx watsonx/mistralai/mistral-large 131k $3.00 $10.00 New
Azure AI azure_ai/cohere-rerank-v3.5 4k $2.00/1k queries - New (Rerank)

Features

Bugs

  • Mistral
    • Fix transform_response handling for empty string content - PR
    • Turn Mistral to use llm_http_handler - PR
  • Gemini
    • Fix tool call sequence - PR
    • Fix custom api_base path preservation - PR
  • Anthropic
    • Fix user_id validation logic - PR
  • Bedrock
    • Support optional args for bedrock - PR
  • Ollama
    • Fix default parameters for ollama-chat - PR
  • VLLM
    • Add 'audio_url' message type support - PR

LLM API Endpoints

Features

  • /batches
    • Support batch retrieve with target model Query Param - PR
    • Anthropic completion bridge improvements - PR
  • /responses
    • Azure responses api bridge improvements - PR
    • Fix responses api error handling - PR
  • /mcp (MCP Gateway)
    • Add MCP url masking on frontend - PR
    • Add MCP servers header to scope - PR
    • Litellm mcp tool prefix - PR
    • Segregate MCP tools on connections using headers - PR
    • Added changes to mcp url wrapping - PR

Bugs

  • /v1/messages
    • Remove hardcoded model name on streaming - PR
    • Support lowest latency routing - PR
    • Non-anthropic models token usage returned - PR
  • /chat/completions
    • Support Cursor IDE tool_choice format {"type": "auto"} - PR
  • /generateContent
    • Allow passing litellm_params - PR
    • Only pass supported params when using OpenAI models - PR
    • Fix using gemini-cli with Vertex Anthropic Models - PR
  • Streaming
    • Fix Error code: 307 for LlamaAPI Streaming Chat - PR
    • Store finish reason even if is_finished - PR

Spend Tracking / Budget Improvements

Bugs

  • Fix allow strings in calculate cost - PR
  • VertexAI Anthropic streaming cost tracking with prompt caching fixes - PR

Management Endpoints / UI

Bugs

  • Team Management
    • Prevent team model reset on model add - PR
    • Return team-only models on /v2/model/info - PR
    • Render team member budget correctly - PR
  • UI Rendering
    • Fix rendering ui on non-root images - PR
    • Correctly display 'Internal Viewer' user role - PR
  • Configuration
    • Handle empty config.yaml - PR
    • Fix gemini /models - replace models/ as expected - PR

Features

  • Team Management
    • Allow adding team specific logging callbacks - PR
    • Add Arize Team Based Logging - PR
    • Allow Viewing/Editing Team Based Callbacks - PR
  • UI Improvements
    • Comma separated spend and budget display - PR
    • Add logos to callback list - PR
  • CLI
    • Add litellm-proxy cli login for starting to use litellm proxy - PR
  • Email Templates
    • Customizable Email template - Subject and Signature - PR

Logging / Guardrail Integrations

Features

Bugs

  • Security
    • Ensure only LLM API route fails get logged on Langfuse - PR
  • OpenMeter
    • Integration error handling fix - PR
  • Message Redaction
    • Ensure message redaction works for responses API logging - PR
  • Bedrock Guardrails
    • Fix bedrock guardrails post_call for streaming responses - PR

Performance / Loadbalancing / Reliability improvements

Features

  • Python SDK
    • 2 second faster import times - PR
    • Reduce python sdk import time by .3s - PR
  • Error Handling
    • Add error handling for MCP tools not found or invalid server - PR
  • SSL/TLS
    • Fix SSL certificate error - PR
    • Fix custom ca bundle support in aiohttp transport - PR

General Proxy Improvements

  • Startup
    • Add new banner on startup - PR
  • Dependencies
    • Update pydantic version - PR

New Contributors

Git Diff