--- title: "v1.74.9-stable - Auto-Router" slug: "v1-74-9" date: 2025-07-27T10:00:00 authors: - name: Krrish Dholakia title: CEO, LiteLLM url: https://www.linkedin.com/in/krish-d/ image_url: https://pbs.twimg.com/profile_images/1298587542745358340/DZv3Oj-h_400x400.jpg - name: Ishaan Jaffer title: CTO, LiteLLM url: https://www.linkedin.com/in/reffajnaahsi/ image_url: https://pbs.twimg.com/profile_images/1613813310264340481/lz54oEiB_400x400.jpg hide_table_of_contents: false --- import Image from '@theme/IdealImage'; import Tabs from '@theme/Tabs'; import TabItem from '@theme/TabItem'; ## Deploy this version ``` showLineNumbers title="docker run litellm" docker run \ -e STORE_MODEL_IN_DB=True \ -p 4000:4000 \ ghcr.io/berriai/litellm:v1.74.9-stable.patch.1 ``` ``` showLineNumbers title="pip install litellm" pip install litellm==1.74.9.post2 ``` --- ## Key Highlights - **Auto-Router** - Automatically route requests to specific models based on request content. - **Model-level Guardrails** - Only run guardrails when specific models are used. - **MCP Header Propagation** - Propagate headers from client to backend MCP. - **New LLM Providers** - Added Bedrock inpainting support and Recraft API image generation / image edits support. --- ## Auto-Router
This release introduces auto-routing to models based on request content. This means **Proxy Admins** can define a set of keywords that always routes to specific models when **users** opt in to using the auto-router. This is great for internal use cases where you don't want **users** to think about which model to use - for example, use Claude models for coding vs GPT models for generating ad copy. [Read More](../../docs/proxy/auto_routing) --- ## Model-level Guardrails
This release brings model-level guardrails support to your config.yaml + UI. This is great for cases when you have an on-prem and hosted model, and just want to run prevent sending PII to the hosted model. ```yaml model_list: - model_name: claude-sonnet-4 litellm_params: model: anthropic/claude-sonnet-4-20250514 api_key: os.environ/ANTHROPIC_API_KEY api_base: https://api.anthropic.com/v1 guardrails: ["azure-text-moderation"] # 👈 KEY CHANGE guardrails: - guardrail_name: azure-text-moderation litellm_params: guardrail: azure/text_moderations mode: "post_call" api_key: os.environ/AZURE_GUARDRAIL_API_KEY api_base: os.environ/AZURE_GUARDRAIL_API_BASE ``` [Read More](../../docs/proxy/guardrails/quick_start#model-level-guardrails) --- ## MCP Header Propagation
v1.74.9-stable allows you to propagate MCP server specific authentication headers via LiteLLM - Allowing users to specify which `header_name` is to be propagated to which `mcp_server` via headers - Allows adding of different deployments of same MCP server type to use different authentication headers [Read More](https://docs.litellm.ai/docs/mcp#new-server-specific-auth-headers-recommended) --- ## New Models / Updated Models #### Pricing / Context Window Updates | Provider | Model | Context Window | Input ($/1M tokens) | Output ($/1M tokens) | | ----------- | -------------------------------------- | -------------- | ------------------- | -------------------- | | Fireworks AI | `fireworks/models/kimi-k2-instruct` | 131k | $0.6 | $2.5 | | OpenRouter | `openrouter/qwen/qwen-vl-plus` | 8192 | $0.21 | $0.63 | | OpenRouter | `openrouter/qwen/qwen3-coder` | 8192 | $1 | $5 | | OpenRouter | `openrouter/bytedance/ui-tars-1.5-7b` | 128k | $0.10 | $0.20 | | Groq | `groq/qwen/qwen3-32b` | 131k | $0.29 | $0.59 | | VertexAI | `vertex_ai/meta/llama-3.1-8b-instruct-maas` | 128k | $0.00 | $0.00 | | VertexAI | `vertex_ai/meta/llama-3.1-405b-instruct-maas` | 128k | $5 | $16 | | VertexAI | `vertex_ai/meta/llama-3.2-90b-vision-instruct-maas` | 128k | $0.00 | $0.00 | | Google AI Studio | `gemini/gemini-2.0-flash-live-001` | 1,048,576 | $0.35 | $1.5 | | Google AI Studio | `gemini/gemini-2.5-flash-lite` | 1,048,576 | $0.1 | $0.4 | | VertexAI | `vertex_ai/gemini-2.0-flash-lite-001` | 1,048,576 | $0.35 | $1.5 | | OpenAI | `gpt-4o-realtime-preview-2025-06-03` | 128k | $5 | $20 | #### Features - **[Lambda AI](../../docs/providers/lambda_ai)** - New LLM API provider - [PR #12817](https://github.com/BerriAI/litellm/pull/12817) - **[Github Copilot](../../docs/providers/github_copilot)** - Dynamic endpoint support - [PR #12827](https://github.com/BerriAI/litellm/pull/12827) - **[Morph](../../docs/providers/morph)** - New LLM API provider - [PR #12821](https://github.com/BerriAI/litellm/pull/12821) - **[Groq](../../docs/providers/groq)** - Remove deprecated groq/qwen-qwq-32b - [PR #12832](https://github.com/BerriAI/litellm/pull/12831) - **[Recraft](../../docs/providers/recraft)** - New image generation API - [PR #12832](https://github.com/BerriAI/litellm/pull/12832) - New image edits api - [PR #12874](https://github.com/BerriAI/litellm/pull/12874) - **[Azure OpenAI](../../docs/providers/azure/azure)** - Support DefaultAzureCredential without hard-coded environment variables - [PR #12841](https://github.com/BerriAI/litellm/pull/12841) - **[Hyperbolic](../../docs/providers/hyperbolic)** - New LLM API provider - [PR #12826](https://github.com/BerriAI/litellm/pull/12826) - **[OpenAI](../../docs/providers/openai)** - `/realtime` API - pass through intent query param - [PR #12838](https://github.com/BerriAI/litellm/pull/12838) - **[Bedrock](../../docs/providers/bedrock)** - Add inpainting support for Amazon Nova Canvas - [PR #12949](https://github.com/BerriAI/litellm/pull/12949) s/o @[SantoshDhaladhuli](https://github.com/SantoshDhaladhuli) #### Bugs - **Gemini ([Google AI Studio](../../docs/providers/gemini) + [VertexAI](../../docs/providers/vertex))** - Fix leaking file descriptor error on sync calls - [PR #12824](https://github.com/BerriAI/litellm/pull/12824) - **IBM Watsonx** - use correct parameter name for tool choice - [PR #9980](https://github.com/BerriAI/litellm/pull/9980) - **[Anthropic](../../docs/providers/anthropic)** - Only show ‘reasoning_effort’ for supported models - [PR #12847](https://github.com/BerriAI/litellm/pull/12847) - Handle $id and $schema in tool call requests (Anthropic API stopped accepting them) - [PR #12959](https://github.com/BerriAI/litellm/pull/12959) - **[Openrouter](../../docs/providers/openrouter)** - filter out cache_control flag for non-anthropic models (allows usage with claude code) https://github.com/BerriAI/litellm/pull/12850 - **[Gemini](../../docs/providers/gemini)** - Shorten Gemini tool_call_id for Open AI compatibility - [PR #12941](https://github.com/BerriAI/litellm/pull/12941) s/o @[tonga54](https://github.com/tonga54) --- ## LLM API Endpoints #### Features - **[Passthrough endpoints](../../docs/pass_through/)** - Make key/user/team cost tracking OSS - [PR #12847](https://github.com/BerriAI/litellm/pull/12847) - **[/v1/models](../../docs/providers/passthrough)** - Return fallback models as part of api response - [PR #12811](https://github.com/BerriAI/litellm/pull/12811) s/o @[murad-khafizov](https://github.com/murad-khafizov) - **[/vector_stores](../../docs/providers/passthrough)** - Make permission management OSS - [PR #12990](https://github.com/BerriAI/litellm/pull/12990) #### Bugs 1. `/batches` 1. Skip invalid batch during cost tracking check (prev. Would stop all checks) - [PR #12782](https://github.com/BerriAI/litellm/pull/12782) 2. `/chat/completions` 1. Fix async retryer on .acompletion() - [PR #12886](https://github.com/BerriAI/litellm/pull/12886) --- ## [MCP Gateway](../../docs/mcp) #### Features - **[Permission Management](../../docs/mcp#grouping-mcps-access-groups)** - Make permission management by key/team OSS - [PR #12988](https://github.com/BerriAI/litellm/pull/12988) - **[MCP Alias](../../docs/mcp#mcp-aliases)** - Support mcp server aliases (useful for calling long mcp server names on Cursor) - [PR #12994](https://github.com/BerriAI/litellm/pull/12994) - **Header Propagation** - Support propagating headers from client to backend MCP (useful for sending personal access tokens to backend MCP) - [PR #13003](https://github.com/BerriAI/litellm/pull/13003) --- ## Management Endpoints / UI #### Features - **Usage** - Support viewing usage by model group - [PR #12890](https://github.com/BerriAI/litellm/pull/12890) - **Virtual Keys** - New `key_type` field on `/key/generate` - allows specifying if key can call LLM API vs. Management routes - [PR #12909](https://github.com/BerriAI/litellm/pull/12909) - **Models** - Add ‘auto router’ on UI - [PR #12960](https://github.com/BerriAI/litellm/pull/12960) - Show global retry policy on UI - [PR #12969](https://github.com/BerriAI/litellm/pull/12969) - Add model-level guardrails on create + update - [PR #13006](https://github.com/BerriAI/litellm/pull/13006) #### Bugs - **SSO** - Fix logout when SSO is enabled - [PR #12703](https://github.com/BerriAI/litellm/pull/12703) - Fix reset SSO when ui_access_mode is updated - [PR #13011](https://github.com/BerriAI/litellm/pull/13011) - **Guardrails** - Show correct guardrails when editing a team - [PR #12823](https://github.com/BerriAI/litellm/pull/12823) - **Virtual Keys** - Get updated token on regenerate key - [PR #12788](https://github.com/BerriAI/litellm/pull/12788) - Fix CVE with key injection - [PR #12840](https://github.com/BerriAI/litellm/pull/12840) --- ## Logging / Guardrail Integrations #### Features - **[Google Cloud Model Armor](../../docs/proxy/guardrails/model_armor)** - Document new guardrail - [PR #12492](https://github.com/BerriAI/litellm/pull/12492) - **[Pillar Security](../../docs/proxy/guardrails/pillar_security)** - New LLM Guardrail - [PR #12791](https://github.com/BerriAI/litellm/pull/12791) - **CloudZero** - Allow exporting spend to cloudzero - [PR #12908](https://github.com/BerriAI/litellm/pull/12908) - **Model-level Guardrails** - Support model-level guardrails - [PR #12968](https://github.com/BerriAI/litellm/pull/12968) #### Bugs - **[Prometheus](../../docs/proxy/prometheus)** - Fix `[tag]=false` when tag is set for tag-based metrics - [PR #12916](https://github.com/BerriAI/litellm/pull/12916) - **[Guardrails AI](../../docs/proxy/guardrails/guardrails_ai)** - Use ‘validatedOutput’ to allow usage of “fix” guards - [PR #12891](https://github.com/BerriAI/litellm/pull/12891) s/o @[DmitriyAlergant](https://github.com/DmitriyAlergant) --- ## Performance / Loadbalancing / Reliability improvements #### Features - **[Auto-Router](../../docs/proxy/auto_routing)** - New auto-router powered by `semantic-router` - [PR #12955](https://github.com/BerriAI/litellm/pull/12955) #### Bugs - **forward_clientside_headers** - Filter out `content-length` from headers (caused backend requests to hang) - [PR #12886](https://github.com/BerriAI/litellm/pull/12886/files) - **Message Redaction** - Fix cannot pickle coroutine object error - [PR #13005](https://github.com/BerriAI/litellm/pull/13005) --- ## General Proxy Improvements #### Features - **Benchmarks** - Updated litellm proxy benchmarks (p50, p90, p99 overhead) - [PR #12842](https://github.com/BerriAI/litellm/pull/12842) - **Request Headers** - Added new `x-litellm-num-retries` request header - **Swagger** - Support local swagger on custom root paths - [PR #12911](https://github.com/BerriAI/litellm/pull/12911) - **Health** - Track cost + add tags for health checks done by LiteLLM Proxy - [PR #12880](https://github.com/BerriAI/litellm/pull/12880) #### Bugs - **Proxy Startup** - Fixes issue on startup where team member budget is None would block startup - [PR #12843](https://github.com/BerriAI/litellm/pull/12843) - **Docker** - Move non-root docker to chain guard image (fewer vulnerabilities) - [PR #12707](https://github.com/BerriAI/litellm/pull/12707) - add azure-keyvault==4.2.0 to Docker img - [PR #12873](https://github.com/BerriAI/litellm/pull/12873) - **Separate Health App** - Pass through cmd args via supervisord (enables user config to still work via docker) - [PR #12871](https://github.com/BerriAI/litellm/pull/12871) - **Swagger** - Bump DOMPurify version (fixes vulnerability) - [PR #12911](https://github.com/BerriAI/litellm/pull/12911) - Add back local swagger bundle (enables swagger to work in air gapped env.) - [PR #12911](https://github.com/BerriAI/litellm/pull/12911) - **Request Headers** - Make ‘user_header_name’ field check case insensitive (fixes customer budget enforcement for OpenWebUi) - [PR #12950](https://github.com/BerriAI/litellm/pull/12950) - **SpendLogs** - Fix issues writing to DB when custom_llm_provider is None - [PR #13001](https://github.com/BerriAI/litellm/pull/13001) --- ## New Contributors * @magicalne made their first contribution in https://github.com/BerriAI/litellm/pull/12804 * @pavangudiwada made their first contribution in https://github.com/BerriAI/litellm/pull/12798 * @mdiloreto made their first contribution in https://github.com/BerriAI/litellm/pull/12707 * @murad-khafizov made their first contribution in https://github.com/BerriAI/litellm/pull/12811 * @eagle-p made their first contribution in https://github.com/BerriAI/litellm/pull/12791 * @apoorv-sharma made their first contribution in https://github.com/BerriAI/litellm/pull/12920 * @SantoshDhaladhuli made their first contribution in https://github.com/BerriAI/litellm/pull/12949 * @tonga54 made their first contribution in https://github.com/BerriAI/litellm/pull/12941 * @sings-to-bees-on-wednesdays made their first contribution in https://github.com/BerriAI/litellm/pull/12950 ## **[Full Changelog](https://github.com/BerriAI/litellm/compare/v1.74.7-stable...v1.74.9.rc-draft)**