Added LiteLLM to the stack

This commit is contained in:
2025-08-18 09:40:50 +00:00
parent 0648c1968c
commit d220b04e32
2682 changed files with 533609 additions and 1 deletions

View File

@@ -0,0 +1,299 @@
---
title: "v1.74.9-stable - Auto-Router"
slug: "v1-74-9"
date: 2025-07-27T10:00:00
authors:
- name: Krrish Dholakia
title: CEO, LiteLLM
url: https://www.linkedin.com/in/krish-d/
image_url: https://pbs.twimg.com/profile_images/1298587542745358340/DZv3Oj-h_400x400.jpg
- name: Ishaan Jaffer
title: CTO, LiteLLM
url: https://www.linkedin.com/in/reffajnaahsi/
image_url: https://pbs.twimg.com/profile_images/1613813310264340481/lz54oEiB_400x400.jpg
hide_table_of_contents: false
---
import Image from '@theme/IdealImage';
import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';
## Deploy this version
<Tabs>
<TabItem value="docker" label="Docker">
``` showLineNumbers title="docker run litellm"
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:v1.74.9-stable.patch.1
```
</TabItem>
<TabItem value="pip" label="Pip">
``` showLineNumbers title="pip install litellm"
pip install litellm==1.74.9.post2
```
</TabItem>
</Tabs>
---
## Key Highlights
- **Auto-Router** - Automatically route requests to specific models based on request content.
- **Model-level Guardrails** - Only run guardrails when specific models are used.
- **MCP Header Propagation** - Propagate headers from client to backend MCP.
- **New LLM Providers** - Added Bedrock inpainting support and Recraft API image generation / image edits support.
---
## Auto-Router
<Image img={require('../../img/release_notes/auto_router.png')} />
<br/>
This release introduces auto-routing to models based on request content. This means **Proxy Admins** can define a set of keywords that always routes to specific models when **users** opt in to using the auto-router.
This is great for internal use cases where you don't want **users** to think about which model to use - for example, use Claude models for coding vs GPT models for generating ad copy.
[Read More](../../docs/proxy/auto_routing)
---
## Model-level Guardrails
<Image img={require('../../img/release_notes/model_level_guardrails.jpg')} />
<br/>
This release brings model-level guardrails support to your config.yaml + UI. This is great for cases when you have an on-prem and hosted model, and just want to run prevent sending PII to the hosted model.
```yaml
model_list:
- model_name: claude-sonnet-4
litellm_params:
model: anthropic/claude-sonnet-4-20250514
api_key: os.environ/ANTHROPIC_API_KEY
api_base: https://api.anthropic.com/v1
guardrails: ["azure-text-moderation"] # 👈 KEY CHANGE
guardrails:
- guardrail_name: azure-text-moderation
litellm_params:
guardrail: azure/text_moderations
mode: "post_call"
api_key: os.environ/AZURE_GUARDRAIL_API_KEY
api_base: os.environ/AZURE_GUARDRAIL_API_BASE
```
[Read More](../../docs/proxy/guardrails/quick_start#model-level-guardrails)
---
## MCP Header Propagation
<Image img={require('../../img/release_notes/mcp_header_propogation.png')} />
<br/>
v1.74.9-stable allows you to propagate MCP server specific authentication headers via LiteLLM
- Allowing users to specify which `header_name` is to be propagated to which `mcp_server` via headers
- Allows adding of different deployments of same MCP server type to use different authentication headers
[Read More](https://docs.litellm.ai/docs/mcp#new-server-specific-auth-headers-recommended)
---
## New Models / Updated Models
#### Pricing / Context Window Updates
| Provider | Model | Context Window | Input ($/1M tokens) | Output ($/1M tokens) |
| ----------- | -------------------------------------- | -------------- | ------------------- | -------------------- |
| Fireworks AI | `fireworks/models/kimi-k2-instruct` | 131k | $0.6 | $2.5 |
| OpenRouter | `openrouter/qwen/qwen-vl-plus` | 8192 | $0.21 | $0.63 |
| OpenRouter | `openrouter/qwen/qwen3-coder` | 8192 | $1 | $5 |
| OpenRouter | `openrouter/bytedance/ui-tars-1.5-7b` | 128k | $0.10 | $0.20 |
| Groq | `groq/qwen/qwen3-32b` | 131k | $0.29 | $0.59 |
| VertexAI | `vertex_ai/meta/llama-3.1-8b-instruct-maas` | 128k | $0.00 | $0.00 |
| VertexAI | `vertex_ai/meta/llama-3.1-405b-instruct-maas` | 128k | $5 | $16 |
| VertexAI | `vertex_ai/meta/llama-3.2-90b-vision-instruct-maas` | 128k | $0.00 | $0.00 |
| Google AI Studio | `gemini/gemini-2.0-flash-live-001` | 1,048,576 | $0.35 | $1.5 |
| Google AI Studio | `gemini/gemini-2.5-flash-lite` | 1,048,576 | $0.1 | $0.4 |
| VertexAI | `vertex_ai/gemini-2.0-flash-lite-001` | 1,048,576 | $0.35 | $1.5 |
| OpenAI | `gpt-4o-realtime-preview-2025-06-03` | 128k | $5 | $20 |
#### Features
- **[Lambda AI](../../docs/providers/lambda_ai)**
- New LLM API provider - [PR #12817](https://github.com/BerriAI/litellm/pull/12817)
- **[Github Copilot](../../docs/providers/github_copilot)**
- Dynamic endpoint support - [PR #12827](https://github.com/BerriAI/litellm/pull/12827)
- **[Morph](../../docs/providers/morph)**
- New LLM API provider - [PR #12821](https://github.com/BerriAI/litellm/pull/12821)
- **[Groq](../../docs/providers/groq)**
- Remove deprecated groq/qwen-qwq-32b - [PR #12832](https://github.com/BerriAI/litellm/pull/12831)
- **[Recraft](../../docs/providers/recraft)**
- New image generation API - [PR #12832](https://github.com/BerriAI/litellm/pull/12832)
- New image edits api - [PR #12874](https://github.com/BerriAI/litellm/pull/12874)
- **[Azure OpenAI](../../docs/providers/azure/azure)**
- Support DefaultAzureCredential without hard-coded environment variables - [PR #12841](https://github.com/BerriAI/litellm/pull/12841)
- **[Hyperbolic](../../docs/providers/hyperbolic)**
- New LLM API provider - [PR #12826](https://github.com/BerriAI/litellm/pull/12826)
- **[OpenAI](../../docs/providers/openai)**
- `/realtime` API - pass through intent query param - [PR #12838](https://github.com/BerriAI/litellm/pull/12838)
- **[Bedrock](../../docs/providers/bedrock)**
- Add inpainting support for Amazon Nova Canvas - [PR #12949](https://github.com/BerriAI/litellm/pull/12949) s/o @[SantoshDhaladhuli](https://github.com/SantoshDhaladhuli)
#### Bugs
- **Gemini ([Google AI Studio](../../docs/providers/gemini) + [VertexAI](../../docs/providers/vertex))**
- Fix leaking file descriptor error on sync calls - [PR #12824](https://github.com/BerriAI/litellm/pull/12824)
- **IBM Watsonx**
- use correct parameter name for tool choice - [PR #9980](https://github.com/BerriAI/litellm/pull/9980)
- **[Anthropic](../../docs/providers/anthropic)**
- Only show reasoning_effort for supported models - [PR #12847](https://github.com/BerriAI/litellm/pull/12847)
- Handle $id and $schema in tool call requests (Anthropic API stopped accepting them) - [PR #12959](https://github.com/BerriAI/litellm/pull/12959)
- **[Openrouter](../../docs/providers/openrouter)**
- filter out cache_control flag for non-anthropic models (allows usage with claude code) https://github.com/BerriAI/litellm/pull/12850
- **[Gemini](../../docs/providers/gemini)**
- Shorten Gemini tool_call_id for Open AI compatibility - [PR #12941](https://github.com/BerriAI/litellm/pull/12941) s/o @[tonga54](https://github.com/tonga54)
---
## LLM API Endpoints
#### Features
- **[Passthrough endpoints](../../docs/pass_through/)**
- Make key/user/team cost tracking OSS - [PR #12847](https://github.com/BerriAI/litellm/pull/12847)
- **[/v1/models](../../docs/providers/passthrough)**
- Return fallback models as part of api response - [PR #12811](https://github.com/BerriAI/litellm/pull/12811) s/o @[murad-khafizov](https://github.com/murad-khafizov)
- **[/vector_stores](../../docs/providers/passthrough)**
- Make permission management OSS - [PR #12990](https://github.com/BerriAI/litellm/pull/12990)
#### Bugs
1. `/batches`
1. Skip invalid batch during cost tracking check (prev. Would stop all checks) - [PR #12782](https://github.com/BerriAI/litellm/pull/12782)
2. `/chat/completions`
1. Fix async retryer on .acompletion() - [PR #12886](https://github.com/BerriAI/litellm/pull/12886)
---
## [MCP Gateway](../../docs/mcp)
#### Features
- **[Permission Management](../../docs/mcp#grouping-mcps-access-groups)**
- Make permission management by key/team OSS - [PR #12988](https://github.com/BerriAI/litellm/pull/12988)
- **[MCP Alias](../../docs/mcp#mcp-aliases)**
- Support mcp server aliases (useful for calling long mcp server names on Cursor) - [PR #12994](https://github.com/BerriAI/litellm/pull/12994)
- **Header Propagation**
- Support propagating headers from client to backend MCP (useful for sending personal access tokens to backend MCP) - [PR #13003](https://github.com/BerriAI/litellm/pull/13003)
---
## Management Endpoints / UI
#### Features
- **Usage**
- Support viewing usage by model group - [PR #12890](https://github.com/BerriAI/litellm/pull/12890)
- **Virtual Keys**
- New `key_type` field on `/key/generate` - allows specifying if key can call LLM API vs. Management routes - [PR #12909](https://github.com/BerriAI/litellm/pull/12909)
- **Models**
- Add auto router on UI - [PR #12960](https://github.com/BerriAI/litellm/pull/12960)
- Show global retry policy on UI - [PR #12969](https://github.com/BerriAI/litellm/pull/12969)
- Add model-level guardrails on create + update - [PR #13006](https://github.com/BerriAI/litellm/pull/13006)
#### Bugs
- **SSO**
- Fix logout when SSO is enabled - [PR #12703](https://github.com/BerriAI/litellm/pull/12703)
- Fix reset SSO when ui_access_mode is updated - [PR #13011](https://github.com/BerriAI/litellm/pull/13011)
- **Guardrails**
- Show correct guardrails when editing a team - [PR #12823](https://github.com/BerriAI/litellm/pull/12823)
- **Virtual Keys**
- Get updated token on regenerate key - [PR #12788](https://github.com/BerriAI/litellm/pull/12788)
- Fix CVE with key injection - [PR #12840](https://github.com/BerriAI/litellm/pull/12840)
---
## Logging / Guardrail Integrations
#### Features
- **[Google Cloud Model Armor](../../docs/proxy/guardrails/model_armor)**
- Document new guardrail - [PR #12492](https://github.com/BerriAI/litellm/pull/12492)
- **[Pillar Security](../../docs/proxy/guardrails/pillar_security)**
- New LLM Guardrail - [PR #12791](https://github.com/BerriAI/litellm/pull/12791)
- **CloudZero**
- Allow exporting spend to cloudzero - [PR #12908](https://github.com/BerriAI/litellm/pull/12908)
- **Model-level Guardrails**
- Support model-level guardrails - [PR #12968](https://github.com/BerriAI/litellm/pull/12968)
#### Bugs
- **[Prometheus](../../docs/proxy/prometheus)**
- Fix `[tag]=false` when tag is set for tag-based metrics - [PR #12916](https://github.com/BerriAI/litellm/pull/12916)
- **[Guardrails AI](../../docs/proxy/guardrails/guardrails_ai)**
- Use validatedOutput to allow usage of “fix” guards - [PR #12891](https://github.com/BerriAI/litellm/pull/12891) s/o @[DmitriyAlergant](https://github.com/DmitriyAlergant)
---
## Performance / Loadbalancing / Reliability improvements
#### Features
- **[Auto-Router](../../docs/proxy/auto_routing)**
- New auto-router powered by `semantic-router` - [PR #12955](https://github.com/BerriAI/litellm/pull/12955)
#### Bugs
- **forward_clientside_headers**
- Filter out `content-length` from headers (caused backend requests to hang) - [PR #12886](https://github.com/BerriAI/litellm/pull/12886/files)
- **Message Redaction**
- Fix cannot pickle coroutine object error - [PR #13005](https://github.com/BerriAI/litellm/pull/13005)
---
## General Proxy Improvements
#### Features
- **Benchmarks**
- Updated litellm proxy benchmarks (p50, p90, p99 overhead) - [PR #12842](https://github.com/BerriAI/litellm/pull/12842)
- **Request Headers**
- Added new `x-litellm-num-retries` request header
- **Swagger**
- Support local swagger on custom root paths - [PR #12911](https://github.com/BerriAI/litellm/pull/12911)
- **Health**
- Track cost + add tags for health checks done by LiteLLM Proxy - [PR #12880](https://github.com/BerriAI/litellm/pull/12880)
#### Bugs
- **Proxy Startup**
- Fixes issue on startup where team member budget is None would block startup - [PR #12843](https://github.com/BerriAI/litellm/pull/12843)
- **Docker**
- Move non-root docker to chain guard image (fewer vulnerabilities) - [PR #12707](https://github.com/BerriAI/litellm/pull/12707)
- add azure-keyvault==4.2.0 to Docker img - [PR #12873](https://github.com/BerriAI/litellm/pull/12873)
- **Separate Health App**
- Pass through cmd args via supervisord (enables user config to still work via docker) - [PR #12871](https://github.com/BerriAI/litellm/pull/12871)
- **Swagger**
- Bump DOMPurify version (fixes vulnerability) - [PR #12911](https://github.com/BerriAI/litellm/pull/12911)
- Add back local swagger bundle (enables swagger to work in air gapped env.) - [PR #12911](https://github.com/BerriAI/litellm/pull/12911)
- **Request Headers**
- Make user_header_name field check case insensitive (fixes customer budget enforcement for OpenWebUi) - [PR #12950](https://github.com/BerriAI/litellm/pull/12950)
- **SpendLogs**
- Fix issues writing to DB when custom_llm_provider is None - [PR #13001](https://github.com/BerriAI/litellm/pull/13001)
---
## New Contributors
* @magicalne made their first contribution in https://github.com/BerriAI/litellm/pull/12804
* @pavangudiwada made their first contribution in https://github.com/BerriAI/litellm/pull/12798
* @mdiloreto made their first contribution in https://github.com/BerriAI/litellm/pull/12707
* @murad-khafizov made their first contribution in https://github.com/BerriAI/litellm/pull/12811
* @eagle-p made their first contribution in https://github.com/BerriAI/litellm/pull/12791
* @apoorv-sharma made their first contribution in https://github.com/BerriAI/litellm/pull/12920
* @SantoshDhaladhuli made their first contribution in https://github.com/BerriAI/litellm/pull/12949
* @tonga54 made their first contribution in https://github.com/BerriAI/litellm/pull/12941
* @sings-to-bees-on-wednesdays made their first contribution in https://github.com/BerriAI/litellm/pull/12950
## **[Full Changelog](https://github.com/BerriAI/litellm/compare/v1.74.7-stable...v1.74.9.rc-draft)**