Added LiteLLM to the stack

2025-08-18 09:40:50 +00:00
parent 0648c1968c
commit d220b04e32
2682 changed files with 533609 additions and 1 deletions
--- a/Development/litellm/docs/my-website/release_notes/v1.74.9-stable/index.md
+++ b/Development/litellm/docs/my-website/release_notes/v1.74.9-stable/index.md
@@ -0,0 +1,299 @@
+---
+title: "v1.74.9-stable - Auto-Router"
+slug: "v1-74-9"
+date: 2025-07-27T10:00:00
+authors:
+  - name: Krrish Dholakia
+    title: CEO, LiteLLM
+    url: https://www.linkedin.com/in/krish-d/
+    image_url: https://pbs.twimg.com/profile_images/1298587542745358340/DZv3Oj-h_400x400.jpg
+  - name: Ishaan Jaffer
+    title: CTO, LiteLLM
+    url: https://www.linkedin.com/in/reffajnaahsi/
+    image_url: https://pbs.twimg.com/profile_images/1613813310264340481/lz54oEiB_400x400.jpg
+
+hide_table_of_contents: false
+---
+
+import Image from '@theme/IdealImage';
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+## Deploy this version
+
+<Tabs>
+<TabItem value="docker" label="Docker">
+
+``` showLineNumbers title="docker run litellm"
+docker run \
+-e STORE_MODEL_IN_DB=True \
+-p 4000:4000 \
+ghcr.io/berriai/litellm:v1.74.9-stable.patch.1
+```
+</TabItem>
+
+<TabItem value="pip" label="Pip">
+
+``` showLineNumbers title="pip install litellm"
+pip install litellm==1.74.9.post2
+```
+
+</TabItem>
+</Tabs>
+
+---
+
+## Key Highlights
+
+- **Auto-Router** - Automatically route requests to specific models based on request content.
+- **Model-level Guardrails** - Only run guardrails when specific models are used.
+- **MCP Header Propagation** - Propagate headers from client to backend MCP.
+- **New LLM Providers** - Added Bedrock inpainting support and Recraft API image generation  / image edits support.
+
+---
+
+## Auto-Router
+
+<Image img={require('../../img/release_notes/auto_router.png')} />
+
+<br/>
+
+This release introduces auto-routing to models based on request content. This means **Proxy Admins** can define a set of keywords that always routes to specific models when **users** opt in to using the auto-router.
+
+This is great for internal use cases where you don't want **users** to think about which model to use - for example, use Claude models for coding vs GPT models for generating ad copy.
+
+
+[Read More](../../docs/proxy/auto_routing)
+
+---
+
+## Model-level Guardrails
+
+<Image img={require('../../img/release_notes/model_level_guardrails.jpg')} />
+
+<br/>
+
+This release brings model-level guardrails support to your config.yaml + UI. This is great for cases when you have an on-prem and hosted model, and just want to run prevent sending PII to the hosted model.
+
+```yaml
+model_list:
+  - model_name: claude-sonnet-4
+    litellm_params:
+      model: anthropic/claude-sonnet-4-20250514
+      api_key: os.environ/ANTHROPIC_API_KEY
+      api_base: https://api.anthropic.com/v1
+      guardrails: ["azure-text-moderation"] # 👈 KEY CHANGE
+
+guardrails:
+  - guardrail_name: azure-text-moderation
+    litellm_params:
+      guardrail: azure/text_moderations
+      mode: "post_call" 
+      api_key: os.environ/AZURE_GUARDRAIL_API_KEY
+      api_base: os.environ/AZURE_GUARDRAIL_API_BASE 
+```
+
+
+[Read More](../../docs/proxy/guardrails/quick_start#model-level-guardrails)
+
+---
+## MCP Header Propagation
+
+<Image img={require('../../img/release_notes/mcp_header_propogation.png')} />
+
+<br/>
+
+v1.74.9-stable allows you to propagate MCP server specific authentication headers via LiteLLM
+
+- Allowing users to specify which `header_name` is to be propagated to which `mcp_server` via headers
+- Allows adding of different deployments of same MCP server type to use different authentication headers
+
+
+[Read More](https://docs.litellm.ai/docs/mcp#new-server-specific-auth-headers-recommended)
+
+---
+## New Models / Updated Models
+
+#### Pricing / Context Window Updates
+
+| Provider    | Model                                  | Context Window | Input ($/1M tokens) | Output ($/1M tokens) |
+| ----------- | -------------------------------------- | -------------- | ------------------- | -------------------- |
+| Fireworks AI | `fireworks/models/kimi-k2-instruct` | 131k | $0.6 | $2.5 | 
+| OpenRouter | `openrouter/qwen/qwen-vl-plus` | 8192 | $0.21 | $0.63 | 
+| OpenRouter | `openrouter/qwen/qwen3-coder` | 8192 | $1 | $5 | 
+| OpenRouter | `openrouter/bytedance/ui-tars-1.5-7b` | 128k | $0.10 | $0.20 | 
+| Groq | `groq/qwen/qwen3-32b` | 131k | $0.29 | $0.59 | 
+| VertexAI | `vertex_ai/meta/llama-3.1-8b-instruct-maas` | 128k | $0.00 | $0.00 | 
+| VertexAI | `vertex_ai/meta/llama-3.1-405b-instruct-maas` | 128k | $5 | $16 | 
+| VertexAI | `vertex_ai/meta/llama-3.2-90b-vision-instruct-maas` | 128k | $0.00 | $0.00 | 
+| Google AI Studio | `gemini/gemini-2.0-flash-live-001` | 1,048,576 | $0.35 | $1.5 | 
+| Google AI Studio | `gemini/gemini-2.5-flash-lite` | 1,048,576 | $0.1 | $0.4 | 
+| VertexAI | `vertex_ai/gemini-2.0-flash-lite-001` | 1,048,576 | $0.35 | $1.5 | 
+| OpenAI | `gpt-4o-realtime-preview-2025-06-03` | 128k | $5 | $20 |
+
+#### Features
+
+- **[Lambda AI](../../docs/providers/lambda_ai)**
+    - New LLM API provider - [PR #12817](https://github.com/BerriAI/litellm/pull/12817)
+- **[Github Copilot](../../docs/providers/github_copilot)**
+    - Dynamic endpoint support - [PR #12827](https://github.com/BerriAI/litellm/pull/12827)
+- **[Morph](../../docs/providers/morph)**
+    - New LLM API provider - [PR #12821](https://github.com/BerriAI/litellm/pull/12821)
+- **[Groq](../../docs/providers/groq)**
+    - Remove deprecated groq/qwen-qwq-32b - [PR #12832](https://github.com/BerriAI/litellm/pull/12831)
+- **[Recraft](../../docs/providers/recraft)**
+    - New image generation API - [PR #12832](https://github.com/BerriAI/litellm/pull/12832)
+    - New image edits api - [PR #12874](https://github.com/BerriAI/litellm/pull/12874)
+- **[Azure OpenAI](../../docs/providers/azure/azure)**
+    - Support DefaultAzureCredential without hard-coded environment variables - [PR #12841](https://github.com/BerriAI/litellm/pull/12841)
+- **[Hyperbolic](../../docs/providers/hyperbolic)**
+    - New LLM API provider - [PR #12826](https://github.com/BerriAI/litellm/pull/12826)
+- **[OpenAI](../../docs/providers/openai)**
+    - `/realtime` API - pass through intent query param - [PR #12838](https://github.com/BerriAI/litellm/pull/12838)
+- **[Bedrock](../../docs/providers/bedrock)**
+    - Add inpainting support for Amazon Nova Canvas - [PR #12949](https://github.com/BerriAI/litellm/pull/12949) s/o @[SantoshDhaladhuli](https://github.com/SantoshDhaladhuli)
+
+#### Bugs
+- **Gemini ([Google AI Studio](../../docs/providers/gemini) + [VertexAI](../../docs/providers/vertex))**
+    - Fix leaking file descriptor error on sync calls - [PR #12824](https://github.com/BerriAI/litellm/pull/12824)
+- **IBM Watsonx**
+    - use correct parameter name for tool choice - [PR #9980](https://github.com/BerriAI/litellm/pull/9980)
+- **[Anthropic](../../docs/providers/anthropic)**
+    - Only show ‘reasoning_effort’ for supported models - [PR #12847](https://github.com/BerriAI/litellm/pull/12847)
+    - Handle $id and $schema in tool call requests (Anthropic API stopped accepting them) - [PR #12959](https://github.com/BerriAI/litellm/pull/12959)
+- **[Openrouter](../../docs/providers/openrouter)**
+    - filter out cache_control flag for non-anthropic models (allows usage with claude code) https://github.com/BerriAI/litellm/pull/12850
+- **[Gemini](../../docs/providers/gemini)**
+    - Shorten Gemini tool_call_id for Open AI compatibility - [PR #12941](https://github.com/BerriAI/litellm/pull/12941) s/o @[tonga54](https://github.com/tonga54)
+
+---
+
+## LLM API Endpoints
+
+#### Features
+
+- **[Passthrough endpoints](../../docs/pass_through/)**
+    - Make key/user/team cost tracking OSS - [PR #12847](https://github.com/BerriAI/litellm/pull/12847)
+- **[/v1/models](../../docs/providers/passthrough)**
+    - Return fallback models as part of api response - [PR #12811](https://github.com/BerriAI/litellm/pull/12811) s/o @[murad-khafizov](https://github.com/murad-khafizov)
+- **[/vector_stores](../../docs/providers/passthrough)**
+    - Make permission management OSS - [PR #12990](https://github.com/BerriAI/litellm/pull/12990)
+
+#### Bugs
+1. `/batches`
+    1. Skip invalid batch during cost tracking check (prev. Would stop all checks) - [PR #12782](https://github.com/BerriAI/litellm/pull/12782)
+2. `/chat/completions`
+    1. Fix async retryer on .acompletion() - [PR #12886](https://github.com/BerriAI/litellm/pull/12886)
+
+---
+
+## [MCP Gateway](../../docs/mcp)
+
+#### Features
+- **[Permission Management](../../docs/mcp#grouping-mcps-access-groups)**
+    - Make permission management by key/team OSS - [PR #12988](https://github.com/BerriAI/litellm/pull/12988)
+- **[MCP Alias](../../docs/mcp#mcp-aliases)**
+    - Support mcp server aliases (useful for calling long mcp server names on Cursor) - [PR #12994](https://github.com/BerriAI/litellm/pull/12994)
+- **Header Propagation**
+    - Support propagating headers from client to backend MCP (useful for sending personal access tokens to backend MCP) - [PR #13003](https://github.com/BerriAI/litellm/pull/13003)
+
+---
+
+## Management Endpoints / UI
+
+#### Features
+- **Usage**
+    - Support viewing usage by model group - [PR #12890](https://github.com/BerriAI/litellm/pull/12890)
+- **Virtual Keys**
+    - New `key_type` field on `/key/generate` - allows specifying if key can call LLM API vs. Management routes - [PR #12909](https://github.com/BerriAI/litellm/pull/12909)
+- **Models**
+    - Add ‘auto router’ on UI - [PR #12960](https://github.com/BerriAI/litellm/pull/12960)
+    - Show global retry policy on UI - [PR #12969](https://github.com/BerriAI/litellm/pull/12969)
+    - Add model-level guardrails on create + update - [PR #13006](https://github.com/BerriAI/litellm/pull/13006)
+
+#### Bugs
+- **SSO**
+    - Fix logout when SSO is enabled - [PR #12703](https://github.com/BerriAI/litellm/pull/12703)
+    - Fix reset SSO when ui_access_mode is updated - [PR #13011](https://github.com/BerriAI/litellm/pull/13011)
+- **Guardrails**
+    - Show correct guardrails when editing a team - [PR #12823](https://github.com/BerriAI/litellm/pull/12823)
+- **Virtual Keys**
+    - Get updated token on regenerate key - [PR #12788](https://github.com/BerriAI/litellm/pull/12788)
+    - Fix CVE with key injection - [PR #12840](https://github.com/BerriAI/litellm/pull/12840)
+---
+
+## Logging / Guardrail Integrations
+
+#### Features
+- **[Google Cloud Model Armor](../../docs/proxy/guardrails/model_armor)**
+    - Document new guardrail - [PR #12492](https://github.com/BerriAI/litellm/pull/12492)
+- **[Pillar Security](../../docs/proxy/guardrails/pillar_security)**
+    - New LLM Guardrail - [PR #12791](https://github.com/BerriAI/litellm/pull/12791)
+- **CloudZero**
+    - Allow exporting spend to cloudzero - [PR #12908](https://github.com/BerriAI/litellm/pull/12908)
+- **Model-level Guardrails**
+    - Support model-level guardrails - [PR #12968](https://github.com/BerriAI/litellm/pull/12968)
+
+#### Bugs
+- **[Prometheus](../../docs/proxy/prometheus)**
+    - Fix `[tag]=false` when tag is set for tag-based metrics - [PR #12916](https://github.com/BerriAI/litellm/pull/12916)
+- **[Guardrails AI](../../docs/proxy/guardrails/guardrails_ai)**
+    - Use ‘validatedOutput’ to allow usage of “fix” guards - [PR #12891](https://github.com/BerriAI/litellm/pull/12891) s/o @[DmitriyAlergant](https://github.com/DmitriyAlergant)
+
+---
+
+## Performance / Loadbalancing / Reliability improvements
+
+#### Features
+- **[Auto-Router](../../docs/proxy/auto_routing)**
+    - New auto-router powered by `semantic-router` - [PR #12955](https://github.com/BerriAI/litellm/pull/12955)
+
+#### Bugs
+- **forward_clientside_headers**
+    - Filter out `content-length` from headers (caused backend requests to hang) - [PR #12886](https://github.com/BerriAI/litellm/pull/12886/files)
+- **Message Redaction**
+    - Fix cannot pickle coroutine object error - [PR #13005](https://github.com/BerriAI/litellm/pull/13005)
+---
+
+## General Proxy Improvements
+
+#### Features
+- **Benchmarks**
+    - Updated litellm proxy benchmarks (p50, p90, p99 overhead) - [PR #12842](https://github.com/BerriAI/litellm/pull/12842)
+- **Request Headers**
+    - Added new `x-litellm-num-retries` request header 
+- **Swagger**
+    - Support local swagger on custom root paths - [PR #12911](https://github.com/BerriAI/litellm/pull/12911)
+- **Health**
+    - Track cost + add tags for health checks done by LiteLLM Proxy - [PR #12880](https://github.com/BerriAI/litellm/pull/12880)
+#### Bugs
+
+- **Proxy Startup**
+    - Fixes issue on startup where team member budget is None would block startup - [PR #12843](https://github.com/BerriAI/litellm/pull/12843)
+- **Docker**
+    - Move non-root docker to chain guard image (fewer vulnerabilities) - [PR #12707](https://github.com/BerriAI/litellm/pull/12707)
+    - add azure-keyvault==4.2.0 to Docker img - [PR #12873](https://github.com/BerriAI/litellm/pull/12873)
+- **Separate Health App**
+    - Pass through cmd args via supervisord (enables user config to still work via docker) - [PR #12871](https://github.com/BerriAI/litellm/pull/12871)
+- **Swagger**
+    - Bump DOMPurify version (fixes vulnerability) - [PR #12911](https://github.com/BerriAI/litellm/pull/12911)
+    - Add back local swagger bundle (enables swagger to work in air gapped env.) - [PR #12911](https://github.com/BerriAI/litellm/pull/12911)
+- **Request Headers**
+    - Make ‘user_header_name’ field check case insensitive (fixes customer budget enforcement for OpenWebUi) - [PR #12950](https://github.com/BerriAI/litellm/pull/12950)
+- **SpendLogs**
+    - Fix issues writing to DB when custom_llm_provider is None - [PR #13001](https://github.com/BerriAI/litellm/pull/13001)
+
+---
+
+## New Contributors
+* @magicalne made their first contribution in https://github.com/BerriAI/litellm/pull/12804
+* @pavangudiwada made their first contribution in https://github.com/BerriAI/litellm/pull/12798
+* @mdiloreto made their first contribution in https://github.com/BerriAI/litellm/pull/12707
+* @murad-khafizov made their first contribution in https://github.com/BerriAI/litellm/pull/12811
+* @eagle-p made their first contribution in https://github.com/BerriAI/litellm/pull/12791
+* @apoorv-sharma made their first contribution in https://github.com/BerriAI/litellm/pull/12920
+* @SantoshDhaladhuli made their first contribution in https://github.com/BerriAI/litellm/pull/12949
+* @tonga54 made their first contribution in https://github.com/BerriAI/litellm/pull/12941
+* @sings-to-bees-on-wednesdays made their first contribution in https://github.com/BerriAI/litellm/pull/12950
+
+## **[Full Changelog](https://github.com/BerriAI/litellm/compare/v1.74.7-stable...v1.74.9.rc-draft)**