Added LiteLLM to the stack

2025-08-18 09:40:50 +00:00
parent 0648c1968c
commit d220b04e32
2682 changed files with 533609 additions and 1 deletions
--- a/Development/litellm/docs/my-website/release_notes/v1.71.1-stable/index.md
+++ b/Development/litellm/docs/my-website/release_notes/v1.71.1-stable/index.md
@@ -0,0 +1,284 @@
+---
+title: v1.71.1-stable - 2x Higher Requests Per Second (RPS)
+slug: v1.71.1-stable
+date: 2025-05-24T10:00:00
+authors:
+  - name: Krrish Dholakia
+    title: CEO, LiteLLM
+    url: https://www.linkedin.com/in/krish-d/
+    image_url: https://media.licdn.com/dms/image/v2/D4D03AQGrlsJ3aqpHmQ/profile-displayphoto-shrink_400_400/B4DZSAzgP7HYAg-/0/1737327772964?e=1749686400&v=beta&t=Hkl3U8Ps0VtvNxX0BNNq24b4dtX5wQaPFp6oiKCIHD8
+  - name: Ishaan Jaffer
+    title: CTO, LiteLLM
+    url: https://www.linkedin.com/in/reffajnaahsi/
+    image_url: https://pbs.twimg.com/profile_images/1613813310264340481/lz54oEiB_400x400.jpg
+
+hide_table_of_contents: false
+---
+
+import Image from '@theme/IdealImage';
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+## Deploy this version
+
+<Tabs>
+<TabItem value="docker" label="Docker">
+
+``` showLineNumbers title="docker run litellm"
+docker run
+-e STORE_MODEL_IN_DB=True
+-p 4000:4000
+ghcr.io/berriai/litellm:main-v1.71.1-stable
+```
+</TabItem>
+
+<TabItem value="pip" label="Pip">
+
+``` showLineNumbers title="pip install litellm"
+pip install litellm==1.71.1
+```
+</TabItem>
+</Tabs>
+
+## Key Highlights
+
+LiteLLM v1.71.1-stable is live now. Here are the key highlights of this release:
+
+- **Performance improvements**: LiteLLM can now scale to 200 RPS per instance with a 74ms median response time. 
+- **File Permissions**:  Control file access across OpenAI, Azure, VertexAI. 
+- **MCP x OpenAI**: Use MCP servers with OpenAI Responses API.
+
+
+
+## Performance Improvements
+
+<Image img={require('../../img/perf_imp.png')}  style={{ width: '800px', height: 'auto' }} />
+
+<br/>
+
+
+This release brings aiohttp support for all LLM api providers. This means that LiteLLM can now scale to 200 RPS per instance with a 40ms median latency overhead. 
+
+This change doubles the RPS LiteLLM can scale to at this latency overhead.
+
+You can opt into this by enabling the flag below. (We expect to make this the default in 1 week.)
+
+
+### Flag to enable
+
+**On LiteLLM Proxy**
+
+Set the `USE_AIOHTTP_TRANSPORT=True` in the environment variables. 
+
+```yaml showLineNumbers title="Environment Variable"
+export USE_AIOHTTP_TRANSPORT="True"
+```
+
+**On LiteLLM Python SDK**
+
+Set the `use_aiohttp_transport=True` to enable aiohttp transport. 
+
+```python showLineNumbers title="Python SDK"
+import litellm
+
+litellm.use_aiohttp_transport = True # default is False, enable this to use aiohttp transport
+result = litellm.completion(
+    model="openai/gpt-4o",
+    messages=[{"role": "user", "content": "Hello, world!"}],
+)
+print(result)
+```
+
+## File Permissions
+
+<Image img={require('../../img/files_api_graphic.png')}  style={{ width: '800px', height: 'auto' }} />
+
+<br/>
+
+This release brings support for [File Permissions](../../docs/proxy/litellm_managed_files#file-permissions) and [Finetuning APIs](../../docs/proxy/managed_finetuning) to [LiteLLM Managed Files](../../docs/proxy/litellm_managed_files). This is great for: 
+
+- **Proxy Admins**: as users can only view/edit/delete files they’ve created - even when using shared OpenAI/Azure/Vertex deployments.
+- **Developers**: get a standard interface to use Files across Chat/Finetuning/Batch APIs.
+
+
+## New Models / Updated Models
+
+- **Gemini [VertexAI](https://docs.litellm.ai/docs/providers/vertex), [Google AI Studio](https://docs.litellm.ai/docs/providers/gemini)**
+    - New gemini models - [PR 1](https://github.com/BerriAI/litellm/pull/10991), [PR 2](https://github.com/BerriAI/litellm/pull/10998)
+        - `gemini-2.5-flash-preview-tts`
+        - `gemini-2.0-flash-preview-image-generation`
+        - `gemini/gemini-2.5-flash-preview-05-20`
+        - `gemini-2.5-flash-preview-05-20`
+- **[Anthropic](../../docs/providers/anthropic)**
+    - Claude-4 model family support - [PR](https://github.com/BerriAI/litellm/pull/11060)
+- **[Bedrock](../../docs/providers/bedrock)**
+    - Claude-4 model family support - [PR](https://github.com/BerriAI/litellm/pull/11060)
+    - Support for `reasoning_effort` and `thinking` parameters for Claude-4 - [PR](https://github.com/BerriAI/litellm/pull/11114)
+- **[VertexAI](../../docs/providers/vertex)**
+    - Claude-4 model family support - [PR](https://github.com/BerriAI/litellm/pull/11060)
+    - Global endpoints support - [PR](https://github.com/BerriAI/litellm/pull/10658)
+    - authorized_user credentials type support - [PR](https://github.com/BerriAI/litellm/pull/10899)
+- **[xAI](../../docs/providers/xai)**
+    - `xai/grok-3` pricing information - [PR](https://github.com/BerriAI/litellm/pull/11028)
+- **[LM Studio](../../docs/providers/lm_studio)**
+    - Structured JSON schema outputs support - [PR](https://github.com/BerriAI/litellm/pull/10929)
+- **[SambaNova](../../docs/providers/sambanova)**
+    - Updated models and parameters - [PR](https://github.com/BerriAI/litellm/pull/10900)
+- **[Databricks](../../docs/providers/databricks)**
+    - Llama 4 Maverick model cost - [PR](https://github.com/BerriAI/litellm/pull/11008)
+    - Claude 3.7 Sonnet output token cost correction - [PR](https://github.com/BerriAI/litellm/pull/11007)
+- **[Azure](../../docs/providers/azure)**
+    - Mistral Medium 25.05 support - [PR](https://github.com/BerriAI/litellm/pull/11063)
+    - Certificate-based authentication support - [PR](https://github.com/BerriAI/litellm/pull/11069)
+- **[Mistral](../../docs/providers/mistral)**
+    - devstral-small-2505 model pricing and context window - [PR](https://github.com/BerriAI/litellm/pull/11103)
+- **[Ollama](../../docs/providers/ollama)**
+    - Wildcard model support - [PR](https://github.com/BerriAI/litellm/pull/10982)
+- **[CustomLLM](../../docs/providers/custom_llm_server)**
+    - Embeddings support added - [PR](https://github.com/BerriAI/litellm/pull/10980)
+- **[Featherless AI](../../docs/providers/featherless_ai)**
+    - Access to 4200+ models - [PR](https://github.com/BerriAI/litellm/pull/10596)
+
+## LLM API Endpoints
+
+- **[Image Edits](../../docs/image_generation)**
+    - `/v1/images/edits` - Support for /images/edits endpoint - [PR](https://github.com/BerriAI/litellm/pull/11020) [PR](https://github.com/BerriAI/litellm/pull/11123)
+    - Content policy violation error mapping - [PR](https://github.com/BerriAI/litellm/pull/11113)
+- **[Responses API](../../docs/response_api)**
+    - MCP support for Responses API - [PR](https://github.com/BerriAI/litellm/pull/11029)
+- **[Files API](../../docs/fine_tuning)**
+    - LiteLLM Managed Files support for finetuning - [PR](https://github.com/BerriAI/litellm/pull/11039) [PR](https://github.com/BerriAI/litellm/pull/11040)
+    - Validation for file operations (retrieve/list/delete) - [PR](https://github.com/BerriAI/litellm/pull/11081)
+
+## Management Endpoints / UI
+
+- **Teams**
+    - Key and member count display - [PR](https://github.com/BerriAI/litellm/pull/10950)
+    - Spend rounded to 4 decimal points - [PR](https://github.com/BerriAI/litellm/pull/11013)
+    - Organization and team create buttons repositioned - [PR](https://github.com/BerriAI/litellm/pull/10948)
+- **Keys**
+    - Key reassignment and 'updated at' column - [PR](https://github.com/BerriAI/litellm/pull/10960)
+    - Show model access groups during creation - [PR](https://github.com/BerriAI/litellm/pull/10965)
+- **Logs**
+    - Model filter on logs - [PR](https://github.com/BerriAI/litellm/pull/11048)
+    - Passthrough endpoint error logs support - [PR](https://github.com/BerriAI/litellm/pull/10990)
+- **Guardrails**
+    - Config.yaml guardrails display - [PR](https://github.com/BerriAI/litellm/pull/10959)
+- **Organizations/Users**
+    - Spend rounded to 4 decimal points - [PR](https://github.com/BerriAI/litellm/pull/11023)
+    - Show clear error when adding a user to a team - [PR](https://github.com/BerriAI/litellm/pull/10978)
+- **Audit Logs**
+    - `/list` and `/info` endpoints for Audit Logs - [PR](https://github.com/BerriAI/litellm/pull/11102)
+
+## Logging / Alerting Integrations
+
+- **[Prometheus](../../docs/proxy/prometheus)**
+    - Track `route` on proxy_* metrics - [PR](https://github.com/BerriAI/litellm/pull/10992)
+- **[Langfuse](../../docs/proxy/logging#langfuse)**
+    - Support for `prompt_label` parameter - [PR](https://github.com/BerriAI/litellm/pull/11018)
+    - Consistent modelParams logging - [PR](https://github.com/BerriAI/litellm/pull/11018)
+- **[DeepEval/ConfidentAI](../../docs/proxy/logging#deepeval)**
+    - Logging enabled for proxy and SDK - [PR](https://github.com/BerriAI/litellm/pull/10649)
+- **[Logfire](../../docs/proxy/logging)**
+    - Fix otel proxy server initialization when using Logfire - [PR](https://github.com/BerriAI/litellm/pull/11091)
+
+## Authentication & Security
+
+- **[JWT Authentication](../../docs/proxy/token_auth)**
+    - Support for applying default internal user parameters when upserting a user via JWT authentication - [PR](https://github.com/BerriAI/litellm/pull/10995)
+    - Map a user to a team when upserting a user via JWT authentication - [PR](https://github.com/BerriAI/litellm/pull/11108)
+- **Custom Auth**
+    - Support for switching between custom auth and API key auth - [PR](https://github.com/BerriAI/litellm/pull/11070)
+
+## Performance / Reliability Improvements
+
+- **aiohttp Transport**
+    - 97% lower median latency (feature flagged) - [PR](https://github.com/BerriAI/litellm/pull/11097) [PR](https://github.com/BerriAI/litellm/pull/11132)
+- **Background Health Checks**
+    - Improved reliability - [PR](https://github.com/BerriAI/litellm/pull/10887)
+- **Response Handling**
+    - Better streaming status code detection - [PR](https://github.com/BerriAI/litellm/pull/10962)
+    - Response ID propagation improvements - [PR](https://github.com/BerriAI/litellm/pull/11006)
+- **Thread Management**
+    - Removed error-creating threads for reliability - [PR](https://github.com/BerriAI/litellm/pull/11066)
+
+## General Proxy Improvements
+
+- **[Proxy CLI](../../docs/proxy/cli)**
+    - Skip server startup flag - [PR](https://github.com/BerriAI/litellm/pull/10665)
+    - Avoid DATABASE_URL override when provided - [PR](https://github.com/BerriAI/litellm/pull/11076)
+- **Model Management**
+    - Clear cache and reload after model updates - [PR](https://github.com/BerriAI/litellm/pull/10853)
+    - Computer use support tracking - [PR](https://github.com/BerriAI/litellm/pull/10881)
+- **Helm Chart**
+    - LoadBalancer class support - [PR](https://github.com/BerriAI/litellm/pull/11064)
+
+## Bug Fixes
+
+This release includes numerous bug fixes to improve stability and reliability:
+
+- **LLM Provider Fixes**
+    - VertexAI: 
+        - Fixed quota_project_id parameter issue - [PR](https://github.com/BerriAI/litellm/pull/10915)
+        - Fixed credential refresh exceptions - [PR](https://github.com/BerriAI/litellm/pull/10969)
+    - Cohere: 
+        Fixes for adding Cohere models through LiteLLM UI - [PR](https://github.com/BerriAI/litellm/pull/10822)
+    - Anthropic: 
+        - Fixed streaming dict object handling for /v1/messages - [PR](https://github.com/BerriAI/litellm/pull/11032)
+    - OpenRouter: 
+        - Fixed stream usage ID issues - [PR](https://github.com/BerriAI/litellm/pull/11004)
+
+- **Authentication & Users**
+    - Fixed invitation email link generation - [PR](https://github.com/BerriAI/litellm/pull/10958) 
+    - Fixed JWT authentication default role - [PR](https://github.com/BerriAI/litellm/pull/10995)
+    - Fixed user budget reset functionality - [PR](https://github.com/BerriAI/litellm/pull/10993)
+    - Fixed SSO user compatibility and email validation - [PR](https://github.com/BerriAI/litellm/pull/11106)
+
+- **Database & Infrastructure**
+    - Fixed DB connection parameter handling - [PR](https://github.com/BerriAI/litellm/pull/10842)
+    - Fixed email invitation link  - [PR](https://github.com/BerriAI/litellm/pull/11031)
+
+- **UI & Display**
+    - Fixed MCP tool rendering when no arguments required - [PR](https://github.com/BerriAI/litellm/pull/11012)
+    - Fixed team model alias deletion - [PR](https://github.com/BerriAI/litellm/pull/11121)
+    - Fixed team viewer permissions - [PR](https://github.com/BerriAI/litellm/pull/11127)
+
+- **Model & Routing**
+    - Fixed team model mapping in route requests - [PR](https://github.com/BerriAI/litellm/pull/11111)
+    - Fixed standard optional parameter passing - [PR](https://github.com/BerriAI/litellm/pull/11124)
+
+
+## New Contributors
+* [@DarinVerheijke](https://github.com/DarinVerheijke) made their first contribution in PR [#10596](https://github.com/BerriAI/litellm/pull/10596)
+* [@estsauver](https://github.com/estsauver) made their first contribution in PR [#10929](https://github.com/BerriAI/litellm/pull/10929)
+* [@mohittalele](https://github.com/mohittalele) made their first contribution in PR [#10665](https://github.com/BerriAI/litellm/pull/10665)
+* [@pselden](https://github.com/pselden) made their first contribution in PR [#10899](https://github.com/BerriAI/litellm/pull/10899)
+* [@unrealandychan](https://github.com/unrealandychan) made their first contribution in PR [#10842](https://github.com/BerriAI/litellm/pull/10842)
+* [@dastaiger](https://github.com/dastaiger) made their first contribution in PR [#10946](https://github.com/BerriAI/litellm/pull/10946)
+* [@slytechnical](https://github.com/slytechnical) made their first contribution in PR [#10881](https://github.com/BerriAI/litellm/pull/10881)
+* [@daarko10](https://github.com/daarko10) made their first contribution in PR [#11006](https://github.com/BerriAI/litellm/pull/11006)
+* [@sorenmat](https://github.com/sorenmat) made their first contribution in PR [#10658](https://github.com/BerriAI/litellm/pull/10658)
+* [@matthid](https://github.com/matthid) made their first contribution in PR [#10982](https://github.com/BerriAI/litellm/pull/10982)
+* [@jgowdy-godaddy](https://github.com/jgowdy-godaddy) made their first contribution in PR [#11032](https://github.com/BerriAI/litellm/pull/11032)
+* [@bepotp](https://github.com/bepotp) made their first contribution in PR [#11008](https://github.com/BerriAI/litellm/pull/11008)
+* [@jmorenoc-o](https://github.com/jmorenoc-o) made their first contribution in PR [#11031](https://github.com/BerriAI/litellm/pull/11031)
+* [@martin-liu](https://github.com/martin-liu) made their first contribution in PR [#11076](https://github.com/BerriAI/litellm/pull/11076)
+* [@gunjan-solanki](https://github.com/gunjan-solanki) made their first contribution in PR [#11064](https://github.com/BerriAI/litellm/pull/11064)
+* [@tokoko](https://github.com/tokoko) made their first contribution in PR [#10980](https://github.com/BerriAI/litellm/pull/10980)
+* [@spike-spiegel-21](https://github.com/spike-spiegel-21) made their first contribution in PR [#10649](https://github.com/BerriAI/litellm/pull/10649)
+* [@kreatoo](https://github.com/kreatoo) made their first contribution in PR [#10927](https://github.com/BerriAI/litellm/pull/10927)
+* [@baejooc](https://github.com/baejooc) made their first contribution in PR [#10887](https://github.com/BerriAI/litellm/pull/10887)
+* [@keykbd](https://github.com/keykbd) made their first contribution in PR [#11114](https://github.com/BerriAI/litellm/pull/11114)
+* [@dalssoft](https://github.com/dalssoft) made their first contribution in PR [#11088](https://github.com/BerriAI/litellm/pull/11088)
+* [@jtong99](https://github.com/jtong99) made their first contribution in PR [#10853](https://github.com/BerriAI/litellm/pull/10853)
+
+## Demo Instance
+
+Here's a Demo Instance to test changes:
+
+- Instance: https://demo.litellm.ai/
+- Login Credentials:
+    - Username: admin
+    - Password: sk-1234
+
+## [Git Diff](https://github.com/BerriAI/litellm/releases)