Added LiteLLM to the stack
This commit is contained in:
@@ -0,0 +1,284 @@
|
||||
---
|
||||
title: v1.71.1-stable - 2x Higher Requests Per Second (RPS)
|
||||
slug: v1.71.1-stable
|
||||
date: 2025-05-24T10:00:00
|
||||
authors:
|
||||
- name: Krrish Dholakia
|
||||
title: CEO, LiteLLM
|
||||
url: https://www.linkedin.com/in/krish-d/
|
||||
image_url: https://media.licdn.com/dms/image/v2/D4D03AQGrlsJ3aqpHmQ/profile-displayphoto-shrink_400_400/B4DZSAzgP7HYAg-/0/1737327772964?e=1749686400&v=beta&t=Hkl3U8Ps0VtvNxX0BNNq24b4dtX5wQaPFp6oiKCIHD8
|
||||
- name: Ishaan Jaffer
|
||||
title: CTO, LiteLLM
|
||||
url: https://www.linkedin.com/in/reffajnaahsi/
|
||||
image_url: https://pbs.twimg.com/profile_images/1613813310264340481/lz54oEiB_400x400.jpg
|
||||
|
||||
hide_table_of_contents: false
|
||||
---
|
||||
|
||||
import Image from '@theme/IdealImage';
|
||||
import Tabs from '@theme/Tabs';
|
||||
import TabItem from '@theme/TabItem';
|
||||
|
||||
## Deploy this version
|
||||
|
||||
<Tabs>
|
||||
<TabItem value="docker" label="Docker">
|
||||
|
||||
``` showLineNumbers title="docker run litellm"
|
||||
docker run
|
||||
-e STORE_MODEL_IN_DB=True
|
||||
-p 4000:4000
|
||||
ghcr.io/berriai/litellm:main-v1.71.1-stable
|
||||
```
|
||||
</TabItem>
|
||||
|
||||
<TabItem value="pip" label="Pip">
|
||||
|
||||
``` showLineNumbers title="pip install litellm"
|
||||
pip install litellm==1.71.1
|
||||
```
|
||||
</TabItem>
|
||||
</Tabs>
|
||||
|
||||
## Key Highlights
|
||||
|
||||
LiteLLM v1.71.1-stable is live now. Here are the key highlights of this release:
|
||||
|
||||
- **Performance improvements**: LiteLLM can now scale to 200 RPS per instance with a 74ms median response time.
|
||||
- **File Permissions**: Control file access across OpenAI, Azure, VertexAI.
|
||||
- **MCP x OpenAI**: Use MCP servers with OpenAI Responses API.
|
||||
|
||||
|
||||
|
||||
## Performance Improvements
|
||||
|
||||
<Image img={require('../../img/perf_imp.png')} style={{ width: '800px', height: 'auto' }} />
|
||||
|
||||
<br/>
|
||||
|
||||
|
||||
This release brings aiohttp support for all LLM api providers. This means that LiteLLM can now scale to 200 RPS per instance with a 40ms median latency overhead.
|
||||
|
||||
This change doubles the RPS LiteLLM can scale to at this latency overhead.
|
||||
|
||||
You can opt into this by enabling the flag below. (We expect to make this the default in 1 week.)
|
||||
|
||||
|
||||
### Flag to enable
|
||||
|
||||
**On LiteLLM Proxy**
|
||||
|
||||
Set the `USE_AIOHTTP_TRANSPORT=True` in the environment variables.
|
||||
|
||||
```yaml showLineNumbers title="Environment Variable"
|
||||
export USE_AIOHTTP_TRANSPORT="True"
|
||||
```
|
||||
|
||||
**On LiteLLM Python SDK**
|
||||
|
||||
Set the `use_aiohttp_transport=True` to enable aiohttp transport.
|
||||
|
||||
```python showLineNumbers title="Python SDK"
|
||||
import litellm
|
||||
|
||||
litellm.use_aiohttp_transport = True # default is False, enable this to use aiohttp transport
|
||||
result = litellm.completion(
|
||||
model="openai/gpt-4o",
|
||||
messages=[{"role": "user", "content": "Hello, world!"}],
|
||||
)
|
||||
print(result)
|
||||
```
|
||||
|
||||
## File Permissions
|
||||
|
||||
<Image img={require('../../img/files_api_graphic.png')} style={{ width: '800px', height: 'auto' }} />
|
||||
|
||||
<br/>
|
||||
|
||||
This release brings support for [File Permissions](../../docs/proxy/litellm_managed_files#file-permissions) and [Finetuning APIs](../../docs/proxy/managed_finetuning) to [LiteLLM Managed Files](../../docs/proxy/litellm_managed_files). This is great for:
|
||||
|
||||
- **Proxy Admins**: as users can only view/edit/delete files they’ve created - even when using shared OpenAI/Azure/Vertex deployments.
|
||||
- **Developers**: get a standard interface to use Files across Chat/Finetuning/Batch APIs.
|
||||
|
||||
|
||||
## New Models / Updated Models
|
||||
|
||||
- **Gemini [VertexAI](https://docs.litellm.ai/docs/providers/vertex), [Google AI Studio](https://docs.litellm.ai/docs/providers/gemini)**
|
||||
- New gemini models - [PR 1](https://github.com/BerriAI/litellm/pull/10991), [PR 2](https://github.com/BerriAI/litellm/pull/10998)
|
||||
- `gemini-2.5-flash-preview-tts`
|
||||
- `gemini-2.0-flash-preview-image-generation`
|
||||
- `gemini/gemini-2.5-flash-preview-05-20`
|
||||
- `gemini-2.5-flash-preview-05-20`
|
||||
- **[Anthropic](../../docs/providers/anthropic)**
|
||||
- Claude-4 model family support - [PR](https://github.com/BerriAI/litellm/pull/11060)
|
||||
- **[Bedrock](../../docs/providers/bedrock)**
|
||||
- Claude-4 model family support - [PR](https://github.com/BerriAI/litellm/pull/11060)
|
||||
- Support for `reasoning_effort` and `thinking` parameters for Claude-4 - [PR](https://github.com/BerriAI/litellm/pull/11114)
|
||||
- **[VertexAI](../../docs/providers/vertex)**
|
||||
- Claude-4 model family support - [PR](https://github.com/BerriAI/litellm/pull/11060)
|
||||
- Global endpoints support - [PR](https://github.com/BerriAI/litellm/pull/10658)
|
||||
- authorized_user credentials type support - [PR](https://github.com/BerriAI/litellm/pull/10899)
|
||||
- **[xAI](../../docs/providers/xai)**
|
||||
- `xai/grok-3` pricing information - [PR](https://github.com/BerriAI/litellm/pull/11028)
|
||||
- **[LM Studio](../../docs/providers/lm_studio)**
|
||||
- Structured JSON schema outputs support - [PR](https://github.com/BerriAI/litellm/pull/10929)
|
||||
- **[SambaNova](../../docs/providers/sambanova)**
|
||||
- Updated models and parameters - [PR](https://github.com/BerriAI/litellm/pull/10900)
|
||||
- **[Databricks](../../docs/providers/databricks)**
|
||||
- Llama 4 Maverick model cost - [PR](https://github.com/BerriAI/litellm/pull/11008)
|
||||
- Claude 3.7 Sonnet output token cost correction - [PR](https://github.com/BerriAI/litellm/pull/11007)
|
||||
- **[Azure](../../docs/providers/azure)**
|
||||
- Mistral Medium 25.05 support - [PR](https://github.com/BerriAI/litellm/pull/11063)
|
||||
- Certificate-based authentication support - [PR](https://github.com/BerriAI/litellm/pull/11069)
|
||||
- **[Mistral](../../docs/providers/mistral)**
|
||||
- devstral-small-2505 model pricing and context window - [PR](https://github.com/BerriAI/litellm/pull/11103)
|
||||
- **[Ollama](../../docs/providers/ollama)**
|
||||
- Wildcard model support - [PR](https://github.com/BerriAI/litellm/pull/10982)
|
||||
- **[CustomLLM](../../docs/providers/custom_llm_server)**
|
||||
- Embeddings support added - [PR](https://github.com/BerriAI/litellm/pull/10980)
|
||||
- **[Featherless AI](../../docs/providers/featherless_ai)**
|
||||
- Access to 4200+ models - [PR](https://github.com/BerriAI/litellm/pull/10596)
|
||||
|
||||
## LLM API Endpoints
|
||||
|
||||
- **[Image Edits](../../docs/image_generation)**
|
||||
- `/v1/images/edits` - Support for /images/edits endpoint - [PR](https://github.com/BerriAI/litellm/pull/11020) [PR](https://github.com/BerriAI/litellm/pull/11123)
|
||||
- Content policy violation error mapping - [PR](https://github.com/BerriAI/litellm/pull/11113)
|
||||
- **[Responses API](../../docs/response_api)**
|
||||
- MCP support for Responses API - [PR](https://github.com/BerriAI/litellm/pull/11029)
|
||||
- **[Files API](../../docs/fine_tuning)**
|
||||
- LiteLLM Managed Files support for finetuning - [PR](https://github.com/BerriAI/litellm/pull/11039) [PR](https://github.com/BerriAI/litellm/pull/11040)
|
||||
- Validation for file operations (retrieve/list/delete) - [PR](https://github.com/BerriAI/litellm/pull/11081)
|
||||
|
||||
## Management Endpoints / UI
|
||||
|
||||
- **Teams**
|
||||
- Key and member count display - [PR](https://github.com/BerriAI/litellm/pull/10950)
|
||||
- Spend rounded to 4 decimal points - [PR](https://github.com/BerriAI/litellm/pull/11013)
|
||||
- Organization and team create buttons repositioned - [PR](https://github.com/BerriAI/litellm/pull/10948)
|
||||
- **Keys**
|
||||
- Key reassignment and 'updated at' column - [PR](https://github.com/BerriAI/litellm/pull/10960)
|
||||
- Show model access groups during creation - [PR](https://github.com/BerriAI/litellm/pull/10965)
|
||||
- **Logs**
|
||||
- Model filter on logs - [PR](https://github.com/BerriAI/litellm/pull/11048)
|
||||
- Passthrough endpoint error logs support - [PR](https://github.com/BerriAI/litellm/pull/10990)
|
||||
- **Guardrails**
|
||||
- Config.yaml guardrails display - [PR](https://github.com/BerriAI/litellm/pull/10959)
|
||||
- **Organizations/Users**
|
||||
- Spend rounded to 4 decimal points - [PR](https://github.com/BerriAI/litellm/pull/11023)
|
||||
- Show clear error when adding a user to a team - [PR](https://github.com/BerriAI/litellm/pull/10978)
|
||||
- **Audit Logs**
|
||||
- `/list` and `/info` endpoints for Audit Logs - [PR](https://github.com/BerriAI/litellm/pull/11102)
|
||||
|
||||
## Logging / Alerting Integrations
|
||||
|
||||
- **[Prometheus](../../docs/proxy/prometheus)**
|
||||
- Track `route` on proxy_* metrics - [PR](https://github.com/BerriAI/litellm/pull/10992)
|
||||
- **[Langfuse](../../docs/proxy/logging#langfuse)**
|
||||
- Support for `prompt_label` parameter - [PR](https://github.com/BerriAI/litellm/pull/11018)
|
||||
- Consistent modelParams logging - [PR](https://github.com/BerriAI/litellm/pull/11018)
|
||||
- **[DeepEval/ConfidentAI](../../docs/proxy/logging#deepeval)**
|
||||
- Logging enabled for proxy and SDK - [PR](https://github.com/BerriAI/litellm/pull/10649)
|
||||
- **[Logfire](../../docs/proxy/logging)**
|
||||
- Fix otel proxy server initialization when using Logfire - [PR](https://github.com/BerriAI/litellm/pull/11091)
|
||||
|
||||
## Authentication & Security
|
||||
|
||||
- **[JWT Authentication](../../docs/proxy/token_auth)**
|
||||
- Support for applying default internal user parameters when upserting a user via JWT authentication - [PR](https://github.com/BerriAI/litellm/pull/10995)
|
||||
- Map a user to a team when upserting a user via JWT authentication - [PR](https://github.com/BerriAI/litellm/pull/11108)
|
||||
- **Custom Auth**
|
||||
- Support for switching between custom auth and API key auth - [PR](https://github.com/BerriAI/litellm/pull/11070)
|
||||
|
||||
## Performance / Reliability Improvements
|
||||
|
||||
- **aiohttp Transport**
|
||||
- 97% lower median latency (feature flagged) - [PR](https://github.com/BerriAI/litellm/pull/11097) [PR](https://github.com/BerriAI/litellm/pull/11132)
|
||||
- **Background Health Checks**
|
||||
- Improved reliability - [PR](https://github.com/BerriAI/litellm/pull/10887)
|
||||
- **Response Handling**
|
||||
- Better streaming status code detection - [PR](https://github.com/BerriAI/litellm/pull/10962)
|
||||
- Response ID propagation improvements - [PR](https://github.com/BerriAI/litellm/pull/11006)
|
||||
- **Thread Management**
|
||||
- Removed error-creating threads for reliability - [PR](https://github.com/BerriAI/litellm/pull/11066)
|
||||
|
||||
## General Proxy Improvements
|
||||
|
||||
- **[Proxy CLI](../../docs/proxy/cli)**
|
||||
- Skip server startup flag - [PR](https://github.com/BerriAI/litellm/pull/10665)
|
||||
- Avoid DATABASE_URL override when provided - [PR](https://github.com/BerriAI/litellm/pull/11076)
|
||||
- **Model Management**
|
||||
- Clear cache and reload after model updates - [PR](https://github.com/BerriAI/litellm/pull/10853)
|
||||
- Computer use support tracking - [PR](https://github.com/BerriAI/litellm/pull/10881)
|
||||
- **Helm Chart**
|
||||
- LoadBalancer class support - [PR](https://github.com/BerriAI/litellm/pull/11064)
|
||||
|
||||
## Bug Fixes
|
||||
|
||||
This release includes numerous bug fixes to improve stability and reliability:
|
||||
|
||||
- **LLM Provider Fixes**
|
||||
- VertexAI:
|
||||
- Fixed quota_project_id parameter issue - [PR](https://github.com/BerriAI/litellm/pull/10915)
|
||||
- Fixed credential refresh exceptions - [PR](https://github.com/BerriAI/litellm/pull/10969)
|
||||
- Cohere:
|
||||
Fixes for adding Cohere models through LiteLLM UI - [PR](https://github.com/BerriAI/litellm/pull/10822)
|
||||
- Anthropic:
|
||||
- Fixed streaming dict object handling for /v1/messages - [PR](https://github.com/BerriAI/litellm/pull/11032)
|
||||
- OpenRouter:
|
||||
- Fixed stream usage ID issues - [PR](https://github.com/BerriAI/litellm/pull/11004)
|
||||
|
||||
- **Authentication & Users**
|
||||
- Fixed invitation email link generation - [PR](https://github.com/BerriAI/litellm/pull/10958)
|
||||
- Fixed JWT authentication default role - [PR](https://github.com/BerriAI/litellm/pull/10995)
|
||||
- Fixed user budget reset functionality - [PR](https://github.com/BerriAI/litellm/pull/10993)
|
||||
- Fixed SSO user compatibility and email validation - [PR](https://github.com/BerriAI/litellm/pull/11106)
|
||||
|
||||
- **Database & Infrastructure**
|
||||
- Fixed DB connection parameter handling - [PR](https://github.com/BerriAI/litellm/pull/10842)
|
||||
- Fixed email invitation link - [PR](https://github.com/BerriAI/litellm/pull/11031)
|
||||
|
||||
- **UI & Display**
|
||||
- Fixed MCP tool rendering when no arguments required - [PR](https://github.com/BerriAI/litellm/pull/11012)
|
||||
- Fixed team model alias deletion - [PR](https://github.com/BerriAI/litellm/pull/11121)
|
||||
- Fixed team viewer permissions - [PR](https://github.com/BerriAI/litellm/pull/11127)
|
||||
|
||||
- **Model & Routing**
|
||||
- Fixed team model mapping in route requests - [PR](https://github.com/BerriAI/litellm/pull/11111)
|
||||
- Fixed standard optional parameter passing - [PR](https://github.com/BerriAI/litellm/pull/11124)
|
||||
|
||||
|
||||
## New Contributors
|
||||
* [@DarinVerheijke](https://github.com/DarinVerheijke) made their first contribution in PR [#10596](https://github.com/BerriAI/litellm/pull/10596)
|
||||
* [@estsauver](https://github.com/estsauver) made their first contribution in PR [#10929](https://github.com/BerriAI/litellm/pull/10929)
|
||||
* [@mohittalele](https://github.com/mohittalele) made their first contribution in PR [#10665](https://github.com/BerriAI/litellm/pull/10665)
|
||||
* [@pselden](https://github.com/pselden) made their first contribution in PR [#10899](https://github.com/BerriAI/litellm/pull/10899)
|
||||
* [@unrealandychan](https://github.com/unrealandychan) made their first contribution in PR [#10842](https://github.com/BerriAI/litellm/pull/10842)
|
||||
* [@dastaiger](https://github.com/dastaiger) made their first contribution in PR [#10946](https://github.com/BerriAI/litellm/pull/10946)
|
||||
* [@slytechnical](https://github.com/slytechnical) made their first contribution in PR [#10881](https://github.com/BerriAI/litellm/pull/10881)
|
||||
* [@daarko10](https://github.com/daarko10) made their first contribution in PR [#11006](https://github.com/BerriAI/litellm/pull/11006)
|
||||
* [@sorenmat](https://github.com/sorenmat) made their first contribution in PR [#10658](https://github.com/BerriAI/litellm/pull/10658)
|
||||
* [@matthid](https://github.com/matthid) made their first contribution in PR [#10982](https://github.com/BerriAI/litellm/pull/10982)
|
||||
* [@jgowdy-godaddy](https://github.com/jgowdy-godaddy) made their first contribution in PR [#11032](https://github.com/BerriAI/litellm/pull/11032)
|
||||
* [@bepotp](https://github.com/bepotp) made their first contribution in PR [#11008](https://github.com/BerriAI/litellm/pull/11008)
|
||||
* [@jmorenoc-o](https://github.com/jmorenoc-o) made their first contribution in PR [#11031](https://github.com/BerriAI/litellm/pull/11031)
|
||||
* [@martin-liu](https://github.com/martin-liu) made their first contribution in PR [#11076](https://github.com/BerriAI/litellm/pull/11076)
|
||||
* [@gunjan-solanki](https://github.com/gunjan-solanki) made their first contribution in PR [#11064](https://github.com/BerriAI/litellm/pull/11064)
|
||||
* [@tokoko](https://github.com/tokoko) made their first contribution in PR [#10980](https://github.com/BerriAI/litellm/pull/10980)
|
||||
* [@spike-spiegel-21](https://github.com/spike-spiegel-21) made their first contribution in PR [#10649](https://github.com/BerriAI/litellm/pull/10649)
|
||||
* [@kreatoo](https://github.com/kreatoo) made their first contribution in PR [#10927](https://github.com/BerriAI/litellm/pull/10927)
|
||||
* [@baejooc](https://github.com/baejooc) made their first contribution in PR [#10887](https://github.com/BerriAI/litellm/pull/10887)
|
||||
* [@keykbd](https://github.com/keykbd) made their first contribution in PR [#11114](https://github.com/BerriAI/litellm/pull/11114)
|
||||
* [@dalssoft](https://github.com/dalssoft) made their first contribution in PR [#11088](https://github.com/BerriAI/litellm/pull/11088)
|
||||
* [@jtong99](https://github.com/jtong99) made their first contribution in PR [#10853](https://github.com/BerriAI/litellm/pull/10853)
|
||||
|
||||
## Demo Instance
|
||||
|
||||
Here's a Demo Instance to test changes:
|
||||
|
||||
- Instance: https://demo.litellm.ai/
|
||||
- Login Credentials:
|
||||
- Username: admin
|
||||
- Password: sk-1234
|
||||
|
||||
## [Git Diff](https://github.com/BerriAI/litellm/releases)
|
Reference in New Issue
Block a user