Added LiteLLM to the stack

2025-08-18 09:40:50 +00:00
parent 0648c1968c
commit d220b04e32
2682 changed files with 533609 additions and 1 deletions
--- a/Development/litellm/docs/my-website/release_notes/v1.72.0-stable/index.md
+++ b/Development/litellm/docs/my-website/release_notes/v1.72.0-stable/index.md
@@ -0,0 +1,234 @@
+---
+title: "v1.72.0-stable"
+slug: "v1-72-0-stable"
+date: 2025-05-31T10:00:00
+authors:
+  - name: Krrish Dholakia
+    title: CEO, LiteLLM
+    url: https://www.linkedin.com/in/krish-d/
+    image_url: https://media.licdn.com/dms/image/v2/D4D03AQGrlsJ3aqpHmQ/profile-displayphoto-shrink_400_400/B4DZSAzgP7HYAg-/0/1737327772964?e=1749686400&v=beta&t=Hkl3U8Ps0VtvNxX0BNNq24b4dtX5wQaPFp6oiKCIHD8
+  - name: Ishaan Jaffer
+    title: CTO, LiteLLM
+    url: https://www.linkedin.com/in/reffajnaahsi/
+    image_url: https://pbs.twimg.com/profile_images/1613813310264340481/lz54oEiB_400x400.jpg
+
+hide_table_of_contents: false
+---
+
+import Image from '@theme/IdealImage';
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+## Deploy this version
+
+<Tabs>
+<TabItem value="docker" label="Docker">
+
+``` showLineNumbers title="docker run litellm"
+docker run
+-e STORE_MODEL_IN_DB=True
+-p 4000:4000
+ghcr.io/berriai/litellm:main-v1.72.0-stable
+```
+</TabItem>
+
+<TabItem value="pip" label="Pip">
+
+``` showLineNumbers title="pip install litellm"
+pip install litellm==1.72.0
+```
+</TabItem>
+</Tabs>
+
+
+## Key Highlights
+
+LiteLLM v1.72.0-stable.rc is live now. Here are the key highlights of this release:
+
+- **Vector Store Permissions**: Control Vector Store access at the Key, Team, and Organization level.
+- **Rate Limiting Sliding Window support**: Improved accuracy for Key/Team/User rate limits with request tracking across minutes.
+- **Aiohttp Transport used by default**: Aiohttp transport is now the default transport for LiteLLM networking requests. This gives users 2x higher RPS per instance with a 40ms median latency overhead.
+- **Bedrock Agents**: Call Bedrock Agents with `/chat/completions`, `/response` endpoints.
+- **Anthropic File API**: Upload and analyze CSV files with Claude-4 on Anthropic via LiteLLM.
+- **Prometheus**: End users (`end_user`) will no longer be tracked by default on Prometheus. Tracking end_users on prometheus is now opt-in. This is done to prevent the response from `/metrics` from  becoming too large. [Read More](../../docs/proxy/prometheus#tracking-end_user-on-prometheus)
+
+
+---
+
+## Vector Store Permissions
+
+This release brings support for managing permissions for vector stores by Keys, Teams, Organizations (entities) on LiteLLM. When a request attempts to query a vector store, LiteLLM will block it if the requesting entity lacks the proper permissions.
+
+This is great for use cases that require access to restricted data that you don't want everyone to use. 
+
+Over the next week we plan on adding permission management for MCP Servers.
+
+---
+## Aiohttp Transport used by default
+
+Aiohttp transport is now the default transport for LiteLLM networking requests. This gives users 2x higher RPS per instance with a 40ms median latency overhead. This has been live on LiteLLM Cloud for a week + gone through alpha users testing for a week.
+
+
+If you encounter any issues, you can disable using the aiohttp transport in the following ways:
+
+**On LiteLLM Proxy**
+
+Set the `DISABLE_AIOHTTP_TRANSPORT=True` in the environment variables. 
+
+```yaml showLineNumbers title="Environment Variable"
+export DISABLE_AIOHTTP_TRANSPORT="True"
+```
+
+**On LiteLLM Python SDK**
+
+Set the `disable_aiohttp_transport=True` to disable aiohttp transport. 
+
+```python showLineNumbers title="Python SDK"
+import litellm
+
+litellm.disable_aiohttp_transport = True # default is False, enable this to disable aiohttp transport
+result = litellm.completion(
+    model="openai/gpt-4o",
+    messages=[{"role": "user", "content": "Hello, world!"}],
+)
+print(result)
+```
+
+---
+
+
+## New Models / Updated Models
+
+- **[Bedrock](../../docs/providers/bedrock)**
+    - Video support for Bedrock Converse - [PR](https://github.com/BerriAI/litellm/pull/11166)
+    - InvokeAgents support as /chat/completions route - [PR](https://github.com/BerriAI/litellm/pull/11239), [Get Started](../../docs/providers/bedrock_agents)
+    - AI21 Jamba models compatibility fixes - [PR](https://github.com/BerriAI/litellm/pull/11233)
+    - Fixed duplicate maxTokens parameter for Claude with thinking - [PR](https://github.com/BerriAI/litellm/pull/11181)
+- **[Gemini (Google AI Studio + Vertex AI)](https://docs.litellm.ai/docs/providers/gemini)**
+    - Parallel tool calling support with `parallel_tool_calls` parameter - [PR](https://github.com/BerriAI/litellm/pull/11125)
+    - All Gemini models now support parallel function calling - [PR](https://github.com/BerriAI/litellm/pull/11225)
+- **[VertexAI](../../docs/providers/vertex)**
+    - codeExecution tool support and anyOf handling - [PR](https://github.com/BerriAI/litellm/pull/11195)
+    - Vertex AI Anthropic support on /v1/messages - [PR](https://github.com/BerriAI/litellm/pull/11246)
+    - Thinking, global regions, and parallel tool calling improvements - [PR](https://github.com/BerriAI/litellm/pull/11194)
+    - Web Search Support [PR](https://github.com/BerriAI/litellm/commit/06484f6e5a7a2f4e45c490266782ed28b51b7db6)
+- **[Anthropic](../../docs/providers/anthropic)**
+    - Thinking blocks on streaming support - [PR](https://github.com/BerriAI/litellm/pull/11194)
+    - Files API with form-data support on passthrough - [PR](https://github.com/BerriAI/litellm/pull/11256)
+    - File ID support on /chat/completion - [PR](https://github.com/BerriAI/litellm/pull/11256)
+- **[xAI](../../docs/providers/xai)**
+    - Web Search Support [PR](https://github.com/BerriAI/litellm/commit/06484f6e5a7a2f4e45c490266782ed28b51b7db6)
+- **[Google AI Studio](../../docs/providers/gemini)**
+    - Web Search Support [PR](https://github.com/BerriAI/litellm/commit/06484f6e5a7a2f4e45c490266782ed28b51b7db6)
+- **[Mistral](../../docs/providers/mistral)**
+    - Updated mistral-medium prices and context sizes - [PR](https://github.com/BerriAI/litellm/pull/10729)
+- **[Ollama](../../docs/providers/ollama)**
+    - Tool calls parsing on streaming - [PR](https://github.com/BerriAI/litellm/pull/11171)
+- **[Cohere](../../docs/providers/cohere)**
+    - Swapped Cohere and Cohere Chat provider positioning - [PR](https://github.com/BerriAI/litellm/pull/11173)
+- **[Nebius AI Studio](../../docs/providers/nebius)**
+    - New provider integration - [PR](https://github.com/BerriAI/litellm/pull/11143)
+
+## LLM API Endpoints
+
+- **[Image Edits API](../../docs/image_generation)**
+    - Azure support for /v1/images/edits - [PR](https://github.com/BerriAI/litellm/pull/11160)
+    - Cost tracking for image edits endpoint (OpenAI, Azure) - [PR](https://github.com/BerriAI/litellm/pull/11186)
+- **[Completions API](../../docs/completion/chat)**
+    - Codestral latency overhead tracking on /v1/completions - [PR](https://github.com/BerriAI/litellm/pull/10879)
+- **[Audio Transcriptions API](../../docs/audio/speech)**
+    - GPT-4o mini audio preview pricing without date - [PR](https://github.com/BerriAI/litellm/pull/11207)
+    - Non-default params support for audio transcription - [PR](https://github.com/BerriAI/litellm/pull/11212)
+- **[Responses API](../../docs/response_api)**
+    - Session management fixes for using Non-OpenAI models - [PR](https://github.com/BerriAI/litellm/pull/11254)
+
+## Management Endpoints / UI
+
+- **Vector Stores**
+    - Permission management for LiteLLM Keys, Teams, and Organizations - [PR](https://github.com/BerriAI/litellm/pull/11213)
+    - UI display of vector store permissions - [PR](https://github.com/BerriAI/litellm/pull/11277)
+    - Vector store access controls enforcement - [PR](https://github.com/BerriAI/litellm/pull/11281)
+    - Object permissions fixes and QA improvements - [PR](https://github.com/BerriAI/litellm/pull/11291)
+- **Teams**
+    - "All proxy models" display when no models selected - [PR](https://github.com/BerriAI/litellm/pull/11187)
+    - Removed redundant teamInfo call, using existing teamsList - [PR](https://github.com/BerriAI/litellm/pull/11051)
+    - Improved model tags display on Keys, Teams and Org pages - [PR](https://github.com/BerriAI/litellm/pull/11022)
+- **SSO/SCIM**
+    - Bug fixes for showing SCIM token on UI - [PR](https://github.com/BerriAI/litellm/pull/11220)
+- **General UI**
+    - Fix "UI Session Expired. Logging out" - [PR](https://github.com/BerriAI/litellm/pull/11279)
+    - Support for forwarding /sso/key/generate to server root path URL - [PR](https://github.com/BerriAI/litellm/pull/11165)
+
+
+## Logging / Guardrails Integrations
+
+#### Logging
+- **[Prometheus](../../docs/proxy/prometheus)**
+    - End users will no longer be tracked by default on Prometheus. Tracking end_users on prometheus is now opt-in. [PR](https://github.com/BerriAI/litellm/pull/11192)
+- **[Langfuse](../../docs/proxy/logging#langfuse)**
+    - Performance improvements: Fixed "Max langfuse clients reached" issue - [PR](https://github.com/BerriAI/litellm/pull/11285)
+- **[Helicone](../../docs/observability/helicone_integration)**
+    - Base URL support - [PR](https://github.com/BerriAI/litellm/pull/11211)
+- **[Sentry](../../docs/proxy/logging#sentry)**
+    - Added sentry sample rate configuration - [PR](https://github.com/BerriAI/litellm/pull/10283)
+
+#### Guardrails
+- **[Bedrock Guardrails](../../docs/proxy/guardrails/bedrock)**
+    - Streaming support for bedrock post guard - [PR](https://github.com/BerriAI/litellm/pull/11247)
+    - Auth parameter persistence fixes - [PR](https://github.com/BerriAI/litellm/pull/11270)
+- **[Pangea Guardrails](../../docs/proxy/guardrails/pangea)**
+    - Added Pangea provider to Guardrails hook - [PR](https://github.com/BerriAI/litellm/pull/10775)
+
+
+## Performance / Reliability Improvements
+- **aiohttp Transport**
+    - Handling for aiohttp.ClientPayloadError - [PR](https://github.com/BerriAI/litellm/pull/11162)
+    - SSL verification settings support - [PR](https://github.com/BerriAI/litellm/pull/11162)
+    - Rollback to httpx==0.27.0 for stability - [PR](https://github.com/BerriAI/litellm/pull/11146)
+- **Request Limiting**
+    - Sliding window logic for parallel request limiter v2 - [PR](https://github.com/BerriAI/litellm/pull/11283)
+
+
+## Bug Fixes
+
+- **LLM API Fixes**
+    - Added missing request_kwargs to get_available_deployment call - [PR](https://github.com/BerriAI/litellm/pull/11202)
+    - Fixed calling Azure O-series models - [PR](https://github.com/BerriAI/litellm/pull/11212)
+    - Support for dropping non-OpenAI params via additional_drop_params - [PR](https://github.com/BerriAI/litellm/pull/11246)
+    - Fixed frequency_penalty to repeat_penalty parameter mapping - [PR](https://github.com/BerriAI/litellm/pull/11284)
+    - Fix for embedding cache hits on string input - [PR](https://github.com/BerriAI/litellm/pull/11211)
+- **General**
+    - OIDC provider improvements and audience bug fix - [PR](https://github.com/BerriAI/litellm/pull/10054)
+    - Removed AzureCredentialType restriction on AZURE_CREDENTIAL - [PR](https://github.com/BerriAI/litellm/pull/11272)
+    - Prevention of sensitive key leakage to Langfuse - [PR](https://github.com/BerriAI/litellm/pull/11165)
+    - Fixed healthcheck test using curl when curl not in image - [PR](https://github.com/BerriAI/litellm/pull/9737)
+
+## New Contributors
+* [@agajdosi](https://github.com/agajdosi) made their first contribution in [#9737](https://github.com/BerriAI/litellm/pull/9737)
+* [@ketangangal](https://github.com/ketangangal) made their first contribution in [#11161](https://github.com/BerriAI/litellm/pull/11161)
+* [@Aktsvigun](https://github.com/Aktsvigun) made their first contribution in [#11143](https://github.com/BerriAI/litellm/pull/11143)
+* [@ryanmeans](https://github.com/ryanmeans) made their first contribution in [#10775](https://github.com/BerriAI/litellm/pull/10775)
+* [@nikoizs](https://github.com/nikoizs) made their first contribution in [#10054](https://github.com/BerriAI/litellm/pull/10054)
+* [@Nitro963](https://github.com/Nitro963) made their first contribution in [#11202](https://github.com/BerriAI/litellm/pull/11202)
+* [@Jacobh2](https://github.com/Jacobh2) made their first contribution in [#11207](https://github.com/BerriAI/litellm/pull/11207)
+* [@regismesquita](https://github.com/regismesquita) made their first contribution in [#10729](https://github.com/BerriAI/litellm/pull/10729)
+* [@Vinnie-Singleton-NN](https://github.com/Vinnie-Singleton-NN) made their first contribution in [#10283](https://github.com/BerriAI/litellm/pull/10283)
+* [@trashhalo](https://github.com/trashhalo) made their first contribution in [#11219](https://github.com/BerriAI/litellm/pull/11219)
+* [@VigneshwarRajasekaran](https://github.com/VigneshwarRajasekaran) made their first contribution in [#11223](https://github.com/BerriAI/litellm/pull/11223)
+* [@AnilAren](https://github.com/AnilAren) made their first contribution in [#11233](https://github.com/BerriAI/litellm/pull/11233)
+* [@fadil4u](https://github.com/fadil4u) made their first contribution in [#11242](https://github.com/BerriAI/litellm/pull/11242)
+* [@whitfin](https://github.com/whitfin) made their first contribution in [#11279](https://github.com/BerriAI/litellm/pull/11279)
+* [@hcoona](https://github.com/hcoona) made their first contribution in [#11272](https://github.com/BerriAI/litellm/pull/11272)
+* [@keyute](https://github.com/keyute) made their first contribution in [#11173](https://github.com/BerriAI/litellm/pull/11173)
+* [@emmanuel-ferdman](https://github.com/emmanuel-ferdman) made their first contribution in [#11230](https://github.com/BerriAI/litellm/pull/11230)
+
+## Demo Instance
+
+Here's a Demo Instance to test changes:
+
+- Instance: https://demo.litellm.ai/
+- Login Credentials:
+    - Username: admin
+    - Password: sk-1234
+
+## [Git Diff](https://github.com/BerriAI/litellm/releases)