Added LiteLLM to the stack
This commit is contained in:
@@ -0,0 +1,61 @@
|
||||
---
|
||||
title: v1.55.10
|
||||
slug: v1.55.10
|
||||
date: 2024-12-24T10:00:00
|
||||
authors:
|
||||
- name: Krrish Dholakia
|
||||
title: CEO, LiteLLM
|
||||
url: https://www.linkedin.com/in/krish-d/
|
||||
image_url: https://media.licdn.com/dms/image/v2/D4D03AQGrlsJ3aqpHmQ/profile-displayphoto-shrink_400_400/B4DZSAzgP7HYAg-/0/1737327772964?e=1749686400&v=beta&t=Hkl3U8Ps0VtvNxX0BNNq24b4dtX5wQaPFp6oiKCIHD8
|
||||
- name: Ishaan Jaffer
|
||||
title: CTO, LiteLLM
|
||||
url: https://www.linkedin.com/in/reffajnaahsi/
|
||||
image_url: https://media.licdn.com/dms/image/v2/D4D03AQGiM7ZrUwqu_Q/profile-displayphoto-shrink_800_800/profile-displayphoto-shrink_800_800/0/1675971026692?e=1741824000&v=beta&t=eQnRdXPJo4eiINWTZARoYTfqh064pgZ-E21pQTSy8jc
|
||||
tags: [batches, guardrails, team management, custom auth]
|
||||
hide_table_of_contents: false
|
||||
---
|
||||
|
||||
import Image from '@theme/IdealImage';
|
||||
|
||||
# v1.55.10
|
||||
|
||||
`batches`, `guardrails`, `team management`, `custom auth`
|
||||
|
||||
|
||||
<Image img={require('../../img/batches_cost_tracking.png')} />
|
||||
|
||||
<br/>
|
||||
|
||||
:::info
|
||||
|
||||
Get a free 7-day LiteLLM Enterprise trial here. [Start here](https://www.litellm.ai/enterprise#trial)
|
||||
|
||||
**No call needed**
|
||||
|
||||
:::
|
||||
|
||||
## ✨ Cost Tracking, Logging for Batches API (`/batches`)
|
||||
|
||||
Track cost, usage for Batch Creation Jobs. [Start here](https://docs.litellm.ai/docs/batches)
|
||||
|
||||
## ✨ `/guardrails/list` endpoint
|
||||
|
||||
Show available guardrails to users. [Start here](https://litellm-api.up.railway.app/#/Guardrails)
|
||||
|
||||
|
||||
## ✨ Allow teams to add models
|
||||
|
||||
This enables team admins to call their own finetuned models via litellm proxy. [Start here](https://docs.litellm.ai/docs/proxy/team_model_add)
|
||||
|
||||
|
||||
## ✨ Common checks for custom auth
|
||||
|
||||
Calling the internal common_checks function in custom auth is now enforced as an enterprise feature. This allows admins to use litellm's default budget/auth checks within their custom auth implementation. [Start here](https://docs.litellm.ai/docs/proxy/virtual_keys#custom-auth)
|
||||
|
||||
|
||||
## ✨ Assigning team admins
|
||||
|
||||
Team admins is graduating from beta and moving to our enterprise tier. This allows proxy admins to allow others to manage keys/models for their own teams (useful for projects in production). [Start here](https://docs.litellm.ai/docs/proxy/virtual_keys#restricting-key-generation)
|
||||
|
||||
|
||||
|
@@ -0,0 +1,62 @@
|
||||
---
|
||||
title: v1.55.8-stable
|
||||
slug: v1.55.8-stable
|
||||
date: 2024-12-22T10:00:00
|
||||
authors:
|
||||
- name: Krrish Dholakia
|
||||
title: CEO, LiteLLM
|
||||
url: https://www.linkedin.com/in/krish-d/
|
||||
image_url: https://media.licdn.com/dms/image/v2/D4D03AQGrlsJ3aqpHmQ/profile-displayphoto-shrink_400_400/B4DZSAzgP7HYAg-/0/1737327772964?e=1749686400&v=beta&t=Hkl3U8Ps0VtvNxX0BNNq24b4dtX5wQaPFp6oiKCIHD8
|
||||
- name: Ishaan Jaffer
|
||||
title: CTO, LiteLLM
|
||||
url: https://www.linkedin.com/in/reffajnaahsi/
|
||||
image_url: https://media.licdn.com/dms/image/v2/D4D03AQGiM7ZrUwqu_Q/profile-displayphoto-shrink_800_800/profile-displayphoto-shrink_800_800/0/1675971026692?e=1741824000&v=beta&t=eQnRdXPJo4eiINWTZARoYTfqh064pgZ-E21pQTSy8jc
|
||||
tags: [langfuse, fallbacks, new models, azure_storage]
|
||||
hide_table_of_contents: false
|
||||
---
|
||||
|
||||
import Image from '@theme/IdealImage';
|
||||
|
||||
# v1.55.8-stable
|
||||
|
||||
A new LiteLLM Stable release [just went out](https://github.com/BerriAI/litellm/releases/tag/v1.55.8-stable). Here are 5 updates since v1.52.2-stable.
|
||||
|
||||
`langfuse`, `fallbacks`, `new models`, `azure_storage`
|
||||
|
||||
<Image img={require('../../img/langfuse_prmpt_mgmt.png')} />
|
||||
|
||||
## Langfuse Prompt Management
|
||||
|
||||
This makes it easy to run experiments or change the specific models `gpt-4o` to `gpt-4o-mini` on Langfuse, instead of making changes in your applications. [Start here](https://docs.litellm.ai/docs/proxy/prompt_management)
|
||||
|
||||
## Control fallback prompts client-side
|
||||
|
||||
> Claude prompts are different than OpenAI
|
||||
|
||||
Pass in prompts specific to model when doing fallbacks. [Start here](https://docs.litellm.ai/docs/proxy/reliability#control-fallback-prompts)
|
||||
|
||||
|
||||
## New Providers / Models
|
||||
|
||||
- [NVIDIA Triton](https://developer.nvidia.com/triton-inference-server) `/infer` endpoint. [Start here](https://docs.litellm.ai/docs/providers/triton-inference-server)
|
||||
- [Infinity](https://github.com/michaelfeil/infinity) Rerank Models [Start here](https://docs.litellm.ai/docs/providers/infinity)
|
||||
|
||||
|
||||
## ✨ Azure Data Lake Storage Support
|
||||
|
||||
Send LLM usage (spend, tokens) data to [Azure Data Lake](https://learn.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-introduction). This makes it easy to consume usage data on other services (eg. Databricks)
|
||||
[Start here](https://docs.litellm.ai/docs/proxy/logging#azure-blob-storage)
|
||||
|
||||
## Docker Run LiteLLM
|
||||
|
||||
```shell
|
||||
docker run \
|
||||
-e STORE_MODEL_IN_DB=True \
|
||||
-p 4000:4000 \
|
||||
ghcr.io/berriai/litellm:litellm_stable_release_branch-v1.55.8-stable
|
||||
```
|
||||
|
||||
## Get Daily Updates
|
||||
|
||||
LiteLLM ships new releases every day. [Follow us on LinkedIn](https://www.linkedin.com/company/berri-ai/) to get daily updates.
|
||||
|
@@ -0,0 +1,90 @@
|
||||
---
|
||||
title: v1.56.1
|
||||
slug: v1.56.1
|
||||
date: 2024-12-27T10:00:00
|
||||
authors:
|
||||
- name: Krrish Dholakia
|
||||
title: CEO, LiteLLM
|
||||
url: https://www.linkedin.com/in/krish-d/
|
||||
image_url: https://media.licdn.com/dms/image/v2/D4D03AQGrlsJ3aqpHmQ/profile-displayphoto-shrink_400_400/B4DZSAzgP7HYAg-/0/1737327772964?e=1749686400&v=beta&t=Hkl3U8Ps0VtvNxX0BNNq24b4dtX5wQaPFp6oiKCIHD8
|
||||
- name: Ishaan Jaffer
|
||||
title: CTO, LiteLLM
|
||||
url: https://www.linkedin.com/in/reffajnaahsi/
|
||||
image_url: https://media.licdn.com/dms/image/v2/D4D03AQGiM7ZrUwqu_Q/profile-displayphoto-shrink_800_800/profile-displayphoto-shrink_800_800/0/1675971026692?e=1741824000&v=beta&t=eQnRdXPJo4eiINWTZARoYTfqh064pgZ-E21pQTSy8jc
|
||||
tags: [key management, budgets/rate limits, logging, guardrails]
|
||||
hide_table_of_contents: false
|
||||
---
|
||||
|
||||
import Image from '@theme/IdealImage';
|
||||
|
||||
# v1.56.1
|
||||
|
||||
`key management`, `budgets/rate limits`, `logging`, `guardrails`
|
||||
|
||||
:::info
|
||||
|
||||
Get a 7 day free trial for LiteLLM Enterprise [here](https://litellm.ai/#trial).
|
||||
|
||||
**no call needed**
|
||||
|
||||
:::
|
||||
|
||||
## ✨ Budget / Rate Limit Tiers
|
||||
|
||||
Define tiers with rate limits. Assign them to keys.
|
||||
|
||||
Use this to control access and budgets across a lot of keys.
|
||||
|
||||
**[Start here](https://docs.litellm.ai/docs/proxy/rate_limit_tiers)**
|
||||
|
||||
```bash
|
||||
curl -L -X POST 'http://0.0.0.0:4000/budget/new' \
|
||||
-H 'Authorization: Bearer sk-1234' \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{
|
||||
"budget_id": "high-usage-tier",
|
||||
"model_max_budget": {
|
||||
"gpt-4o": {"rpm_limit": 1000000}
|
||||
}
|
||||
}'
|
||||
```
|
||||
|
||||
|
||||
## OTEL Bug Fix
|
||||
|
||||
LiteLLM was double logging litellm_request span. This is now fixed.
|
||||
|
||||
[Relevant PR](https://github.com/BerriAI/litellm/pull/7435)
|
||||
|
||||
## Logging for Finetuning Endpoints
|
||||
|
||||
Logs for finetuning requests are now available on all logging providers (e.g. Datadog).
|
||||
|
||||
What's logged per request:
|
||||
|
||||
- file_id
|
||||
- finetuning_job_id
|
||||
- any key/team metadata
|
||||
|
||||
|
||||
**Start Here:**
|
||||
- [Setup Finetuning](https://docs.litellm.ai/docs/fine_tuning)
|
||||
- [Setup Logging](https://docs.litellm.ai/docs/proxy/logging#datadog)
|
||||
|
||||
## Dynamic Params for Guardrails
|
||||
|
||||
You can now set custom parameters (like success threshold) for your guardrails in each request.
|
||||
|
||||
[See guardrails spec for more details](https://docs.litellm.ai/docs/proxy/guardrails/custom_guardrail#-pass-additional-parameters-to-guardrail)
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
@@ -0,0 +1,117 @@
|
||||
---
|
||||
title: v1.56.3
|
||||
slug: v1.56.3
|
||||
date: 2024-12-28T10:00:00
|
||||
authors:
|
||||
- name: Krrish Dholakia
|
||||
title: CEO, LiteLLM
|
||||
url: https://www.linkedin.com/in/krish-d/
|
||||
image_url: https://media.licdn.com/dms/image/v2/D4D03AQGrlsJ3aqpHmQ/profile-displayphoto-shrink_400_400/B4DZSAzgP7HYAg-/0/1737327772964?e=1749686400&v=beta&t=Hkl3U8Ps0VtvNxX0BNNq24b4dtX5wQaPFp6oiKCIHD8
|
||||
- name: Ishaan Jaffer
|
||||
title: CTO, LiteLLM
|
||||
url: https://www.linkedin.com/in/reffajnaahsi/
|
||||
image_url: https://media.licdn.com/dms/image/v2/D4D03AQGiM7ZrUwqu_Q/profile-displayphoto-shrink_800_800/profile-displayphoto-shrink_800_800/0/1675971026692?e=1741824000&v=beta&t=eQnRdXPJo4eiINWTZARoYTfqh064pgZ-E21pQTSy8jc
|
||||
tags: [guardrails, logging, virtual key management, new models]
|
||||
hide_table_of_contents: false
|
||||
---
|
||||
|
||||
import Image from '@theme/IdealImage';
|
||||
|
||||
`guardrails`, `logging`, `virtual key management`, `new models`
|
||||
|
||||
:::info
|
||||
|
||||
Get a 7 day free trial for LiteLLM Enterprise [here](https://litellm.ai/#trial).
|
||||
|
||||
**no call needed**
|
||||
|
||||
:::
|
||||
|
||||
## New Features
|
||||
|
||||
### ✨ Log Guardrail Traces
|
||||
|
||||
Track guardrail failure rate and if a guardrail is going rogue and failing requests. [Start here](https://docs.litellm.ai/docs/proxy/guardrails/quick_start)
|
||||
|
||||
|
||||
#### Traced Guardrail Success
|
||||
|
||||
<Image img={require('../../img/gd_success.png')} />
|
||||
|
||||
#### Traced Guardrail Failure
|
||||
|
||||
<Image img={require('../../img/gd_fail.png')} />
|
||||
|
||||
|
||||
### `/guardrails/list`
|
||||
|
||||
`/guardrails/list` allows clients to view available guardrails + supported guardrail params
|
||||
|
||||
|
||||
```shell
|
||||
curl -X GET 'http://0.0.0.0:4000/guardrails/list'
|
||||
```
|
||||
|
||||
Expected response
|
||||
|
||||
```json
|
||||
{
|
||||
"guardrails": [
|
||||
{
|
||||
"guardrail_name": "aporia-post-guard",
|
||||
"guardrail_info": {
|
||||
"params": [
|
||||
{
|
||||
"name": "toxicity_score",
|
||||
"type": "float",
|
||||
"description": "Score between 0-1 indicating content toxicity level"
|
||||
},
|
||||
{
|
||||
"name": "pii_detection",
|
||||
"type": "boolean"
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
|
||||
### ✨ Guardrails with Mock LLM
|
||||
|
||||
|
||||
Send `mock_response` to test guardrails without making an LLM call. More info on `mock_response` [here](https://docs.litellm.ai/docs/proxy/guardrails/quick_start)
|
||||
|
||||
```shell
|
||||
curl -i http://localhost:4000/v1/chat/completions \
|
||||
-H "Content-Type: application/json" \
|
||||
-H "Authorization: Bearer sk-npnwjPQciVRok5yNZgKmFQ" \
|
||||
-d '{
|
||||
"model": "gpt-3.5-turbo",
|
||||
"messages": [
|
||||
{"role": "user", "content": "hi my email is ishaan@berri.ai"}
|
||||
],
|
||||
"mock_response": "This is a mock response",
|
||||
"guardrails": ["aporia-pre-guard", "aporia-post-guard"]
|
||||
}'
|
||||
```
|
||||
|
||||
|
||||
|
||||
### Assign Keys to Users
|
||||
|
||||
You can now assign keys to users via Proxy UI
|
||||
|
||||
|
||||
<Image img={require('../../img/ui_key.png')} />
|
||||
|
||||
## New Models
|
||||
|
||||
- `openrouter/openai/o1`
|
||||
- `vertex_ai/mistral-large@2411`
|
||||
|
||||
## Fixes
|
||||
|
||||
- Fix `vertex_ai/` mistral model pricing: https://github.com/BerriAI/litellm/pull/7345
|
||||
- Missing model_group field in logs for aspeech call types https://github.com/BerriAI/litellm/pull/7392
|
@@ -0,0 +1,72 @@
|
||||
---
|
||||
title: v1.56.4
|
||||
slug: v1.56.4
|
||||
date: 2024-12-29T10:00:00
|
||||
authors:
|
||||
- name: Krrish Dholakia
|
||||
title: CEO, LiteLLM
|
||||
url: https://www.linkedin.com/in/krish-d/
|
||||
image_url: https://media.licdn.com/dms/image/v2/D4D03AQGrlsJ3aqpHmQ/profile-displayphoto-shrink_400_400/B4DZSAzgP7HYAg-/0/1737327772964?e=1749686400&v=beta&t=Hkl3U8Ps0VtvNxX0BNNq24b4dtX5wQaPFp6oiKCIHD8
|
||||
- name: Ishaan Jaffer
|
||||
title: CTO, LiteLLM
|
||||
url: https://www.linkedin.com/in/reffajnaahsi/
|
||||
image_url: https://media.licdn.com/dms/image/v2/D4D03AQGiM7ZrUwqu_Q/profile-displayphoto-shrink_800_800/profile-displayphoto-shrink_800_800/0/1675971026692?e=1741824000&v=beta&t=eQnRdXPJo4eiINWTZARoYTfqh064pgZ-E21pQTSy8jc
|
||||
tags: [deepgram, fireworks ai, vision, admin ui, dependency upgrades]
|
||||
hide_table_of_contents: false
|
||||
---
|
||||
|
||||
import Image from '@theme/IdealImage';
|
||||
|
||||
|
||||
`deepgram`, `fireworks ai`, `vision`, `admin ui`, `dependency upgrades`
|
||||
|
||||
## New Models
|
||||
|
||||
### **Deepgram Speech to Text**
|
||||
|
||||
New Speech to Text support for Deepgram models. [**Start Here**](https://docs.litellm.ai/docs/providers/deepgram)
|
||||
|
||||
```python
|
||||
from litellm import transcription
|
||||
import os
|
||||
|
||||
# set api keys
|
||||
os.environ["DEEPGRAM_API_KEY"] = ""
|
||||
audio_file = open("/path/to/audio.mp3", "rb")
|
||||
|
||||
response = transcription(model="deepgram/nova-2", file=audio_file)
|
||||
|
||||
print(f"response: {response}")
|
||||
```
|
||||
|
||||
### **Fireworks AI - Vision** support for all models
|
||||
LiteLLM supports document inlining for Fireworks AI models. This is useful for models that are not vision models, but still need to parse documents/images/etc.
|
||||
LiteLLM will add `#transform=inline` to the url of the image_url, if the model is not a vision model [See Code](https://github.com/BerriAI/litellm/blob/1ae9d45798bdaf8450f2dfdec703369f3d2212b7/litellm/llms/fireworks_ai/chat/transformation.py#L114)
|
||||
|
||||
|
||||
## Proxy Admin UI
|
||||
|
||||
- `Test Key` Tab displays `model` used in response
|
||||
|
||||
<Image img={require('../../img/release_notes/ui_model.png')} />
|
||||
|
||||
- `Test Key` Tab renders content in `.md`, `.py` (any code/markdown format)
|
||||
|
||||
<Image img={require('../../img/release_notes/ui_format.png')} />
|
||||
|
||||
|
||||
## Dependency Upgrades
|
||||
|
||||
- (Security fix) Upgrade to `fastapi==0.115.5` https://github.com/BerriAI/litellm/pull/7447
|
||||
|
||||
## Bug Fixes
|
||||
|
||||
- Add health check support for realtime models [Here](https://docs.litellm.ai/docs/proxy/health#realtime-models)
|
||||
- Health check error with audio_transcription model https://github.com/BerriAI/litellm/issues/5999
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
@@ -0,0 +1,66 @@
|
||||
---
|
||||
title: v1.57.3 - New Base Docker Image
|
||||
slug: v1.57.3
|
||||
date: 2025-01-08T10:00:00
|
||||
authors:
|
||||
- name: Krrish Dholakia
|
||||
title: CEO, LiteLLM
|
||||
url: https://www.linkedin.com/in/krish-d/
|
||||
image_url: https://media.licdn.com/dms/image/v2/D4D03AQGrlsJ3aqpHmQ/profile-displayphoto-shrink_400_400/B4DZSAzgP7HYAg-/0/1737327772964?e=1749686400&v=beta&t=Hkl3U8Ps0VtvNxX0BNNq24b4dtX5wQaPFp6oiKCIHD8
|
||||
- name: Ishaan Jaffer
|
||||
title: CTO, LiteLLM
|
||||
url: https://www.linkedin.com/in/reffajnaahsi/
|
||||
image_url: https://media.licdn.com/dms/image/v2/D4D03AQGiM7ZrUwqu_Q/profile-displayphoto-shrink_800_800/profile-displayphoto-shrink_800_800/0/1675971026692?e=1741824000&v=beta&t=eQnRdXPJo4eiINWTZARoYTfqh064pgZ-E21pQTSy8jc
|
||||
tags: [docker image, security, vulnerability]
|
||||
hide_table_of_contents: false
|
||||
---
|
||||
|
||||
import Image from '@theme/IdealImage';
|
||||
|
||||
`docker image`, `security`, `vulnerability`
|
||||
|
||||
# 0 Critical/High Vulnerabilities
|
||||
|
||||
<Image img={require('../../img/release_notes/security.png')} />
|
||||
|
||||
## What changed?
|
||||
- LiteLLMBase image now uses `cgr.dev/chainguard/python:latest-dev`
|
||||
|
||||
## Why the change?
|
||||
|
||||
To ensure there are 0 critical/high vulnerabilities on LiteLLM Docker Image
|
||||
|
||||
## Migration Guide
|
||||
|
||||
- If you use a custom dockerfile with litellm as a base image + `apt-get`
|
||||
|
||||
Instead of `apt-get` use `apk`, the base litellm image will no longer have `apt-get` installed.
|
||||
|
||||
**You are only impacted if you use `apt-get` in your Dockerfile**
|
||||
```shell
|
||||
# Use the provided base image
|
||||
FROM ghcr.io/berriai/litellm:main-latest
|
||||
|
||||
# Set the working directory
|
||||
WORKDIR /app
|
||||
|
||||
# Install dependencies - CHANGE THIS to `apk`
|
||||
RUN apt-get update && apt-get install -y dumb-init
|
||||
```
|
||||
|
||||
|
||||
Before Change
|
||||
```
|
||||
RUN apt-get update && apt-get install -y dumb-init
|
||||
```
|
||||
|
||||
After Change
|
||||
```
|
||||
RUN apk update && apk add --no-cache dumb-init
|
||||
```
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
@@ -0,0 +1,59 @@
|
||||
---
|
||||
title: v1.57.7
|
||||
slug: v1.57.7
|
||||
date: 2025-01-10T10:00:00
|
||||
authors:
|
||||
- name: Krrish Dholakia
|
||||
title: CEO, LiteLLM
|
||||
url: https://www.linkedin.com/in/krish-d/
|
||||
image_url: https://media.licdn.com/dms/image/v2/D4D03AQGrlsJ3aqpHmQ/profile-displayphoto-shrink_400_400/B4DZSAzgP7HYAg-/0/1737327772964?e=1749686400&v=beta&t=Hkl3U8Ps0VtvNxX0BNNq24b4dtX5wQaPFp6oiKCIHD8
|
||||
- name: Ishaan Jaffer
|
||||
title: CTO, LiteLLM
|
||||
url: https://www.linkedin.com/in/reffajnaahsi/
|
||||
image_url: https://media.licdn.com/dms/image/v2/D4D03AQGiM7ZrUwqu_Q/profile-displayphoto-shrink_800_800/profile-displayphoto-shrink_800_800/0/1675971026692?e=1741824000&v=beta&t=eQnRdXPJo4eiINWTZARoYTfqh064pgZ-E21pQTSy8jc
|
||||
tags: [langfuse, management endpoints, ui, prometheus, secret management]
|
||||
hide_table_of_contents: false
|
||||
---
|
||||
|
||||
`langfuse`, `management endpoints`, `ui`, `prometheus`, `secret management`
|
||||
|
||||
## Langfuse Prompt Management
|
||||
|
||||
Langfuse Prompt Management is being labelled as BETA. This allows us to iterate quickly on the feedback we're receiving, and making the status clearer to users. We expect to make this feature to be stable by next month (February 2025).
|
||||
|
||||
Changes:
|
||||
- Include the client message in the LLM API Request. (Previously only the prompt template was sent, and the client message was ignored).
|
||||
- Log the prompt template in the logged request (e.g. to s3/langfuse).
|
||||
- Log the 'prompt_id' and 'prompt_variables' in the logged request (e.g. to s3/langfuse).
|
||||
|
||||
|
||||
[Start Here](https://docs.litellm.ai/docs/proxy/prompt_management)
|
||||
|
||||
## Team/Organization Management + UI Improvements
|
||||
|
||||
Managing teams and organizations on the UI is now easier.
|
||||
|
||||
Changes:
|
||||
- Support for editing user role within team on UI.
|
||||
- Support updating team member role to admin via api - `/team/member_update`
|
||||
- Show team admins all keys for their team.
|
||||
- Add organizations with budgets
|
||||
- Assign teams to orgs on the UI
|
||||
- Auto-assign SSO users to teams
|
||||
|
||||
[Start Here](https://docs.litellm.ai/docs/proxy/self_serve)
|
||||
|
||||
## Hashicorp Vault Support
|
||||
|
||||
We now support writing LiteLLM Virtual API keys to Hashicorp Vault.
|
||||
|
||||
[Start Here](https://docs.litellm.ai/docs/proxy/vault)
|
||||
|
||||
## Custom Prometheus Metrics
|
||||
|
||||
Define custom prometheus metrics, and track usage/latency/no. of requests against them
|
||||
|
||||
This allows for more fine-grained tracking - e.g. on prompt template passed in request metadata
|
||||
|
||||
[Start Here](https://docs.litellm.ai/docs/proxy/prometheus#beta-custom-metrics)
|
||||
|
@@ -0,0 +1,107 @@
|
||||
---
|
||||
title: v1.57.8-stable
|
||||
slug: v1.57.8-stable
|
||||
date: 2025-01-11T10:00:00
|
||||
authors:
|
||||
- name: Krrish Dholakia
|
||||
title: CEO, LiteLLM
|
||||
url: https://www.linkedin.com/in/krish-d/
|
||||
image_url: https://media.licdn.com/dms/image/v2/D4D03AQGrlsJ3aqpHmQ/profile-displayphoto-shrink_400_400/B4DZSAzgP7HYAg-/0/1737327772964?e=1749686400&v=beta&t=Hkl3U8Ps0VtvNxX0BNNq24b4dtX5wQaPFp6oiKCIHD8
|
||||
- name: Ishaan Jaffer
|
||||
title: CTO, LiteLLM
|
||||
url: https://www.linkedin.com/in/reffajnaahsi/
|
||||
image_url: https://media.licdn.com/dms/image/v2/D4D03AQGiM7ZrUwqu_Q/profile-displayphoto-shrink_800_800/profile-displayphoto-shrink_800_800/0/1675971026692?e=1741824000&v=beta&t=eQnRdXPJo4eiINWTZARoYTfqh064pgZ-E21pQTSy8jc
|
||||
tags: [langfuse, humanloop, alerting, prometheus, secret management, management endpoints, ui, prompt management, finetuning, batch]
|
||||
hide_table_of_contents: false
|
||||
---
|
||||
|
||||
`alerting`, `prometheus`, `secret management`, `management endpoints`, `ui`, `prompt management`, `finetuning`, `batch`
|
||||
|
||||
|
||||
## New / Updated Models
|
||||
|
||||
1. Mistral large pricing - https://github.com/BerriAI/litellm/pull/7452
|
||||
2. Cohere command-r7b-12-2024 pricing - https://github.com/BerriAI/litellm/pull/7553/files
|
||||
3. Voyage - new models, prices and context window information - https://github.com/BerriAI/litellm/pull/7472
|
||||
4. Anthropic - bump Bedrock claude-3-5-haiku max_output_tokens to 8192
|
||||
|
||||
## General Proxy Improvements
|
||||
|
||||
1. Health check support for realtime models
|
||||
2. Support calling Azure realtime routes via virtual keys
|
||||
3. Support custom tokenizer on `/utils/token_counter` - useful when checking token count for self-hosted models
|
||||
4. Request Prioritization - support on `/v1/completion` endpoint as well
|
||||
|
||||
## LLM Translation Improvements
|
||||
|
||||
1. Deepgram STT support. [Start Here](https://docs.litellm.ai/docs/providers/deepgram)
|
||||
2. OpenAI Moderations - `omni-moderation-latest` support. [Start Here](https://docs.litellm.ai/docs/moderation)
|
||||
3. Azure O1 - fake streaming support. This ensures if a `stream=true` is passed, the response is streamed. [Start Here](https://docs.litellm.ai/docs/providers/azure)
|
||||
4. Anthropic - non-whitespace char stop sequence handling - [PR](https://github.com/BerriAI/litellm/pull/7484)
|
||||
5. Azure OpenAI - support Entra ID username + password based auth. [Start Here](https://docs.litellm.ai/docs/providers/azure#entra-id---use-tenant_id-client_id-client_secret)
|
||||
6. LM Studio - embedding route support. [Start Here](https://docs.litellm.ai/docs/providers/lm-studio)
|
||||
7. WatsonX - ZenAPIKeyAuth support. [Start Here](https://docs.litellm.ai/docs/providers/watsonx)
|
||||
|
||||
## Prompt Management Improvements
|
||||
|
||||
1. Langfuse integration
|
||||
2. HumanLoop integration
|
||||
3. Support for using load balanced models
|
||||
4. Support for loading optional params from prompt manager
|
||||
|
||||
[Start Here](https://docs.litellm.ai/docs/proxy/prompt_management)
|
||||
|
||||
## Finetuning + Batch APIs Improvements
|
||||
|
||||
1. Improved unified endpoint support for Vertex AI finetuning - [PR](https://github.com/BerriAI/litellm/pull/7487)
|
||||
2. Add support for retrieving vertex api batch jobs - [PR](https://github.com/BerriAI/litellm/commit/13f364682d28a5beb1eb1b57f07d83d5ef50cbdc)
|
||||
|
||||
## *NEW* Alerting Integration
|
||||
|
||||
PagerDuty Alerting Integration.
|
||||
|
||||
Handles two types of alerts:
|
||||
|
||||
- High LLM API Failure Rate. Configure X fails in Y seconds to trigger an alert.
|
||||
- High Number of Hanging LLM Requests. Configure X hangs in Y seconds to trigger an alert.
|
||||
|
||||
|
||||
[Start Here](https://docs.litellm.ai/docs/proxy/pagerduty)
|
||||
|
||||
## Prometheus Improvements
|
||||
|
||||
Added support for tracking latency/spend/tokens based on custom metrics. [Start Here](https://docs.litellm.ai/docs/proxy/prometheus#beta-custom-metrics)
|
||||
|
||||
## *NEW* Hashicorp Secret Manager Support
|
||||
|
||||
Support for reading credentials + writing LLM API keys. [Start Here](https://docs.litellm.ai/docs/secret#hashicorp-vault)
|
||||
|
||||
## Management Endpoints / UI Improvements
|
||||
|
||||
1. Create and view organizations + assign org admins on the Proxy UI
|
||||
2. Support deleting keys by key_alias
|
||||
3. Allow assigning teams to org on UI
|
||||
4. Disable using ui session token for 'test key' pane
|
||||
5. Show model used in 'test key' pane
|
||||
6. Support markdown output in 'test key' pane
|
||||
|
||||
## Helm Improvements
|
||||
|
||||
1. Prevent istio injection for db migrations cron job
|
||||
2. allow using migrationJob.enabled variable within job
|
||||
|
||||
## Logging Improvements
|
||||
|
||||
1. braintrust logging: respect project_id, add more metrics - https://github.com/BerriAI/litellm/pull/7613
|
||||
2. Athina - support base url - `ATHINA_BASE_URL`
|
||||
3. Lunary - Allow passing custom parent run id to LLM Calls
|
||||
|
||||
|
||||
|
||||
## Git Diff
|
||||
|
||||
This is the diff between v1.56.3-stable and v1.57.8-stable.
|
||||
|
||||
Use this to see the changes in the codebase.
|
||||
|
||||
[Git Diff](https://github.com/BerriAI/litellm/compare/v1.56.3-stable...189b67760011ea313ca58b1f8bd43aa74fbd7f55)
|
@@ -0,0 +1,60 @@
|
||||
---
|
||||
title: v1.59.0
|
||||
slug: v1.59.0
|
||||
date: 2025-01-17T10:00:00
|
||||
authors:
|
||||
- name: Krrish Dholakia
|
||||
title: CEO, LiteLLM
|
||||
url: https://www.linkedin.com/in/krish-d/
|
||||
image_url: https://media.licdn.com/dms/image/v2/D4D03AQGrlsJ3aqpHmQ/profile-displayphoto-shrink_400_400/B4DZSAzgP7HYAg-/0/1737327772964?e=1749686400&v=beta&t=Hkl3U8Ps0VtvNxX0BNNq24b4dtX5wQaPFp6oiKCIHD8
|
||||
- name: Ishaan Jaffer
|
||||
title: CTO, LiteLLM
|
||||
url: https://www.linkedin.com/in/reffajnaahsi/
|
||||
image_url: https://media.licdn.com/dms/image/v2/D4D03AQGiM7ZrUwqu_Q/profile-displayphoto-shrink_800_800/profile-displayphoto-shrink_800_800/0/1675971026692?e=1741824000&v=beta&t=eQnRdXPJo4eiINWTZARoYTfqh064pgZ-E21pQTSy8jc
|
||||
tags: [admin ui, logging, db schema]
|
||||
hide_table_of_contents: false
|
||||
---
|
||||
|
||||
import Image from '@theme/IdealImage';
|
||||
|
||||
# v1.59.0
|
||||
|
||||
|
||||
|
||||
:::info
|
||||
|
||||
Get a 7 day free trial for LiteLLM Enterprise [here](https://litellm.ai/#trial).
|
||||
|
||||
**no call needed**
|
||||
|
||||
:::
|
||||
|
||||
## UI Improvements
|
||||
|
||||
### [Opt In] Admin UI - view messages / responses
|
||||
|
||||
You can now view messages and response logs on Admin UI.
|
||||
|
||||
<Image img={require('../../img/release_notes/ui_logs.png')} />
|
||||
|
||||
How to enable it - add `store_prompts_in_spend_logs: true` to your `proxy_config.yaml`
|
||||
|
||||
Once this flag is enabled, your `messages` and `responses` will be stored in the `LiteLLM_Spend_Logs` table.
|
||||
|
||||
```yaml
|
||||
general_settings:
|
||||
store_prompts_in_spend_logs: true
|
||||
```
|
||||
|
||||
## DB Schema Change
|
||||
|
||||
Added `messages` and `responses` to the `LiteLLM_Spend_Logs` table.
|
||||
|
||||
**By default this is not logged.** If you want `messages` and `responses` to be logged, you need to opt in with this setting
|
||||
|
||||
```yaml
|
||||
general_settings:
|
||||
store_prompts_in_spend_logs: true
|
||||
```
|
||||
|
||||
|
@@ -0,0 +1,161 @@
|
||||
---
|
||||
title: v1.59.8-stable
|
||||
slug: v1.59.8-stable
|
||||
date: 2025-01-31T10:00:00
|
||||
authors:
|
||||
- name: Krrish Dholakia
|
||||
title: CEO, LiteLLM
|
||||
url: https://www.linkedin.com/in/krish-d/
|
||||
image_url: https://media.licdn.com/dms/image/v2/D4D03AQGrlsJ3aqpHmQ/profile-displayphoto-shrink_400_400/B4DZSAzgP7HYAg-/0/1737327772964?e=1749686400&v=beta&t=Hkl3U8Ps0VtvNxX0BNNq24b4dtX5wQaPFp6oiKCIHD8
|
||||
- name: Ishaan Jaffer
|
||||
title: CTO, LiteLLM
|
||||
url: https://www.linkedin.com/in/reffajnaahsi/
|
||||
image_url: https://media.licdn.com/dms/image/v2/D4D03AQGiM7ZrUwqu_Q/profile-displayphoto-shrink_800_800/profile-displayphoto-shrink_800_800/0/1675971026692?e=1741824000&v=beta&t=eQnRdXPJo4eiINWTZARoYTfqh064pgZ-E21pQTSy8jc
|
||||
tags: [admin ui, logging, db schema]
|
||||
hide_table_of_contents: false
|
||||
---
|
||||
|
||||
import Image from '@theme/IdealImage';
|
||||
|
||||
# v1.59.8-stable
|
||||
|
||||
|
||||
|
||||
:::info
|
||||
|
||||
Get a 7 day free trial for LiteLLM Enterprise [here](https://litellm.ai/#trial).
|
||||
|
||||
**no call needed**
|
||||
|
||||
:::
|
||||
|
||||
|
||||
## New Models / Updated Models
|
||||
|
||||
1. New OpenAI `/image/variations` endpoint BETA support [Docs](../../docs/image_variations)
|
||||
2. Topaz API support on OpenAI `/image/variations` BETA endpoint [Docs](../../docs/providers/topaz)
|
||||
3. Deepseek - r1 support w/ reasoning_content ([Deepseek API](../../docs/providers/deepseek#reasoning-models), [Vertex AI](../../docs/providers/vertex#model-garden), [Bedrock](../../docs/providers/bedrock#deepseek))
|
||||
4. Azure - Add azure o1 pricing [See Here](https://github.com/BerriAI/litellm/blob/b8b927f23bc336862dacb89f59c784a8d62aaa15/model_prices_and_context_window.json#L952)
|
||||
5. Anthropic - handle `-latest` tag in model for cost calculation
|
||||
6. Gemini-2.0-flash-thinking - add model pricing (it’s 0.0) [See Here](https://github.com/BerriAI/litellm/blob/b8b927f23bc336862dacb89f59c784a8d62aaa15/model_prices_and_context_window.json#L3393)
|
||||
7. Bedrock - add stability sd3 model pricing [See Here](https://github.com/BerriAI/litellm/blob/b8b927f23bc336862dacb89f59c784a8d62aaa15/model_prices_and_context_window.json#L6814) (s/o [Marty Sullivan](https://github.com/marty-sullivan))
|
||||
8. Bedrock - add us.amazon.nova-lite-v1:0 to model cost map [See Here](https://github.com/BerriAI/litellm/blob/b8b927f23bc336862dacb89f59c784a8d62aaa15/model_prices_and_context_window.json#L5619)
|
||||
9. TogetherAI - add new together_ai llama3.3 models [See Here](https://github.com/BerriAI/litellm/blob/b8b927f23bc336862dacb89f59c784a8d62aaa15/model_prices_and_context_window.json#L6985)
|
||||
|
||||
## LLM Translation
|
||||
|
||||
1. LM Studio -> fix async embedding call
|
||||
2. Gpt 4o models - fix response_format translation
|
||||
3. Bedrock nova - expand supported document types to include .md, .csv, etc. [Start Here](../../docs/providers/bedrock#usage---pdf--document-understanding)
|
||||
4. Bedrock - docs on IAM role based access for bedrock - [Start Here](https://docs.litellm.ai/docs/providers/bedrock#sts-role-based-auth)
|
||||
5. Bedrock - cache IAM role credentials when used
|
||||
6. Google AI Studio (`gemini/`) - support gemini 'frequency_penalty' and 'presence_penalty'
|
||||
7. Azure O1 - fix model name check
|
||||
8. WatsonX - ZenAPIKey support for WatsonX [Docs](../../docs/providers/watsonx)
|
||||
9. Ollama Chat - support json schema response format [Start Here](../../docs/providers/ollama#json-schema-support)
|
||||
10. Bedrock - return correct bedrock status code and error message if error during streaming
|
||||
11. Anthropic - Supported nested json schema on anthropic calls
|
||||
12. OpenAI - `metadata` param preview support
|
||||
1. SDK - enable via `litellm.enable_preview_features = True`
|
||||
2. PROXY - enable via `litellm_settings::enable_preview_features: true`
|
||||
13. Replicate - retry completion response on status=processing
|
||||
|
||||
## Spend Tracking Improvements
|
||||
|
||||
1. Bedrock - QA asserts all bedrock regional models have same `supported_` as base model
|
||||
2. Bedrock - fix bedrock converse cost tracking w/ region name specified
|
||||
3. Spend Logs reliability fix - when `user` passed in request body is int instead of string
|
||||
4. Ensure ‘base_model’ cost tracking works across all endpoints
|
||||
5. Fixes for Image generation cost tracking
|
||||
6. Anthropic - fix anthropic end user cost tracking
|
||||
7. JWT / OIDC Auth - add end user id tracking from jwt auth
|
||||
|
||||
## Management Endpoints / UI
|
||||
|
||||
1. allows team member to become admin post-add (ui + endpoints)
|
||||
2. New edit/delete button for updating team membership on UI
|
||||
3. If team admin - show all team keys
|
||||
4. Model Hub - clarify cost of models is per 1m tokens
|
||||
5. Invitation Links - fix invalid url generated
|
||||
6. New - SpendLogs Table Viewer - allows proxy admin to view spend logs on UI
|
||||
1. New spend logs - allow proxy admin to ‘opt in’ to logging request/response in spend logs table - enables easier abuse detection
|
||||
2. Show country of origin in spend logs
|
||||
3. Add pagination + filtering by key name/team name
|
||||
7. `/key/delete` - allow team admin to delete team keys
|
||||
8. Internal User ‘view’ - fix spend calculation when team selected
|
||||
9. Model Analytics is now on Free
|
||||
10. Usage page - shows days when spend = 0, and round spend on charts to 2 sig figs
|
||||
11. Public Teams - allow admins to expose teams for new users to ‘join’ on UI - [Start Here](https://docs.litellm.ai/docs/proxy/public_teams)
|
||||
12. Guardrails
|
||||
1. set/edit guardrails on a virtual key
|
||||
2. Allow setting guardrails on a team
|
||||
3. Set guardrails on team create + edit page
|
||||
13. Support temporary budget increases on `/key/update` - new `temp_budget_increase` and `temp_budget_expiry` fields - [Start Here](../../docs/proxy/virtual_keys#temporary-budget-increase)
|
||||
14. Support writing new key alias to AWS Secret Manager - on key rotation [Start Here](../../docs/secret#aws-secret-manager)
|
||||
|
||||
## Helm
|
||||
|
||||
1. add securityContext and pull policy values to migration job (s/o https://github.com/Hexoplon)
|
||||
2. allow specifying envVars on values.yaml
|
||||
3. new helm lint test
|
||||
|
||||
## Logging / Guardrail Integrations
|
||||
|
||||
1. Log the used prompt when prompt management used. [Start Here](../../docs/proxy/prompt_management)
|
||||
2. Support s3 logging with team alias prefixes - [Start Here](https://docs.litellm.ai/docs/proxy/logging#team-alias-prefix-in-object-key)
|
||||
3. Prometheus [Start Here](../../docs/proxy/prometheus)
|
||||
1. fix litellm_llm_api_time_to_first_token_metric not populating for bedrock models
|
||||
2. emit remaining team budget metric on regular basis (even when call isn’t made) - allows for more stable metrics on Grafana/etc.
|
||||
3. add key and team level budget metrics
|
||||
4. emit `litellm_overhead_latency_metric`
|
||||
5. Emit `litellm_team_budget_reset_at_metric` and `litellm_api_key_budget_remaining_hours_metric`
|
||||
4. Datadog - support logging spend tags to Datadog. [Start Here](../../docs/proxy/enterprise#tracking-spend-for-custom-tags)
|
||||
5. Langfuse - fix logging request tags, read from standard logging payload
|
||||
6. GCS - don’t truncate payload on logging
|
||||
7. New GCS Pub/Sub logging support [Start Here](https://docs.litellm.ai/docs/proxy/logging#google-cloud-storage---pubsub-topic)
|
||||
8. Add AIM Guardrails support [Start Here](../../docs/proxy/guardrails/aim_security)
|
||||
|
||||
## Security
|
||||
|
||||
1. New Enterprise SLA for patching security vulnerabilities. [See Here](../../docs/enterprise#slas--professional-support)
|
||||
2. Hashicorp - support using vault namespace for TLS auth. [Start Here](../../docs/secret#hashicorp-vault)
|
||||
3. Azure - DefaultAzureCredential support
|
||||
|
||||
## Health Checks
|
||||
|
||||
1. Cleanup pricing-only model names from wildcard route list - prevent bad health checks
|
||||
2. Allow specifying a health check model for wildcard routes - https://docs.litellm.ai/docs/proxy/health#wildcard-routes
|
||||
3. New ‘health_check_timeout ‘ param with default 1min upperbound to prevent bad model from health check to hang and cause pod restarts. [Start Here](../../docs/proxy/health#health-check-timeout)
|
||||
4. Datadog - add data dog service health check + expose new `/health/services` endpoint. [Start Here](../../docs/proxy/health#healthservices)
|
||||
|
||||
## Performance / Reliability improvements
|
||||
|
||||
1. 3x increase in RPS - moving to orjson for reading request body
|
||||
2. LLM Routing speedup - using cached get model group info
|
||||
3. SDK speedup - using cached get model info helper - reduces CPU work to get model info
|
||||
4. Proxy speedup - only read request body 1 time per request
|
||||
5. Infinite loop detection scripts added to codebase
|
||||
6. Bedrock - pure async image transformation requests
|
||||
7. Cooldowns - single deployment model group if 100% calls fail in high traffic - prevents an o1 outage from impacting other calls
|
||||
8. Response Headers - return
|
||||
1. `x-litellm-timeout`
|
||||
2. `x-litellm-attempted-retries`
|
||||
3. `x-litellm-overhead-duration-ms`
|
||||
4. `x-litellm-response-duration-ms`
|
||||
9. ensure duplicate callbacks are not added to proxy
|
||||
10. Requirements.txt - bump certifi version
|
||||
|
||||
## General Proxy Improvements
|
||||
|
||||
1. JWT / OIDC Auth - new `enforce_rbac` param,allows proxy admin to prevent any unmapped yet authenticated jwt tokens from calling proxy. [Start Here](../../docs/proxy/token_auth#enforce-role-based-access-control-rbac)
|
||||
2. fix custom openapi schema generation for customized swagger’s
|
||||
3. Request Headers - support reading `x-litellm-timeout` param from request headers. Enables model timeout control when using Vercel’s AI SDK + LiteLLM Proxy. [Start Here](../../docs/proxy/request_headers#litellm-headers)
|
||||
4. JWT / OIDC Auth - new `role` based permissions for model authentication. [See Here](https://docs.litellm.ai/docs/proxy/jwt_auth_arch)
|
||||
|
||||
## Complete Git Diff
|
||||
|
||||
This is the diff between v1.57.8-stable and v1.59.8-stable.
|
||||
|
||||
Use this to see the changes in the codebase.
|
||||
|
||||
[**Git Diff**](https://github.com/BerriAI/litellm/compare/v1.57.8-stable...v1.59.8-stable)
|
@@ -0,0 +1,103 @@
|
||||
---
|
||||
title: v1.61.20-stable
|
||||
slug: v1.61.20-stable
|
||||
date: 2025-03-01T10:00:00
|
||||
authors:
|
||||
- name: Krrish Dholakia
|
||||
title: CEO, LiteLLM
|
||||
url: https://www.linkedin.com/in/krish-d/
|
||||
image_url: https://media.licdn.com/dms/image/v2/D4D03AQGrlsJ3aqpHmQ/profile-displayphoto-shrink_400_400/B4DZSAzgP7HYAg-/0/1737327772964?e=1749686400&v=beta&t=Hkl3U8Ps0VtvNxX0BNNq24b4dtX5wQaPFp6oiKCIHD8
|
||||
- name: Ishaan Jaffer
|
||||
title: CTO, LiteLLM
|
||||
url: https://www.linkedin.com/in/reffajnaahsi/
|
||||
image_url: https://media.licdn.com/dms/image/v2/D4D03AQGiM7ZrUwqu_Q/profile-displayphoto-shrink_800_800/profile-displayphoto-shrink_800_800/0/1675971026692?e=1741824000&v=beta&t=eQnRdXPJo4eiINWTZARoYTfqh064pgZ-E21pQTSy8jc
|
||||
tags: [llm translation, rerank, ui, thinking, reasoning_content, claude-3-7-sonnet]
|
||||
hide_table_of_contents: false
|
||||
---
|
||||
|
||||
import Image from '@theme/IdealImage';
|
||||
|
||||
# v1.61.20-stable
|
||||
|
||||
|
||||
These are the changes since `v1.61.13-stable`.
|
||||
|
||||
This release is primarily focused on:
|
||||
- LLM Translation improvements (claude-3-7-sonnet + 'thinking'/'reasoning_content' support)
|
||||
- UI improvements (add model flow, user management, etc)
|
||||
|
||||
## Demo Instance
|
||||
|
||||
Here's a Demo Instance to test changes:
|
||||
- Instance: https://demo.litellm.ai/
|
||||
- Login Credentials:
|
||||
- Username: admin
|
||||
- Password: sk-1234
|
||||
|
||||
## New Models / Updated Models
|
||||
|
||||
1. Anthropic 3-7 sonnet support + cost tracking (Anthropic API + Bedrock + Vertex AI + OpenRouter)
|
||||
1. Anthropic API [Start here](https://docs.litellm.ai/docs/providers/anthropic#usage---thinking--reasoning_content)
|
||||
2. Bedrock API [Start here](https://docs.litellm.ai/docs/providers/bedrock#usage---thinking--reasoning-content)
|
||||
3. Vertex AI API [See here](../../docs/providers/vertex#usage---thinking--reasoning_content)
|
||||
4. OpenRouter [See here](https://github.com/BerriAI/litellm/blob/ba5bdce50a0b9bc822de58c03940354f19a733ed/model_prices_and_context_window.json#L5626)
|
||||
2. Gpt-4.5-preview support + cost tracking [See here](https://github.com/BerriAI/litellm/blob/ba5bdce50a0b9bc822de58c03940354f19a733ed/model_prices_and_context_window.json#L79)
|
||||
3. Azure AI - Phi-4 cost tracking [See here](https://github.com/BerriAI/litellm/blob/ba5bdce50a0b9bc822de58c03940354f19a733ed/model_prices_and_context_window.json#L1773)
|
||||
4. Claude-3.5-sonnet - vision support updated on Anthropic API [See here](https://github.com/BerriAI/litellm/blob/ba5bdce50a0b9bc822de58c03940354f19a733ed/model_prices_and_context_window.json#L2888)
|
||||
5. Bedrock llama vision support [See here](https://github.com/BerriAI/litellm/blob/ba5bdce50a0b9bc822de58c03940354f19a733ed/model_prices_and_context_window.json#L7714)
|
||||
6. Cerebras llama3.3-70b pricing [See here](https://github.com/BerriAI/litellm/blob/ba5bdce50a0b9bc822de58c03940354f19a733ed/model_prices_and_context_window.json#L2697)
|
||||
|
||||
## LLM Translation
|
||||
|
||||
1. Infinity Rerank - support returning documents when return_documents=True [Start here](../../docs/providers/infinity#usage---returning-documents)
|
||||
2. Amazon Deepseek - `<think>` param extraction into ‘reasoning_content’ [Start here](https://docs.litellm.ai/docs/providers/bedrock#bedrock-imported-models-deepseek-deepseek-r1)
|
||||
3. Amazon Titan Embeddings - filter out ‘aws_’ params from request body [Start here](https://docs.litellm.ai/docs/providers/bedrock#bedrock-embedding)
|
||||
4. Anthropic ‘thinking’ + ‘reasoning_content’ translation support (Anthropic API, Bedrock, Vertex AI) [Start here](https://docs.litellm.ai/docs/reasoning_content)
|
||||
5. VLLM - support ‘video_url’ [Start here](../../docs/providers/vllm#send-video-url-to-vllm)
|
||||
6. Call proxy via litellm SDK: Support `litellm_proxy/` for embedding, image_generation, transcription, speech, rerank [Start here](https://docs.litellm.ai/docs/providers/litellm_proxy)
|
||||
7. OpenAI Pass-through - allow using Assistants GET, DELETE on /openai pass through routes [Start here](https://docs.litellm.ai/docs/pass_through/openai_passthrough)
|
||||
8. Message Translation - fix openai message for assistant msg if role is missing - openai allows this
|
||||
9. O1/O3 - support ‘drop_params’ for o3-mini and o1 parallel_tool_calls param (not supported currently) [See here](https://docs.litellm.ai/docs/completion/drop_params)
|
||||
|
||||
## Spend Tracking Improvements
|
||||
|
||||
1. Cost tracking for rerank via Bedrock [See PR](https://github.com/BerriAI/litellm/commit/b682dc4ec8fd07acf2f4c981d2721e36ae2a49c5)
|
||||
2. Anthropic pass-through - fix race condition causing cost to not be tracked [See PR](https://github.com/BerriAI/litellm/pull/8874)
|
||||
3. Anthropic pass-through: Ensure accurate token counting [See PR](https://github.com/BerriAI/litellm/pull/8880)
|
||||
|
||||
## Management Endpoints / UI
|
||||
|
||||
1. Models Page - Allow sorting models by ‘created at’
|
||||
2. Models Page - Edit Model Flow Improvements
|
||||
3. Models Page - Fix Adding Azure, Azure AI Studio models on UI
|
||||
4. Internal Users Page - Allow Bulk Adding Internal Users on UI
|
||||
5. Internal Users Page - Allow sorting users by ‘created at’
|
||||
6. Virtual Keys Page - Allow searching for UserIDs on the dropdown when assigning a user to a team [See PR](https://github.com/BerriAI/litellm/pull/8844)
|
||||
7. Virtual Keys Page - allow creating a user when assigning keys to users [See PR](https://github.com/BerriAI/litellm/pull/8844)
|
||||
8. Model Hub Page - fix text overflow issue [See PR](https://github.com/BerriAI/litellm/pull/8749)
|
||||
9. Admin Settings Page - Allow adding MSFT SSO on UI
|
||||
10. Backend - don't allow creating duplicate internal users in DB
|
||||
|
||||
## Helm
|
||||
|
||||
1. support ttlSecondsAfterFinished on the migration job - [See PR](https://github.com/BerriAI/litellm/pull/8593)
|
||||
2. enhance migrations job with additional configurable properties - [See PR](https://github.com/BerriAI/litellm/pull/8636)
|
||||
|
||||
## Logging / Guardrail Integrations
|
||||
|
||||
1. Arize Phoenix support
|
||||
2. ‘No-log’ - fix ‘no-log’ param support on embedding calls
|
||||
|
||||
## Performance / Loadbalancing / Reliability improvements
|
||||
|
||||
1. Single Deployment Cooldown logic - Use allowed_fails or allowed_fail_policy if set [Start here](https://docs.litellm.ai/docs/routing#advanced-custom-retries-cooldowns-based-on-error-type)
|
||||
|
||||
## General Proxy Improvements
|
||||
|
||||
1. Hypercorn - fix reading / parsing request body
|
||||
2. Windows - fix running proxy in windows
|
||||
3. DD-Trace - fix dd-trace enablement on proxy
|
||||
|
||||
## Complete Git Diff
|
||||
|
||||
View the complete git diff [here](https://github.com/BerriAI/litellm/compare/v1.61.13-stable...v1.61.20-stable).
|
@@ -0,0 +1,40 @@
|
||||
---
|
||||
title: v1.63.0 - Anthropic 'thinking' response update
|
||||
slug: v1.63.0
|
||||
date: 2025-03-05T10:00:00
|
||||
authors:
|
||||
- name: Krrish Dholakia
|
||||
title: CEO, LiteLLM
|
||||
url: https://www.linkedin.com/in/krish-d/
|
||||
image_url: https://media.licdn.com/dms/image/v2/D4D03AQGrlsJ3aqpHmQ/profile-displayphoto-shrink_400_400/B4DZSAzgP7HYAg-/0/1737327772964?e=1749686400&v=beta&t=Hkl3U8Ps0VtvNxX0BNNq24b4dtX5wQaPFp6oiKCIHD8
|
||||
- name: Ishaan Jaffer
|
||||
title: CTO, LiteLLM
|
||||
url: https://www.linkedin.com/in/reffajnaahsi/
|
||||
image_url: https://media.licdn.com/dms/image/v2/D4D03AQGiM7ZrUwqu_Q/profile-displayphoto-shrink_800_800/profile-displayphoto-shrink_800_800/0/1675971026692?e=1741824000&v=beta&t=eQnRdXPJo4eiINWTZARoYTfqh064pgZ-E21pQTSy8jc
|
||||
tags: [llm translation, thinking, reasoning_content, claude-3-7-sonnet]
|
||||
hide_table_of_contents: false
|
||||
---
|
||||
|
||||
v1.63.0 fixes Anthropic 'thinking' response on streaming to return the `signature` block. [Github Issue](https://github.com/BerriAI/litellm/issues/8964)
|
||||
|
||||
|
||||
|
||||
It also moves the response structure from `signature_delta` to `signature` to be the same as Anthropic. [Anthropic Docs](https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking#implementing-extended-thinking)
|
||||
|
||||
|
||||
## Diff
|
||||
|
||||
```bash
|
||||
"message": {
|
||||
...
|
||||
"reasoning_content": "The capital of France is Paris.",
|
||||
"thinking_blocks": [
|
||||
{
|
||||
"type": "thinking",
|
||||
"thinking": "The capital of France is Paris.",
|
||||
- "signature_delta": "EqoBCkgIARABGAIiQL2UoU0b1OHYi+..." # 👈 OLD FORMAT
|
||||
+ "signature": "EqoBCkgIARABGAIiQL2UoU0b1OHYi+..." # 👈 KEY CHANGE
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
@@ -0,0 +1,172 @@
|
||||
---
|
||||
title: v1.63.11-stable
|
||||
slug: v1.63.11-stable
|
||||
date: 2025-03-15T10:00:00
|
||||
authors:
|
||||
- name: Krrish Dholakia
|
||||
title: CEO, LiteLLM
|
||||
url: https://www.linkedin.com/in/krish-d/
|
||||
image_url: https://media.licdn.com/dms/image/v2/D4D03AQGrlsJ3aqpHmQ/profile-displayphoto-shrink_400_400/B4DZSAzgP7HYAg-/0/1737327772964?e=1749686400&v=beta&t=Hkl3U8Ps0VtvNxX0BNNq24b4dtX5wQaPFp6oiKCIHD8
|
||||
- name: Ishaan Jaffer
|
||||
title: CTO, LiteLLM
|
||||
url: https://www.linkedin.com/in/reffajnaahsi/
|
||||
image_url: https://pbs.twimg.com/profile_images/1613813310264340481/lz54oEiB_400x400.jpg
|
||||
|
||||
tags: [credential management, thinking content, responses api, snowflake]
|
||||
hide_table_of_contents: false
|
||||
---
|
||||
|
||||
import Image from '@theme/IdealImage';
|
||||
|
||||
These are the changes since `v1.63.2-stable`.
|
||||
|
||||
This release is primarily focused on:
|
||||
- [Beta] Responses API Support
|
||||
- Snowflake Cortex Support, Amazon Nova Image Generation
|
||||
- UI - Credential Management, re-use credentials when adding new models
|
||||
- UI - Test Connection to LLM Provider before adding a model
|
||||
|
||||
## Known Issues
|
||||
- 🚨 Known issue on Azure OpenAI - We don't recommend upgrading if you use Azure OpenAI. This version failed our Azure OpenAI load test
|
||||
|
||||
|
||||
## Docker Run LiteLLM Proxy
|
||||
|
||||
```
|
||||
docker run
|
||||
-e STORE_MODEL_IN_DB=True
|
||||
-p 4000:4000
|
||||
ghcr.io/berriai/litellm:main-v1.63.11-stable
|
||||
```
|
||||
|
||||
## Demo Instance
|
||||
|
||||
Here's a Demo Instance to test changes:
|
||||
- Instance: https://demo.litellm.ai/
|
||||
- Login Credentials:
|
||||
- Username: admin
|
||||
- Password: sk-1234
|
||||
|
||||
|
||||
|
||||
## New Models / Updated Models
|
||||
|
||||
- Image Generation support for Amazon Nova Canvas [Getting Started](https://docs.litellm.ai/docs/providers/bedrock#image-generation)
|
||||
- Add pricing for Jamba new models [PR](https://github.com/BerriAI/litellm/pull/9032/files)
|
||||
- Add pricing for Amazon EU models [PR](https://github.com/BerriAI/litellm/pull/9056/files)
|
||||
- Add Bedrock Deepseek R1 model pricing [PR](https://github.com/BerriAI/litellm/pull/9108/files)
|
||||
- Update Gemini pricing: Gemma 3, Flash 2 thinking update, LearnLM [PR](https://github.com/BerriAI/litellm/pull/9190/files)
|
||||
- Mark Cohere Embedding 3 models as Multimodal [PR](https://github.com/BerriAI/litellm/pull/9176/commits/c9a576ce4221fc6e50dc47cdf64ab62736c9da41)
|
||||
- Add Azure Data Zone pricing [PR](https://github.com/BerriAI/litellm/pull/9185/files#diff-19ad91c53996e178c1921cbacadf6f3bae20cfe062bd03ee6bfffb72f847ee37)
|
||||
- LiteLLM Tracks cost for `azure/eu` and `azure/us` models
|
||||
|
||||
|
||||
|
||||
## LLM Translation
|
||||
|
||||
<Image img={require('../../img/release_notes/responses_api.png')} />
|
||||
|
||||
1. **New Endpoints**
|
||||
- [Beta] POST `/responses` API. [Getting Started](https://docs.litellm.ai/docs/response_api)
|
||||
|
||||
2. **New LLM Providers**
|
||||
- Snowflake Cortex [Getting Started](https://docs.litellm.ai/docs/providers/snowflake)
|
||||
|
||||
3. **New LLM Features**
|
||||
|
||||
- Support OpenRouter `reasoning_content` on streaming [Getting Started](https://docs.litellm.ai/docs/reasoning_content)
|
||||
|
||||
4. **Bug Fixes**
|
||||
|
||||
- OpenAI: Return `code`, `param` and `type` on bad request error [More information on litellm exceptions](https://docs.litellm.ai/docs/exception_mapping)
|
||||
- Bedrock: Fix converse chunk parsing to only return empty dict on tool use [PR](https://github.com/BerriAI/litellm/pull/9166)
|
||||
- Bedrock: Support extra_headers [PR](https://github.com/BerriAI/litellm/pull/9113)
|
||||
- Azure: Fix Function Calling Bug & Update Default API Version to `2025-02-01-preview` [PR](https://github.com/BerriAI/litellm/pull/9191)
|
||||
- Azure: Fix AI services URL [PR](https://github.com/BerriAI/litellm/pull/9185)
|
||||
- Vertex AI: Handle HTTP 201 status code in response [PR](https://github.com/BerriAI/litellm/pull/9193)
|
||||
- Perplexity: Fix incorrect streaming response [PR](https://github.com/BerriAI/litellm/pull/9081)
|
||||
- Triton: Fix streaming completions bug [PR](https://github.com/BerriAI/litellm/pull/8386)
|
||||
- Deepgram: Support bytes.IO when handling audio files for transcription [PR](https://github.com/BerriAI/litellm/pull/9071)
|
||||
- Ollama: Fix "system" role has become unacceptable [PR](https://github.com/BerriAI/litellm/pull/9261)
|
||||
- All Providers (Streaming): Fix String `data:` stripped from entire content in streamed responses [PR](https://github.com/BerriAI/litellm/pull/9070)
|
||||
|
||||
|
||||
|
||||
## Spend Tracking Improvements
|
||||
|
||||
1. Support Bedrock converse cache token tracking [Getting Started](https://docs.litellm.ai/docs/completion/prompt_caching)
|
||||
2. Cost Tracking for Responses API [Getting Started](https://docs.litellm.ai/docs/response_api)
|
||||
3. Fix Azure Whisper cost tracking [Getting Started](https://docs.litellm.ai/docs/audio_transcription)
|
||||
|
||||
|
||||
## UI
|
||||
|
||||
### Re-Use Credentials on UI
|
||||
|
||||
You can now onboard LLM provider credentials on LiteLLM UI. Once these credentials are added you can re-use them when adding new models [Getting Started](https://docs.litellm.ai/docs/proxy/ui_credentials)
|
||||
|
||||
<Image img={require('../../img/release_notes/credentials.jpg')} />
|
||||
|
||||
|
||||
### Test Connections before adding models
|
||||
|
||||
Before adding a model you can test the connection to the LLM provider to verify you have setup your API Base + API Key correctly
|
||||
|
||||
<Image img={require('../../img/release_notes/litellm_test_connection.gif')} />
|
||||
|
||||
### General UI Improvements
|
||||
1. Add Models Page
|
||||
- Allow adding Cerebras, Sambanova, Perplexity, Fireworks, Openrouter, TogetherAI Models, Text-Completion OpenAI on Admin UI
|
||||
- Allow adding EU OpenAI models
|
||||
- Fix: Instantly show edit + deletes to models
|
||||
2. Keys Page
|
||||
- Fix: Instantly show newly created keys on Admin UI (don't require refresh)
|
||||
- Fix: Allow clicking into Top Keys when showing users Top API Key
|
||||
- Fix: Allow Filter Keys by Team Alias, Key Alias and Org
|
||||
- UI Improvements: Show 100 Keys Per Page, Use full height, increase width of key alias
|
||||
3. Users Page
|
||||
- Fix: Show correct count of internal user keys on Users Page
|
||||
- Fix: Metadata not updating in Team UI
|
||||
4. Logs Page
|
||||
- UI Improvements: Keep expanded log in focus on LiteLLM UI
|
||||
- UI Improvements: Minor improvements to logs page
|
||||
- Fix: Allow internal user to query their own logs
|
||||
- Allow switching off storing Error Logs in DB [Getting Started](https://docs.litellm.ai/docs/proxy/ui_logs)
|
||||
5. Sign In/Sign Out
|
||||
- Fix: Correctly use `PROXY_LOGOUT_URL` when set [Getting Started](https://docs.litellm.ai/docs/proxy/self_serve#setting-custom-logout-urls)
|
||||
|
||||
|
||||
## Security
|
||||
|
||||
1. Support for Rotating Master Keys [Getting Started](https://docs.litellm.ai/docs/proxy/master_key_rotations)
|
||||
2. Fix: Internal User Viewer Permissions, don't allow `internal_user_viewer` role to see `Test Key Page` or `Create Key Button` [More information on role based access controls](https://docs.litellm.ai/docs/proxy/access_control)
|
||||
3. Emit audit logs on All user + model Create/Update/Delete endpoints [Getting Started](https://docs.litellm.ai/docs/proxy/multiple_admins)
|
||||
4. JWT
|
||||
- Support multiple JWT OIDC providers [Getting Started](https://docs.litellm.ai/docs/proxy/token_auth)
|
||||
- Fix JWT access with Groups not working when team is assigned All Proxy Models access
|
||||
5. Using K/V pairs in 1 AWS Secret [Getting Started](https://docs.litellm.ai/docs/secret#using-kv-pairs-in-1-aws-secret)
|
||||
|
||||
|
||||
## Logging Integrations
|
||||
|
||||
1. Prometheus: Track Azure LLM API latency metric [Getting Started](https://docs.litellm.ai/docs/proxy/prometheus#request-latency-metrics)
|
||||
2. Athina: Added tags, user_feedback and model_options to additional_keys which can be sent to Athina [Getting Started](https://docs.litellm.ai/docs/observability/athina_integration)
|
||||
|
||||
|
||||
## Performance / Reliability improvements
|
||||
|
||||
1. Redis + litellm router - Fix Redis cluster mode for litellm router [PR](https://github.com/BerriAI/litellm/pull/9010)
|
||||
|
||||
|
||||
## General Improvements
|
||||
|
||||
1. OpenWebUI Integration - display `thinking` tokens
|
||||
- Guide on getting started with LiteLLM x OpenWebUI. [Getting Started](https://docs.litellm.ai/docs/tutorials/openweb_ui)
|
||||
- Display `thinking` tokens on OpenWebUI (Bedrock, Anthropic, Deepseek) [Getting Started](https://docs.litellm.ai/docs/tutorials/openweb_ui#render-thinking-content-on-openweb-ui)
|
||||
|
||||
<Image img={require('../../img/litellm_thinking_openweb.gif')} />
|
||||
|
||||
|
||||
## Complete Git Diff
|
||||
|
||||
[Here's the complete git diff](https://github.com/BerriAI/litellm/compare/v1.63.2-stable...v1.63.11-stable)
|
@@ -0,0 +1,131 @@
|
||||
---
|
||||
title: v1.63.14-stable
|
||||
slug: v1.63.14-stable
|
||||
date: 2025-03-22T10:00:00
|
||||
authors:
|
||||
- name: Krrish Dholakia
|
||||
title: CEO, LiteLLM
|
||||
url: https://www.linkedin.com/in/krish-d/
|
||||
image_url: https://media.licdn.com/dms/image/v2/D4D03AQGrlsJ3aqpHmQ/profile-displayphoto-shrink_400_400/B4DZSAzgP7HYAg-/0/1737327772964?e=1749686400&v=beta&t=Hkl3U8Ps0VtvNxX0BNNq24b4dtX5wQaPFp6oiKCIHD8
|
||||
- name: Ishaan Jaffer
|
||||
title: CTO, LiteLLM
|
||||
url: https://www.linkedin.com/in/reffajnaahsi/
|
||||
image_url: https://pbs.twimg.com/profile_images/1613813310264340481/lz54oEiB_400x400.jpg
|
||||
|
||||
tags: [credential management, thinking content, responses api, snowflake]
|
||||
hide_table_of_contents: false
|
||||
---
|
||||
|
||||
import Image from '@theme/IdealImage';
|
||||
|
||||
These are the changes since `v1.63.11-stable`.
|
||||
|
||||
This release brings:
|
||||
- LLM Translation Improvements (MCP Support and Bedrock Application Profiles)
|
||||
- Perf improvements for Usage-based Routing
|
||||
- Streaming guardrail support via websockets
|
||||
- Azure OpenAI client perf fix (from previous release)
|
||||
|
||||
## Docker Run LiteLLM Proxy
|
||||
|
||||
```
|
||||
docker run
|
||||
-e STORE_MODEL_IN_DB=True
|
||||
-p 4000:4000
|
||||
ghcr.io/berriai/litellm:main-v1.63.14-stable.patch1
|
||||
```
|
||||
|
||||
## Demo Instance
|
||||
|
||||
Here's a Demo Instance to test changes:
|
||||
- Instance: https://demo.litellm.ai/
|
||||
- Login Credentials:
|
||||
- Username: admin
|
||||
- Password: sk-1234
|
||||
|
||||
|
||||
|
||||
## New Models / Updated Models
|
||||
|
||||
- Azure gpt-4o - fixed pricing to latest global pricing - [PR](https://github.com/BerriAI/litellm/pull/9361)
|
||||
- O1-Pro - add pricing + model information - [PR](https://github.com/BerriAI/litellm/pull/9397)
|
||||
- Azure AI - mistral 3.1 small pricing added - [PR](https://github.com/BerriAI/litellm/pull/9453)
|
||||
- Azure - gpt-4.5-preview pricing added - [PR](https://github.com/BerriAI/litellm/pull/9453)
|
||||
|
||||
|
||||
|
||||
## LLM Translation
|
||||
|
||||
1. **New LLM Features**
|
||||
|
||||
- Bedrock: Support bedrock application inference profiles [Docs](https://docs.litellm.ai/docs/providers/bedrock#bedrock-application-inference-profile)
|
||||
- Infer aws region from bedrock application profile id - (`arn:aws:bedrock:us-east-1:...`)
|
||||
- Ollama - support calling via `/v1/completions` [Get Started](../../docs/providers/ollama#using-ollama-fim-on-v1completions)
|
||||
- Bedrock - support `us.deepseek.r1-v1:0` model name [Docs](../../docs/providers/bedrock#supported-aws-bedrock-models)
|
||||
- OpenRouter - `OPENROUTER_API_BASE` env var support [Docs](../../docs/providers/openrouter.md)
|
||||
- Azure - add audio model parameter support - [Docs](../../docs/providers/azure#azure-audio-model)
|
||||
- OpenAI - PDF File support [Docs](../../docs/completion/document_understanding#openai-file-message-type)
|
||||
- OpenAI - o1-pro Responses API streaming support [Docs](../../docs/response_api.md#streaming)
|
||||
- [BETA] MCP - Use MCP Tools with LiteLLM SDK [Docs](../../docs/mcp)
|
||||
|
||||
2. **Bug Fixes**
|
||||
|
||||
- Voyage: prompt token on embedding tracking fix - [PR](https://github.com/BerriAI/litellm/commit/56d3e75b330c3c3862dc6e1c51c1210e48f1068e)
|
||||
- Sagemaker - Fix ‘Too little data for declared Content-Length’ error - [PR](https://github.com/BerriAI/litellm/pull/9326)
|
||||
- OpenAI-compatible models - fix issue when calling openai-compatible models w/ custom_llm_provider set - [PR](https://github.com/BerriAI/litellm/pull/9355)
|
||||
- VertexAI - Embedding ‘outputDimensionality’ support - [PR](https://github.com/BerriAI/litellm/commit/437dbe724620675295f298164a076cbd8019d304)
|
||||
- Anthropic - return consistent json response format on streaming/non-streaming - [PR](https://github.com/BerriAI/litellm/pull/9437)
|
||||
|
||||
## Spend Tracking Improvements
|
||||
|
||||
- `litellm_proxy/` - support reading litellm response cost header from proxy, when using client sdk
|
||||
- Reset Budget Job - fix budget reset error on keys/teams/users [PR](https://github.com/BerriAI/litellm/pull/9329)
|
||||
- Streaming - Prevents final chunk w/ usage from being ignored (impacted bedrock streaming + cost tracking) [PR](https://github.com/BerriAI/litellm/pull/9314)
|
||||
|
||||
|
||||
## UI
|
||||
|
||||
1. Users Page
|
||||
- Feature: Control default internal user settings [PR](https://github.com/BerriAI/litellm/pull/9328)
|
||||
2. Icons:
|
||||
- Feature: Replace external "artificialanalysis.ai" icons by local svg [PR](https://github.com/BerriAI/litellm/pull/9374)
|
||||
3. Sign In/Sign Out
|
||||
- Fix: Default login when `default_user_id` user does not exist in DB [PR](https://github.com/BerriAI/litellm/pull/9395)
|
||||
|
||||
|
||||
## Logging Integrations
|
||||
|
||||
- Support post-call guardrails for streaming responses [Get Started](../../docs/proxy/guardrails/custom_guardrail#1-write-a-customguardrail-class)
|
||||
- Arize [Get Started](../../docs/observability/arize_integration)
|
||||
- fix invalid package import [PR](https://github.com/BerriAI/litellm/pull/9338)
|
||||
- migrate to using standardloggingpayload for metadata, ensures spans land successfully [PR](https://github.com/BerriAI/litellm/pull/9338)
|
||||
- fix logging to just log the LLM I/O [PR](https://github.com/BerriAI/litellm/pull/9353)
|
||||
- Dynamic API Key/Space param support [Get Started](../../docs/observability/arize_integration#pass-arize-spacekey-per-request)
|
||||
- StandardLoggingPayload - Log litellm_model_name in payload. Allows knowing what the model sent to API provider was [Get Started](../../docs/proxy/logging_spec#standardlogginghiddenparams)
|
||||
- Prompt Management - Allow building custom prompt management integration [Get Started](../../docs/proxy/custom_prompt_management.md)
|
||||
|
||||
## Performance / Reliability improvements
|
||||
|
||||
- Redis Caching - add 5s default timeout, prevents hanging redis connection from impacting llm calls [PR](https://github.com/BerriAI/litellm/commit/db92956ae33ed4c4e3233d7e1b0c7229817159bf)
|
||||
- Allow disabling all spend updates / writes to DB - patch to allow disabling all spend updates to DB with a flag [PR](https://github.com/BerriAI/litellm/pull/9331)
|
||||
- Azure OpenAI - correctly re-use azure openai client, fixes perf issue from previous Stable release [PR](https://github.com/BerriAI/litellm/commit/f2026ef907c06d94440930917add71314b901413)
|
||||
- Azure OpenAI - uses litellm.ssl_verify on Azure/OpenAI clients [PR](https://github.com/BerriAI/litellm/commit/f2026ef907c06d94440930917add71314b901413)
|
||||
- Usage-based routing - Wildcard model support [Get Started](../../docs/proxy/usage_based_routing#wildcard-model-support)
|
||||
- Usage-based routing - Support batch writing increments to redis - reduces latency to same as ‘simple-shuffle’ [PR](https://github.com/BerriAI/litellm/pull/9357)
|
||||
- Router - show reason for model cooldown on ‘no healthy deployments available error’ [PR](https://github.com/BerriAI/litellm/pull/9438)
|
||||
- Caching - add max value limit to an item in in-memory cache (1MB) - prevents OOM errors on large image url’s being sent through proxy [PR](https://github.com/BerriAI/litellm/pull/9448)
|
||||
|
||||
|
||||
## General Improvements
|
||||
|
||||
- Passthrough Endpoints - support returning api-base on pass-through endpoints Response Headers [Docs](../../docs/proxy/response_headers#litellm-specific-headers)
|
||||
- SSL - support reading ssl security level from env var - Allows user to specify lower security settings [Get Started](../../docs/guides/security_settings)
|
||||
- Credentials - only poll Credentials table when `STORE_MODEL_IN_DB` is True [PR](https://github.com/BerriAI/litellm/pull/9376)
|
||||
- Image URL Handling - new architecture doc on image url handling [Docs](../../docs/proxy/image_handling)
|
||||
- OpenAI - bump to pip install "openai==1.68.2" [PR](https://github.com/BerriAI/litellm/commit/e85e3bc52a9de86ad85c3dbb12d87664ee567a5a)
|
||||
- Gunicorn - security fix - bump gunicorn==23.0.0 [PR](https://github.com/BerriAI/litellm/commit/7e9fc92f5c7fea1e7294171cd3859d55384166eb)
|
||||
|
||||
|
||||
## Complete Git Diff
|
||||
|
||||
[Here's the complete git diff](https://github.com/BerriAI/litellm/compare/v1.63.11-stable...v1.63.14.rc)
|
@@ -0,0 +1,112 @@
|
||||
---
|
||||
title: v1.63.2-stable
|
||||
slug: v1.63.2-stable
|
||||
date: 2025-03-08T10:00:00
|
||||
authors:
|
||||
- name: Krrish Dholakia
|
||||
title: CEO, LiteLLM
|
||||
url: https://www.linkedin.com/in/krish-d/
|
||||
image_url: https://media.licdn.com/dms/image/v2/D4D03AQGrlsJ3aqpHmQ/profile-displayphoto-shrink_400_400/B4DZSAzgP7HYAg-/0/1737327772964?e=1749686400&v=beta&t=Hkl3U8Ps0VtvNxX0BNNq24b4dtX5wQaPFp6oiKCIHD8
|
||||
- name: Ishaan Jaffer
|
||||
title: CTO, LiteLLM
|
||||
url: https://www.linkedin.com/in/reffajnaahsi/
|
||||
image_url: https://media.licdn.com/dms/image/v2/D4D03AQGiM7ZrUwqu_Q/profile-displayphoto-shrink_800_800/profile-displayphoto-shrink_800_800/0/1675971026692?e=1741824000&v=beta&t=eQnRdXPJo4eiINWTZARoYTfqh064pgZ-E21pQTSy8jc
|
||||
tags: [llm translation, thinking, reasoning_content, claude-3-7-sonnet]
|
||||
hide_table_of_contents: false
|
||||
---
|
||||
|
||||
import Image from '@theme/IdealImage';
|
||||
|
||||
|
||||
These are the changes since `v1.61.20-stable`.
|
||||
|
||||
This release is primarily focused on:
|
||||
- LLM Translation improvements (more `thinking` content improvements)
|
||||
- UI improvements (Error logs now shown on UI)
|
||||
|
||||
|
||||
:::info
|
||||
|
||||
This release will be live on 03/09/2025
|
||||
|
||||
:::
|
||||
|
||||
<Image img={require('../../img/release_notes/v1632_release.jpg')} />
|
||||
|
||||
|
||||
## Demo Instance
|
||||
|
||||
Here's a Demo Instance to test changes:
|
||||
- Instance: https://demo.litellm.ai/
|
||||
- Login Credentials:
|
||||
- Username: admin
|
||||
- Password: sk-1234
|
||||
|
||||
|
||||
## New Models / Updated Models
|
||||
|
||||
1. Add `supports_pdf_input` for specific Bedrock Claude models [PR](https://github.com/BerriAI/litellm/commit/f63cf0030679fe1a43d03fb196e815a0f28dae92)
|
||||
2. Add pricing for amazon `eu` models [PR](https://github.com/BerriAI/litellm/commits/main/model_prices_and_context_window.json)
|
||||
3. Fix Azure O1 mini pricing [PR](https://github.com/BerriAI/litellm/commit/52de1949ef2f76b8572df751f9c868a016d4832c)
|
||||
|
||||
## LLM Translation
|
||||
|
||||
<Image img={require('../../img/release_notes/anthropic_thinking.jpg')}/>
|
||||
|
||||
1. Support `/openai/` passthrough for Assistant endpoints. [Get Started](https://docs.litellm.ai/docs/pass_through/openai_passthrough)
|
||||
2. Bedrock Claude - fix tool calling transformation on invoke route. [Get Started](../../docs/providers/bedrock#usage---function-calling--tool-calling)
|
||||
3. Bedrock Claude - response_format support for claude on invoke route. [Get Started](../../docs/providers/bedrock#usage---structured-output--json-mode)
|
||||
4. Bedrock - pass `description` if set in response_format. [Get Started](../../docs/providers/bedrock#usage---structured-output--json-mode)
|
||||
5. Bedrock - Fix passing response_format: `{"type": "text"}`. [PR](https://github.com/BerriAI/litellm/commit/c84b489d5897755139aa7d4e9e54727ebe0fa540)
|
||||
6. OpenAI - Handle sending image_url as str to openai. [Get Started](https://docs.litellm.ai/docs/completion/vision)
|
||||
7. Deepseek - return 'reasoning_content' missing on streaming. [Get Started](https://docs.litellm.ai/docs/reasoning_content)
|
||||
8. Caching - Support caching on reasoning content. [Get Started](https://docs.litellm.ai/docs/proxy/caching)
|
||||
9. Bedrock - handle thinking blocks in assistant message. [Get Started](https://docs.litellm.ai/docs/providers/bedrock#usage---thinking--reasoning-content)
|
||||
10. Anthropic - Return `signature` on streaming. [Get Started](https://docs.litellm.ai/docs/providers/bedrock#usage---thinking--reasoning-content)
|
||||
- Note: We've also migrated from `signature_delta` to `signature`. [Read more](https://docs.litellm.ai/release_notes/v1.63.0)
|
||||
11. Support format param for specifying image type. [Get Started](../../docs/completion/vision.md#explicitly-specify-image-type)
|
||||
12. Anthropic - `/v1/messages` endpoint - `thinking` param support. [Get Started](../../docs/anthropic_unified.md)
|
||||
- Note: this refactors the [BETA] unified `/v1/messages` endpoint, to just work for the Anthropic API.
|
||||
13. Vertex AI - handle $id in response schema when calling vertex ai. [Get Started](https://docs.litellm.ai/docs/providers/vertex#json-schema)
|
||||
|
||||
## Spend Tracking Improvements
|
||||
|
||||
1. Batches API - Fix cost calculation to run on retrieve_batch. [Get Started](https://docs.litellm.ai/docs/batches)
|
||||
2. Batches API - Log batch models in spend logs / standard logging payload. [Get Started](../../docs/proxy/logging_spec.md#standardlogginghiddenparams)
|
||||
|
||||
## Management Endpoints / UI
|
||||
|
||||
<Image img={require('../../img/release_notes/error_logs.jpg')} />
|
||||
|
||||
1. Virtual Keys Page
|
||||
- Allow team/org filters to be searchable on the Create Key Page
|
||||
- Add created_by and updated_by fields to Keys table
|
||||
- Show 'user_email' on key table
|
||||
- Show 100 Keys Per Page, Use full height, increase width of key alias
|
||||
2. Logs Page
|
||||
- Show Error Logs on LiteLLM UI
|
||||
- Allow Internal Users to View their own logs
|
||||
3. Internal Users Page
|
||||
- Allow admin to control default model access for internal users
|
||||
7. Fix session handling with cookies
|
||||
|
||||
## Logging / Guardrail Integrations
|
||||
|
||||
1. Fix prometheus metrics w/ custom metrics, when keys containing team_id make requests. [PR](https://github.com/BerriAI/litellm/pull/8935)
|
||||
|
||||
## Performance / Loadbalancing / Reliability improvements
|
||||
|
||||
1. Cooldowns - Support cooldowns on models called with client side credentials. [Get Started](https://docs.litellm.ai/docs/proxy/clientside_auth#pass-user-llm-api-keys--api-base)
|
||||
2. Tag-based Routing - ensures tag-based routing across all endpoints (`/embeddings`, `/image_generation`, etc.). [Get Started](https://docs.litellm.ai/docs/proxy/tag_routing)
|
||||
|
||||
## General Proxy Improvements
|
||||
|
||||
1. Raise BadRequestError when unknown model passed in request
|
||||
2. Enforce model access restrictions on Azure OpenAI proxy route
|
||||
3. Reliability fix - Handle emoji’s in text - fix orjson error
|
||||
4. Model Access Patch - don't overwrite litellm.anthropic_models when running auth checks
|
||||
5. Enable setting timezone information in docker image
|
||||
|
||||
## Complete Git Diff
|
||||
|
||||
[Here's the complete git diff](https://github.com/BerriAI/litellm/compare/v1.61.20-stable...v1.63.2-stable)
|
@@ -0,0 +1,160 @@
|
||||
---
|
||||
title: v1.65.0-stable - Model Context Protocol
|
||||
slug: v1.65.0-stable
|
||||
date: 2025-03-30T10:00:00
|
||||
authors:
|
||||
- name: Krrish Dholakia
|
||||
title: CEO, LiteLLM
|
||||
url: https://www.linkedin.com/in/krish-d/
|
||||
image_url: https://media.licdn.com/dms/image/v2/D4D03AQGrlsJ3aqpHmQ/profile-displayphoto-shrink_400_400/B4DZSAzgP7HYAg-/0/1737327772964?e=1749686400&v=beta&t=Hkl3U8Ps0VtvNxX0BNNq24b4dtX5wQaPFp6oiKCIHD8
|
||||
- name: Ishaan Jaffer
|
||||
title: CTO, LiteLLM
|
||||
url: https://www.linkedin.com/in/reffajnaahsi/
|
||||
image_url: https://pbs.twimg.com/profile_images/1613813310264340481/lz54oEiB_400x400.jpg
|
||||
tags: [mcp, custom_prompt_management]
|
||||
hide_table_of_contents: false
|
||||
---
|
||||
import Image from '@theme/IdealImage';
|
||||
|
||||
v1.65.0-stable is live now. Here are the key highlights of this release:
|
||||
- **MCP Support**: Support for adding and using MCP servers on the LiteLLM proxy.
|
||||
- **UI view total usage after 1M+ logs**: You can now view usage analytics after crossing 1M+ logs in DB.
|
||||
|
||||
## Model Context Protocol (MCP)
|
||||
|
||||
This release introduces support for centrally adding MCP servers on LiteLLM. This allows you to add MCP server endpoints and your developers can `list` and `call` MCP tools through LiteLLM.
|
||||
|
||||
Read more about MCP [here](https://docs.litellm.ai/docs/mcp).
|
||||
|
||||
<Image
|
||||
img={require('../../img/release_notes/mcp_ui.png')}
|
||||
style={{width: '100%', display: 'block', margin: '2rem auto'}}
|
||||
/>
|
||||
<p style={{textAlign: 'left', color: '#666'}}>
|
||||
Expose and use MCP servers through LiteLLM
|
||||
</p>
|
||||
|
||||
## UI view total usage after 1M+ logs
|
||||
|
||||
This release brings the ability to view total usage analytics even after exceeding 1M+ logs in your database. We've implemented a scalable architecture that stores only aggregate usage data, resulting in significantly more efficient queries and reduced database CPU utilization.
|
||||
|
||||
|
||||
<Image
|
||||
img={require('../../img/release_notes/ui_usage.png')}
|
||||
style={{width: '100%', display: 'block', margin: '2rem auto'}}
|
||||
/>
|
||||
<p style={{textAlign: 'left', color: '#666'}}>
|
||||
View total usage after 1M+ logs
|
||||
</p>
|
||||
|
||||
|
||||
- How this works:
|
||||
- We now aggregate usage data into a dedicated DailyUserSpend table, significantly reducing query load and CPU usage even beyond 1M+ logs.
|
||||
|
||||
- Daily Spend Breakdown API:
|
||||
|
||||
- Retrieve granular daily usage data (by model, provider, and API key) with a single endpoint.
|
||||
Example Request:
|
||||
|
||||
```shell title="Daily Spend Breakdown API" showLineNumbers
|
||||
curl -L -X GET 'http://localhost:4000/user/daily/activity?start_date=2025-03-20&end_date=2025-03-27' \
|
||||
-H 'Authorization: Bearer sk-...'
|
||||
```
|
||||
|
||||
```json title="Daily Spend Breakdown API Response" showLineNumbers
|
||||
{
|
||||
"results": [
|
||||
{
|
||||
"date": "2025-03-27",
|
||||
"metrics": {
|
||||
"spend": 0.0177072,
|
||||
"prompt_tokens": 111,
|
||||
"completion_tokens": 1711,
|
||||
"total_tokens": 1822,
|
||||
"api_requests": 11
|
||||
},
|
||||
"breakdown": {
|
||||
"models": {
|
||||
"gpt-4o-mini": {
|
||||
"spend": 1.095e-05,
|
||||
"prompt_tokens": 37,
|
||||
"completion_tokens": 9,
|
||||
"total_tokens": 46,
|
||||
"api_requests": 1
|
||||
},
|
||||
"providers": { "openai": { ... }, "azure_ai": { ... } },
|
||||
"api_keys": { "3126b6eaf1...": { ... } }
|
||||
}
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"total_spend": 0.7274667,
|
||||
"total_prompt_tokens": 280990,
|
||||
"total_completion_tokens": 376674,
|
||||
"total_api_requests": 14
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
|
||||
|
||||
|
||||
## New Models / Updated Models
|
||||
- Support for Vertex AI gemini-2.0-flash-lite & Google AI Studio gemini-2.0-flash-lite [PR](https://github.com/BerriAI/litellm/pull/9523)
|
||||
- Support for Vertex AI Fine-Tuned LLMs [PR](https://github.com/BerriAI/litellm/pull/9542)
|
||||
- Nova Canvas image generation support [PR](https://github.com/BerriAI/litellm/pull/9525)
|
||||
- OpenAI gpt-4o-transcribe support [PR](https://github.com/BerriAI/litellm/pull/9517)
|
||||
- Added new Vertex AI text embedding model [PR](https://github.com/BerriAI/litellm/pull/9476)
|
||||
|
||||
## LLM Translation
|
||||
- OpenAI Web Search Tool Call Support [PR](https://github.com/BerriAI/litellm/pull/9465)
|
||||
- Vertex AI topLogprobs support [PR](https://github.com/BerriAI/litellm/pull/9518)
|
||||
- Support for sending images and video to Vertex AI multimodal embedding [Doc](https://docs.litellm.ai/docs/providers/vertex#multi-modal-embeddings)
|
||||
- Support litellm.api_base for Vertex AI + Gemini across completion, embedding, image_generation [PR](https://github.com/BerriAI/litellm/pull/9516)
|
||||
- Bug fix for returning `response_cost` when using litellm python SDK with LiteLLM Proxy [PR](https://github.com/BerriAI/litellm/commit/6fd18651d129d606182ff4b980e95768fc43ca3d)
|
||||
- Support for `max_completion_tokens` on Mistral API [PR](https://github.com/BerriAI/litellm/pull/9606)
|
||||
- Refactored Vertex AI passthrough routes - fixes unpredictable behaviour with auto-setting default_vertex_region on router model add [PR](https://github.com/BerriAI/litellm/pull/9467)
|
||||
|
||||
## Spend Tracking Improvements
|
||||
- Log 'api_base' on spend logs [PR](https://github.com/BerriAI/litellm/pull/9509)
|
||||
- Support for Gemini audio token cost tracking [PR](https://github.com/BerriAI/litellm/pull/9535)
|
||||
- Fixed OpenAI audio input token cost tracking [PR](https://github.com/BerriAI/litellm/pull/9535)
|
||||
|
||||
## UI
|
||||
|
||||
### Model Management
|
||||
- Allowed team admins to add/update/delete models on UI [PR](https://github.com/BerriAI/litellm/pull/9572)
|
||||
- Added render supports_web_search on model hub [PR](https://github.com/BerriAI/litellm/pull/9469)
|
||||
|
||||
### Request Logs
|
||||
- Show API base and model ID on request logs [PR](https://github.com/BerriAI/litellm/pull/9572)
|
||||
- Allow viewing keyinfo on request logs [PR](https://github.com/BerriAI/litellm/pull/9568)
|
||||
|
||||
### Usage Tab
|
||||
- Added Daily User Spend Aggregate view - allows UI Usage tab to work > 1m rows [PR](https://github.com/BerriAI/litellm/pull/9538)
|
||||
- Connected UI to "LiteLLM_DailyUserSpend" spend table [PR](https://github.com/BerriAI/litellm/pull/9603)
|
||||
|
||||
## Logging Integrations
|
||||
- Fixed StandardLoggingPayload for GCS Pub Sub Logging Integration [PR](https://github.com/BerriAI/litellm/pull/9508)
|
||||
- Track `litellm_model_name` on `StandardLoggingPayload` [Docs](https://docs.litellm.ai/docs/proxy/logging_spec#standardlogginghiddenparams)
|
||||
|
||||
## Performance / Reliability Improvements
|
||||
- LiteLLM Redis semantic caching implementation [PR](https://github.com/BerriAI/litellm/pull/9356)
|
||||
- Gracefully handle exceptions when DB is having an outage [PR](https://github.com/BerriAI/litellm/pull/9533)
|
||||
- Allow Pods to startup + passing /health/readiness when allow_requests_on_db_unavailable: True and DB is down [PR](https://github.com/BerriAI/litellm/pull/9569)
|
||||
|
||||
|
||||
## General Improvements
|
||||
- Support for exposing MCP tools on litellm proxy [PR](https://github.com/BerriAI/litellm/pull/9426)
|
||||
- Support discovering Gemini, Anthropic, xAI models by calling their /v1/model endpoint [PR](https://github.com/BerriAI/litellm/pull/9530)
|
||||
- Fixed route check for non-proxy admins on JWT auth [PR](https://github.com/BerriAI/litellm/pull/9454)
|
||||
- Added baseline Prisma database migrations [PR](https://github.com/BerriAI/litellm/pull/9565)
|
||||
- View all wildcard models on /model/info [PR](https://github.com/BerriAI/litellm/pull/9572)
|
||||
|
||||
|
||||
## Security
|
||||
- Bumped next from 14.2.21 to 14.2.25 in UI dashboard [PR](https://github.com/BerriAI/litellm/pull/9458)
|
||||
|
||||
## Complete Git Diff
|
||||
|
||||
[Here's the complete git diff](https://github.com/BerriAI/litellm/compare/v1.63.14-stable.patch1...v1.65.0-stable)
|
@@ -0,0 +1,34 @@
|
||||
---
|
||||
title: v1.65.0 - Team Model Add - update
|
||||
slug: v1.65.0
|
||||
date: 2025-03-28T10:00:00
|
||||
authors:
|
||||
- name: Krrish Dholakia
|
||||
title: CEO, LiteLLM
|
||||
url: https://www.linkedin.com/in/krish-d/
|
||||
image_url: https://media.licdn.com/dms/image/v2/D4D03AQGrlsJ3aqpHmQ/profile-displayphoto-shrink_400_400/B4DZSAzgP7HYAg-/0/1737327772964?e=1749686400&v=beta&t=Hkl3U8Ps0VtvNxX0BNNq24b4dtX5wQaPFp6oiKCIHD8
|
||||
- name: Ishaan Jaffer
|
||||
title: CTO, LiteLLM
|
||||
url: https://www.linkedin.com/in/reffajnaahsi/
|
||||
image_url: https://pbs.twimg.com/profile_images/1613813310264340481/lz54oEiB_400x400.jpg
|
||||
tags: [management endpoints, team models, ui]
|
||||
hide_table_of_contents: false
|
||||
---
|
||||
|
||||
import Image from '@theme/IdealImage';
|
||||
|
||||
v1.65.0 updates the `/model/new` endpoint to prevent non-team admins from creating team models.
|
||||
|
||||
This means that only proxy admins or team admins can create team models.
|
||||
|
||||
## Additional Changes
|
||||
|
||||
- Allows team admins to call `/model/update` to update team models.
|
||||
- Allows team admins to call `/model/delete` to delete team models.
|
||||
- Introduces new `user_models_only` param to `/v2/model/info` - only return models added by this user.
|
||||
|
||||
|
||||
These changes enable team admins to add and manage models for their team on the LiteLLM UI + API.
|
||||
|
||||
|
||||
<Image img={require('../../img/release_notes/team_model_add.png')} />
|
@@ -0,0 +1,176 @@
|
||||
---
|
||||
title: v1.65.4-stable
|
||||
slug: v1.65.4-stable
|
||||
date: 2025-04-05T10:00:00
|
||||
authors:
|
||||
- name: Krrish Dholakia
|
||||
title: CEO, LiteLLM
|
||||
url: https://www.linkedin.com/in/krish-d/
|
||||
image_url: https://media.licdn.com/dms/image/v2/D4D03AQGrlsJ3aqpHmQ/profile-displayphoto-shrink_400_400/B4DZSAzgP7HYAg-/0/1737327772964?e=1749686400&v=beta&t=Hkl3U8Ps0VtvNxX0BNNq24b4dtX5wQaPFp6oiKCIHD8
|
||||
- name: Ishaan Jaffer
|
||||
title: CTO, LiteLLM
|
||||
url: https://www.linkedin.com/in/reffajnaahsi/
|
||||
image_url: https://pbs.twimg.com/profile_images/1613813310264340481/lz54oEiB_400x400.jpg
|
||||
|
||||
tags: []
|
||||
hide_table_of_contents: false
|
||||
---
|
||||
|
||||
import Image from '@theme/IdealImage';
|
||||
import Tabs from '@theme/Tabs';
|
||||
import TabItem from '@theme/TabItem';
|
||||
|
||||
## Deploy this version
|
||||
|
||||
<Tabs>
|
||||
<TabItem value="docker" label="Docker">
|
||||
|
||||
``` showLineNumbers title="docker run litellm"
|
||||
docker run
|
||||
-e STORE_MODEL_IN_DB=True
|
||||
-p 4000:4000
|
||||
ghcr.io/berriai/litellm:main-v1.65.4-stable
|
||||
```
|
||||
</TabItem>
|
||||
|
||||
<TabItem value="pip" label="Pip">
|
||||
|
||||
``` showLineNumbers title="pip install litellm"
|
||||
pip install litellm==1.65.4.post1
|
||||
```
|
||||
</TabItem>
|
||||
</Tabs>
|
||||
|
||||
v1.65.4-stable is live. Here are the improvements since v1.65.0-stable.
|
||||
|
||||
## Key Highlights
|
||||
- **Preventing DB Deadlocks**: Fixes a high-traffic issue when multiple instances were writing to the DB at the same time.
|
||||
- **New Usage Tab**: Enables viewing spend by model and customizing date range
|
||||
|
||||
Let's dive in.
|
||||
|
||||
### Preventing DB Deadlocks
|
||||
|
||||
<Image img={require('../../img/prevent_deadlocks.jpg')} />
|
||||
|
||||
This release fixes the DB deadlocking issue that users faced in high traffic (10K+ RPS). This is great because it enables user/key/team spend tracking works at that scale.
|
||||
|
||||
Read more about the new architecture [here](https://docs.litellm.ai/docs/proxy/db_deadlocks)
|
||||
|
||||
|
||||
### New Usage Tab
|
||||
|
||||
<Image img={require('../../img/release_notes/spend_by_model.jpg')} />
|
||||
|
||||
The new Usage tab now brings the ability to track daily spend by model. This makes it easier to catch any spend tracking or token counting errors, when combined with the ability to view successful requests, and token usage.
|
||||
|
||||
To test this out, just go to Experimental > New Usage > Activity.
|
||||
|
||||
|
||||
## New Models / Updated Models
|
||||
|
||||
1. Databricks - claude-3-7-sonnet cost tracking [PR](https://github.com/BerriAI/litellm/blob/52b35cd8093b9ad833987b24f494586a1e923209/model_prices_and_context_window.json#L10350)
|
||||
2. VertexAI - `gemini-2.5-pro-exp-03-25` cost tracking [PR](https://github.com/BerriAI/litellm/blob/52b35cd8093b9ad833987b24f494586a1e923209/model_prices_and_context_window.json#L4492)
|
||||
3. VertexAI - `gemini-2.0-flash` cost tracking [PR](https://github.com/BerriAI/litellm/blob/52b35cd8093b9ad833987b24f494586a1e923209/model_prices_and_context_window.json#L4689)
|
||||
4. Groq - add whisper ASR models to model cost map [PR](https://github.com/BerriAI/litellm/blob/52b35cd8093b9ad833987b24f494586a1e923209/model_prices_and_context_window.json#L3324)
|
||||
5. IBM - Add watsonx/ibm/granite-3-8b-instruct to model cost map [PR](https://github.com/BerriAI/litellm/blob/52b35cd8093b9ad833987b24f494586a1e923209/model_prices_and_context_window.json#L91)
|
||||
6. Google AI Studio - add gemini/gemini-2.5-pro-preview-03-25 to model cost map [PR](https://github.com/BerriAI/litellm/blob/52b35cd8093b9ad833987b24f494586a1e923209/model_prices_and_context_window.json#L4850)
|
||||
|
||||
## LLM Translation
|
||||
1. Vertex AI - Support anyOf param for OpenAI json schema translation [Get Started](https://docs.litellm.ai/docs/providers/vertex#json-schema)
|
||||
2. Anthropic- response_format + thinking param support (works across Anthropic API, Bedrock, Vertex) [Get Started](https://docs.litellm.ai/docs/reasoning_content)
|
||||
3. Anthropic - if thinking token is specified and max tokens is not - ensure max token to anthropic is higher than thinking tokens (works across Anthropic API, Bedrock, Vertex) [PR](https://github.com/BerriAI/litellm/pull/9594)
|
||||
4. Bedrock - latency optimized inference support [Get Started](https://docs.litellm.ai/docs/providers/bedrock#usage---latency-optimized-inference)
|
||||
5. Sagemaker - handle special tokens + multibyte character code in response [Get Started](https://docs.litellm.ai/docs/providers/aws_sagemaker)
|
||||
6. MCP - add support for using SSE MCP servers [Get Started](https://docs.litellm.ai/docs/mcp#usage)
|
||||
8. Anthropic - new `litellm.messages.create` interface for calling Anthropic `/v1/messages` via passthrough [Get Started](https://docs.litellm.ai/docs/anthropic_unified#usage)
|
||||
11. Anthropic - support ‘file’ content type in message param (works across Anthropic API, Bedrock, Vertex) [Get Started](https://docs.litellm.ai/docs/providers/anthropic#usage---pdf)
|
||||
12. Anthropic - map openai 'reasoning_effort' to anthropic 'thinking' param (works across Anthropic API, Bedrock, Vertex) [Get Started](https://docs.litellm.ai/docs/providers/anthropic#usage---thinking--reasoning_content)
|
||||
13. Google AI Studio (Gemini) - [BETA] `/v1/files` upload support [Get Started](../../docs/providers/google_ai_studio/files)
|
||||
14. Azure - fix o-series tool calling [Get Started](../../docs/providers/azure#tool-calling--function-calling)
|
||||
15. Unified file id - [ALPHA] allow calling multiple providers with same file id [PR](https://github.com/BerriAI/litellm/pull/9718)
|
||||
- This is experimental, and not recommended for production use.
|
||||
- We plan to have a production-ready implementation by next week.
|
||||
16. Google AI Studio (Gemini) - return logprobs [PR](https://github.com/BerriAI/litellm/pull/9713)
|
||||
17. Anthropic - Support prompt caching for Anthropic tool calls [Get Started](https://docs.litellm.ai/docs/completion/prompt_caching)
|
||||
18. OpenRouter - unwrap extra body on open router calls [PR](https://github.com/BerriAI/litellm/pull/9747)
|
||||
19. VertexAI - fix credential caching issue [PR](https://github.com/BerriAI/litellm/pull/9756)
|
||||
20. XAI - filter out 'name' param for XAI [PR](https://github.com/BerriAI/litellm/pull/9761)
|
||||
21. Gemini - image generation output support [Get Started](../../docs/providers/gemini#image-generation)
|
||||
22. Databricks - support claude-3-7-sonnet w/ thinking + response_format [Get Started](../../docs/providers/databricks#usage---thinking--reasoning_content)
|
||||
|
||||
## Spend Tracking Improvements
|
||||
1. Reliability fix - Check sent and received model for cost calculation [PR](https://github.com/BerriAI/litellm/pull/9669)
|
||||
2. Vertex AI - Multimodal embedding cost tracking [Get Started](https://docs.litellm.ai/docs/providers/vertex#multi-modal-embeddings), [PR](https://github.com/BerriAI/litellm/pull/9623)
|
||||
|
||||
## Management Endpoints / UI
|
||||
|
||||
<Image img={require('../../img/release_notes/new_activity_tab.png')} />
|
||||
|
||||
1. New Usage Tab
|
||||
- Report 'total_tokens' + report success/failure calls
|
||||
- Remove double bars on scroll
|
||||
- Ensure ‘daily spend’ chart ordered from earliest to latest date
|
||||
- showing spend per model per day
|
||||
- show key alias on usage tab
|
||||
- Allow non-admins to view their activity
|
||||
- Add date picker to new usage tab
|
||||
2. Virtual Keys Tab
|
||||
- remove 'default key' on user signup
|
||||
- fix showing user models available for personal key creation
|
||||
3. Test Key Tab
|
||||
- Allow testing image generation models
|
||||
4. Models Tab
|
||||
- Fix bulk adding models
|
||||
- support reusable credentials for passthrough endpoints
|
||||
- Allow team members to see team models
|
||||
5. Teams Tab
|
||||
- Fix json serialization error on update team metadata
|
||||
6. Request Logs Tab
|
||||
- Add reasoning_content token tracking across all providers on streaming
|
||||
7. API
|
||||
- return key alias on /user/daily/activity [Get Started](../../docs/proxy/cost_tracking#daily-spend-breakdown-api)
|
||||
8. SSO
|
||||
- Allow assigning SSO users to teams on MSFT SSO [PR](https://github.com/BerriAI/litellm/pull/9745)
|
||||
|
||||
## Logging / Guardrail Integrations
|
||||
|
||||
1. Console Logs - Add json formatting for uncaught exceptions [PR](https://github.com/BerriAI/litellm/pull/9619)
|
||||
2. Guardrails - AIM Guardrails support for virtual key based policies [Get Started](../../docs/proxy/guardrails/aim_security)
|
||||
3. Logging - fix completion start time tracking [PR](https://github.com/BerriAI/litellm/pull/9688)
|
||||
4. Prometheus
|
||||
- Allow adding authentication on Prometheus /metrics endpoints [PR](https://github.com/BerriAI/litellm/pull/9766)
|
||||
- Distinguish LLM Provider Exception vs. LiteLLM Exception in metric naming [PR](https://github.com/BerriAI/litellm/pull/9760)
|
||||
- Emit operational metrics for new DB Transaction architecture [PR](https://github.com/BerriAI/litellm/pull/9719)
|
||||
|
||||
## Performance / Loadbalancing / Reliability improvements
|
||||
1. Preventing Deadlocks
|
||||
- Reduce DB Deadlocks by storing spend updates in Redis and then committing to DB [PR](https://github.com/BerriAI/litellm/pull/9608)
|
||||
- Ensure no deadlocks occur when updating DailyUserSpendTransaction [PR](https://github.com/BerriAI/litellm/pull/9690)
|
||||
- High Traffic fix - ensure new DB + Redis architecture accurately tracks spend [PR](https://github.com/BerriAI/litellm/pull/9673)
|
||||
- Use Redis for PodLock Manager instead of PG (ensures no deadlocks occur) [PR](https://github.com/BerriAI/litellm/pull/9715)
|
||||
- v2 DB Deadlock Reduction Architecture – Add Max Size for In-Memory Queue + Backpressure Mechanism [PR](https://github.com/BerriAI/litellm/pull/9759)
|
||||
|
||||
2. Prisma Migrations [Get Started](../../docs/proxy/prod#9-use-prisma-migrate-deploy)
|
||||
- connects litellm proxy to litellm's prisma migration files
|
||||
- Handle db schema updates from new `litellm-proxy-extras` sdk
|
||||
3. Redis - support password for sync sentinel clients [PR](https://github.com/BerriAI/litellm/pull/9622)
|
||||
4. Fix "Circular reference detected" error when max_parallel_requests = 0 [PR](https://github.com/BerriAI/litellm/pull/9671)
|
||||
5. Code QA - Ban hardcoded numbers [PR](https://github.com/BerriAI/litellm/pull/9709)
|
||||
|
||||
## Helm
|
||||
1. fix: wrong indentation of ttlSecondsAfterFinished in chart [PR](https://github.com/BerriAI/litellm/pull/9611)
|
||||
|
||||
## General Proxy Improvements
|
||||
1. Fix - only apply service_account_settings.enforced_params on service accounts [PR](https://github.com/BerriAI/litellm/pull/9683)
|
||||
2. Fix - handle metadata null on `/chat/completion` [PR](https://github.com/BerriAI/litellm/issues/9717)
|
||||
3. Fix - Move daily user transaction logging outside of 'disable_spend_logs' flag, as they’re unrelated [PR](https://github.com/BerriAI/litellm/pull/9772)
|
||||
|
||||
## Demo
|
||||
|
||||
Try this on the demo instance [today](https://docs.litellm.ai/docs/proxy/demo)
|
||||
|
||||
## Complete Git Diff
|
||||
|
||||
See the complete git diff since v1.65.0-stable, [here](https://github.com/BerriAI/litellm/releases/tag/v1.65.4-stable)
|
||||
|
@@ -0,0 +1,197 @@
|
||||
---
|
||||
title: v1.66.0-stable - Realtime API Cost Tracking
|
||||
slug: v1.66.0-stable
|
||||
date: 2025-04-12T10:00:00
|
||||
authors:
|
||||
- name: Krrish Dholakia
|
||||
title: CEO, LiteLLM
|
||||
url: https://www.linkedin.com/in/krish-d/
|
||||
image_url: https://media.licdn.com/dms/image/v2/D4D03AQGrlsJ3aqpHmQ/profile-displayphoto-shrink_400_400/B4DZSAzgP7HYAg-/0/1737327772964?e=1749686400&v=beta&t=Hkl3U8Ps0VtvNxX0BNNq24b4dtX5wQaPFp6oiKCIHD8
|
||||
- name: Ishaan Jaffer
|
||||
title: CTO, LiteLLM
|
||||
url: https://www.linkedin.com/in/reffajnaahsi/
|
||||
image_url: https://pbs.twimg.com/profile_images/1613813310264340481/lz54oEiB_400x400.jpg
|
||||
|
||||
tags: ["sso", "unified_file_id", "cost_tracking", "security"]
|
||||
hide_table_of_contents: false
|
||||
---
|
||||
|
||||
import Image from '@theme/IdealImage';
|
||||
import Tabs from '@theme/Tabs';
|
||||
import TabItem from '@theme/TabItem';
|
||||
|
||||
## Deploy this version
|
||||
|
||||
<Tabs>
|
||||
<TabItem value="docker" label="Docker">
|
||||
|
||||
``` showLineNumbers title="docker run litellm"
|
||||
docker run
|
||||
-e STORE_MODEL_IN_DB=True
|
||||
-p 4000:4000
|
||||
ghcr.io/berriai/litellm:main-v1.66.0-stable
|
||||
```
|
||||
</TabItem>
|
||||
|
||||
<TabItem value="pip" label="Pip">
|
||||
|
||||
``` showLineNumbers title="pip install litellm"
|
||||
pip install litellm==1.66.0.post1
|
||||
```
|
||||
</TabItem>
|
||||
</Tabs>
|
||||
|
||||
v1.66.0-stable is live now, here are the key highlights of this release
|
||||
|
||||
## Key Highlights
|
||||
- **Realtime API Cost Tracking**: Track cost of realtime API calls
|
||||
- **Microsoft SSO Auto-sync**: Auto-sync groups and group members from Azure Entra ID to LiteLLM
|
||||
- **xAI grok-3**: Added support for `xai/grok-3` models
|
||||
- **Security Fixes**: Fixed [CVE-2025-0330](https://www.cve.org/CVERecord?id=CVE-2025-0330) and [CVE-2024-6825](https://www.cve.org/CVERecord?id=CVE-2024-6825) vulnerabilities
|
||||
|
||||
Let's dive in.
|
||||
|
||||
## Realtime API Cost Tracking
|
||||
|
||||
<Image
|
||||
img={require('../../img/realtime_api.png')}
|
||||
style={{width: '100%', display: 'block'}}
|
||||
/>
|
||||
|
||||
|
||||
This release adds Realtime API logging + cost tracking.
|
||||
- **Logging**: LiteLLM now logs the complete response from realtime calls to all logging integrations (DB, S3, Langfuse, etc.)
|
||||
- **Cost Tracking**: You can now set 'base_model' and custom pricing for realtime models. [Custom Pricing](../../docs/proxy/custom_pricing)
|
||||
- **Budgets**: Your key/user/team budgets now work for realtime models as well.
|
||||
|
||||
Start [here](https://docs.litellm.ai/docs/realtime)
|
||||
|
||||
|
||||
|
||||
## Microsoft SSO Auto-sync
|
||||
|
||||
<Image
|
||||
img={require('../../img/release_notes/sso_sync.png')}
|
||||
style={{width: '100%', display: 'block'}}
|
||||
/>
|
||||
<p style={{textAlign: 'left', color: '#666'}}>
|
||||
Auto-sync groups and members from Azure Entra ID to LiteLLM
|
||||
</p>
|
||||
|
||||
This release adds support for auto-syncing groups and members on Microsoft Entra ID with LiteLLM. This means that LiteLLM proxy administrators can spend less time managing teams and members and LiteLLM handles the following:
|
||||
|
||||
- Auto-create teams that exist on Microsoft Entra ID
|
||||
- Sync team members on Microsoft Entra ID with LiteLLM teams
|
||||
|
||||
Get started with this [here](https://docs.litellm.ai/docs/tutorials/msft_sso)
|
||||
|
||||
|
||||
## New Models / Updated Models
|
||||
|
||||
- **xAI**
|
||||
1. Added reasoning_effort support for `xai/grok-3-mini-beta` [Get Started](https://docs.litellm.ai/docs/providers/xai#reasoning-usage)
|
||||
2. Added cost tracking for `xai/grok-3` models [PR](https://github.com/BerriAI/litellm/pull/9920)
|
||||
|
||||
- **Hugging Face**
|
||||
1. Added inference providers support [Get Started](https://docs.litellm.ai/docs/providers/huggingface#serverless-inference-providers)
|
||||
|
||||
- **Azure**
|
||||
1. Added azure/gpt-4o-realtime-audio cost tracking [PR](https://github.com/BerriAI/litellm/pull/9893)
|
||||
|
||||
- **VertexAI**
|
||||
1. Added enterpriseWebSearch tool support [Get Started](https://docs.litellm.ai/docs/providers/vertex#grounding---web-search)
|
||||
2. Moved to only passing keys accepted by the Vertex AI response schema [PR](https://github.com/BerriAI/litellm/pull/8992)
|
||||
|
||||
- **Google AI Studio**
|
||||
1. Added cost tracking for `gemini-2.5-pro` [PR](https://github.com/BerriAI/litellm/pull/9837)
|
||||
2. Fixed pricing for 'gemini/gemini-2.5-pro-preview-03-25' [PR](https://github.com/BerriAI/litellm/pull/9896)
|
||||
3. Fixed handling file_data being passed in [PR](https://github.com/BerriAI/litellm/pull/9786)
|
||||
|
||||
- **Azure**
|
||||
1. Updated Azure Phi-4 pricing [PR](https://github.com/BerriAI/litellm/pull/9862)
|
||||
2. Added azure/gpt-4o-realtime-audio cost tracking [PR](https://github.com/BerriAI/litellm/pull/9893)
|
||||
|
||||
- **Databricks**
|
||||
1. Removed reasoning_effort from parameters [PR](https://github.com/BerriAI/litellm/pull/9811)
|
||||
2. Fixed custom endpoint check for Databricks [PR](https://github.com/BerriAI/litellm/pull/9925)
|
||||
|
||||
- **General**
|
||||
1. Added litellm.supports_reasoning() util to track if an llm supports reasoning [Get Started](https://docs.litellm.ai/docs/providers/anthropic#reasoning)
|
||||
2. Function Calling - Handle pydantic base model in message tool calls, handle tools = [], and support fake streaming on tool calls for meta.llama3-3-70b-instruct-v1:0 [PR](https://github.com/BerriAI/litellm/pull/9774)
|
||||
3. LiteLLM Proxy - Allow passing `thinking` param to litellm proxy via client sdk [PR](https://github.com/BerriAI/litellm/pull/9386)
|
||||
4. Fixed correctly translating 'thinking' param for litellm [PR](https://github.com/BerriAI/litellm/pull/9904)
|
||||
|
||||
|
||||
## Spend Tracking Improvements
|
||||
- **OpenAI, Azure**
|
||||
1. Realtime API Cost tracking with token usage metrics in spend logs [Get Started](https://docs.litellm.ai/docs/realtime)
|
||||
- **Anthropic**
|
||||
1. Fixed Claude Haiku cache read pricing per token [PR](https://github.com/BerriAI/litellm/pull/9834)
|
||||
2. Added cost tracking for Claude responses with base_model [PR](https://github.com/BerriAI/litellm/pull/9897)
|
||||
3. Fixed Anthropic prompt caching cost calculation and trimmed logged message in db [PR](https://github.com/BerriAI/litellm/pull/9838)
|
||||
- **General**
|
||||
1. Added token tracking and log usage object in spend logs [PR](https://github.com/BerriAI/litellm/pull/9843)
|
||||
2. Handle custom pricing at deployment level [PR](https://github.com/BerriAI/litellm/pull/9855)
|
||||
|
||||
|
||||
## Management Endpoints / UI
|
||||
|
||||
- **Test Key Tab**
|
||||
1. Added rendering of Reasoning content, ttft, usage metrics on test key page [PR](https://github.com/BerriAI/litellm/pull/9931)
|
||||
|
||||
<Image
|
||||
img={require('../../img/release_notes/chat_metrics.png')}
|
||||
style={{width: '100%', display: 'block'}}
|
||||
/>
|
||||
<p style={{textAlign: 'left', color: '#666'}}>
|
||||
View input, output, reasoning tokens, ttft metrics.
|
||||
</p>
|
||||
- **Tag / Policy Management**
|
||||
1. Added Tag/Policy Management. Create routing rules based on request metadata. This allows you to enforce that requests with `tags="private"` only go to specific models. [Get Started](https://docs.litellm.ai/docs/tutorials/tag_management)
|
||||
|
||||
<br />
|
||||
|
||||
<Image
|
||||
img={require('../../img/release_notes/tag_management.png')}
|
||||
style={{width: '100%', display: 'block'}}
|
||||
/>
|
||||
<p style={{textAlign: 'left', color: '#666'}}>
|
||||
Create and manage tags.
|
||||
</p>
|
||||
- **Redesigned Login Screen**
|
||||
1. Polished login screen [PR](https://github.com/BerriAI/litellm/pull/9778)
|
||||
- **Microsoft SSO Auto-Sync**
|
||||
1. Added debug route to allow admins to debug SSO JWT fields [PR](https://github.com/BerriAI/litellm/pull/9835)
|
||||
2. Added ability to use MSFT Graph API to assign users to teams [PR](https://github.com/BerriAI/litellm/pull/9865)
|
||||
3. Connected litellm to Azure Entra ID Enterprise Application [PR](https://github.com/BerriAI/litellm/pull/9872)
|
||||
4. Added ability for admins to set `default_team_params` for when litellm SSO creates default teams [PR](https://github.com/BerriAI/litellm/pull/9895)
|
||||
5. Fixed MSFT SSO to use correct field for user email [PR](https://github.com/BerriAI/litellm/pull/9886)
|
||||
6. Added UI support for setting Default Team setting when litellm SSO auto creates teams [PR](https://github.com/BerriAI/litellm/pull/9918)
|
||||
- **UI Bug Fixes**
|
||||
1. Prevented team, key, org, model numerical values changing on scrolling [PR](https://github.com/BerriAI/litellm/pull/9776)
|
||||
2. Instantly reflect key and team updates in UI [PR](https://github.com/BerriAI/litellm/pull/9825)
|
||||
|
||||
## Logging / Guardrail Improvements
|
||||
|
||||
- **Prometheus**
|
||||
1. Emit Key and Team Budget metrics on a cron job schedule [Get Started](https://docs.litellm.ai/docs/proxy/prometheus#initialize-budget-metrics-on-startup)
|
||||
|
||||
## Security Fixes
|
||||
|
||||
- Fixed [CVE-2025-0330](https://www.cve.org/CVERecord?id=CVE-2025-0330) - Leakage of Langfuse API keys in team exception handling [PR](https://github.com/BerriAI/litellm/pull/9830)
|
||||
- Fixed [CVE-2024-6825](https://www.cve.org/CVERecord?id=CVE-2024-6825) - Remote code execution in post call rules [PR](https://github.com/BerriAI/litellm/pull/9826)
|
||||
|
||||
## Helm
|
||||
|
||||
- Added service annotations to litellm-helm chart [PR](https://github.com/BerriAI/litellm/pull/9840)
|
||||
- Added extraEnvVars to the helm deployment [PR](https://github.com/BerriAI/litellm/pull/9292)
|
||||
|
||||
## Demo
|
||||
|
||||
Try this on the demo instance [today](https://docs.litellm.ai/docs/proxy/demo)
|
||||
|
||||
## Complete Git Diff
|
||||
|
||||
See the complete git diff since v1.65.4-stable, [here](https://github.com/BerriAI/litellm/releases/tag/v1.66.0-stable)
|
||||
|
||||
|
@@ -0,0 +1,153 @@
|
||||
---
|
||||
title: v1.67.0-stable - SCIM Integration
|
||||
slug: v1.67.0-stable
|
||||
date: 2025-04-19T10:00:00
|
||||
authors:
|
||||
- name: Krrish Dholakia
|
||||
title: CEO, LiteLLM
|
||||
url: https://www.linkedin.com/in/krish-d/
|
||||
image_url: https://media.licdn.com/dms/image/v2/D4D03AQGrlsJ3aqpHmQ/profile-displayphoto-shrink_400_400/B4DZSAzgP7HYAg-/0/1737327772964?e=1749686400&v=beta&t=Hkl3U8Ps0VtvNxX0BNNq24b4dtX5wQaPFp6oiKCIHD8
|
||||
- name: Ishaan Jaffer
|
||||
title: CTO, LiteLLM
|
||||
url: https://www.linkedin.com/in/reffajnaahsi/
|
||||
image_url: https://pbs.twimg.com/profile_images/1613813310264340481/lz54oEiB_400x400.jpg
|
||||
|
||||
tags: ["sso", "unified_file_id", "cost_tracking", "security"]
|
||||
hide_table_of_contents: false
|
||||
---
|
||||
import Image from '@theme/IdealImage';
|
||||
import Tabs from '@theme/Tabs';
|
||||
import TabItem from '@theme/TabItem';
|
||||
|
||||
## Key Highlights
|
||||
|
||||
- **SCIM Integration**: Enables identity providers (Okta, Azure AD, OneLogin, etc.) to automate user and team (group) provisioning, updates, and deprovisioning
|
||||
- **Team and Tag based usage tracking**: You can now see usage and spend by team and tag at 1M+ spend logs.
|
||||
- **Unified Responses API**: Support for calling Anthropic, Gemini, Groq, etc. via OpenAI's new Responses API.
|
||||
|
||||
Let's dive in.
|
||||
|
||||
## SCIM Integration
|
||||
|
||||
<Image img={require('../../img/scim_integration.png')}/>
|
||||
|
||||
This release adds SCIM support to LiteLLM. This allows your SSO provider (Okta, Azure AD, etc) to automatically create/delete users, teams, and memberships on LiteLLM. This means that when you remove a team on your SSO provider, your SSO provider will automatically delete the corresponding team on LiteLLM.
|
||||
|
||||
[Read more](../../docs/tutorials/scim_litellm)
|
||||
## Team and Tag based usage tracking
|
||||
|
||||
<Image img={require('../../img/release_notes/new_team_usage_highlight.jpg')}/>
|
||||
|
||||
|
||||
This release improves team and tag based usage tracking at 1m+ spend logs, making it easy to monitor your LLM API Spend in production. This covers:
|
||||
|
||||
- View **daily spend** by teams + tags
|
||||
- View **usage / spend by key**, within teams
|
||||
- View **spend by multiple tags**
|
||||
- Allow **internal users** to view spend of teams they're a member of
|
||||
|
||||
[Read more](#management-endpoints--ui)
|
||||
|
||||
## Unified Responses API
|
||||
|
||||
This release allows you to call Azure OpenAI, Anthropic, AWS Bedrock, and Google Vertex AI models via the POST /v1/responses endpoint on LiteLLM. This means you can now use popular tools like [OpenAI Codex](https://docs.litellm.ai/docs/tutorials/openai_codex) with your own models.
|
||||
|
||||
<Image img={require('../../img/release_notes/unified_responses_api_rn.png')}/>
|
||||
|
||||
|
||||
[Read more](https://docs.litellm.ai/docs/response_api)
|
||||
|
||||
|
||||
## New Models / Updated Models
|
||||
|
||||
- **OpenAI**
|
||||
1. gpt-4.1, gpt-4.1-mini, gpt-4.1-nano, o3, o3-mini, o4-mini pricing - [Get Started](../../docs/providers/openai#usage), [PR](https://github.com/BerriAI/litellm/pull/9990)
|
||||
2. o4 - correctly map o4 to openai o_series model
|
||||
- **Azure AI**
|
||||
1. Phi-4 output cost per token fix - [PR](https://github.com/BerriAI/litellm/pull/9880)
|
||||
2. Responses API support [Get Started](../../docs/providers/azure#azure-responses-api),[PR](https://github.com/BerriAI/litellm/pull/10116)
|
||||
- **Anthropic**
|
||||
1. redacted message thinking support - [Get Started](../../docs/providers/anthropic#usage---thinking--reasoning_content),[PR](https://github.com/BerriAI/litellm/pull/10129)
|
||||
- **Cohere**
|
||||
1. `/v2/chat` Passthrough endpoint support w/ cost tracking - [Get Started](../../docs/pass_through/cohere), [PR](https://github.com/BerriAI/litellm/pull/9997)
|
||||
- **Azure**
|
||||
1. Support azure tenant_id/client_id env vars - [Get Started](../../docs/providers/azure#entra-id---use-tenant_id-client_id-client_secret), [PR](https://github.com/BerriAI/litellm/pull/9993)
|
||||
2. Fix response_format check for 2025+ api versions - [PR](https://github.com/BerriAI/litellm/pull/9993)
|
||||
3. Add gpt-4.1, gpt-4.1-mini, gpt-4.1-nano, o3, o3-mini, o4-mini pricing
|
||||
- **VLLM**
|
||||
1. Files - Support 'file' message type for VLLM video url's - [Get Started](../../docs/providers/vllm#send-video-url-to-vllm), [PR](https://github.com/BerriAI/litellm/pull/10129)
|
||||
2. Passthrough - new `/vllm/` passthrough endpoint support [Get Started](../../docs/pass_through/vllm), [PR](https://github.com/BerriAI/litellm/pull/10002)
|
||||
- **Mistral**
|
||||
1. new `/mistral` passthrough endpoint support [Get Started](../../docs/pass_through/mistral), [PR](https://github.com/BerriAI/litellm/pull/10002)
|
||||
- **AWS**
|
||||
1. New mapped bedrock regions - [PR](https://github.com/BerriAI/litellm/pull/9430)
|
||||
- **VertexAI / Google AI Studio**
|
||||
1. Gemini - Response format - Retain schema field ordering for google gemini and vertex by specifying propertyOrdering - [Get Started](../../docs/providers/vertex#json-schema), [PR](https://github.com/BerriAI/litellm/pull/9828)
|
||||
2. Gemini-2.5-flash - return reasoning content [Google AI Studio](../../docs/providers/gemini#usage---thinking--reasoning_content), [Vertex AI](../../docs/providers/vertex#thinking--reasoning_content)
|
||||
3. Gemini-2.5-flash - pricing + model information [PR](https://github.com/BerriAI/litellm/pull/10125)
|
||||
4. Passthrough - new `/vertex_ai/discovery` route - enables calling AgentBuilder API routes [Get Started](../../docs/pass_through/vertex_ai#supported-api-endpoints), [PR](https://github.com/BerriAI/litellm/pull/10084)
|
||||
- **Fireworks AI**
|
||||
1. return tool calling responses in `tool_calls` field (fireworks incorrectly returns this as a json str in content) [PR](https://github.com/BerriAI/litellm/pull/10130)
|
||||
- **Triton**
|
||||
1. Remove fixed remove bad_words / stop words from `/generate` call - [Get Started](../../docs/providers/triton-inference-server#triton-generate---chat-completion), [PR](https://github.com/BerriAI/litellm/pull/10163)
|
||||
- **Other**
|
||||
1. Support for all litellm providers on Responses API (works with Codex) - [Get Started](../../docs/tutorials/openai_codex), [PR](https://github.com/BerriAI/litellm/pull/10132)
|
||||
2. Fix combining multiple tool calls in streaming response - [Get Started](../../docs/completion/stream#helper-function), [PR](https://github.com/BerriAI/litellm/pull/10040)
|
||||
|
||||
|
||||
## Spend Tracking Improvements
|
||||
|
||||
- **Cost Control** - inject cache control points in prompt for cost reduction [Get Started](../../docs/tutorials/prompt_caching), [PR](https://github.com/BerriAI/litellm/pull/10000)
|
||||
- **Spend Tags** - spend tags in headers - support x-litellm-tags even if tag based routing not enabled [Get Started](../../docs/proxy/request_headers#litellm-headers), [PR](https://github.com/BerriAI/litellm/pull/10000)
|
||||
- **Gemini-2.5-flash** - support cost calculation for reasoning tokens [PR](https://github.com/BerriAI/litellm/pull/10141)
|
||||
|
||||
## Management Endpoints / UI
|
||||
- **Users**
|
||||
1. Show created_at and updated_at on users page - [PR](https://github.com/BerriAI/litellm/pull/10033)
|
||||
- **Virtual Keys**
|
||||
1. Filter by key alias - https://github.com/BerriAI/litellm/pull/10085
|
||||
- **Usage Tab**
|
||||
|
||||
1. Team based usage
|
||||
|
||||
- New `LiteLLM_DailyTeamSpend` Table for aggregate team based usage logging - [PR](https://github.com/BerriAI/litellm/pull/10039)
|
||||
|
||||
- New Team based usage dashboard + new `/team/daily/activity` API - [PR](https://github.com/BerriAI/litellm/pull/10081)
|
||||
- Return team alias on /team/daily/activity API - [PR](https://github.com/BerriAI/litellm/pull/10157)
|
||||
- allow internal user view spend for teams they belong to - [PR](https://github.com/BerriAI/litellm/pull/10157)
|
||||
- allow viewing top keys by team - [PR](https://github.com/BerriAI/litellm/pull/10157)
|
||||
|
||||
<Image img={require('../../img/release_notes/new_team_usage.png')}/>
|
||||
|
||||
2. Tag Based Usage
|
||||
- New `LiteLLM_DailyTagSpend` Table for aggregate tag based usage logging - [PR](https://github.com/BerriAI/litellm/pull/10071)
|
||||
- Restrict to only Proxy Admins - [PR](https://github.com/BerriAI/litellm/pull/10157)
|
||||
- allow viewing top keys by tag
|
||||
- Return tags passed in request (i.e. dynamic tags) on `/tag/list` API - [PR](https://github.com/BerriAI/litellm/pull/10157)
|
||||
<Image img={require('../../img/release_notes/new_tag_usage.png')}/>
|
||||
3. Track prompt caching metrics in daily user, team, tag tables - [PR](https://github.com/BerriAI/litellm/pull/10029)
|
||||
4. Show usage by key (on all up, team, and tag usage dashboards) - [PR](https://github.com/BerriAI/litellm/pull/10157)
|
||||
5. swap old usage with new usage tab
|
||||
- **Models**
|
||||
1. Make columns resizable/hideable - [PR](https://github.com/BerriAI/litellm/pull/10119)
|
||||
- **API Playground**
|
||||
1. Allow internal user to call api playground - [PR](https://github.com/BerriAI/litellm/pull/10157)
|
||||
- **SCIM**
|
||||
1. Add LiteLLM SCIM Integration for Team and User management - [Get Started](../../docs/tutorials/scim_litellm), [PR](https://github.com/BerriAI/litellm/pull/10072)
|
||||
|
||||
|
||||
## Logging / Guardrail Integrations
|
||||
- **GCS**
|
||||
1. Fix gcs pub sub logging with env var GCS_PROJECT_ID - [Get Started](../../docs/observability/gcs_bucket_integration#usage), [PR](https://github.com/BerriAI/litellm/pull/10042)
|
||||
- **AIM**
|
||||
1. Add litellm call id passing to Aim guardrails on pre and post-hooks calls - [Get Started](../../docs/proxy/guardrails/aim_security), [PR](https://github.com/BerriAI/litellm/pull/10021)
|
||||
- **Azure blob storage**
|
||||
1. Ensure logging works in high throughput scenarios - [Get Started](../../docs/proxy/logging#azure-blob-storage), [PR](https://github.com/BerriAI/litellm/pull/9962)
|
||||
|
||||
## General Proxy Improvements
|
||||
|
||||
- **Support setting `litellm.modify_params` via env var** [PR](https://github.com/BerriAI/litellm/pull/9964)
|
||||
- **Model Discovery** - Check provider’s `/models` endpoints when calling proxy’s `/v1/models` endpoint - [Get Started](../../docs/proxy/model_discovery), [PR](https://github.com/BerriAI/litellm/pull/9958)
|
||||
- **`/utils/token_counter`** - fix retrieving custom tokenizer for db models - [Get Started](../../docs/proxy/configs#set-custom-tokenizer), [PR](https://github.com/BerriAI/litellm/pull/10047)
|
||||
- **Prisma migrate** - handle existing columns in db table - [PR](https://github.com/BerriAI/litellm/pull/10138)
|
||||
|
@@ -0,0 +1,197 @@
|
||||
---
|
||||
title: v1.67.4-stable - Improved User Management
|
||||
slug: v1.67.4-stable
|
||||
date: 2025-04-26T10:00:00
|
||||
authors:
|
||||
- name: Krrish Dholakia
|
||||
title: CEO, LiteLLM
|
||||
url: https://www.linkedin.com/in/krish-d/
|
||||
image_url: https://media.licdn.com/dms/image/v2/D4D03AQGrlsJ3aqpHmQ/profile-displayphoto-shrink_400_400/B4DZSAzgP7HYAg-/0/1737327772964?e=1749686400&v=beta&t=Hkl3U8Ps0VtvNxX0BNNq24b4dtX5wQaPFp6oiKCIHD8
|
||||
- name: Ishaan Jaffer
|
||||
title: CTO, LiteLLM
|
||||
url: https://www.linkedin.com/in/reffajnaahsi/
|
||||
image_url: https://pbs.twimg.com/profile_images/1613813310264340481/lz54oEiB_400x400.jpg
|
||||
|
||||
tags: ["responses_api", "ui_improvements", "security", "session_management"]
|
||||
hide_table_of_contents: false
|
||||
---
|
||||
import Image from '@theme/IdealImage';
|
||||
import Tabs from '@theme/Tabs';
|
||||
import TabItem from '@theme/TabItem';
|
||||
|
||||
|
||||
|
||||
## Deploy this version
|
||||
|
||||
<Tabs>
|
||||
<TabItem value="docker" label="Docker">
|
||||
|
||||
``` showLineNumbers title="docker run litellm"
|
||||
docker run
|
||||
-e STORE_MODEL_IN_DB=True
|
||||
-p 4000:4000
|
||||
ghcr.io/berriai/litellm:main-v1.67.4-stable
|
||||
```
|
||||
</TabItem>
|
||||
|
||||
<TabItem value="pip" label="Pip">
|
||||
|
||||
``` showLineNumbers title="pip install litellm"
|
||||
pip install litellm==1.67.4.post1
|
||||
```
|
||||
</TabItem>
|
||||
</Tabs>
|
||||
|
||||
## Key Highlights
|
||||
|
||||
- **Improved User Management**: This release enables search and filtering across users, keys, teams, and models.
|
||||
- **Responses API Load Balancing**: Route requests across provider regions and ensure session continuity.
|
||||
- **UI Session Logs**: Group several requests to LiteLLM into a session.
|
||||
|
||||
## Improved User Management
|
||||
|
||||
<Image img={require('../../img/release_notes/ui_search_users.png')}/>
|
||||
<br/>
|
||||
|
||||
This release makes it easier to manage users and keys on LiteLLM. You can now search and filter across users, keys, teams, and models, and control user settings more easily.
|
||||
|
||||
New features include:
|
||||
|
||||
- Search for users by email, ID, role, or team.
|
||||
- See all of a user's models, teams, and keys in one place.
|
||||
- Change user roles and model access right from the Users Tab.
|
||||
|
||||
These changes help you spend less time on user setup and management on LiteLLM.
|
||||
|
||||
## Responses API Load Balancing
|
||||
|
||||
<Image img={require('../../img/release_notes/ui_responses_lb.png')}/>
|
||||
<br/>
|
||||
|
||||
This release introduces load balancing for the Responses API, allowing you to route requests across provider regions and ensure session continuity. It works as follows:
|
||||
|
||||
- If a `previous_response_id` is provided, LiteLLM will route the request to the original deployment that generated the prior response — ensuring session continuity.
|
||||
- If no `previous_response_id` is provided, LiteLLM will load-balance requests across your available deployments.
|
||||
|
||||
[Read more](https://docs.litellm.ai/docs/response_api#load-balancing-with-session-continuity)
|
||||
|
||||
## UI Session Logs
|
||||
|
||||
<Image img={require('../../img/ui_session_logs.png')}/>
|
||||
<br/>
|
||||
|
||||
This release allow you to group requests to LiteLLM proxy into a session. If you specify a litellm_session_id in your request LiteLLM will automatically group all logs in the same session. This allows you to easily track usage and request content per session.
|
||||
|
||||
[Read more](https://docs.litellm.ai/docs/proxy/ui_logs_sessions)
|
||||
|
||||
## New Models / Updated Models
|
||||
|
||||
- **OpenAI**
|
||||
1. Added `gpt-image-1` cost tracking [Get Started](https://docs.litellm.ai/docs/image_generation)
|
||||
2. Bug fix: added cost tracking for gpt-image-1 when quality is unspecified [PR](https://github.com/BerriAI/litellm/pull/10247)
|
||||
- **Azure**
|
||||
1. Fixed timestamp granularities passing to whisper in Azure [Get Started](https://docs.litellm.ai/docs/audio_transcription)
|
||||
2. Added azure/gpt-image-1 pricing [Get Started](https://docs.litellm.ai/docs/image_generation), [PR](https://github.com/BerriAI/litellm/pull/10327)
|
||||
3. Added cost tracking for `azure/computer-use-preview`, `azure/gpt-4o-audio-preview-2024-12-17`, `azure/gpt-4o-mini-audio-preview-2024-12-17` [PR](https://github.com/BerriAI/litellm/pull/10178)
|
||||
- **Bedrock**
|
||||
1. Added support for all compatible Bedrock parameters when model="arn:.." (Bedrock application inference profile models) [Get started](https://docs.litellm.ai/docs/providers/bedrock#bedrock-application-inference-profile), [PR](https://github.com/BerriAI/litellm/pull/10256)
|
||||
2. Fixed wrong system prompt transformation [PR](https://github.com/BerriAI/litellm/pull/10120)
|
||||
- **VertexAI / Google AI Studio**
|
||||
1. Allow setting `budget_tokens=0` for `gemini-2.5-flash` [Get Started](https://docs.litellm.ai/docs/providers/gemini#usage---thinking--reasoning_content),[PR](https://github.com/BerriAI/litellm/pull/10198)
|
||||
2. Ensure returned `usage` includes thinking token usage [PR](https://github.com/BerriAI/litellm/pull/10198)
|
||||
3. Added cost tracking for `gemini-2.5-pro-preview-03-25` [PR](https://github.com/BerriAI/litellm/pull/10178)
|
||||
- **Cohere**
|
||||
1. Added support for cohere command-a-03-2025 [Get Started](https://docs.litellm.ai/docs/providers/cohere), [PR](https://github.com/BerriAI/litellm/pull/10295)
|
||||
- **SageMaker**
|
||||
1. Added support for max_completion_tokens parameter [Get Started](https://docs.litellm.ai/docs/providers/sagemaker), [PR](https://github.com/BerriAI/litellm/pull/10300)
|
||||
- **Responses API**
|
||||
1. Added support for GET and DELETE operations - `/v1/responses/{response_id}` [Get Started](../../docs/response_api)
|
||||
2. Added session management support for non-OpenAI models [PR](https://github.com/BerriAI/litellm/pull/10321)
|
||||
3. Added routing affinity to maintain model consistency within sessions [Get Started](https://docs.litellm.ai/docs/response_api#load-balancing-with-routing-affinity), [PR](https://github.com/BerriAI/litellm/pull/10193)
|
||||
|
||||
|
||||
## Spend Tracking Improvements
|
||||
|
||||
- **Bug Fix**: Fixed spend tracking bug, ensuring default litellm params aren't modified in memory [PR](https://github.com/BerriAI/litellm/pull/10167)
|
||||
- **Deprecation Dates**: Added deprecation dates for Azure, VertexAI models [PR](https://github.com/BerriAI/litellm/pull/10308)
|
||||
|
||||
## Management Endpoints / UI
|
||||
|
||||
#### Users
|
||||
- **Filtering and Searching**:
|
||||
- Filter users by user_id, role, team, sso_id
|
||||
- Search users by email
|
||||
|
||||
<br/>
|
||||
|
||||
<Image img={require('../../img/release_notes/user_filters.png')}/>
|
||||
|
||||
- **User Info Panel**: Added a new user information pane [PR](https://github.com/BerriAI/litellm/pull/10213)
|
||||
- View teams, keys, models associated with User
|
||||
- Edit user role, model permissions
|
||||
|
||||
|
||||
|
||||
#### Teams
|
||||
- **Filtering and Searching**:
|
||||
- Filter teams by Organization, Team ID [PR](https://github.com/BerriAI/litellm/pull/10324)
|
||||
- Search teams by Team Name [PR](https://github.com/BerriAI/litellm/pull/10324)
|
||||
|
||||
<br/>
|
||||
|
||||
<Image img={require('../../img/release_notes/team_filters.png')}/>
|
||||
|
||||
|
||||
|
||||
#### Keys
|
||||
- **Key Management**:
|
||||
- Support for cross-filtering and filtering by key hash [PR](https://github.com/BerriAI/litellm/pull/10322)
|
||||
- Fixed key alias reset when resetting filters [PR](https://github.com/BerriAI/litellm/pull/10099)
|
||||
- Fixed table rendering on key creation [PR](https://github.com/BerriAI/litellm/pull/10224)
|
||||
|
||||
#### UI Logs Page
|
||||
|
||||
- **Session Logs**: Added UI Session Logs [Get Started](https://docs.litellm.ai/docs/proxy/ui_logs_sessions)
|
||||
|
||||
|
||||
#### UI Authentication & Security
|
||||
- **Required Authentication**: Authentication now required for all dashboard pages [PR](https://github.com/BerriAI/litellm/pull/10229)
|
||||
- **SSO Fixes**: Fixed SSO user login invalid token error [PR](https://github.com/BerriAI/litellm/pull/10298)
|
||||
- [BETA] **Encrypted Tokens**: Moved UI to encrypted token usage [PR](https://github.com/BerriAI/litellm/pull/10302)
|
||||
- **Token Expiry**: Support token refresh by re-routing to login page (fixes issue where expired token would show a blank page) [PR](https://github.com/BerriAI/litellm/pull/10250)
|
||||
|
||||
#### UI General fixes
|
||||
- **Fixed UI Flicker**: Addressed UI flickering issues in Dashboard [PR](https://github.com/BerriAI/litellm/pull/10261)
|
||||
- **Improved Terminology**: Better loading and no-data states on Keys and Tools pages [PR](https://github.com/BerriAI/litellm/pull/10253)
|
||||
- **Azure Model Support**: Fixed editing Azure public model names and changing model names after creation [PR](https://github.com/BerriAI/litellm/pull/10249)
|
||||
- **Team Model Selector**: Bug fix for team model selection [PR](https://github.com/BerriAI/litellm/pull/10171)
|
||||
|
||||
|
||||
## Logging / Guardrail Integrations
|
||||
|
||||
- **Datadog**:
|
||||
1. Fixed Datadog LLM observability logging [Get Started](https://docs.litellm.ai/docs/proxy/logging#datadog), [PR](https://github.com/BerriAI/litellm/pull/10206)
|
||||
- **Prometheus / Grafana**:
|
||||
1. Enable datasource selection on LiteLLM Grafana Template [Get Started](https://docs.litellm.ai/docs/proxy/prometheus#-litellm-maintained-grafana-dashboards-), [PR](https://github.com/BerriAI/litellm/pull/10257)
|
||||
- **AgentOps**:
|
||||
1. Added AgentOps Integration [Get Started](https://docs.litellm.ai/docs/observability/agentops_integration), [PR](https://github.com/BerriAI/litellm/pull/9685)
|
||||
- **Arize**:
|
||||
1. Added missing attributes for Arize & Phoenix Integration [Get Started](https://docs.litellm.ai/docs/observability/arize_integration), [PR](https://github.com/BerriAI/litellm/pull/10215)
|
||||
|
||||
|
||||
## General Proxy Improvements
|
||||
|
||||
- **Caching**: Fixed caching to account for `thinking` or `reasoning_effort` when calculating cache key [PR](https://github.com/BerriAI/litellm/pull/10140)
|
||||
- **Model Groups**: Fixed handling for cases where user sets model_group inside model_info [PR](https://github.com/BerriAI/litellm/pull/10191)
|
||||
- **Passthrough Endpoints**: Ensured `PassthroughStandardLoggingPayload` is logged with method, URL, request/response body [PR](https://github.com/BerriAI/litellm/pull/10194)
|
||||
- **Fix SQL Injection**: Fixed potential SQL injection vulnerability in spend_management_endpoints.py [PR](https://github.com/BerriAI/litellm/pull/9878)
|
||||
|
||||
|
||||
|
||||
## Helm
|
||||
|
||||
- Fixed serviceAccountName on migration job [PR](https://github.com/BerriAI/litellm/pull/10258)
|
||||
|
||||
## Full Changelog
|
||||
|
||||
The complete list of changes can be found in the [GitHub release notes](https://github.com/BerriAI/litellm/compare/v1.67.0-stable...v1.67.4-stable).
|
@@ -0,0 +1,182 @@
|
||||
---
|
||||
title: v1.68.0-stable
|
||||
slug: v1.68.0-stable
|
||||
date: 2025-05-03T10:00:00
|
||||
authors:
|
||||
- name: Krrish Dholakia
|
||||
title: CEO, LiteLLM
|
||||
url: https://www.linkedin.com/in/krish-d/
|
||||
image_url: https://media.licdn.com/dms/image/v2/D4D03AQGrlsJ3aqpHmQ/profile-displayphoto-shrink_400_400/B4DZSAzgP7HYAg-/0/1737327772964?e=1749686400&v=beta&t=Hkl3U8Ps0VtvNxX0BNNq24b4dtX5wQaPFp6oiKCIHD8
|
||||
- name: Ishaan Jaffer
|
||||
title: CTO, LiteLLM
|
||||
url: https://www.linkedin.com/in/reffajnaahsi/
|
||||
image_url: https://pbs.twimg.com/profile_images/1613813310264340481/lz54oEiB_400x400.jpg
|
||||
|
||||
hide_table_of_contents: false
|
||||
---
|
||||
import Image from '@theme/IdealImage';
|
||||
import Tabs from '@theme/Tabs';
|
||||
import TabItem from '@theme/TabItem';
|
||||
|
||||
|
||||
|
||||
## Deploy this version
|
||||
|
||||
<Tabs>
|
||||
<TabItem value="docker" label="Docker">
|
||||
|
||||
``` showLineNumbers title="docker run litellm"
|
||||
docker run
|
||||
-e STORE_MODEL_IN_DB=True
|
||||
-p 4000:4000
|
||||
ghcr.io/berriai/litellm:main-v1.68.0-stable
|
||||
```
|
||||
</TabItem>
|
||||
|
||||
<TabItem value="pip" label="Pip">
|
||||
|
||||
``` showLineNumbers title="pip install litellm"
|
||||
pip install litellm==1.68.0.post1
|
||||
```
|
||||
</TabItem>
|
||||
</Tabs>
|
||||
|
||||
## Key Highlights
|
||||
|
||||
LiteLLM v1.68.0-stable will be live soon. Here are the key highlights of this release:
|
||||
|
||||
- **Bedrock Knowledge Base**: You can now call query your Bedrock Knowledge Base with all LiteLLM models via `/chat/completion` or `/responses` API.
|
||||
- **Rate Limits**: This release brings accurate rate limiting across multiple instances, reducing spillover to at most 10 additional requests in high traffic.
|
||||
- **Meta Llama API**: Added support for Meta Llama API [Get Started](https://docs.litellm.ai/docs/providers/meta_llama)
|
||||
- **LlamaFile**: Added support for LlamaFile [Get Started](https://docs.litellm.ai/docs/providers/llamafile)
|
||||
|
||||
## Bedrock Knowledge Base (Vector Store)
|
||||
|
||||
<Image img={require('../../img/release_notes/bedrock_kb.png')}/>
|
||||
<br/>
|
||||
|
||||
This release adds support for Bedrock vector stores (knowledge bases) in LiteLLM. With this update, you can:
|
||||
|
||||
- Use Bedrock vector stores in the OpenAI /chat/completions spec with all LiteLLM supported models.
|
||||
- View all available vector stores through the LiteLLM UI or API.
|
||||
- Configure vector stores to be always active for specific models.
|
||||
- Track vector store usage in LiteLLM Logs.
|
||||
|
||||
For the next release we plan on allowing you to set key, user, team, org permissions for vector stores.
|
||||
|
||||
[Read more here](https://docs.litellm.ai/docs/completion/knowledgebase)
|
||||
|
||||
## Rate Limiting
|
||||
|
||||
<Image img={require('../../img/multi_instance_rate_limiting.png')}/>
|
||||
<br/>
|
||||
|
||||
|
||||
This release brings accurate multi-instance rate limiting across keys/users/teams. Outlining key engineering changes below:
|
||||
|
||||
- **Change**: Instances now increment cache value instead of setting it. To avoid calling Redis on each request, this is synced every 0.01s.
|
||||
- **Accuracy**: In testing, we saw a maximum spill over from expected of 10 requests, in high traffic (100 RPS, 3 instances), vs. current 189 request spillover
|
||||
- **Performance**: Our load tests show this to reduce median response time by 100ms in high traffic
|
||||
|
||||
This is currently behind a feature flag, and we plan to have this be the default by next week. To enable this today, just add this environment variable:
|
||||
|
||||
```
|
||||
export LITELLM_RATE_LIMIT_ACCURACY=true
|
||||
```
|
||||
|
||||
[Read more here](../../docs/proxy/users#beta-multi-instance-rate-limiting)
|
||||
|
||||
|
||||
|
||||
## New Models / Updated Models
|
||||
- **Gemini ([VertexAI](https://docs.litellm.ai/docs/providers/vertex#usage-with-litellm-proxy-server) + [Google AI Studio](https://docs.litellm.ai/docs/providers/gemini))**
|
||||
- Handle more json schema - openapi schema conversion edge cases [PR](https://github.com/BerriAI/litellm/pull/10351)
|
||||
- Tool calls - return ‘finish_reason=“tool_calls”’ on gemini tool calling response [PR](https://github.com/BerriAI/litellm/pull/10485)
|
||||
- **[VertexAI](../../docs/providers/vertex#metallama-api)**
|
||||
- Meta/llama-4 model support [PR](https://github.com/BerriAI/litellm/pull/10492)
|
||||
- Meta/llama3 - handle tool call result in content [PR](https://github.com/BerriAI/litellm/pull/10492)
|
||||
- Meta/* - return ‘finish_reason=“tool_calls”’ on tool calling response [PR](https://github.com/BerriAI/litellm/pull/10492)
|
||||
- **[Bedrock](../../docs/providers/bedrock#litellm-proxy-usage)**
|
||||
- [Image Generation](../../docs/providers/bedrock#image-generation) - Support new ‘stable-image-core’ models - [PR](https://github.com/BerriAI/litellm/pull/10351)
|
||||
- [Knowledge Bases](../../docs/completion/knowledgebase) - support using Bedrock knowledge bases with `/chat/completions` [PR](https://github.com/BerriAI/litellm/pull/10413)
|
||||
- [Anthropic](../../docs/providers/bedrock#litellm-proxy-usage) - add ‘supports_pdf_input’ for claude-3.7-bedrock models [PR](https://github.com/BerriAI/litellm/pull/9917), [Get Started](../../docs/completion/document_understanding#checking-if-a-model-supports-pdf-input)
|
||||
- **[OpenAI](../../docs/providers/openai)**
|
||||
- Support OPENAI_BASE_URL in addition to OPENAI_API_BASE [PR](https://github.com/BerriAI/litellm/pull/10423)
|
||||
- Correctly re-raise 504 timeout errors [PR](https://github.com/BerriAI/litellm/pull/10462)
|
||||
- Native Gpt-4o-mini-tts support [PR](https://github.com/BerriAI/litellm/pull/10462)
|
||||
- 🆕 **[Meta Llama API](../../docs/providers/meta_llama)** provider [PR](https://github.com/BerriAI/litellm/pull/10451)
|
||||
- 🆕 **[LlamaFile](../../docs/providers/llamafile)** provider [PR](https://github.com/BerriAI/litellm/pull/10482)
|
||||
|
||||
## LLM API Endpoints
|
||||
- **[Response API](../../docs/response_api)**
|
||||
- Fix for handling multi turn sessions [PR](https://github.com/BerriAI/litellm/pull/10415)
|
||||
- **[Embeddings](../../docs/embedding/supported_embedding)**
|
||||
- Caching fixes - [PR](https://github.com/BerriAI/litellm/pull/10424)
|
||||
- handle str -> list cache
|
||||
- Return usage tokens for cache hit
|
||||
- Combine usage tokens on partial cache hits
|
||||
- 🆕 **[Vector Stores](../../docs/completion/knowledgebase)**
|
||||
- Allow defining Vector Store Configs - [PR](https://github.com/BerriAI/litellm/pull/10448)
|
||||
- New StandardLoggingPayload field for requests made when a vector store is used - [PR](https://github.com/BerriAI/litellm/pull/10509)
|
||||
- Show Vector Store / KB Request on LiteLLM Logs Page - [PR](https://github.com/BerriAI/litellm/pull/10514)
|
||||
- Allow using vector store in OpenAI API spec with tools - [PR](https://github.com/BerriAI/litellm/pull/10516)
|
||||
- **[MCP](../../docs/mcp)**
|
||||
- Ensure Non-Admin virtual keys can access /mcp routes - [PR](https://github.com/BerriAI/litellm/pull/10473)
|
||||
|
||||
**Note:** Currently, all Virtual Keys are able to access the MCP endpoints. We are working on a feature to allow restricting MCP access by keys/teams/users/orgs. Follow [here](https://github.com/BerriAI/litellm/discussions/9891) for updates.
|
||||
- **Moderations**
|
||||
- Add logging callback support for `/moderations` API - [PR](https://github.com/BerriAI/litellm/pull/10390)
|
||||
|
||||
|
||||
## Spend Tracking / Budget Improvements
|
||||
- **[OpenAI](../../docs/providers/openai)**
|
||||
- [computer-use-preview](../../docs/providers/openai/responses_api#computer-use) cost tracking / pricing [PR](https://github.com/BerriAI/litellm/pull/10422)
|
||||
- [gpt-4o-mini-tts](../../docs/providers/openai/text_to_speech) input cost tracking - [PR](https://github.com/BerriAI/litellm/pull/10462)
|
||||
- **[Fireworks AI](../../docs/providers/fireworks_ai)** - pricing updates - new `0-4b` model pricing tier + llama4 model pricing
|
||||
- **[Budgets](../../docs/proxy/users#set-budgets)**
|
||||
- [Budget resets](../../docs/proxy/users#reset-budgets) now happen as start of day/week/month - [PR](https://github.com/BerriAI/litellm/pull/10333)
|
||||
- Trigger [Soft Budget Alerts](../../docs/proxy/alerting#soft-budget-alerts-for-virtual-keys) When Key Crosses Threshold - [PR](https://github.com/BerriAI/litellm/pull/10491)
|
||||
- **[Token Counting](../../docs/completion/token_usage#3-token_counter)**
|
||||
- Rewrite of token_counter() function to handle to prevent undercounting tokens - [PR](https://github.com/BerriAI/litellm/pull/10409)
|
||||
|
||||
|
||||
## Management Endpoints / UI
|
||||
- **Virtual Keys**
|
||||
- Fix filtering on key alias - [PR](https://github.com/BerriAI/litellm/pull/10455)
|
||||
- Support global filtering on keys - [PR](https://github.com/BerriAI/litellm/pull/10455)
|
||||
- Pagination - fix clicking on next/back buttons on table - [PR](https://github.com/BerriAI/litellm/pull/10528)
|
||||
- **Models**
|
||||
- Triton - Support adding model/provider on UI - [PR](https://github.com/BerriAI/litellm/pull/10456)
|
||||
- VertexAI - Fix adding vertex models with reusable credentials - [PR](https://github.com/BerriAI/litellm/pull/10528)
|
||||
- LLM Credentials - show existing credentials for easy editing - [PR](https://github.com/BerriAI/litellm/pull/10519)
|
||||
- **Teams**
|
||||
- Allow reassigning team to other org - [PR](https://github.com/BerriAI/litellm/pull/10527)
|
||||
- **Organizations**
|
||||
- Fix showing org budget on table - [PR](https://github.com/BerriAI/litellm/pull/10528)
|
||||
|
||||
|
||||
|
||||
## Logging / Guardrail Integrations
|
||||
- **[Langsmith](../../docs/observability/langsmith_integration)**
|
||||
- Respect [langsmith_batch_size](../../docs/observability/langsmith_integration#local-testing---control-batch-size) param - [PR](https://github.com/BerriAI/litellm/pull/10411)
|
||||
|
||||
## Performance / Loadbalancing / Reliability improvements
|
||||
- **[Redis](../../docs/proxy/caching)**
|
||||
- Ensure all redis queues are periodically flushed, this fixes an issue where redis queue size was growing indefinitely when request tags were used - [PR](https://github.com/BerriAI/litellm/pull/10393)
|
||||
- **[Rate Limits](../../docs/proxy/users#set-rate-limit)**
|
||||
- [Multi-instance rate limiting](../../docs/proxy/users#beta-multi-instance-rate-limiting) support across keys/teams/users/customers - [PR](https://github.com/BerriAI/litellm/pull/10458), [PR](https://github.com/BerriAI/litellm/pull/10497), [PR](https://github.com/BerriAI/litellm/pull/10500)
|
||||
- **[Azure OpenAI OIDC](../../docs/providers/azure#entra-id---use-azure_ad_token)**
|
||||
- allow using litellm defined params for [OIDC Auth](../../docs/providers/azure#entra-id---use-azure_ad_token) - [PR](https://github.com/BerriAI/litellm/pull/10394)
|
||||
|
||||
|
||||
## General Proxy Improvements
|
||||
- **Security**
|
||||
- Allow [blocking web crawlers](../../docs/proxy/enterprise#blocking-web-crawlers) - [PR](https://github.com/BerriAI/litellm/pull/10420)
|
||||
- **Auth**
|
||||
- Support [`x-litellm-api-key` header param by default](../../docs/pass_through/vertex_ai#use-with-virtual-keys), this fixes an issue from the prior release where `x-litellm-api-key` was not being used on vertex ai passthrough requests - [PR](https://github.com/BerriAI/litellm/pull/10392)
|
||||
- Allow key at max budget to call non-llm api endpoints - [PR](https://github.com/BerriAI/litellm/pull/10392)
|
||||
- 🆕 **[Python Client Library](../../docs/proxy/management_cli) for LiteLLM Proxy management endpoints**
|
||||
- Initial PR - [PR](https://github.com/BerriAI/litellm/pull/10445)
|
||||
- Support for doing HTTP requests - [PR](https://github.com/BerriAI/litellm/pull/10452)
|
||||
- **Dependencies**
|
||||
- Don’t require uvloop for windows - [PR](https://github.com/BerriAI/litellm/pull/10483)
|
@@ -0,0 +1,200 @@
|
||||
---
|
||||
title: v1.69.0-stable - Loadbalance Batch API Models
|
||||
slug: v1.69.0-stable
|
||||
date: 2025-05-10T10:00:00
|
||||
authors:
|
||||
- name: Krrish Dholakia
|
||||
title: CEO, LiteLLM
|
||||
url: https://www.linkedin.com/in/krish-d/
|
||||
image_url: https://media.licdn.com/dms/image/v2/D4D03AQGrlsJ3aqpHmQ/profile-displayphoto-shrink_400_400/B4DZSAzgP7HYAg-/0/1737327772964?e=1749686400&v=beta&t=Hkl3U8Ps0VtvNxX0BNNq24b4dtX5wQaPFp6oiKCIHD8
|
||||
- name: Ishaan Jaffer
|
||||
title: CTO, LiteLLM
|
||||
url: https://www.linkedin.com/in/reffajnaahsi/
|
||||
image_url: https://pbs.twimg.com/profile_images/1613813310264340481/lz54oEiB_400x400.jpg
|
||||
|
||||
hide_table_of_contents: false
|
||||
---
|
||||
import Image from '@theme/IdealImage';
|
||||
import Tabs from '@theme/Tabs';
|
||||
import TabItem from '@theme/TabItem';
|
||||
|
||||
|
||||
|
||||
## Deploy this version
|
||||
|
||||
<Tabs>
|
||||
<TabItem value="docker" label="Docker">
|
||||
|
||||
``` showLineNumbers title="docker run litellm"
|
||||
docker run
|
||||
-e STORE_MODEL_IN_DB=True
|
||||
-p 4000:4000
|
||||
ghcr.io/berriai/litellm:main-v1.69.0-stable
|
||||
```
|
||||
</TabItem>
|
||||
|
||||
<TabItem value="pip" label="Pip">
|
||||
|
||||
``` showLineNumbers title="pip install litellm"
|
||||
pip install litellm==1.69.0.post1
|
||||
```
|
||||
</TabItem>
|
||||
</Tabs>
|
||||
|
||||
## Key Highlights
|
||||
|
||||
LiteLLM v1.69.0-stable brings the following key improvements:
|
||||
|
||||
- **Loadbalance Batch API Models**: Easily loadbalance across multiple azure batch deployments using LiteLLM Managed Files
|
||||
- **Email Invites 2.0**: Send new users onboarded to LiteLLM an email invite.
|
||||
- **Nscale**: LLM API for compliance with European regulations.
|
||||
- **Bedrock /v1/messages**: Use Bedrock Anthropic models with Anthropic's /v1/messages.
|
||||
|
||||
## Batch API Load Balancing
|
||||
|
||||
<Image
|
||||
img={require('../../img/release_notes/lb_batch.png')}
|
||||
style={{width: '100%', display: 'block', margin: '0 0 2rem 0'}}
|
||||
/>
|
||||
|
||||
|
||||
This release brings LiteLLM Managed File support to Batches. This is great for:
|
||||
|
||||
- Proxy Admins: You can now control which Batch models users can call.
|
||||
- Developers: You no longer need to know the Azure deployment name when creating your batch .jsonl files - just specify the model your LiteLLM key has access to.
|
||||
|
||||
Over time, we expect LiteLLM Managed Files to be the way most teams use Files across `/chat/completions`, `/batch`, `/fine_tuning` endpoints.
|
||||
|
||||
[Read more here](https://docs.litellm.ai/docs/proxy/managed_batches)
|
||||
|
||||
|
||||
## Email Invites
|
||||
|
||||
<Image
|
||||
img={require('../../img/email_2_0.png')}
|
||||
style={{width: '100%', display: 'block', margin: '0 0 2rem 0'}}
|
||||
/>
|
||||
|
||||
This release brings the following improvements to our email invite integration:
|
||||
- New templates for user invited and key created events.
|
||||
- Fixes for using SMTP email providers.
|
||||
- Native support for Resend API.
|
||||
- Ability for Proxy Admins to control email events.
|
||||
|
||||
For LiteLLM Cloud Users, please reach out to us if you want this enabled for your instance.
|
||||
|
||||
[Read more here](https://docs.litellm.ai/docs/proxy/email)
|
||||
|
||||
|
||||
## New Models / Updated Models
|
||||
- **Gemini ([VertexAI](https://docs.litellm.ai/docs/providers/vertex#usage-with-litellm-proxy-server) + [Google AI Studio](https://docs.litellm.ai/docs/providers/gemini))**
|
||||
- Added `gemini-2.5-pro-preview-05-06` models with pricing and context window info - [PR](https://github.com/BerriAI/litellm/pull/10597)
|
||||
- Set correct context window length for all Gemini 2.5 variants - [PR](https://github.com/BerriAI/litellm/pull/10690)
|
||||
- **[Perplexity](../../docs/providers/perplexity)**:
|
||||
- Added new Perplexity models - [PR](https://github.com/BerriAI/litellm/pull/10652)
|
||||
- Added sonar-deep-research model pricing - [PR](https://github.com/BerriAI/litellm/pull/10537)
|
||||
- **[Azure OpenAI](../../docs/providers/azure)**:
|
||||
- Fixed passing through of azure_ad_token_provider parameter - [PR](https://github.com/BerriAI/litellm/pull/10694)
|
||||
- **[OpenAI](../../docs/providers/openai)**:
|
||||
- Added support for pdf url's in 'file' parameter - [PR](https://github.com/BerriAI/litellm/pull/10640)
|
||||
- **[Sagemaker](../../docs/providers/aws_sagemaker)**:
|
||||
- Fix content length for `sagemaker_chat` provider - [PR](https://github.com/BerriAI/litellm/pull/10607)
|
||||
- **[Azure AI Foundry](../../docs/providers/azure_ai)**:
|
||||
- Added cost tracking for the following models [PR](https://github.com/BerriAI/litellm/pull/9956)
|
||||
- DeepSeek V3 0324
|
||||
- Llama 4 Scout
|
||||
- Llama 4 Maverick
|
||||
- **[Bedrock](../../docs/providers/bedrock)**:
|
||||
- Added cost tracking for Bedrock Llama 4 models - [PR](https://github.com/BerriAI/litellm/pull/10582)
|
||||
- Fixed template conversion for Llama 4 models in Bedrock - [PR](https://github.com/BerriAI/litellm/pull/10582)
|
||||
- Added support for using Bedrock Anthropic models with /v1/messages format - [PR](https://github.com/BerriAI/litellm/pull/10681)
|
||||
- Added streaming support for Bedrock Anthropic models with /v1/messages format - [PR](https://github.com/BerriAI/litellm/pull/10710)
|
||||
- **[OpenAI](../../docs/providers/openai)**: Added `reasoning_effort` support for `o3` models - [PR](https://github.com/BerriAI/litellm/pull/10591)
|
||||
- **[Databricks](../../docs/providers/databricks)**:
|
||||
- Fixed issue when Databricks uses external model and delta could be empty - [PR](https://github.com/BerriAI/litellm/pull/10540)
|
||||
- **[Cerebras](../../docs/providers/cerebras)**: Fixed Llama-3.1-70b model pricing and context window - [PR](https://github.com/BerriAI/litellm/pull/10648)
|
||||
- **[Ollama](../../docs/providers/ollama)**:
|
||||
- Fixed custom price cost tracking and added 'max_completion_token' support - [PR](https://github.com/BerriAI/litellm/pull/10636)
|
||||
- Fixed KeyError when using JSON response format - [PR](https://github.com/BerriAI/litellm/pull/10611)
|
||||
- 🆕 **[Nscale](../../docs/providers/nscale)**:
|
||||
- Added support for chat, image generation endpoints - [PR](https://github.com/BerriAI/litellm/pull/10638)
|
||||
|
||||
## LLM API Endpoints
|
||||
- **[Messages API](../../docs/anthropic_unified)**:
|
||||
- 🆕 Added support for using Bedrock Anthropic models with /v1/messages format - [PR](https://github.com/BerriAI/litellm/pull/10681) and streaming support - [PR](https://github.com/BerriAI/litellm/pull/10710)
|
||||
- **[Moderations API](../../docs/moderations)**:
|
||||
- Fixed bug to allow using LiteLLM UI credentials for /moderations API - [PR](https://github.com/BerriAI/litellm/pull/10723)
|
||||
- **[Realtime API](../../docs/realtime)**:
|
||||
- Fixed setting 'headers' in scope for websocket auth requests and infinite loop issues - [PR](https://github.com/BerriAI/litellm/pull/10679)
|
||||
- **[Files API](../../docs/proxy/litellm_managed_files)**:
|
||||
- Unified File ID output support - [PR](https://github.com/BerriAI/litellm/pull/10713)
|
||||
- Support for writing files to all deployments - [PR](https://github.com/BerriAI/litellm/pull/10708)
|
||||
- Added target model name validation - [PR](https://github.com/BerriAI/litellm/pull/10722)
|
||||
- **[Batches API](../../docs/batches)**:
|
||||
- Complete unified batch ID support - replacing model in jsonl to be deployment model name - [PR](https://github.com/BerriAI/litellm/pull/10719)
|
||||
- Beta support for unified file ID (managed files) for batches - [PR](https://github.com/BerriAI/litellm/pull/10650)
|
||||
|
||||
|
||||
## Spend Tracking / Budget Improvements
|
||||
- Bug Fix - PostgreSQL Integer Overflow Error in DB Spend Tracking - [PR](https://github.com/BerriAI/litellm/pull/10697)
|
||||
|
||||
## Management Endpoints / UI
|
||||
- **Models**
|
||||
- Fixed model info overwriting when editing a model on UI - [PR](https://github.com/BerriAI/litellm/pull/10726)
|
||||
- Fixed team admin model updates and organization creation with specific models - [PR](https://github.com/BerriAI/litellm/pull/10539)
|
||||
- **Logs**:
|
||||
- Bug Fix - copying Request/Response on Logs Page - [PR](https://github.com/BerriAI/litellm/pull/10720)
|
||||
- Bug Fix - log did not remain in focus on QA Logs page + text overflow on error logs - [PR](https://github.com/BerriAI/litellm/pull/10725)
|
||||
- Added index for session_id on LiteLLM_SpendLogs for better query performance - [PR](https://github.com/BerriAI/litellm/pull/10727)
|
||||
- **User Management**:
|
||||
- Added user management functionality to Python client library & CLI - [PR](https://github.com/BerriAI/litellm/pull/10627)
|
||||
- Bug Fix - Fixed SCIM token creation on Admin UI - [PR](https://github.com/BerriAI/litellm/pull/10628)
|
||||
- Bug Fix - Added 404 response when trying to delete verification tokens that don't exist - [PR](https://github.com/BerriAI/litellm/pull/10605)
|
||||
|
||||
## Logging / Guardrail Integrations
|
||||
- **Custom Logger API**: v2 Custom Callback API (send llm logs to custom api) - [PR](https://github.com/BerriAI/litellm/pull/10575), [Get Started](https://docs.litellm.ai/docs/proxy/logging#custom-callback-apis-async)
|
||||
- **OpenTelemetry**:
|
||||
- Fixed OpenTelemetry to follow genai semantic conventions + support for 'instructions' param for TTS - [PR](https://github.com/BerriAI/litellm/pull/10608)
|
||||
- ** Bedrock PII**:
|
||||
- Add support for PII Masking with bedrock guardrails - [Get Started](https://docs.litellm.ai/docs/proxy/guardrails/bedrock#pii-masking-with-bedrock-guardrails), [PR](https://github.com/BerriAI/litellm/pull/10608)
|
||||
- **Documentation**:
|
||||
- Added documentation for StandardLoggingVectorStoreRequest - [PR](https://github.com/BerriAI/litellm/pull/10535)
|
||||
|
||||
## Performance / Reliability Improvements
|
||||
- **Python Compatibility**:
|
||||
- Added support for Python 3.11- (fixed datetime UTC handling) - [PR](https://github.com/BerriAI/litellm/pull/10701)
|
||||
- Fixed UnicodeDecodeError: 'charmap' on Windows during litellm import - [PR](https://github.com/BerriAI/litellm/pull/10542)
|
||||
- **Caching**:
|
||||
- Fixed embedding string caching result - [PR](https://github.com/BerriAI/litellm/pull/10700)
|
||||
- Fixed cache miss for Gemini models with response_format - [PR](https://github.com/BerriAI/litellm/pull/10635)
|
||||
|
||||
## General Proxy Improvements
|
||||
- **Proxy CLI**:
|
||||
- Added `--version` flag to `litellm-proxy` CLI - [PR](https://github.com/BerriAI/litellm/pull/10704)
|
||||
- Added dedicated `litellm-proxy` CLI - [PR](https://github.com/BerriAI/litellm/pull/10578)
|
||||
- **Alerting**:
|
||||
- Fixed Slack alerting not working when using a DB - [PR](https://github.com/BerriAI/litellm/pull/10370)
|
||||
- **Email Invites**:
|
||||
- Added V2 Emails with fixes for sending emails when creating keys + Resend API support - [PR](https://github.com/BerriAI/litellm/pull/10602)
|
||||
- Added user invitation emails - [PR](https://github.com/BerriAI/litellm/pull/10615)
|
||||
- Added endpoints to manage email settings - [PR](https://github.com/BerriAI/litellm/pull/10646)
|
||||
- **General**:
|
||||
- Fixed bug where duplicate JSON logs were getting emitted - [PR](https://github.com/BerriAI/litellm/pull/10580)
|
||||
|
||||
|
||||
## New Contributors
|
||||
- [@zoltan-ongithub](https://github.com/zoltan-ongithub) made their first contribution in [PR #10568](https://github.com/BerriAI/litellm/pull/10568)
|
||||
- [@mkavinkumar1](https://github.com/mkavinkumar1) made their first contribution in [PR #10548](https://github.com/BerriAI/litellm/pull/10548)
|
||||
- [@thomelane](https://github.com/thomelane) made their first contribution in [PR #10549](https://github.com/BerriAI/litellm/pull/10549)
|
||||
- [@frankzye](https://github.com/frankzye) made their first contribution in [PR #10540](https://github.com/BerriAI/litellm/pull/10540)
|
||||
- [@aholmberg](https://github.com/aholmberg) made their first contribution in [PR #10591](https://github.com/BerriAI/litellm/pull/10591)
|
||||
- [@aravindkarnam](https://github.com/aravindkarnam) made their first contribution in [PR #10611](https://github.com/BerriAI/litellm/pull/10611)
|
||||
- [@xsg22](https://github.com/xsg22) made their first contribution in [PR #10648](https://github.com/BerriAI/litellm/pull/10648)
|
||||
- [@casparhsws](https://github.com/casparhsws) made their first contribution in [PR #10635](https://github.com/BerriAI/litellm/pull/10635)
|
||||
- [@hypermoose](https://github.com/hypermoose) made their first contribution in [PR #10370](https://github.com/BerriAI/litellm/pull/10370)
|
||||
- [@tomukmatthews](https://github.com/tomukmatthews) made their first contribution in [PR #10638](https://github.com/BerriAI/litellm/pull/10638)
|
||||
- [@keyute](https://github.com/keyute) made their first contribution in [PR #10652](https://github.com/BerriAI/litellm/pull/10652)
|
||||
- [@GPTLocalhost](https://github.com/GPTLocalhost) made their first contribution in [PR #10687](https://github.com/BerriAI/litellm/pull/10687)
|
||||
- [@husnain7766](https://github.com/husnain7766) made their first contribution in [PR #10697](https://github.com/BerriAI/litellm/pull/10697)
|
||||
- [@claralp](https://github.com/claralp) made their first contribution in [PR #10694](https://github.com/BerriAI/litellm/pull/10694)
|
||||
- [@mollux](https://github.com/mollux) made their first contribution in [PR #10690](https://github.com/BerriAI/litellm/pull/10690)
|
@@ -0,0 +1,248 @@
|
||||
---
|
||||
title: v1.70.1-stable - Gemini Realtime API Support
|
||||
slug: v1.70.1-stable
|
||||
date: 2025-05-17T10:00:00
|
||||
authors:
|
||||
- name: Krrish Dholakia
|
||||
title: CEO, LiteLLM
|
||||
url: https://www.linkedin.com/in/krish-d/
|
||||
image_url: https://media.licdn.com/dms/image/v2/D4D03AQGrlsJ3aqpHmQ/profile-displayphoto-shrink_400_400/B4DZSAzgP7HYAg-/0/1737327772964?e=1749686400&v=beta&t=Hkl3U8Ps0VtvNxX0BNNq24b4dtX5wQaPFp6oiKCIHD8
|
||||
- name: Ishaan Jaffer
|
||||
title: CTO, LiteLLM
|
||||
url: https://www.linkedin.com/in/reffajnaahsi/
|
||||
image_url: https://pbs.twimg.com/profile_images/1613813310264340481/lz54oEiB_400x400.jpg
|
||||
|
||||
hide_table_of_contents: false
|
||||
---
|
||||
|
||||
import Image from '@theme/IdealImage';
|
||||
import Tabs from '@theme/Tabs';
|
||||
import TabItem from '@theme/TabItem';
|
||||
|
||||
|
||||
|
||||
## Deploy this version
|
||||
|
||||
<Tabs>
|
||||
<TabItem value="docker" label="Docker">
|
||||
|
||||
``` showLineNumbers title="docker run litellm"
|
||||
docker run
|
||||
-e STORE_MODEL_IN_DB=True
|
||||
-p 4000:4000
|
||||
ghcr.io/berriai/litellm:main-v1.70.1-stable
|
||||
```
|
||||
</TabItem>
|
||||
|
||||
<TabItem value="pip" label="Pip">
|
||||
|
||||
``` showLineNumbers title="pip install litellm"
|
||||
pip install litellm==1.70.1
|
||||
```
|
||||
</TabItem>
|
||||
</Tabs>
|
||||
|
||||
|
||||
## Key Highlights
|
||||
|
||||
LiteLLM v1.70.1-stable is live now. Here are the key highlights of this release:
|
||||
|
||||
- **Gemini Realtime API**: You can now call Gemini's Live API via the OpenAI /v1/realtime API
|
||||
- **Spend Logs Retention Period**: Enable deleting spend logs older than a certain period.
|
||||
- **PII Masking 2.0**: Easily configure masking or blocking specific PII/PHI entities on the UI
|
||||
|
||||
## Gemini Realtime API
|
||||
|
||||
<Image img={require('../../img/gemini_realtime.png')}/>
|
||||
|
||||
|
||||
This release brings support for calling Gemini's realtime models (e.g. gemini-2.0-flash-live) via OpenAI's /v1/realtime API. This is great for developers as it lets them easily switch from OpenAI to Gemini by just changing the model name.
|
||||
|
||||
Key Highlights:
|
||||
- Support for text + audio input/output
|
||||
- Support for setting session configurations (modality, instructions, activity detection) in the OpenAI format
|
||||
- Support for logging + usage tracking for realtime sessions
|
||||
|
||||
This is currently supported via Google AI Studio. We plan to release VertexAI support over the coming week.
|
||||
|
||||
[**Read more**](../../docs/providers/google_ai_studio/realtime)
|
||||
|
||||
## Spend Logs Retention Period
|
||||
|
||||
<Image img={require('../../img/delete_spend_logs.jpg')}/>
|
||||
|
||||
|
||||
|
||||
This release enables deleting LiteLLM Spend Logs older than a certain period. Since we now enable storing the raw request/response in the logs, deleting old logs ensures the database remains performant in production.
|
||||
|
||||
[**Read more**](../../docs/proxy/spend_logs_deletion)
|
||||
|
||||
## PII Masking 2.0
|
||||
|
||||
<Image img={require('../../img/pii_masking_v2.png')}/>
|
||||
|
||||
This release brings improvements to our Presidio PII Integration. As a Proxy Admin, you now have the ability to:
|
||||
|
||||
- Mask or block specific entities (e.g., block medical licenses while masking other entities like emails).
|
||||
- Monitor guardrails in production. LiteLLM Logs will now show you the guardrail run, the entities it detected, and its confidence score for each entity.
|
||||
|
||||
[**Read more**](../../docs/proxy/guardrails/pii_masking_v2)
|
||||
|
||||
## New Models / Updated Models
|
||||
|
||||
- **Gemini ([VertexAI](https://docs.litellm.ai/docs/providers/vertex#usage-with-litellm-proxy-server) + [Google AI Studio](https://docs.litellm.ai/docs/providers/gemini))**
|
||||
- `/chat/completion`
|
||||
- Handle audio input - [PR](https://github.com/BerriAI/litellm/pull/10739)
|
||||
- Fixes maximum recursion depth issue when using deeply nested response schemas with Vertex AI by Increasing DEFAULT_MAX_RECURSE_DEPTH from 10 to 100 in constants. [PR](https://github.com/BerriAI/litellm/pull/10798)
|
||||
- Capture reasoning tokens in streaming mode - [PR](https://github.com/BerriAI/litellm/pull/10789)
|
||||
- **[Google AI Studio](../../docs/providers/google_ai_studio/realtime)**
|
||||
- `/realtime`
|
||||
- Gemini Multimodal Live API support
|
||||
- Audio input/output support, optional param mapping, accurate usage calculation - [PR](https://github.com/BerriAI/litellm/pull/10909)
|
||||
- **[VertexAI](../../docs/providers/vertex#metallama-api)**
|
||||
- `/chat/completion`
|
||||
- Fix llama streaming error - where model response was nested in returned streaming chunk - [PR](https://github.com/BerriAI/litellm/pull/10878)
|
||||
- **[Ollama](../../docs/providers/ollama)**
|
||||
- `/chat/completion`
|
||||
- structure responses fix - [PR](https://github.com/BerriAI/litellm/pull/10617)
|
||||
- **[Bedrock](../../docs/providers/bedrock#litellm-proxy-usage)**
|
||||
- [`/chat/completion`](../../docs/providers/bedrock#litellm-proxy-usage)
|
||||
- Handle thinking_blocks when assistant.content is None - [PR](https://github.com/BerriAI/litellm/pull/10688)
|
||||
- Fixes to only allow accepted fields for tool json schema - [PR](https://github.com/BerriAI/litellm/pull/10062)
|
||||
- Add bedrock sonnet prompt caching cost information
|
||||
- Mistral Pixtral support - [PR](https://github.com/BerriAI/litellm/pull/10439)
|
||||
- Tool caching support - [PR](https://github.com/BerriAI/litellm/pull/10897)
|
||||
- [`/messages`](../../docs/anthropic_unified)
|
||||
- allow using dynamic AWS Params - [PR](https://github.com/BerriAI/litellm/pull/10769)
|
||||
- **[Nvidia NIM](../../docs/providers/nvidia_nim)**
|
||||
- [`/chat/completion`](../../docs/providers/nvidia_nim#usage---litellm-proxy-server)
|
||||
- Add tools, tool_choice, parallel_tool_calls support - [PR](https://github.com/BerriAI/litellm/pull/10763)
|
||||
- **[Novita AI](../../docs/providers/novita)**
|
||||
- New Provider added for `/chat/completion` routes - [PR](https://github.com/BerriAI/litellm/pull/9527)
|
||||
- **[Azure](../../docs/providers/azure)**
|
||||
- [`/image/generation`](../../docs/providers/azure#image-generation)
|
||||
- Fix azure dall e 3 call with custom model name - [PR](https://github.com/BerriAI/litellm/pull/10776)
|
||||
- **[Cohere](../../docs/providers/cohere)**
|
||||
- [`/embeddings`](../../docs/providers/cohere#embedding)
|
||||
- Migrate embedding to use `/v2/embed` - adds support for output_dimensions param - [PR](https://github.com/BerriAI/litellm/pull/10809)
|
||||
- **[Anthropic](../../docs/providers/anthropic)**
|
||||
- [`/chat/completion`](../../docs/providers/anthropic#usage-with-litellm-proxy)
|
||||
- Web search tool support - native + openai format - [Get Started](../../docs/providers/anthropic#anthropic-hosted-tools-computer-text-editor-web-search)
|
||||
- **[VLLM](../../docs/providers/vllm)**
|
||||
- [`/embeddings`](../../docs/providers/vllm#embeddings)
|
||||
- Support embedding input as list of integers
|
||||
- **[OpenAI](../../docs/providers/openai)**
|
||||
- [`/chat/completion`](../../docs/providers/openai#usage---litellm-proxy-server)
|
||||
- Fix - b64 file data input handling - [Get Started](../../docs/providers/openai#pdf-file-parsing)
|
||||
- Add ‘supports_pdf_input’ to all vision models - [PR](https://github.com/BerriAI/litellm/pull/10897)
|
||||
|
||||
## LLM API Endpoints
|
||||
- [**Responses API**](../../docs/response_api)
|
||||
- Fix delete API support - [PR](https://github.com/BerriAI/litellm/pull/10845)
|
||||
- [**Rerank API**](../../docs/rerank)
|
||||
- `/v2/rerank` now registered as ‘llm_api_route’ - enabling non-admins to call it - [PR](https://github.com/BerriAI/litellm/pull/10861)
|
||||
|
||||
## Spend Tracking Improvements
|
||||
- **`/chat/completion`, `/messages`**
|
||||
- Anthropic - web search tool cost tracking - [PR](https://github.com/BerriAI/litellm/pull/10846)
|
||||
- Groq - update model max tokens + cost information - [PR](https://github.com/BerriAI/litellm/pull/10077)
|
||||
- **`/audio/transcription`**
|
||||
- Azure - Add gpt-4o-mini-tts pricing - [PR](https://github.com/BerriAI/litellm/pull/10807)
|
||||
- Proxy - Fix tracking spend by tag - [PR](https://github.com/BerriAI/litellm/pull/10832)
|
||||
- **`/embeddings`**
|
||||
- Azure AI - Add cohere embed v4 pricing - [PR](https://github.com/BerriAI/litellm/pull/10806)
|
||||
|
||||
## Management Endpoints / UI
|
||||
- **Models**
|
||||
- Ollama - adds api base param to UI
|
||||
- **Logs**
|
||||
- Add team id, key alias, key hash filter on logs - https://github.com/BerriAI/litellm/pull/10831
|
||||
- Guardrail tracing now in Logs UI - https://github.com/BerriAI/litellm/pull/10893
|
||||
- **Teams**
|
||||
- Patch for updating team info when team in org and members not in org - https://github.com/BerriAI/litellm/pull/10835
|
||||
- **Guardrails**
|
||||
- Add Bedrock, Presidio, Lakers guardrails on UI - https://github.com/BerriAI/litellm/pull/10874
|
||||
- See guardrail info page - https://github.com/BerriAI/litellm/pull/10904
|
||||
- Allow editing guardrails on UI - https://github.com/BerriAI/litellm/pull/10907
|
||||
- **Test Key**
|
||||
- select guardrails to test on UI
|
||||
|
||||
|
||||
|
||||
## Logging / Alerting Integrations
|
||||
- **[StandardLoggingPayload](../../docs/proxy/logging_spec)**
|
||||
- Log any `x-` headers in requester metadata - [Get Started](../../docs/proxy/logging_spec#standardloggingmetadata)
|
||||
- Guardrail tracing now in standard logging payload - [Get Started](../../docs/proxy/logging_spec#standardloggingguardrailinformation)
|
||||
- **[Generic API Logger](../../docs/proxy/logging#custom-callback-apis-async)**
|
||||
- Support passing application/json header
|
||||
- **[Arize Phoenix](../../docs/observability/phoenix_integration)**
|
||||
- fix: URL encode OTEL_EXPORTER_OTLP_TRACES_HEADERS for Phoenix Integration - [PR](https://github.com/BerriAI/litellm/pull/10654)
|
||||
- add guardrail tracing to OTEL, Arize phoenix - [PR](https://github.com/BerriAI/litellm/pull/10896)
|
||||
- **[PagerDuty](../../docs/proxy/pagerduty)**
|
||||
- Pagerduty is now a free feature - [PR](https://github.com/BerriAI/litellm/pull/10857)
|
||||
- **[Alerting](../../docs/proxy/alerting)**
|
||||
- Sending slack alerts on virtual key/user/team updates is now free - [PR](https://github.com/BerriAI/litellm/pull/10863)
|
||||
|
||||
|
||||
## Guardrails
|
||||
- **Guardrails**
|
||||
- New `/apply_guardrail` endpoint for directly testing a guardrail - [PR](https://github.com/BerriAI/litellm/pull/10867)
|
||||
- **[Lakera](../../docs/proxy/guardrails/lakera_ai)**
|
||||
- `/v2` endpoints support - [PR](https://github.com/BerriAI/litellm/pull/10880)
|
||||
- **[Presidio](../../docs/proxy/guardrails/pii_masking_v2)**
|
||||
- Fixes handling of message content on presidio guardrail integration - [PR](https://github.com/BerriAI/litellm/pull/10197)
|
||||
- Allow specifying PII Entities Config - [PR](https://github.com/BerriAI/litellm/pull/10810)
|
||||
- **[Aim Security](../../docs/proxy/guardrails/aim_security)**
|
||||
- Support for anonymization in AIM Guardrails - [PR](https://github.com/BerriAI/litellm/pull/10757)
|
||||
|
||||
|
||||
|
||||
## Performance / Loadbalancing / Reliability improvements
|
||||
- **Allow overriding all constants using a .env variable** - [PR](https://github.com/BerriAI/litellm/pull/10803)
|
||||
- **[Maximum retention period for spend logs](../../docs/proxy/spend_logs_deletion)**
|
||||
- Add retention flag to config - [PR](https://github.com/BerriAI/litellm/pull/10815)
|
||||
- Support for cleaning up logs based on configured time period - [PR](https://github.com/BerriAI/litellm/pull/10872)
|
||||
|
||||
## General Proxy Improvements
|
||||
- **Authentication**
|
||||
- Handle Bearer $LITELLM_API_KEY in x-litellm-api-key custom header [PR](https://github.com/BerriAI/litellm/pull/10776)
|
||||
- **New Enterprise pip package** - `litellm-enterprise` - fixes issue where `enterprise` folder was not found when using pip package
|
||||
- **[Proxy CLI](../../docs/proxy/management_cli)**
|
||||
- Add `models import` command - [PR](https://github.com/BerriAI/litellm/pull/10581)
|
||||
- **[OpenWebUI](../../docs/tutorials/openweb_ui#per-user-tracking)**
|
||||
- Configure LiteLLM to Parse User Headers from Open Web UI
|
||||
- **[LiteLLM Proxy w/ LiteLLM SDK](../../docs/providers/litellm_proxy#send-all-sdk-requests-to-litellm-proxy)**
|
||||
- Option to force/always use the litellm proxy when calling via LiteLLM SDK
|
||||
|
||||
|
||||
## New Contributors
|
||||
* [@imdigitalashish](https://github.com/imdigitalashish) made their first contribution in PR [#10617](https://github.com/BerriAI/litellm/pull/10617)
|
||||
* [@LouisShark](https://github.com/LouisShark) made their first contribution in PR [#10688](https://github.com/BerriAI/litellm/pull/10688)
|
||||
* [@OscarSavNS](https://github.com/OscarSavNS) made their first contribution in PR [#10764](https://github.com/BerriAI/litellm/pull/10764)
|
||||
* [@arizedatngo](https://github.com/arizedatngo) made their first contribution in PR [#10654](https://github.com/BerriAI/litellm/pull/10654)
|
||||
* [@jugaldb](https://github.com/jugaldb) made their first contribution in PR [#10805](https://github.com/BerriAI/litellm/pull/10805)
|
||||
* [@daikeren](https://github.com/daikeren) made their first contribution in PR [#10781](https://github.com/BerriAI/litellm/pull/10781)
|
||||
* [@naliotopier](https://github.com/naliotopier) made their first contribution in PR [#10077](https://github.com/BerriAI/litellm/pull/10077)
|
||||
* [@damienpontifex](https://github.com/damienpontifex) made their first contribution in PR [#10813](https://github.com/BerriAI/litellm/pull/10813)
|
||||
* [@Dima-Mediator](https://github.com/Dima-Mediator) made their first contribution in PR [#10789](https://github.com/BerriAI/litellm/pull/10789)
|
||||
* [@igtm](https://github.com/igtm) made their first contribution in PR [#10814](https://github.com/BerriAI/litellm/pull/10814)
|
||||
* [@shibaboy](https://github.com/shibaboy) made their first contribution in PR [#10752](https://github.com/BerriAI/litellm/pull/10752)
|
||||
* [@camfarineau](https://github.com/camfarineau) made their first contribution in PR [#10629](https://github.com/BerriAI/litellm/pull/10629)
|
||||
* [@ajac-zero](https://github.com/ajac-zero) made their first contribution in PR [#10439](https://github.com/BerriAI/litellm/pull/10439)
|
||||
* [@damgem](https://github.com/damgem) made their first contribution in PR [#9802](https://github.com/BerriAI/litellm/pull/9802)
|
||||
* [@hxdror](https://github.com/hxdror) made their first contribution in PR [#10757](https://github.com/BerriAI/litellm/pull/10757)
|
||||
* [@wwwillchen](https://github.com/wwwillchen) made their first contribution in PR [#10894](https://github.com/BerriAI/litellm/pull/10894)
|
||||
|
||||
|
||||
## Demo Instance
|
||||
|
||||
Here's a Demo Instance to test changes:
|
||||
|
||||
- Instance: https://demo.litellm.ai/
|
||||
- Login Credentials:
|
||||
- Username: admin
|
||||
- Password: sk-1234
|
||||
|
||||
|
||||
## [Git Diff](https://github.com/BerriAI/litellm/releases)
|
||||
|
@@ -0,0 +1,284 @@
|
||||
---
|
||||
title: v1.71.1-stable - 2x Higher Requests Per Second (RPS)
|
||||
slug: v1.71.1-stable
|
||||
date: 2025-05-24T10:00:00
|
||||
authors:
|
||||
- name: Krrish Dholakia
|
||||
title: CEO, LiteLLM
|
||||
url: https://www.linkedin.com/in/krish-d/
|
||||
image_url: https://media.licdn.com/dms/image/v2/D4D03AQGrlsJ3aqpHmQ/profile-displayphoto-shrink_400_400/B4DZSAzgP7HYAg-/0/1737327772964?e=1749686400&v=beta&t=Hkl3U8Ps0VtvNxX0BNNq24b4dtX5wQaPFp6oiKCIHD8
|
||||
- name: Ishaan Jaffer
|
||||
title: CTO, LiteLLM
|
||||
url: https://www.linkedin.com/in/reffajnaahsi/
|
||||
image_url: https://pbs.twimg.com/profile_images/1613813310264340481/lz54oEiB_400x400.jpg
|
||||
|
||||
hide_table_of_contents: false
|
||||
---
|
||||
|
||||
import Image from '@theme/IdealImage';
|
||||
import Tabs from '@theme/Tabs';
|
||||
import TabItem from '@theme/TabItem';
|
||||
|
||||
## Deploy this version
|
||||
|
||||
<Tabs>
|
||||
<TabItem value="docker" label="Docker">
|
||||
|
||||
``` showLineNumbers title="docker run litellm"
|
||||
docker run
|
||||
-e STORE_MODEL_IN_DB=True
|
||||
-p 4000:4000
|
||||
ghcr.io/berriai/litellm:main-v1.71.1-stable
|
||||
```
|
||||
</TabItem>
|
||||
|
||||
<TabItem value="pip" label="Pip">
|
||||
|
||||
``` showLineNumbers title="pip install litellm"
|
||||
pip install litellm==1.71.1
|
||||
```
|
||||
</TabItem>
|
||||
</Tabs>
|
||||
|
||||
## Key Highlights
|
||||
|
||||
LiteLLM v1.71.1-stable is live now. Here are the key highlights of this release:
|
||||
|
||||
- **Performance improvements**: LiteLLM can now scale to 200 RPS per instance with a 74ms median response time.
|
||||
- **File Permissions**: Control file access across OpenAI, Azure, VertexAI.
|
||||
- **MCP x OpenAI**: Use MCP servers with OpenAI Responses API.
|
||||
|
||||
|
||||
|
||||
## Performance Improvements
|
||||
|
||||
<Image img={require('../../img/perf_imp.png')} style={{ width: '800px', height: 'auto' }} />
|
||||
|
||||
<br/>
|
||||
|
||||
|
||||
This release brings aiohttp support for all LLM api providers. This means that LiteLLM can now scale to 200 RPS per instance with a 40ms median latency overhead.
|
||||
|
||||
This change doubles the RPS LiteLLM can scale to at this latency overhead.
|
||||
|
||||
You can opt into this by enabling the flag below. (We expect to make this the default in 1 week.)
|
||||
|
||||
|
||||
### Flag to enable
|
||||
|
||||
**On LiteLLM Proxy**
|
||||
|
||||
Set the `USE_AIOHTTP_TRANSPORT=True` in the environment variables.
|
||||
|
||||
```yaml showLineNumbers title="Environment Variable"
|
||||
export USE_AIOHTTP_TRANSPORT="True"
|
||||
```
|
||||
|
||||
**On LiteLLM Python SDK**
|
||||
|
||||
Set the `use_aiohttp_transport=True` to enable aiohttp transport.
|
||||
|
||||
```python showLineNumbers title="Python SDK"
|
||||
import litellm
|
||||
|
||||
litellm.use_aiohttp_transport = True # default is False, enable this to use aiohttp transport
|
||||
result = litellm.completion(
|
||||
model="openai/gpt-4o",
|
||||
messages=[{"role": "user", "content": "Hello, world!"}],
|
||||
)
|
||||
print(result)
|
||||
```
|
||||
|
||||
## File Permissions
|
||||
|
||||
<Image img={require('../../img/files_api_graphic.png')} style={{ width: '800px', height: 'auto' }} />
|
||||
|
||||
<br/>
|
||||
|
||||
This release brings support for [File Permissions](../../docs/proxy/litellm_managed_files#file-permissions) and [Finetuning APIs](../../docs/proxy/managed_finetuning) to [LiteLLM Managed Files](../../docs/proxy/litellm_managed_files). This is great for:
|
||||
|
||||
- **Proxy Admins**: as users can only view/edit/delete files they’ve created - even when using shared OpenAI/Azure/Vertex deployments.
|
||||
- **Developers**: get a standard interface to use Files across Chat/Finetuning/Batch APIs.
|
||||
|
||||
|
||||
## New Models / Updated Models
|
||||
|
||||
- **Gemini [VertexAI](https://docs.litellm.ai/docs/providers/vertex), [Google AI Studio](https://docs.litellm.ai/docs/providers/gemini)**
|
||||
- New gemini models - [PR 1](https://github.com/BerriAI/litellm/pull/10991), [PR 2](https://github.com/BerriAI/litellm/pull/10998)
|
||||
- `gemini-2.5-flash-preview-tts`
|
||||
- `gemini-2.0-flash-preview-image-generation`
|
||||
- `gemini/gemini-2.5-flash-preview-05-20`
|
||||
- `gemini-2.5-flash-preview-05-20`
|
||||
- **[Anthropic](../../docs/providers/anthropic)**
|
||||
- Claude-4 model family support - [PR](https://github.com/BerriAI/litellm/pull/11060)
|
||||
- **[Bedrock](../../docs/providers/bedrock)**
|
||||
- Claude-4 model family support - [PR](https://github.com/BerriAI/litellm/pull/11060)
|
||||
- Support for `reasoning_effort` and `thinking` parameters for Claude-4 - [PR](https://github.com/BerriAI/litellm/pull/11114)
|
||||
- **[VertexAI](../../docs/providers/vertex)**
|
||||
- Claude-4 model family support - [PR](https://github.com/BerriAI/litellm/pull/11060)
|
||||
- Global endpoints support - [PR](https://github.com/BerriAI/litellm/pull/10658)
|
||||
- authorized_user credentials type support - [PR](https://github.com/BerriAI/litellm/pull/10899)
|
||||
- **[xAI](../../docs/providers/xai)**
|
||||
- `xai/grok-3` pricing information - [PR](https://github.com/BerriAI/litellm/pull/11028)
|
||||
- **[LM Studio](../../docs/providers/lm_studio)**
|
||||
- Structured JSON schema outputs support - [PR](https://github.com/BerriAI/litellm/pull/10929)
|
||||
- **[SambaNova](../../docs/providers/sambanova)**
|
||||
- Updated models and parameters - [PR](https://github.com/BerriAI/litellm/pull/10900)
|
||||
- **[Databricks](../../docs/providers/databricks)**
|
||||
- Llama 4 Maverick model cost - [PR](https://github.com/BerriAI/litellm/pull/11008)
|
||||
- Claude 3.7 Sonnet output token cost correction - [PR](https://github.com/BerriAI/litellm/pull/11007)
|
||||
- **[Azure](../../docs/providers/azure)**
|
||||
- Mistral Medium 25.05 support - [PR](https://github.com/BerriAI/litellm/pull/11063)
|
||||
- Certificate-based authentication support - [PR](https://github.com/BerriAI/litellm/pull/11069)
|
||||
- **[Mistral](../../docs/providers/mistral)**
|
||||
- devstral-small-2505 model pricing and context window - [PR](https://github.com/BerriAI/litellm/pull/11103)
|
||||
- **[Ollama](../../docs/providers/ollama)**
|
||||
- Wildcard model support - [PR](https://github.com/BerriAI/litellm/pull/10982)
|
||||
- **[CustomLLM](../../docs/providers/custom_llm_server)**
|
||||
- Embeddings support added - [PR](https://github.com/BerriAI/litellm/pull/10980)
|
||||
- **[Featherless AI](../../docs/providers/featherless_ai)**
|
||||
- Access to 4200+ models - [PR](https://github.com/BerriAI/litellm/pull/10596)
|
||||
|
||||
## LLM API Endpoints
|
||||
|
||||
- **[Image Edits](../../docs/image_generation)**
|
||||
- `/v1/images/edits` - Support for /images/edits endpoint - [PR](https://github.com/BerriAI/litellm/pull/11020) [PR](https://github.com/BerriAI/litellm/pull/11123)
|
||||
- Content policy violation error mapping - [PR](https://github.com/BerriAI/litellm/pull/11113)
|
||||
- **[Responses API](../../docs/response_api)**
|
||||
- MCP support for Responses API - [PR](https://github.com/BerriAI/litellm/pull/11029)
|
||||
- **[Files API](../../docs/fine_tuning)**
|
||||
- LiteLLM Managed Files support for finetuning - [PR](https://github.com/BerriAI/litellm/pull/11039) [PR](https://github.com/BerriAI/litellm/pull/11040)
|
||||
- Validation for file operations (retrieve/list/delete) - [PR](https://github.com/BerriAI/litellm/pull/11081)
|
||||
|
||||
## Management Endpoints / UI
|
||||
|
||||
- **Teams**
|
||||
- Key and member count display - [PR](https://github.com/BerriAI/litellm/pull/10950)
|
||||
- Spend rounded to 4 decimal points - [PR](https://github.com/BerriAI/litellm/pull/11013)
|
||||
- Organization and team create buttons repositioned - [PR](https://github.com/BerriAI/litellm/pull/10948)
|
||||
- **Keys**
|
||||
- Key reassignment and 'updated at' column - [PR](https://github.com/BerriAI/litellm/pull/10960)
|
||||
- Show model access groups during creation - [PR](https://github.com/BerriAI/litellm/pull/10965)
|
||||
- **Logs**
|
||||
- Model filter on logs - [PR](https://github.com/BerriAI/litellm/pull/11048)
|
||||
- Passthrough endpoint error logs support - [PR](https://github.com/BerriAI/litellm/pull/10990)
|
||||
- **Guardrails**
|
||||
- Config.yaml guardrails display - [PR](https://github.com/BerriAI/litellm/pull/10959)
|
||||
- **Organizations/Users**
|
||||
- Spend rounded to 4 decimal points - [PR](https://github.com/BerriAI/litellm/pull/11023)
|
||||
- Show clear error when adding a user to a team - [PR](https://github.com/BerriAI/litellm/pull/10978)
|
||||
- **Audit Logs**
|
||||
- `/list` and `/info` endpoints for Audit Logs - [PR](https://github.com/BerriAI/litellm/pull/11102)
|
||||
|
||||
## Logging / Alerting Integrations
|
||||
|
||||
- **[Prometheus](../../docs/proxy/prometheus)**
|
||||
- Track `route` on proxy_* metrics - [PR](https://github.com/BerriAI/litellm/pull/10992)
|
||||
- **[Langfuse](../../docs/proxy/logging#langfuse)**
|
||||
- Support for `prompt_label` parameter - [PR](https://github.com/BerriAI/litellm/pull/11018)
|
||||
- Consistent modelParams logging - [PR](https://github.com/BerriAI/litellm/pull/11018)
|
||||
- **[DeepEval/ConfidentAI](../../docs/proxy/logging#deepeval)**
|
||||
- Logging enabled for proxy and SDK - [PR](https://github.com/BerriAI/litellm/pull/10649)
|
||||
- **[Logfire](../../docs/proxy/logging)**
|
||||
- Fix otel proxy server initialization when using Logfire - [PR](https://github.com/BerriAI/litellm/pull/11091)
|
||||
|
||||
## Authentication & Security
|
||||
|
||||
- **[JWT Authentication](../../docs/proxy/token_auth)**
|
||||
- Support for applying default internal user parameters when upserting a user via JWT authentication - [PR](https://github.com/BerriAI/litellm/pull/10995)
|
||||
- Map a user to a team when upserting a user via JWT authentication - [PR](https://github.com/BerriAI/litellm/pull/11108)
|
||||
- **Custom Auth**
|
||||
- Support for switching between custom auth and API key auth - [PR](https://github.com/BerriAI/litellm/pull/11070)
|
||||
|
||||
## Performance / Reliability Improvements
|
||||
|
||||
- **aiohttp Transport**
|
||||
- 97% lower median latency (feature flagged) - [PR](https://github.com/BerriAI/litellm/pull/11097) [PR](https://github.com/BerriAI/litellm/pull/11132)
|
||||
- **Background Health Checks**
|
||||
- Improved reliability - [PR](https://github.com/BerriAI/litellm/pull/10887)
|
||||
- **Response Handling**
|
||||
- Better streaming status code detection - [PR](https://github.com/BerriAI/litellm/pull/10962)
|
||||
- Response ID propagation improvements - [PR](https://github.com/BerriAI/litellm/pull/11006)
|
||||
- **Thread Management**
|
||||
- Removed error-creating threads for reliability - [PR](https://github.com/BerriAI/litellm/pull/11066)
|
||||
|
||||
## General Proxy Improvements
|
||||
|
||||
- **[Proxy CLI](../../docs/proxy/cli)**
|
||||
- Skip server startup flag - [PR](https://github.com/BerriAI/litellm/pull/10665)
|
||||
- Avoid DATABASE_URL override when provided - [PR](https://github.com/BerriAI/litellm/pull/11076)
|
||||
- **Model Management**
|
||||
- Clear cache and reload after model updates - [PR](https://github.com/BerriAI/litellm/pull/10853)
|
||||
- Computer use support tracking - [PR](https://github.com/BerriAI/litellm/pull/10881)
|
||||
- **Helm Chart**
|
||||
- LoadBalancer class support - [PR](https://github.com/BerriAI/litellm/pull/11064)
|
||||
|
||||
## Bug Fixes
|
||||
|
||||
This release includes numerous bug fixes to improve stability and reliability:
|
||||
|
||||
- **LLM Provider Fixes**
|
||||
- VertexAI:
|
||||
- Fixed quota_project_id parameter issue - [PR](https://github.com/BerriAI/litellm/pull/10915)
|
||||
- Fixed credential refresh exceptions - [PR](https://github.com/BerriAI/litellm/pull/10969)
|
||||
- Cohere:
|
||||
Fixes for adding Cohere models through LiteLLM UI - [PR](https://github.com/BerriAI/litellm/pull/10822)
|
||||
- Anthropic:
|
||||
- Fixed streaming dict object handling for /v1/messages - [PR](https://github.com/BerriAI/litellm/pull/11032)
|
||||
- OpenRouter:
|
||||
- Fixed stream usage ID issues - [PR](https://github.com/BerriAI/litellm/pull/11004)
|
||||
|
||||
- **Authentication & Users**
|
||||
- Fixed invitation email link generation - [PR](https://github.com/BerriAI/litellm/pull/10958)
|
||||
- Fixed JWT authentication default role - [PR](https://github.com/BerriAI/litellm/pull/10995)
|
||||
- Fixed user budget reset functionality - [PR](https://github.com/BerriAI/litellm/pull/10993)
|
||||
- Fixed SSO user compatibility and email validation - [PR](https://github.com/BerriAI/litellm/pull/11106)
|
||||
|
||||
- **Database & Infrastructure**
|
||||
- Fixed DB connection parameter handling - [PR](https://github.com/BerriAI/litellm/pull/10842)
|
||||
- Fixed email invitation link - [PR](https://github.com/BerriAI/litellm/pull/11031)
|
||||
|
||||
- **UI & Display**
|
||||
- Fixed MCP tool rendering when no arguments required - [PR](https://github.com/BerriAI/litellm/pull/11012)
|
||||
- Fixed team model alias deletion - [PR](https://github.com/BerriAI/litellm/pull/11121)
|
||||
- Fixed team viewer permissions - [PR](https://github.com/BerriAI/litellm/pull/11127)
|
||||
|
||||
- **Model & Routing**
|
||||
- Fixed team model mapping in route requests - [PR](https://github.com/BerriAI/litellm/pull/11111)
|
||||
- Fixed standard optional parameter passing - [PR](https://github.com/BerriAI/litellm/pull/11124)
|
||||
|
||||
|
||||
## New Contributors
|
||||
* [@DarinVerheijke](https://github.com/DarinVerheijke) made their first contribution in PR [#10596](https://github.com/BerriAI/litellm/pull/10596)
|
||||
* [@estsauver](https://github.com/estsauver) made their first contribution in PR [#10929](https://github.com/BerriAI/litellm/pull/10929)
|
||||
* [@mohittalele](https://github.com/mohittalele) made their first contribution in PR [#10665](https://github.com/BerriAI/litellm/pull/10665)
|
||||
* [@pselden](https://github.com/pselden) made their first contribution in PR [#10899](https://github.com/BerriAI/litellm/pull/10899)
|
||||
* [@unrealandychan](https://github.com/unrealandychan) made their first contribution in PR [#10842](https://github.com/BerriAI/litellm/pull/10842)
|
||||
* [@dastaiger](https://github.com/dastaiger) made their first contribution in PR [#10946](https://github.com/BerriAI/litellm/pull/10946)
|
||||
* [@slytechnical](https://github.com/slytechnical) made their first contribution in PR [#10881](https://github.com/BerriAI/litellm/pull/10881)
|
||||
* [@daarko10](https://github.com/daarko10) made their first contribution in PR [#11006](https://github.com/BerriAI/litellm/pull/11006)
|
||||
* [@sorenmat](https://github.com/sorenmat) made their first contribution in PR [#10658](https://github.com/BerriAI/litellm/pull/10658)
|
||||
* [@matthid](https://github.com/matthid) made their first contribution in PR [#10982](https://github.com/BerriAI/litellm/pull/10982)
|
||||
* [@jgowdy-godaddy](https://github.com/jgowdy-godaddy) made their first contribution in PR [#11032](https://github.com/BerriAI/litellm/pull/11032)
|
||||
* [@bepotp](https://github.com/bepotp) made their first contribution in PR [#11008](https://github.com/BerriAI/litellm/pull/11008)
|
||||
* [@jmorenoc-o](https://github.com/jmorenoc-o) made their first contribution in PR [#11031](https://github.com/BerriAI/litellm/pull/11031)
|
||||
* [@martin-liu](https://github.com/martin-liu) made their first contribution in PR [#11076](https://github.com/BerriAI/litellm/pull/11076)
|
||||
* [@gunjan-solanki](https://github.com/gunjan-solanki) made their first contribution in PR [#11064](https://github.com/BerriAI/litellm/pull/11064)
|
||||
* [@tokoko](https://github.com/tokoko) made their first contribution in PR [#10980](https://github.com/BerriAI/litellm/pull/10980)
|
||||
* [@spike-spiegel-21](https://github.com/spike-spiegel-21) made their first contribution in PR [#10649](https://github.com/BerriAI/litellm/pull/10649)
|
||||
* [@kreatoo](https://github.com/kreatoo) made their first contribution in PR [#10927](https://github.com/BerriAI/litellm/pull/10927)
|
||||
* [@baejooc](https://github.com/baejooc) made their first contribution in PR [#10887](https://github.com/BerriAI/litellm/pull/10887)
|
||||
* [@keykbd](https://github.com/keykbd) made their first contribution in PR [#11114](https://github.com/BerriAI/litellm/pull/11114)
|
||||
* [@dalssoft](https://github.com/dalssoft) made their first contribution in PR [#11088](https://github.com/BerriAI/litellm/pull/11088)
|
||||
* [@jtong99](https://github.com/jtong99) made their first contribution in PR [#10853](https://github.com/BerriAI/litellm/pull/10853)
|
||||
|
||||
## Demo Instance
|
||||
|
||||
Here's a Demo Instance to test changes:
|
||||
|
||||
- Instance: https://demo.litellm.ai/
|
||||
- Login Credentials:
|
||||
- Username: admin
|
||||
- Password: sk-1234
|
||||
|
||||
## [Git Diff](https://github.com/BerriAI/litellm/releases)
|
@@ -0,0 +1,234 @@
|
||||
---
|
||||
title: "v1.72.0-stable"
|
||||
slug: "v1-72-0-stable"
|
||||
date: 2025-05-31T10:00:00
|
||||
authors:
|
||||
- name: Krrish Dholakia
|
||||
title: CEO, LiteLLM
|
||||
url: https://www.linkedin.com/in/krish-d/
|
||||
image_url: https://media.licdn.com/dms/image/v2/D4D03AQGrlsJ3aqpHmQ/profile-displayphoto-shrink_400_400/B4DZSAzgP7HYAg-/0/1737327772964?e=1749686400&v=beta&t=Hkl3U8Ps0VtvNxX0BNNq24b4dtX5wQaPFp6oiKCIHD8
|
||||
- name: Ishaan Jaffer
|
||||
title: CTO, LiteLLM
|
||||
url: https://www.linkedin.com/in/reffajnaahsi/
|
||||
image_url: https://pbs.twimg.com/profile_images/1613813310264340481/lz54oEiB_400x400.jpg
|
||||
|
||||
hide_table_of_contents: false
|
||||
---
|
||||
|
||||
import Image from '@theme/IdealImage';
|
||||
import Tabs from '@theme/Tabs';
|
||||
import TabItem from '@theme/TabItem';
|
||||
|
||||
## Deploy this version
|
||||
|
||||
<Tabs>
|
||||
<TabItem value="docker" label="Docker">
|
||||
|
||||
``` showLineNumbers title="docker run litellm"
|
||||
docker run
|
||||
-e STORE_MODEL_IN_DB=True
|
||||
-p 4000:4000
|
||||
ghcr.io/berriai/litellm:main-v1.72.0-stable
|
||||
```
|
||||
</TabItem>
|
||||
|
||||
<TabItem value="pip" label="Pip">
|
||||
|
||||
``` showLineNumbers title="pip install litellm"
|
||||
pip install litellm==1.72.0
|
||||
```
|
||||
</TabItem>
|
||||
</Tabs>
|
||||
|
||||
|
||||
## Key Highlights
|
||||
|
||||
LiteLLM v1.72.0-stable.rc is live now. Here are the key highlights of this release:
|
||||
|
||||
- **Vector Store Permissions**: Control Vector Store access at the Key, Team, and Organization level.
|
||||
- **Rate Limiting Sliding Window support**: Improved accuracy for Key/Team/User rate limits with request tracking across minutes.
|
||||
- **Aiohttp Transport used by default**: Aiohttp transport is now the default transport for LiteLLM networking requests. This gives users 2x higher RPS per instance with a 40ms median latency overhead.
|
||||
- **Bedrock Agents**: Call Bedrock Agents with `/chat/completions`, `/response` endpoints.
|
||||
- **Anthropic File API**: Upload and analyze CSV files with Claude-4 on Anthropic via LiteLLM.
|
||||
- **Prometheus**: End users (`end_user`) will no longer be tracked by default on Prometheus. Tracking end_users on prometheus is now opt-in. This is done to prevent the response from `/metrics` from becoming too large. [Read More](../../docs/proxy/prometheus#tracking-end_user-on-prometheus)
|
||||
|
||||
|
||||
---
|
||||
|
||||
## Vector Store Permissions
|
||||
|
||||
This release brings support for managing permissions for vector stores by Keys, Teams, Organizations (entities) on LiteLLM. When a request attempts to query a vector store, LiteLLM will block it if the requesting entity lacks the proper permissions.
|
||||
|
||||
This is great for use cases that require access to restricted data that you don't want everyone to use.
|
||||
|
||||
Over the next week we plan on adding permission management for MCP Servers.
|
||||
|
||||
---
|
||||
## Aiohttp Transport used by default
|
||||
|
||||
Aiohttp transport is now the default transport for LiteLLM networking requests. This gives users 2x higher RPS per instance with a 40ms median latency overhead. This has been live on LiteLLM Cloud for a week + gone through alpha users testing for a week.
|
||||
|
||||
|
||||
If you encounter any issues, you can disable using the aiohttp transport in the following ways:
|
||||
|
||||
**On LiteLLM Proxy**
|
||||
|
||||
Set the `DISABLE_AIOHTTP_TRANSPORT=True` in the environment variables.
|
||||
|
||||
```yaml showLineNumbers title="Environment Variable"
|
||||
export DISABLE_AIOHTTP_TRANSPORT="True"
|
||||
```
|
||||
|
||||
**On LiteLLM Python SDK**
|
||||
|
||||
Set the `disable_aiohttp_transport=True` to disable aiohttp transport.
|
||||
|
||||
```python showLineNumbers title="Python SDK"
|
||||
import litellm
|
||||
|
||||
litellm.disable_aiohttp_transport = True # default is False, enable this to disable aiohttp transport
|
||||
result = litellm.completion(
|
||||
model="openai/gpt-4o",
|
||||
messages=[{"role": "user", "content": "Hello, world!"}],
|
||||
)
|
||||
print(result)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
|
||||
## New Models / Updated Models
|
||||
|
||||
- **[Bedrock](../../docs/providers/bedrock)**
|
||||
- Video support for Bedrock Converse - [PR](https://github.com/BerriAI/litellm/pull/11166)
|
||||
- InvokeAgents support as /chat/completions route - [PR](https://github.com/BerriAI/litellm/pull/11239), [Get Started](../../docs/providers/bedrock_agents)
|
||||
- AI21 Jamba models compatibility fixes - [PR](https://github.com/BerriAI/litellm/pull/11233)
|
||||
- Fixed duplicate maxTokens parameter for Claude with thinking - [PR](https://github.com/BerriAI/litellm/pull/11181)
|
||||
- **[Gemini (Google AI Studio + Vertex AI)](https://docs.litellm.ai/docs/providers/gemini)**
|
||||
- Parallel tool calling support with `parallel_tool_calls` parameter - [PR](https://github.com/BerriAI/litellm/pull/11125)
|
||||
- All Gemini models now support parallel function calling - [PR](https://github.com/BerriAI/litellm/pull/11225)
|
||||
- **[VertexAI](../../docs/providers/vertex)**
|
||||
- codeExecution tool support and anyOf handling - [PR](https://github.com/BerriAI/litellm/pull/11195)
|
||||
- Vertex AI Anthropic support on /v1/messages - [PR](https://github.com/BerriAI/litellm/pull/11246)
|
||||
- Thinking, global regions, and parallel tool calling improvements - [PR](https://github.com/BerriAI/litellm/pull/11194)
|
||||
- Web Search Support [PR](https://github.com/BerriAI/litellm/commit/06484f6e5a7a2f4e45c490266782ed28b51b7db6)
|
||||
- **[Anthropic](../../docs/providers/anthropic)**
|
||||
- Thinking blocks on streaming support - [PR](https://github.com/BerriAI/litellm/pull/11194)
|
||||
- Files API with form-data support on passthrough - [PR](https://github.com/BerriAI/litellm/pull/11256)
|
||||
- File ID support on /chat/completion - [PR](https://github.com/BerriAI/litellm/pull/11256)
|
||||
- **[xAI](../../docs/providers/xai)**
|
||||
- Web Search Support [PR](https://github.com/BerriAI/litellm/commit/06484f6e5a7a2f4e45c490266782ed28b51b7db6)
|
||||
- **[Google AI Studio](../../docs/providers/gemini)**
|
||||
- Web Search Support [PR](https://github.com/BerriAI/litellm/commit/06484f6e5a7a2f4e45c490266782ed28b51b7db6)
|
||||
- **[Mistral](../../docs/providers/mistral)**
|
||||
- Updated mistral-medium prices and context sizes - [PR](https://github.com/BerriAI/litellm/pull/10729)
|
||||
- **[Ollama](../../docs/providers/ollama)**
|
||||
- Tool calls parsing on streaming - [PR](https://github.com/BerriAI/litellm/pull/11171)
|
||||
- **[Cohere](../../docs/providers/cohere)**
|
||||
- Swapped Cohere and Cohere Chat provider positioning - [PR](https://github.com/BerriAI/litellm/pull/11173)
|
||||
- **[Nebius AI Studio](../../docs/providers/nebius)**
|
||||
- New provider integration - [PR](https://github.com/BerriAI/litellm/pull/11143)
|
||||
|
||||
## LLM API Endpoints
|
||||
|
||||
- **[Image Edits API](../../docs/image_generation)**
|
||||
- Azure support for /v1/images/edits - [PR](https://github.com/BerriAI/litellm/pull/11160)
|
||||
- Cost tracking for image edits endpoint (OpenAI, Azure) - [PR](https://github.com/BerriAI/litellm/pull/11186)
|
||||
- **[Completions API](../../docs/completion/chat)**
|
||||
- Codestral latency overhead tracking on /v1/completions - [PR](https://github.com/BerriAI/litellm/pull/10879)
|
||||
- **[Audio Transcriptions API](../../docs/audio/speech)**
|
||||
- GPT-4o mini audio preview pricing without date - [PR](https://github.com/BerriAI/litellm/pull/11207)
|
||||
- Non-default params support for audio transcription - [PR](https://github.com/BerriAI/litellm/pull/11212)
|
||||
- **[Responses API](../../docs/response_api)**
|
||||
- Session management fixes for using Non-OpenAI models - [PR](https://github.com/BerriAI/litellm/pull/11254)
|
||||
|
||||
## Management Endpoints / UI
|
||||
|
||||
- **Vector Stores**
|
||||
- Permission management for LiteLLM Keys, Teams, and Organizations - [PR](https://github.com/BerriAI/litellm/pull/11213)
|
||||
- UI display of vector store permissions - [PR](https://github.com/BerriAI/litellm/pull/11277)
|
||||
- Vector store access controls enforcement - [PR](https://github.com/BerriAI/litellm/pull/11281)
|
||||
- Object permissions fixes and QA improvements - [PR](https://github.com/BerriAI/litellm/pull/11291)
|
||||
- **Teams**
|
||||
- "All proxy models" display when no models selected - [PR](https://github.com/BerriAI/litellm/pull/11187)
|
||||
- Removed redundant teamInfo call, using existing teamsList - [PR](https://github.com/BerriAI/litellm/pull/11051)
|
||||
- Improved model tags display on Keys, Teams and Org pages - [PR](https://github.com/BerriAI/litellm/pull/11022)
|
||||
- **SSO/SCIM**
|
||||
- Bug fixes for showing SCIM token on UI - [PR](https://github.com/BerriAI/litellm/pull/11220)
|
||||
- **General UI**
|
||||
- Fix "UI Session Expired. Logging out" - [PR](https://github.com/BerriAI/litellm/pull/11279)
|
||||
- Support for forwarding /sso/key/generate to server root path URL - [PR](https://github.com/BerriAI/litellm/pull/11165)
|
||||
|
||||
|
||||
## Logging / Guardrails Integrations
|
||||
|
||||
#### Logging
|
||||
- **[Prometheus](../../docs/proxy/prometheus)**
|
||||
- End users will no longer be tracked by default on Prometheus. Tracking end_users on prometheus is now opt-in. [PR](https://github.com/BerriAI/litellm/pull/11192)
|
||||
- **[Langfuse](../../docs/proxy/logging#langfuse)**
|
||||
- Performance improvements: Fixed "Max langfuse clients reached" issue - [PR](https://github.com/BerriAI/litellm/pull/11285)
|
||||
- **[Helicone](../../docs/observability/helicone_integration)**
|
||||
- Base URL support - [PR](https://github.com/BerriAI/litellm/pull/11211)
|
||||
- **[Sentry](../../docs/proxy/logging#sentry)**
|
||||
- Added sentry sample rate configuration - [PR](https://github.com/BerriAI/litellm/pull/10283)
|
||||
|
||||
#### Guardrails
|
||||
- **[Bedrock Guardrails](../../docs/proxy/guardrails/bedrock)**
|
||||
- Streaming support for bedrock post guard - [PR](https://github.com/BerriAI/litellm/pull/11247)
|
||||
- Auth parameter persistence fixes - [PR](https://github.com/BerriAI/litellm/pull/11270)
|
||||
- **[Pangea Guardrails](../../docs/proxy/guardrails/pangea)**
|
||||
- Added Pangea provider to Guardrails hook - [PR](https://github.com/BerriAI/litellm/pull/10775)
|
||||
|
||||
|
||||
## Performance / Reliability Improvements
|
||||
- **aiohttp Transport**
|
||||
- Handling for aiohttp.ClientPayloadError - [PR](https://github.com/BerriAI/litellm/pull/11162)
|
||||
- SSL verification settings support - [PR](https://github.com/BerriAI/litellm/pull/11162)
|
||||
- Rollback to httpx==0.27.0 for stability - [PR](https://github.com/BerriAI/litellm/pull/11146)
|
||||
- **Request Limiting**
|
||||
- Sliding window logic for parallel request limiter v2 - [PR](https://github.com/BerriAI/litellm/pull/11283)
|
||||
|
||||
|
||||
## Bug Fixes
|
||||
|
||||
- **LLM API Fixes**
|
||||
- Added missing request_kwargs to get_available_deployment call - [PR](https://github.com/BerriAI/litellm/pull/11202)
|
||||
- Fixed calling Azure O-series models - [PR](https://github.com/BerriAI/litellm/pull/11212)
|
||||
- Support for dropping non-OpenAI params via additional_drop_params - [PR](https://github.com/BerriAI/litellm/pull/11246)
|
||||
- Fixed frequency_penalty to repeat_penalty parameter mapping - [PR](https://github.com/BerriAI/litellm/pull/11284)
|
||||
- Fix for embedding cache hits on string input - [PR](https://github.com/BerriAI/litellm/pull/11211)
|
||||
- **General**
|
||||
- OIDC provider improvements and audience bug fix - [PR](https://github.com/BerriAI/litellm/pull/10054)
|
||||
- Removed AzureCredentialType restriction on AZURE_CREDENTIAL - [PR](https://github.com/BerriAI/litellm/pull/11272)
|
||||
- Prevention of sensitive key leakage to Langfuse - [PR](https://github.com/BerriAI/litellm/pull/11165)
|
||||
- Fixed healthcheck test using curl when curl not in image - [PR](https://github.com/BerriAI/litellm/pull/9737)
|
||||
|
||||
## New Contributors
|
||||
* [@agajdosi](https://github.com/agajdosi) made their first contribution in [#9737](https://github.com/BerriAI/litellm/pull/9737)
|
||||
* [@ketangangal](https://github.com/ketangangal) made their first contribution in [#11161](https://github.com/BerriAI/litellm/pull/11161)
|
||||
* [@Aktsvigun](https://github.com/Aktsvigun) made their first contribution in [#11143](https://github.com/BerriAI/litellm/pull/11143)
|
||||
* [@ryanmeans](https://github.com/ryanmeans) made their first contribution in [#10775](https://github.com/BerriAI/litellm/pull/10775)
|
||||
* [@nikoizs](https://github.com/nikoizs) made their first contribution in [#10054](https://github.com/BerriAI/litellm/pull/10054)
|
||||
* [@Nitro963](https://github.com/Nitro963) made their first contribution in [#11202](https://github.com/BerriAI/litellm/pull/11202)
|
||||
* [@Jacobh2](https://github.com/Jacobh2) made their first contribution in [#11207](https://github.com/BerriAI/litellm/pull/11207)
|
||||
* [@regismesquita](https://github.com/regismesquita) made their first contribution in [#10729](https://github.com/BerriAI/litellm/pull/10729)
|
||||
* [@Vinnie-Singleton-NN](https://github.com/Vinnie-Singleton-NN) made their first contribution in [#10283](https://github.com/BerriAI/litellm/pull/10283)
|
||||
* [@trashhalo](https://github.com/trashhalo) made their first contribution in [#11219](https://github.com/BerriAI/litellm/pull/11219)
|
||||
* [@VigneshwarRajasekaran](https://github.com/VigneshwarRajasekaran) made their first contribution in [#11223](https://github.com/BerriAI/litellm/pull/11223)
|
||||
* [@AnilAren](https://github.com/AnilAren) made their first contribution in [#11233](https://github.com/BerriAI/litellm/pull/11233)
|
||||
* [@fadil4u](https://github.com/fadil4u) made their first contribution in [#11242](https://github.com/BerriAI/litellm/pull/11242)
|
||||
* [@whitfin](https://github.com/whitfin) made their first contribution in [#11279](https://github.com/BerriAI/litellm/pull/11279)
|
||||
* [@hcoona](https://github.com/hcoona) made their first contribution in [#11272](https://github.com/BerriAI/litellm/pull/11272)
|
||||
* [@keyute](https://github.com/keyute) made their first contribution in [#11173](https://github.com/BerriAI/litellm/pull/11173)
|
||||
* [@emmanuel-ferdman](https://github.com/emmanuel-ferdman) made their first contribution in [#11230](https://github.com/BerriAI/litellm/pull/11230)
|
||||
|
||||
## Demo Instance
|
||||
|
||||
Here's a Demo Instance to test changes:
|
||||
|
||||
- Instance: https://demo.litellm.ai/
|
||||
- Login Credentials:
|
||||
- Username: admin
|
||||
- Password: sk-1234
|
||||
|
||||
## [Git Diff](https://github.com/BerriAI/litellm/releases)
|
@@ -0,0 +1,273 @@
|
||||
---
|
||||
title: "v1.72.2-stable"
|
||||
slug: "v1-72-2-stable"
|
||||
date: 2025-06-07T10:00:00
|
||||
authors:
|
||||
- name: Krrish Dholakia
|
||||
title: CEO, LiteLLM
|
||||
url: https://www.linkedin.com/in/krish-d/
|
||||
image_url: https://pbs.twimg.com/profile_images/1298587542745358340/DZv3Oj-h_400x400.jpg
|
||||
- name: Ishaan Jaffer
|
||||
title: CTO, LiteLLM
|
||||
url: https://www.linkedin.com/in/reffajnaahsi/
|
||||
image_url: https://pbs.twimg.com/profile_images/1613813310264340481/lz54oEiB_400x400.jpg
|
||||
|
||||
hide_table_of_contents: false
|
||||
---
|
||||
|
||||
import Image from '@theme/IdealImage';
|
||||
import Tabs from '@theme/Tabs';
|
||||
import TabItem from '@theme/TabItem';
|
||||
|
||||
|
||||
## Deploy this version
|
||||
|
||||
<Tabs>
|
||||
<TabItem value="docker" label="Docker">
|
||||
|
||||
``` showLineNumbers title="docker run litellm"
|
||||
docker run
|
||||
-e STORE_MODEL_IN_DB=True
|
||||
-p 4000:4000
|
||||
ghcr.io/berriai/litellm:main-v1.72.2-stable
|
||||
```
|
||||
</TabItem>
|
||||
|
||||
<TabItem value="pip" label="Pip">
|
||||
|
||||
``` showLineNumbers title="pip install litellm"
|
||||
pip install litellm==1.72.2.post1
|
||||
```
|
||||
|
||||
</TabItem>
|
||||
</Tabs>
|
||||
|
||||
## TLDR
|
||||
|
||||
* **Why Upgrade**
|
||||
- Performance Improvements for /v1/messages: For this endpoint LiteLLM Proxy overhead is now down to 50ms at 250 RPS.
|
||||
- Accurate Rate Limiting: Multi-instance rate limiting now tracks rate limits across keys, models, teams, and users with 0 spillover.
|
||||
- Audit Logs on UI: Track when Keys, Teams, and Models were deleted by viewing Audit Logs on the LiteLLM UI.
|
||||
- /v1/messages all models support: You can now use all LiteLLM models (`gpt-4.1`, `o1-pro`, `gemini-2.5-pro`) with /v1/messages API.
|
||||
- [Anthropic MCP](../../docs/providers/anthropic#mcp-tool-calling): Use remote MCP Servers with Anthropic Models.
|
||||
* **Who Should Read**
|
||||
- Teams using `/v1/messages` API (Claude Code)
|
||||
- Proxy Admins using LiteLLM Virtual Keys and setting rate limits
|
||||
* **Risk of Upgrade**
|
||||
- **Medium**
|
||||
- Upgraded `ddtrace==3.8.0`, if you use DataDog tracing this is a medium level risk. We recommend monitoring logs for any issues.
|
||||
|
||||
|
||||
|
||||
---
|
||||
|
||||
## `/v1/messages` Performance Improvements
|
||||
|
||||
<Image
|
||||
img={require('../../img/release_notes/v1_messages_perf.png')}
|
||||
style={{width: '100%', display: 'block', margin: '2rem auto'}}
|
||||
/>
|
||||
|
||||
This release brings significant performance improvements to the /v1/messages API on LiteLLM.
|
||||
|
||||
For this endpoint LiteLLM Proxy overhead latency is now down to 50ms, and each instance can handle 250 RPS. We validated these improvements through load testing with payloads containing over 1,000 streaming chunks.
|
||||
|
||||
This is great for real time use cases with large requests (eg. multi turn conversations, Claude Code, etc.).
|
||||
|
||||
## Multi-Instance Rate Limiting Improvements
|
||||
|
||||
<Image
|
||||
img={require('../../img/release_notes/multi_instance_rate_limits_v3.jpg')}
|
||||
style={{width: '100%', display: 'block', margin: '2rem auto'}}
|
||||
/>
|
||||
|
||||
LiteLLM now accurately tracks rate limits across keys, models, teams, and users with 0 spillover.
|
||||
|
||||
This is a significant improvement over the previous version, which faced issues with leakage and spillover in high traffic, multi-instance setups.
|
||||
|
||||
**Key Changes:**
|
||||
- Redis is now part of the rate limit check, instead of being a background sync. This ensures accuracy and reduces read/write operations during low activity.
|
||||
- LiteLLM now uses Lua scripts to ensure all checks are atomic.
|
||||
- In-memory caching uses Redis values. This prevents drift, and reduces Redis queries once objects are over their limit.
|
||||
|
||||
These changes are currently behind the feature flag - `EXPERIMENTAL_ENABLE_MULTI_INSTANCE_RATE_LIMITING=True`. We plan to GA this in our next release - subject to feedback.
|
||||
|
||||
## Audit Logs on UI
|
||||
|
||||
<Image
|
||||
img={require('../../img/release_notes/ui_audit_log.png')}
|
||||
style={{width: '100%', display: 'block', margin: '2rem auto'}}
|
||||
/>
|
||||
|
||||
This release introduces support for viewing audit logs in the UI. As a Proxy Admin, you can now check if and when a key was deleted, along with who performed the action.
|
||||
|
||||
LiteLLM tracks changes to the following entities and actions:
|
||||
|
||||
- **Entities:** Keys, Teams, Users, Models
|
||||
- **Actions:** Create, Update, Delete, Regenerate
|
||||
|
||||
|
||||
|
||||
## New Models / Updated Models
|
||||
|
||||
**Newly Added Models**
|
||||
|
||||
| Provider | Model | Context Window | Input ($/1M tokens) | Output ($/1M tokens) |
|
||||
| ----------- | -------------------------------------- | -------------- | ------------------- | -------------------- |
|
||||
| Anthropic | `claude-4-opus-20250514` | 200K | $15.00 | $75.00 |
|
||||
| Anthropic | `claude-4-sonnet-20250514` | 200K | $3.00 | $15.00 |
|
||||
| VertexAI, Google AI Studio | `gemini-2.5-pro-preview-06-05` | 1M | $1.25 | $10.00 |
|
||||
| OpenAI | `codex-mini-latest` | 200K | $1.50 | $6.00 |
|
||||
| Cerebras | `qwen-3-32b` | 128K | $0.40 | $0.80 |
|
||||
| SambaNova | `DeepSeek-R1` | 32K | $5.00 | $7.00 |
|
||||
| SambaNova | `DeepSeek-R1-Distill-Llama-70B` | 131K | $0.70 | $1.40 |
|
||||
|
||||
|
||||
|
||||
### Model Updates
|
||||
|
||||
- **[Anthropic](../../docs/providers/anthropic)**
|
||||
- Cost tracking added for new Claude models - [PR](https://github.com/BerriAI/litellm/pull/11339)
|
||||
- `claude-4-opus-20250514`
|
||||
- `claude-4-sonnet-20250514`
|
||||
- Support for MCP tool calling with Anthropic models - [PR](https://github.com/BerriAI/litellm/pull/11474)
|
||||
- **[Google AI Studio](../../docs/providers/gemini)**
|
||||
- Google Gemini 2.5 Pro Preview 06-05 support - [PR](https://github.com/BerriAI/litellm/pull/11447)
|
||||
- Gemini streaming thinking content parsing with `reasoning_content` - [PR](https://github.com/BerriAI/litellm/pull/11298)
|
||||
- Support for no reasoning option for Gemini models - [PR](https://github.com/BerriAI/litellm/pull/11393)
|
||||
- URL context support for Gemini models - [PR](https://github.com/BerriAI/litellm/pull/11351)
|
||||
- Gemini embeddings-001 model prices and context window - [PR](https://github.com/BerriAI/litellm/pull/11332)
|
||||
- **[OpenAI](../../docs/providers/openai)**
|
||||
- Cost tracking for `codex-mini-latest` - [PR](https://github.com/BerriAI/litellm/pull/11492)
|
||||
- **[Vertex AI](../../docs/providers/vertex)**
|
||||
- Cache token tracking on streaming calls - [PR](https://github.com/BerriAI/litellm/pull/11387)
|
||||
- Return response_id matching upstream response ID for stream and non-stream - [PR](https://github.com/BerriAI/litellm/pull/11456)
|
||||
- **[Cerebras](../../docs/providers/cerebras)**
|
||||
- Cerebras/qwen-3-32b model pricing and context window - [PR](https://github.com/BerriAI/litellm/pull/11373)
|
||||
- **[HuggingFace](../../docs/providers/huggingface)**
|
||||
- Fixed embeddings using non-default `input_type` - [PR](https://github.com/BerriAI/litellm/pull/11452)
|
||||
- **[DataRobot](../../docs/providers/datarobot)**
|
||||
- New provider integration for enterprise AI workflows - [PR](https://github.com/BerriAI/litellm/pull/10385)
|
||||
- **[DeepSeek](../../docs/providers/together_ai)**
|
||||
- DeepSeek R1 family model configuration via Together AI - [PR](https://github.com/BerriAI/litellm/pull/11394)
|
||||
- DeepSeek R1 pricing and context window configuration - [PR](https://github.com/BerriAI/litellm/pull/11339)
|
||||
|
||||
---
|
||||
|
||||
## LLM API Endpoints
|
||||
|
||||
- **[Images API](../../docs/image_generation)**
|
||||
- Azure endpoint support for image endpoints - [PR](https://github.com/BerriAI/litellm/pull/11482)
|
||||
- **[Anthropic Messages API](../../docs/completion/chat)**
|
||||
- Support for ALL LiteLLM Providers (OpenAI, Azure, Bedrock, Vertex, DeepSeek, etc.) on /v1/messages API Spec - [PR](https://github.com/BerriAI/litellm/pull/11502)
|
||||
- Performance improvements for /v1/messages route - [PR](https://github.com/BerriAI/litellm/pull/11421)
|
||||
- Return streaming usage statistics when using LiteLLM with Bedrock models - [PR](https://github.com/BerriAI/litellm/pull/11469)
|
||||
- **[Embeddings API](../../docs/embedding/supported_embedding)**
|
||||
- Provider-specific optional params handling for embedding calls - [PR](https://github.com/BerriAI/litellm/pull/11346)
|
||||
- Proper Sagemaker request attribute usage for embeddings - [PR](https://github.com/BerriAI/litellm/pull/11362)
|
||||
- **[Rerank API](../../docs/rerank/supported_rerank)**
|
||||
- New HuggingFace rerank provider support - [PR](https://github.com/BerriAI/litellm/pull/11438), [Guide](../../docs/providers/huggingface_rerank)
|
||||
|
||||
---
|
||||
|
||||
## Spend Tracking
|
||||
|
||||
- Added token tracking for anthropic batch calls via /anthropic passthrough route- [PR](https://github.com/BerriAI/litellm/pull/11388)
|
||||
|
||||
---
|
||||
|
||||
## Management Endpoints / UI
|
||||
|
||||
|
||||
- **SSO/Authentication**
|
||||
- SSO configuration endpoints and UI integration with persistent settings - [PR](https://github.com/BerriAI/litellm/pull/11417)
|
||||
- Update proxy admin ID role in DB + Handle SSO redirects with custom root path - [PR](https://github.com/BerriAI/litellm/pull/11384)
|
||||
- Support returning virtual key in custom auth - [PR](https://github.com/BerriAI/litellm/pull/11346)
|
||||
- User ID validation to ensure it is not an email or phone number - [PR](https://github.com/BerriAI/litellm/pull/10102)
|
||||
- **Teams**
|
||||
- Fixed Create/Update team member API 500 error - [PR](https://github.com/BerriAI/litellm/pull/10479)
|
||||
- Enterprise feature gating for RegenerateKeyModal in KeyInfoView - [PR](https://github.com/BerriAI/litellm/pull/11400)
|
||||
- **SCIM**
|
||||
- Fixed SCIM running patch operation case sensitivity - [PR](https://github.com/BerriAI/litellm/pull/11335)
|
||||
- **General**
|
||||
- Converted action buttons to sticky footer action buttons - [PR](https://github.com/BerriAI/litellm/pull/11293)
|
||||
- Custom Server Root Path - support for serving UI on a custom root path - [Guide](../../docs/proxy/custom_root_ui)
|
||||
---
|
||||
|
||||
## Logging / Guardrails Integrations
|
||||
|
||||
#### Logging
|
||||
- **[S3](../../docs/proxy/logging#s3)**
|
||||
- Async + Batched S3 Logging for improved performance - [PR](https://github.com/BerriAI/litellm/pull/11340)
|
||||
- **[DataDog](../../docs/observability/datadog_integration)**
|
||||
- Add instrumentation for streaming chunks - [PR](https://github.com/BerriAI/litellm/pull/11338)
|
||||
- Add DD profiler to monitor Python profile of LiteLLM CPU% - [PR](https://github.com/BerriAI/litellm/pull/11375)
|
||||
- Bump DD trace version - [PR](https://github.com/BerriAI/litellm/pull/11426)
|
||||
- **[Prometheus](../../docs/proxy/prometheus)**
|
||||
- Pass custom metadata labels in litellm_total_token metrics - [PR](https://github.com/BerriAI/litellm/pull/11414)
|
||||
- **[GCS](../../docs/proxy/logging#google-cloud-storage)**
|
||||
- Update GCSBucketBase to handle GSM project ID if passed - [PR](https://github.com/BerriAI/litellm/pull/11409)
|
||||
|
||||
#### Guardrails
|
||||
- **[Presidio](../../docs/proxy/guardrails/presidio)**
|
||||
- Add presidio_language yaml configuration support for guardrails - [PR](https://github.com/BerriAI/litellm/pull/11331)
|
||||
|
||||
---
|
||||
|
||||
## Performance / Reliability Improvements
|
||||
|
||||
- **Performance Optimizations**
|
||||
- Don't run auth on /health/liveliness endpoints - [PR](https://github.com/BerriAI/litellm/pull/11378)
|
||||
- Don't create 1 task for every hanging request alert - [PR](https://github.com/BerriAI/litellm/pull/11385)
|
||||
- Add debugging endpoint to track active /asyncio-tasks - [PR](https://github.com/BerriAI/litellm/pull/11382)
|
||||
- Make batch size for maximum retention in spend logs controllable - [PR](https://github.com/BerriAI/litellm/pull/11459)
|
||||
- Expose flag to disable token counter - [PR](https://github.com/BerriAI/litellm/pull/11344)
|
||||
- Support pipeline redis lpop for older redis versions - [PR](https://github.com/BerriAI/litellm/pull/11425)
|
||||
---
|
||||
|
||||
## Bug Fixes
|
||||
|
||||
- **LLM API Fixes**
|
||||
- **Anthropic**: Fix regression when passing file url's to the 'file_id' parameter - [PR](https://github.com/BerriAI/litellm/pull/11387)
|
||||
- **Vertex AI**: Fix Vertex AI any_of issues for Description and Default. - [PR](https://github.com/BerriAI/litellm/issues/11383)
|
||||
- Fix transcription model name mapping - [PR](https://github.com/BerriAI/litellm/pull/11333)
|
||||
- **Image Generation**: Fix None values in usage field for gpt-image-1 model responses - [PR](https://github.com/BerriAI/litellm/pull/11448)
|
||||
- **Responses API**: Fix _transform_responses_api_content_to_chat_completion_content doesn't support file content type - [PR](https://github.com/BerriAI/litellm/pull/11494)
|
||||
- **Fireworks AI**: Fix rate limit exception mapping - detect "rate limit" text in error messages - [PR](https://github.com/BerriAI/litellm/pull/11455)
|
||||
- **Spend Tracking/Budgets**
|
||||
- Respect user_header_name property for budget selection and user identification - [PR](https://github.com/BerriAI/litellm/pull/11419)
|
||||
- **MCP Server**
|
||||
- Remove duplicate server_id MCP config servers - [PR](https://github.com/BerriAI/litellm/pull/11327)
|
||||
- **Function Calling**
|
||||
- supports_function_calling works with llm_proxy models - [PR](https://github.com/BerriAI/litellm/pull/11381)
|
||||
- **Knowledge Base**
|
||||
- Fixed Knowledge Base Call returning error - [PR](https://github.com/BerriAI/litellm/pull/11467)
|
||||
|
||||
---
|
||||
|
||||
## New Contributors
|
||||
* [@mjnitz02](https://github.com/mjnitz02) made their first contribution in [#10385](https://github.com/BerriAI/litellm/pull/10385)
|
||||
* [@hagan](https://github.com/hagan) made their first contribution in [#10479](https://github.com/BerriAI/litellm/pull/10479)
|
||||
* [@wwells](https://github.com/wwells) made their first contribution in [#11409](https://github.com/BerriAI/litellm/pull/11409)
|
||||
* [@likweitan](https://github.com/likweitan) made their first contribution in [#11400](https://github.com/BerriAI/litellm/pull/11400)
|
||||
* [@raz-alon](https://github.com/raz-alon) made their first contribution in [#10102](https://github.com/BerriAI/litellm/pull/10102)
|
||||
* [@jtsai-quid](https://github.com/jtsai-quid) made their first contribution in [#11394](https://github.com/BerriAI/litellm/pull/11394)
|
||||
* [@tmbo](https://github.com/tmbo) made their first contribution in [#11362](https://github.com/BerriAI/litellm/pull/11362)
|
||||
* [@wangsha](https://github.com/wangsha) made their first contribution in [#11351](https://github.com/BerriAI/litellm/pull/11351)
|
||||
* [@seankwalker](https://github.com/seankwalker) made their first contribution in [#11452](https://github.com/BerriAI/litellm/pull/11452)
|
||||
* [@pazevedo-hyland](https://github.com/pazevedo-hyland) made their first contribution in [#11381](https://github.com/BerriAI/litellm/pull/11381)
|
||||
* [@cainiaoit](https://github.com/cainiaoit) made their first contribution in [#11438](https://github.com/BerriAI/litellm/pull/11438)
|
||||
* [@vuanhtu52](https://github.com/vuanhtu52) made their first contribution in [#11508](https://github.com/BerriAI/litellm/pull/11508)
|
||||
|
||||
---
|
||||
|
||||
## Demo Instance
|
||||
|
||||
Here's a Demo Instance to test changes:
|
||||
|
||||
- Instance: https://demo.litellm.ai/
|
||||
- Login Credentials:
|
||||
- Username: admin
|
||||
- Password: sk-1234
|
||||
|
||||
## [Git Diff](https://github.com/BerriAI/litellm/releases/tag/v1.72.2-stable)
|
@@ -0,0 +1,294 @@
|
||||
---
|
||||
title: "v1.72.6-stable - MCP Gateway Permission Management"
|
||||
slug: "v1-72-6-stable"
|
||||
date: 2025-06-14T10:00:00
|
||||
authors:
|
||||
- name: Krrish Dholakia
|
||||
title: CEO, LiteLLM
|
||||
url: https://www.linkedin.com/in/krish-d/
|
||||
image_url: https://pbs.twimg.com/profile_images/1298587542745358340/DZv3Oj-h_400x400.jpg
|
||||
- name: Ishaan Jaffer
|
||||
title: CTO, LiteLLM
|
||||
url: https://www.linkedin.com/in/reffajnaahsi/
|
||||
image_url: https://pbs.twimg.com/profile_images/1613813310264340481/lz54oEiB_400x400.jpg
|
||||
|
||||
hide_table_of_contents: false
|
||||
---
|
||||
|
||||
import Image from '@theme/IdealImage';
|
||||
import Tabs from '@theme/Tabs';
|
||||
import TabItem from '@theme/TabItem';
|
||||
|
||||
## Deploy this version
|
||||
|
||||
<Tabs>
|
||||
<TabItem value="docker" label="Docker">
|
||||
|
||||
``` showLineNumbers title="docker run litellm"
|
||||
docker run
|
||||
-e STORE_MODEL_IN_DB=True
|
||||
-p 4000:4000
|
||||
ghcr.io/berriai/litellm:main-v1.72.6-stable
|
||||
```
|
||||
</TabItem>
|
||||
|
||||
<TabItem value="pip" label="Pip">
|
||||
|
||||
``` showLineNumbers title="pip install litellm"
|
||||
pip install litellm==1.72.6.post2
|
||||
```
|
||||
|
||||
</TabItem>
|
||||
</Tabs>
|
||||
|
||||
|
||||
## TLDR
|
||||
|
||||
|
||||
* **Why Upgrade**
|
||||
- Codex-mini on Claude Code: You can now use `codex-mini` (OpenAI’s code assistant model) via Claude Code.
|
||||
- MCP Permissions Management: Manage permissions for MCP Servers by Keys, Teams, Organizations (entities) on LiteLLM.
|
||||
- UI: Turn on/off auto refresh on logs view.
|
||||
- Rate Limiting: Support for output token-only rate limiting.
|
||||
* **Who Should Read**
|
||||
- Teams using `/v1/messages` API (Claude Code)
|
||||
- Teams using **MCP**
|
||||
- Teams giving access to self-hosted models and setting rate limits
|
||||
* **Risk of Upgrade**
|
||||
- **Low**
|
||||
- No major changes to existing functionality or package updates.
|
||||
|
||||
|
||||
---
|
||||
|
||||
## Key Highlights
|
||||
|
||||
|
||||
### MCP Permissions Management
|
||||
|
||||
<Image img={require('../../img/release_notes/mcp_permissions.png')}/>
|
||||
|
||||
This release brings support for managing permissions for MCP Servers by Keys, Teams, Organizations (entities) on LiteLLM. When a MCP client attempts to list tools, LiteLLM will only return the tools the entity has permissions to access.
|
||||
|
||||
This is great for use cases that require access to restricted data (e.g Jira MCP) that you don't want everyone to use.
|
||||
|
||||
For Proxy Admins, this enables centralized management of all MCP Servers with access control. For developers, this means you'll only see the MCP tools assigned to you.
|
||||
|
||||
|
||||
|
||||
|
||||
### Codex-mini on Claude Code
|
||||
|
||||
<Image img={require('../../img/release_notes/codex_on_claude_code.jpg')} />
|
||||
|
||||
This release brings support for calling `codex-mini` (OpenAI’s code assistant model) via Claude Code.
|
||||
|
||||
This is done by LiteLLM enabling any Responses API model (including `o3-pro`) to be called via `/chat/completions` and `/v1/messages` endpoints. This includes:
|
||||
|
||||
- Streaming calls
|
||||
- Non-streaming calls
|
||||
- Cost Tracking on success + failure for Responses API models
|
||||
|
||||
Here's how to use it [today](../../docs/tutorials/claude_responses_api)
|
||||
|
||||
|
||||
|
||||
|
||||
---
|
||||
|
||||
|
||||
## New / Updated Models
|
||||
|
||||
### Pricing / Context Window Updates
|
||||
|
||||
| Provider | Model | Context Window | Input ($/1M tokens) | Output ($/1M tokens) | Type |
|
||||
| ----------- | -------------------------------------- | -------------- | ------------------- | -------------------- | -------------------- |
|
||||
| VertexAI | `vertex_ai/claude-opus-4` | 200K | $15.00 | $75.00 | New |
|
||||
| OpenAI | `gpt-4o-audio-preview-2025-06-03` | 128k | $2.5 (text), $40 (audio) | $10 (text), $80 (audio) | New |
|
||||
| OpenAI | `o3-pro` | 200k | 20 | 80 | New |
|
||||
| OpenAI | `o3-pro-2025-06-10` | 200k | 20 | 80 | New |
|
||||
| OpenAI | `o3` | 200k | 2 | 8 | Updated |
|
||||
| OpenAI | `o3-2025-04-16` | 200k | 2 | 8 | Updated |
|
||||
| Azure | `azure/gpt-4o-mini-transcribe` | 16k | 1.25 (text), 3 (audio) | 5 (text) | New |
|
||||
| Mistral | `mistral/magistral-medium-latest` | 40k | 2 | 5 | New |
|
||||
| Mistral | `mistral/magistral-small-latest` | 40k | 0.5 | 1.5 | New |
|
||||
|
||||
- Deepgram: `nova-3` cost per second pricing is [now supported](https://github.com/BerriAI/litellm/pull/11634).
|
||||
|
||||
### Updated Models
|
||||
#### Bugs
|
||||
- **[Watsonx](../../docs/providers/watsonx)**
|
||||
- Ignore space id on Watsonx deployments (throws json errors) - [PR](https://github.com/BerriAI/litellm/pull/11527)
|
||||
- **[Ollama](../../docs/providers/ollama)**
|
||||
- Set tool call id for streaming calls - [PR](https://github.com/BerriAI/litellm/pull/11528)
|
||||
- **Gemini ([VertexAI](../../docs/providers/vertex) + [Google AI Studio](../../docs/providers/gemini))**
|
||||
- Fix tool call indexes - [PR](https://github.com/BerriAI/litellm/pull/11558)
|
||||
- Handle empty string for arguments in function calls - [PR](https://github.com/BerriAI/litellm/pull/11601)
|
||||
- Add audio/ogg mime type support when inferring from file url’s - [PR](https://github.com/BerriAI/litellm/pull/11635)
|
||||
- **[Custom LLM](../../docs/providers/custom_llm_server)**
|
||||
- Fix passing api_base, api_key, litellm_params_dict to custom_llm embedding methods - [PR](https://github.com/BerriAI/litellm/pull/11450) s/o [ElefHead](https://github.com/ElefHead)
|
||||
- **[Huggingface](../../docs/providers/huggingface)**
|
||||
- Add /chat/completions to endpoint url when missing - [PR](https://github.com/BerriAI/litellm/pull/11630)
|
||||
- **[Deepgram](../../docs/providers/deepgram)**
|
||||
- Support async httpx calls - [PR](https://github.com/BerriAI/litellm/pull/11641)
|
||||
- **[Anthropic](../../docs/providers/anthropic)**
|
||||
- Append prefix (if set) to assistant content start - [PR](https://github.com/BerriAI/litellm/pull/11719)
|
||||
|
||||
#### Features
|
||||
- **[VertexAI](../../docs/providers/vertex)**
|
||||
- Support vertex credentials set via env var on passthrough - [PR](https://github.com/BerriAI/litellm/pull/11527)
|
||||
- Support for choosing ‘global’ region when model is only available there - [PR](https://github.com/BerriAI/litellm/pull/11566)
|
||||
- Anthropic passthrough cost calculation + token tracking - [PR](https://github.com/BerriAI/litellm/pull/11611)
|
||||
- Support ‘global’ vertex region on passthrough - [PR](https://github.com/BerriAI/litellm/pull/11661)
|
||||
- **[Anthropic](../../docs/providers/anthropic)**
|
||||
- ‘none’ tool choice param support - [PR](https://github.com/BerriAI/litellm/pull/11695), [Get Started](../../docs/providers/anthropic#disable-tool-calling)
|
||||
- **[Perplexity](../../docs/providers/perplexity)**
|
||||
- Add ‘reasoning_effort’ support - [PR](https://github.com/BerriAI/litellm/pull/11562), [Get Started](../../docs/providers/perplexity#reasoning-effort)
|
||||
- **[Mistral](../../docs/providers/mistral)**
|
||||
- Add mistral reasoning support - [PR](https://github.com/BerriAI/litellm/pull/11642), [Get Started](../../docs/providers/mistral#reasoning)
|
||||
- **[SGLang](../../docs/providers/openai_compatible)**
|
||||
- Map context window exceeded error for proper handling - [PR](https://github.com/BerriAI/litellm/pull/11575/)
|
||||
- **[Deepgram](../../docs/providers/deepgram)**
|
||||
- Provider specific params support - [PR](https://github.com/BerriAI/litellm/pull/11638)
|
||||
- **[Azure](../../docs/providers/azure)**
|
||||
- Return content safety filter results - [PR](https://github.com/BerriAI/litellm/pull/11655)
|
||||
---
|
||||
|
||||
## LLM API Endpoints
|
||||
|
||||
#### Bugs
|
||||
- **[Chat Completion](../../docs/completion/input)**
|
||||
- Streaming - Ensure consistent ‘created’ across chunks - [PR](https://github.com/BerriAI/litellm/pull/11528)
|
||||
#### Features
|
||||
- **MCP**
|
||||
- Add controls for MCP Permission Management - [PR](https://github.com/BerriAI/litellm/pull/11598), [Docs](../../docs/mcp#-mcp-permission-management)
|
||||
- Add permission management for MCP List + Call Tool operations - [PR](https://github.com/BerriAI/litellm/pull/11682), [Docs](../../docs/mcp#-mcp-permission-management)
|
||||
- Streamable HTTP server support - [PR](https://github.com/BerriAI/litellm/pull/11628), [PR](https://github.com/BerriAI/litellm/pull/11645), [Docs](../../docs/mcp#using-your-mcp)
|
||||
- Use Experimental dedicated Rest endpoints for list, calling MCP tools - [PR](https://github.com/BerriAI/litellm/pull/11684)
|
||||
- **[Responses API](../../docs/response_api)**
|
||||
- NEW API Endpoint - List input items - [PR](https://github.com/BerriAI/litellm/pull/11602)
|
||||
- Background mode for OpenAI + Azure OpenAI - [PR](https://github.com/BerriAI/litellm/pull/11640)
|
||||
- Langfuse/other Logging support on responses api requests - [PR](https://github.com/BerriAI/litellm/pull/11685)
|
||||
- **[Chat Completions](../../docs/completion/input)**
|
||||
- Bridge for Responses API - allows calling codex-mini via `/chat/completions` and `/v1/messages` - [PR](https://github.com/BerriAI/litellm/pull/11632), [PR](https://github.com/BerriAI/litellm/pull/11685)
|
||||
|
||||
|
||||
---
|
||||
|
||||
## Spend Tracking
|
||||
|
||||
#### Bugs
|
||||
- **[End Users](../../docs/proxy/customers)**
|
||||
- Update enduser spend and budget reset date based on budget duration - [PR](https://github.com/BerriAI/litellm/pull/8460) (s/o [laurien16](https://github.com/laurien16))
|
||||
- **[Custom Pricing](../../docs/proxy/custom_pricing)**
|
||||
- Convert scientific notation str to int - [PR](https://github.com/BerriAI/litellm/pull/11655)
|
||||
|
||||
---
|
||||
|
||||
## Management Endpoints / UI
|
||||
|
||||
#### Bugs
|
||||
- **[Users](../../docs/proxy/users)**
|
||||
- `/user/info` - fix passing user with `+` in user id
|
||||
- Add admin-initiated password reset flow - [PR](https://github.com/BerriAI/litellm/pull/11618)
|
||||
- Fixes default user settings UI rendering error - [PR](https://github.com/BerriAI/litellm/pull/11674)
|
||||
- **[Budgets](../../docs/proxy/users)**
|
||||
- Correct success message when new user budget is created - [PR](https://github.com/BerriAI/litellm/pull/11608)
|
||||
|
||||
#### Features
|
||||
- **Leftnav**
|
||||
- Show remaining Enterprise users on UI
|
||||
- **MCP**
|
||||
- New server add form - [PR](https://github.com/BerriAI/litellm/pull/11604)
|
||||
- Allow editing mcp servers - [PR](https://github.com/BerriAI/litellm/pull/11693)
|
||||
- **Models**
|
||||
- Add deepgram models on UI
|
||||
- Model Access Group support on UI - [PR](https://github.com/BerriAI/litellm/pull/11719)
|
||||
- **Keys**
|
||||
- Trim long user id’s - [PR](https://github.com/BerriAI/litellm/pull/11488)
|
||||
- **Logs**
|
||||
- Add live tail feature to logs view, allows user to disable auto refresh in high traffic - [PR](https://github.com/BerriAI/litellm/pull/11712)
|
||||
- Audit Logs - preview screenshot - [PR](https://github.com/BerriAI/litellm/pull/11715)
|
||||
|
||||
---
|
||||
|
||||
## Logging / Guardrails Integrations
|
||||
|
||||
#### Bugs
|
||||
- **[Arize](../../docs/observability/arize_integration)**
|
||||
- Change space_key header to space_id - [PR](https://github.com/BerriAI/litellm/pull/11595) (s/o [vanities](https://github.com/vanities))
|
||||
- **[Prometheus](../../docs/proxy/prometheus)**
|
||||
- Fix total requests increment - [PR](https://github.com/BerriAI/litellm/pull/11718)
|
||||
|
||||
#### Features
|
||||
- **[Lasso Guardrails](../../docs/proxy/guardrails/lasso_security)**
|
||||
- [NEW] Lasso Guardrails support - [PR](https://github.com/BerriAI/litellm/pull/11565)
|
||||
- **[Users](../../docs/proxy/users)**
|
||||
- New `organizations` param on `/user/new` - allows adding users to orgs on creation - [PR](https://github.com/BerriAI/litellm/pull/11572/files)
|
||||
- **Prevent double logging when using bridge logic** - [PR](https://github.com/BerriAI/litellm/pull/11687)
|
||||
|
||||
---
|
||||
|
||||
## Performance / Reliability Improvements
|
||||
|
||||
#### Bugs
|
||||
- **[Tag based routing](../../docs/proxy/tag_routing)**
|
||||
- Do not consider ‘default’ models when request specifies a tag - [PR](https://github.com/BerriAI/litellm/pull/11454) (s/o [thiagosalvatore](https://github.com/thiagosalvatore))
|
||||
|
||||
#### Features
|
||||
- **[Caching](../../docs/caching/all_caches)**
|
||||
- New optional ‘litellm[caching]’ pip install for adding disk cache dependencies - [PR](https://github.com/BerriAI/litellm/pull/11600)
|
||||
|
||||
---
|
||||
|
||||
## General Proxy Improvements
|
||||
|
||||
#### Bugs
|
||||
- **aiohttp**
|
||||
- fixes for transfer encoding error on aiohttp transport - [PR](https://github.com/BerriAI/litellm/pull/11561)
|
||||
|
||||
#### Features
|
||||
- **aiohttp**
|
||||
- Enable System Proxy Support for aiohttp transport - [PR](https://github.com/BerriAI/litellm/pull/11616) (s/o [idootop](https://github.com/idootop))
|
||||
- **CLI**
|
||||
- Make all commands show server URL - [PR](https://github.com/BerriAI/litellm/pull/10801)
|
||||
- **Unicorn**
|
||||
- Allow setting keep alive timeout - [PR](https://github.com/BerriAI/litellm/pull/11594)
|
||||
- **Experimental Rate Limiting v2** (enable via `EXPERIMENTAL_MULTI_INSTANCE_RATE_LIMITING="True"`)
|
||||
- Support specifying rate limit by output_tokens only - [PR](https://github.com/BerriAI/litellm/pull/11646)
|
||||
- Decrement parallel requests on call failure - [PR](https://github.com/BerriAI/litellm/pull/11646)
|
||||
- In-memory only rate limiting support - [PR](https://github.com/BerriAI/litellm/pull/11646)
|
||||
- Return remaining rate limits by key/user/team - [PR](https://github.com/BerriAI/litellm/pull/11646)
|
||||
- **Helm**
|
||||
- support extraContainers in migrations-job.yaml - [PR](https://github.com/BerriAI/litellm/pull/11649)
|
||||
|
||||
|
||||
|
||||
|
||||
---
|
||||
|
||||
## New Contributors
|
||||
* @laurien16 made their first contribution in https://github.com/BerriAI/litellm/pull/8460
|
||||
* @fengbohello made their first contribution in https://github.com/BerriAI/litellm/pull/11547
|
||||
* @lapinek made their first contribution in https://github.com/BerriAI/litellm/pull/11570
|
||||
* @yanwork made their first contribution in https://github.com/BerriAI/litellm/pull/11586
|
||||
* @dhs-shine made their first contribution in https://github.com/BerriAI/litellm/pull/11575
|
||||
* @ElefHead made their first contribution in https://github.com/BerriAI/litellm/pull/11450
|
||||
* @idootop made their first contribution in https://github.com/BerriAI/litellm/pull/11616
|
||||
* @stevenaldinger made their first contribution in https://github.com/BerriAI/litellm/pull/11649
|
||||
* @thiagosalvatore made their first contribution in https://github.com/BerriAI/litellm/pull/11454
|
||||
* @vanities made their first contribution in https://github.com/BerriAI/litellm/pull/11595
|
||||
* @alvarosevilla95 made their first contribution in https://github.com/BerriAI/litellm/pull/11661
|
||||
|
||||
---
|
||||
|
||||
## Demo Instance
|
||||
|
||||
Here's a Demo Instance to test changes:
|
||||
|
||||
- Instance: https://demo.litellm.ai/
|
||||
- Login Credentials:
|
||||
- Username: admin
|
||||
- Password: sk-1234
|
||||
|
||||
## [Git Diff](https://github.com/BerriAI/litellm/compare/v1.72.2-stable...1.72.6.rc)
|
@@ -0,0 +1,337 @@
|
||||
---
|
||||
title: "v1.73.0-stable - Set default team for new users"
|
||||
slug: "v1-73-0-stable"
|
||||
date: 2025-06-21T10:00:00
|
||||
authors:
|
||||
- name: Krrish Dholakia
|
||||
title: CEO, LiteLLM
|
||||
url: https://www.linkedin.com/in/krish-d/
|
||||
image_url: https://pbs.twimg.com/profile_images/1298587542745358340/DZv3Oj-h_400x400.jpg
|
||||
- name: Ishaan Jaffer
|
||||
title: CTO, LiteLLM
|
||||
url: https://www.linkedin.com/in/reffajnaahsi/
|
||||
image_url: https://pbs.twimg.com/profile_images/1613813310264340481/lz54oEiB_400x400.jpg
|
||||
|
||||
hide_table_of_contents: false
|
||||
---
|
||||
|
||||
import Image from '@theme/IdealImage';
|
||||
import Tabs from '@theme/Tabs';
|
||||
import TabItem from '@theme/TabItem';
|
||||
|
||||
|
||||
:::warning
|
||||
|
||||
## Known Issues
|
||||
|
||||
The `non-root` docker image has a known issue around the UI not loading. If you use the `non-root` docker image we recommend waiting before upgrading to this version. We will post a patch fix for this.
|
||||
|
||||
:::
|
||||
|
||||
## Deploy this version
|
||||
|
||||
<Tabs>
|
||||
<TabItem value="docker" label="Docker">
|
||||
|
||||
``` showLineNumbers title="docker run litellm"
|
||||
docker run \
|
||||
-e STORE_MODEL_IN_DB=True \
|
||||
-p 4000:4000 \
|
||||
ghcr.io/berriai/litellm:v1.73.0-stable
|
||||
```
|
||||
</TabItem>
|
||||
|
||||
<TabItem value="pip" label="Pip">
|
||||
|
||||
``` showLineNumbers title="pip install litellm"
|
||||
pip install litellm==1.73.0.post1
|
||||
```
|
||||
|
||||
</TabItem>
|
||||
</Tabs>
|
||||
|
||||
|
||||
## TLDR
|
||||
|
||||
|
||||
* **Why Upgrade**
|
||||
- User Management: Set default team for new users - enables giving all users $10 API keys for exploration.
|
||||
- Passthrough Endpoints v2: Enhanced support for subroutes and custom cost tracking for passthrough endpoints.
|
||||
- Health Check Dashboard: New frontend UI for monitoring model health and status.
|
||||
* **Who Should Read**
|
||||
- Teams using **Passthrough Endpoints**
|
||||
- Teams using **User Management** on LiteLLM
|
||||
- Teams using **Health Check Dashboard** for models
|
||||
- Teams using **Claude Code** with LiteLLM
|
||||
* **Risk of Upgrade**
|
||||
- **Low**
|
||||
- No major breaking changes to existing functionality.
|
||||
- **Major Changes**
|
||||
- `User Agent` will be auto-tracked as a tag in LiteLLM UI Logs Page. This means for all LLM requests you will see a `User Agent` tag in the logs page.
|
||||
|
||||
---
|
||||
|
||||
## Key Highlights
|
||||
|
||||
|
||||
|
||||
### Set Default Team for New Users
|
||||
|
||||
<Image img={require('../../img/default_teams_product_ss.jpg')}/>
|
||||
|
||||
<br/>
|
||||
|
||||
v1.73.0 introduces the ability to assign new users to Default Teams. This makes it much easier to enable experimentation with LLMs within your company, while also **ensuring spend for exploration is tracked correctly.**
|
||||
|
||||
What this means for **Proxy Admins**:
|
||||
- Set a max budget per team member: This sets a max amount an individual can spend within a team.
|
||||
- Set a default team for new users: When a new user signs in via SSO / invitation link, they will be automatically added to this team.
|
||||
|
||||
What this means for **Developers**:
|
||||
- View models across teams: You can now go to `Models + Endpoints` and view the models you have access to, across all teams you're a member of.
|
||||
- Safe create key modal: If you have no model access outside of a team (default behaviour), you are now nudged to select a team on the Create Key modal. This resolves a common confusion point for new users onboarding to the proxy.
|
||||
|
||||
[Get Started](https://docs.litellm.ai/docs/tutorials/default_team_self_serve)
|
||||
|
||||
|
||||
### Passthrough Endpoints v2
|
||||
|
||||
<Image img={require('../../img/release_notes/v2_pt.png')}/>
|
||||
|
||||
|
||||
<br/>
|
||||
|
||||
This release brings support for adding billing and full URL forwarding for passthrough endpoints.
|
||||
|
||||
Previously, you could only map simple endpoints, but now you can add just `/bria` and all subroutes automatically get forwarded - for example, `/bria/v1/text-to-image/base/model` and `/bria/v1/enhance_image` will both be forwarded to the target URL with the same path structure.
|
||||
|
||||
This means you as Proxy Admin can onboard third-party endpoints like Bria API and Mistral OCR, set a cost per request, and give your developers access to the complete API functionality.
|
||||
|
||||
[Learn more about Passthrough Endpoints](../../docs/proxy/pass_through)
|
||||
|
||||
|
||||
### v2 Health Checks
|
||||
|
||||
<Image img={require('../../img/release_notes/v2_health.png')}/>
|
||||
|
||||
<br/>
|
||||
|
||||
This release brings support for Proxy Admins to select which specific models to health check and see the health status as soon as its individual check completes, along with last check times.
|
||||
|
||||
This allows Proxy Admins to immediately identify which specific models are in a bad state and view the full error stack trace for faster troubleshooting.
|
||||
|
||||
---
|
||||
|
||||
|
||||
## New / Updated Models
|
||||
|
||||
### Pricing / Context Window Updates
|
||||
|
||||
| Provider | Model | Context Window | Input ($/1M tokens) | Output ($/1M tokens) | Type |
|
||||
| ----------- | -------------------------------------- | -------------- | ------------------- | -------------------- | ---- |
|
||||
| Google VertexAI | `vertex_ai/imagen-4` | N/A | Image Generation | Image Generation | New |
|
||||
| Google VertexAI | `vertex_ai/imagen-4-preview` | N/A | Image Generation | Image Generation | New |
|
||||
| Gemini | `gemini-2.5-pro` | 2M | $1.25 | $5.00 | New |
|
||||
| Gemini | `gemini-2.5-flash-lite` | 1M | $0.075 | $0.30 | New |
|
||||
| OpenRouter | Various models | Updated | Updated | Updated | Updated |
|
||||
| Azure | `azure/o3` | 200k | $2.00 | $8.00 | Updated |
|
||||
| Azure | `azure/o3-pro` | 200k | $2.00 | $8.00 | Updated |
|
||||
| Azure OpenAI | Azure Codex Models | Various | Various | Various | New |
|
||||
|
||||
### Updated Models
|
||||
|
||||
#### Features
|
||||
- **[Azure](../../docs/providers/azure)**
|
||||
- Support for new /v1 preview Azure OpenAI API - [PR](https://github.com/BerriAI/litellm/pull/11934), [Get Started](../../docs/providers/azure/azure_responses#azure-codex-models)
|
||||
- Add Azure Codex Models support - [PR](https://github.com/BerriAI/litellm/pull/11934), [Get Started](../../docs/providers/azure/azure_responses#azure-codex-models)
|
||||
- Make Azure AD scope configurable - [PR](https://github.com/BerriAI/litellm/pull/11621)
|
||||
- Handle more GPT custom naming patterns - [PR](https://github.com/BerriAI/litellm/pull/11914)
|
||||
- Update o3 pricing to match OpenAI pricing - [PR](https://github.com/BerriAI/litellm/pull/11937)
|
||||
- **[VertexAI](../../docs/providers/vertex)**
|
||||
- Add Vertex Imagen-4 models - [PR](https://github.com/BerriAI/litellm/pull/11767), [Get Started](../../docs/providers/vertex_image)
|
||||
- Anthropic streaming passthrough cost tracking - [PR](https://github.com/BerriAI/litellm/pull/11734)
|
||||
- **[Gemini](../../docs/providers/gemini)**
|
||||
- Working Gemini TTS support via `/v1/speech` endpoint - [PR](https://github.com/BerriAI/litellm/pull/11832)
|
||||
- Fix gemini 2.5 flash config - [PR](https://github.com/BerriAI/litellm/pull/11830)
|
||||
- Add missing `flash-2.5-flash-lite` model and fix pricing - [PR](https://github.com/BerriAI/litellm/pull/11901)
|
||||
- Mark all gemini-2.5 models as supporting PDF input - [PR](https://github.com/BerriAI/litellm/pull/11907)
|
||||
- Add `gemini-2.5-pro` with reasoning support - [PR](https://github.com/BerriAI/litellm/pull/11927)
|
||||
- **[AWS Bedrock](../../docs/providers/bedrock)**
|
||||
- AWS credentials no longer mandatory - [PR](https://github.com/BerriAI/litellm/pull/11765)
|
||||
- Add AWS Bedrock profiles for APAC region - [PR](https://github.com/BerriAI/litellm/pull/11883)
|
||||
- Fix AWS Bedrock Claude tool call index - [PR](https://github.com/BerriAI/litellm/pull/11842)
|
||||
- Handle base64 file data with `qs:..` prefix - [PR](https://github.com/BerriAI/litellm/pull/11908)
|
||||
- Add Mistral Small to BEDROCK_CONVERSE_MODELS - [PR](https://github.com/BerriAI/litellm/pull/11760)
|
||||
- **[Mistral](../../docs/providers/mistral)**
|
||||
- Enhance Mistral API with parallel tool calls support - [PR](https://github.com/BerriAI/litellm/pull/11770)
|
||||
- **[Meta Llama API](../../docs/providers/meta_llama)**
|
||||
- Enable tool calling for meta_llama models - [PR](https://github.com/BerriAI/litellm/pull/11895)
|
||||
- **[Volcengine](../../docs/providers/volcengine)**
|
||||
- Add thinking parameter support - [PR](https://github.com/BerriAI/litellm/pull/11914)
|
||||
|
||||
|
||||
#### Bugs
|
||||
|
||||
- **[VertexAI](../../docs/providers/vertex)**
|
||||
- Handle missing tokenCount in promptTokensDetails - [PR](https://github.com/BerriAI/litellm/pull/11896)
|
||||
- Fix vertex AI claude thinking params - [PR](https://github.com/BerriAI/litellm/pull/11796)
|
||||
- **[Gemini](../../docs/providers/gemini)**
|
||||
- Fix web search error with responses API - [PR](https://github.com/BerriAI/litellm/pull/11894), [Get Started](../../docs/completion/web_search#responses-litellmresponses)
|
||||
- **[Custom LLM](../../docs/providers/custom_llm_server)**
|
||||
- Set anthropic custom LLM provider property - [PR](https://github.com/BerriAI/litellm/pull/11907)
|
||||
- **[Anthropic](../../docs/providers/anthropic)**
|
||||
- Bump anthropic package version - [PR](https://github.com/BerriAI/litellm/pull/11851)
|
||||
- **[Ollama](../../docs/providers/ollama)**
|
||||
- Update ollama_embeddings to work on sync API - [PR](https://github.com/BerriAI/litellm/pull/11746)
|
||||
- Fix response_format not working - [PR](https://github.com/BerriAI/litellm/pull/11880)
|
||||
|
||||
---
|
||||
|
||||
## LLM API Endpoints
|
||||
|
||||
#### Features
|
||||
- **[Responses API](../../docs/response_api)**
|
||||
- Day-0 support for OpenAI re-usable prompts Responses API - [PR](https://github.com/BerriAI/litellm/pull/11782), [Get Started](../../docs/providers/openai/responses_api#reusable-prompts)
|
||||
- Support passing image URLs in Completion-to-Responses bridge - [PR](https://github.com/BerriAI/litellm/pull/11833)
|
||||
- **[MCP Gateway](../../docs/mcp)**
|
||||
- Add Allowed MCPs to Creating/Editing Organizations - [PR](https://github.com/BerriAI/litellm/pull/11893), [Get Started](../../docs/mcp#-mcp-permission-management)
|
||||
- Allow connecting to MCP with authentication headers - [PR](https://github.com/BerriAI/litellm/pull/11891), [Get Started](../../docs/mcp#using-your-mcp-with-client-side-credentials)
|
||||
- **[Speech API](../../docs/speech)**
|
||||
- Working Gemini TTS support via OpenAI's `/v1/speech` endpoint - [PR](https://github.com/BerriAI/litellm/pull/11832)
|
||||
- **[Passthrough Endpoints](../../docs/proxy/pass_through)**
|
||||
- Add support for subroutes for passthrough endpoints - [PR](https://github.com/BerriAI/litellm/pull/11827)
|
||||
- Support for setting custom cost per passthrough request - [PR](https://github.com/BerriAI/litellm/pull/11870)
|
||||
- Ensure "Request" is tracked for passthrough requests on LiteLLM Proxy - [PR](https://github.com/BerriAI/litellm/pull/11873)
|
||||
- Add V2 Passthrough endpoints on UI - [PR](https://github.com/BerriAI/litellm/pull/11905)
|
||||
- Move passthrough endpoints under Models + Endpoints in UI - [PR](https://github.com/BerriAI/litellm/pull/11871)
|
||||
- QA improvements for adding passthrough endpoints - [PR](https://github.com/BerriAI/litellm/pull/11909), [PR](https://github.com/BerriAI/litellm/pull/11939)
|
||||
- **[Models API](../../docs/completion/model_alias)**
|
||||
- Allow `/models` to return correct models for custom wildcard prefixes - [PR](https://github.com/BerriAI/litellm/pull/11784)
|
||||
|
||||
#### Bugs
|
||||
|
||||
- **[Messages API](../../docs/anthropic_unified)**
|
||||
- Fix `/v1/messages` endpoint always using us-central1 with vertex_ai-anthropic models - [PR](https://github.com/BerriAI/litellm/pull/11831)
|
||||
- Fix model_group tracking for `/v1/messages` and `/moderations` - [PR](https://github.com/BerriAI/litellm/pull/11933)
|
||||
- Fix cost tracking and logging via `/v1/messages` API when using Claude Code - [PR](https://github.com/BerriAI/litellm/pull/11928)
|
||||
- **[MCP Gateway](../../docs/mcp)**
|
||||
- Fix using MCPs defined on config.yaml - [PR](https://github.com/BerriAI/litellm/pull/11824)
|
||||
- **[Chat Completion API](../../docs/completion/input)**
|
||||
- Allow dict for tool_choice argument in acompletion - [PR](https://github.com/BerriAI/litellm/pull/11860)
|
||||
- **[Passthrough Endpoints](../../docs/pass_through/langfuse)**
|
||||
- Don't log request to Langfuse passthrough on Langfuse - [PR](https://github.com/BerriAI/litellm/pull/11768)
|
||||
|
||||
---
|
||||
|
||||
## Spend Tracking
|
||||
|
||||
#### Features
|
||||
- **[User Agent Tracking](../../docs/proxy/cost_tracking)**
|
||||
- Automatically track spend by user agent (allows cost tracking for Claude Code) - [PR](https://github.com/BerriAI/litellm/pull/11781)
|
||||
- Add user agent tags in spend logs payload - [PR](https://github.com/BerriAI/litellm/pull/11872)
|
||||
- **[Tag Management](../../docs/proxy/cost_tracking)**
|
||||
- Support adding public model names in tag management - [PR](https://github.com/BerriAI/litellm/pull/11908)
|
||||
|
||||
---
|
||||
|
||||
## Management Endpoints / UI
|
||||
|
||||
#### Features
|
||||
- **Test Key Page**
|
||||
- Allow testing `/v1/messages` on the Test Key Page - [PR](https://github.com/BerriAI/litellm/pull/11930)
|
||||
- **[SSO](../../docs/proxy/sso)**
|
||||
- Allow passing additional headers - [PR](https://github.com/BerriAI/litellm/pull/11781)
|
||||
- **[JWT Auth](../../docs/proxy/jwt_auth)**
|
||||
- Correctly return user email - [PR](https://github.com/BerriAI/litellm/pull/11783)
|
||||
- **[Model Management](../../docs/proxy/model_management)**
|
||||
- Allow editing model access group for existing model - [PR](https://github.com/BerriAI/litellm/pull/11783)
|
||||
- **[Team Management](../../docs/proxy/team_management)**
|
||||
- Allow setting default team for new users - [PR](https://github.com/BerriAI/litellm/pull/11874), [PR](https://github.com/BerriAI/litellm/pull/11877)
|
||||
- Fix default team settings - [PR](https://github.com/BerriAI/litellm/pull/11887)
|
||||
- **[SCIM](../../docs/proxy/scim)**
|
||||
- Add error handling for existing user on SCIM - [PR](https://github.com/BerriAI/litellm/pull/11862)
|
||||
- Add SCIM PATCH and PUT operations for users - [PR](https://github.com/BerriAI/litellm/pull/11863)
|
||||
- **Health Check Dashboard**
|
||||
- Implement health check backend API and storage functionality - [PR](https://github.com/BerriAI/litellm/pull/11852)
|
||||
- Add LiteLLM_HealthCheckTable to database schema - [PR](https://github.com/BerriAI/litellm/pull/11677)
|
||||
- Implement health check frontend UI components and dashboard integration - [PR](https://github.com/BerriAI/litellm/pull/11679)
|
||||
- Add success modal for health check responses - [PR](https://github.com/BerriAI/litellm/pull/11899)
|
||||
- Fix clickable model ID in health check table - [PR](https://github.com/BerriAI/litellm/pull/11898)
|
||||
- Fix health check UI table design - [PR](https://github.com/BerriAI/litellm/pull/11897)
|
||||
|
||||
---
|
||||
|
||||
## Logging / Guardrails Integrations
|
||||
|
||||
#### Bugs
|
||||
- **[Prometheus](../../docs/observability/prometheus)**
|
||||
- Fix bug for using prometheus metrics config - [PR](https://github.com/BerriAI/litellm/pull/11779)
|
||||
|
||||
---
|
||||
|
||||
## Security & Reliability
|
||||
|
||||
#### Security Fixes
|
||||
- **[Documentation Security](../../docs)**
|
||||
- Security fixes for docs - [PR](https://github.com/BerriAI/litellm/pull/11776)
|
||||
- Add Trivy Security Scan for UI + Docs folder - remove all vulnerabilities - [PR](https://github.com/BerriAI/litellm/pull/11778)
|
||||
|
||||
#### Reliability Improvements
|
||||
- **[Dependencies](../../docs)**
|
||||
- Fix aiohttp version requirement - [PR](https://github.com/BerriAI/litellm/pull/11777)
|
||||
- Bump next from 14.2.26 to 14.2.30 in UI dashboard - [PR](https://github.com/BerriAI/litellm/pull/11720)
|
||||
- **[Networking](../../docs)**
|
||||
- Allow using CA Bundles - [PR](https://github.com/BerriAI/litellm/pull/11906)
|
||||
- Add workload identity federation between GCP and AWS - [PR](https://github.com/BerriAI/litellm/pull/10210)
|
||||
|
||||
---
|
||||
|
||||
## General Proxy Improvements
|
||||
|
||||
#### Features
|
||||
- **[Deployment](../../docs/proxy/deploy)**
|
||||
- Add deployment annotations for Kubernetes - [PR](https://github.com/BerriAI/litellm/pull/11849)
|
||||
- Add ciphers in command and pass to hypercorn for proxy - [PR](https://github.com/BerriAI/litellm/pull/11916)
|
||||
- **[Custom Root Path](../../docs/proxy/deploy)**
|
||||
- Fix loading UI on custom root path - [PR](https://github.com/BerriAI/litellm/pull/11912)
|
||||
- **[SDK Improvements](../../docs/proxy/reliability)**
|
||||
- LiteLLM SDK / Proxy improvement (don't transform message client-side) - [PR](https://github.com/BerriAI/litellm/pull/11908)
|
||||
|
||||
#### Bugs
|
||||
- **[Observability](../../docs/observability)**
|
||||
- Fix boto3 tracer wrapping for observability - [PR](https://github.com/BerriAI/litellm/pull/11869)
|
||||
|
||||
|
||||
---
|
||||
|
||||
## New Contributors
|
||||
* @kjoth made their first contribution in [PR](https://github.com/BerriAI/litellm/pull/11621)
|
||||
* @shagunb-acn made their first contribution in [PR](https://github.com/BerriAI/litellm/pull/11760)
|
||||
* @MadsRC made their first contribution in [PR](https://github.com/BerriAI/litellm/pull/11765)
|
||||
* @Abiji-2020 made their first contribution in [PR](https://github.com/BerriAI/litellm/pull/11746)
|
||||
* @salzubi401 made their first contribution in [PR](https://github.com/BerriAI/litellm/pull/11803)
|
||||
* @orolega made their first contribution in [PR](https://github.com/BerriAI/litellm/pull/11826)
|
||||
* @X4tar made their first contribution in [PR](https://github.com/BerriAI/litellm/pull/11796)
|
||||
* @karen-veigas made their first contribution in [PR](https://github.com/BerriAI/litellm/pull/11858)
|
||||
* @Shankyg made their first contribution in [PR](https://github.com/BerriAI/litellm/pull/11859)
|
||||
* @pascallim made their first contribution in [PR](https://github.com/BerriAI/litellm/pull/10210)
|
||||
* @lgruen-vcgs made their first contribution in [PR](https://github.com/BerriAI/litellm/pull/11883)
|
||||
* @rinormaloku made their first contribution in [PR](https://github.com/BerriAI/litellm/pull/11851)
|
||||
* @InvisibleMan1306 made their first contribution in [PR](https://github.com/BerriAI/litellm/pull/11849)
|
||||
* @ervwalter made their first contribution in [PR](https://github.com/BerriAI/litellm/pull/11937)
|
||||
* @ThakeeNathees made their first contribution in [PR](https://github.com/BerriAI/litellm/pull/11880)
|
||||
* @jnhyperion made their first contribution in [PR](https://github.com/BerriAI/litellm/pull/11842)
|
||||
* @Jannchie made their first contribution in [PR](https://github.com/BerriAI/litellm/pull/11860)
|
||||
|
||||
---
|
||||
|
||||
## Demo Instance
|
||||
|
||||
Here's a Demo Instance to test changes:
|
||||
|
||||
- Instance: https://demo.litellm.ai/
|
||||
- Login Credentials:
|
||||
- Username: admin
|
||||
- Password: sk-1234
|
||||
|
||||
## [Git Diff](https://github.com/BerriAI/litellm/compare/v1.72.6-stable...v1.73.0.rc)
|
@@ -0,0 +1,271 @@
|
||||
---
|
||||
title: "v1.73.6-stable"
|
||||
slug: "v1-73-6-stable"
|
||||
date: 2025-06-28T10:00:00
|
||||
authors:
|
||||
- name: Krrish Dholakia
|
||||
title: CEO, LiteLLM
|
||||
url: https://www.linkedin.com/in/krish-d/
|
||||
image_url: https://pbs.twimg.com/profile_images/1298587542745358340/DZv3Oj-h_400x400.jpg
|
||||
- name: Ishaan Jaffer
|
||||
title: CTO, LiteLLM
|
||||
url: https://www.linkedin.com/in/reffajnaahsi/
|
||||
image_url: https://pbs.twimg.com/profile_images/1613813310264340481/lz54oEiB_400x400.jpg
|
||||
|
||||
hide_table_of_contents: false
|
||||
---
|
||||
|
||||
import Image from '@theme/IdealImage';
|
||||
import Tabs from '@theme/Tabs';
|
||||
import TabItem from '@theme/TabItem';
|
||||
|
||||
|
||||
## Deploy this version
|
||||
|
||||
<Tabs>
|
||||
<TabItem value="docker" label="Docker">
|
||||
|
||||
``` showLineNumbers title="docker run litellm"
|
||||
docker run \
|
||||
-e STORE_MODEL_IN_DB=True \
|
||||
-p 4000:4000 \
|
||||
ghcr.io/berriai/litellm:v1.73.6-stable.patch.1
|
||||
```
|
||||
</TabItem>
|
||||
|
||||
<TabItem value="pip" label="Pip">
|
||||
|
||||
``` showLineNumbers title="pip install litellm"
|
||||
pip install litellm==1.73.6.post1
|
||||
```
|
||||
|
||||
</TabItem>
|
||||
</Tabs>
|
||||
|
||||
---
|
||||
|
||||
## Key Highlights
|
||||
|
||||
|
||||
### Claude on gemini-cli
|
||||
|
||||
|
||||
<Image img={require('../../img/release_notes/gemini_cli.png')} />
|
||||
|
||||
<br/>
|
||||
|
||||
This release brings support for using gemini-cli with LiteLLM.
|
||||
|
||||
You can use claude-sonnet-4, gemini-2.5-flash (Vertex AI & Google AI Studio), gpt-4.1 and any LiteLLM supported model on gemini-cli.
|
||||
|
||||
When you use gemini-cli with LiteLLM you get the following benefits:
|
||||
|
||||
**Developer Benefits:**
|
||||
- Universal Model Access: Use any LiteLLM supported model (Anthropic, OpenAI, Vertex AI, Bedrock, etc.) through the gemini-cli interface.
|
||||
- Higher Rate Limits & Reliability: Load balance across multiple models and providers to avoid hitting individual provider limits, with fallbacks to ensure you get responses even if one provider fails.
|
||||
|
||||
**Proxy Admin Benefits:**
|
||||
- Centralized Management: Control access to all models through a single LiteLLM proxy instance without giving your developers API Keys to each provider.
|
||||
- Budget Controls: Set spending limits and track costs across all gemini-cli usage.
|
||||
|
||||
[Get Started](../../docs/tutorials/litellm_gemini_cli)
|
||||
|
||||
<br/>
|
||||
|
||||
### Batch API Cost Tracking
|
||||
|
||||
<Image img={require('../../img/release_notes/batch_api_cost_tracking.jpg')}/>
|
||||
|
||||
<br/>
|
||||
|
||||
v1.73.6 brings cost tracking for [LiteLLM Managed Batch API](../../docs/proxy/managed_batches) calls to LiteLLM. Previously, this was not being done for Batch API calls using LiteLLM Managed Files. Now, LiteLLM will store the status of each batch call in the DB and poll incomplete batch jobs in the background, emitting a spend log for cost tracking once the batch is complete.
|
||||
|
||||
There is no new flag / change needed on your end. Over the next few weeks we hope to extend this to cover batch cost tracking for the Anthropic passthrough as well.
|
||||
|
||||
|
||||
[Get Started](../../docs/proxy/managed_batches)
|
||||
|
||||
---
|
||||
|
||||
## New Models / Updated Models
|
||||
|
||||
### Pricing / Context Window Updates
|
||||
|
||||
| Provider | Model | Context Window | Input ($/1M tokens) | Output ($/1M tokens) | Type |
|
||||
| ----------- | -------------------------------------- | -------------- | ------------------- | -------------------- | ---- |
|
||||
| Azure OpenAI | `azure/o3-pro` | 200k | $20.00 | $80.00 | New |
|
||||
| OpenRouter | `openrouter/mistralai/mistral-small-3.2-24b-instruct` | 32k | $0.1 | $0.3 | New |
|
||||
| OpenAI | `o3-deep-research` | 200k | $10.00 | $40.00 | New |
|
||||
| OpenAI | `o3-deep-research-2025-06-26` | 200k | $10.00 | $40.00 | New |
|
||||
| OpenAI | `o4-mini-deep-research` | 200k | $2.00 | $8.00 | New |
|
||||
| OpenAI | `o4-mini-deep-research-2025-06-26` | 200k | $2.00 | $8.00 | New |
|
||||
| Deepseek | `deepseek-r1` | 65k | $0.55 | $2.19 | New |
|
||||
| Deepseek | `deepseek-v3` | 65k | $0.27 | $0.07 | New |
|
||||
|
||||
|
||||
### Updated Models
|
||||
#### Bugs
|
||||
- **[Sambanova](../../docs/providers/sambanova)**
|
||||
- Handle float timestamps - [PR](https://github.com/BerriAI/litellm/pull/11971) s/o [@neubig](https://github.com/neubig)
|
||||
- **[Azure](../../docs/providers/azure)**
|
||||
- support Azure Authentication method (azure ad token, api keys) on Responses API - [PR](https://github.com/BerriAI/litellm/pull/11941) s/o [@hsuyuming](https://github.com/hsuyuming)
|
||||
- Map ‘image_url’ str as nested dict - [PR](https://github.com/BerriAI/litellm/pull/12075) s/o [@davis-featherstone](https://github.com/davis-featherstone)
|
||||
- **[Watsonx](../../docs/providers/watsonx)**
|
||||
- Set ‘model’ field to None when model is part of a custom deployment - fixes error raised by WatsonX in those cases - [PR](https://github.com/BerriAI/litellm/pull/11854) s/o [@cbjuan](https://github.com/cbjuan)
|
||||
- **[Perplexity](../../docs/providers/perplexity)**
|
||||
- Support web_search_options - [PR](https://github.com/BerriAI/litellm/pull/11983)
|
||||
- Support citation token and search queries cost calculation - [PR](https://github.com/BerriAI/litellm/pull/11938)
|
||||
- **[Anthropic](../../docs/providers/anthropic)**
|
||||
- Null value in usage block handling - [PR](https://github.com/BerriAI/litellm/pull/12068)
|
||||
- **Gemini ([Google AI Studio](../../docs/providers/gemini) + [VertexAI](../../docs/providers/vertex))**
|
||||
- Only use accepted format values (enum and datetime) - else gemini raises errors - [PR](https://github.com/BerriAI/litellm/pull/11989)
|
||||
- Cache tools if passed alongside cached content (else gemini raises an error) - [PR](https://github.com/BerriAI/litellm/pull/11989)
|
||||
- Json schema translation improvement: Fix unpack_def handling of nested $ref inside anyof items - [PR](https://github.com/BerriAI/litellm/pull/11964)
|
||||
- **[Mistral](../../docs/providers/mistral)**
|
||||
- Fix thinking prompt to match hugging face recommendation - [PR](https://github.com/BerriAI/litellm/pull/12007)
|
||||
- Add `supports_response_schema: true` for all mistral models except codestral-mamba - [PR](https://github.com/BerriAI/litellm/pull/12024)
|
||||
- **[Ollama](../../docs/providers/ollama)**
|
||||
- Fix unnecessary await on embedding calls - [PR](https://github.com/BerriAI/litellm/pull/12024)
|
||||
#### Features
|
||||
- **[Azure OpenAI](../../docs/providers/azure)**
|
||||
- Check if o-series model supports reasoning effort (enables drop_params to work for o1 models)
|
||||
- Assistant + tool use cost tracking - [PR](https://github.com/BerriAI/litellm/pull/12045)
|
||||
- **[Nvidia Nim](../../docs/providers/nvidia_nim)**
|
||||
- Add ‘response_format’ param support - [PR](https://github.com/BerriAI/litellm/pull/12003) @shagunb-acn
|
||||
- **[ElevenLabs](../../docs/providers/elevenlabs)**
|
||||
- New STT provider - [PR](https://github.com/BerriAI/litellm/pull/12119)
|
||||
|
||||
---
|
||||
## LLM API Endpoints
|
||||
|
||||
#### Features
|
||||
- [**/mcp**](../../docs/mcp)
|
||||
- Send appropriate auth string value to `/tool/call` endpoint with `x-mcp-auth` - [PR](https://github.com/BerriAI/litellm/pull/11968) s/o [@wagnerjt](https://github.com/wagnerjt)
|
||||
- [**/v1/messages**](../../docs/anthropic_unified)
|
||||
- [Custom LLM](../../docs/providers/custom_llm_server#anthropic-v1messages) support - [PR](https://github.com/BerriAI/litellm/pull/12016)
|
||||
- [**/chat/completions**](../../docs/completion/input)
|
||||
- Azure Responses API via chat completion support - [PR](https://github.com/BerriAI/litellm/pull/12016)
|
||||
- [**/responses**](../../docs/response_api)
|
||||
- Add reasoning content support for non-openai providers - [PR](https://github.com/BerriAI/litellm/pull/12055)
|
||||
- **[NEW] /generateContent**
|
||||
- New endpoints for gemini cli support - [PR](https://github.com/BerriAI/litellm/pull/12040)
|
||||
- Support calling Google AI Studio / VertexAI Gemini models in their native format - [PR](https://github.com/BerriAI/litellm/pull/12046)
|
||||
- Add logging + cost tracking for stream + non-stream vertex/google ai studio routes - [PR](https://github.com/BerriAI/litellm/pull/12058)
|
||||
- Add Bridge from generateContent to /chat/completions - [PR](https://github.com/BerriAI/litellm/pull/12081)
|
||||
- [**/batches**](../../docs/batches)
|
||||
- Filter deployments to only those where managed file was written to - [PR](https://github.com/BerriAI/litellm/pull/12048)
|
||||
- Save all model / file id mappings in db (previously it was just the first one) - enables ‘true’ loadbalancing - [PR](https://github.com/BerriAI/litellm/pull/12048)
|
||||
- Support List Batches with target model name specified - [PR](https://github.com/BerriAI/litellm/pull/12049)
|
||||
|
||||
---
|
||||
## Spend Tracking / Budget Improvements
|
||||
|
||||
#### Features
|
||||
- [**Passthrough**](../../docs/pass_through)
|
||||
- [Bedrock](../../docs/pass_through/bedrock) - cost tracking (`/invoke` + `/converse` routes) on streaming + non-streaming - [PR](https://github.com/BerriAI/litellm/pull/12123)
|
||||
- [VertexAI](../../docs/pass_through/vertex_ai) - anthropic cost calculation support - [PR](https://github.com/BerriAI/litellm/pull/11992)
|
||||
- [**Batches**](../../docs/batches)
|
||||
- Background job for cost tracking LiteLLM Managed batches - [PR](https://github.com/BerriAI/litellm/pull/12125)
|
||||
|
||||
---
|
||||
## Management Endpoints / UI
|
||||
|
||||
#### Bugs
|
||||
- **General UI**
|
||||
- Fix today selector date mutation in dashboard components - [PR](https://github.com/BerriAI/litellm/pull/12042)
|
||||
- **Usage**
|
||||
- Aggregate usage data across all pages of paginated endpoint - [PR](https://github.com/BerriAI/litellm/pull/12033)
|
||||
- **Teams**
|
||||
- De-duplicate models in team settings dropdown - [PR](https://github.com/BerriAI/litellm/pull/12074)
|
||||
- **Models**
|
||||
- Preserve public model name when selecting ‘test connect’ with azure model (previously would reset) - [PR](https://github.com/BerriAI/litellm/pull/11713)
|
||||
- **Invitation Links**
|
||||
- Ensure Invite links email contain the correct invite id when using tf provider - [PR](https://github.com/BerriAI/litellm/pull/12130)
|
||||
#### Features
|
||||
- **Models**
|
||||
- Add ‘last success’ column to health check table - [PR](https://github.com/BerriAI/litellm/pull/11903)
|
||||
- **MCP**
|
||||
- New UI component to support auth types: api key, bearer token, basic auth - [PR](https://github.com/BerriAI/litellm/pull/11968) s/o [@wagnerjt](https://github.com/wagnerjt)
|
||||
- Ensure internal users can access /mcp and /mcp/ routes - [PR](https://github.com/BerriAI/litellm/pull/12106)
|
||||
- **SCIM**
|
||||
- Ensure default_internal_user_params are applied for new users - [PR](https://github.com/BerriAI/litellm/pull/12015)
|
||||
- **Team**
|
||||
- Support default key expiry for team member keys - [PR](https://github.com/BerriAI/litellm/pull/12023)
|
||||
- Expand team member add check to cover user email - [PR](https://github.com/BerriAI/litellm/pull/12082)
|
||||
- **UI**
|
||||
- Restrict UI access by SSO group - [PR](https://github.com/BerriAI/litellm/pull/12023)
|
||||
- **Keys**
|
||||
- Add new new_key param for regenerating key - [PR](https://github.com/BerriAI/litellm/pull/12087)
|
||||
- **Test Keys**
|
||||
- New ‘get code’ button for getting runnable python code snippet based on ui configuration - [PR](https://github.com/BerriAI/litellm/pull/11629)
|
||||
|
||||
---
|
||||
|
||||
## Logging / Guardrail Integrations
|
||||
|
||||
#### Bugs
|
||||
- **Braintrust**
|
||||
- Adds model to metadata to enable braintrust cost estimation - [PR](https://github.com/BerriAI/litellm/pull/12022)
|
||||
#### Features
|
||||
- **Callbacks**
|
||||
- (Enterprise) - disable logging callbacks in request headers - [PR](https://github.com/BerriAI/litellm/pull/11985)
|
||||
- Add List Callbacks API Endpoint - [PR](https://github.com/BerriAI/litellm/pull/11987)
|
||||
- **Bedrock Guardrail**
|
||||
- Don't raise exception on intervene action - [PR](https://github.com/BerriAI/litellm/pull/11875)
|
||||
- Ensure PII Masking is applied on response streaming or non streaming content when using post call - [PR](https://github.com/BerriAI/litellm/pull/12086)
|
||||
- **[NEW] Palo Alto Networks Prisma AIRS Guardrail**
|
||||
- [PR](https://github.com/BerriAI/litellm/pull/12116)
|
||||
- **ElasticSearch**
|
||||
- New Elasticsearch Logging Tutorial - [PR](https://github.com/BerriAI/litellm/pull/11761)
|
||||
- **Message Redaction**
|
||||
- Preserve usage / model information for Embedding redaction - [PR](https://github.com/BerriAI/litellm/pull/12088)
|
||||
|
||||
---
|
||||
|
||||
## Performance / Loadbalancing / Reliability improvements
|
||||
|
||||
#### Bugs
|
||||
- **Team-only models**
|
||||
- Filter team-only models from routing logic for non-team calls
|
||||
- **Context Window Exceeded error**
|
||||
- Catch anthropic exceptions - [PR](https://github.com/BerriAI/litellm/pull/12113)
|
||||
#### Features
|
||||
- **Router**
|
||||
- allow using dynamic cooldown time for a specific deployment - [PR](https://github.com/BerriAI/litellm/pull/12037)
|
||||
- handle cooldown_time = 0 for deployments - [PR](https://github.com/BerriAI/litellm/pull/12108)
|
||||
- **Redis**
|
||||
- Add better debugging to see what variables are set - [PR](https://github.com/BerriAI/litellm/pull/12073)
|
||||
|
||||
---
|
||||
|
||||
## General Proxy Improvements
|
||||
|
||||
#### Bugs
|
||||
- **aiohttp**
|
||||
- Check HTTP_PROXY vars in networking requests
|
||||
- Allow using HTTP_ Proxy settings with trust_env
|
||||
|
||||
#### Features
|
||||
- **Docs**
|
||||
- Add recommended spec - [PR](https://github.com/BerriAI/litellm/pull/11980)
|
||||
- **Swagger**
|
||||
- Introduce new environment variable NO_REDOC to opt-out Redoc - [PR](https://github.com/BerriAI/litellm/pull/12092)
|
||||
|
||||
|
||||
---
|
||||
|
||||
## New Contributors
|
||||
* @mukesh-dream11 made their first contribution in https://github.com/BerriAI/litellm/pull/11969
|
||||
* @cbjuan made their first contribution in https://github.com/BerriAI/litellm/pull/11854
|
||||
* @ryan-castner made their first contribution in https://github.com/BerriAI/litellm/pull/12055
|
||||
* @davis-featherstone made their first contribution in https://github.com/BerriAI/litellm/pull/12075
|
||||
* @Gum-Joe made their first contribution in https://github.com/BerriAI/litellm/pull/12068
|
||||
* @jroberts2600 made their first contribution in https://github.com/BerriAI/litellm/pull/12116
|
||||
* @ohmeow made their first contribution in https://github.com/BerriAI/litellm/pull/12022
|
||||
* @amarrella made their first contribution in https://github.com/BerriAI/litellm/pull/11942
|
||||
* @zhangyoufu made their first contribution in https://github.com/BerriAI/litellm/pull/12092
|
||||
* @bougou made their first contribution in https://github.com/BerriAI/litellm/pull/12088
|
||||
* @codeugar made their first contribution in https://github.com/BerriAI/litellm/pull/11972
|
||||
* @glgh made their first contribution in https://github.com/BerriAI/litellm/pull/12133
|
||||
|
||||
## **[Git Diff](https://github.com/BerriAI/litellm/compare/v1.73.0-stable...v1.73.6.rc-draft)**
|
@@ -0,0 +1,375 @@
|
||||
---
|
||||
title: "v1.74.0-stable"
|
||||
slug: "v1-74-0-stable"
|
||||
date: 2025-07-05T10:00:00
|
||||
authors:
|
||||
- name: Krrish Dholakia
|
||||
title: CEO, LiteLLM
|
||||
url: https://www.linkedin.com/in/krish-d/
|
||||
image_url: https://pbs.twimg.com/profile_images/1298587542745358340/DZv3Oj-h_400x400.jpg
|
||||
- name: Ishaan Jaffer
|
||||
title: CTO, LiteLLM
|
||||
url: https://www.linkedin.com/in/reffajnaahsi/
|
||||
image_url: https://pbs.twimg.com/profile_images/1613813310264340481/lz54oEiB_400x400.jpg
|
||||
|
||||
hide_table_of_contents: false
|
||||
---
|
||||
|
||||
import Image from '@theme/IdealImage';
|
||||
import Tabs from '@theme/Tabs';
|
||||
import TabItem from '@theme/TabItem';
|
||||
|
||||
## Deploy this version
|
||||
|
||||
<Tabs>
|
||||
<TabItem value="docker" label="Docker">
|
||||
|
||||
``` showLineNumbers title="docker run litellm"
|
||||
docker run \
|
||||
-e STORE_MODEL_IN_DB=True \
|
||||
-p 4000:4000 \
|
||||
ghcr.io/berriai/litellm:v1.74.0-stable
|
||||
```
|
||||
</TabItem>
|
||||
|
||||
<TabItem value="pip" label="Pip">
|
||||
|
||||
``` showLineNumbers title="pip install litellm"
|
||||
pip install litellm==1.74.0.post2
|
||||
```
|
||||
|
||||
</TabItem>
|
||||
</Tabs>
|
||||
|
||||
---
|
||||
|
||||
## Key Highlights
|
||||
|
||||
- **MCP Gateway Namespace Servers** - Clients connecting to LiteLLM can now specify which MCP servers to use.
|
||||
- **Key/Team Based Logging on UI** - Proxy Admins can configure team or key-based logging settings directly in the UI.
|
||||
- **Azure Content Safety Guardrails** - Added support for prompt injection and text moderation with Azure Content Safety Guardrails.
|
||||
- **VertexAI Deepseek Models** - Support for calling VertexAI Deepseek models with LiteLLM's/chat/completions or /responses API.
|
||||
- **Github Copilot API** - You can now use Github Copilot as an LLM API provider.
|
||||
|
||||
|
||||
### MCP Gateway: Namespaced MCP Servers
|
||||
|
||||
This release brings support for namespacing MCP Servers on LiteLLM MCP Gateway. This means you can specify the `x-mcp-servers` header to specify which servers to list tools from.
|
||||
|
||||
This is useful when you want to point MCP clients to specific MCP Servers on LiteLLM.
|
||||
|
||||
|
||||
#### Usage
|
||||
|
||||
<Tabs>
|
||||
<TabItem value="openai" label="OpenAI API">
|
||||
|
||||
```bash title="cURL Example with Server Segregation" showLineNumbers
|
||||
curl --location 'https://api.openai.com/v1/responses' \
|
||||
--header 'Content-Type: application/json' \
|
||||
--header "Authorization: Bearer $OPENAI_API_KEY" \
|
||||
--data '{
|
||||
"model": "gpt-4o",
|
||||
"tools": [
|
||||
{
|
||||
"type": "mcp",
|
||||
"server_label": "litellm",
|
||||
"server_url": "<your-litellm-proxy-base-url>/mcp",
|
||||
"require_approval": "never",
|
||||
"headers": {
|
||||
"x-litellm-api-key": "Bearer YOUR_LITELLM_API_KEY",
|
||||
"x-mcp-servers": "Zapier_Gmail"
|
||||
}
|
||||
}
|
||||
],
|
||||
"input": "Run available tools",
|
||||
"tool_choice": "required"
|
||||
}'
|
||||
```
|
||||
|
||||
In this example, the request will only have access to tools from the "Zapier_Gmail" MCP server.
|
||||
|
||||
</TabItem>
|
||||
|
||||
<TabItem value="litellm" label="LiteLLM Proxy">
|
||||
|
||||
```bash title="cURL Example with Server Segregation" showLineNumbers
|
||||
curl --location '<your-litellm-proxy-base-url>/v1/responses' \
|
||||
--header 'Content-Type: application/json' \
|
||||
--header "Authorization: Bearer $LITELLM_API_KEY" \
|
||||
--data '{
|
||||
"model": "gpt-4o",
|
||||
"tools": [
|
||||
{
|
||||
"type": "mcp",
|
||||
"server_label": "litellm",
|
||||
"server_url": "<your-litellm-proxy-base-url>/mcp",
|
||||
"require_approval": "never",
|
||||
"headers": {
|
||||
"x-litellm-api-key": "Bearer YOUR_LITELLM_API_KEY",
|
||||
"x-mcp-servers": "Zapier_Gmail,Server2"
|
||||
}
|
||||
}
|
||||
],
|
||||
"input": "Run available tools",
|
||||
"tool_choice": "required"
|
||||
}'
|
||||
```
|
||||
|
||||
This configuration restricts the request to only use tools from the specified MCP servers.
|
||||
|
||||
</TabItem>
|
||||
|
||||
<TabItem value="cursor" label="Cursor IDE">
|
||||
|
||||
```json title="Cursor MCP Configuration with Server Segregation" showLineNumbers
|
||||
{
|
||||
"mcpServers": {
|
||||
"LiteLLM": {
|
||||
"url": "<your-litellm-proxy-base-url>/mcp",
|
||||
"headers": {
|
||||
"x-litellm-api-key": "Bearer $LITELLM_API_KEY",
|
||||
"x-mcp-servers": "Zapier_Gmail,Server2"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
This configuration in Cursor IDE settings will limit tool access to only the specified MCP server.
|
||||
|
||||
</TabItem>
|
||||
</Tabs>
|
||||
|
||||
### Team / Key Based Logging on UI
|
||||
|
||||
<Image
|
||||
img={require('../../img/release_notes/team_key_logging.png')}
|
||||
style={{width: '100%', display: 'block', margin: '2rem auto'}}
|
||||
/>
|
||||
|
||||
<br />
|
||||
|
||||
This release brings support for Proxy Admins to configure Team/Key Based Logging Settings on the UI. This allows routing LLM request/response logs to different Langfuse/Arize projects based on the team or key.
|
||||
|
||||
For developers using LiteLLM, their logs are automatically routed to their specific Arize/Langfuse projects. On this release, we support the following integrations for key/team based logging:
|
||||
|
||||
- `langfuse`
|
||||
- `arize`
|
||||
- `langsmith`
|
||||
|
||||
### Azure Content Safety Guardrails
|
||||
|
||||
<Image
|
||||
img={require('../../img/azure_content_safety_guardrails.jpg')}
|
||||
style={{width: '100%', display: 'block', margin: '2rem auto'}}
|
||||
/>
|
||||
|
||||
<br />
|
||||
|
||||
|
||||
LiteLLM now supports **Azure Content Safety Guardrails** for Prompt Injection and Text Moderation. This is **great for internal chat-ui** use cases, as you can now create guardrails with detection for Azure’s Harm Categories, specify custom severity thresholds and run them across 100+ LLMs for just that use-case (or across all your calls).
|
||||
|
||||
[Get Started](../../docs/proxy/guardrails/azure_content_guardrail)
|
||||
|
||||
|
||||
### Python SDK: 2.3 Second Faster Import Times
|
||||
|
||||
This release brings significant performance improvements to the Python SDK with 2.3 seconds faster import times. We've refactored the initialization process to reduce startup overhead, making LiteLLM more efficient for applications that need quick initialization. This is a major improvement for applications that need to initialize LiteLLM quickly.
|
||||
|
||||
|
||||
---
|
||||
|
||||
## New Models / Updated Models
|
||||
|
||||
#### Pricing / Context Window Updates
|
||||
|
||||
| Provider | Model | Context Window | Input ($/1M tokens) | Output ($/1M tokens) | Type |
|
||||
| ----------- | -------------------------------------- | -------------- | ------------------- | -------------------- | ---- |
|
||||
| Watsonx | `watsonx/mistralai/mistral-large` | 131k | $3.00 | $10.00 | New |
|
||||
| Azure AI | `azure_ai/cohere-rerank-v3.5` | 4k | $2.00/1k queries | - | New (Rerank) |
|
||||
|
||||
|
||||
#### Features
|
||||
- **[🆕 GitHub Copilot](../../docs/providers/github_copilot)** - Use GitHub Copilot API with LiteLLM - [PR](https://github.com/BerriAI/litellm/pull/12325), [Get Started](../../docs/providers/github_copilot)
|
||||
- **[🆕 VertexAI DeepSeek](../../docs/providers/vertex)** - Add support for VertexAI DeepSeek models - [PR](https://github.com/BerriAI/litellm/pull/12312), [Get Started](../../docs/providers/vertex_partner#vertexai-deepseek)
|
||||
- **[Azure AI](../../docs/providers/azure_ai)**
|
||||
- Add azure_ai cohere rerank v3.5 - [PR](https://github.com/BerriAI/litellm/pull/12283), [Get Started](../../docs/providers/azure_ai#rerank-endpoint)
|
||||
- **[Vertex AI](../../docs/providers/vertex)**
|
||||
- Add size parameter support for image generation - [PR](https://github.com/BerriAI/litellm/pull/12292), [Get Started](../../docs/providers/vertex_image)
|
||||
- **[Custom LLM](../../docs/providers/custom_llm_server)**
|
||||
- Pass through extra_ properties on "custom" llm provider - [PR](https://github.com/BerriAI/litellm/pull/12185)
|
||||
|
||||
#### Bugs
|
||||
- **[Mistral](../../docs/providers/mistral)**
|
||||
- Fix transform_response handling for empty string content - [PR](https://github.com/BerriAI/litellm/pull/12202)
|
||||
- Turn Mistral to use llm_http_handler - [PR](https://github.com/BerriAI/litellm/pull/12245)
|
||||
- **[Gemini](../../docs/providers/gemini)**
|
||||
- Fix tool call sequence - [PR](https://github.com/BerriAI/litellm/pull/11999)
|
||||
- Fix custom api_base path preservation - [PR](https://github.com/BerriAI/litellm/pull/12215)
|
||||
- **[Anthropic](../../docs/providers/anthropic)**
|
||||
- Fix user_id validation logic - [PR](https://github.com/BerriAI/litellm/pull/11432)
|
||||
- **[Bedrock](../../docs/providers/bedrock)**
|
||||
- Support optional args for bedrock - [PR](https://github.com/BerriAI/litellm/pull/12287)
|
||||
- **[Ollama](../../docs/providers/ollama)**
|
||||
- Fix default parameters for ollama-chat - [PR](https://github.com/BerriAI/litellm/pull/12201)
|
||||
- **[VLLM](../../docs/providers/vllm)**
|
||||
- Add 'audio_url' message type support - [PR](https://github.com/BerriAI/litellm/pull/12270)
|
||||
|
||||
---
|
||||
|
||||
## LLM API Endpoints
|
||||
|
||||
#### Features
|
||||
|
||||
- **[/batches](../../docs/batches)**
|
||||
- Support batch retrieve with target model Query Param - [PR](https://github.com/BerriAI/litellm/pull/12228)
|
||||
- Anthropic completion bridge improvements - [PR](https://github.com/BerriAI/litellm/pull/12228)
|
||||
- **[/responses](../../docs/response_api)**
|
||||
- Azure responses api bridge improvements - [PR](https://github.com/BerriAI/litellm/pull/12224)
|
||||
- Fix responses api error handling - [PR](https://github.com/BerriAI/litellm/pull/12225)
|
||||
- **[/mcp (MCP Gateway)](../../docs/mcp)**
|
||||
- Add MCP url masking on frontend - [PR](https://github.com/BerriAI/litellm/pull/12247)
|
||||
- Add MCP servers header to scope - [PR](https://github.com/BerriAI/litellm/pull/12266)
|
||||
- Litellm mcp tool prefix - [PR](https://github.com/BerriAI/litellm/pull/12289)
|
||||
- Segregate MCP tools on connections using headers - [PR](https://github.com/BerriAI/litellm/pull/12296)
|
||||
- Added changes to mcp url wrapping - [PR](https://github.com/BerriAI/litellm/pull/12207)
|
||||
|
||||
|
||||
#### Bugs
|
||||
- **[/v1/messages](../../docs/anthropic_unified)**
|
||||
- Remove hardcoded model name on streaming - [PR](https://github.com/BerriAI/litellm/pull/12131)
|
||||
- Support lowest latency routing - [PR](https://github.com/BerriAI/litellm/pull/12180)
|
||||
- Non-anthropic models token usage returned - [PR](https://github.com/BerriAI/litellm/pull/12184)
|
||||
- **[/chat/completions](../../docs/providers/anthropic_unified)**
|
||||
- Support Cursor IDE tool_choice format `{"type": "auto"}` - [PR](https://github.com/BerriAI/litellm/pull/12168)
|
||||
- **[/generateContent](../../docs/generate_content)**
|
||||
- Allow passing litellm_params - [PR](https://github.com/BerriAI/litellm/pull/12177)
|
||||
- Only pass supported params when using OpenAI models - [PR](https://github.com/BerriAI/litellm/pull/12297)
|
||||
- Fix using gemini-cli with Vertex Anthropic Models - [PR](https://github.com/BerriAI/litellm/pull/12246)
|
||||
- **Streaming**
|
||||
- Fix Error code: 307 for LlamaAPI Streaming Chat - [PR](https://github.com/BerriAI/litellm/pull/11946)
|
||||
- Store finish reason even if is_finished - [PR](https://github.com/BerriAI/litellm/pull/12250)
|
||||
|
||||
---
|
||||
|
||||
## Spend Tracking / Budget Improvements
|
||||
|
||||
#### Bugs
|
||||
- Fix allow strings in calculate cost - [PR](https://github.com/BerriAI/litellm/pull/12200)
|
||||
- VertexAI Anthropic streaming cost tracking with prompt caching fixes - [PR](https://github.com/BerriAI/litellm/pull/12188)
|
||||
|
||||
---
|
||||
|
||||
## Management Endpoints / UI
|
||||
|
||||
#### Bugs
|
||||
- **Team Management**
|
||||
- Prevent team model reset on model add - [PR](https://github.com/BerriAI/litellm/pull/12144)
|
||||
- Return team-only models on /v2/model/info - [PR](https://github.com/BerriAI/litellm/pull/12144)
|
||||
- Render team member budget correctly - [PR](https://github.com/BerriAI/litellm/pull/12144)
|
||||
- **UI Rendering**
|
||||
- Fix rendering ui on non-root images - [PR](https://github.com/BerriAI/litellm/pull/12226)
|
||||
- Correctly display 'Internal Viewer' user role - [PR](https://github.com/BerriAI/litellm/pull/12284)
|
||||
- **Configuration**
|
||||
- Handle empty config.yaml - [PR](https://github.com/BerriAI/litellm/pull/12189)
|
||||
- Fix gemini /models - replace models/ as expected - [PR](https://github.com/BerriAI/litellm/pull/12189)
|
||||
|
||||
#### Features
|
||||
- **Team Management**
|
||||
- Allow adding team specific logging callbacks - [PR](https://github.com/BerriAI/litellm/pull/12261)
|
||||
- Add Arize Team Based Logging - [PR](https://github.com/BerriAI/litellm/pull/12264)
|
||||
- Allow Viewing/Editing Team Based Callbacks - [PR](https://github.com/BerriAI/litellm/pull/12265)
|
||||
- **UI Improvements**
|
||||
- Comma separated spend and budget display - [PR](https://github.com/BerriAI/litellm/pull/12317)
|
||||
- Add logos to callback list - [PR](https://github.com/BerriAI/litellm/pull/12244)
|
||||
- **CLI**
|
||||
- Add litellm-proxy cli login for starting to use litellm proxy - [PR](https://github.com/BerriAI/litellm/pull/12216)
|
||||
- **Email Templates**
|
||||
- Customizable Email template - Subject and Signature - [PR](https://github.com/BerriAI/litellm/pull/12218)
|
||||
|
||||
---
|
||||
|
||||
## Logging / Guardrail Integrations
|
||||
|
||||
#### Features
|
||||
- Guardrails
|
||||
- All guardrails are now supported on the UI - [PR](https://github.com/BerriAI/litellm/pull/12349)
|
||||
- **[Azure Content Safety](../../docs/guardrails/azure_content_safety)**
|
||||
- Add Azure Content Safety Guardrails to LiteLLM proxy - [PR](https://github.com/BerriAI/litellm/pull/12268)
|
||||
- Add azure content safety guardrails to the UI - [PR](https://github.com/BerriAI/litellm/pull/12309)
|
||||
- **[DeepEval](../../docs/observability/deepeval_integration)**
|
||||
- Fix DeepEval logging format for failure events - [PR](https://github.com/BerriAI/litellm/pull/12303)
|
||||
- **[Arize](../../docs/proxy/logging#arize)**
|
||||
- Add Arize Team Based Logging - [PR](https://github.com/BerriAI/litellm/pull/12264)
|
||||
- **[Langfuse](../../docs/proxy/logging#langfuse)**
|
||||
- Langfuse prompt_version support - [PR](https://github.com/BerriAI/litellm/pull/12301)
|
||||
- **[Sentry Integration](../../docs/observability/sentry)**
|
||||
- Add sentry scrubbing - [PR](https://github.com/BerriAI/litellm/pull/12210)
|
||||
- **[AWS SQS Logging](../../docs/proxy/logging#aws-sqs)**
|
||||
- New AWS SQS Logging Integration - [PR](https://github.com/BerriAI/litellm/pull/12176)
|
||||
- **[S3 Logger](../../docs/proxy/logging#s3-buckets)**
|
||||
- Add failure logging support - [PR](https://github.com/BerriAI/litellm/pull/12299)
|
||||
- **[Prometheus Metrics](../../docs/proxy/prometheus)**
|
||||
- Add better error validation for prometheus metrics and labels - [PR](https://github.com/BerriAI/litellm/pull/12182)
|
||||
|
||||
#### Bugs
|
||||
- **Security**
|
||||
- Ensure only LLM API route fails get logged on Langfuse - [PR](https://github.com/BerriAI/litellm/pull/12308)
|
||||
- **OpenMeter**
|
||||
- Integration error handling fix - [PR](https://github.com/BerriAI/litellm/pull/12147)
|
||||
- **Message Redaction**
|
||||
- Ensure message redaction works for responses API logging - [PR](https://github.com/BerriAI/litellm/pull/12291)
|
||||
- **Bedrock Guardrails**
|
||||
- Fix bedrock guardrails post_call for streaming responses - [PR](https://github.com/BerriAI/litellm/pull/12252)
|
||||
---
|
||||
|
||||
## Performance / Loadbalancing / Reliability improvements
|
||||
|
||||
#### Features
|
||||
- **Python SDK**
|
||||
- 2 second faster import times - [PR](https://github.com/BerriAI/litellm/pull/12135)
|
||||
- Reduce python sdk import time by .3s - [PR](https://github.com/BerriAI/litellm/pull/12140)
|
||||
- **Error Handling**
|
||||
- Add error handling for MCP tools not found or invalid server - [PR](https://github.com/BerriAI/litellm/pull/12223)
|
||||
- **SSL/TLS**
|
||||
- Fix SSL certificate error - [PR](https://github.com/BerriAI/litellm/pull/12327)
|
||||
- Fix custom ca bundle support in aiohttp transport - [PR](https://github.com/BerriAI/litellm/pull/12281)
|
||||
|
||||
|
||||
---
|
||||
|
||||
## General Proxy Improvements
|
||||
|
||||
- **Startup**
|
||||
- Add new banner on startup - [PR](https://github.com/BerriAI/litellm/pull/12328)
|
||||
- **Dependencies**
|
||||
- Update pydantic version - [PR](https://github.com/BerriAI/litellm/pull/12213)
|
||||
|
||||
|
||||
---
|
||||
|
||||
## New Contributors
|
||||
* @wildcard made their first contribution in https://github.com/BerriAI/litellm/pull/12157
|
||||
* @colesmcintosh made their first contribution in https://github.com/BerriAI/litellm/pull/12168
|
||||
* @seyeong-han made their first contribution in https://github.com/BerriAI/litellm/pull/11946
|
||||
* @dinggh made their first contribution in https://github.com/BerriAI/litellm/pull/12162
|
||||
* @raz-alon made their first contribution in https://github.com/BerriAI/litellm/pull/11432
|
||||
* @tofarr made their first contribution in https://github.com/BerriAI/litellm/pull/12200
|
||||
* @szafranek made their first contribution in https://github.com/BerriAI/litellm/pull/12179
|
||||
* @SamBoyd made their first contribution in https://github.com/BerriAI/litellm/pull/12147
|
||||
* @lizzij made their first contribution in https://github.com/BerriAI/litellm/pull/12219
|
||||
* @cipri-tom made their first contribution in https://github.com/BerriAI/litellm/pull/12201
|
||||
* @zsimjee made their first contribution in https://github.com/BerriAI/litellm/pull/12185
|
||||
* @jroberts2600 made their first contribution in https://github.com/BerriAI/litellm/pull/12175
|
||||
* @njbrake made their first contribution in https://github.com/BerriAI/litellm/pull/12202
|
||||
* @NANDINI-star made their first contribution in https://github.com/BerriAI/litellm/pull/12244
|
||||
* @utsumi-fj made their first contribution in https://github.com/BerriAI/litellm/pull/12230
|
||||
* @dcieslak19973 made their first contribution in https://github.com/BerriAI/litellm/pull/12283
|
||||
* @hanouticelina made their first contribution in https://github.com/BerriAI/litellm/pull/12286
|
||||
* @lowjiansheng made their first contribution in https://github.com/BerriAI/litellm/pull/11999
|
||||
* @JoostvDoorn made their first contribution in https://github.com/BerriAI/litellm/pull/12281
|
||||
* @takashiishida made their first contribution in https://github.com/BerriAI/litellm/pull/12239
|
||||
|
||||
## **[Git Diff](https://github.com/BerriAI/litellm/compare/v1.73.6-stable...v1.74.0-stable)**
|
||||
|
@@ -0,0 +1,291 @@
|
||||
---
|
||||
title: "v1.74.15-stable"
|
||||
slug: "v1-74-15"
|
||||
date: 2025-08-02T10:00:00
|
||||
authors:
|
||||
- name: Krrish Dholakia
|
||||
title: CEO, LiteLLM
|
||||
url: https://www.linkedin.com/in/krish-d/
|
||||
image_url: https://pbs.twimg.com/profile_images/1298587542745358340/DZv3Oj-h_400x400.jpg
|
||||
- name: Ishaan Jaffer
|
||||
title: CTO, LiteLLM
|
||||
url: https://www.linkedin.com/in/reffajnaahsi/
|
||||
image_url: https://pbs.twimg.com/profile_images/1613813310264340481/lz54oEiB_400x400.jpg
|
||||
|
||||
hide_table_of_contents: false
|
||||
---
|
||||
|
||||
import Image from '@theme/IdealImage';
|
||||
import Tabs from '@theme/Tabs';
|
||||
import TabItem from '@theme/TabItem';
|
||||
|
||||
## Deploy this version
|
||||
|
||||
<Tabs>
|
||||
<TabItem value="docker" label="Docker">
|
||||
|
||||
``` showLineNumbers title="docker run litellm"
|
||||
docker run \
|
||||
-e STORE_MODEL_IN_DB=True \
|
||||
-p 4000:4000 \
|
||||
ghcr.io/berriai/litellm:v1.74.15-stable
|
||||
```
|
||||
</TabItem>
|
||||
|
||||
<TabItem value="pip" label="Pip">
|
||||
|
||||
``` showLineNumbers title="pip install litellm"
|
||||
pip install litellm==1.74.15.post2
|
||||
```
|
||||
|
||||
</TabItem>
|
||||
</Tabs>
|
||||
|
||||
---
|
||||
|
||||
## Key Highlights
|
||||
|
||||
- **User Agent Activity Tracking** - Track how much usage each coding tool gets.
|
||||
- **Prompt Management** - Use Git-Ops style prompt management with prompt templates.
|
||||
- **MCP Gateway: Guardrails** - Support for using Guardrails with MCP servers.
|
||||
- **Google AI Studio Imagen4** - Support for using Imagen4 models on Google AI Studio.
|
||||
|
||||
---
|
||||
|
||||
## User Agent Activity Tracking
|
||||
|
||||
<Image
|
||||
img={require('../../img/agent_1.png')}
|
||||
style={{width: '100%', display: 'block', margin: '2rem auto'}}
|
||||
/>
|
||||
|
||||
<br/>
|
||||
|
||||
This release brings support for tracking usage and costs for AI-powered coding tools like Claude Code, Roo Code, Gemini CLI through LiteLLM. You can now track LLM cost, total tokens used, and DAU/WAU/MAU for each coding tool.
|
||||
|
||||
This is great to central AI Platform teams looking to track how they are helping developer productivity.
|
||||
|
||||
[Read More](https://docs.litellm.ai/docs/tutorials/cost_tracking_coding)
|
||||
|
||||
---
|
||||
|
||||
## Prompt Management
|
||||
|
||||
<br/>
|
||||
|
||||
|
||||
|
||||
[Read More](../../docs/proxy/prompt_management)
|
||||
|
||||
---
|
||||
|
||||
## New Models / Updated Models
|
||||
|
||||
#### New Model Support
|
||||
|
||||
| Provider | Model | Context Window | Input ($/1M tokens) | Output ($/1M tokens) | Cost per Image |
|
||||
| ----------- | -------------------------------------- | -------------- | ------------------- | -------------------- | -------------- |
|
||||
| OpenRouter | `openrouter/x-ai/grok-4` | 256k | $3 | $15 | N/A |
|
||||
| Google AI Studio | `gemini/imagen-4.0-generate-preview-06-06` | N/A | N/A | N/A | $0.04 |
|
||||
| Google AI Studio | `gemini/imagen-4.0-ultra-generate-preview-06-06` | N/A | N/A | N/A | $0.06 |
|
||||
| Google AI Studio | `gemini/imagen-4.0-fast-generate-preview-06-06` | N/A | N/A | N/A | $0.02 |
|
||||
| Google AI Studio | `gemini/imagen-3.0-generate-002` | N/A | N/A | N/A | $0.04 |
|
||||
| Google AI Studio | `gemini/imagen-3.0-generate-001` | N/A | N/A | N/A | $0.04 |
|
||||
| Google AI Studio | `gemini/imagen-3.0-fast-generate-001` | N/A | N/A | N/A | $0.02 |
|
||||
|
||||
#### Features
|
||||
|
||||
- **[Google AI Studio](../../docs/providers/gemini)**
|
||||
- Added Google AI Studio Imagen4 model family support - [PR #13065](https://github.com/BerriAI/litellm/pull/13065), [Get Started](../../docs/providers/google_ai_studio/image_gen)
|
||||
- **[Azure OpenAI](../../docs/providers/azure/azure)**
|
||||
- Azure `api_version="preview"` support - [PR #13072](https://github.com/BerriAI/litellm/pull/13072), [Get Started](../../docs/providers/azure/azure#setting-api-version)
|
||||
- Password protected certificate files support - [PR #12995](https://github.com/BerriAI/litellm/pull/12995), [Get Started](../../docs/providers/azure/azure#authentication)
|
||||
- **[AWS Bedrock](../../docs/providers/bedrock)**
|
||||
- Cost tracking via Anthropic `/v1/messages` - [PR #13072](https://github.com/BerriAI/litellm/pull/13072)
|
||||
- Computer use support - [PR #13150](https://github.com/BerriAI/litellm/pull/13150)
|
||||
- **[OpenRouter](../../docs/providers/openrouter)**
|
||||
- Added Grok4 model support - [PR #13018](https://github.com/BerriAI/litellm/pull/13018)
|
||||
- **[Anthropic](../../docs/providers/anthropic)**
|
||||
- Auto Cache Control Injection - Improved cache_control_injection_points with negative index support - [PR #13187](https://github.com/BerriAI/litellm/pull/13187), [Get Started](../../docs/tutorials/prompt_caching)
|
||||
- Working mid-stream fallbacks with token usage tracking - [PR #13149](https://github.com/BerriAI/litellm/pull/13149), [PR #13170](https://github.com/BerriAI/litellm/pull/13170)
|
||||
- **[Perplexity](../../docs/providers/perplexity)**
|
||||
- Citation annotations support - [PR #13225](https://github.com/BerriAI/litellm/pull/13225)
|
||||
|
||||
#### Bugs
|
||||
|
||||
- **[Gemini](../../docs/providers/gemini)**
|
||||
- Fix merge_reasoning_content_in_choices parameter issue - [PR #13066](https://github.com/BerriAI/litellm/pull/13066), [Get Started](../../docs/tutorials/openweb_ui#render-thinking-content-on-open-webui)
|
||||
- Added support for using `GOOGLE_API_KEY` environment variable for Google AI Studio - [PR #12507](https://github.com/BerriAI/litellm/pull/12507)
|
||||
- **[vLLM/OpenAI-like](../../docs/providers/vllm)**
|
||||
- Fix missing extra_headers support for embeddings - [PR #13198](https://github.com/BerriAI/litellm/pull/13198)
|
||||
|
||||
---
|
||||
|
||||
## LLM API Endpoints
|
||||
|
||||
#### Bugs
|
||||
|
||||
- **[/generateContent](../../docs/generateContent)**
|
||||
- Support for query_params in generateContent routes for API Key setting - [PR #13100](https://github.com/BerriAI/litellm/pull/13100)
|
||||
- Ensure "x-goog-api-key" is used for auth to google ai studio when using /generateContent on LiteLLM - [PR #13098](https://github.com/BerriAI/litellm/pull/13098)
|
||||
- Ensure tool calling works as expected on generateContent - [PR #13189](https://github.com/BerriAI/litellm/pull/13189)
|
||||
- **[/vertex_ai (Passthrough)](../../docs/pass_through/vertex_ai)**
|
||||
- Ensure multimodal embedding responses are logged properly - [PR #13050](https://github.com/BerriAI/litellm/pull/13050)
|
||||
|
||||
---
|
||||
|
||||
## [MCP Gateway](../../docs/mcp)
|
||||
|
||||
#### Features
|
||||
|
||||
- **Health Check Improvements**
|
||||
- Add health check endpoints for MCP servers - [PR #13106](https://github.com/BerriAI/litellm/pull/13106)
|
||||
- **Guardrails Integration**
|
||||
- Add pre and during call hooks initialization - [PR #13067](https://github.com/BerriAI/litellm/pull/13067)
|
||||
- Move pre and during hooks to ProxyLogging - [PR #13109](https://github.com/BerriAI/litellm/pull/13109)
|
||||
- MCP pre and during guardrails implementation - [PR #13188](https://github.com/BerriAI/litellm/pull/13188)
|
||||
- **Protocol & Header Support**
|
||||
- Add protocol headers support - [PR #13062](https://github.com/BerriAI/litellm/pull/13062)
|
||||
- **URL & Namespacing**
|
||||
- Improve MCP server URL validation for internal/Kubernetes URLs - [PR #13099](https://github.com/BerriAI/litellm/pull/13099)
|
||||
|
||||
|
||||
#### Bugs
|
||||
|
||||
- **UI**
|
||||
- Fix scrolling issue with MCP tools - [PR #13015](https://github.com/BerriAI/litellm/pull/13015)
|
||||
- Fix MCP client list failure - [PR #13114](https://github.com/BerriAI/litellm/pull/13114)
|
||||
|
||||
|
||||
[Read More](../../docs/mcp)
|
||||
|
||||
|
||||
---
|
||||
|
||||
## Management Endpoints / UI
|
||||
|
||||
#### Features
|
||||
|
||||
- **Usage Analytics**
|
||||
- New tab for user agent activity tracking - [PR #13146](https://github.com/BerriAI/litellm/pull/13146)
|
||||
- Daily usage per user analytics - [PR #13147](https://github.com/BerriAI/litellm/pull/13147)
|
||||
- Default usage chart date range set to last 7 days - [PR #12917](https://github.com/BerriAI/litellm/pull/12917)
|
||||
- New advanced date range picker component - [PR #13141](https://github.com/BerriAI/litellm/pull/13141), [PR #13221](https://github.com/BerriAI/litellm/pull/13221)
|
||||
- Show loader on usage cost charts after date selection - [PR #13113](https://github.com/BerriAI/litellm/pull/13113)
|
||||
- **Models**
|
||||
- Added Voyage, Jinai, Deepinfra and VolcEngine providers on UI - [PR #13131](https://github.com/BerriAI/litellm/pull/13131)
|
||||
- Added Sagemaker on UI - [PR #13117](https://github.com/BerriAI/litellm/pull/13117)
|
||||
- Preserve model order in `/v1/models` and `/model_group/info` endpoints - [PR #13178](https://github.com/BerriAI/litellm/pull/13178)
|
||||
|
||||
- **Key Management**
|
||||
- Properly parse JSON options for key generation in UI - [PR #12989](https://github.com/BerriAI/litellm/pull/12989)
|
||||
- **Authentication**
|
||||
- **JWT Fields**
|
||||
- Add dot notation support for all JWT fields - [PR #13013](https://github.com/BerriAI/litellm/pull/13013)
|
||||
|
||||
#### Bugs
|
||||
|
||||
- **Permissions**
|
||||
- Fix object permission for organizations - [PR #13142](https://github.com/BerriAI/litellm/pull/13142)
|
||||
- Fix list team v2 security check - [PR #13094](https://github.com/BerriAI/litellm/pull/13094)
|
||||
- **Models**
|
||||
- Fix model reload on model update - [PR #13216](https://github.com/BerriAI/litellm/pull/13216)
|
||||
- **Router Settings**
|
||||
- Fix displaying models for fallbacks in UI - [PR #13191](https://github.com/BerriAI/litellm/pull/13191)
|
||||
- Fix wildcard model name handling with custom values - [PR #13116](https://github.com/BerriAI/litellm/pull/13116)
|
||||
- Fix fallback delete functionality - [PR #12606](https://github.com/BerriAI/litellm/pull/12606)
|
||||
|
||||
---
|
||||
|
||||
## Logging / Guardrail Integrations
|
||||
|
||||
#### Features
|
||||
|
||||
- **[MLFlow](../../docs/proxy/logging#mlflow)**
|
||||
- Allow adding tags for MLFlow logging requests - [PR #13108](https://github.com/BerriAI/litellm/pull/13108)
|
||||
- **[Langfuse OTEL](../../docs/proxy/logging#langfuse)**
|
||||
- Add comprehensive metadata support to Langfuse OpenTelemetry integration - [PR #12956](https://github.com/BerriAI/litellm/pull/12956)
|
||||
- **[Datadog LLM Observability](../../docs/proxy/logging#datadog)**
|
||||
- Allow redacting message/response content for specific logging integrations - [PR #13158](https://github.com/BerriAI/litellm/pull/13158)
|
||||
|
||||
#### Bugs
|
||||
|
||||
- **API Key Logging**
|
||||
- Fix API Key being logged inappropriately - [PR #12978](https://github.com/BerriAI/litellm/pull/12978)
|
||||
- **MCP Spend Tracking**
|
||||
- Set default value for MCP namespace tool name in spend table - [PR #12894](https://github.com/BerriAI/litellm/pull/12894)
|
||||
|
||||
---
|
||||
|
||||
## Performance / Loadbalancing / Reliability improvements
|
||||
|
||||
#### Features
|
||||
|
||||
- **Background Health Checks**
|
||||
- Allow disabling background health checks for specific deployments - [PR #13186](https://github.com/BerriAI/litellm/pull/13186)
|
||||
- **Database Connection Management**
|
||||
- Ensure stale Prisma clients disconnect DB connections properly - [PR #13140](https://github.com/BerriAI/litellm/pull/13140)
|
||||
- **Jitter Improvements**
|
||||
- Fix jitter calculation (should be added not multiplied) - [PR #12901](https://github.com/BerriAI/litellm/pull/12901)
|
||||
|
||||
#### Bugs
|
||||
|
||||
- **Anthropic Streaming**
|
||||
- Always use choice index=0 for Anthropic streaming responses - [PR #12666](https://github.com/BerriAI/litellm/pull/12666)
|
||||
- **Custom Auth**
|
||||
- Bubble up custom exceptions properly - [PR #13093](https://github.com/BerriAI/litellm/pull/13093)
|
||||
- **OTEL with Managed Files**
|
||||
- Fix using managed files with OTEL integration - [PR #13171](https://github.com/BerriAI/litellm/pull/13171)
|
||||
|
||||
---
|
||||
|
||||
## General Proxy Improvements
|
||||
|
||||
#### Features
|
||||
|
||||
- **Database Migration**
|
||||
- Move to use_prisma_migrate by default - [PR #13117](https://github.com/BerriAI/litellm/pull/13117)
|
||||
- Resolve team-only models on auth checks - [PR #13117](https://github.com/BerriAI/litellm/pull/13117)
|
||||
- **Infrastructure**
|
||||
- Loosened MCP Python version restrictions - [PR #13102](https://github.com/BerriAI/litellm/pull/13102)
|
||||
- Migrate build_and_test to CI/CD Postgres DB - [PR #13166](https://github.com/BerriAI/litellm/pull/13166)
|
||||
- **Helm Charts**
|
||||
- Allow Helm hooks for migration jobs - [PR #13174](https://github.com/BerriAI/litellm/pull/13174)
|
||||
- Fix Helm migration job schema updates - [PR #12809](https://github.com/BerriAI/litellm/pull/12809)
|
||||
|
||||
#### Bugs
|
||||
|
||||
- **Docker**
|
||||
- Remove obsolete `version` attribute in docker-compose - [PR #13172](https://github.com/BerriAI/litellm/pull/13172)
|
||||
- Add openssl in runtime stage for non-root Dockerfile - [PR #13168](https://github.com/BerriAI/litellm/pull/13168)
|
||||
- **Database Configuration**
|
||||
- Fix DB config through environment variables - [PR #13111](https://github.com/BerriAI/litellm/pull/13111)
|
||||
- **Logging**
|
||||
- Suppress httpx logging - [PR #13217](https://github.com/BerriAI/litellm/pull/13217)
|
||||
- **Token Counting**
|
||||
- Ignore unsupported keys like prefix in token counter - [PR #11954](https://github.com/BerriAI/litellm/pull/11954)
|
||||
---
|
||||
|
||||
## New Contributors
|
||||
* @5731la made their first contribution in https://github.com/BerriAI/litellm/pull/12989
|
||||
* @restato made their first contribution in https://github.com/BerriAI/litellm/pull/12980
|
||||
* @strickvl made their first contribution in https://github.com/BerriAI/litellm/pull/12956
|
||||
* @Ne0-1 made their first contribution in https://github.com/BerriAI/litellm/pull/12995
|
||||
* @maxrabin made their first contribution in https://github.com/BerriAI/litellm/pull/13079
|
||||
* @lvuna made their first contribution in https://github.com/BerriAI/litellm/pull/12894
|
||||
* @Maximgitman made their first contribution in https://github.com/BerriAI/litellm/pull/12666
|
||||
* @pathikrit made their first contribution in https://github.com/BerriAI/litellm/pull/12901
|
||||
* @huetterma made their first contribution in https://github.com/BerriAI/litellm/pull/12809
|
||||
* @betterthanbreakfast made their first contribution in https://github.com/BerriAI/litellm/pull/13029
|
||||
* @phosae made their first contribution in https://github.com/BerriAI/litellm/pull/12606
|
||||
* @sahusiddharth made their first contribution in https://github.com/BerriAI/litellm/pull/12507
|
||||
* @Amit-kr26 made their first contribution in https://github.com/BerriAI/litellm/pull/11954
|
||||
* @kowyo made their first contribution in https://github.com/BerriAI/litellm/pull/13172
|
||||
* @AnandKhinvasara made their first contribution in https://github.com/BerriAI/litellm/pull/13187
|
||||
* @unique-jakub made their first contribution in https://github.com/BerriAI/litellm/pull/13174
|
||||
* @tyumentsev4 made their first contribution in https://github.com/BerriAI/litellm/pull/13134
|
||||
* @aayush-malviya-acquia made their first contribution in https://github.com/BerriAI/litellm/pull/12978
|
||||
* @kankute-sameer made their first contribution in https://github.com/BerriAI/litellm/pull/13225
|
||||
* @AlexanderYastrebov made their first contribution in https://github.com/BerriAI/litellm/pull/13178
|
||||
|
||||
## **[Full Changelog](https://github.com/BerriAI/litellm/compare/v1.74.9-stable...v1.74.15.rc)**
|
@@ -0,0 +1,323 @@
|
||||
---
|
||||
title: "v1.74.3-stable"
|
||||
slug: "v1-74-3-stable"
|
||||
date: 2025-07-12T10:00:00
|
||||
authors:
|
||||
- name: Krrish Dholakia
|
||||
title: CEO, LiteLLM
|
||||
url: https://www.linkedin.com/in/krish-d/
|
||||
image_url: https://pbs.twimg.com/profile_images/1298587542745358340/DZv3Oj-h_400x400.jpg
|
||||
- name: Ishaan Jaffer
|
||||
title: CTO, LiteLLM
|
||||
url: https://www.linkedin.com/in/reffajnaahsi/
|
||||
image_url: https://pbs.twimg.com/profile_images/1613813310264340481/lz54oEiB_400x400.jpg
|
||||
|
||||
hide_table_of_contents: false
|
||||
---
|
||||
|
||||
import Image from '@theme/IdealImage';
|
||||
import Tabs from '@theme/Tabs';
|
||||
import TabItem from '@theme/TabItem';
|
||||
|
||||
## Deploy this version
|
||||
|
||||
<Tabs>
|
||||
<TabItem value="docker" label="Docker">
|
||||
|
||||
``` showLineNumbers title="docker run litellm"
|
||||
docker run \
|
||||
-e STORE_MODEL_IN_DB=True \
|
||||
-p 4000:4000 \
|
||||
ghcr.io/berriai/litellm:v1.74.3-stable
|
||||
```
|
||||
</TabItem>
|
||||
|
||||
<TabItem value="pip" label="Pip">
|
||||
|
||||
``` showLineNumbers title="pip install litellm"
|
||||
pip install litellm==1.74.3.post1
|
||||
```
|
||||
|
||||
</TabItem>
|
||||
</Tabs>
|
||||
|
||||
---
|
||||
|
||||
## Key Highlights
|
||||
|
||||
- **MCP: Model Access Groups** - Add mcp servers to access groups, for easily managing access to users and teams.
|
||||
- **MCP: Tool Cost Tracking** - Set prices for each MCP tool.
|
||||
- **Model Hub v2** - New OSS Model Hub for telling developers what models are available on the proxy.
|
||||
- **Bytez** - New LLM API Provider.
|
||||
- **Dashscope API** - Call Alibaba's qwen models via new Dashscope API Provider.
|
||||
|
||||
---
|
||||
|
||||
## MCP Gateway: Model Access Groups
|
||||
|
||||
<Image
|
||||
img={require('../../img/release_notes/mcp_access_groups.png')}
|
||||
style={{width: '80%', display: 'block', margin: '0'}}
|
||||
/>
|
||||
|
||||
<br/>
|
||||
|
||||
v1.74.3-stable adds support for adding MCP servers to access groups, this makes it **easier for Proxy Admins** to manage access to MCP servers across users and teams.
|
||||
|
||||
For **developers**, this means you can now connect to multiple MCP servers by passing the access group name in the `x-mcp-servers` header.
|
||||
|
||||
Read more [here](https://docs.litellm.ai/docs/mcp#grouping-mcps-access-groups)
|
||||
|
||||
---
|
||||
|
||||
## MCP Gateway: Tool Cost Tracking
|
||||
|
||||
<Image
|
||||
img={require('../../img/release_notes/mcp_tool_cost_tracking.png')}
|
||||
style={{width: '80%', display: 'block', margin: '0'}}
|
||||
/>
|
||||
|
||||
<br/>
|
||||
|
||||
This release adds cost tracking for MCP tool calls. This is great for **Proxy Admins** giving MCP access to developers as you can now attribute MCP tool call costs to specific LiteLLM keys and teams.
|
||||
|
||||
You can set:
|
||||
- **Uniform server cost**: Set a uniform cost for all tools from a server
|
||||
- **Individual tool cost**: Define individual costs for specific tools (e.g., search_tool costs $10, get_weather costs $5).
|
||||
- **Dynamic costs**: For use cases where you want to set costs based on the MCP's response, you can write a custom post mcp call hook to parse responses and set costs dynamically.
|
||||
|
||||
[Get started](https://docs.litellm.ai/docs/mcp#mcp-cost-tracking)
|
||||
|
||||
---
|
||||
|
||||
## Model Hub v2
|
||||
|
||||
<Image
|
||||
img={require('../../img/release_notes/model_hub_v2.png')}
|
||||
style={{width: '100%', display: 'block', margin: '2rem auto'}}
|
||||
/>
|
||||
|
||||
<br/>
|
||||
|
||||
v1.74.3-stable introduces a new OSS Model Hub for telling developers what models are available on the proxy.
|
||||
|
||||
This is great for **Proxy Admins** as you can now tell developers what models are available on the proxy.
|
||||
|
||||
This improves on the previous model hub by enabling:
|
||||
- The ability to show **Developers** models, even if they don't have a LiteLLM key.
|
||||
- The ability for **Proxy Admins** to select specific models to be public on the model hub.
|
||||
- Improved search and filtering capabilities:
|
||||
- search for models by partial name (e.g. `xai grok-4`)
|
||||
- filter by provider and feature (e.g. 'vision' models)
|
||||
- sort by cost (e.g. cheapest vision model from OpenAI)
|
||||
|
||||
[Get started](../../docs/proxy/model_hub)
|
||||
|
||||
---
|
||||
|
||||
|
||||
## New Models / Updated Models
|
||||
|
||||
#### Pricing / Context Window Updates
|
||||
|
||||
| Provider | Model | Context Window | Input ($/1M tokens) | Output ($/1M tokens) | Type |
|
||||
| ----------- | -------------------------------------- | -------------- | ------------------- | -------------------- | ---- |
|
||||
| Xai | `xai/grok-4` | 256k | $3.00 | $15.00 | New |
|
||||
| Xai | `xai/grok-4-0709` | 256k | $3.00 | $15.00 | New |
|
||||
| Xai | `xai/grok-4-latest` | 256k | $3.00 | $15.00 | New |
|
||||
| Mistral | `mistral/devstral-small-2507` | 128k | $0.1 | $0.3 | New |
|
||||
| Mistral | `mistral/devstral-medium-2507` | 128k | $0.4 | $2 | New |
|
||||
| Azure OpenAI | `azure/o3-deep-research` | 200k | $10 | $40 | New |
|
||||
|
||||
|
||||
#### Features
|
||||
- **[Xinference](../../docs/providers/xinference)**
|
||||
- Image generation API support - [PR](https://github.com/BerriAI/litellm/pull/12439)
|
||||
- **[Bedrock](../../docs/providers/bedrock)**
|
||||
- API Key Auth support for AWS Bedrock API - [PR](https://github.com/BerriAI/litellm/pull/12495)
|
||||
- **[🆕 Dashscope](../../docs/providers/dashscope)**
|
||||
- New integration from Alibaba (enables qwen usage) - [PR](https://github.com/BerriAI/litellm/pull/12361)
|
||||
- **[🆕 Bytez](../../docs/providers/bytez)**
|
||||
- New /chat/completion integration - [PR](https://github.com/BerriAI/litellm/pull/12121)
|
||||
|
||||
#### Bugs
|
||||
- **[Github Copilot](../../docs/providers/github_copilot)**
|
||||
- Fix API base url for Github Copilot - [PR](https://github.com/BerriAI/litellm/pull/12418)
|
||||
- **[Bedrock](../../docs/providers/bedrock)**
|
||||
- Ensure supported bedrock/converse/ params = bedrock/ params - [PR](https://github.com/BerriAI/litellm/pull/12466)
|
||||
- Fix cache token cost calculation - [PR](https://github.com/BerriAI/litellm/pull/12488)
|
||||
- **[XAI](../../docs/providers/xai)**
|
||||
- ensure finish_reason includes tool calls when xai responses with tool calls - [PR](https://github.com/BerriAI/litellm/pull/12545)
|
||||
|
||||
---
|
||||
|
||||
## LLM API Endpoints
|
||||
|
||||
#### Features
|
||||
- **[/completions](../../docs/text_completion)**
|
||||
- Return ‘reasoning_content’ on streaming - [PR](https://github.com/BerriAI/litellm/pull/12377)
|
||||
- **[/chat/completions](../../docs/completion/input)**
|
||||
- Add 'thinking blocks' to stream chunk builder - [PR](https://github.com/BerriAI/litellm/pull/12395)
|
||||
- **[/v1/messages](../../docs/anthropic_unified)**
|
||||
- Fallbacks support - [PR](https://github.com/BerriAI/litellm/pull/12440)
|
||||
- tool call handling for non-anthropic models (/v1/messages to /chat/completion bridge) - [PR](https://github.com/BerriAI/litellm/pull/12473)
|
||||
|
||||
---
|
||||
|
||||
## [MCP Gateway](../../docs/mcp)
|
||||
|
||||
<Image
|
||||
img={require('../../img/release_notes/mcp_tool_cost_tracking.png')}
|
||||
style={{width: '100%', display: 'block', margin: '2rem auto'}}
|
||||
/>
|
||||
|
||||
#### Features
|
||||
- **[Cost Tracking](../../docs/mcp#-mcp-cost-tracking)**
|
||||
- Add Cost Tracking - [PR](https://github.com/BerriAI/litellm/pull/12385)
|
||||
- Add usage tracking - [PR](https://github.com/BerriAI/litellm/pull/12397)
|
||||
- Add custom cost configuration for each MCP tool - [PR](https://github.com/BerriAI/litellm/pull/12499)
|
||||
- Add support for editing MCP cost per tool - [PR](https://github.com/BerriAI/litellm/pull/12501)
|
||||
- Allow using custom post call MCP hook for cost tracking - [PR](https://github.com/BerriAI/litellm/pull/12469)
|
||||
- **[Auth](../../docs/mcp#using-your-mcp-with-client-side-credentials)**
|
||||
- Allow customizing what client side auth header to use - [PR](https://github.com/BerriAI/litellm/pull/12460)
|
||||
- Raises error when MCP server header is malformed in the request - [PR](https://github.com/BerriAI/litellm/pull/12494)
|
||||
- **[MCP Server](../../docs/mcp#adding-your-mcp)**
|
||||
- Allow using stdio MCPs with LiteLLM (enables using Circle CI MCP w/ LiteLLM) - [PR](https://github.com/BerriAI/litellm/pull/12530), [Get Started](../../docs/mcp#adding-a-stdio-mcp-server)
|
||||
|
||||
#### Bugs
|
||||
- **General**
|
||||
- Fix task group is not initialized error - [PR](https://github.com/BerriAI/litellm/pull/12411) s/o [@juancarlosm](https://github.com/juancarlosm)
|
||||
- **[MCP Server](../../docs/mcp#adding-your-mcp)**
|
||||
- Fix mcp tool separator to work with Claude code - [PR](https://github.com/BerriAI/litellm/pull/12430), [Get Started](../../docs/mcp#adding-your-mcp)
|
||||
- Add validation to mcp server name to not allow "-" (enables namespaces to work) - [PR](https://github.com/BerriAI/litellm/pull/12515)
|
||||
|
||||
|
||||
---
|
||||
|
||||
## Management Endpoints / UI
|
||||
|
||||
|
||||
<Image
|
||||
img={require('../../img/release_notes/model_hub_v2.png')}
|
||||
style={{width: '100%', display: 'block', margin: '2rem auto'}}
|
||||
/>
|
||||
|
||||
#### Features
|
||||
- **Model Hub**
|
||||
- new model hub table view - [PR](https://github.com/BerriAI/litellm/pull/12468)
|
||||
- new /public/model_hub endpoint - [PR](https://github.com/BerriAI/litellm/pull/12468)
|
||||
- Make Model Hub OSS - [PR](https://github.com/BerriAI/litellm/pull/12553)
|
||||
- New ‘make public’ modal flow for showing proxy models on public model hub - [PR](https://github.com/BerriAI/litellm/pull/12555)
|
||||
- **MCP**
|
||||
- support for internal users to use and manage MCP servers - [PR](https://github.com/BerriAI/litellm/pull/12458)
|
||||
- Adds UI support to add MCP access groups (similar to namespaces) - [PR](https://github.com/BerriAI/litellm/pull/12470)
|
||||
- MCP Tool Testing Playground - [PR](https://github.com/BerriAI/litellm/pull/12520)
|
||||
- Show cost config on root of MCP settings - [PR](https://github.com/BerriAI/litellm/pull/12526)
|
||||
- **Test Key**
|
||||
- Stick sessions - [PR](https://github.com/BerriAI/litellm/pull/12365)
|
||||
- MCP Access Groups - allow mcp access groups - [PR](https://github.com/BerriAI/litellm/pull/12529)
|
||||
- **Usage**
|
||||
- Truncate long labels and improve tooltip in Top API Keys chart - [PR](https://github.com/BerriAI/litellm/pull/12371)
|
||||
- Improve Chart Readability for Tag Usage - [PR](https://github.com/BerriAI/litellm/pull/12378)
|
||||
- **Teams**
|
||||
- Prevent navigation reset after team member operations - [PR](https://github.com/BerriAI/litellm/pull/12424)
|
||||
- Team Members - reset budget, if duration set - [PR](https://github.com/BerriAI/litellm/pull/12534)
|
||||
- Use central team member budget when max_budget_in_team set on UI - [PR](https://github.com/BerriAI/litellm/pull/12533)
|
||||
- **SSO**
|
||||
- Allow users to run a custom sso login handler - [PR](https://github.com/BerriAI/litellm/pull/12465)
|
||||
- **Navbar**
|
||||
- improve user dropdown UI with premium badge and cleaner layout - [PR](https://github.com/BerriAI/litellm/pull/12502)
|
||||
- **General**
|
||||
- Consistent layout for Create and Back buttons on all the pages - [PR](https://github.com/BerriAI/litellm/pull/12542)
|
||||
- Align Show Password with Checkbox - [PR](https://github.com/BerriAI/litellm/pull/12538)
|
||||
- Prevent writing default user setting updates to yaml (causes error in non-root env) - [PR](https://github.com/BerriAI/litellm/pull/12533)
|
||||
|
||||
#### Bugs
|
||||
- **Model Hub**
|
||||
- fix duplicates in /model_group/info - [PR](https://github.com/BerriAI/litellm/pull/12468)
|
||||
- **MCP**
|
||||
- Fix UI not syncing MCP access groups properly with object permissions - [PR](https://github.com/BerriAI/litellm/pull/12523)
|
||||
|
||||
---
|
||||
|
||||
## Logging / Guardrail Integrations
|
||||
|
||||
#### Features
|
||||
- **[Langfuse](../../docs/observability/langfuse_integration)**
|
||||
- Version bump - [PR](https://github.com/BerriAI/litellm/pull/12376)
|
||||
- LANGFUSE_TRACING_ENVIRONMENT support - [PR](https://github.com/BerriAI/litellm/pull/12376)
|
||||
- **[Bedrock Guardrails](../../docs/proxy/guardrails/bedrock)**
|
||||
- Raise Bedrock output text on 'BLOCKED' actions from guardrail - [PR](https://github.com/BerriAI/litellm/pull/12435)
|
||||
- **[OTEL](../../docs/observability/opentelemetry_integration)**
|
||||
- `OTEL_RESOURCE_ATTRIBUTES` support - [PR](https://github.com/BerriAI/litellm/pull/12468)
|
||||
- **[Guardrails AI](../../docs/proxy/guardrails/guardrails_ai)**
|
||||
- pre-call + logging only guardrail (pii detection/competitor names) support - [PR](https://github.com/BerriAI/litellm/pull/12506)
|
||||
- **[Guardrails](../../docs/proxy/guardrails/quick_start)**
|
||||
- [Enterprise] Support tag based mode for guardrails - [PR](https://github.com/BerriAI/litellm/pull/12508), [Get Started](../../docs/proxy/guardrails/quick_start#-tag-based-guardrail-modes)
|
||||
- **[OpenAI Moderations API](../../docs/proxy/guardrails/openai_moderation)**
|
||||
- New guardrail integration - [PR](https://github.com/BerriAI/litellm/pull/12519)
|
||||
- **[Prometheus](../../docs/proxy/prometheus)**
|
||||
- support tag based metrics (enables prometheus metrics for measuring roo-code/cline/claude code engagement) - [PR](https://github.com/BerriAI/litellm/pull/12534), [Get Started](../../docs/proxy/prometheus#custom-tags)
|
||||
- **[Datadog LLM Observability](../../docs/observability/datadog)**
|
||||
- Added `total_cost` field to track costs in DataDog LLM observability metrics - [PR](https://github.com/BerriAI/litellm/pull/12467)
|
||||
|
||||
#### Bugs
|
||||
- **[Prometheus](../../docs/proxy/prometheus)**
|
||||
- Remove experimental `_by_tag` metrics (fixes cardinality issue) - [PR](https://github.com/BerriAI/litellm/pull/12395)
|
||||
- **[Slack Alerting](../../docs/proxy/alerting)**
|
||||
- Fix slack alerting for outage and region outage alerts - [PR](https://github.com/BerriAI/litellm/pull/12464), [Get Started](../../docs/proxy/alerting#region-outage-alerting--enterprise-feature)
|
||||
|
||||
---
|
||||
|
||||
## Performance / Loadbalancing / Reliability improvements
|
||||
|
||||
#### Bugs
|
||||
- **[Responses API Bridge](../../docs/response_api#calling-non-responses-api-endpoints-responses-to-chatcompletions-bridge)**
|
||||
- add image support for Responses API when falling back on Chat Completions - [PR](https://github.com/BerriAI/litellm/pull/12204) s/o [@ryan-castner](https://github.com/ryan-castner)
|
||||
- **aiohttp**
|
||||
- Properly close aiohttp client sessions to prevent resource leaks - [PR](https://github.com/BerriAI/litellm/pull/12251)
|
||||
- **Router**
|
||||
- don't add invalid deployment to router pattern match - [PR](https://github.com/BerriAI/litellm/pull/12459)
|
||||
|
||||
|
||||
---
|
||||
|
||||
## General Proxy Improvements
|
||||
|
||||
#### Bugs
|
||||
- **S3**
|
||||
- s3 config.yaml file - ensure yaml safe load is used - [PR](https://github.com/BerriAI/litellm/pull/12373)
|
||||
- **Audit Logs**
|
||||
- Add audit logs for model updates - [PR](https://github.com/BerriAI/litellm/pull/12396)
|
||||
- **Startup**
|
||||
- Multiple API Keys Created on Startup when max_budget is enabled - [PR](https://github.com/BerriAI/litellm/pull/12436)
|
||||
- **Auth**
|
||||
- Resolve model group alias on Auth (if user has access to underlying model, allow alias request to work) - [PR](https://github.com/BerriAI/litellm/pull/12440)
|
||||
- **config.yaml**
|
||||
- fix parsing environment_variables from config.yaml - [PR](https://github.com/BerriAI/litellm/pull/12482)
|
||||
- **Security**
|
||||
- Log hashed jwt w/ prefix instead of actual value - [PR](https://github.com/BerriAI/litellm/pull/12524)
|
||||
|
||||
#### Features
|
||||
- **MCP**
|
||||
- Bump mcp version on docker img - [PR](https://github.com/BerriAI/litellm/pull/12362)
|
||||
- **Request Headers**
|
||||
- Forward ‘anthropic-beta’ header when forward_client_headers_to_llm_api is true - [PR](https://github.com/BerriAI/litellm/pull/12462)
|
||||
|
||||
---
|
||||
|
||||
## New Contributors
|
||||
* @kanaka made their first contribution in https://github.com/BerriAI/litellm/pull/12418
|
||||
* @juancarlosm made their first contribution in https://github.com/BerriAI/litellm/pull/12411
|
||||
* @DmitriyAlergant made their first contribution in https://github.com/BerriAI/litellm/pull/12356
|
||||
* @Rayshard made their first contribution in https://github.com/BerriAI/litellm/pull/12487
|
||||
* @minghao51 made their first contribution in https://github.com/BerriAI/litellm/pull/12361
|
||||
* @jdietzsch91 made their first contribution in https://github.com/BerriAI/litellm/pull/12488
|
||||
* @iwinux made their first contribution in https://github.com/BerriAI/litellm/pull/12473
|
||||
* @andresC98 made their first contribution in https://github.com/BerriAI/litellm/pull/12413
|
||||
* @EmaSuriano made their first contribution in https://github.com/BerriAI/litellm/pull/12509
|
||||
* @strawgate made their first contribution in https://github.com/BerriAI/litellm/pull/12528
|
||||
* @inf3rnus made their first contribution in https://github.com/BerriAI/litellm/pull/12121
|
||||
|
||||
## **[Git Diff](https://github.com/BerriAI/litellm/compare/v1.74.0-stable...v1.74.3-stable)**
|
||||
|
@@ -0,0 +1,345 @@
|
||||
---
|
||||
title: "v1.74.7-stable"
|
||||
slug: "v1-74-7"
|
||||
date: 2025-07-19T10:00:00
|
||||
authors:
|
||||
- name: Krrish Dholakia
|
||||
title: CEO, LiteLLM
|
||||
url: https://www.linkedin.com/in/krish-d/
|
||||
image_url: https://pbs.twimg.com/profile_images/1298587542745358340/DZv3Oj-h_400x400.jpg
|
||||
- name: Ishaan Jaffer
|
||||
title: CTO, LiteLLM
|
||||
url: https://www.linkedin.com/in/reffajnaahsi/
|
||||
image_url: https://pbs.twimg.com/profile_images/1613813310264340481/lz54oEiB_400x400.jpg
|
||||
|
||||
hide_table_of_contents: false
|
||||
---
|
||||
|
||||
import Image from '@theme/IdealImage';
|
||||
import Tabs from '@theme/Tabs';
|
||||
import TabItem from '@theme/TabItem';
|
||||
|
||||
## Deploy this version
|
||||
|
||||
<Tabs>
|
||||
<TabItem value="docker" label="Docker">
|
||||
|
||||
``` showLineNumbers title="docker run litellm"
|
||||
docker run \
|
||||
-e STORE_MODEL_IN_DB=True \
|
||||
-p 4000:4000 \
|
||||
ghcr.io/berriai/litellm:v1.74.7-stable.patch.1
|
||||
```
|
||||
</TabItem>
|
||||
|
||||
<TabItem value="pip" label="Pip">
|
||||
|
||||
``` showLineNumbers title="pip install litellm"
|
||||
pip install litellm==1.74.7.post2
|
||||
```
|
||||
|
||||
</TabItem>
|
||||
</Tabs>
|
||||
|
||||
---
|
||||
|
||||
## Key Highlights
|
||||
|
||||
|
||||
- **Vector Stores** - Support for Vertex RAG Engine, PG Vector, OpenAI & Azure OpenAI Vector Stores.
|
||||
- **Bulk Editing Users** - Bulk editing users on the UI.
|
||||
- **Health Check Improvements** - Prevent unnecessary pod restarts during high traffic.
|
||||
- **New LLM Providers** - Added Moonshot AI and Vercel v0 provider support.
|
||||
|
||||
---
|
||||
|
||||
## Vector Stores API
|
||||
|
||||
<Image img={require('../../img/release_notes/vector_stores.png')} />
|
||||
|
||||
|
||||
This release introduces support for using VertexAI RAG Engine, PG Vector, Bedrock Knowledge Bases, and OpenAI Vector Stores with LiteLLM.
|
||||
|
||||
This is ideal for use cases requiring external knowledge sources with LLMs.
|
||||
|
||||
This brings the following benefits for LiteLLM users:
|
||||
|
||||
**Proxy Admin Benefits:**
|
||||
- Fine-grained access control: determine which Keys and Teams can access specific Vector Stores
|
||||
- Complete usage tracking and monitoring across all vector store operations
|
||||
|
||||
**Developer Benefits:**
|
||||
- Simple, unified interface for querying vector stores and using them with LLM API requests
|
||||
- Consistent API experience across all supported vector store providers
|
||||
|
||||
|
||||
|
||||
[Get started](../../docs/completion/knowledgebase)
|
||||
|
||||
|
||||
---
|
||||
|
||||
## Bulk Editing Users
|
||||
|
||||
<Image img={require('../../img/bulk_edit_graphic.png')} />
|
||||
|
||||
v1.74.7-stable introduces Bulk Editing Users on the UI. This is useful for:
|
||||
- granting all existing users to a default team (useful for controlling access / tracking spend by team)
|
||||
- controlling personal model access for existing users
|
||||
|
||||
[Read more](https://docs.litellm.ai/docs/proxy/ui/bulk_edit_users)
|
||||
|
||||
---
|
||||
|
||||
## Health Check Server
|
||||
|
||||
<Image alt="Separate Health App Architecture" img={require('../../img/separate_health_app_architecture.png')} style={{ borderRadius: '8px', marginBottom: '1em', maxWidth: '100%' }} />
|
||||
|
||||
This release brings reliability improvements that prevent unnecessary pod restarts during high traffic. Previously, when the main LiteLLM app was busy serving traffic, health endpoints would timeout even when pods were healthy.
|
||||
|
||||
Starting with this release, you can run health endpoints on an isolated process with a dedicated port. This ensures liveness and readiness probes remain responsive even when the main LiteLLM app is under heavy load.
|
||||
|
||||
[Read More](https://docs.litellm.ai/docs/proxy/prod#10-use-a-separate-health-check-app)
|
||||
|
||||
|
||||
---
|
||||
|
||||
## New Models / Updated Models
|
||||
|
||||
#### Pricing / Context Window Updates
|
||||
|
||||
| Provider | Model | Context Window | Input ($/1M tokens) | Output ($/1M tokens) |
|
||||
| ----------- | -------------------------------------- | -------------- | ------------------- | -------------------- |
|
||||
| Azure AI | `azure_ai/grok-3` | 131k | $3.30 | $16.50 |
|
||||
| Azure AI | `azure_ai/global/grok-3` | 131k | $3.00 | $15.00 |
|
||||
| Azure AI | `azure_ai/global/grok-3-mini` | 131k | $0.25 | $1.27 |
|
||||
| Azure AI | `azure_ai/grok-3-mini` | 131k | $0.275 | $1.38 |
|
||||
| Azure AI | `azure_ai/jais-30b-chat` | 8k | $3200 | $9710 |
|
||||
| Groq | `groq/moonshotai-kimi-k2-instruct` | 131k | $1.00 | $3.00 |
|
||||
| AI21 | `jamba-large-1.7` | 256k | $2.00 | $8.00 |
|
||||
| AI21 | `jamba-mini-1.7` | 256k | $0.20 | $0.40 |
|
||||
| Together.ai | `together_ai/moonshotai/Kimi-K2-Instruct` | 131k | $1.00 | $3.00 |
|
||||
| v0 | `v0/v0-1.0-md` | 128k | $3.00 | $15.00 |
|
||||
| v0 | `v0/v0-1.5-md` | 128k | $3.00 | $15.00 |
|
||||
| v0 | `v0/v0-1.5-lg` | 512k | $15.00 | $75.00 |
|
||||
| Moonshot | `moonshot/moonshot-v1-8k` | 8k | $0.20 | $2.00 |
|
||||
| Moonshot | `moonshot/moonshot-v1-32k` | 32k | $1.00 | $3.00 |
|
||||
| Moonshot | `moonshot/moonshot-v1-128k` | 131k | $2.00 | $5.00 |
|
||||
| Moonshot | `moonshot/moonshot-v1-auto` | 131k | $2.00 | $5.00 |
|
||||
| Moonshot | `moonshot/kimi-k2-0711-preview` | 131k | $0.60 | $2.50 |
|
||||
| Moonshot | `moonshot/moonshot-v1-32k-0430` | 32k | $1.00 | $3.00 |
|
||||
| Moonshot | `moonshot/moonshot-v1-128k-0430` | 131k | $2.00 | $5.00 |
|
||||
| Moonshot | `moonshot/moonshot-v1-8k-0430` | 8k | $0.20 | $2.00 |
|
||||
| Moonshot | `moonshot/kimi-latest` | 131k | $2.00 | $5.00 |
|
||||
| Moonshot | `moonshot/kimi-latest-8k` | 8k | $0.20 | $2.00 |
|
||||
| Moonshot | `moonshot/kimi-latest-32k` | 32k | $1.00 | $3.00 |
|
||||
| Moonshot | `moonshot/kimi-latest-128k` | 131k | $2.00 | $5.00 |
|
||||
| Moonshot | `moonshot/kimi-thinking-preview` | 131k | $30.00 | $30.00 |
|
||||
| Moonshot | `moonshot/moonshot-v1-8k-vision-preview` | 8k | $0.20 | $2.00 |
|
||||
| Moonshot | `moonshot/moonshot-v1-32k-vision-preview` | 32k | $1.00 | $3.00 |
|
||||
| Moonshot | `moonshot/moonshot-v1-128k-vision-preview` | 131k | $2.00 | $5.00 |
|
||||
|
||||
|
||||
#### Features
|
||||
|
||||
- **[🆕 Moonshot API (Kimi)](../../docs/providers/moonshot)**
|
||||
- New LLM API integration for accessing Kimi models - [PR #12592](https://github.com/BerriAI/litellm/pull/12592), [Get Started](../../docs/providers/moonshot)
|
||||
- **[🆕 v0 Provider](../../docs/providers/v0)**
|
||||
- New provider integration for v0.dev - [PR #12751](https://github.com/BerriAI/litellm/pull/12751), [Get Started](../../docs/providers/v0)
|
||||
- **[OpenAI](../../docs/providers/openai)**
|
||||
- Use OpenAI DeepResearch models with `litellm.completion` (`/chat/completions`) - [PR #12627](https://github.com/BerriAI/litellm/pull/12627) **DOC NEEDED**
|
||||
- Add `input_fidelity` parameter for OpenAI image generation - [PR #12662](https://github.com/BerriAI/litellm/pull/12662), [Get Started](../../docs/image_generation)
|
||||
- **[Azure OpenAI](../../docs/providers/azure_openai)**
|
||||
- Use Azure OpenAI DeepResearch models with `litellm.completion` (`/chat/completions`) - [PR #12627](https://github.com/BerriAI/litellm/pull/12627) **DOC NEEDED**
|
||||
- Added `response_format` support for openai gpt-4.1 models - [PR #12745](https://github.com/BerriAI/litellm/pull/12745)
|
||||
- **[Anthropic](../../docs/providers/anthropic)**
|
||||
- Tool cache control support - [PR #12668](https://github.com/BerriAI/litellm/pull/12668)
|
||||
- **[Bedrock](../../docs/providers/bedrock)**
|
||||
- Claude 4 /invoke route support - [PR #12599](https://github.com/BerriAI/litellm/pull/12599), [Get Started](../../docs/providers/bedrock)
|
||||
- Application inference profile tool choice support - [PR #12599](https://github.com/BerriAI/litellm/pull/12599)
|
||||
- **[Gemini](../../docs/providers/gemini)**
|
||||
- Custom TTL support for context caching - [PR #12541](https://github.com/BerriAI/litellm/pull/12541)
|
||||
- Fix implicit caching cost calculation for Gemini 2.x models - [PR #12585](https://github.com/BerriAI/litellm/pull/12585)
|
||||
- **[VertexAI](../../docs/providers/vertex)**
|
||||
- Added Vertex AI RAG Engine support (use with OpenAI compatible `/vector_stores` API) - [PR #12752](https://github.com/BerriAI/litellm/pull/12595), [Get Started](../../docs/completion/knowledgebase)
|
||||
- **[vLLM](../../docs/providers/vllm)**
|
||||
- Added support for using Rerank endpoints with vLLM - [PR #12738](https://github.com/BerriAI/litellm/pull/12738), [Get Started](../../docs/providers/vllm#rerank)
|
||||
- **[AI21](../../docs/providers/ai21)**
|
||||
- Added ai21/jamba-1.7 model family pricing - [PR #12593](https://github.com/BerriAI/litellm/pull/12593), [Get Started](../../docs/providers/ai21)
|
||||
- **[Together.ai](../../docs/providers/together_ai)**
|
||||
- [New Model] add together_ai/moonshotai/Kimi-K2-Instruct - [PR #12645](https://github.com/BerriAI/litellm/pull/12645), [Get Started](../../docs/providers/together_ai)
|
||||
- **[Groq](../../docs/providers/groq)**
|
||||
- Add groq/moonshotai-kimi-k2-instruct model configuration - [PR #12648](https://github.com/BerriAI/litellm/pull/12648), [Get Started](../../docs/providers/groq)
|
||||
- **[Github Copilot](../../docs/providers/github_copilot)**
|
||||
- Change System prompts to assistant prompts for GH Copilot - [PR #12742](https://github.com/BerriAI/litellm/pull/12742), [Get Started](../../docs/providers/github_copilot)
|
||||
|
||||
|
||||
#### Bugs
|
||||
- **[Anthropic](../../docs/providers/anthropic)**
|
||||
- Fix streaming + response_format + tools bug - [PR #12463](https://github.com/BerriAI/litellm/pull/12463)
|
||||
- **[XAI](../../docs/providers/xai)**
|
||||
- grok-4 does not support the `stop` param - [PR #12646](https://github.com/BerriAI/litellm/pull/12646)
|
||||
- **[AWS](../../docs/providers/bedrock)**
|
||||
- Role chaining with web authentication for AWS Bedrock - [PR #12607](https://github.com/BerriAI/litellm/pull/12607)
|
||||
- **[VertexAI](../../docs/providers/vertex)**
|
||||
- Add project_id to cached credentials - [PR #12661](https://github.com/BerriAI/litellm/pull/12661)
|
||||
- **[Bedrock](../../docs/providers/bedrock)**
|
||||
- Fix bedrock nova micro and nova lite context window info in [PR #12619](https://github.com/BerriAI/litellm/pull/12619)
|
||||
|
||||
---
|
||||
|
||||
## LLM API Endpoints
|
||||
|
||||
#### Features
|
||||
- **[/chat/completions](../../docs/completion/input)**
|
||||
- Include tool calls in output of trim_messages - [PR #11517](https://github.com/BerriAI/litellm/pull/11517)
|
||||
- **[/v1/vector_stores](../../docs/vector_stores/search)**
|
||||
- New OpenAI-compatible vector store endpoints - [PR #12699](https://github.com/BerriAI/litellm/pull/12699), [Get Started](../../docs/vector_stores/search)
|
||||
- Vector store search endpoint - [PR #12749](https://github.com/BerriAI/litellm/pull/12749), [Get Started](../../docs/vector_stores/search)
|
||||
- Support for using PG Vector as a vector store - [PR #12667](https://github.com/BerriAI/litellm/pull/12667), [Get Started](../../docs/completion/knowledgebase)
|
||||
- **[/streamGenerateContent](../../docs/generateContent)**
|
||||
- Non-gemini model support - [PR #12647](https://github.com/BerriAI/litellm/pull/12647)
|
||||
|
||||
#### Bugs
|
||||
- **[/vector_stores](../../docs/vector_stores/search)**
|
||||
- Knowledge Base Call returning error when passing as `tools` - [PR #12628](https://github.com/BerriAI/litellm/pull/12628)
|
||||
|
||||
---
|
||||
|
||||
## [MCP Gateway](../../docs/mcp)
|
||||
|
||||
#### Features
|
||||
- **[Access Groups](../../docs/mcp#grouping-mcps-access-groups)**
|
||||
- Allow MCP access groups to be added via litellm proxy config.yaml - [PR #12654](https://github.com/BerriAI/litellm/pull/12654)
|
||||
- List tools from access list for keys - [PR #12657](https://github.com/BerriAI/litellm/pull/12657)
|
||||
- **[Namespacing](../../docs/mcp#mcp-namespacing)**
|
||||
- URL-based namespacing for better segregation - [PR #12658](https://github.com/BerriAI/litellm/pull/12658)
|
||||
- Make MCP_TOOL_PREFIX_SEPARATOR configurable from env - [PR #12603](https://github.com/BerriAI/litellm/pull/12603)
|
||||
- **[Gateway Features](../../docs/mcp#mcp-gateway-features)**
|
||||
- Allow using MCPs with all LLM APIs (VertexAI, Gemini, Groq, etc.) when using /responses - [PR #12546](https://github.com/BerriAI/litellm/pull/12546)
|
||||
|
||||
#### Bugs
|
||||
- Fix to update object permission on update/delete key/team - [PR #12701](https://github.com/BerriAI/litellm/pull/12701)
|
||||
- Include /mcp in list of available routes on proxy - [PR #12612](https://github.com/BerriAI/litellm/pull/12612)
|
||||
|
||||
---
|
||||
|
||||
## Management Endpoints / UI
|
||||
|
||||
#### Features
|
||||
- **Keys**
|
||||
- Regenerate Key State Management improvements - [PR #12729](https://github.com/BerriAI/litellm/pull/12729)
|
||||
- **Models**
|
||||
- Wildcard model filter support - [PR #12597](https://github.com/BerriAI/litellm/pull/12597)
|
||||
- Fixes for handling team only models on UI - [PR #12632](https://github.com/BerriAI/litellm/pull/12632)
|
||||
- **Usage Page**
|
||||
- Fix Y-axis labels overlap on Spend per Tag chart - [PR #12754](https://github.com/BerriAI/litellm/pull/12754)
|
||||
- **Teams**
|
||||
- Allow setting custom key duration + show key creation stats - [PR #12722](https://github.com/BerriAI/litellm/pull/12722)
|
||||
- Enable team admins to update member roles - [PR #12629](https://github.com/BerriAI/litellm/pull/12629)
|
||||
- **Users**
|
||||
- New `/user/bulk_update` endpoint - [PR #12720](https://github.com/BerriAI/litellm/pull/12720)
|
||||
- **Logs Page**
|
||||
- Add `end_user` filter on UI Logs Page - [PR #12663](https://github.com/BerriAI/litellm/pull/12663)
|
||||
- **MCP Servers**
|
||||
- Copy MCP Server name functionality - [PR #12760](https://github.com/BerriAI/litellm/pull/12760)
|
||||
- **Vector Stores**
|
||||
- UI support for clicking into Vector Stores - [PR #12741](https://github.com/BerriAI/litellm/pull/12741)
|
||||
- Allow adding Vertex RAG Engine, OpenAI, Azure through UI - [PR #12752](https://github.com/BerriAI/litellm/pull/12752)
|
||||
- **General**
|
||||
- Add Copy-on-Click for all IDs (Key, Team, Organization, MCP Server) - [PR #12615](https://github.com/BerriAI/litellm/pull/12615)
|
||||
- **[SCIM](../../docs/proxy/scim)**
|
||||
- Add GET /ServiceProviderConfig endpoint - [PR #12664](https://github.com/BerriAI/litellm/pull/12664)
|
||||
|
||||
#### Bugs
|
||||
- **Teams**
|
||||
- Ensure user id correctly added when creating new teams - [PR #12719](https://github.com/BerriAI/litellm/pull/12719)
|
||||
- Fixes for handling team-only models on UI - [PR #12632](https://github.com/BerriAI/litellm/pull/12632)
|
||||
|
||||
---
|
||||
|
||||
## Logging / Guardrail Integrations
|
||||
|
||||
#### Features
|
||||
- **[Google Cloud Model Armor](../../docs/proxy/guardrails/google_cloud_model_armor)**
|
||||
- New guardrails integration - [PR #12492](https://github.com/BerriAI/litellm/pull/12492)
|
||||
- **[Bedrock Guardrails](../../docs/proxy/guardrails/bedrock)**
|
||||
- Allow disabling exception on 'BLOCKED' action - [PR #12693](https://github.com/BerriAI/litellm/pull/12693)
|
||||
- **[Guardrails AI](../../docs/proxy/guardrails/guardrails_ai)**
|
||||
- Support `llmOutput` based guardrails as pre-call hooks - [PR #12674](https://github.com/BerriAI/litellm/pull/12674)
|
||||
- **[DataDog LLM Observability](../../docs/proxy/logging#datadog)**
|
||||
- Add support for tracking the correct span type based on LLM Endpoint used - [PR #12652](https://github.com/BerriAI/litellm/pull/12652)
|
||||
- **[Custom Logging](../../docs/proxy/logging)**
|
||||
- Allow reading custom logger python scripts from S3 or GCS Bucket - [PR #12623](https://github.com/BerriAI/litellm/pull/12623)
|
||||
|
||||
#### Bugs
|
||||
- **[General Logging](../../docs/proxy/logging)**
|
||||
- StandardLoggingPayload on cache_hits should track custom llm provider - [PR #12652](https://github.com/BerriAI/litellm/pull/12652)
|
||||
- **[S3 Buckets](../../docs/proxy/logging#s3-buckets)**
|
||||
- S3 v2 log uploader crashes when using with guardrails - [PR #12733](https://github.com/BerriAI/litellm/pull/12733)
|
||||
|
||||
---
|
||||
|
||||
## Performance / Loadbalancing / Reliability improvements
|
||||
|
||||
#### Features
|
||||
- **Health Checks**
|
||||
- Separate health app for liveness probes - [PR #12669](https://github.com/BerriAI/litellm/pull/12669)
|
||||
- Health check app on separate port - [PR #12718](https://github.com/BerriAI/litellm/pull/12718)
|
||||
- **Caching**
|
||||
- Add Azure Blob cache support - [PR #12587](https://github.com/BerriAI/litellm/pull/12587)
|
||||
- **Router**
|
||||
- Handle ZeroDivisionError with zero completion tokens in lowest_latency strategy - [PR #12734](https://github.com/BerriAI/litellm/pull/12734)
|
||||
|
||||
#### Bugs
|
||||
- **Database**
|
||||
- Use upsert for managed object table to avoid UniqueViolationError - [PR #11795](https://github.com/BerriAI/litellm/pull/11795)
|
||||
- Refactor to support use_prisma_migrate for helm hook - [PR #12600](https://github.com/BerriAI/litellm/pull/12600)
|
||||
- **Cache**
|
||||
- Fix: redis caching for embedding response models - [PR #12750](https://github.com/BerriAI/litellm/pull/12750)
|
||||
|
||||
---
|
||||
|
||||
## Helm Chart
|
||||
|
||||
- DB Migration Hook: refactor to support use_prisma_migrate - for helm hook [PR](https://github.com/BerriAI/litellm/pull/12600)
|
||||
- Add envVars and extraEnvVars support to Helm migrations job - [PR #12591](https://github.com/BerriAI/litellm/pull/12591)
|
||||
|
||||
## General Proxy Improvements
|
||||
|
||||
#### Features
|
||||
- **Control Plane + Data Plane Architecture**
|
||||
- Control Plane + Data Plane support - [PR #12601](https://github.com/BerriAI/litellm/pull/12601)
|
||||
- **Proxy CLI**
|
||||
- Add "keys import" command to CLI - [PR #12620](https://github.com/BerriAI/litellm/pull/12620)
|
||||
- **Swagger Documentation**
|
||||
- Add swagger docs for LiteLLM /chat/completions, /embeddings, /responses - [PR #12618](https://github.com/BerriAI/litellm/pull/12618)
|
||||
- **Dependencies**
|
||||
- Loosen rich version from ==13.7.1 to >=13.7.1 - [PR #12704](https://github.com/BerriAI/litellm/pull/12704)
|
||||
|
||||
|
||||
#### Bugs
|
||||
|
||||
- Verbose log is enabled by default fix - [PR #12596](https://github.com/BerriAI/litellm/pull/12596)
|
||||
|
||||
- Add support for disabling callbacks in request body - [PR #12762](https://github.com/BerriAI/litellm/pull/12762)
|
||||
- Handle circular references in spend tracking metadata JSON serialization - [PR #12643](https://github.com/BerriAI/litellm/pull/12643)
|
||||
|
||||
---
|
||||
|
||||
## New Contributors
|
||||
* @AntonioKL made their first contribution in https://github.com/BerriAI/litellm/pull/12591
|
||||
* @marcelodiaz558 made their first contribution in https://github.com/BerriAI/litellm/pull/12541
|
||||
* @dmcaulay made their first contribution in https://github.com/BerriAI/litellm/pull/12463
|
||||
* @demoray made their first contribution in https://github.com/BerriAI/litellm/pull/12587
|
||||
* @staeiou made their first contribution in https://github.com/BerriAI/litellm/pull/12631
|
||||
* @stefanc-ai2 made their first contribution in https://github.com/BerriAI/litellm/pull/12622
|
||||
* @RichardoC made their first contribution in https://github.com/BerriAI/litellm/pull/12607
|
||||
* @yeahyung made their first contribution in https://github.com/BerriAI/litellm/pull/11795
|
||||
* @mnguyen96 made their first contribution in https://github.com/BerriAI/litellm/pull/12619
|
||||
* @rgambee made their first contribution in https://github.com/BerriAI/litellm/pull/11517
|
||||
* @jvanmelckebeke made their first contribution in https://github.com/BerriAI/litellm/pull/12725
|
||||
* @jlaurendi made their first contribution in https://github.com/BerriAI/litellm/pull/12704
|
||||
* @doublerr made their first contribution in https://github.com/BerriAI/litellm/pull/12661
|
||||
|
||||
## **[Full Changelog](https://github.com/BerriAI/litellm/compare/v1.74.3-stable...v1.74.7-stable)**
|
@@ -0,0 +1,299 @@
|
||||
---
|
||||
title: "v1.74.9-stable - Auto-Router"
|
||||
slug: "v1-74-9"
|
||||
date: 2025-07-27T10:00:00
|
||||
authors:
|
||||
- name: Krrish Dholakia
|
||||
title: CEO, LiteLLM
|
||||
url: https://www.linkedin.com/in/krish-d/
|
||||
image_url: https://pbs.twimg.com/profile_images/1298587542745358340/DZv3Oj-h_400x400.jpg
|
||||
- name: Ishaan Jaffer
|
||||
title: CTO, LiteLLM
|
||||
url: https://www.linkedin.com/in/reffajnaahsi/
|
||||
image_url: https://pbs.twimg.com/profile_images/1613813310264340481/lz54oEiB_400x400.jpg
|
||||
|
||||
hide_table_of_contents: false
|
||||
---
|
||||
|
||||
import Image from '@theme/IdealImage';
|
||||
import Tabs from '@theme/Tabs';
|
||||
import TabItem from '@theme/TabItem';
|
||||
|
||||
## Deploy this version
|
||||
|
||||
<Tabs>
|
||||
<TabItem value="docker" label="Docker">
|
||||
|
||||
``` showLineNumbers title="docker run litellm"
|
||||
docker run \
|
||||
-e STORE_MODEL_IN_DB=True \
|
||||
-p 4000:4000 \
|
||||
ghcr.io/berriai/litellm:v1.74.9-stable.patch.1
|
||||
```
|
||||
</TabItem>
|
||||
|
||||
<TabItem value="pip" label="Pip">
|
||||
|
||||
``` showLineNumbers title="pip install litellm"
|
||||
pip install litellm==1.74.9.post2
|
||||
```
|
||||
|
||||
</TabItem>
|
||||
</Tabs>
|
||||
|
||||
---
|
||||
|
||||
## Key Highlights
|
||||
|
||||
- **Auto-Router** - Automatically route requests to specific models based on request content.
|
||||
- **Model-level Guardrails** - Only run guardrails when specific models are used.
|
||||
- **MCP Header Propagation** - Propagate headers from client to backend MCP.
|
||||
- **New LLM Providers** - Added Bedrock inpainting support and Recraft API image generation / image edits support.
|
||||
|
||||
---
|
||||
|
||||
## Auto-Router
|
||||
|
||||
<Image img={require('../../img/release_notes/auto_router.png')} />
|
||||
|
||||
<br/>
|
||||
|
||||
This release introduces auto-routing to models based on request content. This means **Proxy Admins** can define a set of keywords that always routes to specific models when **users** opt in to using the auto-router.
|
||||
|
||||
This is great for internal use cases where you don't want **users** to think about which model to use - for example, use Claude models for coding vs GPT models for generating ad copy.
|
||||
|
||||
|
||||
[Read More](../../docs/proxy/auto_routing)
|
||||
|
||||
---
|
||||
|
||||
## Model-level Guardrails
|
||||
|
||||
<Image img={require('../../img/release_notes/model_level_guardrails.jpg')} />
|
||||
|
||||
<br/>
|
||||
|
||||
This release brings model-level guardrails support to your config.yaml + UI. This is great for cases when you have an on-prem and hosted model, and just want to run prevent sending PII to the hosted model.
|
||||
|
||||
```yaml
|
||||
model_list:
|
||||
- model_name: claude-sonnet-4
|
||||
litellm_params:
|
||||
model: anthropic/claude-sonnet-4-20250514
|
||||
api_key: os.environ/ANTHROPIC_API_KEY
|
||||
api_base: https://api.anthropic.com/v1
|
||||
guardrails: ["azure-text-moderation"] # 👈 KEY CHANGE
|
||||
|
||||
guardrails:
|
||||
- guardrail_name: azure-text-moderation
|
||||
litellm_params:
|
||||
guardrail: azure/text_moderations
|
||||
mode: "post_call"
|
||||
api_key: os.environ/AZURE_GUARDRAIL_API_KEY
|
||||
api_base: os.environ/AZURE_GUARDRAIL_API_BASE
|
||||
```
|
||||
|
||||
|
||||
[Read More](../../docs/proxy/guardrails/quick_start#model-level-guardrails)
|
||||
|
||||
---
|
||||
## MCP Header Propagation
|
||||
|
||||
<Image img={require('../../img/release_notes/mcp_header_propogation.png')} />
|
||||
|
||||
<br/>
|
||||
|
||||
v1.74.9-stable allows you to propagate MCP server specific authentication headers via LiteLLM
|
||||
|
||||
- Allowing users to specify which `header_name` is to be propagated to which `mcp_server` via headers
|
||||
- Allows adding of different deployments of same MCP server type to use different authentication headers
|
||||
|
||||
|
||||
[Read More](https://docs.litellm.ai/docs/mcp#new-server-specific-auth-headers-recommended)
|
||||
|
||||
---
|
||||
## New Models / Updated Models
|
||||
|
||||
#### Pricing / Context Window Updates
|
||||
|
||||
| Provider | Model | Context Window | Input ($/1M tokens) | Output ($/1M tokens) |
|
||||
| ----------- | -------------------------------------- | -------------- | ------------------- | -------------------- |
|
||||
| Fireworks AI | `fireworks/models/kimi-k2-instruct` | 131k | $0.6 | $2.5 |
|
||||
| OpenRouter | `openrouter/qwen/qwen-vl-plus` | 8192 | $0.21 | $0.63 |
|
||||
| OpenRouter | `openrouter/qwen/qwen3-coder` | 8192 | $1 | $5 |
|
||||
| OpenRouter | `openrouter/bytedance/ui-tars-1.5-7b` | 128k | $0.10 | $0.20 |
|
||||
| Groq | `groq/qwen/qwen3-32b` | 131k | $0.29 | $0.59 |
|
||||
| VertexAI | `vertex_ai/meta/llama-3.1-8b-instruct-maas` | 128k | $0.00 | $0.00 |
|
||||
| VertexAI | `vertex_ai/meta/llama-3.1-405b-instruct-maas` | 128k | $5 | $16 |
|
||||
| VertexAI | `vertex_ai/meta/llama-3.2-90b-vision-instruct-maas` | 128k | $0.00 | $0.00 |
|
||||
| Google AI Studio | `gemini/gemini-2.0-flash-live-001` | 1,048,576 | $0.35 | $1.5 |
|
||||
| Google AI Studio | `gemini/gemini-2.5-flash-lite` | 1,048,576 | $0.1 | $0.4 |
|
||||
| VertexAI | `vertex_ai/gemini-2.0-flash-lite-001` | 1,048,576 | $0.35 | $1.5 |
|
||||
| OpenAI | `gpt-4o-realtime-preview-2025-06-03` | 128k | $5 | $20 |
|
||||
|
||||
#### Features
|
||||
|
||||
- **[Lambda AI](../../docs/providers/lambda_ai)**
|
||||
- New LLM API provider - [PR #12817](https://github.com/BerriAI/litellm/pull/12817)
|
||||
- **[Github Copilot](../../docs/providers/github_copilot)**
|
||||
- Dynamic endpoint support - [PR #12827](https://github.com/BerriAI/litellm/pull/12827)
|
||||
- **[Morph](../../docs/providers/morph)**
|
||||
- New LLM API provider - [PR #12821](https://github.com/BerriAI/litellm/pull/12821)
|
||||
- **[Groq](../../docs/providers/groq)**
|
||||
- Remove deprecated groq/qwen-qwq-32b - [PR #12832](https://github.com/BerriAI/litellm/pull/12831)
|
||||
- **[Recraft](../../docs/providers/recraft)**
|
||||
- New image generation API - [PR #12832](https://github.com/BerriAI/litellm/pull/12832)
|
||||
- New image edits api - [PR #12874](https://github.com/BerriAI/litellm/pull/12874)
|
||||
- **[Azure OpenAI](../../docs/providers/azure/azure)**
|
||||
- Support DefaultAzureCredential without hard-coded environment variables - [PR #12841](https://github.com/BerriAI/litellm/pull/12841)
|
||||
- **[Hyperbolic](../../docs/providers/hyperbolic)**
|
||||
- New LLM API provider - [PR #12826](https://github.com/BerriAI/litellm/pull/12826)
|
||||
- **[OpenAI](../../docs/providers/openai)**
|
||||
- `/realtime` API - pass through intent query param - [PR #12838](https://github.com/BerriAI/litellm/pull/12838)
|
||||
- **[Bedrock](../../docs/providers/bedrock)**
|
||||
- Add inpainting support for Amazon Nova Canvas - [PR #12949](https://github.com/BerriAI/litellm/pull/12949) s/o @[SantoshDhaladhuli](https://github.com/SantoshDhaladhuli)
|
||||
|
||||
#### Bugs
|
||||
- **Gemini ([Google AI Studio](../../docs/providers/gemini) + [VertexAI](../../docs/providers/vertex))**
|
||||
- Fix leaking file descriptor error on sync calls - [PR #12824](https://github.com/BerriAI/litellm/pull/12824)
|
||||
- **IBM Watsonx**
|
||||
- use correct parameter name for tool choice - [PR #9980](https://github.com/BerriAI/litellm/pull/9980)
|
||||
- **[Anthropic](../../docs/providers/anthropic)**
|
||||
- Only show ‘reasoning_effort’ for supported models - [PR #12847](https://github.com/BerriAI/litellm/pull/12847)
|
||||
- Handle $id and $schema in tool call requests (Anthropic API stopped accepting them) - [PR #12959](https://github.com/BerriAI/litellm/pull/12959)
|
||||
- **[Openrouter](../../docs/providers/openrouter)**
|
||||
- filter out cache_control flag for non-anthropic models (allows usage with claude code) https://github.com/BerriAI/litellm/pull/12850
|
||||
- **[Gemini](../../docs/providers/gemini)**
|
||||
- Shorten Gemini tool_call_id for Open AI compatibility - [PR #12941](https://github.com/BerriAI/litellm/pull/12941) s/o @[tonga54](https://github.com/tonga54)
|
||||
|
||||
---
|
||||
|
||||
## LLM API Endpoints
|
||||
|
||||
#### Features
|
||||
|
||||
- **[Passthrough endpoints](../../docs/pass_through/)**
|
||||
- Make key/user/team cost tracking OSS - [PR #12847](https://github.com/BerriAI/litellm/pull/12847)
|
||||
- **[/v1/models](../../docs/providers/passthrough)**
|
||||
- Return fallback models as part of api response - [PR #12811](https://github.com/BerriAI/litellm/pull/12811) s/o @[murad-khafizov](https://github.com/murad-khafizov)
|
||||
- **[/vector_stores](../../docs/providers/passthrough)**
|
||||
- Make permission management OSS - [PR #12990](https://github.com/BerriAI/litellm/pull/12990)
|
||||
|
||||
#### Bugs
|
||||
1. `/batches`
|
||||
1. Skip invalid batch during cost tracking check (prev. Would stop all checks) - [PR #12782](https://github.com/BerriAI/litellm/pull/12782)
|
||||
2. `/chat/completions`
|
||||
1. Fix async retryer on .acompletion() - [PR #12886](https://github.com/BerriAI/litellm/pull/12886)
|
||||
|
||||
---
|
||||
|
||||
## [MCP Gateway](../../docs/mcp)
|
||||
|
||||
#### Features
|
||||
- **[Permission Management](../../docs/mcp#grouping-mcps-access-groups)**
|
||||
- Make permission management by key/team OSS - [PR #12988](https://github.com/BerriAI/litellm/pull/12988)
|
||||
- **[MCP Alias](../../docs/mcp#mcp-aliases)**
|
||||
- Support mcp server aliases (useful for calling long mcp server names on Cursor) - [PR #12994](https://github.com/BerriAI/litellm/pull/12994)
|
||||
- **Header Propagation**
|
||||
- Support propagating headers from client to backend MCP (useful for sending personal access tokens to backend MCP) - [PR #13003](https://github.com/BerriAI/litellm/pull/13003)
|
||||
|
||||
---
|
||||
|
||||
## Management Endpoints / UI
|
||||
|
||||
#### Features
|
||||
- **Usage**
|
||||
- Support viewing usage by model group - [PR #12890](https://github.com/BerriAI/litellm/pull/12890)
|
||||
- **Virtual Keys**
|
||||
- New `key_type` field on `/key/generate` - allows specifying if key can call LLM API vs. Management routes - [PR #12909](https://github.com/BerriAI/litellm/pull/12909)
|
||||
- **Models**
|
||||
- Add ‘auto router’ on UI - [PR #12960](https://github.com/BerriAI/litellm/pull/12960)
|
||||
- Show global retry policy on UI - [PR #12969](https://github.com/BerriAI/litellm/pull/12969)
|
||||
- Add model-level guardrails on create + update - [PR #13006](https://github.com/BerriAI/litellm/pull/13006)
|
||||
|
||||
#### Bugs
|
||||
- **SSO**
|
||||
- Fix logout when SSO is enabled - [PR #12703](https://github.com/BerriAI/litellm/pull/12703)
|
||||
- Fix reset SSO when ui_access_mode is updated - [PR #13011](https://github.com/BerriAI/litellm/pull/13011)
|
||||
- **Guardrails**
|
||||
- Show correct guardrails when editing a team - [PR #12823](https://github.com/BerriAI/litellm/pull/12823)
|
||||
- **Virtual Keys**
|
||||
- Get updated token on regenerate key - [PR #12788](https://github.com/BerriAI/litellm/pull/12788)
|
||||
- Fix CVE with key injection - [PR #12840](https://github.com/BerriAI/litellm/pull/12840)
|
||||
---
|
||||
|
||||
## Logging / Guardrail Integrations
|
||||
|
||||
#### Features
|
||||
- **[Google Cloud Model Armor](../../docs/proxy/guardrails/model_armor)**
|
||||
- Document new guardrail - [PR #12492](https://github.com/BerriAI/litellm/pull/12492)
|
||||
- **[Pillar Security](../../docs/proxy/guardrails/pillar_security)**
|
||||
- New LLM Guardrail - [PR #12791](https://github.com/BerriAI/litellm/pull/12791)
|
||||
- **CloudZero**
|
||||
- Allow exporting spend to cloudzero - [PR #12908](https://github.com/BerriAI/litellm/pull/12908)
|
||||
- **Model-level Guardrails**
|
||||
- Support model-level guardrails - [PR #12968](https://github.com/BerriAI/litellm/pull/12968)
|
||||
|
||||
#### Bugs
|
||||
- **[Prometheus](../../docs/proxy/prometheus)**
|
||||
- Fix `[tag]=false` when tag is set for tag-based metrics - [PR #12916](https://github.com/BerriAI/litellm/pull/12916)
|
||||
- **[Guardrails AI](../../docs/proxy/guardrails/guardrails_ai)**
|
||||
- Use ‘validatedOutput’ to allow usage of “fix” guards - [PR #12891](https://github.com/BerriAI/litellm/pull/12891) s/o @[DmitriyAlergant](https://github.com/DmitriyAlergant)
|
||||
|
||||
---
|
||||
|
||||
## Performance / Loadbalancing / Reliability improvements
|
||||
|
||||
#### Features
|
||||
- **[Auto-Router](../../docs/proxy/auto_routing)**
|
||||
- New auto-router powered by `semantic-router` - [PR #12955](https://github.com/BerriAI/litellm/pull/12955)
|
||||
|
||||
#### Bugs
|
||||
- **forward_clientside_headers**
|
||||
- Filter out `content-length` from headers (caused backend requests to hang) - [PR #12886](https://github.com/BerriAI/litellm/pull/12886/files)
|
||||
- **Message Redaction**
|
||||
- Fix cannot pickle coroutine object error - [PR #13005](https://github.com/BerriAI/litellm/pull/13005)
|
||||
---
|
||||
|
||||
## General Proxy Improvements
|
||||
|
||||
#### Features
|
||||
- **Benchmarks**
|
||||
- Updated litellm proxy benchmarks (p50, p90, p99 overhead) - [PR #12842](https://github.com/BerriAI/litellm/pull/12842)
|
||||
- **Request Headers**
|
||||
- Added new `x-litellm-num-retries` request header
|
||||
- **Swagger**
|
||||
- Support local swagger on custom root paths - [PR #12911](https://github.com/BerriAI/litellm/pull/12911)
|
||||
- **Health**
|
||||
- Track cost + add tags for health checks done by LiteLLM Proxy - [PR #12880](https://github.com/BerriAI/litellm/pull/12880)
|
||||
#### Bugs
|
||||
|
||||
- **Proxy Startup**
|
||||
- Fixes issue on startup where team member budget is None would block startup - [PR #12843](https://github.com/BerriAI/litellm/pull/12843)
|
||||
- **Docker**
|
||||
- Move non-root docker to chain guard image (fewer vulnerabilities) - [PR #12707](https://github.com/BerriAI/litellm/pull/12707)
|
||||
- add azure-keyvault==4.2.0 to Docker img - [PR #12873](https://github.com/BerriAI/litellm/pull/12873)
|
||||
- **Separate Health App**
|
||||
- Pass through cmd args via supervisord (enables user config to still work via docker) - [PR #12871](https://github.com/BerriAI/litellm/pull/12871)
|
||||
- **Swagger**
|
||||
- Bump DOMPurify version (fixes vulnerability) - [PR #12911](https://github.com/BerriAI/litellm/pull/12911)
|
||||
- Add back local swagger bundle (enables swagger to work in air gapped env.) - [PR #12911](https://github.com/BerriAI/litellm/pull/12911)
|
||||
- **Request Headers**
|
||||
- Make ‘user_header_name’ field check case insensitive (fixes customer budget enforcement for OpenWebUi) - [PR #12950](https://github.com/BerriAI/litellm/pull/12950)
|
||||
- **SpendLogs**
|
||||
- Fix issues writing to DB when custom_llm_provider is None - [PR #13001](https://github.com/BerriAI/litellm/pull/13001)
|
||||
|
||||
---
|
||||
|
||||
## New Contributors
|
||||
* @magicalne made their first contribution in https://github.com/BerriAI/litellm/pull/12804
|
||||
* @pavangudiwada made their first contribution in https://github.com/BerriAI/litellm/pull/12798
|
||||
* @mdiloreto made their first contribution in https://github.com/BerriAI/litellm/pull/12707
|
||||
* @murad-khafizov made their first contribution in https://github.com/BerriAI/litellm/pull/12811
|
||||
* @eagle-p made their first contribution in https://github.com/BerriAI/litellm/pull/12791
|
||||
* @apoorv-sharma made their first contribution in https://github.com/BerriAI/litellm/pull/12920
|
||||
* @SantoshDhaladhuli made their first contribution in https://github.com/BerriAI/litellm/pull/12949
|
||||
* @tonga54 made their first contribution in https://github.com/BerriAI/litellm/pull/12941
|
||||
* @sings-to-bees-on-wednesdays made their first contribution in https://github.com/BerriAI/litellm/pull/12950
|
||||
|
||||
## **[Full Changelog](https://github.com/BerriAI/litellm/compare/v1.74.7-stable...v1.74.9.rc-draft)**
|
@@ -0,0 +1,299 @@
|
||||
---
|
||||
title: "v1.75.5-stable - Redis latency improvements"
|
||||
slug: "v1-75-5"
|
||||
date: 2025-08-10T10:00:00
|
||||
authors:
|
||||
- name: Krrish Dholakia
|
||||
title: CEO, LiteLLM
|
||||
url: https://www.linkedin.com/in/krish-d/
|
||||
image_url: https://pbs.twimg.com/profile_images/1298587542745358340/DZv3Oj-h_400x400.jpg
|
||||
- name: Ishaan Jaffer
|
||||
title: CTO, LiteLLM
|
||||
url: https://www.linkedin.com/in/reffajnaahsi/
|
||||
image_url: https://pbs.twimg.com/profile_images/1613813310264340481/lz54oEiB_400x400.jpg
|
||||
|
||||
hide_table_of_contents: false
|
||||
---
|
||||
|
||||
import Image from '@theme/IdealImage';
|
||||
import Tabs from '@theme/Tabs';
|
||||
import TabItem from '@theme/TabItem';
|
||||
|
||||
## Deploy this version
|
||||
|
||||
<Tabs>
|
||||
<TabItem value="docker" label="Docker">
|
||||
|
||||
``` showLineNumbers title="docker run litellm"
|
||||
docker run \
|
||||
-e STORE_MODEL_IN_DB=True \
|
||||
-p 4000:4000 \
|
||||
ghcr.io/berriai/litellm:v1.75.5.rc.1
|
||||
```
|
||||
</TabItem>
|
||||
|
||||
<TabItem value="pip" label="Pip">
|
||||
|
||||
``` showLineNumbers title="pip install litellm"
|
||||
pip install litellm==1.75.5.post1
|
||||
```
|
||||
|
||||
</TabItem>
|
||||
</Tabs>
|
||||
|
||||
---
|
||||
|
||||
## Key Highlights
|
||||
|
||||
- **Redis - Latency Improvements** - Reduces P99 latency by 50% with Redis enabled.
|
||||
- **Responses API Session Management** - Support for managing responses API sessions with images.
|
||||
- **Oracle Cloud Infrastructure** - New LLM provider for calling models on Oracle Cloud Infrastructure.
|
||||
- **Digital Ocean's Gradient AI** - New LLM provider for calling models on Digital Ocean's Gradient AI platform.
|
||||
|
||||
|
||||
### Risk of Upgrade
|
||||
|
||||
If you build the proxy from the pip package, you should hold off on upgrading. This version makes `prisma migrate deploy` our default for managing the DB. This is safer, as it doesn't reset the DB, but it requires a manual `prisma generate` step.
|
||||
|
||||
Users of our Docker image, are **not** affected by this change.
|
||||
|
||||
---
|
||||
|
||||
## Redis Latency Improvements
|
||||
|
||||
<Image
|
||||
img={require('../../img/release_notes/faster_caching_calls.png')}
|
||||
style={{width: '100%', display: 'block', margin: '2rem auto'}}
|
||||
/>
|
||||
|
||||
<br/>
|
||||
|
||||
This release adds in-memory caching for Redis requests, enabling faster response times in high-traffic. Now, LiteLLM instances will check their in-memory cache for a cache hit, before checking Redis. This reduces caching-related latency from 100ms for LLM API calls to sub-1ms, on cache hits.
|
||||
|
||||
---
|
||||
|
||||
## Responses API Session Management w/ Images
|
||||
|
||||
<Image
|
||||
img={require('../../img/release_notes/responses_api_session_mgt_images.jpg')}
|
||||
style={{width: '100%', display: 'block', margin: '2rem auto'}}
|
||||
/>
|
||||
|
||||
<br/>
|
||||
|
||||
LiteLLM now supports session management for Responses API requests with images. This is great for use-cases like chatbots, that are using the Responses API to track the state of a conversation. LiteLLM session management works across **ALL** LLM API's (including Anthropic, Bedrock, OpenAI, etc). LiteLLM session management works by storing the request and response content in an s3 bucket, you can specify.
|
||||
|
||||
---
|
||||
|
||||
|
||||
## New Models / Updated Models
|
||||
|
||||
#### New Model Support
|
||||
|
||||
| Provider | Model | Context Window | Input ($/1M tokens) | Output ($/1M tokens) |
|
||||
| ----------- | -------------------------------------- | -------------- | ------------------- | -------------------- |
|
||||
| Bedrock | `bedrock/us.anthropic.claude-opus-4-1-20250805-v1:0` | 200k | $15 | $75 |
|
||||
| Bedrock | `bedrock/openai.gpt-oss-20b-1:0` | 200k | 0.07 | 0.3 |
|
||||
| Bedrock | `bedrock/openai.gpt-oss-120b-1:0` | 200k | 0.15 | 0.6 |
|
||||
| Fireworks AI | `fireworks_ai/accounts/fireworks/models/glm-4p5` | 128k | 0.55 | 2.19 |
|
||||
| Fireworks AI | `fireworks_ai/accounts/fireworks/models/glm-4p5-air` | 128k | 0.22 | 0.88 |
|
||||
| Fireworks AI | `fireworks_ai/accounts/fireworks/models/gpt-oss-120b` | 131072 | 0.15 | 0.6 |
|
||||
| Fireworks AI | `fireworks_ai/accounts/fireworks/models/gpt-oss-20b` | 131072 | 0.05 | 0.2 |
|
||||
| Groq | `groq/openai/gpt-oss-20b` | 131072 | 0.1 | 0.5 |
|
||||
| Groq | `groq/openai/gpt-oss-120b` | 131072 | 0.15 | 0.75 |
|
||||
| OpenAI | `openai/gpt-5` | 400k | 1.25 | 10 |
|
||||
| OpenAI | `openai/gpt-5-2025-08-07` | 400k | 1.25 | 10 |
|
||||
| OpenAI | `openai/gpt-5-mini` | 400k | 0.25 | 2 |
|
||||
| OpenAI | `openai/gpt-5-mini-2025-08-07` | 400k | 0.25 | 2 |
|
||||
| OpenAI | `openai/gpt-5-nano` | 400k | 0.05 | 0.4 |
|
||||
| OpenAI | `openai/gpt-5-nano-2025-08-07` | 400k | 0.05 | 0.4 |
|
||||
| OpenAI | `openai/gpt-5-chat` | 400k | 1.25 | 10 |
|
||||
| OpenAI | `openai/gpt-5-chat-latest` | 400k | 1.25 | 10 |
|
||||
| Azure | `azure/gpt-5` | 400k | 1.25 | 10 |
|
||||
| Azure | `azure/gpt-5-2025-08-07` | 400k | 1.25 | 10 |
|
||||
| Azure | `azure/gpt-5-mini` | 400k | 0.25 | 2 |
|
||||
| Azure | `azure/gpt-5-mini-2025-08-07` | 400k | 0.25 | 2 |
|
||||
| Azure | `azure/gpt-5-nano-2025-08-07` | 400k | 0.05 | 0.4 |
|
||||
| Azure | `azure/gpt-5-nano` | 400k | 0.05 | 0.4 |
|
||||
| Azure | `azure/gpt-5-chat` | 400k | 1.25 | 10 |
|
||||
| Azure | `azure/gpt-5-chat-latest` | 400k | 1.25 | 10 |
|
||||
|
||||
#### Features
|
||||
|
||||
- **[OCI](../../docs/providers/oci)**
|
||||
- New LLM provider - [PR #13206](https://github.com/BerriAI/litellm/pull/13206)
|
||||
- **[JinaAI](../../docs/providers/jina_ai)**
|
||||
- support multimodal embedding models - [PR #13181](https://github.com/BerriAI/litellm/pull/13181)
|
||||
- **GPT-5 ([OpenAI](../../docs/providers/openai)/[Azure](../../docs/providers/azure))**
|
||||
- Support drop_params for temperature - [PR #13390](https://github.com/BerriAI/litellm/pull/13390)
|
||||
- Map max_tokens to max_completion_tokens - [PR #13390](https://github.com/BerriAI/litellm/pull/13390)
|
||||
- **[Anthropic](../../docs/providers/anthropic)**
|
||||
- Add claude-opus-4-1 on model cost map - [PR #13384](https://github.com/BerriAI/litellm/pull/13384)
|
||||
- **[OpenRouter](../../docs/providers/openrouter)**
|
||||
- Add gpt-oss to model cost map - [PR #13442](https://github.com/BerriAI/litellm/pull/13442)
|
||||
- **[Cerebras](../../docs/providers/cerebras)**
|
||||
- Add gpt-oss to model cost map - [PR #13442](https://github.com/BerriAI/litellm/pull/13442)
|
||||
- **[Azure](../../docs/providers/azure)**
|
||||
- Support drop params for ‘temperature’ on o-series models - [PR #13353](https://github.com/BerriAI/litellm/pull/13353)
|
||||
- **[GradientAI](../../docs/providers/gradient_ai)**
|
||||
- New LLM Provider - [PR #12169](https://github.com/BerriAI/litellm/pull/12169)
|
||||
|
||||
#### Bugs
|
||||
|
||||
- **[OpenAI](../../docs/providers/openai)**
|
||||
- Add ‘service_tier’ and ‘safety_identifier’ as supported responses api params - [PR #13258](https://github.com/BerriAI/litellm/pull/13258)
|
||||
- Correct pricing for web search on 4o-mini - [PR #13269](https://github.com/BerriAI/litellm/pull/13269)
|
||||
- **[Mistral](../../docs/providers/mistral)**
|
||||
- Handle $id and $schema fields when calling mistral - [PR #13389](https://github.com/BerriAI/litellm/pull/13389)
|
||||
---
|
||||
|
||||
## LLM API Endpoints
|
||||
|
||||
#### Features
|
||||
|
||||
- `/responses`
|
||||
- Responses API Session Handling w/ support for images - [PR #13347](https://github.com/BerriAI/litellm/pull/13347)
|
||||
- failed if input containing ResponseReasoningItem - [PR #13465](https://github.com/BerriAI/litellm/pull/13465)
|
||||
- Support custom tools - [PR #13418](https://github.com/BerriAI/litellm/pull/13418)
|
||||
|
||||
#### Bugs
|
||||
|
||||
- `/chat/completions`
|
||||
- Fix completion_token_details usage object missing ‘text’ tokens - [PR #13234](https://github.com/BerriAI/litellm/pull/13234)
|
||||
- (SDK) handle tool being a pydantic object - [PR #13274](https://github.com/BerriAI/litellm/pull/13274)
|
||||
- include cost in streaming usage object - [PR #13418](https://github.com/BerriAI/litellm/pull/13418)
|
||||
- Exclude none fields on /chat/completion - allows usage with n8n - [PR #13320](https://github.com/BerriAI/litellm/pull/13320)
|
||||
- `/responses`
|
||||
- Transform function call in response for non-openai models (gemini/anthropic) - [PR #13260](https://github.com/BerriAI/litellm/pull/13260)
|
||||
- Fix unsupported operand error with model groups - [PR #13293](https://github.com/BerriAI/litellm/pull/13293)
|
||||
- Responses api session management for streaming responses - [PR #13396](https://github.com/BerriAI/litellm/pull/13396)
|
||||
- `/v1/messages`
|
||||
- Added litellm claude code count tokens - [PR #13261](https://github.com/BerriAI/litellm/pull/13261)
|
||||
- `/vector_stores`
|
||||
- Fix create/search vector store errors - [PR #13285](https://github.com/BerriAI/litellm/pull/13285)
|
||||
---
|
||||
|
||||
## [MCP Gateway](../../docs/mcp)
|
||||
|
||||
#### Features
|
||||
|
||||
- Add route check for internal users - [PR #13350](https://github.com/BerriAI/litellm/pull/13350)
|
||||
- MCP Guardrails - docs - [PR #13392](https://github.com/BerriAI/litellm/pull/13392)
|
||||
|
||||
|
||||
#### Bugs
|
||||
|
||||
- Fix auth on UI for bearer token servers - [PR #13312](https://github.com/BerriAI/litellm/pull/13312)
|
||||
- allow access group on mcp tool retrieval - [PR #13425](https://github.com/BerriAI/litellm/pull/13425)
|
||||
|
||||
|
||||
---
|
||||
|
||||
## Management Endpoints / UI
|
||||
|
||||
#### Features
|
||||
|
||||
- **Teams**
|
||||
- Add team deletion check for teams with keys - [PR #12953](https://github.com/BerriAI/litellm/pull/12953)
|
||||
- **Models**
|
||||
- Add ability to set model alias per key/team - [PR #13276](https://github.com/BerriAI/litellm/pull/13276)
|
||||
- New button to reload model pricing from model cost map - [PR #13464](https://github.com/BerriAI/litellm/pull/13464), [PR #13470](https://github.com/BerriAI/litellm/pull/13470)
|
||||
- **Keys**
|
||||
- Make ‘team’ field required when creating service account keys - [PR #13302](https://github.com/BerriAI/litellm/pull/13302)
|
||||
- Gray out key-based logging settings for non-enterprise users - prevents confusion on if ‘logging’ all up is supported - [PR #13431](https://github.com/BerriAI/litellm/pull/13431)
|
||||
- **Navbar**
|
||||
- Add logo customization for LiteLLM admin UI - [PR #12958](https://github.com/BerriAI/litellm/pull/12958)
|
||||
- **Logs**
|
||||
- Add token breakdowns on logs + session page - [PR #13357](https://github.com/BerriAI/litellm/pull/13357)
|
||||
- **Usage**
|
||||
- Ensure Usage Page loads after the DB has large entries - [PR #13400](https://github.com/BerriAI/litellm/pull/13400)
|
||||
- **Test Key Page**
|
||||
- allow uploading images for /chat/completions and /responses - [PR #13445](https://github.com/BerriAI/litellm/pull/13445)
|
||||
- **MCP**
|
||||
- Add auth tokens to local storage auth - [PR #13473](https://github.com/BerriAI/litellm/pull/13473)
|
||||
|
||||
#### Bugs
|
||||
|
||||
- **Custom Root Path**
|
||||
- Fix login route when SSO is enabled - [PR #13267](https://github.com/BerriAI/litellm/pull/13267)
|
||||
- **Customers/End-users**
|
||||
- Allow calling /v1/models when end user over budget - allows model listing to work on OpenWebUI when customer over budget - [PR #13320](https://github.com/BerriAI/litellm/pull/13320)
|
||||
- **Teams**
|
||||
- Remove user - team membership, when user removed from team - [PR #13433](https://github.com/BerriAI/litellm/pull/13433)
|
||||
- **Errors**
|
||||
- Bubble up network errors to user for Logging and Alerts page - [PR #13427](https://github.com/BerriAI/litellm/pull/13427)
|
||||
- **Model Hub**
|
||||
- Show pricing for azure models, when base model is set - [PR #13418](https://github.com/BerriAI/litellm/pull/13418)
|
||||
---
|
||||
|
||||
## Logging / Guardrail Integrations
|
||||
|
||||
#### Features
|
||||
|
||||
- **Bedrock Guardrails**
|
||||
- Redacted sensitive information in bedrock guardrails error message - [PR #13356](https://github.com/BerriAI/litellm/pull/13356)
|
||||
- **Standard Logging Payload**
|
||||
- Fix ‘can’t register atextexit’ bug - [PR #13436](https://github.com/BerriAI/litellm/pull/13436)
|
||||
|
||||
#### Bugs
|
||||
|
||||
- **Braintrust**
|
||||
- Allow setting of braintrust callback base url - [PR #13368](https://github.com/BerriAI/litellm/pull/13368)
|
||||
- **OTEL**
|
||||
- Track pre_call hook latency - [PR #13362](https://github.com/BerriAI/litellm/pull/13362)
|
||||
|
||||
---
|
||||
|
||||
## Performance / Loadbalancing / Reliability improvements
|
||||
|
||||
#### Features
|
||||
|
||||
- **Team-BYOK models**
|
||||
- Add wildcard model support - [PR #13278](https://github.com/BerriAI/litellm/pull/13278)
|
||||
- **Caching**
|
||||
- GCP IAM auth support for caching - [PR #13275](https://github.com/BerriAI/litellm/pull/13275)
|
||||
- **Latency**
|
||||
- reduce p99 latency w/ redis enabled by 50% - only updates model usage if tpm/rpm limits set - [PR #13362](https://github.com/BerriAI/litellm/pull/13362)
|
||||
|
||||
---
|
||||
|
||||
## General Proxy Improvements
|
||||
|
||||
#### Features
|
||||
|
||||
- **Models**
|
||||
- Support /v1/models/\{model_id\} retrieval - [PR #13268](https://github.com/BerriAI/litellm/pull/13268)
|
||||
- **Multi-instance**
|
||||
- Ensure disable_llm_api_endpoints works - [PR #13278](https://github.com/BerriAI/litellm/pull/13278)
|
||||
- **Logs**
|
||||
- Add apscheduler log suppress - [PR #13299](https://github.com/BerriAI/litellm/pull/13299)
|
||||
- **Helm**
|
||||
- Add labels to migrations job template - [PR #13343](https://github.com/BerriAI/litellm/pull/13343) s/o [@unique-jakub](https://github.com/unique-jakub)
|
||||
|
||||
#### Bugs
|
||||
|
||||
- **Non-root image**
|
||||
- Fix non-root image for migration - [PR #13379](https://github.com/BerriAI/litellm/pull/13379)
|
||||
- **Get Routes**
|
||||
- Load get routes when using fastapi-offline - [PR #13466](https://github.com/BerriAI/litellm/pull/13466)
|
||||
- **Health checks**
|
||||
- Generate unique trace IDs for Langfuse health checks - [PR #13468](https://github.com/BerriAI/litellm/pull/13468)
|
||||
- **Swagger**
|
||||
- Allow using Swagger for /chat/completions - [PR #13469](https://github.com/BerriAI/litellm/pull/13469)
|
||||
- **Auth**
|
||||
- Fix JWTs access not working with model access groups - [PR #13474](https://github.com/BerriAI/litellm/pull/13474)
|
||||
|
||||
---
|
||||
|
||||
## New Contributors
|
||||
|
||||
* @bbartels made their first contribution in https://github.com/BerriAI/litellm/pull/13244
|
||||
* @breno-aumo made their first contribution in https://github.com/BerriAI/litellm/pull/13206
|
||||
* @pascalwhoop made their first contribution in https://github.com/BerriAI/litellm/pull/13122
|
||||
* @ZPerling made their first contribution in https://github.com/BerriAI/litellm/pull/13045
|
||||
* @zjx20 made their first contribution in https://github.com/BerriAI/litellm/pull/13181
|
||||
* @edwarddamato made their first contribution in https://github.com/BerriAI/litellm/pull/13368
|
||||
* @msannan2 made their first contribution in https://github.com/BerriAI/litellm/pull/12169
|
||||
|
||||
|
||||
## **[Full Changelog](https://github.com/BerriAI/litellm/compare/v1.74.15-stable...v1.75.5-stable.rc-draft)**
|
@@ -0,0 +1,231 @@
|
||||
---
|
||||
title: "[PRE-RELEASE]v1.75.8"
|
||||
slug: "v1-75-8"
|
||||
date: 2025-08-16T10:00:00
|
||||
authors:
|
||||
- name: Krrish Dholakia
|
||||
title: CEO, LiteLLM
|
||||
url: https://www.linkedin.com/in/krish-d/
|
||||
image_url: https://pbs.twimg.com/profile_images/1298587542745358340/DZv3Oj-h_400x400.jpg
|
||||
- name: Ishaan Jaffer
|
||||
title: CTO, LiteLLM
|
||||
url: https://www.linkedin.com/in/reffajnaahsi/
|
||||
image_url: https://pbs.twimg.com/profile_images/1613813310264340481/lz54oEiB_400x400.jpg
|
||||
|
||||
hide_table_of_contents: false
|
||||
---
|
||||
|
||||
import Image from '@theme/IdealImage';
|
||||
import Tabs from '@theme/Tabs';
|
||||
import TabItem from '@theme/TabItem';
|
||||
|
||||
## Deploy this version
|
||||
|
||||
<Tabs>
|
||||
<TabItem value="docker" label="Docker">
|
||||
|
||||
``` showLineNumbers title="docker run litellm"
|
||||
docker run \
|
||||
-e STORE_MODEL_IN_DB=True \
|
||||
-p 4000:4000 \
|
||||
ghcr.io/berriai/litellm:v1.75.8
|
||||
```
|
||||
</TabItem>
|
||||
|
||||
<TabItem value="pip" label="Pip">
|
||||
|
||||
``` showLineNumbers title="pip install litellm"
|
||||
pip install litellm==1.75.8
|
||||
```
|
||||
|
||||
</TabItem>
|
||||
</Tabs>
|
||||
|
||||
---
|
||||
|
||||
## Key Highlights
|
||||
|
||||
- **Team Member Rate Limits** - Individual rate limiting for team members with JWT authentication support.
|
||||
- **Performance Improvements** - New experimental HTTP handler flag for 100+ RPS improvement on OpenAI calls.
|
||||
- **GPT-5 Model Family Support** - Full support for OpenAI's GPT-5 models with `reasoning_effort` parameter and Azure OpenAI integration.
|
||||
- **Azure AI Flux Image Generation** - Support for Azure AI's Flux image generation models.
|
||||
|
||||
---
|
||||
|
||||
## New Models / Updated Models
|
||||
|
||||
#### New Model Support
|
||||
|
||||
| Provider | Model | Context Window | Input ($/1M tokens) | Output ($/1M tokens) | Features |
|
||||
| ----------- | -------------------------------------- | -------------- | ------------------- | -------------------- | -------- |
|
||||
| Azure AI | `azure_ai/FLUX-1.1-pro` | - | - | $40/image | Image generation |
|
||||
| Azure AI | `azure_ai/FLUX.1-Kontext-pro` | - | - | $40/image | Image generation |
|
||||
| Vertex AI | `vertex_ai/deepseek-ai/deepseek-r1-0528-maas` | 65k | $1.35 | $5.4 | Chat completions + reasoning |
|
||||
| OpenRouter | `openrouter/deepseek/deepseek-chat-v3-0324` | 65k | $0.14 | $0.28 | Chat completions |
|
||||
|
||||
|
||||
#### Features
|
||||
|
||||
- **[OpenAI](../../docs/providers/openai)**
|
||||
- Added `reasoning_effort` parameter support for GPT-5 model family - [PR #13475](https://github.com/BerriAI/litellm/pull/13475), [Get Started](../../docs/providers/openai#openai-chat-completion-models)
|
||||
- Support for `reasoning` parameter in Responses API - [PR #13475](https://github.com/BerriAI/litellm/pull/13475), [Get Started](../../docs/response_api)
|
||||
- **[Azure OpenAI](../../docs/providers/azure/azure)**
|
||||
- GPT-5 support with max_tokens and `reasoning` parameter - [PR #13510](https://github.com/BerriAI/litellm/pull/13510), [Get Started](../../docs/providers/azure/azure#gpt-5-models)
|
||||
- **[AWS Bedrock](../../docs/providers/bedrock)**
|
||||
- Streaming support for bedrock gpt-oss model family - [PR #13346](https://github.com/BerriAI/litellm/pull/13346), [Get Started](../../docs/providers/bedrock#openai-gpt-oss)
|
||||
- `/messages` endpoint compatibility with `bedrock/converse/<model>` - [PR #13627](https://github.com/BerriAI/litellm/pull/13627)
|
||||
- Cache point support for assistant and tool messages - [PR #13640](https://github.com/BerriAI/litellm/pull/13640)
|
||||
- **[Azure AI](../../docs/providers/azure)**
|
||||
- New Azure AI Flux Image Generation provider - [PR #13592](https://github.com/BerriAI/litellm/pull/13592), [Get Started](../../docs/providers/azure_ai_img)
|
||||
- Fixed Content-Type header for image generation - [PR #13584](https://github.com/BerriAI/litellm/pull/13584)
|
||||
- **[CometAPI](../../docs/providers/comet)**
|
||||
- New provider support with chat completions and streaming - [PR #13458](https://github.com/BerriAI/litellm/pull/13458)
|
||||
- **[SambaNova](../../docs/providers/sambanova)**
|
||||
- Added embedding model support - [PR #13308](https://github.com/BerriAI/litellm/pull/13308), [Get Started](../../docs/providers/sambanova#sambanova---embeddings)
|
||||
- **[Vertex AI](../../docs/providers/vertex)**
|
||||
- Added `/countTokens` endpoint support for Gemini CLI integration - [PR #13545](https://github.com/BerriAI/litellm/pull/13545)
|
||||
- Token counter support for VertexAI models - [PR #13558](https://github.com/BerriAI/litellm/pull/13558)
|
||||
- **[hosted_vllm](../../docs/providers/vllm)**
|
||||
- Added `reasoning_effort` parameter support - [PR #13620](https://github.com/BerriAI/litellm/pull/13620), [Get Started](../../docs/providers/vllm#reasoning-effort)
|
||||
|
||||
#### Bugs
|
||||
|
||||
- **[OCI](../../docs/providers/oci)**
|
||||
- Fixed streaming issues - [PR #13437](https://github.com/BerriAI/litellm/pull/13437)
|
||||
- **[Ollama](../../docs/providers/ollama)**
|
||||
- Fixed GPT-OSS streaming with 'thinking' field - [PR #13375](https://github.com/BerriAI/litellm/pull/13375)
|
||||
- **[VolcEngine](../../docs/providers/volcengine)**
|
||||
- Fixed thinking disabled parameter handling - [PR #13598](https://github.com/BerriAI/litellm/pull/13598)
|
||||
- **[Streaming](../../docs/completion/stream)**
|
||||
- Consistent 'finish_reason' chunk indexing - [PR #13560](https://github.com/BerriAI/litellm/pull/13560)
|
||||
---
|
||||
|
||||
## LLM API Endpoints
|
||||
|
||||
#### Features
|
||||
|
||||
- **[/messages](../../docs/anthropic/messages)**
|
||||
- Tool use arguments properly returned for non-anthropic models - [PR #13638](https://github.com/BerriAI/litellm/pull/13638)
|
||||
|
||||
#### Bugs
|
||||
|
||||
- **[Real-time API](../../docs/realtime)**
|
||||
- Fixed endpoint for no intent scenarios - [PR #13476](https://github.com/BerriAI/litellm/pull/13476)
|
||||
- **[Responses API](../../docs/response_api)**
|
||||
- Fixed `stream=True` + `background=True` with Responses API - [PR #13654](https://github.com/BerriAI/litellm/pull/13654)
|
||||
|
||||
---
|
||||
|
||||
## [MCP Gateway](../../docs/mcp)
|
||||
|
||||
#### Features
|
||||
|
||||
- **Access Control & Configuration**
|
||||
- Enhanced MCPServerManager with access groups and description support - [PR #13549](https://github.com/BerriAI/litellm/pull/13549)
|
||||
|
||||
#### Bugs
|
||||
|
||||
- **Authentication**
|
||||
- Fixed MCP gateway key authentication - [PR #13630](https://github.com/BerriAI/litellm/pull/13630)
|
||||
|
||||
[Read More](../../docs/mcp)
|
||||
|
||||
---
|
||||
|
||||
## Management Endpoints / UI
|
||||
|
||||
#### Features
|
||||
|
||||
- **Team Management**
|
||||
- Team Member Rate Limits implementation - [PR #13601](https://github.com/BerriAI/litellm/pull/13601)
|
||||
- JWT authentication support for team member rate limits - [PR #13601](https://github.com/BerriAI/litellm/pull/13601)
|
||||
- Show team member TPM/RPM limits in UI - [PR #13662](https://github.com/BerriAI/litellm/pull/13662)
|
||||
- Allow editing team member RPM/TPM limits - [PR #13669](https://github.com/BerriAI/litellm/pull/13669)
|
||||
- Allow unsetting TPM and RPM in Teams Settings - [PR #13430](https://github.com/BerriAI/litellm/pull/13430)
|
||||
- Team Member Permissions Page access column changes - [PR #13145](https://github.com/BerriAI/litellm/pull/13145)
|
||||
- **Key Management**
|
||||
- Display errors from backend on the UI Keys page - [PR #13435](https://github.com/BerriAI/litellm/pull/13435)
|
||||
- Added confirmation modal before deleting keys - [PR #13655](https://github.com/BerriAI/litellm/pull/13655)
|
||||
- Support for `user` parameter in LiteLLM SDK to Proxy communication - [PR #13555](https://github.com/BerriAI/litellm/pull/13555)
|
||||
- **UI Improvements**
|
||||
- Fixed internal users table overflow - [PR #12736](https://github.com/BerriAI/litellm/pull/12736)
|
||||
- Enhanced chart readability with short-form notation for large numbers - [PR #12370](https://github.com/BerriAI/litellm/pull/12370)
|
||||
- Fixed image overflow in LiteLLM model display - [PR #13639](https://github.com/BerriAI/litellm/pull/13639)
|
||||
- Removed ambiguous network response errors - [PR #13582](https://github.com/BerriAI/litellm/pull/13582)
|
||||
- **Credentials**
|
||||
- Added CredentialDeleteModal component and integration with CredentialsPanel - [PR #13550](https://github.com/BerriAI/litellm/pull/13550)
|
||||
- **Admin & Permissions**
|
||||
- Allow routes for admin viewer - [PR #13588](https://github.com/BerriAI/litellm/pull/13588)
|
||||
|
||||
#### Bugs
|
||||
|
||||
- **SCIM Integration**
|
||||
- Fixed SCIM Team Memberships metadata handling - [PR #13553](https://github.com/BerriAI/litellm/pull/13553)
|
||||
- **Authentication**
|
||||
- Fixed incorrect key info endpoint - [PR #13633](https://github.com/BerriAI/litellm/pull/13633)
|
||||
|
||||
---
|
||||
|
||||
## Logging / Guardrail Integrations
|
||||
|
||||
#### Features
|
||||
|
||||
- **[Langfuse OTEL](../../docs/proxy/logging#langfuse)**
|
||||
- Added key/team logging for Langfuse OTEL Logger - [PR #13512](https://github.com/BerriAI/litellm/pull/13512)
|
||||
- Fixed LangfuseOtelSpanAttributes constants to match expected values - [PR #13659](https://github.com/BerriAI/litellm/pull/13659)
|
||||
- **[MLflow](../../docs/proxy/logging#mlflow)**
|
||||
- Updated MLflow logger usage span attributes - [PR #13561](https://github.com/BerriAI/litellm/pull/13561)
|
||||
|
||||
#### Bugs
|
||||
|
||||
- **Security**
|
||||
- Hide sensitive data in `/model/info` - azure entra client_secret - [PR #13577](https://github.com/BerriAI/litellm/pull/13577)
|
||||
- Fixed trivy/secrets false positives - [PR #13631](https://github.com/BerriAI/litellm/pull/13631)
|
||||
|
||||
---
|
||||
|
||||
## Performance / Loadbalancing / Reliability improvements
|
||||
|
||||
#### Features
|
||||
|
||||
- **HTTP Performance**
|
||||
- New 'EXPERIMENTAL_OPENAI_BASE_LLM_HTTP_HANDLER' flag for +100 RPS improvement on OpenAI calls - [PR #13625](https://github.com/BerriAI/litellm/pull/13625)
|
||||
- **Database Monitoring**
|
||||
- Added DB metrics to Prometheus - [PR #13626](https://github.com/BerriAI/litellm/pull/13626)
|
||||
- **Error Handling**
|
||||
- Added safe divide by 0 protection to prevent crashes - [PR #13624](https://github.com/BerriAI/litellm/pull/13624)
|
||||
|
||||
#### Bugs
|
||||
|
||||
- **Dependencies**
|
||||
- Updated boto3 to 1.36.0 and aioboto3 to 13.4.0 - [PR #13665](https://github.com/BerriAI/litellm/pull/13665)
|
||||
|
||||
---
|
||||
|
||||
## General Proxy Improvements
|
||||
|
||||
#### Features
|
||||
|
||||
- **Database**
|
||||
- Removed redundant `use_prisma_migrate` flag - now default - [PR #13555](https://github.com/BerriAI/litellm/pull/13555)
|
||||
- **LLM Translation**
|
||||
- Added model ID check - [PR #13507](https://github.com/BerriAI/litellm/pull/13507)
|
||||
- Refactored Anthropic configurations and added support for `anthropic_beta` headers - [PR #13590](https://github.com/BerriAI/litellm/pull/13590)
|
||||
|
||||
|
||||
---
|
||||
|
||||
## New Contributors
|
||||
* @TensorNull made their first contribution in [PR #13458](https://github.com/BerriAI/litellm/pull/13458)
|
||||
* @MajorD00m made their first contribution in [PR #13577](https://github.com/BerriAI/litellm/pull/13577)
|
||||
* @VerunicaM made their first contribution in [PR #13584](https://github.com/BerriAI/litellm/pull/13584)
|
||||
* @huangyafei made their first contribution in [PR #13607](https://github.com/BerriAI/litellm/pull/13607)
|
||||
* @TomeHirata made their first contribution in [PR #13561](https://github.com/BerriAI/litellm/pull/13561)
|
||||
* @willfinnigan made their first contribution in [PR #13659](https://github.com/BerriAI/litellm/pull/13659)
|
||||
* @dcbark01 made their first contribution in [PR #13633](https://github.com/BerriAI/litellm/pull/13633)
|
||||
* @javacruft made their first contribution in [PR #13631](https://github.com/BerriAI/litellm/pull/13631)
|
||||
|
||||
---
|
||||
|
||||
## **[Full Changelog](https://github.com/BerriAI/litellm/compare/v1.75.5-stable.rc-draft...v1.75.8-nightly)**
|
||||
|
Reference in New Issue
Block a user