---
title: "v1.75.5-stable - Redis latency improvements"
slug: "v1-75-5"
date: 2025-08-10T10:00:00
authors:
- name: Krrish Dholakia
title: CEO, LiteLLM
url: https://www.linkedin.com/in/krish-d/
image_url: https://pbs.twimg.com/profile_images/1298587542745358340/DZv3Oj-h_400x400.jpg
- name: Ishaan Jaffer
title: CTO, LiteLLM
url: https://www.linkedin.com/in/reffajnaahsi/
image_url: https://pbs.twimg.com/profile_images/1613813310264340481/lz54oEiB_400x400.jpg
hide_table_of_contents: false
---
import Image from '@theme/IdealImage';
import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';
## Deploy this version
``` showLineNumbers title="docker run litellm"
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:v1.75.5.rc.1
```
``` showLineNumbers title="pip install litellm"
pip install litellm==1.75.5.post1
```
---
## Key Highlights
- **Redis - Latency Improvements** - Reduces P99 latency by 50% with Redis enabled.
- **Responses API Session Management** - Support for managing responses API sessions with images.
- **Oracle Cloud Infrastructure** - New LLM provider for calling models on Oracle Cloud Infrastructure.
- **Digital Ocean's Gradient AI** - New LLM provider for calling models on Digital Ocean's Gradient AI platform.
### Risk of Upgrade
If you build the proxy from the pip package, you should hold off on upgrading. This version makes `prisma migrate deploy` our default for managing the DB. This is safer, as it doesn't reset the DB, but it requires a manual `prisma generate` step.
Users of our Docker image, are **not** affected by this change.
---
## Redis Latency Improvements
This release adds in-memory caching for Redis requests, enabling faster response times in high-traffic. Now, LiteLLM instances will check their in-memory cache for a cache hit, before checking Redis. This reduces caching-related latency from 100ms for LLM API calls to sub-1ms, on cache hits.
---
## Responses API Session Management w/ Images
LiteLLM now supports session management for Responses API requests with images. This is great for use-cases like chatbots, that are using the Responses API to track the state of a conversation. LiteLLM session management works across **ALL** LLM API's (including Anthropic, Bedrock, OpenAI, etc). LiteLLM session management works by storing the request and response content in an s3 bucket, you can specify.
---
## New Models / Updated Models
#### New Model Support
| Provider | Model | Context Window | Input ($/1M tokens) | Output ($/1M tokens) |
| ----------- | -------------------------------------- | -------------- | ------------------- | -------------------- |
| Bedrock | `bedrock/us.anthropic.claude-opus-4-1-20250805-v1:0` | 200k | $15 | $75 |
| Bedrock | `bedrock/openai.gpt-oss-20b-1:0` | 200k | 0.07 | 0.3 |
| Bedrock | `bedrock/openai.gpt-oss-120b-1:0` | 200k | 0.15 | 0.6 |
| Fireworks AI | `fireworks_ai/accounts/fireworks/models/glm-4p5` | 128k | 0.55 | 2.19 |
| Fireworks AI | `fireworks_ai/accounts/fireworks/models/glm-4p5-air` | 128k | 0.22 | 0.88 |
| Fireworks AI | `fireworks_ai/accounts/fireworks/models/gpt-oss-120b` | 131072 | 0.15 | 0.6 |
| Fireworks AI | `fireworks_ai/accounts/fireworks/models/gpt-oss-20b` | 131072 | 0.05 | 0.2 |
| Groq | `groq/openai/gpt-oss-20b` | 131072 | 0.1 | 0.5 |
| Groq | `groq/openai/gpt-oss-120b` | 131072 | 0.15 | 0.75 |
| OpenAI | `openai/gpt-5` | 400k | 1.25 | 10 |
| OpenAI | `openai/gpt-5-2025-08-07` | 400k | 1.25 | 10 |
| OpenAI | `openai/gpt-5-mini` | 400k | 0.25 | 2 |
| OpenAI | `openai/gpt-5-mini-2025-08-07` | 400k | 0.25 | 2 |
| OpenAI | `openai/gpt-5-nano` | 400k | 0.05 | 0.4 |
| OpenAI | `openai/gpt-5-nano-2025-08-07` | 400k | 0.05 | 0.4 |
| OpenAI | `openai/gpt-5-chat` | 400k | 1.25 | 10 |
| OpenAI | `openai/gpt-5-chat-latest` | 400k | 1.25 | 10 |
| Azure | `azure/gpt-5` | 400k | 1.25 | 10 |
| Azure | `azure/gpt-5-2025-08-07` | 400k | 1.25 | 10 |
| Azure | `azure/gpt-5-mini` | 400k | 0.25 | 2 |
| Azure | `azure/gpt-5-mini-2025-08-07` | 400k | 0.25 | 2 |
| Azure | `azure/gpt-5-nano-2025-08-07` | 400k | 0.05 | 0.4 |
| Azure | `azure/gpt-5-nano` | 400k | 0.05 | 0.4 |
| Azure | `azure/gpt-5-chat` | 400k | 1.25 | 10 |
| Azure | `azure/gpt-5-chat-latest` | 400k | 1.25 | 10 |
#### Features
- **[OCI](../../docs/providers/oci)**
- New LLM provider - [PR #13206](https://github.com/BerriAI/litellm/pull/13206)
- **[JinaAI](../../docs/providers/jina_ai)**
- support multimodal embedding models - [PR #13181](https://github.com/BerriAI/litellm/pull/13181)
- **GPT-5 ([OpenAI](../../docs/providers/openai)/[Azure](../../docs/providers/azure))**
- Support drop_params for temperature - [PR #13390](https://github.com/BerriAI/litellm/pull/13390)
- Map max_tokens to max_completion_tokens - [PR #13390](https://github.com/BerriAI/litellm/pull/13390)
- **[Anthropic](../../docs/providers/anthropic)**
- Add claude-opus-4-1 on model cost map - [PR #13384](https://github.com/BerriAI/litellm/pull/13384)
- **[OpenRouter](../../docs/providers/openrouter)**
- Add gpt-oss to model cost map - [PR #13442](https://github.com/BerriAI/litellm/pull/13442)
- **[Cerebras](../../docs/providers/cerebras)**
- Add gpt-oss to model cost map - [PR #13442](https://github.com/BerriAI/litellm/pull/13442)
- **[Azure](../../docs/providers/azure)**
- Support drop params for ‘temperature’ on o-series models - [PR #13353](https://github.com/BerriAI/litellm/pull/13353)
- **[GradientAI](../../docs/providers/gradient_ai)**
- New LLM Provider - [PR #12169](https://github.com/BerriAI/litellm/pull/12169)
#### Bugs
- **[OpenAI](../../docs/providers/openai)**
- Add ‘service_tier’ and ‘safety_identifier’ as supported responses api params - [PR #13258](https://github.com/BerriAI/litellm/pull/13258)
- Correct pricing for web search on 4o-mini - [PR #13269](https://github.com/BerriAI/litellm/pull/13269)
- **[Mistral](../../docs/providers/mistral)**
- Handle $id and $schema fields when calling mistral - [PR #13389](https://github.com/BerriAI/litellm/pull/13389)
---
## LLM API Endpoints
#### Features
- `/responses`
- Responses API Session Handling w/ support for images - [PR #13347](https://github.com/BerriAI/litellm/pull/13347)
- failed if input containing ResponseReasoningItem - [PR #13465](https://github.com/BerriAI/litellm/pull/13465)
- Support custom tools - [PR #13418](https://github.com/BerriAI/litellm/pull/13418)
#### Bugs
- `/chat/completions`
- Fix completion_token_details usage object missing ‘text’ tokens - [PR #13234](https://github.com/BerriAI/litellm/pull/13234)
- (SDK) handle tool being a pydantic object - [PR #13274](https://github.com/BerriAI/litellm/pull/13274)
- include cost in streaming usage object - [PR #13418](https://github.com/BerriAI/litellm/pull/13418)
- Exclude none fields on /chat/completion - allows usage with n8n - [PR #13320](https://github.com/BerriAI/litellm/pull/13320)
- `/responses`
- Transform function call in response for non-openai models (gemini/anthropic) - [PR #13260](https://github.com/BerriAI/litellm/pull/13260)
- Fix unsupported operand error with model groups - [PR #13293](https://github.com/BerriAI/litellm/pull/13293)
- Responses api session management for streaming responses - [PR #13396](https://github.com/BerriAI/litellm/pull/13396)
- `/v1/messages`
- Added litellm claude code count tokens - [PR #13261](https://github.com/BerriAI/litellm/pull/13261)
- `/vector_stores`
- Fix create/search vector store errors - [PR #13285](https://github.com/BerriAI/litellm/pull/13285)
---
## [MCP Gateway](../../docs/mcp)
#### Features
- Add route check for internal users - [PR #13350](https://github.com/BerriAI/litellm/pull/13350)
- MCP Guardrails - docs - [PR #13392](https://github.com/BerriAI/litellm/pull/13392)
#### Bugs
- Fix auth on UI for bearer token servers - [PR #13312](https://github.com/BerriAI/litellm/pull/13312)
- allow access group on mcp tool retrieval - [PR #13425](https://github.com/BerriAI/litellm/pull/13425)
---
## Management Endpoints / UI
#### Features
- **Teams**
- Add team deletion check for teams with keys - [PR #12953](https://github.com/BerriAI/litellm/pull/12953)
- **Models**
- Add ability to set model alias per key/team - [PR #13276](https://github.com/BerriAI/litellm/pull/13276)
- New button to reload model pricing from model cost map - [PR #13464](https://github.com/BerriAI/litellm/pull/13464), [PR #13470](https://github.com/BerriAI/litellm/pull/13470)
- **Keys**
- Make ‘team’ field required when creating service account keys - [PR #13302](https://github.com/BerriAI/litellm/pull/13302)
- Gray out key-based logging settings for non-enterprise users - prevents confusion on if ‘logging’ all up is supported - [PR #13431](https://github.com/BerriAI/litellm/pull/13431)
- **Navbar**
- Add logo customization for LiteLLM admin UI - [PR #12958](https://github.com/BerriAI/litellm/pull/12958)
- **Logs**
- Add token breakdowns on logs + session page - [PR #13357](https://github.com/BerriAI/litellm/pull/13357)
- **Usage**
- Ensure Usage Page loads after the DB has large entries - [PR #13400](https://github.com/BerriAI/litellm/pull/13400)
- **Test Key Page**
- allow uploading images for /chat/completions and /responses - [PR #13445](https://github.com/BerriAI/litellm/pull/13445)
- **MCP**
- Add auth tokens to local storage auth - [PR #13473](https://github.com/BerriAI/litellm/pull/13473)
#### Bugs
- **Custom Root Path**
- Fix login route when SSO is enabled - [PR #13267](https://github.com/BerriAI/litellm/pull/13267)
- **Customers/End-users**
- Allow calling /v1/models when end user over budget - allows model listing to work on OpenWebUI when customer over budget - [PR #13320](https://github.com/BerriAI/litellm/pull/13320)
- **Teams**
- Remove user - team membership, when user removed from team - [PR #13433](https://github.com/BerriAI/litellm/pull/13433)
- **Errors**
- Bubble up network errors to user for Logging and Alerts page - [PR #13427](https://github.com/BerriAI/litellm/pull/13427)
- **Model Hub**
- Show pricing for azure models, when base model is set - [PR #13418](https://github.com/BerriAI/litellm/pull/13418)
---
## Logging / Guardrail Integrations
#### Features
- **Bedrock Guardrails**
- Redacted sensitive information in bedrock guardrails error message - [PR #13356](https://github.com/BerriAI/litellm/pull/13356)
- **Standard Logging Payload**
- Fix ‘can’t register atextexit’ bug - [PR #13436](https://github.com/BerriAI/litellm/pull/13436)
#### Bugs
- **Braintrust**
- Allow setting of braintrust callback base url - [PR #13368](https://github.com/BerriAI/litellm/pull/13368)
- **OTEL**
- Track pre_call hook latency - [PR #13362](https://github.com/BerriAI/litellm/pull/13362)
---
## Performance / Loadbalancing / Reliability improvements
#### Features
- **Team-BYOK models**
- Add wildcard model support - [PR #13278](https://github.com/BerriAI/litellm/pull/13278)
- **Caching**
- GCP IAM auth support for caching - [PR #13275](https://github.com/BerriAI/litellm/pull/13275)
- **Latency**
- reduce p99 latency w/ redis enabled by 50% - only updates model usage if tpm/rpm limits set - [PR #13362](https://github.com/BerriAI/litellm/pull/13362)
---
## General Proxy Improvements
#### Features
- **Models**
- Support /v1/models/\{model_id\} retrieval - [PR #13268](https://github.com/BerriAI/litellm/pull/13268)
- **Multi-instance**
- Ensure disable_llm_api_endpoints works - [PR #13278](https://github.com/BerriAI/litellm/pull/13278)
- **Logs**
- Add apscheduler log suppress - [PR #13299](https://github.com/BerriAI/litellm/pull/13299)
- **Helm**
- Add labels to migrations job template - [PR #13343](https://github.com/BerriAI/litellm/pull/13343) s/o [@unique-jakub](https://github.com/unique-jakub)
#### Bugs
- **Non-root image**
- Fix non-root image for migration - [PR #13379](https://github.com/BerriAI/litellm/pull/13379)
- **Get Routes**
- Load get routes when using fastapi-offline - [PR #13466](https://github.com/BerriAI/litellm/pull/13466)
- **Health checks**
- Generate unique trace IDs for Langfuse health checks - [PR #13468](https://github.com/BerriAI/litellm/pull/13468)
- **Swagger**
- Allow using Swagger for /chat/completions - [PR #13469](https://github.com/BerriAI/litellm/pull/13469)
- **Auth**
- Fix JWTs access not working with model access groups - [PR #13474](https://github.com/BerriAI/litellm/pull/13474)
---
## New Contributors
* @bbartels made their first contribution in https://github.com/BerriAI/litellm/pull/13244
* @breno-aumo made their first contribution in https://github.com/BerriAI/litellm/pull/13206
* @pascalwhoop made their first contribution in https://github.com/BerriAI/litellm/pull/13122
* @ZPerling made their first contribution in https://github.com/BerriAI/litellm/pull/13045
* @zjx20 made their first contribution in https://github.com/BerriAI/litellm/pull/13181
* @edwarddamato made their first contribution in https://github.com/BerriAI/litellm/pull/13368
* @msannan2 made their first contribution in https://github.com/BerriAI/litellm/pull/12169
## **[Full Changelog](https://github.com/BerriAI/litellm/compare/v1.74.15-stable...v1.75.5-stable.rc-draft)**