Added LiteLLM to the stack
This commit is contained in:
@@ -0,0 +1,50 @@
|
||||
What endpoints does the litellm proxy have 💥 LiteLLM Proxy Server
|
||||
LiteLLM Server manages:
|
||||
|
||||
Calling 100+ LLMs Huggingface/Bedrock/TogetherAI/etc. in the OpenAI ChatCompletions & Completions format
|
||||
Set custom prompt templates + model-specific configs (temperature, max_tokens, etc.)
|
||||
Quick Start
|
||||
View all the supported args for the Proxy CLI here
|
||||
|
||||
$ litellm --model huggingface/bigcode/starcoder
|
||||
|
||||
#INFO: Proxy running on http://0.0.0.0:8000
|
||||
|
||||
Test
|
||||
In a new shell, run, this will make an openai.ChatCompletion request
|
||||
|
||||
litellm --test
|
||||
|
||||
This will now automatically route any requests for gpt-3.5-turbo to bigcode starcoder, hosted on huggingface inference endpoints.
|
||||
|
||||
Replace openai base
|
||||
import openai
|
||||
|
||||
openai.api_base = "http://0.0.0.0:8000"
|
||||
|
||||
print(openai.chat.completions.create(model="test", messages=[{"role":"user", "content":"Hey!"}]))
|
||||
|
||||
Supported LLMs
|
||||
Bedrock
|
||||
Huggingface (TGI)
|
||||
Anthropic
|
||||
VLLM
|
||||
OpenAI Compatible Server
|
||||
TogetherAI
|
||||
Replicate
|
||||
Petals
|
||||
Palm
|
||||
Azure OpenAI
|
||||
AI21
|
||||
Cohere
|
||||
$ export AWS_ACCESS_KEY_ID=""
|
||||
$ export AWS_REGION_NAME="" # e.g. us-west-2
|
||||
$ export AWS_SECRET_ACCESS_KEY=""
|
||||
|
||||
$ litellm --model bedrock/anthropic.claude-v2
|
||||
|
||||
Server Endpoints
|
||||
POST /chat/completions - chat completions endpoint to call 100+ LLMs
|
||||
POST /completions - completions endpoint
|
||||
POST /embeddings - embedding endpoint for Azure, OpenAI, Huggingface endpoints
|
||||
GET /models - available models on server
|
Reference in New Issue
Block a user