Vertex AI - Anthropic, DeepSeek, Model Garden
Supported Partner Providersโ
Provider | LiteLLM Route | Vertex Documentation |
---|---|---|
Anthropic (Claude) | vertex_ai/claude-* | Vertex AI - Anthropic Models |
DeepSeek | vertex_ai/deepseek-ai/{MODEL} | Vertex AI - DeepSeek Models |
Meta/Llama | vertex_ai/meta/{MODEL} | Vertex AI - Meta Models |
Mistral | vertex_ai/mistral-* | Vertex AI - Mistral Models |
AI21 (Jamba) | vertex_ai/jamba-* | Vertex AI - AI21 Models |
Model Garden | vertex_ai/openai/{MODEL_ID} or vertex_ai/{MODEL_ID} | Vertex Model Garden |
Vertex AI - Anthropic (Claude)โ
Model Name | Function Call |
---|---|
claude-3-opus@20240229 | completion('vertex_ai/claude-3-opus@20240229', messages) |
claude-3-5-sonnet@20240620 | completion('vertex_ai/claude-3-5-sonnet@20240620', messages) |
claude-3-sonnet@20240229 | completion('vertex_ai/claude-3-sonnet@20240229', messages) |
claude-3-haiku@20240307 | completion('vertex_ai/claude-3-haiku@20240307', messages) |
claude-3-7-sonnet@20250219 | completion('vertex_ai/claude-3-7-sonnet@20250219', messages) |
Usageโ
- SDK
- Proxy
from litellm import completion
import os
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = ""
model = "claude-3-sonnet@20240229"
vertex_ai_project = "your-vertex-project" # can also set this as os.environ["VERTEXAI_PROJECT"]
vertex_ai_location = "your-vertex-location" # can also set this as os.environ["VERTEXAI_LOCATION"]
response = completion(
model="vertex_ai/" + model,
messages=[{"role": "user", "content": "hi"}],
temperature=0.7,
vertex_ai_project=vertex_ai_project,
vertex_ai_location=vertex_ai_location,
)
print("\nModel Response", response)
1. Add to config
model_list:
- model_name: anthropic-vertex
litellm_params:
model: vertex_ai/claude-3-sonnet@20240229
vertex_ai_project: "my-test-project"
vertex_ai_location: "us-east-1"
- model_name: anthropic-vertex
litellm_params:
model: vertex_ai/claude-3-sonnet@20240229
vertex_ai_project: "my-test-project"
vertex_ai_location: "us-west-1"
2. Start proxy
litellm --config /path/to/config.yaml
# RUNNING at http://0.0.0.0:4000
3. Test it!
curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Authorization: Bearer sk-1234' \
--header 'Content-Type: application/json' \
--data '{
"model": "anthropic-vertex", # ๐ the 'model_name' in config
"messages": [
{
"role": "user",
"content": "what llm are you"
}
],
}'
Usage - thinking
/ reasoning_content
โ
- SDK
- PROXY
from litellm import completion
resp = completion(
model="vertex_ai/claude-3-7-sonnet-20250219",
messages=[{"role": "user", "content": "What is the capital of France?"}],
thinking={"type": "enabled", "budget_tokens": 1024},
)
- Setup config.yaml
- model_name: claude-3-7-sonnet-20250219
litellm_params:
model: vertex_ai/claude-3-7-sonnet-20250219
vertex_ai_project: "my-test-project"
vertex_ai_location: "us-west-1"
- Start proxy
litellm --config /path/to/config.yaml
- Test it!
curl http://0.0.0.0:4000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <YOUR-LITELLM-KEY>" \
-d '{
"model": "claude-3-7-sonnet-20250219",
"messages": [{"role": "user", "content": "What is the capital of France?"}],
"thinking": {"type": "enabled", "budget_tokens": 1024}
}'
Expected Response
ModelResponse(
id='chatcmpl-c542d76d-f675-4e87-8e5f-05855f5d0f5e',
created=1740470510,
model='claude-3-7-sonnet-20250219',
object='chat.completion',
system_fingerprint=None,
choices=[
Choices(
finish_reason='stop',
index=0,
message=Message(
content="The capital of France is Paris.",
role='assistant',
tool_calls=None,
function_call=None,
provider_specific_fields={
'citations': None,
'thinking_blocks': [
{
'type': 'thinking',
'thinking': 'The capital of France is Paris. This is a very straightforward factual question.',
'signature': 'EuYBCkQYAiJAy6...'
}
]
}
),
thinking_blocks=[
{
'type': 'thinking',
'thinking': 'The capital of France is Paris. This is a very straightforward factual question.',
'signature': 'EuYBCkQYAiJAy6AGB...'
}
],
reasoning_content='The capital of France is Paris. This is a very straightforward factual question.'
)
],
usage=Usage(
completion_tokens=68,
prompt_tokens=42,
total_tokens=110,
completion_tokens_details=None,
prompt_tokens_details=PromptTokensDetailsWrapper(
audio_tokens=None,
cached_tokens=0,
text_tokens=None,
image_tokens=None
),
cache_creation_input_tokens=0,
cache_read_input_tokens=0
)
)
VertexAI DeepSeekโ
Property | Details |
---|---|
Provider Route | vertex_ai/deepseek-ai/{MODEL} |
Vertex Documentation | Vertex AI - DeepSeek Models |
Usageโ
LiteLLM Supports all Vertex AI DeepSeek Models. Ensure you use the vertex_ai/deepseek-ai/
prefix for all Vertex AI DeepSeek models.
Model Name | Usage |
---|---|
vertex_ai/deepseek-ai/deepseek-r1-0528-maas | completion('vertex_ai/deepseek-ai/deepseek-r1-0528-maas', messages) |
VertexAI Meta/Llama APIโ
Model Name | Function Call |
---|---|
meta/llama-3.2-90b-vision-instruct-maas | completion('vertex_ai/meta/llama-3.2-90b-vision-instruct-maas', messages) |
meta/llama3-8b-instruct-maas | completion('vertex_ai/meta/llama3-8b-instruct-maas', messages) |
meta/llama3-70b-instruct-maas | completion('vertex_ai/meta/llama3-70b-instruct-maas', messages) |
meta/llama3-405b-instruct-maas | completion('vertex_ai/meta/llama3-405b-instruct-maas', messages) |
meta/llama-4-scout-17b-16e-instruct-maas | completion('vertex_ai/meta/llama-4-scout-17b-16e-instruct-maas', messages) |
meta/llama-4-scout-17-128e-instruct-maas | completion('vertex_ai/meta/llama-4-scout-128b-16e-instruct-maas', messages) |
meta/llama-4-maverick-17b-128e-instruct-maas | completion('vertex_ai/meta/llama-4-maverick-17b-128e-instruct-maas',messages) |
meta/llama-4-maverick-17b-16e-instruct-maas | completion('vertex_ai/meta/llama-4-maverick-17b-16e-instruct-maas',messages) |
Usageโ
- SDK
- Proxy
from litellm import completion
import os
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = ""
model = "meta/llama3-405b-instruct-maas"
vertex_ai_project = "your-vertex-project" # can also set this as os.environ["VERTEXAI_PROJECT"]
vertex_ai_location = "your-vertex-location" # can also set this as os.environ["VERTEXAI_LOCATION"]
response = completion(
model="vertex_ai/" + model,
messages=[{"role": "user", "content": "hi"}],
vertex_ai_project=vertex_ai_project,
vertex_ai_location=vertex_ai_location,
)
print("\nModel Response", response)
1. Add to config
model_list:
- model_name: anthropic-llama
litellm_params:
model: vertex_ai/meta/llama3-405b-instruct-maas
vertex_ai_project: "my-test-project"
vertex_ai_location: "us-east-1"
- model_name: anthropic-llama
litellm_params:
model: vertex_ai/meta/llama3-405b-instruct-maas
vertex_ai_project: "my-test-project"
vertex_ai_location: "us-west-1"
2. Start proxy
litellm --config /path/to/config.yaml
# RUNNING at http://0.0.0.0:4000
3. Test it!
curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Authorization: Bearer sk-1234' \
--header 'Content-Type: application/json' \
--data '{
"model": "anthropic-llama", # ๐ the 'model_name' in config
"messages": [
{
"role": "user",
"content": "what llm are you"
}
],
}'
VertexAI Mistral APIโ
Model Name | Function Call |
---|---|
mistral-large@latest | completion('vertex_ai/mistral-large@latest', messages) |
mistral-large@2407 | completion('vertex_ai/mistral-large@2407', messages) |
mistral-nemo@latest | completion('vertex_ai/mistral-nemo@latest', messages) |
codestral@latest | completion('vertex_ai/codestral@latest', messages) |
codestral@@2405 | completion('vertex_ai/codestral@2405', messages) |
Usageโ
- SDK
- Proxy
from litellm import completion
import os
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = ""
model = "mistral-large@2407"
vertex_ai_project = "your-vertex-project" # can also set this as os.environ["VERTEXAI_PROJECT"]
vertex_ai_location = "your-vertex-location" # can also set this as os.environ["VERTEXAI_LOCATION"]
response = completion(
model="vertex_ai/" + model,
messages=[{"role": "user", "content": "hi"}],
vertex_ai_project=vertex_ai_project,
vertex_ai_location=vertex_ai_location,
)
print("\nModel Response", response)
1. Add to config
model_list:
- model_name: vertex-mistral
litellm_params:
model: vertex_ai/mistral-large@2407
vertex_ai_project: "my-test-project"
vertex_ai_location: "us-east-1"
- model_name: vertex-mistral
litellm_params:
model: vertex_ai/mistral-large@2407
vertex_ai_project: "my-test-project"
vertex_ai_location: "us-west-1"
2. Start proxy
litellm --config /path/to/config.yaml
# RUNNING at http://0.0.0.0:4000
3. Test it!
curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Authorization: Bearer sk-1234' \
--header 'Content-Type: application/json' \
--data '{
"model": "vertex-mistral", # ๐ the 'model_name' in config
"messages": [
{
"role": "user",
"content": "what llm are you"
}
],
}'
Usage - Codestral FIMโ
Call Codestral on VertexAI via the OpenAI /v1/completion
endpoint for FIM tasks.
Note: You can also call Codestral via /chat/completion
.
- SDK
- Proxy
from litellm import completion
import os
# os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = ""
# OR run `!gcloud auth print-access-token` in your terminal
model = "codestral@2405"
vertex_ai_project = "your-vertex-project" # can also set this as os.environ["VERTEXAI_PROJECT"]
vertex_ai_location = "your-vertex-location" # can also set this as os.environ["VERTEXAI_LOCATION"]
response = text_completion(
model="vertex_ai/" + model,
vertex_ai_project=vertex_ai_project,
vertex_ai_location=vertex_ai_location,
prompt="def is_odd(n): \n return n % 2 == 1 \ndef test_is_odd():",
suffix="return True", # optional
temperature=0, # optional
top_p=1, # optional
max_tokens=10, # optional
min_tokens=10, # optional
seed=10, # optional
stop=["return"], # optional
)
print("\nModel Response", response)
1. Add to config
model_list:
- model_name: vertex-codestral
litellm_params:
model: vertex_ai/codestral@2405
vertex_ai_project: "my-test-project"
vertex_ai_location: "us-east-1"
- model_name: vertex-codestral
litellm_params:
model: vertex_ai/codestral@2405
vertex_ai_project: "my-test-project"
vertex_ai_location: "us-west-1"
2. Start proxy
litellm --config /path/to/config.yaml
# RUNNING at http://0.0.0.0:4000
3. Test it!
curl -X POST 'http://0.0.0.0:4000/completions' \
-H 'Authorization: Bearer sk-1234' \
-H 'Content-Type: application/json' \
-d '{
"model": "vertex-codestral", # ๐ the 'model_name' in config
"prompt": "def is_odd(n): \n return n % 2 == 1 \ndef test_is_odd():",
"suffix":"return True", # optional
"temperature":0, # optional
"top_p":1, # optional
"max_tokens":10, # optional
"min_tokens":10, # optional
"seed":10, # optional
"stop":["return"], # optional
}'
VertexAI AI21 Modelsโ
Model Name | Function Call |
---|---|
jamba-1.5-mini@001 | completion(model='vertex_ai/jamba-1.5-mini@001', messages) |
jamba-1.5-large@001 | completion(model='vertex_ai/jamba-1.5-large@001', messages) |
Usageโ
- SDK
- Proxy
from litellm import completion
import os
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = ""
model = "meta/jamba-1.5-mini@001"
vertex_ai_project = "your-vertex-project" # can also set this as os.environ["VERTEXAI_PROJECT"]
vertex_ai_location = "your-vertex-location" # can also set this as os.environ["VERTEXAI_LOCATION"]
response = completion(
model="vertex_ai/" + model,
messages=[{"role": "user", "content": "hi"}],
vertex_ai_project=vertex_ai_project,
vertex_ai_location=vertex_ai_location,
)
print("\nModel Response", response)
1. Add to config
model_list:
- model_name: jamba-1.5-mini
litellm_params:
model: vertex_ai/jamba-1.5-mini@001
vertex_ai_project: "my-test-project"
vertex_ai_location: "us-east-1"
- model_name: jamba-1.5-large
litellm_params:
model: vertex_ai/jamba-1.5-large@001
vertex_ai_project: "my-test-project"
vertex_ai_location: "us-west-1"
2. Start proxy
litellm --config /path/to/config.yaml
# RUNNING at http://0.0.0.0:4000
3. Test it!
curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Authorization: Bearer sk-1234' \
--header 'Content-Type: application/json' \
--data '{
"model": "jamba-1.5-large",
"messages": [
{
"role": "user",
"content": "what llm are you"
}
],
}'
Model Gardenโ
tip
All OpenAI compatible models from Vertex Model Garden are supported.
Using Model Gardenโ
Almost all Vertex Model Garden models are OpenAI compatible.
- OpenAI Compatible Models
- Non-OpenAI Compatible Models
Property | Details |
---|---|
Provider Route | vertex_ai/openai/{MODEL_ID} |
Vertex Documentation | SDK for Deploy & OpenAI Chat Completions, Vertex Model Garden |
Supported Operations | /chat/completions , /embeddings |
- SDK
- Proxy
from litellm import completion
import os
## set ENV variables
os.environ["VERTEXAI_PROJECT"] = "hardy-device-38811"
os.environ["VERTEXAI_LOCATION"] = "us-central1"
response = completion(
model="vertex_ai/openai/<your-endpoint-id>",
messages=[{ "content": "Hello, how are you?","role": "user"}]
)
1. Add to config
model_list:
- model_name: llama3-1-8b-instruct
litellm_params:
model: vertex_ai/openai/5464397967697903616
vertex_ai_project: "my-test-project"
vertex_ai_location: "us-east-1"
2. Start proxy
litellm --config /path/to/config.yaml
# RUNNING at http://0.0.0.0:4000
3. Test it!
curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Authorization: Bearer sk-1234' \
--header 'Content-Type: application/json' \
--data '{
"model": "llama3-1-8b-instruct", # ๐ the 'model_name' in config
"messages": [
{
"role": "user",
"content": "what llm are you"
}
],
}'
from litellm import completion
import os
## set ENV variables
os.environ["VERTEXAI_PROJECT"] = "hardy-device-38811"
os.environ["VERTEXAI_LOCATION"] = "us-central1"
response = completion(
model="vertex_ai/<your-endpoint-id>",
messages=[{ "content": "Hello, how are you?","role": "user"}]
)