Vertex AI - Anthropic, DeepSeek, Model Garden

Supported Partner Providers

Provider	LiteLLM Route	Vertex Documentation
Anthropic (Claude)	`vertex_ai/claude-*`	Vertex AI - Anthropic Models
DeepSeek	`vertex_ai/deepseek-ai/{MODEL}`	Vertex AI - DeepSeek Models
Meta/Llama	`vertex_ai/meta/{MODEL}`	Vertex AI - Meta Models
Mistral	`vertex_ai/mistral-*`	Vertex AI - Mistral Models
AI21 (Jamba)	`vertex_ai/jamba-*`	Vertex AI - AI21 Models
Model Garden	`vertex_ai/openai/{MODEL_ID}` or `vertex_ai/{MODEL_ID}`	Vertex Model Garden

Vertex AI - Anthropic (Claude)

Model Name	Function Call
claude-3-opus@20240229	`completion('vertex_ai/claude-3-opus@20240229', messages)`
claude-3-5-sonnet@20240620	`completion('vertex_ai/claude-3-5-sonnet@20240620', messages)`
claude-3-sonnet@20240229	`completion('vertex_ai/claude-3-sonnet@20240229', messages)`
claude-3-haiku@20240307	`completion('vertex_ai/claude-3-haiku@20240307', messages)`
claude-3-7-sonnet@20250219	`completion('vertex_ai/claude-3-7-sonnet@20250219', messages)`

Usage

SDK
Proxy

from litellm import completion
import os

os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = ""

model = "claude-3-sonnet@20240229"

vertex_ai_project = "your-vertex-project" # can also set this as os.environ["VERTEXAI_PROJECT"]
vertex_ai_location = "your-vertex-location" # can also set this as os.environ["VERTEXAI_LOCATION"]

response = completion(
    model="vertex_ai/" + model,
    messages=[{"role": "user", "content": "hi"}],
    temperature=0.7,
    vertex_ai_project=vertex_ai_project,
    vertex_ai_location=vertex_ai_location,
)
print("\nModel Response", response)

1. Add to config

model_list:
    - model_name: anthropic-vertex
      litellm_params:
        model: vertex_ai/claude-3-sonnet@20240229
        vertex_ai_project: "my-test-project"
        vertex_ai_location: "us-east-1"
    - model_name: anthropic-vertex
      litellm_params:
        model: vertex_ai/claude-3-sonnet@20240229
        vertex_ai_project: "my-test-project"
        vertex_ai_location: "us-west-1"

2. Start proxy

litellm --config /path/to/config.yaml

# RUNNING at http://0.0.0.0:4000

3. Test it!

curl --location 'http://0.0.0.0:4000/chat/completions' \
      --header 'Authorization: Bearer sk-1234' \
      --header 'Content-Type: application/json' \
      --data '{
            "model": "anthropic-vertex", # 👈 the 'model_name' in config
            "messages": [
                {
                "role": "user",
                "content": "what llm are you"
                }
            ],
        }'

Usage - `thinking` / `reasoning_content`

SDK
PROXY

from litellm import completion

resp = completion(
    model="vertex_ai/claude-3-7-sonnet-20250219",
    messages=[{"role": "user", "content": "What is the capital of France?"}],
    thinking={"type": "enabled", "budget_tokens": 1024},
)

Setup config.yaml

- model_name: claude-3-7-sonnet-20250219
  litellm_params:
    model: vertex_ai/claude-3-7-sonnet-20250219
    vertex_ai_project: "my-test-project"
    vertex_ai_location: "us-west-1"

Start proxy

litellm --config /path/to/config.yaml

Test it!

curl http://0.0.0.0:4000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <YOUR-LITELLM-KEY>" \
  -d '{
    "model": "claude-3-7-sonnet-20250219",
    "messages": [{"role": "user", "content": "What is the capital of France?"}],
    "thinking": {"type": "enabled", "budget_tokens": 1024}
  }'

Expected Response

ModelResponse(
    id='chatcmpl-c542d76d-f675-4e87-8e5f-05855f5d0f5e',
    created=1740470510,
    model='claude-3-7-sonnet-20250219',
    object='chat.completion',
    system_fingerprint=None,
    choices=[
        Choices(
            finish_reason='stop',
            index=0,
            message=Message(
                content="The capital of France is Paris.",
                role='assistant',
                tool_calls=None,
                function_call=None,
                provider_specific_fields={
                    'citations': None,
                    'thinking_blocks': [
                        {
                            'type': 'thinking',
                            'thinking': 'The capital of France is Paris. This is a very straightforward factual question.',
                            'signature': 'EuYBCkQYAiJAy6...'
                        }
                    ]
                }
            ),
            thinking_blocks=[
                {
                    'type': 'thinking',
                    'thinking': 'The capital of France is Paris. This is a very straightforward factual question.',
                    'signature': 'EuYBCkQYAiJAy6AGB...'
                }
            ],
            reasoning_content='The capital of France is Paris. This is a very straightforward factual question.'
        )
    ],
    usage=Usage(
        completion_tokens=68,
        prompt_tokens=42,
        total_tokens=110,
        completion_tokens_details=None,
        prompt_tokens_details=PromptTokensDetailsWrapper(
            audio_tokens=None,
            cached_tokens=0,
            text_tokens=None,
            image_tokens=None
        ),
        cache_creation_input_tokens=0,
        cache_read_input_tokens=0
    )
)

VertexAI DeepSeek

Property	Details
Provider Route	`vertex_ai/deepseek-ai/{MODEL}`
Vertex Documentation	Vertex AI - DeepSeek Models

Usage

LiteLLM Supports all Vertex AI DeepSeek Models. Ensure you use the vertex_ai/deepseek-ai/ prefix for all Vertex AI DeepSeek models.

Model Name	Usage
vertex_ai/deepseek-ai/deepseek-r1-0528-maas	`completion('vertex_ai/deepseek-ai/deepseek-r1-0528-maas', messages)`

VertexAI Meta/Llama API

Model Name	Function Call
meta/llama-3.2-90b-vision-instruct-maas	`completion('vertex_ai/meta/llama-3.2-90b-vision-instruct-maas', messages)`
meta/llama3-8b-instruct-maas	`completion('vertex_ai/meta/llama3-8b-instruct-maas', messages)`
meta/llama3-70b-instruct-maas	`completion('vertex_ai/meta/llama3-70b-instruct-maas', messages)`
meta/llama3-405b-instruct-maas	`completion('vertex_ai/meta/llama3-405b-instruct-maas', messages)`
meta/llama-4-scout-17b-16e-instruct-maas	`completion('vertex_ai/meta/llama-4-scout-17b-16e-instruct-maas', messages)`
meta/llama-4-scout-17-128e-instruct-maas	`completion('vertex_ai/meta/llama-4-scout-128b-16e-instruct-maas', messages)`
meta/llama-4-maverick-17b-128e-instruct-maas	`completion('vertex_ai/meta/llama-4-maverick-17b-128e-instruct-maas',messages)`
meta/llama-4-maverick-17b-16e-instruct-maas	`completion('vertex_ai/meta/llama-4-maverick-17b-16e-instruct-maas',messages)`

Usage

SDK
Proxy

from litellm import completion
import os

os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = ""

model = "meta/llama3-405b-instruct-maas"

vertex_ai_project = "your-vertex-project" # can also set this as os.environ["VERTEXAI_PROJECT"]
vertex_ai_location = "your-vertex-location" # can also set this as os.environ["VERTEXAI_LOCATION"]

response = completion(
    model="vertex_ai/" + model,
    messages=[{"role": "user", "content": "hi"}],
    vertex_ai_project=vertex_ai_project,
    vertex_ai_location=vertex_ai_location,
)
print("\nModel Response", response)

1. Add to config

model_list:
    - model_name: anthropic-llama
      litellm_params:
        model: vertex_ai/meta/llama3-405b-instruct-maas
        vertex_ai_project: "my-test-project"
        vertex_ai_location: "us-east-1"
    - model_name: anthropic-llama
      litellm_params:
        model: vertex_ai/meta/llama3-405b-instruct-maas
        vertex_ai_project: "my-test-project"
        vertex_ai_location: "us-west-1"

2. Start proxy

litellm --config /path/to/config.yaml

# RUNNING at http://0.0.0.0:4000

3. Test it!

curl --location 'http://0.0.0.0:4000/chat/completions' \
      --header 'Authorization: Bearer sk-1234' \
      --header 'Content-Type: application/json' \
      --data '{
            "model": "anthropic-llama", # 👈 the 'model_name' in config
            "messages": [
                {
                "role": "user",
                "content": "what llm are you"
                }
            ],
        }'

VertexAI Mistral API

Supported OpenAI Params

LiteLLM Supports all Vertex AI Mistral Models. Ensure you use the vertex_ai/mistral- prefix for all Vertex AI Mistral models.

Overview

Property	Details
Provider Route	`vertex_ai/mistral-{MODEL}`
Vertex Documentation	Vertex AI - Mistral Models

Model Name	Function Call
mistral-large@latest	`completion('vertex_ai/mistral-large@latest', messages)`
mistral-large@2407	`completion('vertex_ai/mistral-large@2407', messages)`
mistral-small-2503	`completion('vertex_ai/mistral-small-2503', messages)`
mistral-large-2411	`completion('vertex_ai/mistral-large-2411', messages)`
mistral-nemo@latest	`completion('vertex_ai/mistral-nemo@latest', messages)`
codestral@latest	`completion('vertex_ai/codestral@latest', messages)`
codestral@@2405	`completion('vertex_ai/codestral@2405', messages)`

Usage

SDK
Proxy

from litellm import completion
import os

os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = ""

model = "mistral-large@2407"

vertex_ai_project = "your-vertex-project" # can also set this as os.environ["VERTEXAI_PROJECT"]
vertex_ai_location = "your-vertex-location" # can also set this as os.environ["VERTEXAI_LOCATION"]

response = completion(
    model="vertex_ai/" + model,
    messages=[{"role": "user", "content": "hi"}],
    vertex_ai_project=vertex_ai_project,
    vertex_ai_location=vertex_ai_location,
)
print("\nModel Response", response)

1. Add to config

model_list:
    - model_name: vertex-mistral
      litellm_params:
        model: vertex_ai/mistral-large@2407
        vertex_ai_project: "my-test-project"
        vertex_ai_location: "us-east-1"
    - model_name: vertex-mistral
      litellm_params:
        model: vertex_ai/mistral-large@2407
        vertex_ai_project: "my-test-project"
        vertex_ai_location: "us-west-1"

2. Start proxy

litellm --config /path/to/config.yaml

# RUNNING at http://0.0.0.0:4000

3. Test it!

curl --location 'http://0.0.0.0:4000/chat/completions' \
      --header 'Authorization: Bearer sk-1234' \
      --header 'Content-Type: application/json' \
      --data '{
            "model": "vertex-mistral", # 👈 the 'model_name' in config
            "messages": [
                {
                "role": "user",
                "content": "what llm are you"
                }
            ],
        }'

Usage - Codestral FIM

Call Codestral on VertexAI via the OpenAI /v1/completion endpoint for FIM tasks.

Note: You can also call Codestral via /chat/completion.

SDK
Proxy

from litellm import completion
import os

# os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = ""
# OR run `!gcloud auth print-access-token` in your terminal

model = "codestral@2405"

vertex_ai_project = "your-vertex-project" # can also set this as os.environ["VERTEXAI_PROJECT"]
vertex_ai_location = "your-vertex-location" # can also set this as os.environ["VERTEXAI_LOCATION"]

response = text_completion(
    model="vertex_ai/" + model,
    vertex_ai_project=vertex_ai_project,
    vertex_ai_location=vertex_ai_location,
    prompt="def is_odd(n): \n return n % 2 == 1 \ndef test_is_odd():", 
    suffix="return True",                                              # optional
    temperature=0,                                                     # optional
    top_p=1,                                                           # optional
    max_tokens=10,                                                     # optional
    min_tokens=10,                                                     # optional
    seed=10,                                                           # optional
    stop=["return"],                                                   # optional
)

print("\nModel Response", response)

1. Add to config

model_list:
    - model_name: vertex-codestral
      litellm_params:
        model: vertex_ai/codestral@2405
        vertex_ai_project: "my-test-project"
        vertex_ai_location: "us-east-1"
    - model_name: vertex-codestral
      litellm_params:
        model: vertex_ai/codestral@2405
        vertex_ai_project: "my-test-project"
        vertex_ai_location: "us-west-1"

2. Start proxy

litellm --config /path/to/config.yaml

# RUNNING at http://0.0.0.0:4000

3. Test it!

curl -X POST 'http://0.0.0.0:4000/completions' \
      -H 'Authorization: Bearer sk-1234' \
      -H 'Content-Type: application/json' \
      -d '{
            "model": "vertex-codestral", # 👈 the 'model_name' in config
            "prompt": "def is_odd(n): \n return n % 2 == 1 \ndef test_is_odd():", 
            "suffix":"return True",                                              # optional
            "temperature":0,                                                     # optional
            "top_p":1,                                                           # optional
            "max_tokens":10,                                                     # optional
            "min_tokens":10,                                                     # optional
            "seed":10,                                                           # optional
            "stop":["return"],                                                   # optional
        }'

VertexAI AI21 Models

Model Name	Function Call
jamba-1.5-mini@001	`completion(model='vertex_ai/jamba-1.5-mini@001', messages)`
jamba-1.5-large@001	`completion(model='vertex_ai/jamba-1.5-large@001', messages)`

Usage

SDK
Proxy

from litellm import completion
import os

os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = ""

model = "meta/jamba-1.5-mini@001"

vertex_ai_project = "your-vertex-project" # can also set this as os.environ["VERTEXAI_PROJECT"]
vertex_ai_location = "your-vertex-location" # can also set this as os.environ["VERTEXAI_LOCATION"]

response = completion(
    model="vertex_ai/" + model,
    messages=[{"role": "user", "content": "hi"}],
    vertex_ai_project=vertex_ai_project,
    vertex_ai_location=vertex_ai_location,
)
print("\nModel Response", response)

1. Add to config

model_list:
    - model_name: jamba-1.5-mini
      litellm_params:
        model: vertex_ai/jamba-1.5-mini@001
        vertex_ai_project: "my-test-project"
        vertex_ai_location: "us-east-1"
    - model_name: jamba-1.5-large
      litellm_params:
        model: vertex_ai/jamba-1.5-large@001
        vertex_ai_project: "my-test-project"
        vertex_ai_location: "us-west-1"

2. Start proxy

litellm --config /path/to/config.yaml

# RUNNING at http://0.0.0.0:4000

3. Test it!

curl --location 'http://0.0.0.0:4000/chat/completions' \
      --header 'Authorization: Bearer sk-1234' \
      --header 'Content-Type: application/json' \
      --data '{
            "model": "jamba-1.5-large",
            "messages": [
                {
                "role": "user",
                "content": "what llm are you"
                }
            ],
        }'

Model Garden

tip

All OpenAI compatible models from Vertex Model Garden are supported.

Using Model Garden

Almost all Vertex Model Garden models are OpenAI compatible.

OpenAI Compatible Models
Non-OpenAI Compatible Models

Property	Details
Provider Route	`vertex_ai/openai/{MODEL_ID}`
Vertex Documentation	Model Garden LiteLLM Inference, Vertex Model Garden
Supported Operations	`/chat/completions`, `/embeddings`

SDK
Proxy

from litellm import completion
import os

## set ENV variables
os.environ["VERTEXAI_PROJECT"] = "hardy-device-38811"
os.environ["VERTEXAI_LOCATION"] = "us-central1"

response = completion(
  model="vertex_ai/openai/<your-endpoint-id>", 
  messages=[{ "content": "Hello, how are you?","role": "user"}]
)

1. Add to config

model_list:
    - model_name: llama3-1-8b-instruct
      litellm_params:
        model: vertex_ai/openai/5464397967697903616
        vertex_ai_project: "my-test-project"
        vertex_ai_location: "us-east-1"

2. Start proxy

litellm --config /path/to/config.yaml

# RUNNING at http://0.0.0.0:4000

3. Test it!

curl --location 'http://0.0.0.0:4000/chat/completions' \
      --header 'Authorization: Bearer sk-1234' \
      --header 'Content-Type: application/json' \
      --data '{
            "model": "llama3-1-8b-instruct", # 👈 the 'model_name' in config
            "messages": [
                {
                "role": "user",
                "content": "what llm are you"
                }
            ],
        }'

from litellm import completion
import os

## set ENV variables
os.environ["VERTEXAI_PROJECT"] = "hardy-device-38811"
os.environ["VERTEXAI_LOCATION"] = "us-central1"

response = completion(
  model="vertex_ai/<your-endpoint-id>", 
  messages=[{ "content": "Hello, how are you?","role": "user"}]
)

Supported Partner Providers​

Vertex AI - Anthropic (Claude)​

Usage​

Usage - thinking / reasoning_content​

VertexAI DeepSeek​

Usage​

VertexAI Meta/Llama API​

Usage​

VertexAI Mistral API​

Usage​

Usage - Codestral FIM​

VertexAI AI21 Models​

Usage​

Model Garden​

Using Model Garden​

Supported Partner Providers

Vertex AI - Anthropic (Claude)

Usage

Usage - `thinking` / `reasoning_content`

VertexAI DeepSeek

Usage

VertexAI Meta/Llama API

Usage

VertexAI Mistral API

Usage

Usage - Codestral FIM

VertexAI AI21 Models

Usage

Model Garden

Using Model Garden