Day 0 Support: Claude Opus 4.7
LiteLLM now supports Claude Opus 4.7 on Day 0. Use it across Anthropic, Azure, Vertex AI, and Bedrock through the LiteLLM AI Gateway.
Docker Image​
docker pull ghcr.io/berriai/litellm:litellm_stable_release_branch-v1.83.3-stable.opus-4.7
Usage - Anthropic​
- LiteLLM Proxy
1. Setup config.yaml
model_list:
- model_name: claude-opus-4-7
litellm_params:
model: anthropic/claude-opus-4-7
api_key: os.environ/ANTHROPIC_API_KEY
2. Start the proxy
docker run -d \
-p 4000:4000 \
-e ANTHROPIC_API_KEY=$ANTHROPIC_API_KEY \
-v $(pwd)/config.yaml:/app/config.yaml \
ghcr.io/berriai/litellm:litellm_stable_release_branch-v1.83.3-stable.opus-4.7 \
--config /app/config.yaml
3. Test it!
curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer $LITELLM_KEY' \
--data '{
"model": "claude-opus-4-7",
"messages": [
{
"role": "user",
"content": "what llm are you"
}
]
}'
Usage - Azure​
- LiteLLM Proxy
1. Setup config.yaml
model_list:
- model_name: claude-opus-4-7
litellm_params:
model: azure_ai/claude-opus-4-7
api_key: os.environ/AZURE_AI_API_KEY
api_base: os.environ/AZURE_AI_API_BASE # https://<resource>.services.ai.azure.com
2. Start the proxy
docker run -d \
-p 4000:4000 \
-e AZURE_AI_API_KEY=$AZURE_AI_API_KEY \
-e AZURE_AI_API_BASE=$AZURE_AI_API_BASE \
-v $(pwd)/config.yaml:/app/config.yaml \
ghcr.io/berriai/litellm:litellm_stable_release_branch-v1.83.3-stable.opus-4.7 \
--config /app/config.yaml
3. Test it!
curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer $LITELLM_KEY' \
--data '{
"model": "claude-opus-4-7",
"messages": [
{
"role": "user",
"content": "what llm are you"
}
]
}'
Usage - Vertex AI​
- LiteLLM Proxy
1. Setup config.yaml
model_list:
- model_name: claude-opus-4-7
litellm_params:
model: vertex_ai/claude-opus-4-7
vertex_project: os.environ/VERTEX_PROJECT
vertex_location: us-east5
2. Start the proxy
docker run -d \
-p 4000:4000 \
-e VERTEX_PROJECT=$VERTEX_PROJECT \
-e GOOGLE_APPLICATION_CREDENTIALS=/app/credentials.json \
-v $(pwd)/config.yaml:/app/config.yaml \
-v $(pwd)/credentials.json:/app/credentials.json \
ghcr.io/berriai/litellm:litellm_stable_release_branch-v1.83.3-stable.opus-4.7 \
--config /app/config.yaml
3. Test it!
curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer $LITELLM_KEY' \
--data '{
"model": "claude-opus-4-7",
"messages": [
{
"role": "user",
"content": "what llm are you"
}
]
}'
Usage - Bedrock​
- LiteLLM Proxy
1. Setup config.yaml
model_list:
- model_name: claude-opus-4-7
litellm_params:
model: bedrock/anthropic.claude-opus-4-7
aws_access_key_id: os.environ/AWS_ACCESS_KEY_ID
aws_secret_access_key: os.environ/AWS_SECRET_ACCESS_KEY
aws_region_name: us-east-1
2. Start the proxy
docker run -d \
-p 4000:4000 \
-e AWS_ACCESS_KEY_ID=$AWS_ACCESS_KEY_ID \
-e AWS_SECRET_ACCESS_KEY=$AWS_SECRET_ACCESS_KEY \
-v $(pwd)/config.yaml:/app/config.yaml \
ghcr.io/berriai/litellm:litellm_stable_release_branch-v1.83.3-stable.opus-4.7 \
--config /app/config.yaml
3. Test it!
curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer $LITELLM_KEY' \
--data '{
"model": "claude-opus-4-7",
"messages": [
{
"role": "user",
"content": "what llm are you"
}
]
}'
Advanced Features​
Adaptive Thinking​
When using reasoning_effort with Claude Opus 4.7, all values (low, medium, high, xhigh, max) are mapped to thinking: {type: "adaptive"}. Opus 4.7 only supports adaptive thinking; explicit budgets via thinking: {type: "enabled", budget_tokens: ...} are rejected by the Anthropic API with a 400 error. To control thinking depth, pair adaptive thinking with output_config.effort (see Effort Levels below) rather than a fixed budget.
- /chat/completions
- /v1/messages
LiteLLM supports adaptive thinking through the reasoning_effort parameter:
curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer $LITELLM_KEY' \
--data '{
"model": "claude-opus-4-7",
"messages": [
{
"role": "user",
"content": "Solve this complex problem: What is the optimal strategy for..."
}
],
"reasoning_effort": "high"
}'
Use the thinking parameter with type: "adaptive" to enable adaptive thinking mode:
curl --location 'http://0.0.0.0:4000/v1/messages' \
--header 'x-api-key: sk-12345' \
--header 'content-type: application/json' \
--data '{
"model": "claude-opus-4-7",
"max_tokens": 16000,
"thinking": {
"type": "adaptive"
},
"messages": [
{
"role": "user",
"content": "Explain why the sum of two even numbers is always even."
}
]
}'
Effort Levels​
Claude Opus 4.7 supports five effort levels: low, medium, high (default), xhigh, and max. These give you finer-grained control over how much reasoning the model applies to a task. Pass the effort level via the output_config parameter.
xhigh is a new effort level introduced with Opus 4.7 that sits above high and is the recommended starting point for coding and agentic work. max sits above xhigh for the absolute highest capability; reserve it for genuinely frontier problems, since on most workloads it adds significant token cost for relatively small quality gains.
- /chat/completions
- /v1/messages
curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer $LITELLM_KEY' \
--data '{
"model": "claude-opus-4-7",
"messages": [
{
"role": "user",
"content": "Explain quantum computing"
}
],
"output_config": {
"effort": "xhigh"
}
}'
Using OpenAI SDK:
import openai
client = openai.OpenAI(
api_key="your-litellm-key",
base_url="http://0.0.0.0:4000"
)
response = client.chat.completions.create(
model="claude-opus-4-7",
messages=[{"role": "user", "content": "Explain quantum computing"}],
extra_body={"output_config": {"effort": "xhigh"}}
)
Using LiteLLM SDK:
from litellm import completion
response = completion(
model="anthropic/claude-opus-4-7",
messages=[{"role": "user", "content": "Explain quantum computing"}],
output_config={"effort": "xhigh"},
)
You can combine reasoning_effort with output_config for even more fine-grained control over the model's behavior.
curl --location 'http://0.0.0.0:4000/v1/messages' \
--header 'x-api-key: sk-12345' \
--header 'content-type: application/json' \
--data '{
"model": "claude-opus-4-7",
"max_tokens": 4096,
"messages": [
{
"role": "user",
"content": "Explain quantum computing"
}
],
"output_config": {
"effort": "xhigh"
}
}'
Effort level guide:
| Effort | When to use |
|---|---|
low | Short, fast responses — simple lookups, formatting, classification |
medium | Balanced tradeoff for everyday Q&A and light reasoning |
high (default) | Complex reasoning, code generation, analysis |
xhigh | Hardest problems — multi-step math, deep research, agentic planning |


