Skip to main content

/converse

Call Bedrock's /converse endpoint through LiteLLM Proxy.

FeatureSupported
Cost Trackingโœ…
Loggingโœ…
Streamingโœ… via /converse-stream
Load Balancingโœ…

Quick Startโ€‹

1. Setup config.yamlโ€‹

model_list:
- model_name: my-bedrock-model
litellm_params:
model: bedrock/us.anthropic.claude-3-5-sonnet-20240620-v1:0
aws_region_name: us-west-2
aws_access_key_id: os.environ/AWS_ACCESS_KEY_ID # reads from environment
aws_secret_access_key: os.environ/AWS_SECRET_ACCESS_KEY
custom_llm_provider: bedrock

Set AWS credentials in your environment:

export AWS_ACCESS_KEY_ID="your-access-key"
export AWS_SECRET_ACCESS_KEY="your-secret-key"

2. Start Proxyโ€‹

litellm --config config.yaml

# RUNNING on http://0.0.0.0:4000

3. Call /converse endpointโ€‹

curl -X POST 'http://0.0.0.0:4000/bedrock/model/my-bedrock-model/converse' \
-H 'Authorization: Bearer sk-1234' \
-H 'Content-Type: application/json' \
-d '{
"messages": [
{
"role": "user",
"content": [{"text": "Hello, how are you?"}]
}
],
"inferenceConfig": {
"temperature": 0.5,
"maxTokens": 100
}
}'

Streamingโ€‹

For streaming responses, use /converse-stream:

curl -X POST 'http://0.0.0.0:4000/bedrock/model/my-bedrock-model/converse-stream' \
-H 'Authorization: Bearer sk-1234' \
-H 'Content-Type: application/json' \
-d '{
"messages": [
{
"role": "user",
"content": [{"text": "Tell me a short story"}]
}
],
"inferenceConfig": {
"temperature": 0.7,
"maxTokens": 200
}
}'

Load Balancingโ€‹

Define multiple deployments with the same model_name for automatic load balancing:

model_list:
# Deployment 1 - us-west-2
- model_name: my-bedrock-model
litellm_params:
model: bedrock/us.anthropic.claude-3-5-sonnet-20240620-v1:0
aws_region_name: us-west-2
aws_access_key_id: os.environ/AWS_ACCESS_KEY_ID
aws_secret_access_key: os.environ/AWS_SECRET_ACCESS_KEY
custom_llm_provider: bedrock

# Deployment 2 - us-east-1
- model_name: my-bedrock-model
litellm_params:
model: bedrock/us.anthropic.claude-3-5-sonnet-20240620-v1:0
aws_region_name: us-east-1
aws_access_key_id: os.environ/AWS_ACCESS_KEY_ID
aws_secret_access_key: os.environ/AWS_SECRET_ACCESS_KEY
custom_llm_provider: bedrock

The proxy automatically distributes requests across both regions.

Using boto3 SDKโ€‹

import boto3
import json
import os

# Set dummy AWS credentials (required by boto3, but not used by LiteLLM proxy)
os.environ['AWS_ACCESS_KEY_ID'] = 'dummy'
os.environ['AWS_SECRET_ACCESS_KEY'] = 'dummy'
os.environ['AWS_BEARER_TOKEN_BEDROCK'] = "sk-1234" # your litellm proxy api key

# Point boto3 to the LiteLLM proxy
bedrock_runtime = boto3.client(
service_name='bedrock-runtime',
region_name='us-west-2',
endpoint_url='http://0.0.0.0:4000/bedrock'
)

response = bedrock_runtime.converse(
modelId='my-bedrock-model', # Your model_name from config.yaml
messages=[
{
"role": "user",
"content": [{"text": "Hello, how are you?"}]
}
],
inferenceConfig={
"temperature": 0.5,
"maxTokens": 100
}
)

print(response['output']['message']['content'][0]['text'])

More Infoโ€‹

For complete documentation including Guardrails, Knowledge Bases, and Agents, see: