Skip to main content

/responses/compact

Compress conversation history using OpenAI's /responses/compact endpoint.

FeatureSupported
Supported LiteLLM Versions1.72.0+
Supported Providersopenai

Usage​

LiteLLM Python SDK​

Compact Response
import litellm

response = litellm.compact_responses(
model="openai/gpt-4o",
input=[{"role": "user", "content": "Hello, how are you?"}],
instructions="Be helpful",
previous_response_id="resp_abc123" # optional
)

print(response.id)
print(response.object) # "response.compaction"
print(response.output)

LiteLLM Proxy​

Compact Request
curl http://localhost:4000/v1/responses/compact \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-1234" \
-d '{
"model": "openai/gpt-4o",
"input": [{"role": "user", "content": "Hello"}],
"instructions": "Be helpful"
}'

Request Parameters​

ParameterTypeRequiredDescription
modelstringYesModel to use for compaction
inputstring or arrayYesInput messages to compact
instructionsstringNoSystem instructions
previous_response_idstringNoID of previous response to continue from

Response Format​

{
"id": "resp_abc123",
"object": "response.compaction",
"created_at": 1734366691,
"output": [
{
"type": "message",
"role": "assistant",
"content": [...]
},
{
"type": "compaction",
"encrypted_content": "..."
}
],
"usage": {
"input_tokens": 100,
"output_tokens": 50,
"total_tokens": 150
}
}