EmpirioLabs AI
Overview​
| Property | Details |
|---|---|
| Description | EmpirioLabs AI hosts open, proprietary, and custom models behind one OpenAI-compatible API with pay-as-you-go pricing across text, image, video, audio, search, and 3D endpoints. |
| Provider Route on LiteLLM | empiriolabs/ |
| Link to Provider Doc | EmpirioLabs AI Documentation ↗ |
| Base URL | https://api.empiriolabs.ai/v1 |
| Supported Operations | /chat/completions, /responses |
We support ALL EmpirioLabs chat models, just set empiriolabs/ as a prefix when sending completion requests
Available Models (selection)​
The full live catalog with pricing is at empiriolabs.ai/models. Popular chat models:
| Model | Description | Context Window |
|---|---|---|
empiriolabs/qwen3-7-max | Qwen3.7 Max flagship text model for coding, agents, and deep thinking | 1M tokens |
empiriolabs/qwen3-7-plus | Cost-effective Qwen3.7 vision-language model (text, image, video input) | 1M tokens |
empiriolabs/deepseek-v4-pro | DeepSeek V4 flagship MoE (1.6T total / 49B active parameters) | 1M tokens |
empiriolabs/deepseek-v4-flash | Lightweight DeepSeek V4 MoE (284B total / 13B active parameters) | 1M tokens |
empiriolabs/glm-5-1 | Zhipu AI long-context reasoning model with tool use | 202K tokens |
empiriolabs/kimi-k2-6 | Moonshot Kimi K2.6 multimodal reasoning model | 256K tokens |
empiriolabs/minimax-m3 | MiniMax M3 multimodal reasoning for coding and agents | 524K tokens |
empiriolabs/gemma-4-26b-a4b | Google Gemma 4 26B A4B open multimodal model | 256K tokens |
Required Variables​
Environment Variables
os.environ["EMPIRIOLABS_API_KEY"] = "" # your EmpirioLabs API key
Get an API key from the EmpirioLabs dashboard.
Usage - LiteLLM Python SDK​
Non-streaming​
EmpirioLabs Non-streaming Completion
import os
import litellm
from litellm import completion
os.environ["EMPIRIOLABS_API_KEY"] = "" # your EmpirioLabs API key
messages = [{"content": "Hello, how are you?", "role": "user"}]
# EmpirioLabs call
response = completion(model="empiriolabs/qwen3-7-plus", messages=messages)
print(response)
Streaming​
EmpirioLabs Streaming Completion
import os
import litellm
from litellm import completion
os.environ["EMPIRIOLABS_API_KEY"] = "" # your EmpirioLabs API key
messages = [{"content": "Hello, how are you?", "role": "user"}]
# EmpirioLabs call with streaming
response = completion(
model="empiriolabs/qwen3-7-plus",
messages=messages,
stream=True,
)
for chunk in response:
print(chunk)
Usage - LiteLLM Proxy​
Add the following to your LiteLLM Proxy configuration file:
config.yaml
model_list:
- model_name: qwen3-7-plus
litellm_params:
model: empiriolabs/qwen3-7-plus
api_key: os.environ/EMPIRIOLABS_API_KEY
- model_name: deepseek-v4-flash
litellm_params:
model: empiriolabs/deepseek-v4-flash
api_key: os.environ/EMPIRIOLABS_API_KEY
Start your LiteLLM Proxy server:
Start LiteLLM Proxy
litellm --config config.yaml
# RUNNING on http://0.0.0.0:4000
- OpenAI SDK
- cURL
EmpirioLabs via Proxy
from openai import OpenAI
# Initialize client with your proxy URL
client = OpenAI(
base_url="http://localhost:4000", # Your proxy URL
api_key="your-proxy-api-key", # Your proxy API key
)
# Non-streaming response
response = client.chat.completions.create(
model="qwen3-7-plus",
messages=[{"role": "user", "content": "hello from litellm"}],
)
print(response.choices[0].message.content)
EmpirioLabs via Proxy - cURL
curl http://localhost:4000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer your-proxy-api-key" \
-d '{
"model": "qwen3-7-plus",
"messages": [{"role": "user", "content": "hello from litellm"}]
}'
Additional Notes​
- Thinking-capable models accept
reasoning_effort(none,low,medium,high,max); the gateway maps it onto each model's native thinking controls. - Per-model parameters, limits, and live pricing are listed at docs.empiriolabs.ai and on each model page at empiriolabs.ai/models.