
OpenRouter — User Guide
One API for many models.
Strengths
- One API Key to access 200+ AI models
- Automatic routing to the cheapest or fastest available provider
- Real-time price comparison, transparent fee display
- Support OpenAI compatible API format
- Some models are completely free
Best for
- Unified management of API calls for multiple AI models
- Automatically select the cheapest model provider
- Test and compare the output of different models
- Build applications that support multi-model switching
- Access models that cannot be called directly in China
quick start
OpenRouter uses an OpenAI compatible format, requiring little modification to existing code.
Call any model
from openai import OpenAI
client = OpenAI(
base_url="https://openrouter.ai/api/v1",
api_key="your-openrouter-key"
)
# You can call any model, just modify the model parameters
response = client.chat.completions.create(
model="anthropic/claude-3.5-sonnet", # or any other model
messages=[{"role": "user", "content": "Hello"}],
extra_headers={
"HTTP-Referer": "https://your-site.com",
"X-Title": "Your App Name"
}
)
print(response.choices[0].message.content)
# Just change one line to switch models
# model="openai/gpt-4o"
# model="google/gemini-pro-1.5"
# model="meta-llama/llama-3.1-70b-instruct"One API Key to access all models,
To switch models, just modify the model parameters.
Billing is managed in OpenRouter.
At openrouter.ai/models you can view real-time prices for all models and choose the one that suits you best.
Use free models
Free model on #OpenRouter (2025)
free_models = [
"google/gemini-flash-1.5-8b", # Google Free
"meta-llama/llama-3.2-3b-instruct:free", # Meta free
"mistralai/mistral-7b-instruct:free", #Mistral free
"deepseek/deepseek-r1:free", # DeepSeek free
]
# Use free models
response = client.chat.completions.create(
model="deepseek/deepseek-r1:free",
messages=[{"role": "user", "content": "Explaining Machine Learning"}]
)Free models have rate limits,
But it is completely sufficient for testing and low-frequency use.
DeepSeek R1 free version is of high quality.
Free models usually have stricter rate limits, and paid models are recommended for production applications.
Automatic routing and cost optimization
OpenRouter automatically selects the cheapest provider, reducing API costs.
Use the automatic routing feature
# Use "auto" routing, OpenRouter automatically selects the optimal model
response = client.chat.completions.create(
model="openrouter/auto",
messages=[{"role": "user", "content": "Write a poem"}]
)
# View the actual used model
print(response.model)
#Set cost cap
response = client.chat.completions.create(
model="openai/gpt-4o",
messages=[{"role": "user", "content": "Hello"}],
extra_body={
"provider": {
"order": ["OpenAI", "Azure"], # Priority order
"allow_fallbacks": True # Allow downgrades
}
}
)Automatic routing selects the cheapest or fastest available provider at the moment,
Automatically switches to backup when primary provider is unavailable,
Improve application reliability.
For production applications, it is recommended to set allow_fallbacks=True to avoid service interruption caused by a single provider failure.
Compared with similar tools
| Tool | Strength | Best for | Pricing |
|---|---|---|---|
| OpenRouter This tool | Unified access to all models, automatic price comparison, some free | Need to use multiple models and want to manage the API in a unified way | Pay as you use (transparent pricing) |
| OpenAI API | Direct access with minimal latency | Only use OpenAI models to pursue the lowest latency | Pay by token |
| Together AI | Open source models are cheaper | Extensive use of open source models | Pay by token |
| Groq | The fastest reasoning speed | Extremely high speed requirements | Free quota/paid version |
Sources & references:
- OpenRouter official website (2025-03)
- OpenRouter documentation (2025-03)