
Replicate — User Guide
Run open models in the cloud.
Strengths
- No need to configure the GPU environment, call the API directly
- Supports thousands of open source models (images, videos, audios, text)
- Pay as you go, no minimum spend
- Can deploy your own models for others to use
- Python/JavaScript SDK is easy to use
Best for
- Quickly call image generation models such as Stable Diffusion and FLUX
- Run computationally intensive models such as video generation and audio processing
- AI functional testing during prototype development phase
- Deploy and share your trained models
- Run open source models without a GPU
Quickly call image generation model
The most common use of Replicate is to invoke image generation models without requiring a local GPU.
Call FLUX to generate images
import replicate
# Set API Token (obtained from replicate.com/account)
# export REPLICATE_API_TOKEN=your-token
output = replicate.run(
"black-forest-labs/flux-schnell",
input={
"prompt": "A serene Japanese garden with cherry blossoms, "
"traditional stone lantern, photorealistic",
"num_outputs": 1,
"aspect_ratio": "16:9",
"output_format": "webp",
"output_quality": 90
}
)
# output is a list of image URLs
print(output[0]) #Print image URL
# Download image
import requests
img_data = requests.get(output[0]).content
with open("output.webp", "wb") as f:
f.write(img_data)Generate high-quality images in seconds,
Returns a temporary URL,
FLUX Schnell is one of the fastest high-quality image generation models available.
FLUX Schnell is fast but slightly lower quality, FLUX Dev is higher quality but slower, choose based on your needs.
Call the video generation model
import replicate
output = replicate.run(
"minimax/video-01",
input={
"prompt": "A butterfly landing on a flower, "
"macro photography, slow motion",
"duration": 5,
}
)
# Video generation takes a long time
print(output) # Return video URLGenerate 5 seconds of high-quality video,
The generation time is about 2-5 minutes,
Returns the downloadable video URL.
Video generation takes longer than image generation, so it is recommended to use asynchronous calling.
Browse and discover models
Replicate's model library has thousands of models, and it is important to learn to search and evaluate models.
Find the right model on the site
At replicate.com/explore: 1. Browse by category: - Image generation - Video generation - Audio (audio processing) - Language (language model) 2. Indicators for evaluating the model: - Run count (number of runs, the higher the more reliable) - Cost per run (cost per run) - Run time - Last updated 3. View the API sample code of the model 4. Free trial on the website (limited number of times)
Find the model that suits your needs,
View cost estimates,
Copy the sample code and use it directly.
Give priority to models with high run count and recent updates, as their quality and maintainability are more guaranteed.
Compared with similar tools
| Tool | Strength | Best for | Pricing |
|---|---|---|---|
| Replicate This tool | The largest variety of models, pay as you go, no GPU configuration required | Need to quickly call various open source models and prototype development | Pay according to usage (about $0.001-0.05/picture) |
| Hugging Face Inference API | Direct integration with the Hugging Face model library | Using models from Hugging Face | Free quota/paid version |
| Together AI | Language model inference is faster and cheaper | Calling language models a lot | Pay by token |
| Ollama | Runs completely locally, no fees | There is a local GPU and high data privacy requirements | completely free |
Sources & references:
- Replicate official website (2025-03)
- Replicate Python documentation (2025-03)