Replicate — Guide | AI devotee

Replicate — User Guide

Run open models in the cloud.

Visit website VPN may be required Freemium Sign-up required

Strengths

No need to configure the GPU environment, call the API directly
Supports thousands of open source models (images, videos, audios, text)
Pay as you go, no minimum spend
Can deploy your own models for others to use
Python/JavaScript SDK is easy to use

Best for

Quickly call image generation models such as Stable Diffusion and FLUX
Run computationally intensive models such as video generation and audio processing
AI functional testing during prototype development phase
Deploy and share your trained models
Run open source models without a GPU

Quickly call image generation model

The most common use of Replicate is to invoke image generation models without requiring a local GPU.

Scenario

Call FLUX to generate images

Prompt example

import replicate

# Set API Token (obtained from replicate.com/account)
# export REPLICATE_API_TOKEN=your-token

output = replicate.run(
    "black-forest-labs/flux-schnell",
    input={
        "prompt": "A serene Japanese garden with cherry blossoms, "
                  "traditional stone lantern, photorealistic",
        "num_outputs": 1,
        "aspect_ratio": "16:9",
        "output_format": "webp",
        "output_quality": 90
    }
)

# output is a list of image URLs
print(output[0]) #Print image URL

# Download image
import requests
img_data = requests.get(output[0]).content
with open("output.webp", "wb") as f:
    f.write(img_data)

Output / what to expect

Generate high-quality images in seconds,

Returns a temporary URL,

FLUX Schnell is one of the fastest high-quality image generation models available.

Tips

FLUX Schnell is fast but slightly lower quality, FLUX Dev is higher quality but slower, choose based on your needs.

Scenario

Call the video generation model

Prompt example

import replicate

output = replicate.run(
    "minimax/video-01",
    input={
        "prompt": "A butterfly landing on a flower, "
                  "macro photography, slow motion",
        "duration": 5,
    }
)

# Video generation takes a long time
print(output) # Return video URL

Output / what to expect

Generate 5 seconds of high-quality video,

The generation time is about 2-5 minutes,

Returns the downloadable video URL.

Tips

Video generation takes longer than image generation, so it is recommended to use asynchronous calling.

Browse and discover models

Replicate's model library has thousands of models, and it is important to learn to search and evaluate models.

Scenario

Find the right model on the site

Prompt example

At replicate.com/explore:

1. Browse by category:
   - Image generation
   - Video generation
   - Audio (audio processing)
   - Language (language model)

2. Indicators for evaluating the model:
   - Run count (number of runs, the higher the more reliable)
   - Cost per run (cost per run)
   - Run time
   - Last updated

3. View the API sample code of the model
4. Free trial on the website (limited number of times)

Output / what to expect

Find the model that suits your needs,

View cost estimates,

Copy the sample code and use it directly.

Tips

Give priority to models with high run count and recent updates, as their quality and maintainability are more guaranteed.

Compared with similar tools

Tool	Strength	Best for	Pricing
Replicate This tool	The largest variety of models, pay as you go, no GPU configuration required	Need to quickly call various open source models and prototype development	Pay according to usage (about $0.001-0.05/picture)
Hugging Face Inference API	Direct integration with the Hugging Face model library	Using models from Hugging Face	Free quota/paid version
Together AI	Language model inference is faster and cheaper	Calling language models a lot	Pay by token
Ollama	Runs completely locally, no fees	There is a local GPU and high data privacy requirements	completely free

Sources & references:

Replicate official website (2025-03)
Replicate Python documentation (2025-03)