MiniMax Hailuo: text or image to video.
This model gets stronger as the shot becomes more explicit. Give it a subject, a move, a frame, and a mood so the output feels directed instead of guessed.
Best results start with a directed prompt or a strong first frame.
Hailuo Video on Pixio is MiniMax Hailuo: text-to-video and image-to-video with cinematic camera control, character consistency, and realistic physics. Clips are typically 6s or 10s at 768p or 1080p (1080p may be limited to 6s depending on backend), 25 FPS, with aspect ratios within 2:5–5:2. Use it when you want MiniMax quality for short clips from a prompt or keyframe—drafts, ads, or narrative beats—or as an alternative to Runway/ByteDance in Pixio.
Hailuo Video on Pixio is MiniMax Hailuo: text-to-video and image-to-video with cinematic camera control, character consistency, and realistic physics. Clips are typically 6s or 10s at 768p or 1080p (1080p may be limited to 6s depending on backend), 25 FPS, with aspect ratios within 2:5–5:2. Use it when you want MiniMax quality for short clips from a prompt or keyframe—drafts, ads, or narrative beats—or as an alternative to Runway/ByteDance in Pixio.
| Mode | Input | Best for |
|---|---|---|
| Text to Video | Prompt only | Scenes from scratch; one clear motion and composition per clip |
| Image to Video | One image + prompt | Keyframe-driven clips; image defines look, prompt describes motion |
| Option | Values | Notes |
|---|---|---|
| Duration | 6s, 10s | 1080p may be limited to 6s on some backends—check Pixio |
| Resolution | 768p, 1080p | 768p (e.g. 1366×768) or 1080p; higher res uses more credits |
| Aspect ratio | Within 2:5–5:2 | Shorter side typically >300px; 16:9, 9:16 common |
| Frame rate | 25 FPS (typical) | Check Pixio for variant (e.g. Hailuo 2.0 / 2.3) |
Credits depend on duration, resolution, and variant (e.g. standard vs fast). Check the model card in Pixio for current rates.
[Scene] + [Motion] + [Camera] + [Style]. One clear sentence. For image-to-video, describe motion and style; the image defines the look. One primary action and one camera move per prompt work best.
Cinematic:
"Wide shot of a lone astronaut walking across a red Martian landscape at golden hour. Dust kicks up with each step. Camera slowly dollies backward, keeping the figure small in frame. Cinematic, anamorphic feel, shallow depth of field."
Product:
"A luxury watch rests on a black velvet surface. Soft key light from the left, subtle rim light on the metal. Camera orbits 90 degrees around the watch, smooth and slow. High-end product commercial, 24p, clean reflections."
Narrative:
"A woman in a red coat walks through a rainy city street at night. Camera follows from behind at a steady pace. Neon signs reflect on wet pavement; streetlights glow in the mist. Cinematic, moody, film-noir atmosphere."
Image-to-video (motion only):
"Camera slowly pushes in. Leaves rustle in the wind. Woman turns her head slightly toward camera. Background stays soft and still."
| Scenario | Best choice |
|---|---|
| MiniMax Hailuo text or image to video | Hailuo Video |
| Best Runway or ByteDance quality | Gen-4, Seedance 2 Pro |
| Quick draft, lower cost | Kling or Gen-4 Turbo |
| Talking head / lip-sync | Fabric, Character 3, OmniHuman |
Start with a strong first frame when consistency matters more than surprise.
Keep each prompt focused on one primary motion direction.
Use shorter runs for iteration, then scale up for finals.
For narratives, structure the idea as Shot 1 / Shot 2 / Shot 3 instead of one flat blob.
A strong video prompt gives the scene a subject, a move, camera behavior, and a mood to hold onto.
Start from language and push for camera intent, pacing, atmosphere, and shot design in one move.
Start from a frame or reference when consistency matters more than improvisation.
Continue or refine the clip without throwing away the visual language you already established.
Hailuo Video works well when the prompt needs motion, framing, and visual direction, not just subject matter.
Use it for sequences that need a strong first frame, continuity, or a clearly controlled camera idea.
Treat each generation like a shot brief instead of a loose caption to get more cinematic outputs.
Start with either a directed text brief or a strong frame, depending on how locked the look already is.
Write the motion like a director: subject, action, camera behavior, environment, lighting, and tone.
Iterate fast on shorter runs, then move to stronger finals once the rhythm feels right.
Use it to build a stronger first frame, then hand that frame to the video model for motion and continuity.
Pair it with frame extraction, merge tools, or image prep so the motion workflow stays clean end to end.