Veed Fabric: turn a face image and audio into lip-synced talking-head video. 1.0 Fast for speed; 1.0 for higher quality when it matters.
This model gets stronger as the shot becomes more explicit. Give it a subject, a move, a frame, and a mood so the output feels directed instead of guessed.
Best results start with a directed prompt or a strong first frame.
Fabric 1.0 / 1.0 Fast on Pixio is Veed Fabric’s talking-head pipeline: one face image + audio → lip-synced video. The model drives mouth and expression from the audio so the character speaks naturally. 1.0 Fast for speed and lower cost; 1.0 for higher quality when it matters. Use it when you need a spokesperson, avatar, or talking head that matches your script or voiceover.
Fabric 1.0 / 1.0 Fast on Pixio is Veed Fabric’s talking-head pipeline: one face image + audio → lip-synced video. The model drives mouth and expression from the audio so the character speaks naturally. 1.0 Fast for speed and lower cost; 1.0 for higher quality when it matters. Use it when you need a spokesperson, avatar, or talking head that matches your script or voiceover.
| Mode | Input | Best for |
|---|---|---|
| Face + Audio to Video | One face image + audio file (or script) | Lip-synced talking head; expression driven by audio |
| Option | Values | Notes |
|---|---|---|
| Variant | 1.0 Fast, 1.0 | Fast = speed/cost; 1.0 = higher fidelity |
| Face reference | One image (clear face, front or three-quarter) | Good lighting, neutral or slight expression |
| Audio | Voice track or script (when TTS supported) | Clean audio improves lip-sync |
Credits depend on variant (1.0 Fast vs 1.0) and duration. Fast costs less per clip. Check the model card in Pixio for current rates.
Fabric is built for one face + one audio → one talking-head video. The model handles lip-sync and expression from the audio; you don’t need to animate mouth or timing. Use a clear face reference (front or three-quarter, good lighting) and clean audio for best results. For character-driven motion without speech (e.g. waving, gesturing), use Gen-4 Act-Two or Character 3 instead.
| Scenario | Best choice |
|---|---|
| Lip-synced talking head (face + audio) | Fabric 1.0 / 1.0 Fast |
| Talking head, Hedra pipeline | Character 3 |
| ByteDance talking head | OmniHuman v1.5 |
| Character motion without speech | Gen-4 Act-Two |
| General image-to-video | Gen-4, Seedance, Kling |
Start with a strong first frame when consistency matters more than surprise.
Keep each prompt focused on one primary motion direction.
Use shorter runs for iteration, then scale up for finals.
For narratives, structure the idea as Shot 1 / Shot 2 / Shot 3 instead of one flat blob.
A strong video prompt gives the scene a subject, a move, camera behavior, and a mood to hold onto.
Start from language and push for camera intent, pacing, atmosphere, and shot design in one move.
Start from a frame or reference when consistency matters more than improvisation.
Continue or refine the clip without throwing away the visual language you already established.
Fabric 1.0 / 1.0 Fast works well when the prompt needs motion, framing, and visual direction, not just subject matter.
Use it for sequences that need a strong first frame, continuity, or a clearly controlled camera idea.
Treat each generation like a shot brief instead of a loose caption to get more cinematic outputs.
Start with either a directed text brief or a strong frame, depending on how locked the look already is.
Write the motion like a director: subject, action, camera behavior, environment, lighting, and tone.
Iterate fast on shorter runs, then move to stronger finals once the rhythm feels right.
Use it to build a stronger first frame, then hand that frame to the video model for motion and continuity.
Pair it with frame extraction, merge tools, or image prep so the motion workflow stays clean end to end.