The most lens-controlled AI video model
PixVerse v6, released March 30, 2026, is PixVerse’s most comprehensive video generation model. The headline feature is its 20+ cinematic lens control system — parameters including focal length, aperture, depth of field, lens distortion, chromatic aberration, and vignetting that are typically exclusive to real camera setups. Combined with native audio-video generation and a multi-shot storytelling engine, v6 bridges the gap between AI generation and professional cinematography. Multilingual text rendering within frames (subtitles, labels, titles) enables localized global content production without a separate post-production step.Capabilities
20+ cinematic lens controls
Focal length, aperture, depth of field, lens distortion, chromatic aberration, vignetting, and more — precise optical simulation for cinematic results.
Native audio-visual generation
Audio and video generated simultaneously from a single prompt — dialogue, sound effects, and ambient audio synchronized to the visual output.
Multi-shot engine
Structured multi-shot storytelling within a single generation — scene cuts, transitions, and narrative arcs handled automatically from your prompt.
Multilingual text in-frame
Renders accurate text in multiple languages directly within the video frame — titles, subtitles, labels, and signage in your target locale.
1080p at 15 seconds
Full HD output up to 15 seconds — one of the longer native generation windows available at 1080p.
Scene extension and transitions
Supports video extension and scene transition generation — seamlessly continue or connect scenes without re-generating from scratch.
Specifications
| Feature | Details |
|---|---|
| Developer | PixVerse |
| Released | March 30, 2026 |
| Resolution | 1080p |
| Duration | Up to 15 seconds |
| Aspect ratios | 16:9, 9:16, 1:1, 4:3, 3:4 |
| Audio | Native — dialogue, SFX, ambient |
| Lens controls | 20+ (focal length, aperture, DoF, distortion, CA, vignetting) |
| Text in-frame | Yes, multilingual |
| Multi-shot | Yes |
How to use
Write your prompt
Include cinematic language — lens type, aperture mood, audio cues, and scene progression. v6 interprets optical and cinematographic vocabulary directly.
Configure lens controls (optional)
Use advanced settings to tune specific optical parameters — focal length for compression/expansion, aperture for depth of field, vignetting for atmosphere.
Include audio direction
Describe the sound environment explicitly in the prompt for the native audio generation to match your visual intent.
Prompting tips
- Use optical terminology — “Shot on an 85mm lens with a wide-open aperture, shallow depth of field” will be interpreted and applied as actual optical rendering properties.
- Describe audio in the prompt — “With the sound of waves crashing and seagulls in the distance” triggers native audio generation alongside the visual.
- Leverage chromatic aberration for mood — A subtle chromatic aberration setting adds a cinematic, slightly analog feel to otherwise perfect digital footage.
- Multi-shot: use transition cues — “CUT TO:” or “THEN:” in prompts cue the multi-shot engine to generate discrete scene transitions.
Example prompts
A couple walks on a sunset beach, shot on a 35mm lens, f/1.8, golden bokeh in the background. Ocean waves audible. Slight vignette. Cinematic, 10 seconds.
A product launch promo: SHOT 1 — smartphone spinning on a white surface, dramatic lighting. CUT TO SHOT 2 — close-up of screen with “NEW ERA” text in bold. Upbeat electronic music building throughout. 15 seconds, 16:9.
Compare models
| Model | Audio | Lens controls | Multi-shot | Duration | Best for |
|---|---|---|---|---|---|
| PixVerse v6 | Yes | 20+ | Yes | 15s | Cinematic optical control, A/V |
| PixVerse v5.5 | Yes | Limited | Yes | 10s | Script-first, multi-shot |
| PixVerse v5 | No | None | No | 15s | Fast 1080p, character animation |
| Kling 3.0 Pro | Yes | No | Up to 6 shots | 15s | 4K cinematic storytelling |

