Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.imagine.art/llms.txt

Use this file to discover all available pages before exploring further.

VIDEO MODELby Google DeepMindReleased October 2025

Google Veo 3.1

Google DeepMind’s flagship video generation model — 4–8 seconds of output, native 48kHz stereo audio including dialogue and voiceover, 4K resolution with selectable frame rates (24, 30, or 60 FPS), and the highest visual fidelity in the Veo family for broadcast-quality production.

Duration
4–8 seconds
Resolution
Up to 4K
Audio
48kHz stereo
Frame rates
24, 30, or 60 FPS

Google’s most capable video model

Google Veo 3.1, released October 2025, is the production flagship of the Veo 3.1 family — generating 4–8 seconds of content per generation. Combined with broadcast-quality audio at 48kHz stereo (including dialogue, voiceover, and sound effects) and 4K resolution, Veo 3.1 is positioned for broadcast, commercial, and high-end production. Selectable frame rates (24 FPS for cinematic, 30 FPS for standard, 60 FPS for sports and fast motion) give the model flexibility across production contexts that no other model in the lineup matches.

Capabilities

4–8 seconds

Choose from 4, 6, or 8 second clips — enough for complete commercial spots, narrative sequences, and high-quality short-form content.

48kHz stereo audio

Broadcast-quality audio at 48kHz stereo — includes dialogue, voiceover, sound effects, and ambient soundscapes generated natively.

4K resolution

4K output with the Transformer backbone’s spatial fidelity — production-ready for broadcast, cinema, and large-format display.

Selectable frame rates

Choose 24 FPS (cinematic standard), 30 FPS (broadcast/digital), or 60 FPS (sports, action, smooth motion) per generation.

Highest visual fidelity in Veo family

Maximum quality configuration of the Veo 3.1 Transformer backbone — richer texture, more stable scene composition, and more precise prompt adherence than Fast and Lite tiers.

Cinematic prompt control

Responds accurately to cinematographic language — camera movements, lighting conditions, scene pacing, and narrative cues all translate to the output.

Specifications

FeatureDetails
DeveloperGoogle DeepMind
ReleasedOctober 2025
Resolution720p, 1080p, 4K
Duration4–8 seconds (4, 6, or 8s selectable)
Frame rates24 FPS, 30 FPS, 60 FPS (selectable)
Audio48kHz stereo — dialogue, voiceover, SFX, ambient
Aspect ratios16:9, 9:16
ArchitectureTransformer backbone, spatio-temporal patches

How to use

1

Open the AI Video Generator

Log into ImagineArt and go to the AI Video Generator.
2

Select Google Veo 3.1

Choose Google Veo 3.1 from the model dropdown.
3

Write a detailed prompt

For long-form content, structure your prompt as a scene description with clear narrative progression — describe the arc of what happens over the full duration.
4

Set duration

Choose your desired length — 4, 6, or 8 seconds.
5

Select frame rate

Choose 24 FPS for cinematic, 30 FPS for standard delivery, or 60 FPS for smooth high-speed action.
6

Select resolution

Choose 720p, 1080p, or 4K based on delivery requirements.
7

Generate

Click Generate. 4K generations take longer than 720p or 1080p.

Prompting tips

  • Structure prompts as narratives — “The scene opens on… then transitions to… building to a conclusion where…” — Veo 3.1 understands temporal narrative structure.
  • Specify audio timing — “The music builds from quiet to full orchestral by the halfway point” — temporal audio cues are interpreted accurately.
  • Match FPS to content — 24 FPS for drama and cinema; 30 FPS for documentary and commercial; 60 FPS for sports or fluid slow-motion effect.
  • Use 4K for large-format or print-quality moments — Product launches, hero brand shots, and cinematic sequences benefit most from 4K.

Example prompts

A nature documentary segment: A wolf pack hunts across a frozen tundra at dusk. The alpha leads, the pack follows in formation. Wide shot establishing the landscape, intercutting with close-ups of paw prints, breath visible in the cold air. Narration: “In the far north, survival depends on the pack.” Ambient wind and distant howling. 8 seconds, 4K.
A coffee brand spot: Product sits on a warm kitchen table at sunrise. A pair of hands wrap around the mug. Steam rises in close-up. Brand voice: “Start every morning right.” Warm, cinematic. 6 seconds.

Compare models

ModelDurationAudioResolutionFPS optionsBest for
Veo 3.14–8s48kHz stereoUp to 4K24/30/60Broadcast, commercial, high-end
Veo 3.1 Fast4/6/8sSFX + ambientUp to 4K24Short-form production, 4K
Veo 3.1 Lite8sNo1080pCost-efficient, no audio
Sora 2 Pro25sDialogue + SFX1080pLong-form A/V, physics
Choose Veo 3.1 when you need the highest visual fidelity in the Veo family with full audio (dialogue, voiceover, SFX) and up to 4K. For faster generation at lower cost, Veo 3.1 Fast offers the same 4–8s and 4K with a reduced credit cost.