Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.imagine.art/llms.txt

Use this file to discover all available pages before exploring further.

VIDEO MODELby Kling AIKling 3 family

Kling 3.0 Pro

Kling AI’s most advanced video model — 1080p at 60 FPS, Omni Native Audio with multilingual dialogue and environmental soundscapes, and the ability to generate up to 6 distinct shots in a single 15-second output.

Resolution
1080p
Frame rate
60 FPS
Duration
Up to 15 seconds
Shots per generation
Up to 6

Kling 3.0 Pro

Kling 3.0 Pro marks Kling AI’s most significant architectural leap — 1080p output at 60 frames per second with Omni Native Audio and multi-shot storyboarding in a single generation. The Multi-modal Visual Language (MVL) architecture unifies text, image, video, and audio inputs into a single model, enabling true multi-shot storyboarding — up to 6 distinct shots, each with specified duration, shot size, perspective, narrative, and camera movement, all generated from one prompt.

Capabilities

1080p at 60 FPS

Generates 1080p video at 60 frames per second — smooth, high frame-rate output for cinematic and action-heavy content.

Omni Native Audio

Multilingual audio generation including English, Japanese, Korean, Spanish, and environmental soundscapes — generated natively alongside the video.

Multi-shot storyboarding

Specify up to 6 shots in a single 15-second generation — each with its own duration, shot size, perspective, camera movement, and narrative.

MVL architecture

Multi-modal Visual Language architecture natively processes text, images, video, and audio as unified inputs for coherent multimodal output.

Up to 10 reference images

Accepts up to 10 reference images for subject appearance, style, and composition anchoring across a multi-shot sequence.

Complex action accuracy

Handles fast, intricate physical actions — martial arts, dance, sports — with consistent body mechanics and no ghosting artifacts.

Specifications

FeatureDetails
DeveloperKling AI (Kuaishou)
Base credits300
Resolution1080p
Frame rate60 FPS
DurationUp to 15 seconds
Shots per generationUp to 6
AudioOmni Native Audio — dialogue, SFX, music
LanguagesEnglish, Japanese, Korean, Spanish + more
Max reference images10
ArchitectureMulti-modal Visual Language (MVL)

How to use

1

Open the AI Video Generator

Log into ImagineArt and go to the AI Video Generator.
2

Select Kling 3.0 Pro

Choose Kling 3.0 Pro from the model dropdown.
3

Structure your prompt for multi-shot

For multi-shot output, describe each shot with explicit transitions: “SHOT 1 (3s, wide, establishing): … SHOT 2 (2s, close-up): …” Kling 3.0 Pro interprets these cues to generate distinct cinematographic cuts.
4

Add reference images (optional)

Upload up to 10 reference images for character appearance, environment style, or composition guidance.
5

Include audio direction

Describe the audio landscape — dialogue lines, ambient environment, music style — within the prompt for Omni Native Audio.
6

Generate

Click Generate. Kling 3.0 Pro produces a 1080p, 60 FPS output with synchronized audio.

Prompting tips

  • Structure shots explicitly — “SHOT 1: wide establishing exterior, 3 seconds, slow pan right. SHOT 2: medium close-up on protagonist, 2 seconds, static camera.” Kling 3.0 Pro follows cinematographic structure in prompts.
  • Specify language for dialogue — If your scene requires characters speaking a specific language, state it clearly: “The character speaks in Japanese with a formal tone.”
  • Reference images anchor identity — For character consistency across shots, upload a reference image and describe the character consistently in each shot description.
  • Use technical camera terms — “Shallow depth of field,” “Dutch angle,” “rack focus,” and “tracking shot” all meaningfully influence the cinematic output.

Example prompts

SHOT 1 (4s, wide, cinematic): A samurai stands at the edge of a misty forest at dawn. Slow pan left, revealing a village in the distance. Traditional Japanese ambient sounds. SHOT 2 (3s, close-up): The samurai’s hand grips a sword hilt. Rain begins to fall. SHOT 3 (3s, medium): The samurai turns and walks into the mist.
A professional basketball player dribbles through defenders and dunks. Wide angle, 60 FPS, 5 seconds. Arena crowd roaring in the background, sneakers squeaking on hardwood.

Compare models

ModelResolutionFPSAudioShotsBest for
Kling 3.0 Pro1080p60Omni NativeUp to 6Multi-shot storytelling, 60 FPS
Kling O34K60YesUp to 6Advanced physics, 6 generation modes
Kling 2.6 Pro1080p48Lip-syncAudio-synced content, fast motion
Kling 2.5 Pro1080pNoCost-efficient HD production
Kling 3.0 Pro is the right choice when you need structured multi-shot storytelling at 60 FPS with native audio in a single generation. For 4K output, compare with Kling O3.