Skip to main content
VIDEO MODELby AlibabaWan family

Wan 2.2

Alibaba’s Mixture of Experts video model — the Video Animation Control Engine (VACE) with camera trajectory controls, subject locking, and background stabilization, plus a few-shot LoRA pipeline for custom style adaptation using just 10–20 images. Native 1080p at 30 FPS.

Resolution
Native 1080p
Frame rate
30 FPS
Architecture
MoE (~10B params)
Camera control
VACE engine

Advanced camera control and style adaptation

Wan 2.2 is built on a Mixture of Experts (MoE) diffusion architecture — approximately 10 billion parameters arranged as specialized experts for high-noise and low-noise diffusion stages, producing more efficient and higher-quality output than the 14B monolithic architecture of Wan 2.1. The Video Animation Control Engine (VACE) is the central differentiating feature: explicit camera trajectory inputs, subject locking (keeping a defined subject stationary relative to camera movement), and background stabilization for controlled scene composition. Combined with a few-shot LoRA pipeline that adapts the model to a custom visual style using only 10–20 reference images, Wan 2.2 is the most configurable model in the Wan family. Licensed under Apache 2.0, the underlying model is open for commercial use.

Capabilities

VACE camera controls

Explicit camera trajectory inputs — pans, zooms, focus pulls, and custom paths — for precise cinematographic control over the generated video.

Subject locking

Keep a defined subject visually stable relative to camera movement — useful for product showcase, character focus, and controlled scene composition.

Background stabilization

Stabilize the background while the subject moves, or vice versa — independent control over foreground and background motion.

Few-shot LoRA style adaptation

Adapt the model to a custom visual style using just 10–20 reference images via a few-shot LoRA pipeline — style consistency across generations.

MoE architecture

~10B parameter Mixture of Experts model — more efficient than 14B monolithic architectures, with specialized processing for different noise levels.

Native 1080p at 30 FPS

Full HD output at 30 frames per second without upscaling.

Specifications

FeatureDetails
DeveloperAlibaba (Wan Video)
ArchitectureMixture of Experts (MoE), ~10B parameters
ResolutionNative 1080p
Frame rate30 FPS
DurationUp to 5 seconds (T2V); multiple for I2V
Camera controlVACE — pans, zooms, focus pulls, custom paths
Style adaptationFew-shot LoRA (10–20 images)
AudioNo native audio
LicenseApache 2.0

How to use

1

Open the AI Video Generator

Log into ImagineArt and go to the AI Video Generator.
2

Select Wan 2.2

Choose Wan 2.2 from the model dropdown.
3

Configure camera controls

Use VACE parameters to set your camera trajectory — specify pans, zoom direction, focus pull behavior, and any subject locking requirements.
4

Write your prompt

Describe the scene with detail on subjects, setting, style, and motion. Wan 2.2’s semantic understanding handles complex compositional instructions.
5

Generate

Click Generate for native 1080p output at 30 FPS.

Prompting tips

  • Be explicit with camera trajectory — “Camera slowly pans right while maintaining focus on the stationary subject” is processed as a VACE instruction, not just a suggestion.
  • Describe background vs. foreground separately — “Background blurs softly while the subject remains sharp and stationary in frame center” activates subject locking and background stabilization.
  • Use style references for LoRA — For brand content requiring a specific visual style, the few-shot LoRA pipeline can adapt the model to your reference aesthetic.

Example prompts

A product bottle sits on a frosted glass surface. The camera orbits slowly around it 180 degrees, keeping the bottle perfectly centered. Studio lighting, clean white background. 5 seconds, 1080p.
A wide shot of a mountain valley. The camera slowly pushes forward through a field of wildflowers, foreground flowers in focus, mountains slightly blurred in background. Golden hour light. 5 seconds.

Compare models

ModelCamera controlStyle adaptAudioArchitectureBest for
Wan 2.2VACE explicitLoRA (10–20 imgs)NoMoE 10BCamera-precise, brand style
Wan 2.5Prompt-basedNoYesAudio-visual sync
Wan 2.6Prompt-basedNoYesCharacter reference, audio
Kling 2.5 ProPrompt-basedNoNoFast, affordable 1080p
Wan 2.2 is the best choice when precise camera control and brand-style consistency matter. The VACE system gives you a level of programmatic camera control not available in prompt-only models.