Create Amazing Videos with Wan AI 2.1

The Wan AI generation queue is currently processing requests. While waiting, explore the Wan AI showcase gallery and advanced model features below.

Wan AI Featured Examples

Wan AI Text Generation Demo 1

Prompt: A stunning young woman emerges from crystal-clear turquoise water in slow motion. Her long dark hair flows gracefully as water cascades down her face. Shot from a slightly low angle, capturing the moment her face breaks through the water's surface. Soft natural sunlight filters through the water, creating ethereal light rays and sparkles on the water droplets. Her expression is serene and peaceful, with water droplets glistening on her skin. The background shows a blurred underwater scene with gentle ripples. Cinematic color grading with rich aqua tones, professional lighting, shot in 4K quality with high dynamic range. The movement is smooth and elegant, emphasizing the transition between underwater and surface.

Wan AI Text Generation Demo 2

Prompt: A wide cinematic shot from the audience's perspective capturing a vibrant hip-hop crew dominating the stage. Comprising five dancers with urban streetwear, confident expressions, and synchronized movements, they perform under dynamic stage lighting with side-angled beams

Wan AI Text Generation Demo3

Prompt: In the Baroque style European palace, sparkling crystal chandeliers cast a soft glow, illuminating a pair of dancing lovers in the center. The man is dressed in a black tailcoat, paired with a snow-white shirt and bow tie, showcasing a gentlemanly demeanor; The lady wore a floor length long skirt, with delicate lace embellishments at the hem, creating a light and elegant look. They embraced tightly, their arms elegantly intertwined, spinning and jumping with the rhythm of the waltz, each step interpreting romance and passion. The audience around cast envious glances, and the air was filled with the atmosphere of nobility and the sweetness of love. Mid shot, using stable follow-up shooting to capture every moment of rotation.

Wan AI 2.1 leverages cutting-edge architectures:

Wan-VAE: Advanced 3D Causal VAE

Novel 3D causal VAE architecture specifically designed for video generation

•Unlimited-length 1080P video processing
•Temporal information preservation
•Efficient spatio-temporal compression
•Reduced memory footprint
•Superior performance efficiency

Video Diffusion DiT

Flow Matching framework with Diffusion Transformers

Wan 1.3B

1536D

12 heads

30 layers

Wan 14B

5120D

40 heads

40 layers

Create Amazing Videos with Wan AI 2.1

Wan AI 2.1 is a state-of-the-art open-source video generation platform that transforms your photos and text into stunning videos. Built on advanced AI technology, it delivers professional-quality videos with exceptional performance on consumer-grade GPUs.

Consumer-Friendly

Generate high-quality videos on consumer GPUs with just 8.19GB VRAM. Create a 5-second 480P video in about 4 minutes on an RTX 4090.

SOTA Performance

Experience industry-leading video quality that outperforms existing open-source models and commercial solutions across multiple benchmarks.

Multiple Tasks Support

One unified model for Text-to-Video, Image-to-Video, Video Editing, Text-to-Image, and Video-to-Audio generation tasks.

Visual Text Generation

The first video model capable of generating both Chinese and English text with robust text generation capabilities for practical applications.

Wan AI Advanced Features

Powerful Video VAE

Wan-VAE can encode and decode unlimited-length 1080P videos while preserving temporal information, perfect for video generation.

Resolution Options

Support for both 480P and 720P video generation, with optimized performance for different GPU configurations.

Open Source

Fully open-source implementation with comprehensive documentation, supporting community development and innovation.

Wan AI Technical Specifications

Wan 2.1 leverages cutting-edge AI architectures and optimizations:

Advanced Flow Matching framework with Diffusion Transformers
Novel 3D causal VAE architecture for superior video compression
Optimized for both single and multi-GPU inference
Supports T2V-14B, T2V-1.3B, and I2V-14B models