Quick Start

This page uses Wan2.1 as an example to show how to run text-to-video inference with MindIE SD. For more model-specific inference details, see Modelers - MindIE/Wan2.1.

Prerequisites

Before running inference, complete the environment preparation and install MindIE SD by following the Installation Guide.

Run inference

Install the model-specific dependencies and then run inference.

Clone the Wan2.1 model repository anywhere, install its requirements, and run the inference script from the MindIE SD workspace. Adjust the weight path as needed, for example /home/{user}/Wan2.1-T2V-14B. Parameter details are documented in parameter_config.md.

git clone https://modelers.cn/MindIE/Wan2.1.git && cd Wan2.1
pip install -r requirements.txt

# 8-card inference for Wan2.1-T2V-14B
cp MindIE-SD/examples/wan/infer_t2v.sh ./
bash infer_t2v.sh --model_base="/home/{user}/Wan2.1-T2V-14B"

Acceleration results

The following Wan2.1 example shows the effect of different acceleration features on an Atlas 800I A2 inference server (1*64G), including both single-card and multi-card runs.

Where:

Single-card acceleration

Cache acceleration

Baseline + Cache ratio 1.6 + Cache ratio 2.0 + Cache ratio 2.4
860.2s 631.7s 1.36x 541.8s 1.59x 516.9s *1.66x

Parallel strategy results

Two-card single-strategy results

Model Cards Parallel strategy Output resolution Operator optimization Cache optimization FA sparse 50-step E2E time (s) Speedup
Wan2.1 2 VAE 832*480 548.8 1.02x
Wan2.1 2 TP 832*480 502.8 1.12x
Wan2.1 2 CFG 832*480 332.6 1.69x
Wan2.1 2 Ulysses 832*480 327.6 *1.71x

Note: * marks the best acceleration result.

Multi-card combined-strategy results

Model Cards Parallel strategy Output resolution Operator optimization Cache optimization FA sparse 50-step E2E time (s) Speedup
Wan2.1 4 TP=4, VAE 832*480 204.0 2.754x
Wan2.1 4 CFG=2, TP=2, VAE 832*480 175.8 3.19x
Wan2.1 4 Ulysses=4, VAE 832*480 151.1 3.71x
Wan2.1 4 CFG=2, Ulysses=2, VAE 832*480 147.9 *3.79x
Wan2.1 8 TP=8, VAE 832*480 141.5 3.96x
Wan2.1 8 CFG=2, TP=4, VAE 832*480 102.9 5.45x
Wan2.1 8 Ulysses=8, VAE 832*480 78.1 7.18x
Wan2.1 8 CFG=2, Ulysses=4, VAE 832*480 76.4 *7.34x

Note: * marks the best acceleration result.