特性列表

本手册描述MindSpeed Core相关特性，介绍特性商用版本的发布状态、PyTorch框架支持情况。

Note

下表中的“Released”代表商用版本已发布，“✅”代表支持，“❌”代表不支持。

表 1 特性列表

特性类型	特性名称	Released	PyTorch框架支持情况
Megatron特性	Megatron 数据并行	✅	✅
	Megatron 张量并行	✅	✅
	Megatron 流水并行	✅	✅
	Megatron 虚拟流水线并行	✅	✅
	Megatron 分布式优化器	✅	✅
	Megatron 序列并行	✅	✅
	Megatron 异步DDP	✅	✅
	Megatron 权重更新通信隐藏	✅	✅
	Megatron 重计算	✅	✅
	Megatron 分布式权重	✅	✅
	Megatron 全分片并行	✅	✅
	Megatron Transformer Engine	✅	✅
	Megatron Multi-head Latent Attention	✅	✅
并行策略特性	Ascend Ulysses 长序列并行	✅	✅
	Ascend Ring Attention 长序列并行	✅	✅
	Ascend Double Ring Attention 长序列并行	✅	✅
	Ascend 混合长序列并行	✅	✅
	Ascend 自定义空操作层	✅	✅
	Ascend DualPipeV	✅	✅
内存优化特性	Ascend 激活函数重计算	✅	✅
	Ascend 重计算流水线独立调度	✅	✅
	Ascend Mask归一	✅	✅
	Ascend BF16 参数副本复用	✅	✅
	Ascend swap_attention	✅	✅
	Ascend Norm重计算	✅	✅
	Ascend Hccl Buffer 自适应	✅	✅
	Ascend Swap Optimizer	✅	✅
	Virtual Optimizer	✅	✅
亲和计算特性	Ascend rms_norm 融合算子	✅	✅
	Ascend swiglu 融合算子	✅	✅
	Ascend rotary_embedding 融合算子	✅	✅
	Ascend flash attention	✅	✅
	Ascend Moe Token Permute and Unpermute 融合算子	✅	✅
	Ascend npu_matmul_add_fp32 梯度累加融合算子	✅	✅
	Ascend 计算通信并行优化	❌	✅
	Ascend MC2	❌	✅
	Ascend fusion_attention_v2	❌	✅
通信优化特性	Ascend Gloo 存档落盘优化	✅	✅
通信优化特性	Ascend 高维张量并行	✅	✅
Mcore MoE特性	Ascend Megatron MoE GMM	✅	✅
	Ascend Megatron MoE Allgather Dispatcher 性能优化	✅	✅
	Ascend Megatron MoE Alltoall Dispatcher 性能优化	✅	✅
	Ascend Megatron MoE TP拓展EP	✅	✅
	Megatron MoE alltoall dispatcher分支通信隐藏优化	❌	✅
	Megatron MoE allgather dispatcher分支通信隐藏优化	✅	✅
	Ascend 共享专家	✅	✅
	1F1B Overlap	✅	✅
	专家并行动态负载均衡（数参互寻）	✅	✅
关键场景特性	Ascend EOD Reset训练场景	✅	✅
关键场景特性	Ascend alibi	❌	✅
多模态特性	Ascend fused ema adamw优化器	❌	✅
	Ascend PP支持动态形状	✅	✅
	Ascend PP支持多参数传递	✅	✅
	Ascend PP支持多参数传递和动态形状	✅	✅
	Ascend 非对齐线性层	✅	✅
	Ascend 非对齐Ulysses长序列并行	✅	✅
其它特性	Ascend TFLOPS计算	✅	✅
	Ascend Auto Settings 并行策略自动搜索系统	❌	✅
	Ascend 确定性计算	❌	✅
	Ascend MindStudio Training Tools 精度对照	❌	✅