文件最后提交记录最后更新时间
perf(fp8): enhance te Co-authored-by: Muu<koimuu@163.com> # message auto-generated for no-merge-commit merge: !3064 merge feature_fix into master perf(fp8): enhance te Created-by: Muuyo Commit-by: Muu Merged-by: ascend-robot Description: 1. 引入低精度重计算 2. mxfp8 mm之后清理无用quant tensor 和 scale 显存 3. 重构 te linner 抽象 dw 流程 4. 提取 gmm op 4. GMMFunction 引入 gemm_gradient_accumulation_fusion 5. 支持参数(--moe-router-dtype fp8) 控制 topK routing 开启低精度计算 6. mxfp8 mm 去除额外引入的转置操作 7. GMM add仅使用高精度 https://wiki.huawei.com/domains/76578/wiki/233229/WIKI202512189479523 See merge request: Ascend/MindSpeed!30645 个月前
feat(fp8): quant mm add Co-authored-by: Muu<koimuu@163.com> # message auto-generated for no-merge-commit merge: !3272 merge feature_quant_mm_add into master feat(fp8): quant mm add Created-by: Muuyo Commit-by: Muu Merged-by: ascend-robot Description: 1. mxfp8 策略支持 npu_add_quant_matmul 算子 2. tensorwise 支持mc2策略 3. 修复在backward时取FP8Tensor2D device 取不到的问题 自测报告: https://wiki.huawei.com/domains/76578/wiki/233229/WIKI2026030210253506 See merge request: Ascend/MindSpeed!32722 个月前
feat(fp8): quant mm add Co-authored-by: Muu<koimuu@163.com> # message auto-generated for no-merge-commit merge: !3272 merge feature_quant_mm_add into master feat(fp8): quant mm add Created-by: Muuyo Commit-by: Muu Merged-by: ascend-robot Description: 1. mxfp8 策略支持 npu_add_quant_matmul 算子 2. tensorwise 支持mc2策略 3. 修复在backward时取FP8Tensor2D device 取不到的问题 自测报告: https://wiki.huawei.com/domains/76578/wiki/233229/WIKI2026030210253506 See merge request: Ascend/MindSpeed!32722 个月前
feat(fp8): quant mm add Co-authored-by: Muu<koimuu@163.com> # message auto-generated for no-merge-commit merge: !3272 merge feature_quant_mm_add into master feat(fp8): quant mm add Created-by: Muuyo Commit-by: Muu Merged-by: ascend-robot Description: 1. mxfp8 策略支持 npu_add_quant_matmul 算子 2. tensorwise 支持mc2策略 3. 修复在backward时取FP8Tensor2D device 取不到的问题 自测报告: https://wiki.huawei.com/domains/76578/wiki/233229/WIKI2026030210253506 See merge request: Ascend/MindSpeed!32722 个月前