文件最后提交记录最后更新时间
[fix]:solve the problem of chunk_bwd_dqkwg triton's time degradation. Co-authored-by: LinShua<707894133@qq.com> # message auto-generated for no-merge-commit merge: !3379 merge 26.0.0_core_r0.12.1_dqkwg_triton_time into 26.0.0_core_r0.12.1 [fix]:solve the problem of chunk_bwd_dqkwg triton's time degradation. Created-by: LinShua Commit-by: LinShua Merged-by: ascend-robot Description: What this PR does / why we need it? 解决chunk_bwd_dqkwg 算子定长性能优化带来的变长性能劣化问题: chunk_bwd_kernel_dqkwg 定长: 原逻辑:26158us 优化后:18855us 变长: 原逻辑:33450us 优化后:89597us 当前PR: 定长: 18769us 变长: 33509us Does this PR introduce any user-facing change? NA How was this patch tested? 见PR:[fix]:solve the UT of GDN triton. See merge request: Ascend/MindSpeed!33791 个月前
Add MindSeed Lite Co-authored-by: JialiZheng<jializheng@huawei.com> # message auto-generated for no-merge-commit merge: !2978 merge lite into master Add MindSeed Lite Created-by: JialiZheng1 Commit-by: JialiZheng Merged-by: ascend-robot Description: Add MindSeed Lite See merge request: Ascend/MindSpeed!29786 个月前
feat:【FSDP backend】Change expert parallel to support combined experts weights. Co-authored-by: JialiZheng<jializheng@huawei.com> # message auto-generated for no-merge-commit merge: !3147 merge master into master feat:【FSDP backend】Change expert parallel to support combined experts weights. Created-by: JialiZheng1 Commit-by: JialiZheng Merged-by: ascend-robot Description: 修改 EP 为支持专家合并权重 eager dispatcher 精度验证: http://image.huawei.com/tiny-lts/v1/images/mdstorm/729781f03639b9a5e430e986c51fc6ee_665x772.png fused dispatcher 精度验证: http://image.huawei.com/tiny-lts/v1/images/mdstorm/52a71f80ec498d1136da7bb83b3b585d_665x770.png See merge request: Ascend/MindSpeed!31474 个月前
Add MindSeed Lite Co-authored-by: JialiZheng<jializheng@huawei.com> # message auto-generated for no-merge-commit merge: !2978 merge lite into master Add MindSeed Lite Created-by: JialiZheng1 Commit-by: JialiZheng Merged-by: ascend-robot Description: Add MindSeed Lite See merge request: Ascend/MindSpeed!29786 个月前