文件最后提交记录最后更新时间
!2246 refactor: context parallel refactor Merge pull request !2246 from YE ZHENYUAN/master 1 年前
[Feature] Support full attention via Ring Attention under TND layout. Co-authored-by: zs-derrick1<1434012475@qq.com> # message auto-generated for no-merge-commit merge: !3021 merge master into master [Feature] Support full attention via Ring Attention under TND layout. Created-by: zs-derrick1 Commit-by: zs-derrick1 Merged-by: ascend-robot Description: [Feature] Support full attention via Ring Attention under TND layout. 添加新的AttentionWithCpTNDGeneral和TNDGeneralAttentionStrategy以支持TND layout在General 场景 下的Ring attention 验证结果: https://wiki.huawei.com/domains/149956/wiki/300791/WIKI202511048896138?title=_c3cd4b74 See merge request: Ascend/MindSpeed!30216 个月前
fix(torch/cp): use sbnd format before all2all Co-authored-by: clc2025<chenlucong@huawei.com> # message auto-generated for no-merge-commit merge: !3282 merge fixbug_ulysses_tnd into master fix(torch/cp): use sbnd format before all2all Created-by: clc2025 Commit-by: clc2025 Merged-by: ascend-robot Description: fixbug for ulysses tnd See merge request: Ascend/MindSpeed!32823 个月前
!2696 verl cp adaptor and cp fix Merge pull request !2696 from YE ZHENYUAN/master 9 个月前
fix: bugfix for mla in ulysses Co-authored-by: wuweiqiang24<wuweiqiang11@huawei.com> # message auto-generated for no-merge-commit merge: !3170 merge bugfix_ulysses_tnd into master fix: bugfix for mla in ulysses Created-by: wuweiqiang24 Commit-by: wuweiqiang24 Merged-by: ascend-robot Description: 1、适配Ulysses+TND Causal场景 2、修复了Ulysses+TND场景下MLA-CP的bug See merge request: Ascend/MindSpeed!31704 个月前
fix(torch/cp): use sbnd format before all2all Co-authored-by: clc2025<chenlucong@huawei.com> # message auto-generated for no-merge-commit merge: !3282 merge fixbug_ulysses_tnd into master fix(torch/cp): use sbnd format before all2all Created-by: clc2025 Commit-by: clc2025 Merged-by: ascend-robot Description: fixbug for ulysses tnd See merge request: Ascend/MindSpeed!32823 个月前
fix: bugfix for mla in ulysses Co-authored-by: wuweiqiang24<wuweiqiang11@huawei.com> # message auto-generated for no-merge-commit merge: !3170 merge bugfix_ulysses_tnd into master fix: bugfix for mla in ulysses Created-by: wuweiqiang24 Commit-by: wuweiqiang24 Merged-by: ascend-robot Description: 1、适配Ulysses+TND Causal场景 2、修复了Ulysses+TND场景下MLA-CP的bug See merge request: Ascend/MindSpeed!31704 个月前
!2342 core_r0.12.0 add cp v2 ut & bugfix Merge pull request !2342 from YE ZHENYUAN/master 11 个月前
feat: kvallgather_cp_algo Co-authored-by: wuweiqiang24<wuweiqiang11@huawei.com> # message auto-generated for no-merge-commit merge: !3104 merge add_allgather_cp into master feat: kvallgather_cp_algo Created-by: wuweiqiang24 Commit-by: wuweiqiang24 Merged-by: ascend-robot Description: 1. 增加TE中CPStrategyFactory、BaseCPStrategy标准化接口 2. 增加KVAllGatherCPStrategy策略 3. 增加KVAllGatherCP Readme文档 4. 增加KVAllGatherCP UT测试用例 精度已对齐,并进行了性能测试: https://wiki.huawei.com/domains/137239/wiki/268925/WIKI202512179461897 See merge request: Ascend/MindSpeed!31044 个月前
[bugfix!!!]7 bugfix&eod update Co-authored-by: EX_mitsu<yangjie409@h-partners.com> # message auto-generated for no-merge-commit merge: !3070 merge master into master [bugfix!!!]7 bugfix&eod update Created-by: EX_mitsuX Commit-by: EX_mitsu Merged-by: ascend-robot Description: 1.修复premute融合算子开启时新增同步使能错误的问题。 2.修复overlap特性中prob中间变量内存未正确释放的问题。 3.优化EoD加速模式下的性能,去除多余的cpu操作,EoD加速模式性能提升约2%+。 4.修复EoD模式中,TND(THD)下位置编码计算错误的问题。 5.同步readme描述,补充EoD加速模式下对--variable-seq-lengths的拦截,否则该场景会出现FA算子前向报错,并对EoD的校验重新整理。 6.优化部分变量及函数的命名,使其更贴近具体职能,并减少多余的get_args使用。 7.修复alltoall overlap下同时开启ETP与共享专家时报错的问题。 8.修复fbov开启共享专家产生的多流同步问题。 9.修复alltoall_ov与fb_ov中开启zero_memory时,重计算部分permute融合算子未生效的问题。 See merge request: Ascend/MindSpeed!30705 个月前