文件最后提交记录最后更新时间
!24702 Code security Merge pull request !24702 from SCh-zx/safem 8 个月前
[SHMEM] support npu shmem Co-authored-by: wangchao430<wangchao430@huawei.com> # message auto-generated for no-merge-commit merge: !26027 merge v2.99.0_shmem1 into master [SHMEM] support npu shmem Created-by: wangchao430 Commit-by: wangchao430 Merged-by: ascend-robot Description: <!-- Thanks for sending a pull request! --> **What type of PR is this?** > Uncomment only one /kind <> line, hit enter to put that in a new line, and remove leading whitespaces from that line: > > /kind bug > /kind task > /kind feature **What does this PR do / why do we need it**: **Which issue(s) this PR fixes**: <!-- *Automatically closes linked issue when PR is merged. Usage: Fixes #<issue number>, or Fixes (paste link of issue). --> Fixes # **Special notes for your reviewers**: See merge request: Ascend/pytorch!260276 个月前
[SHMEM] support npu shmem Co-authored-by: wangchao430<wangchao430@huawei.com> # message auto-generated for no-merge-commit merge: !26027 merge v2.99.0_shmem1 into master [SHMEM] support npu shmem Created-by: wangchao430 Commit-by: wangchao430 Merged-by: ascend-robot Description: <!-- Thanks for sending a pull request! --> **What type of PR is this?** > Uncomment only one /kind <> line, hit enter to put that in a new line, and remove leading whitespaces from that line: > > /kind bug > /kind task > /kind feature **What does this PR do / why do we need it**: **Which issue(s) this PR fixes**: <!-- *Automatically closes linked issue when PR is merged. Usage: Fixes #<issue number>, or Fixes (paste link of issue). --> Fixes # **Special notes for your reviewers**: See merge request: Ascend/pytorch!260276 个月前
check and log hccl error Co-authored-by: zhaoyu<nanzhaogang@qq.com> # message auto-generated for no-merge-commit merge: !26292 merge hccl_error into master check and log hccl error Created-by: zhaoyu65 Commit-by: zhaoyu Merged-by: ascend-robot Description: <!-- Thanks for sending a pull request! --> **What type of PR is this?** > Uncomment only one /kind <> line, hit enter to put that in a new line, and remove leading whitespaces from that line: > /kind feature **What does this PR do / why do we need it**: **Which issue(s) this PR fixes**: <!-- *Automatically closes linked issue when PR is merged. Usage: Fixes #<issue number>, or Fixes (paste link of issue). --> **Special notes for your reviewers**: HCCL错误打屏,PTA透传HCCL的错误码和错误信息。 代码逻辑:先判断是否有hccl错误,如果有错误,通过aclGetRecentErrMsg接口获取最近的错误并打印到日志里。 See merge request: Ascend/pytorch!262926 个月前
check and log hccl error Co-authored-by: zhaoyu<nanzhaogang@qq.com> # message auto-generated for no-merge-commit merge: !26292 merge hccl_error into master check and log hccl error Created-by: zhaoyu65 Commit-by: zhaoyu Merged-by: ascend-robot Description: <!-- Thanks for sending a pull request! --> **What type of PR is this?** > Uncomment only one /kind <> line, hit enter to put that in a new line, and remove leading whitespaces from that line: > /kind feature **What does this PR do / why do we need it**: **Which issue(s) this PR fixes**: <!-- *Automatically closes linked issue when PR is merged. Usage: Fixes #<issue number>, or Fixes (paste link of issue). --> **Special notes for your reviewers**: HCCL错误打屏,PTA透传HCCL的错误码和错误信息。 代码逻辑:先判断是否有hccl错误,如果有错误,通过aclGetRecentErrMsg接口获取最近的错误并打印到日志里。 See merge request: Ascend/pytorch!262926 个月前
[feat] pg register window memory Co-authored-by: XDaoHong<xudaohong@huawei.com> # message auto-generated for no-merge-commit merge: merge master into master [feat] pg register window memory Created-by: XDaoHong Commit-by: XDaoHong Merged-by: ascend-robot Description: <!-- Thanks for sending a pull request! --> **What type of PR is this?** > Uncomment only one /kind <> line, hit enter to put that in a new line, and remove leading whitespaces from that line: > > /kind bug > /kind task > feature **What does this PR do / why do we need it**: processgroup新增window内存注册和申请接口 **Which issue(s) this PR fixes**: <!-- *Automatically closes linked issue when PR is merged. Usage: Fixes #<issue number>, or Fixes (paste link of issue). --> Fixes # **Special notes for your reviewers**: See merge request: Ascend/pytorch!248477 个月前
[feat] pg register window memory Co-authored-by: XDaoHong<xudaohong@huawei.com> # message auto-generated for no-merge-commit merge: merge master into master [feat] pg register window memory Created-by: XDaoHong Commit-by: XDaoHong Merged-by: ascend-robot Description: <!-- Thanks for sending a pull request! --> **What type of PR is this?** > Uncomment only one /kind <> line, hit enter to put that in a new line, and remove leading whitespaces from that line: > > /kind bug > /kind task > feature **What does this PR do / why do we need it**: processgroup新增window内存注册和申请接口 **Which issue(s) this PR fixes**: <!-- *Automatically closes linked issue when PR is merged. Usage: Fixes #<issue number>, or Fixes (paste link of issue). --> Fixes # **Special notes for your reviewers**: See merge request: Ascend/pytorch!248477 个月前
!6511 Set compile_options -fvisibility=hidden Merge pull request !6511 from 姜怡文/master_TORCH_API 2 年前
!19461 cleanCode Fix Merge pull request !19461 from 叶子凡/master_cleancode_0322 1 年前
!17775 add interface for LCCL backend Merge pull request !17775 from 闫鹏全/master_LCCL 1 年前
!19725 fix paralleltcpstore wait bug Merge pull request !19725 from 王超/v2.7.0_tcpstore 1 年前
!19725 fix paralleltcpstore wait bug Merge pull request !19725 from 王超/v2.7.0_tcpstore 1 年前
!22137 Added permission verification and prevents memory overflow Merge pull request !22137 from yuhaiyan/master-dev3 11 个月前
!21002 Parallel tcpstore support domain name Merge pull request !21002 from 王超/v2.7.0_storeup 1 年前
!21462 TORCH MAIN SYNC : link fmt && clone in c10d/Store Merge pull request !21462 from dilililiwhy/main_sync_20250527 11 个月前
!21462 TORCH MAIN SYNC : link fmt && clone in c10d/Store Merge pull request !21462 from dilililiwhy/main_sync_20250527 11 个月前
make watchdog thread a class Co-authored-by: wanghonghao<wanghonghao9@huawei.com> # message auto-generated for no-merge-commit merge: !27136 merge pg_update into master make watchdog thread a class Created-by: wanghonghao Commit-by: wanghonghao Merged-by: ascend-robot Description: <!-- Thanks for sending a pull request! --> **What type of PR is this?** > Uncomment only one /kind <> line, hit enter to put that in a new line, and remove leading whitespaces from that line: > > /kind bug > /kind task > /kind feature **What does this PR do / why do we need it**: **Which issue(s) this PR fixes**: <!-- *Automatically closes linked issue when PR is merged. Usage: Fixes #<issue number>, or Fixes (paste link of issue). --> Fixes # **Special notes for your reviewers**: See merge request: Ascend/pytorch!271365 个月前
make watchdog thread a class Co-authored-by: wanghonghao<wanghonghao9@huawei.com> # message auto-generated for no-merge-commit merge: !27136 merge pg_update into master make watchdog thread a class Created-by: wanghonghao Commit-by: wanghonghao Merged-by: ascend-robot Description: <!-- Thanks for sending a pull request! --> **What type of PR is this?** > Uncomment only one /kind <> line, hit enter to put that in a new line, and remove leading whitespaces from that line: > > /kind bug > /kind task > /kind feature **What does this PR do / why do we need it**: **Which issue(s) this PR fixes**: <!-- *Automatically closes linked issue when PR is merged. Usage: Fixes #<issue number>, or Fixes (paste link of issue). --> Fixes # **Special notes for your reviewers**: See merge request: Ascend/pytorch!271365 个月前
add lock for workspaceallocator Co-authored-by: huangyunlong2022<huangyunlong4@h-partners.com> Co-authored-by: zhaoyu65<nanzhaogang@qq.com> # message auto-generated for no-merge-commit merge: !26720 merge 2.10ts into master add lock for workspaceallocator Created-by: huangyunlong2022 Commit-by: zhaoyu65;huangyunlong2022 Merged-by: ascend-robot Description: <!-- Thanks for sending a pull request! --> **What type of PR is this?** > Uncomment only one /kind <> line, hit enter to put that in a new line, and remove leading whitespaces from that line: > > /kind bug > /kind task > /kind feature **What does this PR do / why do we need it**: 1、通过环境变量控制是否开启每个流一个taskqueue,默认关闭 2、开启后,在enqueue的时候初始化taskqueue,避免流创建的时候初始化太多taskqueue线程,同时为了避免多线程多次初始化进行加锁保护 3、初始化的时候默认选择当前流下发,当前计算算子都是下发到当前流上,对于通信算子下发到通信流上,将其下发的流传到enqueue使用传入的通信流下发 4、取流的时候进行清queue,只清对应流上的queue,可以避免不必要的清queue耗时 5、当前event需要先record后wait,在多taskqueue时为了保序,需要wait在enqueue时确保record已经下发(dequeue阶段通过record数量判断,在event复用场景下,wait之后的record会导致record数量判断失效,导致卡死) 6、event销毁需要确保record,wait已经下发后进行,为了不阻塞,当前采用lazy destroy 7、workspaceallocator进行加锁保护,防止多taskqueue出现竞争问题 **Which issue(s) this PR fixes**: <!-- *Automatically closes linked issue when PR is merged. Usage: Fixes #<issue number>, or Fixes (paste link of issue). --> Fixes # **Special notes for your reviewers**: See merge request: Ascend/pytorch!267206 个月前
!19093 cleancodeFix Merge pull request !19093 from 叶子凡/master_cleancode_0313 1 年前
!21002 Parallel tcpstore support domain name Merge pull request !21002 from 王超/v2.7.0_storeup 1 年前
!21002 Parallel tcpstore support domain name Merge pull request !21002 from 王超/v2.7.0_storeup 1 年前
!22240 Remove code that doesn't have much effect Merge pull request !22240 from yuhaiyan/master-dev1 11 个月前
!16974 parallel tcpstore support tiered optimizer Merge pull request !16974 from wuxiaotong/cherry-pick-1734348942 1 年前
!22218 JSON serialization for hccl status dump Merge pull request !22218 from huangyunlong/2.8js1 11 个月前
!18875 CleanCode Fix Merge pull request !18875 from 叶子凡/master_cleancode_0307 1 年前
!17257 [distributed] Adjust space indentation to 4 spaces. Merge pull request !17257 from 王超/master_cleancode 1 年前
fix code issue. Variables must be declared before use Co-authored-by: liangsongwei<liangsongwei@huawei.com> # message auto-generated for no-merge-commit merge: !26365 merge cherry-pick-mr-26289-1762593406721-auto into master fix code issue. Variables must be declared before use Created-by: liangsongwei Commit-by: liangsongwei Merged-by: ascend-robot Description: <!-- Thanks for sending a pull request! --> **What type of PR is this?** > Uncomment only one /kind <> line, hit enter to put that in a new line, and remove leading whitespaces from that line: > > /kind bug > /kind task > /kind feature **What does this PR do / why do we need it**: **Which issue(s) this PR fixes**: <!-- *Automatically closes linked issue when PR is merged. Usage: Fixes #<issue number>, or Fixes (paste link of issue). --> Fixes # **Special notes for your reviewers**: See merge request: Ascend/pytorch!263656 个月前
!19093 cleancodeFix Merge pull request !19093 from 叶子凡/master_cleancode_0313 1 年前
!18875 CleanCode Fix Merge pull request !18875 from 叶子凡/master_cleancode_0307 1 年前