pytorch/torch_npu/contrib · yanzi2021/pytorch - AtomGit

ascend-robotAdapt parameter "device_id" for init_process_group

118b1c42创建于 2025年12月9日历史提交

文件	最后提交记录	最后更新时间
function	!22490 cleancode Merge pull request !22490 from SCh-zx/cleancodem	10 个月前
module	wqbmm sync A5 Co-authored-by: wangmingli<wangmingli6@h-partners.com> # message auto-generated for no-merge-commit merge: !26928 merge master into master wqbmm sync A5 Created-by: wangmingli Commit-by: wangmingli Merged-by: ascend-robot Description: <!-- Thanks for sending a pull request! --> What type of PR is this? > Uncomment only one `/kind <>` line, hit enter to put that in a new line, and remove leading whitespaces from that line: > > /kind bug > /kind task > /kind feature What does this PR do / why do we need it: Which issue(s) this PR fixes: <!-- Automatically closes linked issue when PR is merged. Usage: `Fixes #<issue number>`, or `Fixes (paste link of issue)`. --> Fixes # Special notes for your reviewers*: See merge request: Ascend/pytorch!26928	6 个月前
__init__.py	!19783 Remove expose for deprecated apis. Merge pull request !19783 from 刘嘉巍/master	1 年前
apis_config.json	!11855 【transfer_to_npu】Streamline API config file Merge pull request !11855 from louyujing/master	2 年前
transfer_to_npu.py	Adapt parameter "device_id" for init_process_group Co-authored-by: l30044004<louyujing@huawei.com> # message auto-generated for no-merge-commit merge: !27487 merge master into master Adapt parameter "device_id" for init_process_group Created-by: louyujing Commit-by: l30044004 Merged-by: ascend-robot Description: <!-- Thanks for sending a pull request! --> What type of PR is this? > Uncomment only one `/kind <>` line, hit enter to put that in a new line, and remove leading whitespaces from that line: > > /kind bug > /kind task > /kind feature What does this PR do / why do we need it: Which issue(s) this PR fixes: <!-- Automatically closes linked issue when PR is merged. Usage: `Fixes #<issue number>`, or `Fixes (paste link of issue)`. --> Fixes # torch.distributed.init_process_group可传入参数device_id，如果传入的device_id包含cuda，在调用torch.distributed.barrier()时会获取process_group中包含cuda的device_id，导致在npu上运行报错；因此，给torch.distributed.init_process_group添加装饰器，将参数device_id中的cuda替换为npu。 Special notes for your reviewers*: See merge request: Ascend/pytorch!27487	5 个月前