| Adapt parameter "device_id" for init_process_group
Co-authored-by: l30044004<louyujing@huawei.com>
# message auto-generated for no-merge-commit merge:
!27487 merge master into master
Adapt parameter "device_id" for init_process_group
Created-by: louyujing
Commit-by: l30044004
Merged-by: ascend-robot
Description: <!-- Thanks for sending a pull request!
-->
**What type of PR is this?**
> Uncomment only one /kind <> line, hit enter to put that in a new line, and remove leading whitespaces from that line:
>
> /kind bug
> /kind task
> /kind feature
**What does this PR do / why do we need it**:
**Which issue(s) this PR fixes**:
<!--
*Automatically closes linked issue when PR is merged.
Usage: Fixes #<issue number>, or Fixes (paste link of issue).
-->
Fixes #
torch.distributed.init_process_group可传入参数device_id,如果传入的device_id包含cuda,在调用torch.distributed.barrier()时会获取process_group中包含cuda的device_id,导致在npu上运行报错;
因此,给torch.distributed.init_process_group添加装饰器,将参数device_id中的cuda替换为npu。
**Special notes for your reviewers**:
See merge request: Ascend/pytorch!27487 | 5 个月前 |