a1208d03创建于 3 天前历史提交

文件	最后提交记录	最后更新时间
onpolicy	Set the default parameter for allow_internal_format Co-authored-by: Ginray1<18667882700@163.com> # message auto-generated for no-merge-commit merge: !7500 merge master into master Set the default parameter for allow_internal_format Created-by: Ginray1 Commit-by: Ginray1 Merged-by: ascend-robot Description: ## Motivation Set the default parameter for allow_internal_format ## Modification Please briefly describe what modification is made in this PR. ## Self-test (Optional) If modifications to this PR may cause/fix function/accuracy/performance DTSs/issues, a self-inspection record needs to be attached. ## BC-breaking (Optional) If there are compatibility issues, such as dependencies on cann/torch_npu versions, they need to be explained in the PR. ## Checklist Before PR: - [√ ] The new code needs to comply with the Clean Code specification. - [ √] The PR content is self-checked, and the expression can be clear and the writing standardized After PR: - [ √] CLA has been signed and all committers have signed the CLA in this PR. - [√ ] The ci-pipeline is passed, Code Check is passed. See merge request: Ascend/ModelZoo-PyTorch!7500	2 个月前
test	!6943 add description for env variables Merge pull request !6943 from WeiChunyu/master	1 年前
.gitignore	MAPPO初次提交	2 年前
LICENSE	MAPPO初次提交	2 年前
README.md	更新MAPPO模型FAQ Co-authored-by: mamba-chen<chenhao388@huawei.com> # message auto-generated for no-merge-commit merge: !7597 merge mappo into master 更新MAPPO模型FAQ Created-by: mamba-chen Commit-by: mamba-chen Merged-by: ascend-robot Description: ## Motivation 更新MAPPO模型FAQ ## Modification 更新MAPPO模型FAQ ## Self-test (Optional) If modifications to this PR may cause/fix function/accuracy/performance DTSs/issues, a self-inspection record needs to be attached. ## BC-breaking (Optional) If there are compatibility issues, such as dependencies on cann/torch_npu versions, they need to be explained in the PR. ## Checklist Before PR: - [ ] The new code needs to comply with the Clean Code specification. - [ ] The PR content is self-checked, and the expression can be clear and the writing standardized After PR: - [ ] CLA has been signed and all committers have signed the CLA in this PR. - [ ] The ci-pipeline is passed, Code Check is passed. See merge request: Ascend/ModelZoo-PyTorch!7597	3 天前
environment.yaml	MAPPO初次提交	2 年前
public_address_statement.md	!7376 optimize public_address_statement.md Merge pull request !7376 from 王凯宇/master	8 个月前
requirements.txt	!7308 [built-in][PyTorch][MAPPO] 修改requirement.txt Merge pull request !7308 from 郑特驹/master	11 个月前
setup.py	MAPPO初次提交	2 年前

MAPPO for PyTorch

概述
准备训练环境
开始训练
训练结果展示
版本说明

概述

简述

多智能体近端策略优化算法（Multi-Agent Proximal Policy Optimization， MAPPO）是一种新型的Policy Gradient算法。基于现有的近端策略优化算法（Proximal Policy Optimization， PPO），在不修改算法架构的基础上，通过调整超参数，在多智能体环境中达到与大多数off-policy算法相当的性能。

参考实现：

url=https://github.com/marlbenchmark/on-policy
commit_id=b21e0f743bd4516086825318452bb6927a33538d

适配昇腾 AI 处理器的实现：

url=https://gitcode.com/ascend/ModelZoo-PyTorch.git
code_path=PyTorch/built-in/rl/

准备训练环境

准备环境

推荐使用最新的版本准备训练环境。

表 1 版本配套表

软件	版本	安装指南
Driver	AscendHDK 25.0.RC1.1	《驱动固件安装指南》
Firmware	AscendHDK 25.0.RC1.1	《驱动固件安装指南》
CANN	CANN 8.1.RC1	《CANN 软件安装指南》
PyTorch	2.1.0	《Ascend Extension for PyTorch 配置与安装》
torch_npu	release v7.0.0-pytorch2.1.0	《Ascend Extension for PyTorch 配置与安装》

三方库依赖如下表所示。

表 2 三方库依赖表

Torch_Version 三方库依赖版本

PyTorch 2.1 absl-py==1.4.0; gym==0.17.2; protobuf==3.20.0; wandb==0.10.5
安装依赖。

在模型根目录下执行命令，安装模型对应PyTorch版本需要的依赖。
```
pip install -r requirements.txt  
pip install -e .
```

Torch_Version	三方库依赖版本
PyTorch 2.1	absl-py==1.4.0; gym==0.17.2; protobuf==3.20.0; wandb==0.10.5

准备数据集

无。

获取预训练模型

无。

开始训练

训练模型

本文以MPE Comm场景为例，展示训练方法，其余场景需要根据场景替换启动脚本。

进入解压后的源码包根目录。
```
cd /${模型文件夹名称} 
```
运行训练脚本。

该模型支持单机单卡训练。
- 单机单卡训练
```
bash test/train_full_1p.sh  # 单卡训练
```
- 单机单卡性能
```
bash test/train_performance_1p.sh  # 单卡性能
```
训练完成后，权重文件保存在onpolicy/scripts/results路径下，并输出模型训练精度和性能信息。

训练结果展示

表 3 训练结果展示表

NAME	FPS	MAX Training TimeSteps	Average Reward
1p-竞品V	1789	2000000	-15.9
1p-NPU	885	2000000	-15.9

说明：上表为历史数据，仅供参考。2025年5月10日更新的性能数据如下：

NAME	精度类型	FPS
1p-竞品	FP16	1789
1p-Atlas 900 A2 PoDc	FP16	1791.33

公网地址说明

代码涉及公网地址参考 public_address_statement.md

版本说明

变更

2023.09.05：首次发布。

FAQ

1.暂不支持Atlas 350加速卡。