Ii-robot!7376 optimize public_address_statement.md

9549f127创建于 2025年9月4日历史提交

文件	最后提交记录	最后更新时间
configs	init	4 年前
demo	init	4 年前
docker	init	4 年前
docs	init	4 年前
mmcv_need	!4704 [fix] 处理PT2.0版本上mmcv适配问题 * fix readme	2 年前
mmdet	!4685 [fix] 修改算子调用方式 * fix code clean all_py about Operator modification * fix code clean * fix op way first	2 年前
requirements	init	4 年前
scripts	!4671 【fix】批量修改模型python版本，兼容环境上的python3.8版本 * fix python version	3 年前
test	!4671 【fix】批量修改模型python版本，兼容环境上的python3.8版本 * fix python version	3 年前
tests	init	4 年前
tools	!4671 【fix】批量修改模型python版本，兼容环境上的python3.8版本 * fix python version	3 年前
.gitignore	init	4 年前
LICENSE	init	4 年前
README.md	!5192 GCNet、GFocalV2、M2Det、NasFPN、RCF 模型公网地址整改 * GCNet、GFocalV2、M2Det、NasFPN、RCF 模型公网地址整改	2 年前
README_raw.md	init	4 年前
demo.py	init	4 年前
modelzoo_level.txt	init	4 年前
public_address_statement.md	!7376 optimize public_address_statement.md Merge pull request !7376 from 王凯宇/master	8 个月前
pytest.ini	init	4 年前
requirements.txt	init	4 年前
setup.cfg	init	4 年前
setup.py	init	4 年前

GFocalV2

This implements training of GFocalV2 on the Coco dataset, mainly modified from pytorch/examples.

GFocalV2 Detail

As of the current date, Ascend-Pytorch is still inefficient for contiguous operations. Therefore, GFocalV2 model need to be modified in the following aspects:

Converting tensors with the dynamic shapes into tensors with fixed shapes. (This is the hardest one)
Several operations, like the sum of INT64, are not supported on the NPU, so we modified tensors' dtype when needed
Framework bottlenecks lead to poor performance, so we improve the original code to improve the performance of the model
We used Apex for mmdtection due to the hardware defects of the NPU
...

Requirements

NPU配套的run包安装
Python 3.7.5
PyTorch(NPU版本)
apex(NPU版本)
MMCV v1.2.7

Document and data preparation

下载压缩modelzoo\contrib\PyTorch\cv\image_object_detection\GFocalV2 文件夹
于npu服务器解压GFocalV2压缩包
下载coco2017数据集
将coco数据集放于GFocalV2/data目录下，目录结构如下：

GFocalV2
├── configs
├── data
│   ├── coco
│       ├── annotations   796M
│       ├── train2017     19G
│       ├── val2017       788M

Download and modify mmcv

在GFocalV2目录上级目录，下载mmcv，最好是1.2.7版本的（版本要求是1.2.5以上，1.3.0以下）

git clone -b v1.2.7 --depth=1 https://github.com/open-mmlab/mmcv.git

进入GFocalV2目录，用mmcv_need里的文件替换mmcv中对应的文件

cp -f ./mmcv_need/_functions.py ./mmcv/mmcv/parallel/
cp -f ./mmcv_need/builder.py ./mmcv/mmcv/runner/optimizer/
cp -f ./mmcv_need/distributed.py ./mmcv/mmcv/parallel/
cp -f ./mmcv_need/data_parallel.py ./mmcv/mmcv/parallel/
cp -f ./mmcv_need/dist_utils.py ./mmcv/mmcv/runner/
cp -f ./mmcv_need/optimizer.py ./mmcv/mmcv/runner/hooks/

以下三个文件的替换是为了在log中打印出FPS的信息

cp -f ./mmcv_need/iter_timer.py ../mmcv/mmcv/runner/hooks/
cp -f ./mmcv_need/base_runner.py ../mmcv/mmcv/runner/
cp -f ./mmcv_need/epoch_based_runner.py ../mmcv/mmcv/runner/

Configure the environment

先source环境变量
source GFocalV2/test/env_npu.sh

配置安装mmcv

cd mmcv
MMCV_WITH_OPS=1 pip3.7 install -e .
cd ..
pip3 list | grep mmcv  # 查看版本和路径

配置安装mmdet

cd GFocalV2
pip3 install -r requirements/build.txt
python3 setup.py develop
cd ..

修改apex中的113行，主要是为了支持O1，参考路径root/archiconda3/envs/fcos/lib/python3.7/site-packages/apex/amp/utils.py

if cached_x.grad_fn.next_functions[1][0].variable is not x:

改成

if cached_x.grad_fn.next_functions[0][0].variable is not x:

Train MODEL

进入GFocalV2目录下

1p

导入环境变量，修改train_full_1p.sh权限并运行

chmod +x ./test/train_full_1p.sh
bash ./test/train_full_1p.sh --data_path=./data/coco

8p

导入环境变量，修改train_full_8p.sh权限并运行

bash ./test/train_full_8p.sh --data_path=./data/coco

Eval

修改train_eval_1p.sh权限并运行

chmod +x ./test/train_eval_1p.sh
bash ./test/train_eval_1p.sh --data_path=./data/coco

finetuning

修改train_finetune_1p.sh权限并运行

chmod +x ./test/train_eval_1p.sh
bash ./test/train_finetune_1p.sh --data_path=./data/coco --checkpoint=xxx（可选，gfocal模型的权重文件）

Demo

source ./test/env_npu.sh
python3 demo.py --checkpoint xxx(可选，gfocal模型的权重文件，默认./work_dirs/gfocal_r50_fpn_1x/latest.pth） --img xxx(可选，测试图片）

GFocalV2 training result

Acc@1	FPS	Npu/Gpu_nums	Epochs	AMP_Type	Loss_Scale
23.2	14.6	1p Gpu	1	O1	128.0
41.0	71.3	8p Gpu	12	O1	128.0
23.5	3.46	1p Npu	1	O1	128.0
40.9	27.15	8p Npu	12	O1	128.0

公网地址说明

代码涉及公网地址参考 public_address_statement.md