SOLOv2
This implements training of SOLOv2 on the Coco dataset, mainly modified from pytorch/examples.
SOLOv2 Detail
As of the current date, Ascend-Pytorch is still inefficient for contiguous operations. Therefore, SOLOv2 model need to be modified in the following aspects:
- Converting tensors with the dynamic shapes into tensors with fixed shapes. (This is the hardest one)
- Several operations, like the sum of
INT64, are not supported on the NPU, so we modified tensors'dtypewhen needed - Framework bottlenecks lead to poor performance, so we improve the original code to improve the performance of the model
- We used Apex for mmdtection due to the hardware defects of the NPU
- ...
Requirements
- NPU配套的run包安装
- Python 3.7.5
- PyTorch(NPU版本)
- apex(NPU版本)
- MMCV v0.2.16
Document and data preparation
- 下载压缩modelzoo\contrib\PyTorch\cv\instance_segmentation\SOLOv2 文件夹
- 于npu服务器解压SOLOv2压缩包
- 下载coco2017数据集
- 下载预训练模型,下载地址:https://download.pytorch.org/models/resnet50-19c8e357.pth,下载完成后重命名为resnet50.pth
- 将coco数据集放于SOLOv2/data目录下,目录结构如下:
GFocalV2
├── configs
├── data
│ ├── coco
│ ├── annotations 796M
│ ├── train2017 19G
│ ├── val2017 788M
│ ├── pretrained
│ ├── resnet50.pth 98M
Configure the environment
进入SOLOv2目录,source环境变量
cd SOLOv2
source test/env_npu.sh
- 配置安装mmcv
cd mmcv
python3 setup.py build_ext
python3 setup.py develop
cd ..
pip list | grep mmcv # 查看版本和路径
- 配置安装mmdet
pip install -r requirements/build.txt
pip install "git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI"
pip install -v -e .
Train MODEL
进入SOLOv2目录下
1p
导入环境变量,修改train_full_1p.sh权限并运行
chmod +x ./test/train_full_1p.sh
bash ./test/train_full_1p.sh --data_path=./data/coco
8p
导入环境变量,修改train_full_8p.sh权限并运行
chmod +x ./test/train_full_8p.sh
bash ./test/train_full_8p.sh --data_path=./data/coco
Eval
修改train_eval_1p.sh权限并运行
chmod +x ./test/train_eval_1p.sh
bash ./test/train_eval_1p.sh --data_path=./data/coco
finetuning
修改train_finetune_1p.sh权限并运行
chmod +x ./test/train_eval_1p.sh
bash ./test/train_finetune_1p.sh --data_path=./data/coco
SOLOv2 training result
| Acc@1 | FPS | Npu/Gpu_nums | Epochs | AMP_Type | Loss_Scale |
|---|---|---|---|---|---|
| 18.4 | 8.1 | 1p Gpu | 1 | O1 | 128.0 |
| 34.8 | 38.3 | 8p Gpu | 12 | O1 | 128.0 |
| 18.4 | 1.1 | 1p Npu | 1 | O1 | 128.0 |
| 34.3 | 6.2 | 8p Npu | 12 | O1 | 128.0 |