520ee967创建于 2023年3月13日历史提交

FrameworkPTAdapter 3.0.RC2 Release Notes

FrameworkPTAdapter 3.0.RC2

Before You Start

This framework is modified based on the open-source PyTorch 1.5.0 and 1.8.1 developed by Facebook, inherits native PyTorch features, and uses NPUs for dynamic image training. Models are adapted by operator granularity, code can be reused, and current networks can be ported and used on NPUs with only device types or data types modified.

PyTorch 1.8.1 uses plug-in adaptation and is completely decoupled from the native PyTorch framework. The current functions and performance of PyTorch 1.8.1 are basically the same as those of PyTorch 1.5.0, providing good development experience for backend operator adaptation. AOE-based tuning is supported.

New Features

Table 1 Features supported by PyTorch

Level-1 Feature

Level-2 Feature

Description

PyTorch 1.5.0 features adapted to NPUs

Basic framework functions

Fixed a few bugs.

PyTorch 1.8.1 features adapted to NPUs

Decoupled as plugins

Decoupled the torch_npu plugin from the native PyTorch framework completely.

Performance optimization

Optimized the single-operator delivery and collective communication functions.

API satisfaction

Added more APIs. (For details, see the API list.)

Modified Features

N/A

Resolved Issues

N/A

Known Issues

Known Issue

Description

Data type support

NPUs do not support the input or output of the inf/nan data of the float16 type

Data format

Dimensions cannot be reduced when the format larger than 4D is used.

Restrictions on collective communication

The graphs executed on different devices in a training job must be the same.

Allocation at only 1, 2, 4, or 8 processors is supported.

Only the int8, int32, float16, and float32 data types are supported.

Apex function

In the current version, Apex is implemented mainly using Python, and the customized optimization of CUDA kernel in Apex is not supported.

Compatibility

Atlas 800 (model 9010): CentOS 7.6, Ubuntu 18.04/2.04, BC-Linux 7.6, Debian 9.9, Debian 10, openEuler 20.03 LTS

Atlas 800 (model 9000): CentOS 7.6, Ubuntu 18.04/2.04, EulerOS 2.8/2.10, Kylin V10, BC-Linux 7.6, openEuler 20.03 LTS, UOS 20 1020e

FrameworkPTAdapter 3.0.RC1

Before You Start

This framework is modified based on the open-source PyTorch 1.5.0 and 1.8.1 developed by Facebook, inherits native PyTorch features, and uses NPUs for dynamic image training. Models are adapted by operator granularity, code can be reused, and current networks can be ported and used on NPUs with only device types or data types modified.

PyTorch 1.8.1 adopts the plugin adaptation mode and inherits the features of PyTorch 1.5.0. Their functions are basically the same, but PyTorch 1.8.1 provides better development experience for backend operator adaptation. It supports AOE-based tuning.

New Features

Table 1 Features supported by PyTorch

Level-1 Feature

Level-2 Feature

Description

PyTorch 1.5.0 features adapted to NPUs

AOE-based tuning

Supported the AOE auto tuning tool to improve the model performance.

Basic framework functions

Added the function of adapted operator development. For details, see the API list.

PyTorch 1.8.1 features adapted to NPUs

Plugin decoupling

Decoupled the NPU-adapted code and ported it to the **torch_npu** plugin to decouple the Ascend-adapted code from the native PyTorch code.

AOE-based tuning

Supported the AOE auto tuning tool to improve the model performance.

Improvement of framework API satisfaction

Supported APIs that have been adapted to Ascend PyTorch 1.5.0. For details, see the API list.

Modified Features

N/A

Resolved Issues

N/A

Known Issues

Known Issue

Symptom

Data type support

NPUs do not support the input or output of the inf/nan data of the float16 type.

Data format

Dimensions cannot be reduced when the format larger than 4D is used.

Restrictions on collective communication

The graphs executed on different devices in a training job must be the same.

Allocation at only 1, 2, 4, or 8 processors is supported.

Only the int8, int32, float16, and float32 data types are supported.

Apex function

In the current version, Apex is implemented mainly using Python, and the customized optimization of CUDA kernel in Apex is not supported.

Compatibility

Atlas 800 (model 9010): CentOS 7.6, Ubuntu 18.04/2.04, BC-Linux 7.6, Debian 9.9, Debian 10, openEuler 20.03 LTS

Atlas 800 (model 9000): CentOS 7.6, Ubuntu 18.04/2.04, EulerOS 2.8/2.10, Kylin V10, BC-Linux 7.6, openEuler 20.03 LTS, UOS 20 1020e

FrameworkPTAdapter 2.0.4

Before You Start

This framework is modified based on the open-source PyTorch 1.5.0 and 1.8.1 developed by Facebook, inherits native PyTorch features, and uses NPUs for dynamic image training. Models are adapted by operator granularity, code can be reused, and current networks can be ported and used on NPUs with only device types or data types modified.

PyTorch 1.8.1 inherits the features of PyTorch 1.5.0. Their functions are basically the same, but PyTorch 1.8.1 provides better development experience for backend operator adaptation. Currently, PyTorch 1.8.1 supports only the ResNet-50 network model.

New Features

Table 1 Features supported by PyTorch

Level-1 Feature

Level-2 Feature

Description

PyTorch 1.5.0 features adapted to NPUs

Optimization of Model Accuracy Analyzer

Supported mapping between IR and TBE operators and enabled the NPU dump data to be loaded to the GPU side for comparison.

E2E prof

Integrated the profile data obtained by the native PyTorch Profiling tool and CANN prof tool to implement end-to-end model and operator performance analysis.

Basic framework functions

Added the function of adapted operator development. For details, see the operator list.

PyTorch 1.8.1 features adapted to NPUs

AMP

Supported the native training with automatic mixed precision (AMP) of PyTorch.

Profiling

Supported the native profiling function of PyTorch.

OS compatibility

OS compatibility

Supported Ubuntu 20.04 (x86 and ARM) and EulerOS 2.10 (ARM).

Python version compatibility

Supported compilation and use of Python 3.9 (only in PyTorch 1.8.1).

Modified Features

N/A

Resolved Issues

N/A

Known Issues

Known Issue

Description

Data type support

NPUs do not support the input or output of the inf/nan data of the float16 type.

Data format

Dimensions cannot be reduced when the format larger than 4D is used.

Restrictions on collective communication

The graphs executed on different devices in a training job must be the same.

Allocation at only 1, 2, 4, or 8 processors is supported.

Only the int8, int32, float16, and float32 data types are supported.

Apex function

In the current version, Apex is implemented mainly using Python, and the customized optimization of CUDA kernel in Apex is not supported.

Compatibility

Atlas 800 (model 9010): CentOS 7.6, Ubuntu 18.04/2.04, BC-Linux 7.6, Debian 9.9, Debian 10, openEuler 20.03 LTS

Atlas 800 (model 9000): CentOS 7.6, Ubuntu 18.04/2.04, EulerOS 2.8/2.10, Kylin V10, BC-Linux 7.6, openEuler 20.03 LTS, UOS 20 1020e

FrameworkPTAdapter 2.0.3

Before You Start

This framework is modified based on the open-source PyTorch 1.5.0 developed by Facebook, inherits native PyTorch features, and uses NPUs for dynamic image training. Models are adapted by operator granularity, code can be reused, and current networks can be ported and used on NPUs with only device types or data types modified.

PyTorch 1.8.1 is supported by this version and later, and this version inherits the features of PyTorch 1.5.0 and provides the same functions, except for the Profiling tool. In addition, it optimizes the backend operator adaptation. Currently, PyTorch 1.8.1 supports only the ResNet-50 network model.

New Features

Table 1 Features supported by PyTorch

Level-1 Feature

Level-2 Feature

Description

PyTorch features adapted to NPUs

PyTorch 1.8.1

Added PyTorch 1.8.1. Currently, only the ResNet-50 network is supported, including the training scenario for distributed data parallel (DDP).

Python 3.8

Supported compilation and use of Python 3.8.

Operator overflow/underflow detection tool

Supported IR-level operator overflow/underflow detection in the PyTorch framework. When an AI Core operator overflow/underflow occurs, the IR information is displayed.

OS compatibility

Supported UOS 20 1020e ARM.

Basic framework functions

Added the function of adapted operator development. For details, see the operator list.

Model training

CenterFace

-

PCBU

-

Net++

-

FCN8S

-

OSNetRetinaFace

-

PSPnet

-

EDSR

-

Tsm

-

pnasnet5large

-

Gaitset

-

fcn

-

Albert

-

AdvancedEast

-

ReidStrongBaseline

-

Fast-scnn

-

RDN

-

SRFlow

-

MGN

-

Roberta

-

RegNetY

-

VoVNet-39

-

RegNetX

-

RefineNet

-

RefineDet

-

AlignedReID

-

FaceBoxes

-

Modified Features

N/A

Resolved Issues

N/A

Known Issues

Known Issue

Description

Data type support

NPUs do not support the input or output of the inf/nan data of the float16 type.

Data format

Dimensions cannot be reduced when the format larger than 4D is used.

Restrictions on collective communication

The graphs executed on different devices in a training job must be the same.

Allocation at only 1, 2, 4, or 8 processors is supported.

Only the int8, int32, float16, and float32 data types are supported.

Apex function

In the current version, Apex is implemented mainly using Python, and the customized optimization of CUDA kernel in Apex is not supported.

Compatibility

Atlas 800 (model 9010): CentOS 7.6, Ubuntu 18.04, BC-Linux 7.6, Debian 9.9, Debian 10, openEuler 20.03 LTS

Atlas 800 (model 9000): CentOS 7.6, Euler 2.8, Kylin v10, BC-Linux 7.6, openEuler 20.03 LTS, UOS 20 1020e

FrameworkPTAdapter 2.0.2

Before You Start

This framework is modified based on the open-source PyTorch 1.5.0 primarily developed by Facebook, inherits native PyTorch features, and uses NPUs for dynamic image training. Models are adapted by operator granularity, code can be reused, and current networks can be ported and used on NPUs with only device types or data types modified.

New Features

Table 1 Features supported by PyTorch

Level-1 Feature

Level-2 Feature

Description

Adapted training models

YOLOv4

-

YOLOv3

-

DB

-

RFCN

-

CRNN

-

Densenset161

-

Densenset191

-

PyTorch features adapted to NPUs

Basic framework functions

Added the function of adapted operator development. For details, see the operator list.

Model Accuracy Analyzer

Added the Model Accuracy Analyzer and supported training accuracy demarcation.

Ascend 710 AI Processor

Supported the online inference on Ascend 710 AI Processors.

OS compatibility

Supported Ubuntu 18.04.5 and openEuler 20.03 LTS.

Modified Features

N/A

Resolved Issues

N/A

Known Issues

Known Issue

Description

Data type support

NPUs do not support the input or output of the inf/nan data of the float16 type.

Data format

Dimensions cannot be reduced when the format larger than 4D is used.

Restrictions on collective communication

The graphs executed on different devices in a training job must be the same.

Allocation at only 1, 2, 4, or 8 processors is supported.

Only the int8, int32, float16, and float32 data types are supported.

Apex function

In the current version, Apex is implemented mainly using Python, and the customized optimization CUDA kernel in Apex is not supported.

Compatibility

Atlas 800 (model 9010): CentOS 7.6/Ubuntu 18.04/BC-Linux 7.6/Debian 9.9/Debian 10/openEuler 20.03 LTS

Atlas 800 (model 9000): CentOS 7.6/EulerOS 2.8/Kylin v10/BC-Linux 7.6/openEuler 20.03 LTS