FrameworkPTAdapter 3.0.RC2 Release Notes
FrameworkPTAdapter 3.0.RC2
Before You Start
This framework is modified based on the open-source PyTorch 1.5.0 and 1.8.1 developed by Facebook, inherits native PyTorch features, and uses NPUs for dynamic image training. Models are adapted by operator granularity, code can be reused, and current networks can be ported and used on NPUs with only device types or data types modified.
PyTorch 1.8.1 uses plug-in adaptation and is completely decoupled from the native PyTorch framework. The current functions and performance of PyTorch 1.8.1 are basically the same as those of PyTorch 1.5.0, providing good development experience for backend operator adaptation.
AOE-based tuning is supported.
New Features
Table 1 Features supported by PyTorch
Level-1 Feature
|
Level-2 Feature
|
Description
|
PyTorch 1.5.0 features adapted to NPUs
|
Basic framework functions
|
Fixed a few bugs.
|
PyTorch 1.8.1 features adapted to NPUs
|
Decoupled as plugins
|
Decoupled the torch_npu plugin from the native PyTorch framework completely.
|
Performance optimization
|
Optimized the single-operator delivery and collective communication functions.
|
API satisfaction
|
Added more APIs. (For details, see the API list.)
|
Modified Features
N/A
Resolved Issues
N/A
Known Issues
Known Issue
|
Description
|
Data type support
|
NPUs do not support the input or output of the inf/nan data of the float16 type
|
Data format
|
Dimensions cannot be reduced when the format larger than 4D is used.
|
Restrictions on collective communication
|
The graphs executed on different devices in a training job must be the same.
|
Allocation at only 1, 2, 4, or 8 processors is supported.
|
Only the int8, int32, float16, and float32 data types are supported.
|
Apex function
|
In the current version, Apex is implemented mainly using Python, and the customized optimization of CUDA kernel in Apex is not supported.
|
Compatibility
Atlas 800 (model 9010): CentOS 7.6, Ubuntu 18.04/2.04, BC-Linux 7.6, Debian 9.9, Debian 10, openEuler 20.03 LTS
Atlas 800 (model 9000): CentOS 7.6, Ubuntu 18.04/2.04, EulerOS 2.8/2.10, Kylin V10, BC-Linux 7.6, openEuler 20.03 LTS, UOS 20 1020e
FrameworkPTAdapter 3.0.RC1
Before You Start
This framework is modified based on the open-source PyTorch 1.5.0 and 1.8.1 developed by Facebook, inherits native PyTorch features, and uses NPUs for dynamic image training. Models are adapted by operator granularity, code can be reused, and current networks can be ported and used on NPUs with only device types or data types modified.
PyTorch 1.8.1 adopts the plugin adaptation mode and inherits the features of PyTorch 1.5.0. Their functions are basically the same, but PyTorch 1.8.1 provides better development experience for backend operator adaptation. It supports AOE-based tuning.
New Features
Table 1 Features supported by PyTorch
Level-1 Feature
|
Level-2 Feature
|
Description
|
PyTorch 1.5.0 features adapted to NPUs
|
AOE-based tuning
|
Supported the AOE auto tuning tool to improve the model performance.
|
Basic framework functions
|
Added the function of adapted operator development. For details, see the API list.
|
PyTorch 1.8.1 features adapted to NPUs
|
Plugin decoupling
|
Decoupled the NPU-adapted code and ported it to the **torch_npu** plugin to decouple the Ascend-adapted code from the native PyTorch code.
|
AOE-based tuning
|
Supported the AOE auto tuning tool to improve the model performance.
|
Improvement of framework API satisfaction
| Supported APIs that have been adapted to Ascend PyTorch 1.5.0. For details, see the API list.
|
Modified Features
N/A
Resolved Issues
N/A
Known Issues
Known Issue
|
Symptom
|
Data type support
|
NPUs do not support the input or output of the inf/nan data of the float16 type.
|
Data format
|
Dimensions cannot be reduced when the format larger than 4D is used.
|
Restrictions on collective communication
|
The graphs executed on different devices in a training job must be the same.
|
Allocation at only 1, 2, 4, or 8 processors is supported.
|
Only the int8, int32, float16, and float32 data types are supported.
|
Apex function
|
In the current version, Apex is implemented mainly using Python, and the customized optimization of CUDA kernel in Apex is not supported.
|
Compatibility
Atlas 800 (model 9010): CentOS 7.6, Ubuntu 18.04/2.04, BC-Linux 7.6, Debian 9.9, Debian 10, openEuler 20.03 LTS
Atlas 800 (model 9000): CentOS 7.6, Ubuntu 18.04/2.04, EulerOS 2.8/2.10, Kylin V10, BC-Linux 7.6, openEuler 20.03 LTS, UOS 20 1020e
FrameworkPTAdapter 2.0.4
Before You Start
This framework is modified based on the open-source PyTorch 1.5.0 and 1.8.1 developed by Facebook, inherits native PyTorch features, and uses NPUs for dynamic image training. Models are adapted by operator granularity, code can be reused, and current networks can be ported and used on NPUs with only device types or data types modified.
PyTorch 1.8.1 inherits the features of PyTorch 1.5.0. Their functions are basically the same, but PyTorch 1.8.1 provides better development experience for backend operator adaptation. Currently, PyTorch 1.8.1 supports only the ResNet-50 network model.
New Features
Table 1 Features supported by PyTorch
Level-1 Feature
|
Level-2 Feature
|
Description
|
PyTorch 1.5.0 features adapted to NPUs
|
Optimization of Model Accuracy Analyzer
|
Supported mapping between IR and TBE operators and enabled the NPU dump data to be loaded to the GPU side for comparison.
|
E2E prof
|
Integrated the profile data obtained by the native PyTorch Profiling tool and CANN prof tool to implement end-to-end model and operator performance analysis.
|
Basic framework functions
|
Added the function of adapted operator development. For details, see the operator list.
|
PyTorch 1.8.1 features adapted to NPUs
|
AMP
|
Supported the native training with automatic mixed precision (AMP) of PyTorch.
|
Profiling
|
Supported the native profiling function of PyTorch.
|
OS compatibility
|
OS compatibility
|
Supported Ubuntu 20.04 (x86 and ARM) and EulerOS 2.10 (ARM).
|
Python version compatibility
|
Supported compilation and use of Python 3.9 (only in PyTorch 1.8.1).
|
Modified Features
N/A
Resolved Issues
N/A
Known Issues
Known Issue
|
Description
|
Data type support
|
NPUs do not support the input or output of the inf/nan data of the float16 type.
|
Data format
|
Dimensions cannot be reduced when the format larger than 4D is used.
|
Restrictions on collective communication
|
The graphs executed on different devices in a training job must be the same.
|
Allocation at only 1, 2, 4, or 8 processors is supported.
|
Only the int8, int32, float16, and float32 data types are supported.
|
Apex function
|
In the current version, Apex is implemented mainly using Python, and the customized optimization of CUDA kernel in Apex is not supported.
|
Compatibility
Atlas 800 (model 9010): CentOS 7.6, Ubuntu 18.04/2.04, BC-Linux 7.6, Debian 9.9, Debian 10, openEuler 20.03 LTS
Atlas 800 (model 9000): CentOS 7.6, Ubuntu 18.04/2.04, EulerOS 2.8/2.10, Kylin V10, BC-Linux 7.6, openEuler 20.03 LTS, UOS 20 1020e
FrameworkPTAdapter 2.0.3
Before You Start
This framework is modified based on the open-source PyTorch 1.5.0 developed by Facebook, inherits native PyTorch features, and uses NPUs for dynamic image training. Models are adapted by operator granularity, code can be reused, and current networks can be ported and used on NPUs with only device types or data types modified.
PyTorch 1.8.1 is supported by this version and later, and this version inherits the features of PyTorch 1.5.0 and provides the same functions, except for the Profiling tool. In addition, it optimizes the backend operator adaptation. Currently, PyTorch 1.8.1 supports only the ResNet-50 network model.
New Features
Table 1 Features supported by PyTorch
Level-1 Feature
|
Level-2 Feature
|
Description
|
PyTorch features adapted to NPUs
|
PyTorch 1.8.1
|
Added PyTorch 1.8.1. Currently, only the ResNet-50 network is supported, including the training scenario for distributed data parallel (DDP).
|
Python 3.8
|
Supported compilation and use of Python 3.8.
|
Operator overflow/underflow detection tool
|
Supported IR-level operator overflow/underflow detection in the PyTorch framework. When an AI Core operator overflow/underflow occurs, the IR information is displayed.
|
OS compatibility
|
Supported UOS 20 1020e ARM.
|
Basic framework functions
|
Added the function of adapted operator development. For details, see the operator list.
|
Model training
|
CenterFace
|
-
|
PCBU
|
-
|
Net++
|
-
|
FCN8S
|
-
|
OSNetRetinaFace
|
-
|
PSPnet
|
-
|
EDSR
|
-
|
Tsm
|
-
|
pnasnet5large
|
-
|
Gaitset
|
-
|
fcn
|
-
|
Albert
|
-
|
AdvancedEast
|
-
|
ReidStrongBaseline
|
-
|
Fast-scnn
|
-
|
RDN
|
-
|
SRFlow
|
-
|
MGN
|
-
|
Roberta
|
-
|
RegNetY
|
-
|
VoVNet-39
|
-
|
RegNetX
|
-
|
RefineNet
|
-
|
RefineDet
|
-
|
AlignedReID
|
-
|
FaceBoxes
|
-
|
Modified Features
N/A
Resolved Issues
N/A
Known Issues
Known Issue
|
Description
|
Data type support
|
NPUs do not support the input or output of the inf/nan data of the float16 type.
|
Data format
|
Dimensions cannot be reduced when the format larger than 4D is used.
|
Restrictions on collective communication
|
The graphs executed on different devices in a training job must be the same.
|
Allocation at only 1, 2, 4, or 8 processors is supported.
|
Only the int8, int32, float16, and float32 data types are supported.
|
Apex function
|
In the current version, Apex is implemented mainly using Python, and the customized optimization of CUDA kernel in Apex is not supported.
|
Compatibility
Atlas 800 (model 9010): CentOS 7.6, Ubuntu 18.04, BC-Linux 7.6, Debian 9.9, Debian 10, openEuler 20.03 LTS
Atlas 800 (model 9000): CentOS 7.6, Euler 2.8, Kylin v10, BC-Linux 7.6, openEuler 20.03 LTS, UOS 20 1020e
FrameworkPTAdapter 2.0.2
Before You Start
This framework is modified based on the open-source PyTorch 1.5.0 primarily developed by Facebook, inherits native PyTorch features, and uses NPUs for dynamic image training. Models are adapted by operator granularity, code can be reused, and current networks can be ported and used on NPUs with only device types or data types modified.
New Features
Table 1 Features supported by PyTorch
Level-1 Feature
|
Level-2 Feature
|
Description
|
Adapted training models
|
YOLOv4
|
-
|
YOLOv3
|
-
|
DB
|
-
|
RFCN
|
-
|
CRNN
|
-
|
Densenset161
|
-
|
Densenset191
|
-
|
PyTorch features adapted to NPUs
|
Basic framework functions
|
Added the function of adapted operator development. For details, see the operator list.
|
Model Accuracy Analyzer
|
Added the Model Accuracy Analyzer and supported training accuracy demarcation.
|
Ascend 710 AI Processor
|
Supported the online inference on Ascend 710 AI Processors.
|
OS compatibility
|
Supported Ubuntu 18.04.5 and openEuler 20.03 LTS.
|
Modified Features
N/A
Resolved Issues
N/A
Known Issues
Known Issue
|
Description
|
Data type support
|
NPUs do not support the input or output of the inf/nan data of the float16 type.
|
Data format
|
Dimensions cannot be reduced when the format larger than 4D is used.
|
Restrictions on collective communication
|
The graphs executed on different devices in a training job must be the same.
|
Allocation at only 1, 2, 4, or 8 processors is supported.
|
Only the int8, int32, float16, and float32 data types are supported.
|
Apex function
|
In the current version, Apex is implemented mainly using Python, and the customized optimization CUDA kernel in Apex is not supported.
|
Compatibility
Atlas 800 (model 9010): CentOS 7.6/Ubuntu 18.04/BC-Linux 7.6/Debian 9.9/Debian 10/openEuler 20.03 LTS
Atlas 800 (model 9000): CentOS 7.6/EulerOS 2.8/Kylin v10/BC-Linux 7.6/openEuler 20.03 LTS