Ppengzirongdelete docs (except releasenote)

520ee967创建于 2023年3月13日历史提交

FrameworkPTAdapter 3.0.RC2 Release Notes

FrameworkPTAdapter 3.0.RC2 Release Notes

FrameworkPTAdapter 3.0.RC2

Before You Start

This framework is modified based on the open-source PyTorch 1.5.0 and 1.8.1 developed by Facebook, inherits native PyTorch features, and uses NPUs for dynamic image training. Models are adapted by operator granularity, code can be reused, and current networks can be ported and used on NPUs with only device types or data types modified.

PyTorch 1.8.1 uses plug-in adaptation and is completely decoupled from the native PyTorch framework. The current functions and performance of PyTorch 1.8.1 are basically the same as those of PyTorch 1.5.0, providing good development experience for backend operator adaptation. AOE-based tuning is supported.

New Features

Table 1 Features supported by PyTorch

Level-1 Feature	Level-2 Feature	Description
PyTorch 1.5.0 features adapted to NPUs
PyTorch 1.5.0 features adapted to NPUs	Basic framework functions	Fixed a few bugs.
PyTorch 1.8.1 features adapted to NPUs	Decoupled as plugins	Decoupled the torch_npu plugin from the native PyTorch framework completely.
	Performance optimization	Optimized the single-operator delivery and collective communication functions.
	API satisfaction	Added more APIs. (For details, see the API list.)

Modified Features

N/A

Resolved Issues

N/A

Known Issues

Known Issue	Description
Data type support	NPUs do not support the input or output of the inf/nan data of the float16 type
Data format	Dimensions cannot be reduced when the format larger than 4D is used.
Restrictions on collective communication	The graphs executed on different devices in a training job must be the same.
	Allocation at only 1, 2, 4, or 8 processors is supported.
	Only the int8, int32, float16, and float32 data types are supported.
Apex function	In the current version, Apex is implemented mainly using Python, and the customized optimization of CUDA kernel in Apex is not supported.

Compatibility

Atlas 800 (model 9010): CentOS 7.6, Ubuntu 18.04/2.04, BC-Linux 7.6, Debian 9.9, Debian 10, openEuler 20.03 LTS

Atlas 800 (model 9000): CentOS 7.6, Ubuntu 18.04/2.04, EulerOS 2.8/2.10, Kylin V10, BC-Linux 7.6, openEuler 20.03 LTS, UOS 20 1020e

FrameworkPTAdapter 3.0.RC1

Before You Start

PyTorch 1.8.1 adopts the plugin adaptation mode and inherits the features of PyTorch 1.5.0. Their functions are basically the same, but PyTorch 1.8.1 provides better development experience for backend operator adaptation. It supports AOE-based tuning.

New Features

Table 1 Features supported by PyTorch

Level-1 Feature	Level-2 Feature	Description
PyTorch 1.5.0 features adapted to NPUs	AOE-based tuning	Supported the AOE auto tuning tool to improve the model performance.
PyTorch 1.5.0 features adapted to NPUs	Basic framework functions	Added the function of adapted operator development. For details, see the API list.
PyTorch 1.8.1 features adapted to NPUs	Plugin decoupling	Decoupled the NPU-adapted code and ported it to the torch_npu plugin to decouple the Ascend-adapted code from the native PyTorch code.
	AOE-based tuning	Supported the AOE auto tuning tool to improve the model performance.
	Improvement of framework API satisfaction	Supported APIs that have been adapted to Ascend PyTorch 1.5.0. For details, see the API list.

Modified Features

N/A

Resolved Issues

N/A

Known Issues

Known Issue	Symptom
Data type support	NPUs do not support the input or output of the inf/nan data of the float16 type.
Data format	Dimensions cannot be reduced when the format larger than 4D is used.
Restrictions on collective communication	The graphs executed on different devices in a training job must be the same.
	Allocation at only 1, 2, 4, or 8 processors is supported.
	Only the int8, int32, float16, and float32 data types are supported.
Apex function	In the current version, Apex is implemented mainly using Python, and the customized optimization of CUDA kernel in Apex is not supported.

Compatibility

Atlas 800 (model 9010): CentOS 7.6, Ubuntu 18.04/2.04, BC-Linux 7.6, Debian 9.9, Debian 10, openEuler 20.03 LTS

Atlas 800 (model 9000): CentOS 7.6, Ubuntu 18.04/2.04, EulerOS 2.8/2.10, Kylin V10, BC-Linux 7.6, openEuler 20.03 LTS, UOS 20 1020e

FrameworkPTAdapter 2.0.4

Before You Start

PyTorch 1.8.1 inherits the features of PyTorch 1.5.0. Their functions are basically the same, but PyTorch 1.8.1 provides better development experience for backend operator adaptation. Currently, PyTorch 1.8.1 supports only the ResNet-50 network model.

New Features

Table 1 Features supported by PyTorch

Level-1 Feature	Level-2 Feature	Description
PyTorch 1.5.0 features adapted to NPUs	Optimization of Model Accuracy Analyzer	Supported mapping between IR and TBE operators and enabled the NPU dump data to be loaded to the GPU side for comparison.
	E2E prof	Integrated the profile data obtained by the native PyTorch Profiling tool and CANN prof tool to implement end-to-end model and operator performance analysis.
	Basic framework functions	Added the function of adapted operator development. For details, see the operator list.
PyTorch 1.8.1 features adapted to NPUs	AMP	Supported the native training with automatic mixed precision (AMP) of PyTorch.
PyTorch 1.8.1 features adapted to NPUs	Profiling	Supported the native profiling function of PyTorch.
OS compatibility	OS compatibility	Supported Ubuntu 20.04 (x86 and ARM) and EulerOS 2.10 (ARM).
OS compatibility	Python version compatibility	Supported compilation and use of Python 3.9 (only in PyTorch 1.8.1).

Modified Features

N/A

Resolved Issues

N/A

Known Issues

Known Issue	Description
Data type support	NPUs do not support the input or output of the inf/nan data of the float16 type.
Data format	Dimensions cannot be reduced when the format larger than 4D is used.
Restrictions on collective communication	The graphs executed on different devices in a training job must be the same.
	Allocation at only 1, 2, 4, or 8 processors is supported.
	Only the int8, int32, float16, and float32 data types are supported.
Apex function	In the current version, Apex is implemented mainly using Python, and the customized optimization of CUDA kernel in Apex is not supported.

Compatibility

Atlas 800 (model 9010): CentOS 7.6, Ubuntu 18.04/2.04, BC-Linux 7.6, Debian 9.9, Debian 10, openEuler 20.03 LTS

Atlas 800 (model 9000): CentOS 7.6, Ubuntu 18.04/2.04, EulerOS 2.8/2.10, Kylin V10, BC-Linux 7.6, openEuler 20.03 LTS, UOS 20 1020e

FrameworkPTAdapter 2.0.3

Before You Start

This framework is modified based on the open-source PyTorch 1.5.0 developed by Facebook, inherits native PyTorch features, and uses NPUs for dynamic image training. Models are adapted by operator granularity, code can be reused, and current networks can be ported and used on NPUs with only device types or data types modified.

PyTorch 1.8.1 is supported by this version and later, and this version inherits the features of PyTorch 1.5.0 and provides the same functions, except for the Profiling tool. In addition, it optimizes the backend operator adaptation. Currently, PyTorch 1.8.1 supports only the ResNet-50 network model.

New Features

Table 1 Features supported by PyTorch

Level-1 Feature	Level-2 Feature	Description
PyTorch features adapted to NPUs	PyTorch 1.8.1	Added PyTorch 1.8.1. Currently, only the ResNet-50 network is supported, including the training scenario for distributed data parallel (DDP).
	Python 3.8	Supported compilation and use of Python 3.8.
	Operator overflow/underflow detection tool	Supported IR-level operator overflow/underflow detection in the PyTorch framework. When an AI Core operator overflow/underflow occurs, the IR information is displayed.
	OS compatibility	Supported UOS 20 1020e ARM.
	Basic framework functions	Added the function of adapted operator development. For details, see the operator list.
Model training	CenterFace	-
	PCBU	-
	Net++	-
	FCN8S	-
	OSNetRetinaFace	-
	PSPnet	-
	EDSR	-
	Tsm	-
	pnasnet5large	-
	Gaitset	-
	fcn	-
	Albert	-
	AdvancedEast	-
	ReidStrongBaseline	-
	Fast-scnn	-
	RDN	-
	SRFlow	-
	MGN	-
	Roberta	-
	RegNetY	-
	VoVNet-39	-
	RegNetX	-
	RefineNet	-
	RefineDet	-
	AlignedReID	-
	FaceBoxes	-

Modified Features

N/A

Resolved Issues

N/A

Known Issues

Known Issue	Description
Data type support	NPUs do not support the input or output of the inf/nan data of the float16 type.
Data format	Dimensions cannot be reduced when the format larger than 4D is used.
Restrictions on collective communication	The graphs executed on different devices in a training job must be the same.
	Allocation at only 1, 2, 4, or 8 processors is supported.
	Only the int8, int32, float16, and float32 data types are supported.
Apex function	In the current version, Apex is implemented mainly using Python, and the customized optimization of CUDA kernel in Apex is not supported.

Compatibility

Atlas 800 (model 9010): CentOS 7.6, Ubuntu 18.04, BC-Linux 7.6, Debian 9.9, Debian 10, openEuler 20.03 LTS

Atlas 800 (model 9000): CentOS 7.6, Euler 2.8, Kylin v10, BC-Linux 7.6, openEuler 20.03 LTS, UOS 20 1020e

FrameworkPTAdapter 2.0.2

Before You Start

This framework is modified based on the open-source PyTorch 1.5.0 primarily developed by Facebook, inherits native PyTorch features, and uses NPUs for dynamic image training. Models are adapted by operator granularity, code can be reused, and current networks can be ported and used on NPUs with only device types or data types modified.

New Features

Table 1 Features supported by PyTorch

Level-1 Feature	Level-2 Feature	Description
Adapted training models	YOLOv4	-
	YOLOv3	-
	DB	-
	RFCN	-
	CRNN	-
	Densenset161	-
	Densenset191	-
PyTorch features adapted to NPUs	Basic framework functions	Added the function of adapted operator development. For details, see the operator list.
	Model Accuracy Analyzer	Added the Model Accuracy Analyzer and supported training accuracy demarcation.
	Ascend 710 AI Processor	Supported the online inference on Ascend 710 AI Processors.
	OS compatibility	Supported Ubuntu 18.04.5 and openEuler 20.03 LTS.

Modified Features

N/A

Resolved Issues

N/A

Known Issues

Known Issue	Description
Data type support	NPUs do not support the input or output of the inf/nan data of the float16 type.
Data format	Dimensions cannot be reduced when the format larger than 4D is used.
Restrictions on collective communication	The graphs executed on different devices in a training job must be the same.
	Allocation at only 1, 2, 4, or 8 processors is supported.
	Only the int8, int32, float16, and float32 data types are supported.
Apex function	In the current version, Apex is implemented mainly using Python, and the customized optimization CUDA kernel in Apex is not supported.

Compatibility

Atlas 800 (model 9010): CentOS 7.6/Ubuntu 18.04/BC-Linux 7.6/Debian 9.9/Debian 10/openEuler 20.03 LTS

Atlas 800 (model 9000): CentOS 7.6/EulerOS 2.8/Kylin v10/BC-Linux 7.6/openEuler 20.03 LTS