MindStudio Ops Generator User Guide

Introduction

Tool Overview

After analyzing an operator and defining the prototype, you can use MindStudio Ops Generator (msOpGen) to generate a custom operator project, build the project, and deploy it.

It provides the following functions:

  • Outputs operator projects based on the operator prototype definition.
  • Outputs the operator simulation pipeline file based on the dump data file generated in the performance simulation environment.

Tool Usage Process

For details, see Figure 1 msOpGen workflow.

Figure 1 msOpGen workflow

Preparations

After the environment is set up according to the requirements, you can directly use msOpGen.

Environment Setup

Before developing an operator, install the CANN Toolkit and ops operator package of the required version and configure CANN environment variables. For details, see CANN Software Installation Guide. No installation example is provided in this section.

Constraints

  • For security and least privilege purposes, you are advised to use a common user account instead of a high-permission user account (such as root) to run the tools in this code repository.

  • Before using the operator development tools, ensure that the running user's umask is 0027 or more restrictive. Failure to do so may result in excessively permissive permissions on the directories and files where profile data is stored.

  • Before using the operator development tools, ensure that the principle of least privilege is applied (for example, do not allow write access for others and avoid setting file permissions to 666 or 777).

  • You are not advised to configure or run custom scripts in directories of the other user to avoid privilege escalation.

  • When downloading the code sample, run the following command to specify the branch version:

    git clone https://gitee.com/ascend/samples.git -b master
    

Operator Project Creation Functions

Function Description

msOpGen supports the following functions: operator project creation, operator implementation (on both host and kernel), operator project build and deployment, and operator simulation pipeline file parsing.

Table 1 msOpGen functions

Function

Link

Operator project creation

Creating an Operator Project

Operator implementation (on both host and kernel)

Developing an Operator

Operator project build and deployment

Building and Deploying an Operator

Operator simulation pipeline file parsing

Viewing the Operator Simulation Pipeline

Precautions

After an operator template is generated based on the input configuration parameters, you are advised to check the security of the operator project code before running the template.

Command Syntax

Command

Run the following command. For details about the parameters, see Table 1 Parameters for creating an operator project. After an operator template is generated based on the input configuration parameters, you are advised to check the security of the operator project code before running the template.

msopgen gen -i {*.json} -f {framework type} -c {Compute Resource} -lan cpp -out {Output Path}

Parameter Description

Table 1 Parameters for creating an operator project

Parameter

Description

Required/Optional

gen

Generates operator development deliverables.

Required

-i, --input

Operator prototype definition file (.json) path, which can be an absolute path or a relative path. The tool running user must have the read permission on this path.

Required

-f, --framework

Framework type.

  • By default, the TensorFlow framework is used. Default value: tf or tensorflow
  • Caffe framework: caffe
    Note:

    Custom Ascend C operators do not support the Caffe framework.

  • PyTorch framework: pytorch
  • MindSpore framework: ms or mindspore
  • ONNX framework: onnx
Note:
  • All values are case insensitive.
  • TBE&TIK do not support single-operator API call. By default, the TensorFlow framework is generated.
  • Ascend C operator projects support the TensorFlow framework, PyTorch framework, and single-operator API call. By default, the TensorFlow framework is generated.
  • When -f aclnn is used, an Ascend C operator project is generated.

Optional

-lan, --language

Operator encoding language.

  • cpp: Use C/C++ for operator development based on the Ascend C framework.
  • py: Use Python for operator development based on the DSL and TIK frameworks.

Default value: py.

Note:

cpp applies only to Ascend C operator development scenarios.

Optional

-c, --compute_unit

  • Compute resources used by the operator.

    Configuration format: ai_core-{soc version}. ai_core and {soc version} are connected by a hyphen (-).

    Select according to the actual AI processor version.

Note:

To determine the AI processor model soc_version, use the following method:

  • For servers other than the Atlas A3 training products/Atlas A3 inference products: Run the npu-smi info command on the server where the Ascend AI Processor is installed to obtain the chip name. Note that the actual value is represented by AscendChip name. For example, if the chip name is xxxyy, the actual value is Ascendxxxyy. If Ascendxxxyy is the path of the code sample, set this parameter to ascendxxxyy.
  • For the Atlas A3 training products/Atlas A3 inference products: Run the npu-smi info -t board -i id -c chip_id command on the server where the Ascend AI Processor is installed to obtain the chip name and NPU name. The actual value is represented by Chip name_NPU name. For example, if the chip name is Ascendxxx and the NPU name is 1234, the actual value is Ascendxxx_1234. If Ascendxxx_1234 is the path of the code sample, set this parameter to ascendxxx_1234.
    • id: device ID, which is the NPU ID obtained by running the `npu-smi info -l` command.
    • chip_id: chip ID, which is the same as the chip ID obtained by running the npu-smi info -m command.

Basic functions (operator development, build, and deployment based on the project) are applicable across operator projects created based on AI processor models from the same series.

  • For AI CPU operators, set this parameter to aicpu.
    Note:

    For the Atlas A3 training products/Atlas A3 inference products: Do not use the following compile options during compilation. Failure to comply may result in system malfunction.

    • -march=armv8-a+lse
    • -march=armv8.1-a
    • -march=armv8.2-a
    • -march=armv8.3-a

Required

-out, --output

Directory where the generated file will be stored. This path can be configured as an absolute path or a relative path. The user executing the tool must have read and write permissions for the specified directory.

If this parameter is not specified, the file is generated in the current path where the command is executed by default.

Note:

If a file with the same name as a template project exists in the user-specified output directory, the file in the output directory will be overwritten by the file from the template project.

Optional

-m, --mode

Deliverable generation mode.

  • 0: creates an operator project. If an operator project already exists in the specified path, an error is reported and the system exits.
  • 1: generates the deliverables to an existing operator project.

Default value: 0.

Optional

-op, --operator

Operator type, for example, Conv2DTik.

If this option is not set, the tool prompts you to select an operator when there are multiple operators in the operator prototype definition file.

Optional

Supplementary Information

For details about other parameters of the msOpGen tool, see Table 2 Parameter description.

Table 2 Parameter description

Parameter Description Remarks
compile Used when building a TBE&AI CPU operator project. For details, see Independent Compilation of Operator Deliverables.

Example

Creating an Operator Project

  1. Compile the prototype definition JSON file of the operator to generate the operator development project. For details about the parameters in the JSON file, see Table 1 Parameters in the JSON file.

    For example, the JSON file of the AddCustom operator is named add\_custom.json, and the file content is as follows:

    [
        {
            "op": "AddCustom",
            "input_desc": [
                {
                    "name": "x",
                    "param_type": "required",
                    "format": [
                        "ND",
                        "ND",
                        "ND"
                    ],
                    "type": [
                        "fp16",
                        "float",
                        "int32"
                    ]
                },
                {
                    "name": "y",
                    "param_type": "required",
                    "format": [
                        "ND",
                        "ND",
                        "ND"
                    ],
                    "type": [
                        "fp16",
                        "float",
                        "int32"
                    ]
                }
            ],
            "output_desc": [
                {
                    "name": "z",
                    "param_type": "required",
                    "format": [
                        "ND",
                        "ND",
                        "ND"
                    ],
                    "type": [
                        "fp16",
                        "float",
                        "int32"
                    ]
                }
            ]
        }
    ]
    

    For example, the JSON file of the ReduceMaxCustom operator (including attributes) is named reduce\_max\_custom.json, and the file content is as follows:

    [
        {
            "op": "ReduceMaxCustom",
            "input_desc": [
                {
                    "name": "x",
                    "param_type": "required",
                    "format": ["ND"],
                    "type": ["float16"]
                }
            ],
            "output_desc": [
                {
                    "name": "y",
                    "param_type": "required",
                    "format": ["ND"],
                    "type": ["float16"]
                },
                {
                    "name": "idx",
                    "param_type": "required",
                    "format": ["ND"],
                    "type": ["int32"]
                }
            ],
            "attr": [                                                                   
                {
                    "name": "reduceDim",
                    "param_type": "required",
                    "type": "int"
                },
                {
                    "name": "isKeepDim",
                    "param_type": "optional",
                    "type": "int",
                    "default_value": 1
                }
            ]
        }
    ]
    

    Table 1 Parameters in the JSON file

    Configuration Field

    Type

    Description

    Required/Optional

    op

    -

    String

    Operator type.

    Required

    input_desc

    -

    List

    Input parameter description.

    Optional

    name

    String

    Name of the operator input parameter.

    param_type

    String

    Parameter type:

    • required
    • optional
    • dynamic

    If it is not configured, the default value is required.

    format

    List

    For parameters of the tensor type, set this field to the data layout format supported by the tensor.

    Values:

    ND, NHWC, NCHW, HWCN, NC1HWC0, FRACTAL_Z, and others.

    Note:

    format must match type. If one field is only populated with a unique value, msOpGen automatically scales that value to match the length of the other fully-populated field. For example, if you set format:["ND"] and type:["fp16","float","int32"], msOpGen automatically scales the unique value (ND) of format to match the length of the type parameter, resulting in the configuration format:["ND","ND","ND"] and type:["fp16","float","int32"].

    type

    List

    Type of the operator parameter.

    • Ascend C or TBE operator: float, half, float16 (fp16), float32 (fp32), int8, int16, int32, int64, uint8, uint16, uint32, uint64, qint8, qint16, qint32, quint8, quint16, quint32, bool, double, string, resource, complex64, complex128, bf16, numbertype, realnumbertype, quantizedtype, all, BasicType, IndexNumberType, and bfloat16.
    • MindSpore data: None_None, BOOL_None, BOOL_Default, BOOL_5HD, BOOL_FracZ, BOOL_FracNZ, BOOL_C1HWNCoC0, BOOL_NCHW, BOOL_NHWC, BOOL_NDHWC, I8_None, I8_Default, I8_5HD, I8_FracZ, I8_FracNZ, I8_C1HWNCoC0, I8_NCHW, I8_NHWC, I8_HWCN, I8_NDHWC, U8_None, U8_Default, U8_5HD, U8_FracZ, U8_FracNZ, U8_C1HWNCoC0, U8_NCHW, U8_NHWC, U8_HWCN, U8_NDHWC, I16_None, I16_Default, I16_5HD, I16_FracZ, I16_FracNZ, I16_C1HWNCoC0, I16_NCHW, I16_NHWC, I16_HWCN, I16_NDHWC, U16_None, U16_Default, U16_5HD, U16_FracZ, U16_FracNZ, U16_C1HWNCoC0, U16_NCHW, U16_NHWC, U16_HWCN, U16_NDHWC, I32_None, I32_Default, I32_5HD, I32_FracZ, I32_FracNZ, I32_C1HWNCoC0, I32_NCHW, I32_NHWC, I32_HWCN, I32_NDHWC, U32_None, U32_Default, U32_5HD, U32_FracZ, U32_FracNZ, U32_C1HWNCoC0, U32_NCHW, U32_NHWC, U32_HWCN, U32_NDHWC, I64_None, I64_Default, I64_5HD, I64_FracZ, I64_FracNZ, I64_C1HWNCoC0, I64_NCHW, I64_NHWC, I64_HWCN, I64_NDHWC, U64_None, U64_Default, U64_5HD, U64_FracZ, U64_FracNZ, U64_C1HWNCoC0, U64_NCHW, U64_NHWC, U64_HWCN, U64_NDHWC, F16_None, F16_Default, F16_5HD, F16_FracZ, F16_FracNZ, F16_C1HWNCoC0, F16_NCHW, F16_NHWC, F16_HWCN, F16_NDHWCi, F16_FracZNLSTM, F32_None, F32_Default, F32_5HD, F32_FracZ, F32_FracNZ, F32_C1HWNCoC0, F32_NCHW, F32_NHWC, F32_HWCN, F32_NDHWC, F32_FracZNLSTM, F64_None, F64_Default, F64_5HD, F64_FracZ, F64_FracNZ, F64_C1HWNCoC0, F64_NCHW, F64_NHWC, F64_HWCN, and F64_NDHWC.
    Note:
    • Different compute operations support different data types. For details, see Ascend C Operator Development APIs.
    • format must match type. If one field is only populated with a unique value, msOpGen automatically scales that value to match the length of the other fully-populated field. For example, if you set format:["ND"] and type:["fp16","float","int32"], msOpGen automatically scales the unique value (ND) of format to match the length of the type parameter, resulting in the configuration format:["ND","ND","ND"] and type:["fp16","float","int32"].

    output_desc

    -

    List

    Output parameter description.

    Required

    name

    String

    Name of the operator output parameter.

    param_type

    String

    Parameter type:

    • required
    • optional
    • dynamic

    If it is not configured, the default value is required.

    format

    List

    For parameters of the tensor type, set this field to the data layout format supported by the tensor.

    Values:

    ND, NHWC, NCHW, HWCN, NC1HWC0, FRACTAL_Z, and others.

    Note:

    format must match type. If one field is only populated with a unique value, msOpGen automatically scales that value to match the length of the other fully-populated field. For example, if you set format:["ND"] and type:["fp16","float","int32"], msOpGen automatically scales the unique value (ND) of format to match the length of the type parameter, resulting in the configuration format:["ND","ND","ND"] and type:["fp16","float","int32"].

    type

    List

    Type of the operator parameter.

    • Ascend C or TBE operator: float, half, float16 (fp16), float32 (fp32), int8, int16, int32, int64, uint8, uint16, uint32, uint64, qint8, qint16, qint32, quint8, quint16, quint32, bool, double, string, resource, complex64, complex128, bf16, numbertype, realnumbertype, quantizedtype, all, BasicType, IndexNumberType, and bfloat16.
    • MindSpore data: None_None, BOOL_None, BOOL_Default, BOOL_5HD, BOOL_FracZ, BOOL_FracNZ, BOOL_C1HWNCoC0, BOOL_NCHW, BOOL_NHWC, BOOL_NDHWC, I8_None, I8_Default, I8_5HD, I8_FracZ, I8_FracNZ, I8_C1HWNCoC0, I8_NCHW, I8_NHWC, I8_HWCN, I8_NDHWC, U8_None, U8_Default, U8_5HD, U8_FracZ, U8_FracNZ, U8_C1HWNCoC0, U8_NCHW, U8_NHWC, U8_HWCN, U8_NDHWC, I16_None, I16_Default, I16_5HD, I16_FracZ, I16_FracNZ, I16_C1HWNCoC0, I16_NCHW, I16_NHWC, I16_HWCN, I16_NDHWC, U16_None, U16_Default, U16_5HD, U16_FracZ, U16_FracNZ, U16_C1HWNCoC0, U16_NCHW, U16_NHWC, U16_HWCN, U16_NDHWC, I32_None, I32_Default, I32_5HD, I32_FracZ, I32_FracNZ, I32_C1HWNCoC0, I32_NCHW, I32_NHWC, I32_HWCN, I32_NDHWC, U32_None, U32_Default, U32_5HD, U32_FracZ, U32_FracNZ, U32_C1HWNCoC0, U32_NCHW, U32_NHWC, U32_HWCN, U32_NDHWC, I64_None, I64_Default, I64_5HD, I64_FracZ, I64_FracNZ, I64_C1HWNCoC0, I64_NCHW, I64_NHWC, I64_HWCN, I64_NDHWC, U64_None, U64_Default, U64_5HD, U64_FracZ, U64_FracNZ, U64_C1HWNCoC0, U64_NCHW, U64_NHWC, U64_HWCN, U64_NDHWC, F16_None, F16_Default, F16_5HD, F16_FracZ, F16_FracNZ, F16_C1HWNCoC0, F16_NCHW, F16_NHWC, F16_HWCN, F16_NDHWCi, F16_FracZNLSTM, F32_None, F32_Default, F32_5HD, F32_FracZ, F32_FracNZ, F32_C1HWNCoC0, F32_NCHW, F32_NHWC, F32_HWCN, F32_NDHWC, F32_FracZNLSTM, F64_None, F64_Default, F64_5HD, F64_FracZ, F64_FracNZ, F64_C1HWNCoC0, F64_NCHW, F64_NHWC, F64_HWCN, and F64_NDHWC.
    Note:
    • Different compute operations support different data types. For details, see Ascend C Operator Development APIs.
    • format must match type. If one field is only populated with a unique value, msOpGen automatically scales that value to match the length of the other fully-populated field. For example, if you set format:["ND"] and type:["fp16","float","int32"], msOpGen automatically scales the unique value (ND) of format to match the length of the type parameter, resulting in the configuration format:["ND","ND","ND"] and type:["fp16","float","int32"].

    attr

    -

    List

    Attribute description.

    Optional

    name

    String

    Name of the operator attribute parameter.

    param_type

    String

    Parameter type:

    • required
    • optional

    If it is not set, the default value is required.

    type

    String

    Type of the operator parameter.

    Values:

    int, bool, float, string, list_int, list_float, list_bool, and list_list_int. For more details, see "Host APIs" > "Prototype Registration and Management" > "OpAttrDef" > "OpAttrDef" in Ascend C Operator Development APIs.

    default_value

    -

    Default value.

    NOTE

    • Multiple operators can be configured in a JSON file, which contains a list, with each element representing an operator.
    • If the input\_desc or output\_desc parameter has the same name, the latter parameter overwrites the previous one.
    • The type and format fields in input\_desc and output\_desc must be matched in sequence. For example, type of the first input x is set to ["int8","int32"], type of the second input y is set to ["fp16","fp32"], and type of the output z is set to ["int32","int64"]. The operator supports the inputs ("int8","fp16") to generate int32 or the inputs ("int32","fp32") to generate int64. That is, the type fields of inputs are vertically mapped to the type field of the output, and cannot overlap.
    • The type and format fields in input_desc and output_desc must be matched in sequence, and the number of types must be the same as the number of formats. If the value of type is one of the followings (numbertype, realnumbertype, quantizedtype, BasicType, IndexNumberType, or all), check whether the number of types is the same as the number of formats. If they are different, an error message will be displayed when you create a project. In addition, the formats will be supplemented based on the number of types, and the operator project will continue to be generated. If the value of type is int32 and the type and format items cannot match, an error message is displayed during project generation, which interrupts project running.
    • The JSON file can be used to configure operator attributes. For details, see compiling the prototype definition file.
    • The operator type must be named in UpperCamelCase style, that is, uppercase letters are used to distinguish different semantics. For details, see the note in Building and Deploying an Operator.
  2. Run the following command to generate an operator project. The following uses the AddCustom operator as an example. For details about the parameters, see Table 1 Parameters in the JSON file.

    msopgen gen -i {*.json} -f {framework type} -c {Compute Resource} -lan cpp -out {Output Path}
    
  3. After the command is executed, the operator project directory is generated in the specified directory. The project contains the operator implementation template file and compilation script.

    The operator project directory is generated in the ./output_data directory specified by -out. The directory is organized as follows:

    output_data
    ├── build.sh         // Compilation entry script
    ├── CMakeLists.txt   // CMakeLists.txt script of the operator project
    ├── CMakePresets.json // Compilation configuration items
    ├── framework        // Directory for storing the implementation file of the operator plugin. The generation of single-operator model files does not depend on the operator plugin and can be ignored.
    ├── op_host                      // Implementation file on the host.
    │   ├── add_custom.cpp         // Content file for operator prototype registration, shape derivation, information library, and tiling implementation.
    │   ├── CMakeLists.txt
    ├── op_kernel                   // Implementation file on the kernel
    │   ├── CMakeLists.txt   
    │   ├── add_custom.cpp        // Operator implementation file
    │   ├── add_custom_tiling.h    // Operator tiling definition file.
    
  4. Add an operator to an existing operator project. To add more custom operators to an existing operator project, include the -m 1 option in the command line.

    msopgen gen -i json_path/*.json -f tf -c ai_core-{Soc Version} -out ./output_data -m 1
    
    • -i: specifies the path of the operator prototype definition file add_custom.json.
    • -c: The value of {Soc Version} is the model of the AI processor.

    The operator is added to the *.json file in the operator project directory. Only operators based on the MindSpore framework can be added to the MindSpore operator project.

  5. After the operator project is created, develop the operator. For details, see Developing an Operator.

Developing an Operator

Procedure

  1. Complete operator development and adaptation, including the development of the operator kernel function and tiling implementation. For details, see "Project-based Operator Development" in Ascend C Operator Development Guide.
  2. Refer to AddCustom documentation to complete the implementation of op_host/add_custom_tiling.h, op_host/add_custom.cpp, and op_kernel/add_custom.cpp.
  3. After the operator is implemented, build and deploy the operator.

Building and Deploying an Operator

Preparations

  • Compile the kernel-side code implementation file *.cpp of the Ascend C operator. There are two release modes: source and binary.

    • Source release: The kernel implementation of the operator is not compiled, and the kernel source code file *.cpp of the operator is retained. This mode supports online operator compilation and operator compilation through ATC model conversion.
    • Binary release: Compile the kernel implementation of the operator and generate the JSON file *.json and operator binary file *.o that describe the operator information. If the operator binary needs to be directly called, use this compilation mode.
  • Compile the host-side code implementation files *.cpp and *.h of the Ascend C operator.

    • Compile the prototype definition and shape inference implementation into the operator prototype definition dynamic library libcust_opsproto_*.so and generate the external API op_proto.h of the operator prototype.
    • Compile the operator information library definition into the information library definition file *.json.
    • Compile the tiling implementation into the tiling dynamic library liboptiling.so.
    • Automatically generate the code and header file aclnn_*.h for the single-operator API call, and compile the dynamic library libcust_opapi.so for the single-operator API call.

Build Process

After the operator kernel and host are developed, build the operator project to generate a custom operator installation package (.run). For details about the build process, see Operator project build process.

Figure 1 Operator project build process

Procedure

  1. Modify the cacheVariables configuration item of CMakePresets.json in the project directory to complete the project build configuration. The content of the CMakePresets.json file is as follows. For details about the parameters, see Table 1 Common parameters to be configured by developers.

    {
        "version": 1,
        "cmakeMinimumRequired": {
            "major": 3,
            "minor": 19,
            "patch": 0
        },
        "configurePresets": [
            {
                "name": "default",
                "displayName": "Default Config",
                "description": "Default build using Unix Makefiles generator",
                "generator": "Unix Makefiles",
                "binaryDir": "${sourceDir}/build_out",
                "cacheVariables": {
                    "CMAKE_BUILD_TYPE": {
                        "type": "STRING",
                        "value": "Release"
                    },
                    "ENABLE_SOURCE_PACKAGE": {
                        "type": "BOOL",
                        "value": "True"
                    },
                    "ENABLE_BINARY_PACKAGE": {
                        "type": "BOOL",
                        "value": "True"
                    },
                    "ASCEND_COMPUTE_UNIT": {
                        "type": "STRING",
                        "value": "ascendxxx"
                    },
                    "ENABLE_TEST": {
                        "type": "BOOL",
                        "value": "True"
                    },
                    "vendor_name": {
                        "type": "STRING",
                        "value": "customize"
                    },
                    "ASCEND_PYTHON_EXECUTABLE": {
                        "type": "STRING",
                        "value": "python3"
                    },
                    "CMAKE_INSTALL_PREFIX": {
                        "type": "PATH",
                        "value": "${sourceDir}/build_out"
                    },
                    "ENABLE_CROSS_COMPILE": {      // Enable cross compilation. Configure it based on the actual environment.
                        "type": "BOOL",
                        "value": "False"
                    },
                    "CMAKE_CROSS_PLATFORM_COMPILER": {     // Relace it with the actual path after the cross compilation tool is installed.
                        "type": "PATH",
                        "value": "/usr/bin/aarch64-linux-gnu-g++"
                    },
                    "ASCEND_PACK_SHARED_LIBRARY": {
                        "type": "BOOL",
                        "value": "False"
                    }
                }
            }
        ]
    }
    

    Table 1 Common parameters to be configured by developers

    Parameter

    Description

    Default Value

    CMAKE_BUILD_TYPE

    Build mode.

    • Release: The release version does not contain debugging information and is the final version.
    • Debug: The debug version contains debugging information, facilitating development and debugging.

    Release

    ENABLE_SOURCE_PACKAGE

    Whether to enable source code compilation.

    True

    ENABLE_BINARY_PACKAGE

    Whether to enable binary compilation.

    True

    vendor_name

    Name of the vendor to which the custom operator belongs. You are advised to specify the vendor name to avoid conflicts with operator packages provided by other vendors.

    customize

  2. Customize compile options.

    Modify the CMakeLists.txt file in the op_kernel directory of the operator project and use add_ops_compile_options to add compile options.

    add_ops_compile_options(OpType COMPUTE_UNIT soc_version1 soc_version2 ... OPTIONS option1 option2 ...)
    

    Table 2 Parameter description

    Parameter

    Required/Optional

    Description

    OpType

    Required

    The first parameter must be set to the operator type. If the parameter needs to take effect for all operators in the operator project, set it to ALL.

    COMPUTE_UNIT

    Optional

    AI processor models on which the compile options take effect. Separate multiple models by spaces. If this parameter is not set, the configuration takes effect for all AI processor models.

    Note:

    COMPUTE_UNIT configuration:

    • For servers other than the Atlas A3 training products/Atlas A3 inference products: Run the npu-smi info command on the server where the Ascend AI Processor is installed to obtain the chip name. Note that the actual value is represented by AscendChip name. For example, if the chip name is xxxyy, the actual value is Ascendxxxyy. If Ascendxxxyy is the path of the code sample, set this parameter to ascendxxxyy.
    • For the Atlas A3 training products/Atlas A3 inference products: Run the npu-smi info -t board -i id -c chip_id command on the server where the Ascend AI Processor is installed to obtain the chip name and NPU name. The actual value is represented by Chip name_NPU name. For example, if the chip name is Ascendxxx and the NPU name is 1234, the actual value is Ascendxxx_1234. If Ascendxxx_1234 is the path of the code sample, set this parameter to ascendxxx_1234.
      • id: device ID, which is the NPU ID obtained by running the npu-smi info -l command.
      • chip_id: chip ID, which is obtained by running the npu-smi info -m command.

    OPTIONS

    Required

    Custom compile options. Multiple compile options are separated by spaces.

    Note:
    • The compile option for debugging, such as -sanitizer, is added to enable the msSanitizer tool. For details, see section "Preparations" > "msOpGen Operator Project Compilation Scenarios" in MindStudio Sanitizer User Guide.
      add_ops_compile_options(ALL OPTIONS -sanitizer)
    • Add compile options such as -g for debugging to enable the code call stack and hot spot map functions in the msprof op simulator scenario of the msProf tool.
      add_ops_compile_options(ALL COMPUTE_UNIT Ascendxxxyy OPTIONS -g)
    • Add compilation options such as -g -O0 for debugging to enable the msDebug tool.
      add_ops_compile_options(ALL OPTIONS -g -O0)
  3. Run the following command in the operator project directory to compile the operator project:

    ./build.sh
    

    After the compilation is successful, the build_out directory is created in the current directory, and the custom operator installation package custom_opp_<target_os>_<target_architecture>.run is generated in the directory.

    NOTE

    After the operator type is registered, the framework obtains the operator registration information based on the operator type and matches the operator implementation file name and kernel function name based on certain rules during compilation and running. To ensure correct matching, the operator type, operator implementation file name, and kernel function name must comply with the following rules. Generally, you only need to ensure that the value of the operator type in the JSON prototype definition file is in upper camel case. The code automatically generated after the project is created meets this rule. When manually writing the operator prototype definition and operator implementation file, comply with the following rules: Name operator type in upper camel case and separate words with a single capitalized letter. The operator implementation file name and kernel function name must be the same. They are the values after the operator type is converted using underscores (_). The following describes the process of converting the operator implementation file name and kernel function name through the operator type.

    • Replace the first uppercase letter with a lowercase letter. Example: Abc -> abc
    • If a lowercase character or digit is used before an uppercase character, an underscore (_) is inserted before the uppercase character, and the uppercase character is converted to a lowercase character. Example: AbcDef -> abc_def
    • If the character before an uppercase character is an uppercase character and the character after the uppercase character is a lowercase character, underscores (_) are inserted before the uppercase characters and the uppercase characters are converted to lowercase characters. Example: AbcAAc -> abc_a_ac
    • Other uppercase characters are converted to lowercase characters, and lowercase characters remain unchanged.
  4. Deploy the operator package.

Deploying the Operator Package

  1. Deploy the operator package.

    In the directory of the custom operator package, run the following command to install the operator package:

    ./custom_opp_<target_os>_<target_architecture>.run --install-path=<path>  // --install-path is optional and is used to specify the installation directory of the custom operator package. An absolute path can be specified. The running user must have the read and write permissions on the specified installation path.
    

    The <vendor_name> corresponds to the value of the vendor_name field defined in the CMakePresets.json configuration file of the operator project. If not explicitly configured, the default value is set to customize.

    • In the default installation scenario, the --install-path option is not set. After the installation is successful, the custom operator files generated after building are deployed to the

      ${INSTALL_DIR}/opp/vendors/<vendor_name> directory. Replace ${INSTALL_DIR} with the file storage path after the CANN software is installed. For example, if the Ascend-CANN-Toolkit software package is installed, the default installation directory is $HOME/Ascend/cann.

      NOTE

      The permission on the default installation path ${INSTALL_DIR}/opp/vendors of the custom operator package is related to the installation user and configuration of the CANN package. If the custom operator package fails to be installed due to insufficient permissions, use the --install-path option and configure the environment variable ASCEND_CUSTOM_OPP_PATH to specify the installation directory (see Installation in a specified directory, or contact the CANN package installation user to modify the permission on the vendors directory. For details about cases, see "FAQs" > "Failed to Open the config.ini File During Operator Calling and Insufficient Permission During Operator Package Deployment" in Ascend C Operator Development Guide.

    • For a specified directory installation scenario, configure the --install-path option. Upon successful installation, the files associated with the compiled custom operators will be deployed to the <path>/vendors/<vendor_name> directory. Additionally, a set_env.bash script will be created in the <path>/vendors/<vendor_name>/bin directory, which contains the environment variables required for the current custom operator package.

      NOTE

      If the --install-path option is configured to specify the installation directory of the operator package during deployment, run the source <path>/vendors/<vendor_name>/bin/set_env.bash command before using the custom operator. The set_env.bash script adds the installation path of the custom operator package to the environment variable ASCEND_CUSTOM_operator package_PATH so that the custom operator takes effect in the current environment.

    After the command is executed successfully, related files in the custom operator package are deployed in the current environment.

  2. You can view the directory structure after the deployment. The following example is based on the default installation scenario:

    ├── opp    // Operator library directory
    │   ├── vendors    // Directory of custom operators
    │       ├── config.ini
    │       ├── vendor_name1   // Custom operator deployed by the storage vendor. The vendor_name is configured during the build of the custom operator installation package. If vendor_name is not configured, the default value customize is used.
    │           ├── framework     // Custom operator plugin library
    │           ├── op_api
    │           │   ├── include
    │           │   │  └── aclnn_xx.h      // Declaration file of the operator API call
    │           │   └── lib
    │           │       └── libcust_opapi.so
    │           ├── op_impl
    │           │   └── ai_core
    │           │       └── tbe
    │           │           ├── config
    │           │           │   └── ${soc_version}     // AI processor type
    │           │           │       └── aic-${soc_version}-ops-info.json     // Custom operator information library file
    │           │           ├── vendor_name1_impl    // Code file for implementing the custom operator
    │           │           │   └── dynamic
    │           │           │       ├── xx.cpp
    │           │           │       └── xx.py
    │           │           ├── kernel     // Custom operator binary file
    │           │           │   └── ${soc_version}     // AI processor type
    │           │           │   └── config
    │           │           └── op_tiling
    │           │               ├── lib
    │           │               └── liboptiling.so 
    │           └── op_proto     // Directory of custom operator prototype library
    │                ├── inc
    │                │   └── op_proto.h
    │                └── lib
    │       ├── vendor_name2   // Custom operator deployed by storage vendor vendor_name2
    

    NOTE

    Parameter value: <soc_version>. The query method is as follows:

    • For servers other than the Atlas A3 training products/Atlas A3 inference products: Run the npu-smi info command on the server where the Ascend AI Processor is installed to obtain the chip name. Note that the actual value is represented by AscendChip name. For example, if the chip name is xxxyy, the actual value is Ascendxxxyy. If Ascendxxxyy is the path of the code sample, set this parameter to ascendxxxyy.

    • For the Atlas A3 training products/Atlas A3 inference products, run the npu-smi info -t board -i id -c chip_id command on the server where the Ascend AI Processor is installed to obtain the chip name and NPU name. The actual value is represented by Chip name_NPU name. For example, if the chip name is Ascendxxx and the NPU name is 1234, the actual value is Ascendxxx_1234. If Ascendxxx_1234 is the path of the code sample, set this parameter to ascendxxx_1234.

      • id: device ID, which is the NPU ID obtained by running the npu-smi info -l command.
      • chip_id: chip ID, which is the same as the chip ID obtained by running the npu-smi info -m command.
  3. Configure the priorities of custom operators.

    If multiple operator packages coexist and custom operators with the same OpType exist in different operator package directories, the operator in the operator package directory with the highest priority is used. The following describes how to configure the operator package priority.

    • Default installation scenario

      If custom operators of multiple vendors exist in the opp/vendors directory, you can configure the priority of the custom operator packages by configuring the config.ini·file in the opp/vendors directory.

      The following provides a configuration template of config.ini.

      load_priority=vendor_name1,vendor_name2,vendor_name3
      
      • load_priority: keyword of the priority sequence configuration, which cannot be modified.
      • vendor_name1,vendor_name2,vendor_name3: priority sequence of custom operator vendors, which is arranged in descending order of priority.
    • Installation in a specified directory

      In an installation scenario with a specified directory, if multiple custom operator packages need to take effect, run the set_env.bash script in the installation directory of each operator package. Each time the script is executed, the installation directory of the current operator package is added to the beginning of the ASCEND_CUSTOM_OPP_PATH environment variable. Therefore, the priority can be determined based on the script execution sequence. A later script execution sequence indicates a higher priority of the corresponding operator package.

      For example, if source <path>/vendor_name1/bin/set_env.bash is executed before source <path>vendor_name2/bin/set_env.bash, the priority of the vendor_name2 operator package is higher than that of the vendor_name1 operator package. The following is an example of ASCEND_CUSTOM_OPP_PATH:

      ASCEND_CUSTOM_OPP_PATH=<path>/vendor_name2:<path>/vendor_name1
      
    • The priority of the operator package installed in a specified directory is higher than that of the operator package installed in the default mode.

  4. Use msOpST to test the operator kernel and verify the operator functions.

  5. Use msSanitizer to detect operator memory and exceptions and identify operator accuracy exceptions.

  6. Use msDebug to debug the operator on the board and further confirm the operator accuracy exception.

  7. Use msOpProf to generate the compute memory heatmap, instruction pipeline chart, and operator instruction hot spot map to further optimize operator performance.

  8. After finishing the preceding steps, ensure that the operator accuracy and performance meet the delivery requirements.

Output Description

Viewing the Operator Simulation Pipeline

msOpGen parses dump files generated by users, and generates operator simulation pipeline files (trace.json).

  1. Run the install.sh file in the ${git_clone_path}/samples/operator/ascendc/0_introduction/1_add_frameworklaunch directory and generate the CustomOp folder. For details, see the document.

    NOTE

    This sample project does not support Atlas A3 training products, Atlas A3 inference products, or Atlas training products.

    ./install.sh -v Ascendxxxyy   # xxxyy indicates the type of the chip used by the user.
    
  2. Build the operator project.

    1. Complete build configurations by referring to Preparations.

    2. Run the following command in the CustomOp operator project directory to build the operator project:

      NOTE

      To generate an operator simulation pipeline, change the value of CMAKE_BUILD_TYPE in the CMakePresets.json file in the current directory to Debug.

      After the build is complete, the .run operator package is generated in the build_out directory.

      ./build.sh
      
  3. In the directory where the custom operator package is stored, run the following command to deploy the operator package:

    ./build_out/custom_opp_<target_os>_<target_architecture>.run
    
  4. Switch to the ${git_clone_path}/samples/operator/ascendc/0_introduction/1_add_frameworklaunch/AclNNInvocation directory of the AclNNInvocation repository and run the following command:

    ./run.sh
    
  5. After the environment variables are enabled, perform simulation by referring to the "Tool Usage" >"msprof op simulator" in msProf User Guide and generate dump data.

    export LD_LIBRARY_PATH=${git_clone_path}/samples/operator/ascendc/0_introduction/1_add_frameworklaunch/CustomOp/build_out/op_host/:$LD_LIBRARY_PATH
    
  6. Generate an operator simulation pipeline file.

    Run the following command. For details about the parameters, see Table 1 Parameter description.

    msopgen sim -c core{id} -d xx/{path of dump data} -subc {sub core id} -out {output path} -reloc {path of .o file or executable file} 
    

    Table 1 Parameter description

    Parameter

    Description

    Required/Optional

    sim

    Simulation operation.

    Note:

    The msopgen sim command will be removed from the next version of MindStudio. After the removal, you can use the simulation capability provided by msOpProf. For details, see "Tool Usage" in msProf User Guide.

    Required

    -c, --core-id

    Core ID.

    Processor ID, for example, core0.

    Required

    -d, --dump-dir

    Dump file path, which can be either absolute or relative.

    Required

    -subc, --subcore_id

    Sub-core ID. A single sub-core can be displayed.

    Specify the dump file to be parsed when the dump file name contains veccore{id} or cubecore{id}. For example, if the file name is core0.veccore0.instr_log.dump, then veccore0 corresponds to the subcore ID.

    Select either one.

    Note:

    Only for Atlas A3 training/Atlas A3 inference products and Atlas A2 training/Atlas A2 inference products.

    -mix, --mixcore-mode

    Mix operator.

    -reloc, --relocatable-file

    The file path where the kernel-side operator compilation generates the resulting .o or executable file.

    Map the pipeline diagram to the corresponding lines of code, and generate a .csv file containing the code line and instruction execution time.

    Note:

    .o file compiled from operator project containing debug information (path: ${git_clone_path}/samples/operator/ascendc/0_introduction/1_add_frameworklaunch/CustomOp/build_out/op_kernel/binary/ascendxxxy/add_custom/AddCustom_*.o). CMAKE_BUILD_TYPE needs to be set to Debug in the CMakePresets.json file. Refer to the build procedure for further details.

    Optional

    -out, --output

    Output file path, which can be configured as an absolute or relative path. The executing user must have read and write permissions.

    Required

    -h, --help

    Help information.

    Optional

    Run the following commands.

    Example 1:

    msopgen sim -c core0 -d xx/{model}/ca/add_custom/add_custom_pre_static_add_custom -out ./output_data -subc cubecore0 -reloc xx/.o
    
    • --c specifies the core ID of the dump data file to be parsed, for example, core0.
    • --d specifies the path of the dump data file generated in the performance simulation environment, for example, {model}/ca/add_custom/add_custom_pre_static_add_custom.
    • --subc: specifies the subcore ID of the dump file to be parsed. For example, if the file name is core0.cubecore0.instr_log.dump, cubecore0 is the subcore ID. (This parameter is required only for Atlas A3 training products/Atlas A3 inference products and Atlas A2 training products/Atlas A2 inference products.)
    • --reloc sets the value to the path of the .o file or executable file generated after operator compilation on the kernel.

    Example 2:

    msopgen sim -c core0 -d xx/{model}/ca/add_custom/add_custom_pre_static_add_custom -out ./output_data -mix
    
    • --c specifies the core ID of the dump data file to be parsed, for example, core0.
    • --d specifies the path of the dump data file generated in the performance simulation environment, for example, {model}/ca/add_custom/add_custom_pre_static_add_custom.
    • --mix: specifies that the Mix operator can be displayed.
  7. View the operator simulation pipeline file.

    Enter chrome://tracing in the Chrome address bar, drag the dump2trace_core*.json file in the output path to the blank area to open it, and press the shortcut keys (W: zoom in; S: zoom out; A: move left; D: move right) to view the file.

    Figure 1 Display of a single sub-core

    Figure 2 Display of the Mix operator

    Table 2 Field description

    Field

    Description

    VECTOR

    Vector unit.

    SCALAR

    Scalar unit.

    CUBE

    Matrix multiplication unit.

    MTE1

    Data transfer pipeline, from L1 to {L0A/L0B, UBUF}.

    MTE2

    Data transfer pipeline, from {DDR/GM, L2} to {L1, L0A/B, UBUF}.

    MTE3

    Data transfer pipeline, from UBUF to {DDR/GM, L2, L1}.

    FIXP

    Data transfer pipeline, from FIXPIPE L0C to OUT/L1. (This parameter is displayed only for Atlas A3 training products/Atlas A3 inference products and Atlas A2 training products/Atlas A2 inference products.)

    FLOWCTRL

    Flow control instruction.

    ICmiss

    iCache miss.

  8. View the time consumption file of code lines or instructions.

    Open the time consumption file of code lines {Core_ID}_code_exe_prof.csv in the output path, as shown in the following figure.

    Figure 3 Time consumption file of code lines

    Open the time consumption file of instructions {Core_ID}_instr_exe_prof.csv in the output path, as shown in the following figure.

    Figure 4 Time consumption file of instructions

    You can view the call count and cycles fields in the file to view the number of times that a code line or instruction is called and the accumulated duration.

Operator Test (msOpST)

Introduction

After using msOpGen to deploy the custom operator package, you can use MindStudio Ops System Test (msOpST) to perform the system test (ST) to test the input and output of an operator in a real-world hardware environment to check operator functions.

Test cases usually include various types of data inputs and expected outputs, as well as tests for boundary situations and exceptions. The ST ensures that the operator can function and run properly in actual applications.

Preparations

Environment Requirements

Before developing an operator, install the CANN Toolkit and ops operator package of the required version and configure CANN environment variables. For details, see CANN Software Installation Guide. No installation example is provided in this section. After the configuration is complete, you can directly use the functions of the msOpST tool.

Constraints

Operator Test Functions

Function Description

msOpST can generate ST cases for operators and execute the cases in the hardware environment. It provides the following functions:

  • Displays the comparison result between the expected operator output and the actual output based on the user-defined function for generating expected operator output. For details, see Generating a Test Case Definition File.
  • Generates ST data and test case execution code based on the operator test case definition file, and executes the operator test cases on the hardware environment. For details, see Generating and Executing Test Cases.
  • Automatically generates a running report (st_report.json) that records the test case information and running status in each stage. For details, see Generating and Executing Test Cases.
  • Automatically generates the on-board test framework for calling the kernel function to test and verify the operator. For details, see Generating a Single-Operator On-board Test Framework.

Precautions

None

Command Syntax

Run the following command to generate the operator test case definition file. For details about the parameters, see Table 1 Parameters for generating the operator test case definition file.

msopst create -i {operator.cpp file} -out {output path} -m {pb file} -q

Parameter Description

  • Generate an operator test case definition file.

    Table 1 Parameters for generating an operator test case definition file

    Parameter

    Description

    Required/Optional

    create

    Creates an operator test case definition file (.json).

    Required

    -i, --input

    Path to the host-side operator implementation file (*.cpp), which can be specified as either an absolute path or a relative path.

    Required

    -out, --output

    Path where the generated file will be stored. This path can be configured as an absolute path or a relative path. The user executing the tool must have read and write permissions for the specified path.

    If this option is not specified, the file is generated in the current path where the command is executed by default.

    Optional

    -m, --model

    Path to the TensorFlow model file, configurable as either an absolute path or a relative path.

    If this option is set, the tool retrieves the shape information of the first-layer operator from the TensorFlow model file and automatically dumps the operator's shape, dtype, and attribute values defined in the operator information library definition file. If the dumped value falls within the range defined in the operator information library definition file, it will be automatically populated into the generated operator test case definition file; otherwise, an error will be reported.

    Note:

    If this option is set, TensorFlow version 1.15 or 2.6.5 must be installed in the system.

    Optional

    -q, --quiet

    In the current version, this option is effective only when used with -m, indicating whether to enable human-machine interaction.

    If this option is not configured, the system prompts you to modify the shape information of the first layer in the obtained model.

    If this option is configured, the system does not prompt you to modify the shape information of the first layer.

    Optional

    -h, --help

    Help information.

    Optional

  • Generate and execute test cases.

    Table 2 Parameters for generating and executing test cases

    Parameter

    Description

    Required/Optional

    run

    Runs the ST cases of an operator.

    Required

    -i, --input

    Path to the operator test case definition file, which can be specified as an absolute or relative path. For details, see the msOpST test case definition file in the MindStudio Ops Generator Typical Cases.

    Note:

    The JSON file supports up to 1000 test cases.

    Required

    -soc, --soc_version

    AI processor type.

    Note:
    • For servers other than the Atlas A3 training products/Atlas A3 inference products: Run the npu-smi info command on the server where the Ascend AI Processor is installed to obtain the chip name. Note that the actual value is represented by AscendChip name. For example, if the chip name is xxxyy, the actual value is Ascendxxxyy. If Ascendxxxyy is the path of the code sample, set this parameter to ascendxxxyy.
    • For the Atlas A3 training products/Atlas A3 inference products, run the npu-smi info -t board -i id -c chip_id command on the server where the Ascend AI Processor is installed to obtain the chip name and NPU name. The actual value is represented by Chip name_NPU name. For example, if the chip name is Ascendxxx and the NPU name is 1234, the actual value is Ascendxxx_1234. If Ascendxxx_1234 is the path of the code sample, set this parameter to ascendxxx_1234.
      • id: device ID, which is the NPU ID obtained by running the npu-smi info -l command.
      • chip_id: chip ID, which is the same as the chip ID obtained by running the npu-smi info -m command.

    Required

    -out, --output

    Path where the generated file will be stored. This path can be configured as an absolute path or a relative path. The user executing the tool must have read and write permissions for the specified path. If this option is not specified, the outputs are generated to the current path where the command is executed.

    Optional

    -c, --case_name

    • Names of the cases to run; for multiple cases, use commas as separators.
    • If it is set to all or not specified, all cases are executed.

    Optional

    -d, --device_id

    NPU device ID: ID of the AI processor on which to run ST cases.

    If this option is not set, the default value 0 is used.

    Optional

    -err_thr, --error_threshold

    Custom precision criteria as a two-element list: "[threshold1, threshold2]".

    • threshold1: threshold of the error between the operator output result and the benchmark data. If the error is greater than this value, the data is recorded as error data.
    • threshold2: threshold of the ratio of error data to all data. If the actual ratio is less than this threshold, the accuracy is deemed acceptable. Otherwise, the precision does not meet the requirement.

    If this option is not set, the default value is "[0.01,0.05]".

    Value range: "[0.0,1.0]".

    Note:
    • Enclose the configuration list in quotation marks to avoid potential issues, for example, set the option as -err_thr "[0.01,0.05]".
    • If this option is set both in the JSON file of the test case and in the corresponding msOpST command, the latter value is used for comparison.
    • If this option is not set in either of them, the default precision standard [0.01, 0.05] set in the msOpST command is used for comparison.

    Optional

    -conf, --config_file

    Path to the ST advanced feature configuration file (msopst.ini), which can be specified as an absolute or relative path.

    Users can implement the following advanced features by modifying the msopst.ini configuration file.

    • Edit the ST source code.
    • Execute the edited ST source code.
    • Set the environment variable for the host log level.
    • Set whether to display logs on the console.
    • Set the log level for ATC model conversion.
    • Set the OS type and architecture of the operating environment for ATC model conversion.
    • Set the model accuracy.
    • Read profile data of the compute operator running on the AI processor.

    If --config_file is not specified, the model will forcibly use FP16 precision. For details about the msopst.ini file, see Table 1 Parameters in the msopst.ini file.

    Optional

    -err_report, --error_report

    For comparison failure cases, retrieve the data where the operator's expected output mismatches the actual test execution result. If this option is not specified, the default value false is used.

    • true: For comparison-failed cases, save the mismatched data between operator expected results and actual execution outputs to the {case.name}_error_report.csv file.
    • false: No the comparison failure result is saved.
      Note:
      • When set to true, the comparison data generates a separate CSV file per case name, with the {case.name}_error_report.csv file located in the {output_path}/{time_stamp}/{op_type}/run/out/test_data/data/st_error_reports directory.
      • The maximum number of lines per CSV file is 50,000; when exceeded, additional rows are saved to new sequentially named files (e.g., {case.name}_error_report0.csv).

    Optional

    -h, --help

    Help information.

    Optional

  • Generate single-operator on-board test framework.

    Table 3 Parameters for generating single-operator on-board test framework

    Parameter

    Description

    Required/Optional

    ascendc_test

    Test code for Ascend C operator kernel function on-board verification.

    Required

    -i, --input

    Path to the operator test case definition file (*.json), which can be specified as an absolute or relative path.

    Note:
    • The specified operator ST case definition file (*.json) supports only one test case.
    • Multiple types, formats, and shapes cannot be configured in the test case.

    Required

    -kernel, --kernel_file

    Ascend CPath to the kernel-side implementation file (*.cpp) of the Ascend C operator, which can be specified as an absolute or relative path.

    Required

    -out, --output

    Output path for test framework code, which can be specified as an absolute or relative path. The user who executes the tool must have the read and write permissions on the path.

    Optional

    -h, --help

    Help information.

    Optional

Example

Generating a Test Case Definition File

This section describes how to use the msOpST tool to generate the operator test case definition file (.json) as the input of the operator ST cases.

  1. Obtain and edit the implementation file (.cpp file) of the operator to be tested on the host.

    The msOpST tool generates the ST case definition file of the operator based on the implementation file of the operator to be tested on the host. The following describes the implementation file path of the operator on the host in the operator project file.

    You can obtain the host-side operator implementation file add_custom.cpp for reference.

    NOTE

    This sample project does not support Atlas A3 training products and Atlas A3 inference products.

    ├── framework/tf_plugin        // Directory for storing the implementation file of the operator plugin. The generation of single-operator model files does not depend on the operator plugin and can be ignored.
    ├── op_host                      // Implementation file on the host.
    │   ├── add_custom_tiling.h    // Operator tiling definition file.
    │   ├── add_custom.cpp         // Content file for operator prototype registration, shape derivation, information library, and tiling implementation.
    ├── op_kernel                   // Implementation file on the kernel
    │   ├── CMakeLists.txt   
    │   ├── add_custom.cpp        // Operator implementation file
    
  2. Run the following command to generate the operator test case definition file. For details about the parameters, see Table 1 Parameters for generating the operator test case definition file.

    msopst create -i {operator.cpp file} -out {output path} -m {pb file} -q
    

    NOTE

    Example: The following command uses the AddCustom operator as an example.

    msopst create -i Op_implementation/add_custom.cpp -out ./output
    

    Replace Op_implementation with the path of the implementation file of the operator on the host. After the command is executed successfully, the operator test case definition file AddCustom_case_timestamp.json is generated in the output directory of the current path.

  3. Create the operator ST case definition file AddCustom_case.json. The following is the template of the file. You can modify the file based on the template. For details about all fields supported by the *.json file, see Operator test case definition file (.json). For details about the test case definition files in different scenarios, see "msOpST Test Case Definition File" in MindStudio Ops Generator Typical Cases.

    [
        {
            "case_name": "Test_OpType_001",
            "op": "OpType",
            "input_desc": [
                {
                    "format": [],
                    "type": [],
                    "shape": [],
                    "data_distribute": [
                        "uniform"
                    ],
                    "value_range": [
                        [
                            0.1,
                            1.0
                        ]
                    ]
                }
            ],
            "output_desc": [
                {
                    "format": [],
                    "type": [],
                    "shape": []
                }
            ]
        }
    ]
    

    Table 1 Operator test case definition file (.json)

    Parameter

    Description

    case_name

    -

    Required.

    String.

    Name of a test case.

    op

    -

    Required.

    String. Operator type. Must not be empty.

    error_threshold

    -

    Optional.

    Custom precision criteria as a two-element list: "[threshold1, threshold2]".

    • threshold1: threshold of the error between the operator output result and the benchmark data. If the error is greater than this value, the data is recorded as error data.
    • threshold2: threshold of the ratio of error data to all data. If the actual ratio is less than this threshold, the accuracy is deemed acceptable. Otherwise, the precision does not meet the requirement.

    If this option is not set, the default value is "[0.01,0.05]".

    Value range: "[0.0,1.0]".

    Note:
    • Enclose the configuration list in quotation marks to avoid potential issues, for example, set the option as -err_thr "[0.01,0.05]".
    • If this option is set both in the JSON file of the test case and in the corresponding msOpST command, the latter value is used for comparison.
    • If this option is not set in either of them, the default precision standard [0.01, 0.05] set in the msOpST command is used for comparison.

    st_mode

    -

    Optional.

    String.

    ST mode: Values "ms_python_train" for MindSpore operator project (Atlas training products only) and "pt_python_train" for PyTorch operator project.

    run_torch_api

    -

    Optional.

    API used to invoke the operator through the torch_api call. The value can be torch.square. square specifies the name of the API. Configure this parameter according to your specific use case.

    expect

    -

    Optional.

    Expected test result. It can be either of the following values. The default value is success.

    • success: indicates that the test case is expected to run successfully. If the model conversion fails, the process will be terminated. You can view ATC logs to locate the fault.
    • failed: indicates that the test case is expected to fail. If you need to run an abnormal case, change the value of the expect field to failed. If the model conversion fails, the process continues.

    In the statistical results, cases are categorized into "success count" if the status matches the expected value in STCaseReport, or "failed count" otherwise.

    fuzz_impl

    -

    Optional. String.

    Required if you need to use a fuzzing script to generate plenty of test cases. In this case, add this parameter and set it to the absolute path or relative path of the fuzzing script: function name.

    Note:

    You are not advised to invoke the fuzzing script in other users' directories to avoid privilege escalation risks.

    fuzz_case_num

    -

    Optional.

    Int.

    When fuzz_impl is added, manually add this parameter to specify the number of test cases generated by the fuzz parameter generation script; range: 1–2000.

    input_desc

    -

    Required.

    Operator input description.

    Note:

    The number of parameter values in all input_desc entries must be consistent; otherwise, test case generation will fail.

    For example, if input1 supports two formats, input2 should also support two formats.

    Similarly, the number of values for type, shape, data_distribute, and value_range must remain consistent across all input_x entries.

    -

    name

    Optional.

    When dynamic multi-input is used, name is required and must be set as the corresponding inputx.name parameter value from the operator information library appended with an index starting from 0 and incrementing sequentially (0, 1, 2, ...) based on the number of inputs.

    For example, if the number of inputs specified in the operator information file is 4, four input descriptions need to be configured in input_desc. The names are xxx0, xxx1, xxx2, and xxx3, where xxx indicates the name of the input parameter.

    For details about the configuration example in the dynamic multi-input scenario, see "Test Case Definition File" >"Uncertain Number of Inputs of an Operator (Dynamic Multi-Input Scenario)" in Typical Cases of MindStudio Ops Generator.

    -

    format

    Required.

    String or string array.

    Format of the input tensor data. The value cannot be empty.

    The common data layout formats are as follows:
    • NCHW
    • NHWC
    • ND: indicates that any format is supported.
    • NC1HWC0: 5D data format. C0 is closely related to the micro-architecture, and the value is equal to the Cube Unit size, for example, 16. C1 is obtained by dividing the C dimension by C0, that is, C1 = C/C0. When the division is not exact, the last data segment is padded to C0.
    • FRACTAL_Z: format of the convolution weight.
    • FRACTAL_NZ: a fractal format. The format of the output matrix is NW1H1H0W0 during Cube Unit computation. A matrix is divided into (H1 × W1) fractals in column-major order, which looks like an N-shape layout. Each fractal consists of (H0 × W0) elements in row-major order, resembling a z-shaped layout. Thus the NW1H1H0W0 format is referred to as the Nz format. H0 and W0 indicate the size of a fractal, as shown in the following figure.

    • RESERVED: reserved. If this value is used, type must be UNDEFINED, indicating that the corresponding operator input is optional.
    • fuzz: Use the fuzzing script to automatically generate values in batches.

    -

    ori_format

    Optional.

    String or string array:

    • Set it to the original format of the input data.

      This field is required when the format implemented by the operator is different from the original format. If this field is not set, the format implemented by the operator is the same as the original format by default.

    • Set it to "fuzz" to automatically generate test cases in batches by using the fuzzing script.

    -

    type

    Required.

    String or string array.

    Input data type:

    • bool
    • int8
    • uint8
    • int16
    • uint16
    • int32
    • int64
    • uint32
    • uint64
    • float16
    • float32
    • float
    • bfloat16 (only supported by Atlas A3 training products/Atlas A3 inference products and Atlas A2 training products/Atlas A2 inference products)
    • UNDEFINED: The input type of the operator is optional.
    • fuzz: automatically generates test cases in batches by using the fuzzing script.

    For details about the configuration example when the input data type is complex number, see "Test Case Definition File" > "Complex Input and Output Types of an Operator" in Typical Cases of MindStudio Ops Generator.

    -

    shape

    Required.

    • Int. 1D or 2D array.

      Supported shapes of the input tensor:

      • Static shape:

        The shape dimension and value are fixed. The shape_range parameter does not need to be configured.

      • Dynamic shape:

        The shape contains -1. For example, (200, -1) indicates that the length of the second axis is unknown. In this scenario, this parameter must be used together with shape_range to specify the value range that -1 supports.

    • String type: "fuzz".

      Supports fuzzing and uses the fuzzing script to automatically generate values in batches.

    • Empty

      If the format and type are undefined, the shape can be empty.

    Note that the configured shape must match the format.

    -

    ori_shape

    Optional.

    • Int. 1D or 2D array.

      Original input shape. This parameter is required if the original operator shape is not consistent with the implemented one.

    • String type, "fuzz".

      Supports fuzzing and uses the fuzzing script to automatically generate values in batches.

    If this parameter is not set, the shape implemented by the operator is the same as the original shape by default.

    -

    typical_shape

    Optional.

    • Int. 1D or 2D array.

      Shape used for testing.

      If the configured shape parameter contains -1, you need to add the typical_shape parameter to the operator test case definition file and specify the static shape value for actual testing.

    • String type, "fuzz".

      Supports fuzzing and uses the fuzzing script to automatically generate values in batches.

    -

    shape_range

    Optional.

    • Int. 1D or 2D array.

      When the operator supports dynamic shape, this parameter indicates the supported shape range.

      Default value: [[1,-1]]. It indicates that the shape ranges from 1 to infinity.

      For example, if shape is set to (200, -1) and shape_range is set to [[1,-1]], the value of the second dimension of the shape ranges from 1 to infinity.

    • String type, "fuzz".

      Supports fuzzing and uses the fuzzing script to automatically generate values in batches.

    -

    is_const

    Optional.

    Bool.

    • true: configures test cases with constant inputs.
    • false: configures test cases with tensor inputs.

    For details about the configuration example when the input is a constant, see "Test Case Definition File" > "Complex Input and Output Types of an Operator" in Typical Cases of MindStudio Ops Generator.

    -

    data_distribute

    Required.

    String or string array.

    Data distribution mode used to generate test data. The following distribution modes are supported:
    • uniform: returns random values that are evenly distributed.
    • normal: returns random values of the normal distribution (Gaussian distribution).
    • beta: returns random values of the beta distribution.
    • laplace: returns random values of Laplace distribution.
    • triangular: returns random values of the triangular distribution.
    • relu: returns random values that are evenly distributed and activated by the ReLU function.
    • sigmoid: returns random values that are evenly distributed and activated by the sigmoid function.
    • softmax: returns random values that are evenly distributed and activated by the softmax function.
    • tanh: returns random values that are evenly distributed and activated by the tanh function.
    • fuzz: automatically generates test cases in batches by using the fuzzing script.

    -

    value_range

    Required.

    • Int or float type. 1D or 2D array.

      Value range. The value cannot be empty.

      [min_value, max_value] and min_value <= max_value.

    • String type, "fuzz".

      Supports fuzzing and uses the fuzzing script to automatically generate values in batches.

    -

    value

    Optional.

    String or tensor array.

    If you need to specify the input data, add this parameter. There are two configuration methods:
    • Directly enter the tensor data. For example, the tensor value is [1,2,3,4].

      "value": [1,2,3,4]

    • Specify the path of a binary file (for example, test.bin):

      "value": "../test.bin"

      The binary data file .bin needs to be prepared by the user. The path can be an absolute path or a relative path of the test case definition file.

    • Set it to "fuzz" to automatically generate test cases in batches by using the fuzzing script.
      Note:

      If the value parameter is added, the data_distribute and value_range parameters will be ignored. In addition, ensure that value matches format, type, and shape. Each test case supports one data type only. For details about the configuration example, see "Test Case Definition File" > "Specifying a Fixed Input" in Typical Cases of MindStudio Ops Generator.

    output_desc

    -

    Required.

    Operator output description.

    Note:

    The number of parameter values in output_desc must be the same as that in input_desc; otherwise, test case generation will fail.

    Example: If inputx supports 2 format types, the output must also support 2 format types.

    -

    name

    Optional. String.

    Output parameter name.

    When dynamic multi-output is used, name is required and must be set as the corresponding outputx.name parameter value from the operator information library appended with an index starting from 0 and incrementing sequentially (0, 1, 2, ...) based on the number of inputs.

    Example: If the number of output specified in the operator information file is 4, four output descriptions need to be configured in output_desc. The names are xxx0, xxx1, xxx2, and xxx3, where xxx indicates the name of the output parameter.

    -

    format

    Required.

    String or string array.

    Format of the output tensor data. The value cannot be empty.

    The following data layout formats are supported:

    • NCHW
    • NHWC
    • ND: indicates that any format is supported.
    • NC1HWC0: 5D data format. C0 is closely related to the micro-architecture, and the value is equal to the Cube Unit size, for example, 16. C1 is obtained by dividing the C dimension by C0, that is, C1 = C/C0. When the division is not exact, the last data segment is padded to C0.
    • FRACTAL_Z: format of the convolution weight.
    • FRACTAL_NZ: a fractal format. The format of the output matrix is NW1H1H0W0 during Cube Unit computation. A matrix is divided into (H1 × W1) fractals in column-major order, which looks like an N-shape layout. Each fractal consists of (H0 × W0) elements in row-major order, resembling a z-shaped layout. Thus the NW1H1H0W0 format is referred to as the Nz format. H0 and W0 indicate the size of a fractal, as shown in the following figure.

    • fuzz: Use the fuzzing script to automatically generate values in batches.

    -

    ori_format

    Optional.

    String or string array.

    • This parameter is required when the format implemented by the operator is different from the original format. Set it to the original format of the data.
    • Set it to "fuzz" to automatically generate test cases in batches by using the fuzzing script.

    If this parameter is not set, the format implemented by the operator is the same as the original format by default.

    -

    type

    Required.

    String, string array, or fuzz.

    Output data type:

    • bool
    • int8
    • uint8
    • int16
    • uint16
    • int32
    • int64
    • uint32
    • uint64
    • float16
    • float32
    • float
    • bfloat16 (only supported by Atlas A3 training products/Atlas A3 inference products and Atlas A2 training products/Atlas A2 inference products)
    • fuzz: automatically generates test cases in batches by using the fuzzing script.

    -

    shape

    Required.

    • Int. 1D or 2D array.

      Input tensor shape.

    • String type, "fuzz".

      Supports fuzzing and uses the fuzzing script to automatically generate values in batches.

    -

    ori_shape

    Optional.

    • Int. 1D or 2D array.

      Original shape of the input data. This parameter is required if the original operator shape is not consistent with the implemented one.

    • String type, "fuzz".

      Supports fuzzing and uses the fuzzing script to automatically generate values in batches.

    If this parameter is not set, the shape implemented by the operator is the same as the original shape by default.

    attr

    -

    Optional.

    -

    name

    If attr is configured, this parameter is required.

    String.

    Attribute name, which cannot be empty.

    -

    type

    If attr is configured, this parameter is required.

    String.

    Attribute type.

    • bool
    • int
    • float
    • string
    • list_bool
    • list_int
    • list_float
    • list_string
    • list_list_int
    • data_type: If value of attr is of a data type, the value of type must be data_type.

    -

    value

    If attr is configured, this parameter is required.

    Attribute value. The value varies according to type.

    • If type is set to bool, value must be set to true or false.
    • If type is set to int, value must be set to an integer.
    • If type is set to float, value must be set to a floating-point number.
    • If type is set to string, value must be set to a string, for example, NCHW.
    • If type is set to list_bool, an example value of value is [false, true].
    • If type is set to list_int, an example value of value is [1, 224, 224, 3].
    • If type is set to list_float, an example value of value is [1.0, 0.0].
    • If type is set to list_string, an example value of value is ["str1", "str2"].
    • If type is set to list_list_int, an example value of value is [[1, 3, 5, 7], [2, 4, 6, 8]].
    • If type is set to data_type, value can be int8, int32, int16, int64, uint8, uint16, uint32, uint64, float, float16, float32, bool, double, complex64, complex128, or bfloat16.
    • If value is set to "fuzz", the fuzzing script is used to automatically generate values in batches.

    calc_expect_func_file

    -

    Optional.

    String.

    file path and operator function name corresponding to the expected data generation function of the operator, for example, /home/test/test_*.py:function

    /home/test/test_*.py is the implementation file of the expected data generation function of the operator, and function is the name of the corresponding function.

    Note:

    You are not advised to invoke the expected data generation script in other users' directories to avoid privilege escalation risks.

  4. (Optional) Customize a data generation function if you want to compare the expected operator outputs with actual outputs. For details, refer to the following steps.

    1. Customize a function for generating the expected data of the Add operator.

      Implement the function in Python. The file directory including the file name can be user-defined, for example, /home/test/test_add_st.py.

      For example, the expected operator data generation function of the Add operator is implemented as follows:

      def calc_expect_func(x1, x2, y):    
          res = x1["value"] + x2["value"]
          return [res, ]
      

      NOTE

      You need to create the expected data generation function of the operator based on the developed custom operator. The names of all input, output, and attribute elements in the test case definition file are used as the input parameters of the expected data generation function of the operator. If an input is optional, the default value will be specified for the input.

      For example, if the x3 input is optional, define the expected operator data generation function of the operator as follows:

      def calc_expect_func(x1, x2, x3=None, y=None)
      
    2. Add a comparison function to the ST case definition file, that is, OpType_xx.json. Edit the test case definition file as required.

      In Step 2 Run the following command to generate the operator test case definition file, add the calc_expect_func_file parameter to the operator test case definition file AddCustom_case_timestamp_.json, and set the parameter value to /home/test/test_add_st.py:calc_expect_func.

      [
          {
              "case_name":"Test_AddCustom_001",         
              "op": "AddCustom",                             
              "calc_expect_func_file": "/home/test/test_add_st.py:calc_expect_func",   // Configure the implementation file for generating expected compute result.
              "input_desc": [...]
              ...
              ...
          }
      ]
      

Generating and Executing Test Cases

This section walks through the workflow of generating the ST data and test case execution code based on the operator test case definition file, for running the test cases in the hardware environment.

Development and operating environments on the same server

  1. Configure the environment variables required for AscendCL application building as the AscendCL API is used to load and execute the single-operator model file during the execution of ST cases.

    export DDK_PATH=${INSTALL_DIR}
    export NPU_HOST_LIB=${INSTALL_DIR}/{arch-os}/devlib
    

    NOTE

    • Replace $\{INSTALL\_DIR\} with the actual file storage path after the CANN software is installed. For example, if the installation is performed as the root user, the default file storage path after the installation is /usr/local/Ascend/cann.
    • In {arch-os}, arch indicates the OS architecture, and os indicates the operating system.
  2. Run the following command to generate or execute test cases. For details about the parameters, see Table 1 Parameters for generating the operator test case definition file.

    msopst run -i {*.json}  -soc {soc version}  -out {output path} -c {case name} -d {device id} -conf {msopst.ini path} -err_thr "[threshold1,threshold2]"
    
    • The path of the msopst.ini file is ${INSTALL_DIR}/python/site-packages/bin/.

    • The following table describes the parameters in the·msopst.ini file.

      NOTE

      By default, the msopst.ini file uses the FP16 precision mode. To use another precision mode, manually modify the --precision_mode option of atc_singleop_advance_option in Table 1 msopst.ini file parameter description.

      Table 1 msopst.ini file parameter description

      Parameter

      Value

      Description

      only_gen_without_run

      • True
      • False (default)

      Running mode of the msOpST tool.

      For details, see Table 2 msOpST running modes.

      only_run_without_gen

      • True
      • False (default)

      performance_mode

      • True
      • False

      Obtains the operator performance mode. If this parameter is set to True, upon successful execution, a series of performance result files will be generated under the run/out/prof/JOBxxx/summary directory. You only need to view the op_summary_0_1.csv file.

      This function requires the CANN package installation environment variable. Change the value based on the actual installation path.
      export install_path=${INSTALL_DIR}

      ASCEND_GLOBAL_LOG_LEVEL

      • 0: DEBUG
      • 1: INFO
      • 2: WARNING
      • 3: ERROR (default)
      • 4: NULL; no log export.

      Host log level.

      ASCEND_SLOG_PRINT_TO_STDOUT

      • 0: (default) disables console printing.
      • 1: enables console printing.

      Log printing control.

      atc_singleop_advance_option

      --log value:
      • debug: outputs run logs at the debug, info, warning, error, and event levels.
      • info: outputs run logs at the info, warning, error, and event levels.
      • warning: outputs run logs at the warning, error, and event levels.
      • error: (default) outputs run logs at the error and event levels.
      • null: does not output run logs.
      --precision_mode value:
      • force_fp16: (Default) When the operator supports both FP16 and FP32 precisions, this value forces the selection of FP16.
      • force_fp32: When the operator supports both FP16 and FP32 precisions, this mode forces the selection of FP32.
      • allow_fp32_to_fp16: If the operator supports FP32, retains original FP32 precision; otherwise, selects FP16.
      • must_keep_origin_dtype: maintains original image precision.
      • allow_mix_precision: enables mixed precision mode.

      --host_env_os value:

      linux: sets OS type to linux

      .
      --host_env_cpu value:
      • x86_64: indicates that the OS architecture is x86_64.
      • aarch64: indicates that the OS architecture is AArch64.

      Example:

      atc_singleop_advance_option="--log=info --host_env_os=linux --host_env_cpu=aarch64 --precision_mode=force_fp16"

      Sets the advanced options for single-operator model conversion.

      If the OS and its architecture of the model compilation environment are inconsistent with those of the model operating environment, set the --host_env_os and --host_env_cpu options to the OS type of the model operating environment. If this parameter is not set, the OS architecture of the model compilation environment where the ATC is located is used by default.

      HOST_ARCH

      • X86_64: x86_64 architecture
      • aarch64: ARM64 architecture

      Example:

      HOST_ARCH="aarch64"

      Architecture of the execution machine.

      Generally, this parameter is configured in the separated deployment scenario.

      TOOL_CHAIN

      g++ path: g++ toolchain path

      Example:

      TOOL_CHAIN="/usr/bin/g++"

      C++ compiler path, which ends with g++.

      Generally, this parameter is configured in the separated deployment scenario.

      Table 2 msOpST run modes

      Mode

      only_gen_without_run

      only_run_without_gen

      Run Mode

      1

      False

      False

      Generates and runs the ST code.

      2

      True

      True/False

      Generates ST code only and does not run the ST code.

      3

      False

      True

      Does not generate ST code but only runs ST code.

    • Examples:

      • Disable the advanced functions of the msOpST tool and run the following command to generate and execute ST cases:

        msopst run -i xx/AddCustom_case_timestamp.json -soc {soc version} -out ./output
        
      • The following is an example to generate ST cases only, with the advanced features of msOpST enabled, and then run the ST cases after manual modification.

        1. Run the following command to edit the msopst.ini file:

          vim ${INSTALL_DIR}/python/site-packages/bin/msopst.ini
          

          Change the run mode of msOpST to mode 2 and change the values of only_gen_without_run and only_run_without_gen by referring to Table 2 msOpST run modes. In this mode, msopst generates ST code only.

        2. Run the following command to generate the ST source code:

          msopst run -i xx/AddCustom_case_timestamp.json -soc {soc version} -out ./output -conf xx/msopst.ini
          

          -Set -conf to the actual path of the msopst.ini configuration file.

          Modify the test code as needed after the ST cases are generated.

        3. Modify the msopst.ini file to change the run mode to running ST cases only.

          Run the following command to edit the msopst.ini file:

          vim ${INSTALL_DIR}/python/site-packages/bin/msopst.ini
          

          Change the run mode of msOpST to mode 3 and change the values of only_gen_without_run and only_run_without_gen by referring to Table 2 msOpST run modes. In this mode, msopst runs ST code only.

        4. Run the following command to run the edited ST source code:

          msopst run -i xx/AddCustom_case_timestamp.json -soc {soc version} -out ./output -conf xx/msopst.ini
          

        NOTE

        If the execution fails, analyze the cause as follows:

  3. View the execution result.

    • In the run mode that msOpST generates the ST code only, a timestamp directory is generated in the directory specified by -out, and under the timestamp directory, a folder named after OpType is generated for storing ST cases. The directory structure is as follows:

       {time_stamp}
      │   ├── OpType
      │   │   ├── CMakeLists.txt            // Build script
      │   │   ├── inc                       // Directory of header files required by the test case code
      │   │   │   └── common.h
      │   │   │   └── op_execute.h
      │   │   │   └── op_runner.h
      │   │   │   └── op_test_desc.h
      │   │   │   └── op_test.h
      │   │   ├── run                       // Directory related to test case execution
      │   │   │   └── out
      │   │   │       └── test_data
      │   │   │          └── config
      │   │   │             └── acl.json      // File for acl initialization. Keep it intact.
      │   │   │             └── acl_op.json   // Operator description file used to build a single-operator model
      │   │   │          └── data
      │   │   │             └── expect
      │   │   │             └── Test_xxx.bin
      │   │   ├── src
      │   │   │   └── CMakeLists.txt    // Build script
      │   │   │   └── common.cpp         // Common function file, used to read binary files
      │   │   │   └── main.cpp            // Code for initializing and executing the operator test cases
      │   │   │   └── op_execute.cpp        // Encapsulated AscendCL API for single-operator execution
      │   │   │   └── op_runner.cpp         // Encapsulated API for loading the single-operator model file for execution
      │   │   │   └── op_test.cpp          // Test class of the operator
      │   │   │   └── op_test_desc.cpp      // File for loading and reading the operator test case information
      │   │   │   └── testcase.cpp             // Test case definition file
      
    • In the run mode that msOpST generates and runs ST code, the execution result is printed after the command is executed. Then, a timestamp directory is generated in the directory specified by -out, and under the timestamp directory, a folder named after OpType is generated for storing both the ST cases and the execution results. The directory structure is as follows:

       {time_stamp}
      │   ├── OpType
      │   │   ├── build
      │   │   │   └── intermediates             // Directory of intermediate files generated at compile time
      │   │   │       └── xxx
      │   │   ├── CMakeLists.txt    // Build script
      │   │   ├── inc
      │   │   │   ├── common.h
      │   │   │   ├── op_execute.h
      │   │   │   ├── op_runner.h
      │   │   │   ├── op_test_desc.h
      │   │   │   └── op_test.h
      │   │   ├── run                           // Directory related to test case execution
      │   │   │   └── out
      │   │   │       ├── fusion_result.json
      │   │   │       ├── main           // Directory of executable files of the operator test cases
      │   │   │       ├── op_models         // Directory of single-operator offline model files
      │   │   │          ├── xx.om
      │   │   │       ├── result_files
      │   │   │          ├── result.txt
      │   │   │          ├── Test_xxx_output_x.bin   // Result binary generated after test case execution
      │   │   │       └── test_data         // Directory for storing test data files
      │   │   │          ├── config
      │   │   │             ├── acl_op.json    // Operator description file used to build a single-operator model
      │   │   │             ├── acl.json       // File for AscendCL initialization. Keep it intact.
      │   │   │          ├── data                // Constructed test data
      │   │   │             ├──expect
      │   │   │                 ├──Test_xxxx.bin      // Binary of the expected output
      │   │   │             ├──st_error_reports
      │   │   │                 ├──Test_xxxx.csv       // Used to store inconsistent data
      │   │   │             ├──Test_xxxx.bin      // Test data binary
      │   │   └── src
      │   │       ├── CMakeLists.txt    // Build script
      │   │       ├── common.cpp         // Common function file, used to read binaries
      │   │       ├── main.cpp            // Code for initializing and executing the operator test cases
      │   │       ├── op_execute.cpp        // Encapsulated AscendCL API for single-operator execution
      │   │       ├── op_runner.cpp         // Encapsulated API for loading the single-operator model file for execution
      │   │       ├── op_test.cpp          // Test class of the operator
      │   │       ├── op_test_desc.cpp      // File for loading and reading the operator test case information.
      │   │       └── testcase.cpp             // Test case definition file
      │   └── st_report.json        // Execution report
      

      After the command is executed successfully, an st_report.json report is automatically generated. The report records the detailed test information and the runtime states in each stage, which facilitates troubleshooting and test result comparison. The st_report.json file is saved in the path specified by "The st_report saved in" in Figure 1 Example of the running result.

      Figure 1 Example of the running result

      Table 3 Fields in the st_report.json report

      Field

      Description

      run_cmd

      -

      -

      Command.

      report_list

      -

      -

      Report list, which can contain reports of multiple test cases.

      trace_detail

      -

      Running details.

      st_case_info

      Test information, including:

      • expect_data_path: expected calculation result path.
      • case_name: test case name.
      • input_data_path: input data path.
      • planned_output_data_paths: actual calculation result output path.
      • op_params: operator parameter information.

      stage_result

      Result information in each running stage, including:

      • status: stage running status, indicating whether the running is successful or failed.
      • result: output result.
      • stage_name: stage name.
      • cmd: running command.

      case_name

      -

      Test name.

      status

      -

      Test result status, indicating whether the test is successful or failed.

      expect

      -

      Expected test result status, indicating whether the test is successful or failed.

      summary

      -

      -

      Result of comparison between the test case result status and the expected result status.

      test case count

      -

      Number of test cases.

      success count

      -

      Number of test cases whose result status is the same as the expected result status.

      failed count

      -

      Number of test cases whose result status is inconsistent with the expected result status.

Development and operating environments on separate servers

  1. Set up the environment in the development environment based on the operating environment architecture.

    1. The AscendCL APIs are used to load and execute the single-operator model file during ST case execution. As such, you need to configure the environment variables required for AscendCL application building in the development environment based on the operating environment architecture.

      • When the architecture of the development environment is the same as that of the operating environment, the environment variables are as follows:

        export DDK_PATH=${INSTALL_DIR}
        export NPU_HOST_LIB=${INSTALL_DIR}/{arch-os}/devlib
        
      • When the architecture of the development environment is different from that of the operating environment, the environment variables are as follows:

        export DDK_PATH=${INSTALL_DIR}/{arch-os}
        export NPU_HOST_LIB=${INSTALL_DIR}/{arch-os}/devlib
        

      NOTE

      • Replace $\{INSTALL\_DIR\} with the actual file storage path after the CANN software is installed. For example, if the installation is performed as the root user, the default file storage path after the installation is /usr/local/Ascend/cann.
      • In arch-os, arch indicates the OS architecture (select a value based on the architecture of the operating environment), and os indicates the operating system (select a value based on the OS of the operating environment).
  2. Enable the advanced features of the msOpST tool in the development environment to generate ST cases only.

    1. Run the following command to edit the msopst.ini file:

      vim ${INSTALL_DIR}/python/site-packages/bin/msopst.ini
      
    2. Change the run mode of msOpST to mode 2 and change the values of only_gen_without_run and only_run_without_gen by referring to Table 2 msOpST run modes. In this mode, msopst generates ST code only.

    3. If the architectures of the development environment and operating environment are different, change the values of HOST_ARCH and TOOL_CHAIN by referring to Table 1 Parameters in the msopst.ini file.

    4. Run the following command to generate the ST source code:

      msopst run -i xx/AddCustom_case_timestamp.json -soc {soc version} -out {output path} -conf xx/msopst.ini
      

      -Set -conf to the actual path of the msopst.ini configuration file.

      Modify the test code as needed after the ST cases are generated.

    5. After the command is executed, ST cases are generated in the {output path} directory, and the executable file main is generated using the g++ compiler. The printed information on the screen shows the number of test cases that have been executed and the execution status of the test cases. The report st_report.json is generated and saved in the path specified by "The st report saved in" in the printed information. For details about the report, see Table 3 Main fields in the st_report.json report.

  3. Run test cases.

    1. Copy the out folder in the run directory of the operator project directory in the development environment to any directory in the operating environment, for example, ${INSTALL_DIR}/Ascend_project/run_add/.

    2. Run the executable file in the out folder in the operating environment.

      Go to the directory where the out folder is located and run the following command:

      chmod +x main
      ./main
      
  4. View the execution result.

    After the execution is complete, the execution result of the test cases is displayed, as shown in Figure 2 Execution result.

    Figure 2 Execution result

Generating a Single-Operator On-board Test Framework

This section describes how to specify the ST case definition file (.json) and implementation file kernel_name.cpp of the Ascend C operator to automatically generate the on-board test framework for calling the kernel function, test and verify the operator, and view the output result to check whether the operator function is correct.

NOTE

  • This function applies only to Atlas inference products and Atlas training products, excluding the Atlas A2 training products/Atlas A2 inference products and Atlas A3 training products/Atlas A3 inference products.
  • The addr and tiling attributes cannot be specified for any parameter.
  • #ifndef__CCE_KT_TEST__ can be used to encapsulate kernel function calls.
  1. Prepare the following input files:

    • Operator ST case definition file (.json).
    • Operator implementation file (.cpp) on the kernel side. For details, see "Operator Implementation" > "Project-based Operator Development" > "Operator Implementation on the Kernel Side" in Ascend C Operator Development Guide.
  2. Run the following command to generate the test code to call the kernel function. For details about the parameters, see Parameters for generating single-operator on-board test framework.

    msopst ascendc_test -i xx/OpType_case.json -kernel xx/add_custom.cpp -out ./output_data
    
  3. View the execution result.

    After the command is executed, the message "Process finished!" is displayed. A timestamp directory is generated in the directory specified by -out. In the timestamp directory, a folder named after OpType of the operator is generated for storing test cases and test results. The directory structure is as follows:

     {time_stamp}
    │   ├── OpType
    │   │   ├── CMakeLists.txt     // Build script
    │   │   ├── data    
    │   │   │   └── xx.bin        
    │   │   │   └── xx.bin
    │   │   ├── data_utils.h
    │   │   ├── main.cpp        // Test framework
    │   │   └── run.sh      // Script for invoking the test framework
    │   └── st_report.json        // Execution report
    

    After the command is executed successfully, an st_report.json report is automatically generated. The report records the detailed test information and the runtime states in each stage, which facilitates troubleshooting and test result comparison.

    The st_report.json file is saved in the path specified by "The st_report saved in".

    2024-01-17 08:40:55 (3271037) - [INFO] Create 1 sub test cases for Test_AddCustom_001.
    2024-01-17 08:40:55 (3271037) - [INFO] [STEP2] [data_generator.py] Generate data for testcase.
    2024-01-17 08:40:55 (3271037) - [INFO] Start to generate the input data for Test_AddCustom_001_case_001_ND_float.
    2024-01-17 08:40:55 (3271037) - [INFO] Generate data for testcase in $HOME/AddCustom/output/20240117084055/AddCustom/data.
    2024-01-17 08:40:55 (3271037) - [INFO] [STEP3] [gen_ascendc_test.py] Generate test code of calling of kernel function for AscendC operator.
    2024-01-17 08:40:55 (3271037) - [INFO] Content appended to $HOME/AddCustom/output/20240117084055/AddCustom/main.cpp successfully.
    2024-01-17 08:40:55 (3271037) - [INFO] AscendC operator test code files for kernel implement have been successfully generated.
    2024-01-17 08:40:55 (3271037) - [INFO] If you want to execute kernel function in Ascend aihost or cpu, please execute commands: cd $HOME/AddCustom/output/20240117084055/AddCustom && bash run.sh <KERNEL_NAME>(add_custom) <SOC_VERSION>(ascendxxxyy) <CORE_TYPE>(AiCore/VectorCore) <RUN_MODE>(cpu/npu). For example: cd $HOME/AddCustom/output/20240117084055/AddCustom && bash run.sh add_custom ascendxxxyy AiCore npu
    2024-01-17 08:40:55 (3271037) - [INFO] Process finished!
    2024-01-17 08:40:55 (3271037) - [INFO] The st report saved in: $HOME/AddCustom/output/20240117084055/st_report.json.
    

    Table 1 Fields in the st_report.json report

    Field

    Description

    run_cmd

    -

    -

    Command.

    report_list

    -

    -

    Report list, which can contain reports of multiple test cases.

    trace_detail

    -

    Running details.

    st_case_info

    Test information, including:

    • expect_data_path: path of the expected calculation result.
    • case_name: test case name.
    • input_data_path: input data path.
    • planned_output_data_paths: actual calculation result output path.
    • op_params: operator parameter information.

    stage_result

    Result information in each running stage, including:

    ;
    • status: stage running status, indicating whether the stage is successfully executed or fails to be executed.
    • result: output result.
    • stage_name: stage name.
    • cmd: running command.

    case_name

    -

    Test name.

    status

    -

    Test result status, indicating whether the test is successful or failed.

    expect

    -

    Expected test result status, indicating whether the test is successful or failed.

    summary

    -

    -

    Result of comparison between the test case result status and the expected result status.

    test case count

    -

    Number of test cases.

    success count

    -

    Number of test cases whose result status is the same as the expected result status.

    failed count

    -

    Number of test cases whose result status is inconsistent with the expected result status.

  4. Modify ASCEND_HOME_DIR in the run.sh file.

    ASCEND_HOME_DIR indicates the installation path of the CANN package. Change it to the actual path.

    # Point to the installation address of the Ascend software package and export environment variables.
    if [ ! $ASCEND_HOME_DIR ]; then
        export ASCEND_HOME_DIR=${INSTALL_DIR}     
    fi
    source $ASCEND_HOME_DIR/bin/set_env.bash
    
  5. Go to the directory where the script file for executing the test framework is located and run the following command to verify the test framework code on the board:

    bash run.sh <kernel_name> <soc_version> <core_type> <run_mode>
    

    Table 2 Script parameters

    Parameter

    Description

    Value

    <kernel_name>

    Name of the Ascend C operator implementation file.

    For example, if the implementation file of the Add operator is add_custom.cpp, add_custom should be passed.

    <soc_version>

    Model of the AI processor where the operator runs.

    For the Atlas training series and Atlas inference series products, the actual model in use must be configured as ascendxxxyy.

    Note:
    • For servers other than the Atlas A3 training products/Atlas A3 inference products: Run the npu-smi info command on the server where the Ascend AI Processor is installed to obtain the chip name. Note that the actual value is represented by AscendChip name. For example, if the chip name is xxxyy, the actual value is Ascendxxxyy. If Ascendxxxyy is the path of the code sample, set this parameter to ascendxxxyy.
    • For the Atlas A3 training products/Atlas A3 inference products, run the npu-smi info -t board -i id -c chip_id command on the server where the Ascend AI Processor is installed to obtain the chip name and NPU name. The actual value is represented by Chip name_NPU name. For example, if the chip name is Ascendxxx and the NPU name is 1234, the actual value is Ascendxxx_1234. If Ascendxxx_1234 is the path of the code sample, set this parameter to ascendxxx_1234.
      • id: device ID, which is the NPU ID obtained by running the npu-smi info -l command.
      • chip_id: chip ID, which is the same as the chip ID obtained by running the npu-smi info -m command.

    <core_type>

    Core type on which the operator is scheduled to run.

    AI Core or Vector Core

    <run_mode>

    Execution mode of the operator.

    CPU or NPU.

    After the script is executed, information similar to the following is displayed. If "succeed" is displayed, the on-board verification is complete.

    INFO: compile op on npu succeed!
    [INFO] Succeeded to exec acl api aclrtCreateContext(&context, deviceId)
    [INFO] Succeeded to exec acl api aclrtCreateStream(&stream)
    [INFO] Succeeded to exec acl api aclrtMallocHost((void**)(&xHost), xByteSize)
    [INFO] Succeeded to exec acl api aclrtMalloc((void**)&xDevice, xByteSize, ACL_MEM_MALLOC_HUGE_FIRST)
    [INFO] Succeeded to exec acl api aclrtMemcpy(xDevice, xByteSize, xHost, xByteSize, ACL_MEMCPY_HOST_TO_DEVICE)
    [INFO] Succeeded to exec acl api aclrtMallocHost((void**)(&yHost), yByteSize)
    [INFO] Succeeded to exec acl api aclrtMalloc((void**)&yDevice, yByteSize, ACL_MEM_MALLOC_HUGE_FIRST)
    [INFO] Succeeded to exec acl api aclrtMemcpy(yDevice, yByteSize, yHost, yByteSize, ACL_MEMCPY_HOST_TO_DEVICE)
    [INFO] Succeeded to exec acl api aclrtMallocHost((void**)(&zHost), zByteSize)
    [INFO] Succeeded to exec acl api aclrtMalloc((void**)&zDevice, zByteSize, ACL_MEM_MALLOC_HUGE_FIRST)
    [INFO] Succeeded to exec acl api aclrtSynchronizeStream(stream)
    [INFO] Succeeded to exec acl api aclrtMemcpy(zHost, zByteSize, zDevice, zByteSize, ACL_MEMCPY_DEVICE_TO_HOST)
    [INFO] aclrtDestroyStream successfully.
    INFO: execute op on npu succeed!
    

Output Description