Extended Functions

JSON Configuration File Description

Write the JSON file for operator definition. For details about the parameters, see Table 1 Parameters in the JSON file and Table 2 test_cases_parameters.

For example, the JSON configuration file is named add_test.json. Developers can modify test data and other configuration parameters based on this template.

{
  "kernel_name": "add_custom",
  "kernel_path": "./add_custom.o",
  "blockdim": 8,
  "mode": "ca",
  "device_id": 0,
  "magic": "RT_DEV_BINARY_MAGIC_ELF_AIVEC",
  "test_cases": [
    {
      "case_name": "Test_AddCustom_001",
      "param_desc": [
        {
          "param_type": "input",
          "type": "float16",
          "shape": [
            8,
            2048
          ],
          "data_path": "./input_x.bin",
          "name": "x"
        },
        {
          "param_type": "input",
          "type": "float16",
          "shape": [
            8,
            2048
          ],
          "data_path": "./input_y.bin",
          "name": "y"
        },
        {
          "param_type": "output",
          "type": "float16",
          "shape": [
            8,
            2048
          ],
          "name": "z"
        },
        {
          "param_type": "workspace",
          "user_workspace_size": 4096
        },
        {
          "param_type": "tiling",
          "tiling_data_size": 8,
          "tiling_data_path": "./tiling.bin"
        }
      ]
    }
  ]
}

Table 1 Parameters in the JSON file

Parameter	Description	Type	Mandatory
kernel_name	Kernel function name.	string	Yes
kernel_path	Path of the binary .o file of the kernel function. The path can be either absolute or relative.	string	Yes
blockdim	Number of cores required for running the kernel function. The default value is 1.	int	No
mode	Test mode. - Onboard: onboard - Performance simulation: ca	string	Yes
device_id	ID of the AI processor used for running. The default value is 0.	int	No
tiling_key	Tiling key of the current dynamic operator.	uint64	No
magic	Operator type. - Cube operator: RT_DEV_BINARY_MAGIC_ELF_AICUBE - Vector operator: RT_DEV_BINARY_MAGIC_ELF_AIVEC - Mixed fusion operator: RT_DEV_BINARY_MAGIC_ELF (only for Atlas A3 training products, Atlas A3 inference products, Atlas A2 training products, and Atlas A2 inference products)	string	Yes
test_cases	Test data. This can be a list, with each element containing a test case. For details, see Table 2 test_cases parameters.	list	Yes

NOTE

The tiling_key parameter applies only to dynamic operators.
For Atlas inference products, the magic parameter must be set to RT_DEV_BINARY_MAGIC_ELF.
For operator on-board or simulation tuning, only one case can be configured for the test_cases parameter.

Table 2 test_cases parameters

Parameter			Note	Type	Mandatory
case_name	-	-	Test case name, which must be unique.	string	Yes
param_desc	-	-	Test case description. This can be a list, with each element representing a kernel function parameter.	list	Yes
-	param_type	input/output/workspace/tiling/fftsAddr	Parameter type.	string	Yes
-	type	-	Supported input and output data types, such as uint8, int16, int32, float16, float32 and float. This parameter is mandatory when param_type is set to input or output.	string	No
-	shape	-	Shapes supported by the input and output tensors. All input and output tensors must support the same number of shapes. For example, [8, 3, 256, 256]. If an invalid shape is entered, for example, [0], an error is reported. This parameter is mandatory when param_type is set to input or output.	list	No
-	data_path	-	Path of the input data .bin file. When param_type is set to input, data_path or value_range must be set, and data_path has a higher priority. To set an empty data_path in the JSON file, set "data_path":"null". For details about the JSON file, see JSON Configuration Description.	string	No
-	name	-	Parameter name, which must be unique. This parameter is mandatory when param_type is set to input or output.	string	No
-	user_workspace_size	-	Size of workspace set by the user. This parameter is mandatory when param_type is set to workspace.	int	No
-	tiling_data_size	-	Size of tiling data. This parameter is mandatory when param_type is set to tiling.	int	No
-	tiling_data_path	-	Path of the tiling data .bin file. This parameter is mandatory when param_type is set to tiling.	string	No
-	data_size	-	Size of fftsAddr data. This parameter is mandatory when param_type is set to fftsAddr.	int	No

[!NOTICE]NOTICE

The number of parameter values in output must be the same as that in input. Otherwise, test case generation fails. For example, if input supports two types, output must also support two types. Similarly, the number of values of type, shape, or value_range in each input or output must be the same.

The number of parameter values in each input of an operator must be the same. Otherwise, test case generation fails. The number of values of type, shape, and value_range in each input must be the same.

mstx Extended Functions

mstx API Overview

MindStudio provides the mstx profiling API, which enables users to embed custom markers within their applications. These markers allow for the precise identification of critical code segments during performance analysis. For details, see Table 1 C/C++ mstx API List and Table 2 Python mstx API List. For further details about the API usage, see MindStudio mstx API Reference.

Table 1 C/C++ mstx API List

API	Description	msOpProf Support
mstxRangeStartA	Marks the beginning of a specific mstx range.	Supported
mstxRangeEnd	Marks the end of a specific mstx range.	Supported

Table 2 Python mstx API List

API	Description	msOpProf Support
mstx.range_start	Marks the beginning of a specific mstx range.	Supported
mstx.range_end	Marks the end of a specific mstx range.	Supported

mstx API Usage

msOpProf allows users to use the mstx API to tune specific operators, customize the start time and end time of the code segment or specified key functions, identify key functions or computing APIs, and quickly demarcate performance issues.
The mstx API is disabled by default. If the mstx API is called in the application, the mstx instrumentation function is enabled based on the actual application scenario. For example, the --mstx=on flag enables mstx APIs within the user program, while --mstx-include can be used to target specific mstx APIs. For detailed usage, refer to the --mstx and --mstx-include parameters in the "Command Reference" sections of the msopprof User Guide and the msopprof Simulator Mode User Guide.
The mstx API can be used via library files or header files. An implementation example can be found at this link:
NOTE
- This sample project does not support Atlas A3 training products.
- Replace ${INSTALL_DIR} with the file storage path after CANN is installed. For example, if the installation is performed by the root user, the default file storage path is /usr/local/Ascend/cann.
- Add the libms_tools_ext.so library file located at ${INSTALL_DIR}/lib64/libms_tools_ext.so to the CMakeLists.txt file at ${git_clone_path}/samples/operator/ascendc/0_introduction/1_add_frameworklaunch/AclNNInvocation/src/CMakeLists.txt.
```
# Header path
include_directories(
     ...
    ${CUST_PKG_PATH}/include
)
...
target_link_libraries( 
    ...
    dl
)
```
- In the main.cpp file at ${git_clone_path}/samples/operator/ascendc/0_introduction/1_add_frameworklaunch/AclNNInvocation/src/main.cpp, compile and link the user program with the dl library. The corresponding header file ms_tools_ext.h is located at ${INSTALL_DIR}/include/mstx.
```
...
#include "mstx/ms_tools_ext.h"
...
```

Example

After msOpProf is started, run the msprof op --mstx=on --mstx-include=range1 --launch-count=2 python cal.py command. This command will profile the operators defined within the range1 scope, specifically the sub and mul operators.

import mstx
import torch
import torch_npu
 
x = torch.Tensor([1,2,3,4]).npu()
y = torch.Tensor([1,2,3,4]).npu()

a = x + y
range1_id = mstx.range_start("range1", None)
b = a - x
c = a * x
mstx.range_end(range1_id)
range2_id = mstx.range_start("range2", None)
d = x / y
range3_id = mstx.range_start("range3", None)
e = torch.abs(y)
mstx.range_end(range3_id)
f = x + e
mstx.range_end(range2_id)