Monitor
Overview
Monitor is a set of APIs integrated in MindStudio Monitor. You can call these APIs to start and stop performance monitoring and obtain monitoring data.
Preparations
Install msMonitor. For details, see msMonitor Installation Guide. You are advised to download the software package for installation.
Monitor Functions
Function
Provides easy-to-use APIs to collect profile data of compute operators, communication operators, APIs, Runtime APIs, and MSTX. You can select metrics to be collected as needed.
API Description
For details, see Monitor Feature APIs for mindstudio_monitor.
Example
-
Import the Monitor APIs to the model Python script.
from msmonitor import Monitor, ActivityKind -
Call the Monitor APIs in the model Python script to start performance monitoring.
import torch import torch.nn as nn class FeatureExtractor(nn.Module): def __init__(self, in_channels=3, out_channels=16, kernel_size=3): super(FeatureExtractor, self).__init__() self.conv = nn.Conv2d(in_channels, out_channels, kernel_size, stride=1, padding=1) self.relu = nn.ReLU() self.pool = nn.AdaptiveAvgPool2d((4, 4)) def forward(self, x): x = self.conv(x) x = self.relu(x) x = self.pool(x) return x from msmonitor import Monitor, ActivityKind # Enable performance monitoring. monitor = Monitor() monitor.start(kinds=[ ActivityKind.API, ActivityKind.Kernel, ActivityKind.Marker ]) # Run the model. batch_size = 4 input_tensor = torch.randn(batch_size, 3, 32, 32).npu() extractor = FeatureExtractor(in_channels=3, out_channels=16, kernel_size=3).npu() linear_layer = nn.Linear(in_features=256, out_features=128).npu() for i in range(10): range_id = torch.npu.mstx.range_start(f"step {i}", torch.npu.current_stream()) features = extractor(input_tensor) flat_features = features.view(batch_size, -1) x = linear_layer(flat_features) w = torch.randn(128, 64).npu() y = torch.matmul(x, w) torch.npu.mstx.range_end(range_id) torch.npu.synchronize() # Stop performance monitoring. monitor.stop() # (Optional) Obtain profile data online. For details, see step 3. result = monitor.get_result() # (Optional) Save the profile data to a local file. For details, see step 4. monitor.save("monitor_result.xlsx") -
(Optional) Obtain profile data online. For details about the returned data structure, see ActivityData Data Structure.
# Obtain and print profile data. result = monitor.get_result() for kind, data in result.items(): for item in data: print(f"kind: {kind}, name: {item.name}, durationNs: {item.endNs-item.startNs}") -
(Optional) Save the profile data to a local file. Currently, only the Excel format is supported. For details about the file, see Output File Description.
# Save the profile data to a local file. monitor.save("monitor_result.xlsx")
Output File Description
The output Excel file contains multiple sheets. Each sheet corresponds to a data type, such as API, kernel, and marker. You can view different sheets to analyze the execution time of operators and APIs.
See the following figure.

The fields on each sheet are described as follows:
Marker
Name: mstx dotting message contentSourceKind: message source type, eitherHostorDeviceDomain: name of the domain to which the message belongsID: message IDStart(us): mstx dotting start time, in microsecondsEnd(us): mstx dotting end time, in microsecondsPid: process ID whenSourceKindisHost, and0whenSourceKindisDeviceTid: thread ID whenSourceKindisHost, and0whenSourceKindisDeviceDevice ID: ID of the device to which the marker belongs whenSourceKindisDevice, or0whenSourceKindisHostStream ID: ID of the stream to which the marker belongs whenSourceKindisDevice, or0whenSourceKindisHostDuration(us): mstx dotting execution time, in microseconds
Kernel
Name: name of a compute operatorStart(us): operator execution start time, in microsecondsEnd(us): operator execution end time, in microsecondsDevice ID: ID of the device where the operator is executedStream ID: ID of the stream where the operator is executedCorrelation ID: operator execution correlation ID, which is used to associate with API dataType: operator type, for example,KERNEL_AICORE,KERNEL_AIVEC, orKERNEL_AICPUDuration(us): operator execution time, in microseconds
Communication
Name: name of a communication operatorStart(us): operator execution start time, in microsecondsEnd(us): operator execution end time, in microsecondsDevice ID: ID of the device where the operator is executedStream ID: ID of the stream where the operator is executedCount: data volume transmitted by the operatorDataType: data type transmitted by the operator, for example,FP32orINT8CommName: name of the communicator to which the operator belongsAlgType: communication algorithm type of the operator, for example,RINGorMESHCorrelation ID: operator execution correlation ID, which is used to associate with API dataDuration(us): operator execution time, in microseconds
API, AclAPI, NodeAPI, and RuntimeAPI
Name: API nameStart(us): API call start time, in microsecondsEnd(us): API call end time, in microsecondsPid: ID of the process that calls the APITid: ID of the thread that calls the APICorrelation ID: API call correlation ID, which is used to associate with kernel/communication dataDuration(us): API call duration, in microseconds