super_kernel Minimal Sample
Use Case Function:
This sample demonstrates how to use super_kernel to complete operator fusion, including operator fusion definition, compilation, execution, and other steps. Core features:
- Simple dependencies, only depends on AscendC and runtime environment.
- Uses Python wrapper for underlying C interfaces, simplifying development workflow.
Directory Structure
├── super_kernel_runtime_ascendc_only # directory
└── superkernel_runtime_ascendc_basic.py # main entry, flow includes sub-kernel compilation, superkernel compilation, memory allocation, loading execution, and so on
└── compile_sk.py # compile sub_kernel, super_kernel operators
└── utils.py # utility functions
Use Case Introduction
graph TB
%% Overall view: two inputs -> superkernel -> output
input1[input1] --> sk[superkernel]
input2[input2] --> sk
sk --> output[output]
%% Enlarged internal view: subgraph shows superkernel internal structure
subgraph superkernel_internal_structure
inner1_in(inner_input1) --> pow(pow)
inner2_in(inner_input2) --> pow
pow --> isinf(isinf)
isinf --> inner_out(inner_output)
end
This use case demonstrates super_kernel basic functionality through compile-time dependency on ascendc and runtime dependency. Main steps:
-
- Initialization
-
- Compile sub_kernel, compile super_kernel, set sub-kernel topology relationship in super kernel for memory allocation
-
- Memory allocation, input data construction
-
- Kernel loading
-
- Launch execution, including args layout and so on
-
- Print output, output result validation
-
- Resource cleanup and release, including memory, kernel, stream, and so on
Explanation:
- Sub-kernel topology relationship is represented through strings. For example, if pow output is isinf input, then pow output and isinf input use the same string representation.
- When allocating memory, use strings to express same memory addresses.
- When launching args, layout follows [pow_in1, pow_in2, pow_ws, isinf_in1, isinf_out1, isinf_ws]
Execution Command
python3 superkernel_runtime_ascendc_basic.py
Expected Execution Result
After execution, print shows success:
execute sample success