CATLASS API List

CATLASS provides a layered GEMM API architecture to assemble templates hierarchically from bottom to top (Basic →\rightarrow Tile →\rightarrow Block →\rightarrow Kernel →\rightarrow Device) to implement operators. Developers can reuse low-level components or develop high-level components based on specific requirements to achieve customized operator development.

Component Classification Description
gemm/kernel Entry point for device-side execution, representing the collective orchestration and execution logic of all blocks on the NPU.
gemm/block The primary interface governing the main loop of block-level matrix multiplication and accumulation (MMAD).
gemm/tile Leverages base APIs to construct the NPU microkernels required for GEMM primitives.
epilogue/block The block-level epilogue component for GEMM, which can also be applied to non-GEMM computations.
epilogue/fusion The graph orchestrator and foundational node components for EVG.
epilogue/tile Leverages base APIs to construct the NPU microkernels required for epilogue operations.
TLA Tensor Layout Architecture. Abstracts underlying data storage details and provides generalized algorithms for multidimensional array access.