triton-ascend/third_party/ascend/lib/DynamicCVPipeline · Ascend/triton-ascend - AtomGit

文件	最后提交记录	最后更新时间
AddControlFlowCondition	!1829 merge 0512_ut into main feat(ssbuf): Fix tag missing and add ut for multi cache inner scope Created-by: cxtverygood123 Commit-by: cxtverygood123 Merged-by: ascend-robot Description: 修改计数器block_id标签缺失问题，生产者op，并补齐相关UT，主要修改点： 1.将核内多buffer轮询计数器相关op缺失block_id标签问题修复； 2.将原本核内多buffer生产者物理写入op修改为copy以满足一切二pass所需输入； 3.补齐核内多buffer单元测试用例。 <!--- The core Triton is a small number of people, and we receive many PRs (thank you!). To help us review your code more quickly, if you are a new contributor (less than 3 PRs merged) we ask that you complete the following tasks and include the filled-out checklist in your PR description. Complete the following tasks before sending your PR, and replace `[ ]` with `[x]` to indicate you have done them. --> # New contributor declaration - [ ] I am not making a trivial change, such as fixing a typo in a comment. - [ ] I have written a PR description following these [rules](https://cbea.ms/git-commit/#why-not-how). - [ ] I have run `pre-commit run --from-ref origin/main --to-ref HEAD`. - Select one of the following. - [ ] I have added tests. - `/test` for `lit` tests - `/unittest` for C++ tests - `/python/test` for end-to-end tests - [ ] This PR does not need a test because `FILL THIS IN`. - Select one of the following. - [ ] I have not added any `lit` tests. - [ ] The `lit` tests I have added follow these [best practices](https://mlir.llvm.org/getting_started/TestingGuide/#filecheck-best-practices), including the "tests should be minimal" section. (Usually running Python code and using the instructions it generates is not minimal.) See merge request: Ascend/triton-ascend!1829	20 天前
AllocMultiCache	add ut for inner scope	20 天前
Common	!1792 merge pr into main implementation code of four passes of the SplitDataflow Created-by: liurufeng25 Commit-by: liurufeng25 Merged-by: ascend-robot Description: ssbuffer cv分核插入数据搬运和同步部分四个pass的完整代码： 1. AddBlockIdForControlOpsPass: 补全控制流block_id标签 2. DataDependencyAnalysisPass: 数据依赖分析 3. InterCoreTransferAndSyncPass: 插入数据搬运和同步 4. MarkMainLoopPass: 标记mainloop <!--- The core Triton is a small number of people, and we receive many PRs (thank you!). To help us review your code more quickly, if you are a new contributor (less than 3 PRs merged) we ask that you complete the following tasks and include the filled-out checklist in your PR description. Complete the following tasks before sending your PR, and replace `[ ]` with `[x]` to indicate you have done them. --> # New contributor declaration - [ ] I am not making a trivial change, such as fixing a typo in a comment. - [ ] I have written a PR description following these [rules](https://cbea.ms/git-commit/#why-not-how). - [ ] I have run `pre-commit run --from-ref origin/main --to-ref HEAD`. - Select one of the following. - [ ] I have added tests. - `/test` for `lit` tests - `/unittest` for C++ tests - `/python/test` for end-to-end tests - [ ] This PR does not need a test because `FILL THIS IN`. - Select one of the following. - [ ] I have not added any `lit` tests. - [ ] The `lit` tests I have added follow these [best practices](https://mlir.llvm.org/getting_started/TestingGuide/#filecheck-best-practices), including the "tests should be minimal" section. (Usually running Python code and using the instructions it generates is not minimal.) See merge request: Ascend/triton-ascend!1792	21 天前
PlanComputeBlock	!1709 merge cube into main feat: cube blocks for ssbuf Created-by: OliverCWY Commit-by: chenwuyang Merged-by: ascend-robot Description: ### 中文总结 (Chinese Summary) 标题：在 Dynamic CV Pipeline 中实现 PlanCubeBlockPass 以支持 Cube 算子分块规划主要内容：本 PR 在 `third_party/ascend` 的动态 CV 流水线（DynamicCVPipeline）中引入了 `PlanCubeBlockPass`。该 Pass 的核心目标是根据 Ascend 硬件特性，将 Cube 类型（矩阵运算相关）的操作划分为合理的计算块（Compute Blocks）。核心逻辑： 1. 两阶段规划策略： * 种子区域规划 (SeedRegionPlanner)：以 `linalg.matmul` 等核心矩阵算子为“种子”，通过广度优先搜索（BFS）向上游（操作数）和下游（用户）扩展，将与其紧密相关的辅助算子（如转置、加载、地址计算等）整合进同一个 Cube 块中。 * 拓扑分区规划 (TopologicalPartitionPlanner)：按照拓扑顺序处理剩余未分配的 Cube 算子，确保在不破坏依赖关系的前提下完成全部分区。 2. 依赖环路检测：引入了 `DependencyCycleDetector`，在算子归并过程中实时检测并防止引入非法的循环依赖，确保生成的调度顺序在逻辑上是可执行的。 3. 流水线集成：将该 Pass 集成到 `AddDynamicCVPipeline` 流程中，位于算子分类（OpClassifier）之后，向量分块（PlanVectorBlock）之前，为后续的资源分配和指令调度奠定基础。 4. 工具类增强：在 `Utils.h` 和 `Common.h` 中增加了 `kBlockId` 属性定义及 `isCubeOp` 等辅助函数，并优化了 `ComputeBlockIdManager` 的管理机制。 --- ### English Summary Title: Implement PlanCubeBlockPass for Cube Operator Partitioning in Dynamic CV Pipeline Overview: This PR introduces the `PlanCubeBlockPass` within the `DynamicCVPipeline` for the `ascend` target. The primary objective of this pass is to partition Cube-type operations (matrix-related) into logical compute blocks tailored for Ascend hardware execution. Key Components: 1. Two-Phase Planning Strategy: * SeedRegionPlanner: Uses core matrix operators (e.g., `linalg.matmul`) as "seeds" and expands the group via Breadth-First Search (BFS) to include related auxiliary operations (transposes, loads, pointer arithmetic, etc.) into a unified Cube block. * TopologicalPartitionPlanner: Processes any remaining unassigned Cube operations following a topological order, ensuring all Cube ops are partitioned without violating data dependencies. 2. Dependency Cycle Detection: Implements a `DependencyCycleDetector` to identify and prevent the creation of illegal cyclic dependencies during the grouping process, guaranteeing a valid execution sequence. 3. Pipeline Integration: Integrates the pass into the `AddDynamicCVPipeline` workflow, positioned after operator classification (`OpClassifier`) and before vector partitioning (`PlanVectorBlock`). 4. Infrastructure Updates: Adds core definitions like `kBlockId` and helper functions such as `isCubeOp` in `Utils.h` and `Common.h`. It also refactors `ComputeBlockIdManager` to support unified block ID management for Cube operations. See merge request: Ascend/triton-ascend!1709	21 天前
SeparateMemoryFromCompute	feat: implement AddMultiBufferToGMLoad pass	22 天前
SplitDataflow	!1719 merge CVScope into main feat(CV):Add separate CV scope split implementation Created-by: wdy45 Commit-by: wdy45 Merged-by: ascend-robot Description: Background After cross-core dataflow analysis and main-loop marking, SplitDataflow must split mixed Vector/Cube computations into separate scopes and avoid losing important control-flow attributes during canonicalization. This PR adds two passes to address those needs and inserts them into the SplitDataflow pipeline. Changes Add SeparateCVScope pass Classifies ops/results by ssbuffer.core_type as VECTOR or CUBE. Builds two scopes per function (VECTOR and CUBE) and retains the corresponding content. Normalizes mixed-scope control-flow operations. Neutralizes or preserves non-target-scope values at yield/terminator and loop-carry sites. Removes temporary ssbuffer.core_type attributes at the end. Add PreserveControlAttrsCanonicalize pass Runs greedy canonicalization with a listener to track inserts/replaces/erases. For tracked control-flow ops (scf::ForOp, scf::IfOp, scf::WhileOp, scf::ParallelOp), transfers non-conflicting attributes from the old op to the new op when an equivalent replacement is created. Prevents loss of control-flow attributes during canonicalization. # New contributor declaration - [ ] I am not making a trivial change, such as fixing a typo in a comment. - [ ] I have written a PR description following these [rules](https://cbea.ms/git-commit/#why-not-how). - [ ] I have run `pre-commit run --from-ref origin/main --to-ref HEAD`. - Select one of the following. - [ ] I have added tests. - `/test` for `lit` tests - `/unittest` for C++ tests - `/python/test` for end-to-end tests - [ ] This PR does not need a test because `FILL THIS IN`. - Select one of the following. - [ ] I have not added any `lit` tests. - [ ] The `lit` tests I have added follow these [best practices](https://mlir.llvm.org/getting_started/TestingGuide/#filecheck-best-practices), including the "tests should be minimal" section. (Usually running Python code and using the instructions it generates is not minimal.) See merge request: Ascend/triton-ascend!1719	21 天前
AddControlFlowCondition.cpp	Supplement ut use cases	21 天前
AddDynamicCVPipeline.cpp	add plan cube pass	22 天前
AllocMultiCache.cpp	initialization of outer scope	22 天前
CMakeLists.txt	!1709 merge cube into main feat: cube blocks for ssbuf Created-by: OliverCWY Commit-by: chenwuyang Merged-by: ascend-robot Description: ### 中文总结 (Chinese Summary) 标题：在 Dynamic CV Pipeline 中实现 PlanCubeBlockPass 以支持 Cube 算子分块规划主要内容：本 PR 在 `third_party/ascend` 的动态 CV 流水线（DynamicCVPipeline）中引入了 `PlanCubeBlockPass`。该 Pass 的核心目标是根据 Ascend 硬件特性，将 Cube 类型（矩阵运算相关）的操作划分为合理的计算块（Compute Blocks）。核心逻辑： 1. 两阶段规划策略： * 种子区域规划 (SeedRegionPlanner)：以 `linalg.matmul` 等核心矩阵算子为“种子”，通过广度优先搜索（BFS）向上游（操作数）和下游（用户）扩展，将与其紧密相关的辅助算子（如转置、加载、地址计算等）整合进同一个 Cube 块中。 * 拓扑分区规划 (TopologicalPartitionPlanner)：按照拓扑顺序处理剩余未分配的 Cube 算子，确保在不破坏依赖关系的前提下完成全部分区。 2. 依赖环路检测：引入了 `DependencyCycleDetector`，在算子归并过程中实时检测并防止引入非法的循环依赖，确保生成的调度顺序在逻辑上是可执行的。 3. 流水线集成：将该 Pass 集成到 `AddDynamicCVPipeline` 流程中，位于算子分类（OpClassifier）之后，向量分块（PlanVectorBlock）之前，为后续的资源分配和指令调度奠定基础。 4. 工具类增强：在 `Utils.h` 和 `Common.h` 中增加了 `kBlockId` 属性定义及 `isCubeOp` 等辅助函数，并优化了 `ComputeBlockIdManager` 的管理机制。 --- ### English Summary Title: Implement PlanCubeBlockPass for Cube Operator Partitioning in Dynamic CV Pipeline Overview: This PR introduces the `PlanCubeBlockPass` within the `DynamicCVPipeline` for the `ascend` target. The primary objective of this pass is to partition Cube-type operations (matrix-related) into logical compute blocks tailored for Ascend hardware execution. Key Components: 1. Two-Phase Planning Strategy: * SeedRegionPlanner: Uses core matrix operators (e.g., `linalg.matmul`) as "seeds" and expands the group via Breadth-First Search (BFS) to include related auxiliary operations (transposes, loads, pointer arithmetic, etc.) into a unified Cube block. * TopologicalPartitionPlanner: Processes any remaining unassigned Cube operations following a topological order, ensuring all Cube ops are partitioned without violating data dependencies. 2. Dependency Cycle Detection: Implements a `DependencyCycleDetector` to identify and prevent the creation of illegal cyclic dependencies during the grouping process, guaranteeing a valid execution sequence. 3. Pipeline Integration: Integrates the pass into the `AddDynamicCVPipeline` workflow, positioned after operator classification (`OpClassifier`) and before vector partitioning (`PlanVectorBlock`). 4. Infrastructure Updates: Adds core definitions like `kBlockId` and helper functions such as `isCubeOp` in `Utils.h` and `Common.h`. It also refactors `ComputeBlockIdManager` to support unified block ID management for Cube operations. See merge request: Ascend/triton-ascend!1709	21 天前
PlanComputeBlock.cpp	!1709 merge cube into main feat: cube blocks for ssbuf Created-by: OliverCWY Commit-by: chenwuyang Merged-by: ascend-robot Description: ### 中文总结 (Chinese Summary) 标题：在 Dynamic CV Pipeline 中实现 PlanCubeBlockPass 以支持 Cube 算子分块规划主要内容：本 PR 在 `third_party/ascend` 的动态 CV 流水线（DynamicCVPipeline）中引入了 `PlanCubeBlockPass`。该 Pass 的核心目标是根据 Ascend 硬件特性，将 Cube 类型（矩阵运算相关）的操作划分为合理的计算块（Compute Blocks）。核心逻辑： 1. 两阶段规划策略： * 种子区域规划 (SeedRegionPlanner)：以 `linalg.matmul` 等核心矩阵算子为“种子”，通过广度优先搜索（BFS）向上游（操作数）和下游（用户）扩展，将与其紧密相关的辅助算子（如转置、加载、地址计算等）整合进同一个 Cube 块中。 * 拓扑分区规划 (TopologicalPartitionPlanner)：按照拓扑顺序处理剩余未分配的 Cube 算子，确保在不破坏依赖关系的前提下完成全部分区。 2. 依赖环路检测：引入了 `DependencyCycleDetector`，在算子归并过程中实时检测并防止引入非法的循环依赖，确保生成的调度顺序在逻辑上是可执行的。 3. 流水线集成：将该 Pass 集成到 `AddDynamicCVPipeline` 流程中，位于算子分类（OpClassifier）之后，向量分块（PlanVectorBlock）之前，为后续的资源分配和指令调度奠定基础。 4. 工具类增强：在 `Utils.h` 和 `Common.h` 中增加了 `kBlockId` 属性定义及 `isCubeOp` 等辅助函数，并优化了 `ComputeBlockIdManager` 的管理机制。 --- ### English Summary Title: Implement PlanCubeBlockPass for Cube Operator Partitioning in Dynamic CV Pipeline Overview: This PR introduces the `PlanCubeBlockPass` within the `DynamicCVPipeline` for the `ascend` target. The primary objective of this pass is to partition Cube-type operations (matrix-related) into logical compute blocks tailored for Ascend hardware execution. Key Components: 1. Two-Phase Planning Strategy: * SeedRegionPlanner: Uses core matrix operators (e.g., `linalg.matmul`) as "seeds" and expands the group via Breadth-First Search (BFS) to include related auxiliary operations (transposes, loads, pointer arithmetic, etc.) into a unified Cube block. * TopologicalPartitionPlanner: Processes any remaining unassigned Cube operations following a topological order, ensuring all Cube ops are partitioned without violating data dependencies. 2. Dependency Cycle Detection: Implements a `DependencyCycleDetector` to identify and prevent the creation of illegal cyclic dependencies during the grouping process, guaranteeing a valid execution sequence. 3. Pipeline Integration: Integrates the pass into the `AddDynamicCVPipeline` workflow, positioned after operator classification (`OpClassifier`) and before vector partitioning (`PlanVectorBlock`). 4. Infrastructure Updates: Adds core definitions like `kBlockId` and helper functions such as `isCubeOp` in `Utils.h` and `Common.h`. It also refactors `ComputeBlockIdManager` to support unified block ID management for Cube operations. See merge request: Ascend/triton-ascend!1709	21 天前
SeparateMemoryFromComputePass.cpp	feat: add gm_load_bufferable attribute for async load hoisting Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	23 天前
SplitDataflow.cpp	Add separate CV scope split implementation	21 天前