文件最后提交记录最后更新时间
!1829 merge 0512_ut into main feat(ssbuf): Fix tag missing and add ut for multi cache inner scope Created-by: cxtverygood123 Commit-by: cxtverygood123 Merged-by: ascend-robot Description: 修改计数器block_id标签缺失问题,生产者op,并补齐相关UT,主要修改点: 1.将核内多buffer轮询计数器相关op缺失block_id标签问题修复; 2.将原本核内多buffer生产者物理写入op修改为copy以满足一切二pass所需输入; 3.补齐核内多buffer单元测试用例。 <!--- The core Triton is a small number of people, and we receive many PRs (thank you!). To help us review your code more quickly, **if you are a new contributor (less than 3 PRs merged) we ask that you complete the following tasks and include the filled-out checklist in your PR description.** Complete the following tasks before sending your PR, and replace [ ] with [x] to indicate you have done them. --> # New contributor declaration - [ ] I am not making a trivial change, such as fixing a typo in a comment. - [ ] I have written a PR description following these [rules](https://cbea.ms/git-commit/#why-not-how). - [ ] I have run pre-commit run --from-ref origin/main --to-ref HEAD. - Select one of the following. - [ ] I have added tests. - /test for lit tests - /unittest for C++ tests - /python/test for end-to-end tests - [ ] This PR does not need a test because FILL THIS IN. - Select one of the following. - [ ] I have not added any lit tests. - [ ] The lit tests I have added follow these [best practices](https://mlir.llvm.org/getting_started/TestingGuide/#filecheck-best-practices), including the "tests should be minimal" section. (Usually running Python code and using the instructions it generates is not minimal.) See merge request: Ascend/triton-ascend!182920 天前
add ut for inner scope 20 天前
!1792 merge pr into main implementation code of four passes of the SplitDataflow Created-by: liurufeng25 Commit-by: liurufeng25 Merged-by: ascend-robot Description: ssbuffer cv分核 插入数据搬运和同步部分 四个pass的完整代码: 1. AddBlockIdForControlOpsPass: 补全控制流block_id标签 2. DataDependencyAnalysisPass: 数据依赖分析 3. InterCoreTransferAndSyncPass: 插入数据搬运和同步 4. MarkMainLoopPass: 标记mainloop <!--- The core Triton is a small number of people, and we receive many PRs (thank you!). To help us review your code more quickly, **if you are a new contributor (less than 3 PRs merged) we ask that you complete the following tasks and include the filled-out checklist in your PR description.** Complete the following tasks before sending your PR, and replace [ ] with [x] to indicate you have done them. --> # New contributor declaration - [ ] I am not making a trivial change, such as fixing a typo in a comment. - [ ] I have written a PR description following these [rules](https://cbea.ms/git-commit/#why-not-how). - [ ] I have run pre-commit run --from-ref origin/main --to-ref HEAD. - Select one of the following. - [ ] I have added tests. - /test for lit tests - /unittest for C++ tests - /python/test for end-to-end tests - [ ] This PR does not need a test because FILL THIS IN. - Select one of the following. - [ ] I have not added any lit tests. - [ ] The lit tests I have added follow these [best practices](https://mlir.llvm.org/getting_started/TestingGuide/#filecheck-best-practices), including the "tests should be minimal" section. (Usually running Python code and using the instructions it generates is not minimal.) See merge request: Ascend/triton-ascend!179221 天前
!1709 merge cube into main feat: cube blocks for ssbuf Created-by: OliverCWY Commit-by: chenwuyang Merged-by: ascend-robot Description: ### **中文总结 (Chinese Summary)** **标题:在 Dynamic CV Pipeline 中实现 PlanCubeBlockPass 以支持 Cube 算子分块规划** **主要内容:** 本 PR 在 third_party/ascend 的动态 CV 流水线(DynamicCVPipeline)中引入了 PlanCubeBlockPass。该 Pass 的核心目标是根据 Ascend 硬件特性,将 Cube 类型(矩阵运算相关)的操作划分为合理的计算块(Compute Blocks)。 **核心逻辑:** 1. **两阶段规划策略:** * **种子区域规划 (SeedRegionPlanner):** 以 linalg.matmul 等核心矩阵算子为“种子”,通过广度优先搜索(BFS)向上游(操作数)和下游(用户)扩展,将与其紧密相关的辅助算子(如转置、加载、地址计算等)整合进同一个 Cube 块中。 * **拓扑分区规划 (TopologicalPartitionPlanner):** 按照拓扑顺序处理剩余未分配的 Cube 算子,确保在不破坏依赖关系的前提下完成全部分区。 2. **依赖环路检测:** 引入了 DependencyCycleDetector,在算子归并过程中实时检测并防止引入非法的循环依赖,确保生成的调度顺序在逻辑上是可执行的。 3. **流水线集成:** 将该 Pass 集成到 AddDynamicCVPipeline 流程中,位于算子分类(OpClassifier)之后,向量分块(PlanVectorBlock)之前,为后续的资源分配和指令调度奠定基础。 4. **工具类增强:** 在 Utils.hCommon.h 中增加了 kBlockId 属性定义及 isCubeOp 等辅助函数,并优化了 ComputeBlockIdManager 的管理机制。 --- ### **English Summary** **Title: Implement PlanCubeBlockPass for Cube Operator Partitioning in Dynamic CV Pipeline** **Overview:** This PR introduces the PlanCubeBlockPass within the DynamicCVPipeline for the ascend target. The primary objective of this pass is to partition Cube-type operations (matrix-related) into logical compute blocks tailored for Ascend hardware execution. **Key Components:** 1. **Two-Phase Planning Strategy:** * **SeedRegionPlanner:** Uses core matrix operators (e.g., linalg.matmul) as "seeds" and expands the group via Breadth-First Search (BFS) to include related auxiliary operations (transposes, loads, pointer arithmetic, etc.) into a unified Cube block. * **TopologicalPartitionPlanner:** Processes any remaining unassigned Cube operations following a topological order, ensuring all Cube ops are partitioned without violating data dependencies. 2. **Dependency Cycle Detection:** Implements a DependencyCycleDetector to identify and prevent the creation of illegal cyclic dependencies during the grouping process, guaranteeing a valid execution sequence. 3. **Pipeline Integration:** Integrates the pass into the AddDynamicCVPipeline workflow, positioned after operator classification (OpClassifier) and before vector partitioning (PlanVectorBlock). 4. **Infrastructure Updates:** Adds core definitions like kBlockId and helper functions such as isCubeOp in Utils.h and Common.h. It also refactors ComputeBlockIdManager to support unified block ID management for Cube operations. See merge request: Ascend/triton-ascend!170921 天前
feat: implement AddMultiBufferToGMLoad pass 22 天前
!1719 merge CVScope into main feat(CV):Add separate CV scope split implementation Created-by: wdy45 Commit-by: wdy45 Merged-by: ascend-robot Description: Background After cross-core dataflow analysis and main-loop marking, SplitDataflow must split mixed Vector/Cube computations into separate scopes and avoid losing important control-flow attributes during canonicalization. This PR adds two passes to address those needs and inserts them into the SplitDataflow pipeline. Changes Add SeparateCVScope pass Classifies ops/results by ssbuffer.core_type as VECTOR or CUBE. Builds two scopes per function (VECTOR and CUBE) and retains the corresponding content. Normalizes mixed-scope control-flow operations. Neutralizes or preserves non-target-scope values at yield/terminator and loop-carry sites. Removes temporary ssbuffer.core_type attributes at the end. Add PreserveControlAttrsCanonicalize pass Runs greedy canonicalization with a listener to track inserts/replaces/erases. For tracked control-flow ops (scf::ForOp, scf::IfOp, scf::WhileOp, scf::ParallelOp), transfers non-conflicting attributes from the old op to the new op when an equivalent replacement is created. Prevents loss of control-flow attributes during canonicalization. # New contributor declaration - [ ] I am not making a trivial change, such as fixing a typo in a comment. - [ ] I have written a PR description following these [rules](https://cbea.ms/git-commit/#why-not-how). - [ ] I have run pre-commit run --from-ref origin/main --to-ref HEAD. - Select one of the following. - [ ] I have added tests. - /test for lit tests - /unittest for C++ tests - /python/test for end-to-end tests - [ ] This PR does not need a test because FILL THIS IN. - Select one of the following. - [ ] I have not added any lit tests. - [ ] The lit tests I have added follow these [best practices](https://mlir.llvm.org/getting_started/TestingGuide/#filecheck-best-practices), including the "tests should be minimal" section. (Usually running Python code and using the instructions it generates is not minimal.) See merge request: Ascend/triton-ascend!171921 天前
Supplement ut use cases 21 天前
add plan cube pass 22 天前
initialization of outer scope 22 天前
!1709 merge cube into main feat: cube blocks for ssbuf Created-by: OliverCWY Commit-by: chenwuyang Merged-by: ascend-robot Description: ### **中文总结 (Chinese Summary)** **标题:在 Dynamic CV Pipeline 中实现 PlanCubeBlockPass 以支持 Cube 算子分块规划** **主要内容:** 本 PR 在 third_party/ascend 的动态 CV 流水线(DynamicCVPipeline)中引入了 PlanCubeBlockPass。该 Pass 的核心目标是根据 Ascend 硬件特性,将 Cube 类型(矩阵运算相关)的操作划分为合理的计算块(Compute Blocks)。 **核心逻辑:** 1. **两阶段规划策略:** * **种子区域规划 (SeedRegionPlanner):** 以 linalg.matmul 等核心矩阵算子为“种子”,通过广度优先搜索(BFS)向上游(操作数)和下游(用户)扩展,将与其紧密相关的辅助算子(如转置、加载、地址计算等)整合进同一个 Cube 块中。 * **拓扑分区规划 (TopologicalPartitionPlanner):** 按照拓扑顺序处理剩余未分配的 Cube 算子,确保在不破坏依赖关系的前提下完成全部分区。 2. **依赖环路检测:** 引入了 DependencyCycleDetector,在算子归并过程中实时检测并防止引入非法的循环依赖,确保生成的调度顺序在逻辑上是可执行的。 3. **流水线集成:** 将该 Pass 集成到 AddDynamicCVPipeline 流程中,位于算子分类(OpClassifier)之后,向量分块(PlanVectorBlock)之前,为后续的资源分配和指令调度奠定基础。 4. **工具类增强:** 在 Utils.hCommon.h 中增加了 kBlockId 属性定义及 isCubeOp 等辅助函数,并优化了 ComputeBlockIdManager 的管理机制。 --- ### **English Summary** **Title: Implement PlanCubeBlockPass for Cube Operator Partitioning in Dynamic CV Pipeline** **Overview:** This PR introduces the PlanCubeBlockPass within the DynamicCVPipeline for the ascend target. The primary objective of this pass is to partition Cube-type operations (matrix-related) into logical compute blocks tailored for Ascend hardware execution. **Key Components:** 1. **Two-Phase Planning Strategy:** * **SeedRegionPlanner:** Uses core matrix operators (e.g., linalg.matmul) as "seeds" and expands the group via Breadth-First Search (BFS) to include related auxiliary operations (transposes, loads, pointer arithmetic, etc.) into a unified Cube block. * **TopologicalPartitionPlanner:** Processes any remaining unassigned Cube operations following a topological order, ensuring all Cube ops are partitioned without violating data dependencies. 2. **Dependency Cycle Detection:** Implements a DependencyCycleDetector to identify and prevent the creation of illegal cyclic dependencies during the grouping process, guaranteeing a valid execution sequence. 3. **Pipeline Integration:** Integrates the pass into the AddDynamicCVPipeline workflow, positioned after operator classification (OpClassifier) and before vector partitioning (PlanVectorBlock). 4. **Infrastructure Updates:** Adds core definitions like kBlockId and helper functions such as isCubeOp in Utils.h and Common.h. It also refactors ComputeBlockIdManager to support unified block ID management for Cube operations. See merge request: Ascend/triton-ascend!170921 天前
!1709 merge cube into main feat: cube blocks for ssbuf Created-by: OliverCWY Commit-by: chenwuyang Merged-by: ascend-robot Description: ### **中文总结 (Chinese Summary)** **标题:在 Dynamic CV Pipeline 中实现 PlanCubeBlockPass 以支持 Cube 算子分块规划** **主要内容:** 本 PR 在 third_party/ascend 的动态 CV 流水线(DynamicCVPipeline)中引入了 PlanCubeBlockPass。该 Pass 的核心目标是根据 Ascend 硬件特性,将 Cube 类型(矩阵运算相关)的操作划分为合理的计算块(Compute Blocks)。 **核心逻辑:** 1. **两阶段规划策略:** * **种子区域规划 (SeedRegionPlanner):** 以 linalg.matmul 等核心矩阵算子为“种子”,通过广度优先搜索(BFS)向上游(操作数)和下游(用户)扩展,将与其紧密相关的辅助算子(如转置、加载、地址计算等)整合进同一个 Cube 块中。 * **拓扑分区规划 (TopologicalPartitionPlanner):** 按照拓扑顺序处理剩余未分配的 Cube 算子,确保在不破坏依赖关系的前提下完成全部分区。 2. **依赖环路检测:** 引入了 DependencyCycleDetector,在算子归并过程中实时检测并防止引入非法的循环依赖,确保生成的调度顺序在逻辑上是可执行的。 3. **流水线集成:** 将该 Pass 集成到 AddDynamicCVPipeline 流程中,位于算子分类(OpClassifier)之后,向量分块(PlanVectorBlock)之前,为后续的资源分配和指令调度奠定基础。 4. **工具类增强:** 在 Utils.hCommon.h 中增加了 kBlockId 属性定义及 isCubeOp 等辅助函数,并优化了 ComputeBlockIdManager 的管理机制。 --- ### **English Summary** **Title: Implement PlanCubeBlockPass for Cube Operator Partitioning in Dynamic CV Pipeline** **Overview:** This PR introduces the PlanCubeBlockPass within the DynamicCVPipeline for the ascend target. The primary objective of this pass is to partition Cube-type operations (matrix-related) into logical compute blocks tailored for Ascend hardware execution. **Key Components:** 1. **Two-Phase Planning Strategy:** * **SeedRegionPlanner:** Uses core matrix operators (e.g., linalg.matmul) as "seeds" and expands the group via Breadth-First Search (BFS) to include related auxiliary operations (transposes, loads, pointer arithmetic, etc.) into a unified Cube block. * **TopologicalPartitionPlanner:** Processes any remaining unassigned Cube operations following a topological order, ensuring all Cube ops are partitioned without violating data dependencies. 2. **Dependency Cycle Detection:** Implements a DependencyCycleDetector to identify and prevent the creation of illegal cyclic dependencies during the grouping process, guaranteeing a valid execution sequence. 3. **Pipeline Integration:** Integrates the pass into the AddDynamicCVPipeline workflow, positioned after operator classification (OpClassifier) and before vector partitioning (PlanVectorBlock). 4. **Infrastructure Updates:** Adds core definitions like kBlockId and helper functions such as isCubeOp in Utils.h and Common.h. It also refactors ComputeBlockIdManager to support unified block ID management for Cube operations. See merge request: Ascend/triton-ascend!170921 天前
feat: add gm_load_bufferable attribute for async load hoisting Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> 23 天前
Add separate CV scope split implementation 21 天前