文件最后提交记录最后更新时间
Merge Triton-Ascend 62eb951f into release/3.5.x 1 个月前
feat(conv1d) : add conv1d op 25 天前
testing: Add code coverage ut and mlir testing cases for TritonToUnstructure, TritonToHIVM, BubbleUpOp, DiscreteMask Co-authored-by: wutianyao<wutianyao1@huawei.com> # message auto-generated for no-merge-commit merge: !1649 merge main6 into main testing: Add code coverage ut and mlir testing cases for TritonToUnstructure, TritonToHIVM, BubbleUpOp, DiscreteMask Created-by: wutianyao Commit-by: wutianyao Merged-by: ascend-robot Description: <!--- The core Triton is a small number of people, and we receive many PRs (thank you!). To help us review your code more quickly, **if you are a new contributor (less than 3 PRs merged) we ask that you complete the following tasks and include the filled-out checklist in your PR description.** Complete the following tasks before sending your PR, and replace [ ] with [x] to indicate you have done them. --> # New contributor declaration - [ ] I am not making a trivial change, such as fixing a typo in a comment. - [ ] I have written a PR description following these [rules](https://cbea.ms/git-commit/#why-not-how). - [ ] I have run pre-commit run --from-ref origin/main --to-ref HEAD. - Select one of the following. - [ ] I have added tests. - /test for lit tests - /unittest for C++ tests - /python/test for end-to-end tests - [ ] This PR does not need a test because FILL THIS IN. - Select one of the following. - [ ] I have not added any lit tests. - [ ] The lit tests I have added follow these [best practices](https://mlir.llvm.org/getting_started/TestingGuide/#filecheck-best-practices), including the "tests should be minimal" section. (Usually running Python code and using the instructions it generates is not minimal.) See merge request: Ascend/triton-ascend!16491 个月前
!1829 merge 0512_ut into main feat(ssbuf): Fix tag missing and add ut for multi cache inner scope Created-by: cxtverygood123 Commit-by: cxtverygood123 Merged-by: ascend-robot Description: 修改计数器block_id标签缺失问题,生产者op,并补齐相关UT,主要修改点: 1.将核内多buffer轮询计数器相关op缺失block_id标签问题修复; 2.将原本核内多buffer生产者物理写入op修改为copy以满足一切二pass所需输入; 3.补齐核内多buffer单元测试用例。 <!--- The core Triton is a small number of people, and we receive many PRs (thank you!). To help us review your code more quickly, **if you are a new contributor (less than 3 PRs merged) we ask that you complete the following tasks and include the filled-out checklist in your PR description.** Complete the following tasks before sending your PR, and replace [ ] with [x] to indicate you have done them. --> # New contributor declaration - [ ] I am not making a trivial change, such as fixing a typo in a comment. - [ ] I have written a PR description following these [rules](https://cbea.ms/git-commit/#why-not-how). - [ ] I have run pre-commit run --from-ref origin/main --to-ref HEAD. - Select one of the following. - [ ] I have added tests. - /test for lit tests - /unittest for C++ tests - /python/test for end-to-end tests - [ ] This PR does not need a test because FILL THIS IN. - Select one of the following. - [ ] I have not added any lit tests. - [ ] The lit tests I have added follow these [best practices](https://mlir.llvm.org/getting_started/TestingGuide/#filecheck-best-practices), including the "tests should be minimal" section. (Usually running Python code and using the instructions it generates is not minimal.) See merge request: Ascend/triton-ascend!182920 天前
fix(ssbuf): implement backward synchronization and fix ssbuf for pipe_s flag Co-authored-by: lichengfeng497<1309837013@qq.com> # message auto-generated for no-merge-commit merge: !1579 merge fix_pipeS into main fix(ssbuf): implement backward synchronization and fix ssbuf for pipe_s flag Created-by: lichengfeng497 Commit-by: lichengfeng497 Merged-by: ascend-robot Description: The core Triton is a small number of people, and we receive many PRs (thank you!). To help us review your code more quickly, **if you are a new contributor (less than 3 PRs merged) we ask that you complete the following tasks and include the filled-out checklist in your PR description.** Complete the following tasks before sending your PR, and replace [ ] with [x] to indicate you have done them. - [ ] I am not making a trivial change, such as fixing a typo in a comment. - [ ] I have written a PR description following these [rules](https://cbea.ms/git-commit/#why-not-how). - [ ] I have run pre-commit run --from-ref origin/main --to-ref HEAD. - Select one of the following. - [ ] I have added tests. - /test for lit tests - /unittest for C++ tests - /python/test for end-to-end tests - [ ] This PR does not need a test because FILL THIS IN. - Select one of the following. - [ ] I have not added any lit tests. - [ ] The lit tests I have added follow these [best practices](https://mlir.llvm.org/getting_started/TestingGuide/#filecheck-best-practices), including the "tests should be minimal" section. (Usually running Python code and using the instructions it generates is not minimal.) hstu_fwd 验证截图如下: ![image.png](https://raw.gitcode.com/user-images/assets/7623225/521a6ed7-ec0f-4155-9594-c0e3a1d14294/image.png 'image.png') See merge request: Ascend/triton-ascend!15791 个月前
change triton-ascend to fork mode Co-authored-by: zhang-chunli01<zhangchunli19@huawei.com> Co-authored-by: Xuan Peng<pengxuan9@huawei.com> Co-authored-by: luobaiqing<luobaiqing1@huawei.com> Co-authored-by: zhuxuejie<zhuxuejie8@huawei.com> Co-authored-by: candyhong<1102229410@qq.com> Co-authored-by: wutianyao<wutianyao1@huawei.com> # message auto-generated for no-merge-commit merge: !1000 merge out-of-tree-refactor into main change triton-ascend to fork mode Created-by: zhang-chunli01 Commit-by: candyhong;luobaiqing;zhang-chunli01;zhuxuejie;wutianyao;Xuan Peng Merged-by: ascend-robot Description: ![image.png](https://raw.gitcode.com/user-images/assets/7623225/d78ef132-1952-4d6e-8d0e-420bd62d0332/image.png 'image.png') See merge request: Ascend/triton-ascend!10005 个月前
feat: TTIR Graph Analysis Framework Co-authored-by: Jin Jing<jinjing21@huawei.com> # message auto-generated for no-merge-commit merge: !1498 merge main into main feat: TTIR Graph Analysis Framework Created-by: jjin_750629 Commit-by: JinJing;Jin Jing Merged-by: huchengbei Description: feat: TTIR Graph Analysis Framework 在TA中增加了针对TTIR的图分析框架,主要是构建控制流图(Control Flow Graph)和数据流图(Data Flow Graph),并针对Tensor和Global Memory补充了Memory SSA的数据流,与TTIR中原有的SSA数据流合并为一套数据流引擎。 控制流图以基本块(BasicBlock)为节点、控制流转移为有向边,描述程序的执行顺序。在 MLIR 体系中,每个 FuncOp / tt.func 内部由若干 BasicBlock 构成,BasicBlock 内的操作顺序执行,BasicBlock 末尾为终结符操作(terminator op)决定后继跳转目标。在控制流图上提供深度优先遍历与广度优先遍历(DFS & BFS)。 过程间控制流图(ICFG)在单函数 CFG 基础上,引入跨 FuncOp 的调用边与返回边,构成全程序(全模块)的统一控制流视图。在 Triton 中,一个 module 往往包含多个 tt.func(device kernel + helper function) 别名图描述程序中每个指针/内存引用可能关联的抽象内存对象集合。在 Triton 中,这对应 tt.ptr 可能指向的设备内存块对应的Tensor对象 数据流图提供Value的definion和use关系的快速查询,综合了MemorySSA和SSA两种形式的数据流 See merge request: Ascend/triton-ascend!14981 个月前
feat(conv1d) : add conv1d op 25 天前
change triton-ascend to fork mode Co-authored-by: zhang-chunli01<zhangchunli19@huawei.com> Co-authored-by: Xuan Peng<pengxuan9@huawei.com> Co-authored-by: luobaiqing<luobaiqing1@huawei.com> Co-authored-by: zhuxuejie<zhuxuejie8@huawei.com> Co-authored-by: candyhong<1102229410@qq.com> Co-authored-by: wutianyao<wutianyao1@huawei.com> # message auto-generated for no-merge-commit merge: !1000 merge out-of-tree-refactor into main change triton-ascend to fork mode Created-by: zhang-chunli01 Commit-by: candyhong;luobaiqing;zhang-chunli01;zhuxuejie;wutianyao;Xuan Peng Merged-by: ascend-robot Description: ![image.png](https://raw.gitcode.com/user-images/assets/7623225/d78ef132-1952-4d6e-8d0e-420bd62d0332/image.png 'image.png') See merge request: Ascend/triton-ascend!10005 个月前
Merge Triton-Ascend 425236de into release/3.5.x 2 个月前
!1790 merge sn into main fix: Enhance offset analysis and change the tensor of the all-one type in the select op to continuous memory access Created-by: wutianyao Commit-by: wutianyao Merged-by: ascend-robot Description: **bugfix1:** The restriction that forcibly treats tensor types with all 1(<1x1x1x...>) in the select op during pointer analysis as non-contiguous has been removed. Because under the original restriction, it would lead to a situation where, when processing tensor<1x> types, they would be expanded into discrete memory access.However, when the MLIR's built-in forloop optimization(SimplifyTrivialLoops) recognizing that the loop count is 1, it would fall back. Yet, it did not remove the discrete memory access label applied to the load op during the discrete memory access optimization({DiscreteMemAccess}). When the linalg pass later attempts to transform the load, if it detects that the load has a discrete memory access label, it will read the init args of the outer forloop of the load. Where the discrete memory access forloop has already been fall back by SimplifyTrivialLoops , this can lead to a series of errors (for example, if the discrete memory access forloop has been erased but the kernel itself still contains a forloop, reading the init args as empty can cause a core dump). **bugfix2:** When recognize a indextensor in forloop init args in the rewriteforloop, it will be convert to offset + stride,but in some scene, for example,the args is recognize as indextensor, then in blockptranalysis, it will be analysised as scalar, and in rewriteTerminator it need the stride must equal one, so an assert error occurs. **bugfix3:** When the mask is composed of splats, it is identified as a continuous mask in the discrete mask analysis pass. However, when it enters the Triton to Linalg pass, if all dimensions are 1, the splat op is converted into an insert op. In this case, the mask analysis cannot identify the insert op, and the mask is analyzed as discontinuous. As a result, the problem occurs. **bugfix4:** In the use analysis, when there is indirect memory access, the 1st load op is initially marked as a meta use. In the Post-process, this situation is handled by identifying and marking the instruction chain related to indirect memory access, and then re-marking it as mixuse to ensure it is not eliminated in subsequent conversion stages. If an op appears in a computation chain involving a set of indirect memory accesses, such as load(1st) -> computeOp -> load(2nd), and this op has been used through assert or print, it will be marked as Mixuse. In this case, the op will be clone into a mixuse op and a metause op before the Post-process phase. The mixuse op is used for assert, and the metause op is used for 2nd loads. However, since the op was initially marked as Mixuse, the 1st load op is also marked as Mixuse, thus skipping the post-process. Since the split metause op is used for 2nd loads, its elimination can cause a series of issues. Currently, a temporary modified for this situation is to disable cloning of select ops. <!--- The core Triton is a small number of people, and we receive many PRs (thank you!). To help us review your code more quickly, **if you are a new contributor (less than 3 PRs merged) we ask that you complete the following tasks and include the filled-out checklist in your PR description.** Complete the following tasks before sending your PR, and replace [ ] with [x] to indicate you have done them. --> # New contributor declaration - [ ] I am not making a trivial change, such as fixing a typo in a comment. - [ ] I have written a PR description following these [rules](https://cbea.ms/git-commit/#why-not-how). - [ ] I have run pre-commit run --from-ref origin/main --to-ref HEAD. - Select one of the following. - [ ] I have added tests. - /test for lit tests - /unittest for C++ tests - /python/test for end-to-end tests - [ ] This PR does not need a test because FILL THIS IN. - Select one of the following. - [ ] I have not added any lit tests. - [ ] The lit tests I have added follow these [best practices](https://mlir.llvm.org/getting_started/TestingGuide/#filecheck-best-practices), including the "tests should be minimal" section. (Usually running Python code and using the instructions it generates is not minimal.) See merge request: Ascend/triton-ascend!179019 天前
Merge Triton-Ascend 425236de into release/3.5.x 2 个月前
!1790 merge sn into main fix: Enhance offset analysis and change the tensor of the all-one type in the select op to continuous memory access Created-by: wutianyao Commit-by: wutianyao Merged-by: ascend-robot Description: **bugfix1:** The restriction that forcibly treats tensor types with all 1(<1x1x1x...>) in the select op during pointer analysis as non-contiguous has been removed. Because under the original restriction, it would lead to a situation where, when processing tensor<1x> types, they would be expanded into discrete memory access.However, when the MLIR's built-in forloop optimization(SimplifyTrivialLoops) recognizing that the loop count is 1, it would fall back. Yet, it did not remove the discrete memory access label applied to the load op during the discrete memory access optimization({DiscreteMemAccess}). When the linalg pass later attempts to transform the load, if it detects that the load has a discrete memory access label, it will read the init args of the outer forloop of the load. Where the discrete memory access forloop has already been fall back by SimplifyTrivialLoops , this can lead to a series of errors (for example, if the discrete memory access forloop has been erased but the kernel itself still contains a forloop, reading the init args as empty can cause a core dump). **bugfix2:** When recognize a indextensor in forloop init args in the rewriteforloop, it will be convert to offset + stride,but in some scene, for example,the args is recognize as indextensor, then in blockptranalysis, it will be analysised as scalar, and in rewriteTerminator it need the stride must equal one, so an assert error occurs. **bugfix3:** When the mask is composed of splats, it is identified as a continuous mask in the discrete mask analysis pass. However, when it enters the Triton to Linalg pass, if all dimensions are 1, the splat op is converted into an insert op. In this case, the mask analysis cannot identify the insert op, and the mask is analyzed as discontinuous. As a result, the problem occurs. **bugfix4:** In the use analysis, when there is indirect memory access, the 1st load op is initially marked as a meta use. In the Post-process, this situation is handled by identifying and marking the instruction chain related to indirect memory access, and then re-marking it as mixuse to ensure it is not eliminated in subsequent conversion stages. If an op appears in a computation chain involving a set of indirect memory accesses, such as load(1st) -> computeOp -> load(2nd), and this op has been used through assert or print, it will be marked as Mixuse. In this case, the op will be clone into a mixuse op and a metause op before the Post-process phase. The mixuse op is used for assert, and the metause op is used for 2nd loads. However, since the op was initially marked as Mixuse, the 1st load op is also marked as Mixuse, thus skipping the post-process. Since the split metause op is used for 2nd loads, its elimination can cause a series of issues. Currently, a temporary modified for this situation is to disable cloning of select ops. <!--- The core Triton is a small number of people, and we receive many PRs (thank you!). To help us review your code more quickly, **if you are a new contributor (less than 3 PRs merged) we ask that you complete the following tasks and include the filled-out checklist in your PR description.** Complete the following tasks before sending your PR, and replace [ ] with [x] to indicate you have done them. --> # New contributor declaration - [ ] I am not making a trivial change, such as fixing a typo in a comment. - [ ] I have written a PR description following these [rules](https://cbea.ms/git-commit/#why-not-how). - [ ] I have run pre-commit run --from-ref origin/main --to-ref HEAD. - Select one of the following. - [ ] I have added tests. - /test for lit tests - /unittest for C++ tests - /python/test for end-to-end tests - [ ] This PR does not need a test because FILL THIS IN. - Select one of the following. - [ ] I have not added any lit tests. - [ ] The lit tests I have added follow these [best practices](https://mlir.llvm.org/getting_started/TestingGuide/#filecheck-best-practices), including the "tests should be minimal" section. (Usually running Python code and using the instructions it generates is not minimal.) See merge request: Ascend/triton-ascend!179019 天前
feat: support bf16 for specializeTypelessValueToConstant 22 天前
refactor: refactoring the dynamic-cv-pipeline solution based on ssbufer. Co-authored-by: zhaojieshan<zhaojieshan@huawei.com> # message auto-generated for no-merge-commit merge: !1639 merge main into main refactor: refactoring the dynamic-cv-pipeline solution based on ssbufer. Created-by: zhaojieshan Commit-by: zhaojieshan Merged-by: ascend-robot Description: <!--- The core Triton is a small number of people, and we receive many PRs (thank you!). To help us review your code more quickly, **if you are a new contributor (less than 3 PRs merged) we ask that you complete the following tasks and include the filled-out checklist in your PR description.** Complete the following tasks before sending your PR, and replace [ ] with [x] to indicate you have done them. --> # New contributor declaration - [ ] I am not making a trivial change, such as fixing a typo in a comment. - [ ] I have written a PR description following these [rules](https://cbea.ms/git-commit/#why-not-how). - [ ] I have run pre-commit run --from-ref origin/main --to-ref HEAD. - Select one of the following. - [ ] I have added tests. - /test for lit tests - /unittest for C++ tests - /python/test for end-to-end tests - [ ] This PR does not need a test because FILL THIS IN. - Select one of the following. - [ ] I have not added any lit tests. - [ ] The lit tests I have added follow these [best practices](https://mlir.llvm.org/getting_started/TestingGuide/#filecheck-best-practices), including the "tests should be minimal" section. (Usually running Python code and using the instructions it generates is not minimal.) See merge request: Ascend/triton-ascend!16391 个月前