testing: Add code coverage ut and mlir testing cases for TritonToUnstructure, TritonToHIVM, BubbleUpOp, DiscreteMask
Co-authored-by: wutianyao<wutianyao1@huawei.com>
# message auto-generated for no-merge-commit merge:
!1649 merge main6 into main
testing: Add code coverage ut and mlir testing cases for TritonToUnstructure, TritonToHIVM, BubbleUpOp, DiscreteMask
Created-by: wutianyao
Commit-by: wutianyao
Merged-by: ascend-robot
Description: <!---
The core Triton is a small number of people, and we receive many PRs (thank
you!). To help us review your code more quickly, **if you are a new
contributor (less than 3 PRs merged) we ask that you complete the following
tasks and include the filled-out checklist in your PR description.**
Complete the following tasks before sending your PR, and replace [ ] with
[x] to indicate you have done them.
-->
# New contributor declaration
- [ ] I am not making a trivial change, such as fixing a typo in a comment.
- [ ] I have written a PR description following these
[rules](https://cbea.ms/git-commit/#why-not-how).
- [ ] I have run pre-commit run --from-ref origin/main --to-ref HEAD.
- Select one of the following.
- [ ] I have added tests.
- /test for lit tests
- /unittest for C++ tests
- /python/test for end-to-end tests
- [ ] This PR does not need a test because FILL THIS IN.
- Select one of the following.
- [ ] I have not added any lit tests.
- [ ] The lit tests I have added follow these [best practices](https://mlir.llvm.org/getting_started/TestingGuide/#filecheck-best-practices),
including the "tests should be minimal" section. (Usually running Python code
and using the instructions it generates is not minimal.)
See merge request: Ascend/triton-ascend!1649
!1829 merge 0512_ut into main
feat(ssbuf): Fix tag missing and add ut for multi cache inner scope
Created-by: cxtverygood123
Commit-by: cxtverygood123
Merged-by: ascend-robot
Description: 修改计数器block_id标签缺失问题,生产者op,并补齐相关UT,主要修改点:
1.将核内多buffer轮询计数器相关op缺失block_id标签问题修复;
2.将原本核内多buffer生产者物理写入op修改为copy以满足一切二pass所需输入;
3.补齐核内多buffer单元测试用例。
<!---
The core Triton is a small number of people, and we receive many PRs (thank
you!). To help us review your code more quickly, **if you are a new
contributor (less than 3 PRs merged) we ask that you complete the following
tasks and include the filled-out checklist in your PR description.**
Complete the following tasks before sending your PR, and replace [ ] with
[x] to indicate you have done them.
-->
# New contributor declaration
- [ ] I am not making a trivial change, such as fixing a typo in a comment.
- [ ] I have written a PR description following these
[rules](https://cbea.ms/git-commit/#why-not-how).
- [ ] I have run pre-commit run --from-ref origin/main --to-ref HEAD.
- Select one of the following.
- [ ] I have added tests.
- /test for lit tests
- /unittest for C++ tests
- /python/test for end-to-end tests
- [ ] This PR does not need a test because FILL THIS IN.
- Select one of the following.
- [ ] I have not added any lit tests.
- [ ] The lit tests I have added follow these [best practices](https://mlir.llvm.org/getting_started/TestingGuide/#filecheck-best-practices),
including the "tests should be minimal" section. (Usually running Python code
and using the instructions it generates is not minimal.)
See merge request: Ascend/triton-ascend!1829
fix(ssbuf): implement backward synchronization and fix ssbuf for pipe_s flag
Co-authored-by: lichengfeng497<1309837013@qq.com>
# message auto-generated for no-merge-commit merge:
!1579 merge fix_pipeS into main
fix(ssbuf): implement backward synchronization and fix ssbuf for pipe_s flag
Created-by: lichengfeng497
Commit-by: lichengfeng497
Merged-by: ascend-robot
Description: The core Triton is a small number of people, and we receive many PRs (thank
you!). To help us review your code more quickly, **if you are a new
contributor (less than 3 PRs merged) we ask that you complete the following
tasks and include the filled-out checklist in your PR description.**
Complete the following tasks before sending your PR, and replace [ ] with
[x] to indicate you have done them.
- [ ] I am not making a trivial change, such as fixing a typo in a comment.
- [ ] I have written a PR description following these
[rules](https://cbea.ms/git-commit/#why-not-how).
- [ ] I have run pre-commit run --from-ref origin/main --to-ref HEAD.
- Select one of the following.
- [ ] I have added tests.
- /test for lit tests
- /unittest for C++ tests
- /python/test for end-to-end tests
- [ ] This PR does not need a test because FILL THIS IN.
- Select one of the following.
- [ ] I have not added any lit tests.
- [ ] The lit tests I have added follow these [best practices](https://mlir.llvm.org/getting_started/TestingGuide/#filecheck-best-practices),
including the "tests should be minimal" section. (Usually running Python code
and using the instructions it generates is not minimal.)
hstu_fwd 验证截图如下:

See merge request: Ascend/triton-ascend!1579
!1790 merge sn into main
fix: Enhance offset analysis and change the tensor of the all-one type in the select op to continuous memory access
Created-by: wutianyao
Commit-by: wutianyao
Merged-by: ascend-robot
Description: **bugfix1:** The restriction that forcibly treats tensor types with all 1(<1x1x1x...>) in the select op during pointer analysis as non-contiguous has been removed.
Because under the original restriction, it would lead to a situation where, when processing tensor<1x> types, they would be expanded into discrete memory access.However, when the MLIR's built-in forloop optimization(SimplifyTrivialLoops) recognizing that the loop count is 1, it would fall back. Yet, it did not remove the discrete memory access label applied to the load op during the discrete memory access optimization({DiscreteMemAccess}).
When the linalg pass later attempts to transform the load, if it detects that the load has a discrete memory access label, it will read the init args of the outer forloop of the load. Where the discrete memory access forloop has already been fall back by SimplifyTrivialLoops , this can lead to a series of errors (for example, if the discrete memory access forloop has been erased but the kernel itself still contains a forloop, reading the init args as empty can cause a core dump).
**bugfix2:** When recognize a indextensor in forloop init args in the rewriteforloop, it will be convert to offset + stride,but in some scene, for example,the args is recognize as indextensor, then in blockptranalysis, it will be analysised as scalar, and in rewriteTerminator it need the stride must equal one, so an assert error occurs.
**bugfix3:** When the mask is composed of splats, it is identified as a continuous mask in the discrete mask analysis pass. However, when it enters the Triton to Linalg pass, if all dimensions are 1, the splat op is converted into an insert op. In this case, the mask analysis cannot identify the insert op, and the mask is analyzed as discontinuous. As a result, the problem occurs.
**bugfix4:** In the use analysis, when there is indirect memory access, the 1st load op is initially marked as a meta use. In the Post-process, this situation is handled by identifying and marking the instruction chain related to indirect memory access, and then re-marking it as mixuse to ensure it is not eliminated in subsequent conversion stages.
If an op appears in a computation chain involving a set of indirect memory accesses, such as load(1st) -> computeOp -> load(2nd), and this op has been used through assert or print, it will be marked as Mixuse. In this case, the op will be clone into a mixuse op and a metause op before the Post-process phase.
The mixuse op is used for assert, and the metause op is used for 2nd loads. However, since the op was initially marked as Mixuse, the 1st load op is also marked as Mixuse, thus skipping the post-process. Since the split metause op is used for 2nd loads, its elimination can cause a series of issues.
Currently, a temporary modified for this situation is to disable cloning of select ops.
<!---
The core Triton is a small number of people, and we receive many PRs (thank
you!). To help us review your code more quickly, **if you are a new
contributor (less than 3 PRs merged) we ask that you complete the following
tasks and include the filled-out checklist in your PR description.**
Complete the following tasks before sending your PR, and replace [ ] with
[x] to indicate you have done them.
-->
# New contributor declaration
- [ ] I am not making a trivial change, such as fixing a typo in a comment.
- [ ] I have written a PR description following these
[rules](https://cbea.ms/git-commit/#why-not-how).
- [ ] I have run pre-commit run --from-ref origin/main --to-ref HEAD.
- Select one of the following.
- [ ] I have added tests.
- /test for lit tests
- /unittest for C++ tests
- /python/test for end-to-end tests
- [ ] This PR does not need a test because FILL THIS IN.
- Select one of the following.
- [ ] I have not added any lit tests.
- [ ] The lit tests I have added follow these [best practices](https://mlir.llvm.org/getting_started/TestingGuide/#filecheck-best-practices),
including the "tests should be minimal" section. (Usually running Python code
and using the instructions it generates is not minimal.)
See merge request: Ascend/triton-ascend!1790
!1790 merge sn into main
fix: Enhance offset analysis and change the tensor of the all-one type in the select op to continuous memory access
Created-by: wutianyao
Commit-by: wutianyao
Merged-by: ascend-robot
Description: **bugfix1:** The restriction that forcibly treats tensor types with all 1(<1x1x1x...>) in the select op during pointer analysis as non-contiguous has been removed.
Because under the original restriction, it would lead to a situation where, when processing tensor<1x> types, they would be expanded into discrete memory access.However, when the MLIR's built-in forloop optimization(SimplifyTrivialLoops) recognizing that the loop count is 1, it would fall back. Yet, it did not remove the discrete memory access label applied to the load op during the discrete memory access optimization({DiscreteMemAccess}).
When the linalg pass later attempts to transform the load, if it detects that the load has a discrete memory access label, it will read the init args of the outer forloop of the load. Where the discrete memory access forloop has already been fall back by SimplifyTrivialLoops , this can lead to a series of errors (for example, if the discrete memory access forloop has been erased but the kernel itself still contains a forloop, reading the init args as empty can cause a core dump).
**bugfix2:** When recognize a indextensor in forloop init args in the rewriteforloop, it will be convert to offset + stride,but in some scene, for example,the args is recognize as indextensor, then in blockptranalysis, it will be analysised as scalar, and in rewriteTerminator it need the stride must equal one, so an assert error occurs.
**bugfix3:** When the mask is composed of splats, it is identified as a continuous mask in the discrete mask analysis pass. However, when it enters the Triton to Linalg pass, if all dimensions are 1, the splat op is converted into an insert op. In this case, the mask analysis cannot identify the insert op, and the mask is analyzed as discontinuous. As a result, the problem occurs.
**bugfix4:** In the use analysis, when there is indirect memory access, the 1st load op is initially marked as a meta use. In the Post-process, this situation is handled by identifying and marking the instruction chain related to indirect memory access, and then re-marking it as mixuse to ensure it is not eliminated in subsequent conversion stages.
If an op appears in a computation chain involving a set of indirect memory accesses, such as load(1st) -> computeOp -> load(2nd), and this op has been used through assert or print, it will be marked as Mixuse. In this case, the op will be clone into a mixuse op and a metause op before the Post-process phase.
The mixuse op is used for assert, and the metause op is used for 2nd loads. However, since the op was initially marked as Mixuse, the 1st load op is also marked as Mixuse, thus skipping the post-process. Since the split metause op is used for 2nd loads, its elimination can cause a series of issues.
Currently, a temporary modified for this situation is to disable cloning of select ops.
<!---
The core Triton is a small number of people, and we receive many PRs (thank
you!). To help us review your code more quickly, **if you are a new
contributor (less than 3 PRs merged) we ask that you complete the following
tasks and include the filled-out checklist in your PR description.**
Complete the following tasks before sending your PR, and replace [ ] with
[x] to indicate you have done them.
-->
# New contributor declaration
- [ ] I am not making a trivial change, such as fixing a typo in a comment.
- [ ] I have written a PR description following these
[rules](https://cbea.ms/git-commit/#why-not-how).
- [ ] I have run pre-commit run --from-ref origin/main --to-ref HEAD.
- Select one of the following.
- [ ] I have added tests.
- /test for lit tests
- /unittest for C++ tests
- /python/test for end-to-end tests
- [ ] This PR does not need a test because FILL THIS IN.
- Select one of the following.
- [ ] I have not added any lit tests.
- [ ] The lit tests I have added follow these [best practices](https://mlir.llvm.org/getting_started/TestingGuide/#filecheck-best-practices),
including the "tests should be minimal" section. (Usually running Python code
and using the instructions it generates is not minimal.)
See merge request: Ascend/triton-ascend!1790
refactor: refactoring the dynamic-cv-pipeline solution based on ssbufer.
Co-authored-by: zhaojieshan<zhaojieshan@huawei.com>
# message auto-generated for no-merge-commit merge:
!1639 merge main into main
refactor: refactoring the dynamic-cv-pipeline solution based on ssbufer.
Created-by: zhaojieshan
Commit-by: zhaojieshan
Merged-by: ascend-robot
Description: <!---
The core Triton is a small number of people, and we receive many PRs (thank
you!). To help us review your code more quickly, **if you are a new
contributor (less than 3 PRs merged) we ask that you complete the following
tasks and include the filled-out checklist in your PR description.**
Complete the following tasks before sending your PR, and replace [ ] with
[x] to indicate you have done them.
-->
# New contributor declaration
- [ ] I am not making a trivial change, such as fixing a typo in a comment.
- [ ] I have written a PR description following these
[rules](https://cbea.ms/git-commit/#why-not-how).
- [ ] I have run pre-commit run --from-ref origin/main --to-ref HEAD.
- Select one of the following.
- [ ] I have added tests.
- /test for lit tests
- /unittest for C++ tests
- /python/test for end-to-end tests
- [ ] This PR does not need a test because FILL THIS IN.
- Select one of the following.
- [ ] I have not added any lit tests.
- [ ] The lit tests I have added follow these [best practices](https://mlir.llvm.org/getting_started/TestingGuide/#filecheck-best-practices),
including the "tests should be minimal" section. (Usually running Python code
and using the instructions it generates is not minimal.)
See merge request: Ascend/triton-ascend!1639