[AMD] Add HIP AOT support to compile.py tool (#7007)
This commit adds HIP AOT compilation support to the
compile.py tool. It allows compiling Triton kernels
into a .h and .cpp file that can be integrated into
applications. Linking via link.py is not yet enabled and
a task for later.
---------
Co-authored-by: Xinya Zhang <Xinya.Zhang@amd.com>
Co-authored-by: Lei Zhang <antiagainst@gmail.com>
fix: raise error for unsupported allow_tf32 in dot_op
Co-authored-by: ZhangAiqiang<zhangaiqiang1@huawei.com>
# message auto-generated for no-merge-commit merge:
!1643 merge tf32_ban_main into main
fix: raise error for unsupported allow_tf32 in dot_op
Created-by: zaq15csdn
Commit-by: ZhangAiqiang
Merged-by: ascend-robot
Description: <!---
The core Triton is a small number of people, and we receive many PRs (thank
you!). To help us review your code more quickly, **if you are a new
contributor (less than 3 PRs merged) we ask that you complete the following
tasks and include the filled-out checklist in your PR description.**
Complete the following tasks before sending your PR, and replace [ ] with
[x] to indicate you have done them.
-->
# New contributor declaration
- [x] I am not making a trivial change, such as fixing a typo in a comment.
- [x] I have written a PR description following these
[rules](https://cbea.ms/git-commit/#why-not-how).
- [ ] I have run pre-commit run --from-ref origin/main --to-ref HEAD.
- Select one of the following.
- [x] I have added tests.
- /test for lit tests
- /unittest for C++ tests
- /python/test for end-to-end tests
- [ ] This PR does not need a test because FILL THIS IN.
- Select one of the following.
- [x] I have not added any lit tests.
- [ ] The lit tests I have added follow these [best practices](https://mlir.llvm.org/getting_started/TestingGuide/#filecheck-best-practices),
including the "tests should be minimal" section. (Usually running Python code
and using the instructions it generates is not minimal.)
See merge request: Ascend/triton-ascend!1643
[Gluon][Tests] Make frontend tests independent of the active target (#7706)
This changes all gluon frontend tests to use run_parser. To achieve
this, I also expand run_parser to support kernel arguments and
non-blackwell targets.
This also allows us to test with the gfx1200 hip target which also has
32 threads per warp.
Closes #7667
[AMD] Enable dot_scaled on gfx11 (#7954)
I saw some occasional test failures, but after disabling True16, tests
seem to pass fine.
---------
Co-authored-by: Paul Trojahn <paul.trojahn@amd.com>