triton-ascend/python/triton · wutianyao/triton-ascend - AtomGit

ascend-robotfix: raise error for unsupported allow_tf32 in dot_op

文件	最后提交记录	最后更新时间
backends	[AMD] Add HIP AOT support to compile.py tool (#7007) This commit adds HIP AOT compilation support to the `compile.py` tool. It allows compiling Triton kernels into a `.h` and `.cpp` file that can be integrated into applications. Linking via `link.py` is not yet enabled and a task for later. --------- Co-authored-by: Xinya Zhang <Xinya.Zhang@amd.com> Co-authored-by: Lei Zhang <antiagainst@gmail.com>	11 个月前
compiler	[fix] (tritonparse)：Adapt to the upstream community tritonparse and extract mix_mode as an independent module. Co-authored-by: zhudada0120<zhuguanda@huawei.com> # message auto-generated for no-merge-commit merge: !1470 merge fix_metadata_name into main [fix] (tritonparse)：Adapt to the upstream community tritonparse and extract mix_mode as an independent module. Created-by: zhudada Commit-by: zhudada0120 Merged-by: ascend-robot Description: ## 背景 PR：https://gitcode.com/Ascend/triton-ascend/pull/1422 本PR的目的和1422PR一致，均是为了解决适配上游项目tritonparse过程中遇到的问题，即：在tritonparse的kernel reproducer特性中： 1、生成复现脚本的目录中包含空格，在执行脚本的时候会因为路径中存在空格而报错 2、codegen出来的kernel复现代码中，import算子路径会包含空格导致复现脚本无法直接执行 1422PR的修改方式存在问题，在解决了问题1问题2后，又引入了新的问题3：import的kernel_name是带mix_mode的name，而非真正的kernel_name，因名称错误，复现脚本无法导入真正的kernel。 ![image.png](https://raw.gitcode.com/user-images/assets/7623225/6a34b2f1-498d-4dcd-871b-d21bfb85e889/image.png 'image.png') 本PR通过另外一种方式重新解决1422PR中提到的问题，且可以同时解决上述3个问题 ## 修改方案直接将mix_mode作为一个独立参数传递给load_binary，而非通过name传递给load_binary后再在内部进行字段拆分修改内容如下： 1、修改npu driver的load_binary函数签名，增加mix_mode入参，同步修改cpu_driver的load_binary函数签名 2、load_binary函数调用处添加mix_mode参数 3、修改compiler.py中的命名拼接逻辑 ## 测试验证 tritonparse功能验证通过 ![85be6035-daa6-4c5e-a057-b173978744d8.png](https://raw.gitcode.com/user-images/assets/7623225/10d8dc85-531f-4e38-b33f-6359c3de5e35/85be6035-daa6-4c5e-a057-b173978744d8.png '85be6035-daa6-4c5e-a057-b173978744d8.png') See merge request: Ascend/triton-ascend!1470	1 个月前
experimental	fix: restore cdiv in standard but not init cdiv	3 个月前
extension	fix(buffer): restore buffer_type correctly in for/if subregions	1 个月前
language	fix: raise error for unsupported allow_tf32 in dot_op Co-authored-by: ZhangAiqiang<zhangaiqiang1@huawei.com> # message auto-generated for no-merge-commit merge: !1643 merge tf32_ban_main into main fix: raise error for unsupported allow_tf32 in dot_op Created-by: zaq15csdn Commit-by: ZhangAiqiang Merged-by: ascend-robot Description: <!--- The core Triton is a small number of people, and we receive many PRs (thank you!). To help us review your code more quickly, if you are a new contributor (less than 3 PRs merged) we ask that you complete the following tasks and include the filled-out checklist in your PR description. Complete the following tasks before sending your PR, and replace `[ ]` with `[x]` to indicate you have done them. --> # New contributor declaration - [x] I am not making a trivial change, such as fixing a typo in a comment. - [x] I have written a PR description following these [rules](https://cbea.ms/git-commit/#why-not-how). - [ ] I have run `pre-commit run --from-ref origin/main --to-ref HEAD`. - Select one of the following. - [x] I have added tests. - `/test` for `lit` tests - `/unittest` for C++ tests - `/python/test` for end-to-end tests - [ ] This PR does not need a test because `FILL THIS IN`. - Select one of the following. - [x] I have not added any `lit` tests. - [ ] The `lit` tests I have added follow these [best practices](https://mlir.llvm.org/getting_started/TestingGuide/#filecheck-best-practices), including the "tests should be minimal" section. (Usually running Python code and using the instructions it generates is not minimal.) See merge request: Ascend/triton-ascend!1643	1 个月前
runtime	fix(interpret): fix interpret implemention of compile_hint op	1 个月前
tools	Merge Triton-Ascend 425236de into release/3.5.x	2 个月前
__init__.py	fix(libentry): TypeError for function arguments Co-authored-by: candyhong<1102229410@qq.com> # message auto-generated for no-merge-commit merge: !55 merge release/3.5.x-upgrade-candy-dev into release/3.5.x-upgrade fix(libentry): TypeError for function arguments Created-by: candyhong Commit-by: candyhong Merged-by: zhuxuejie Description: 修复libentry 入参缺失在问题： ![image.png](https://raw.gitcode.com/user-images/assets/8224568/826f809d-5c4a-447d-a9ee-7f64ed68665d/image.png 'image.png') 修复后执行： `python ./triton-ascend/third_party/ascend/unittest/autotune_ut/04-libentry.py` 成功pass: ![image.png](https://raw.gitcode.com/user-images/assets/8224568/c270cd44-8584-4da2-9fff-6aefb63a2437/image.png 'image.png') See merge request: jeshd/triton-ascend!55	2 个月前
_filecheck.py	[Gluon][Tests] Make frontend tests independent of the active target (#7706) This changes all gluon frontend tests to use `run_parser`. To achieve this, I also expand `run_parser` to support kernel arguments and non-blackwell targets. This also allows us to test with the gfx1200 hip target which also has 32 threads per warp. Closes #7667	10 个月前
_internal_testing.py	[AMD] Enable dot_scaled on gfx11 (#7954) I saw some occasional test failures, but after disabling True16, tests seem to pass fine. --------- Co-authored-by: Paul Trojahn <paul.trojahn@amd.com>	9 个月前
_utils.py	fix: remove outdated language/_utils.py and update root _utils.py Co-authored-by: candyhong<1102229410@qq.com> # message auto-generated for no-merge-commit merge: !20 merge release/3.5.x-upgrade-candy-dev into release/3.5.x-upgrade fix: remove outdated language/_utils.py and update root _utils.py Created-by: candyhong Commit-by: candyhong Merged-by: candyhong Description: python/triton/language/_utils.py中的函数方法在3.4.x版本已经被合并到 python/triton/_utils.py 中，现删除python/triton/language/_utils.py 该文件（同步上游），并且对应注释2的幂次限制 See merge request: jeshd/triton-ascend!20	3 个月前
errors.py	[FRONTEND] Unify Interpreter and JIT Compilation Errors (#3355)	2 年前
knobs.py	Fix `NvidiaTool.from_path` when tool cannot be found (#7985) Signed-off-by: Anatoly Myachev <anatoly.myachev@intel.com>	9 个月前
testing.py	[Testing] Add a div0 check in the benchmarking function (#6868) At Meta we try and reuse the Triton benchmarking infrastructure when comparing our Triton kernels to native comparisons. We have found a [rare case where comparing to a CK baseline is registering as "0ms"](https://github.com/pytorch-labs/tritonbench/blob/a13002697ff55096f495cd132d35cdc414ce36bf/tritonbench/operators/fp8_gemm_rowwise/operator.py#L204). This crashes our work-stream, so this adds as simple division by 0 check to prevent this issue. The default of 1000 is chosen arbitrarily.	1 年前