文件最后提交记录最后更新时间
[AMD] Add HIP AOT support to compile.py tool (#7007) This commit adds HIP AOT compilation support to the compile.py tool. It allows compiling Triton kernels into a .h and .cpp file that can be integrated into applications. Linking via link.py is not yet enabled and a task for later. --------- Co-authored-by: Xinya Zhang <Xinya.Zhang@amd.com> Co-authored-by: Lei Zhang <antiagainst@gmail.com>11 个月前
[fix] (tritonparse):Adapt to the upstream community tritonparse and extract mix_mode as an independent module. Co-authored-by: zhudada0120<zhuguanda@huawei.com> # message auto-generated for no-merge-commit merge: !1470 merge fix_metadata_name into main [fix] (tritonparse):Adapt to the upstream community tritonparse and extract mix_mode as an independent module. Created-by: zhudada Commit-by: zhudada0120 Merged-by: ascend-robot Description: ## 背景 PR:https://gitcode.com/Ascend/triton-ascend/pull/1422 本PR的目的和1422PR一致,均是为了解决适配上游项目tritonparse过程中遇到的问题,即: 在tritonparse的kernel reproducer特性中: 1、生成复现脚本的目录中包含空格,在执行脚本的时候会因为路径中存在空格而报错 2、codegen出来的kernel复现代码中,import算子路径会包含空格导致复现脚本无法直接执行 1422PR的修改方式存在问题,在解决了问题1问题2后,又引入了新的问题3:import的kernel_name是带mix_mode的name,而非真正的kernel_name,因名称错误,复现脚本无法导入真正的kernel。 ![image.png](https://raw.gitcode.com/user-images/assets/7623225/6a34b2f1-498d-4dcd-871b-d21bfb85e889/image.png 'image.png') 本PR通过另外一种方式重新解决1422PR中提到的问题,且可以同时解决上述3个问题 ## 修改方案 直接将mix_mode作为一个独立参数传递给load_binary,而非通过name传递给load_binary后再在内部进行字段拆分 修改内容如下: 1、修改npu driver的load_binary函数签名,增加mix_mode入参,同步修改cpu_driver的load_binary函数签名 2、load_binary函数调用处添加mix_mode参数 3、修改compiler.py中的命名拼接逻辑 ## 测试验证 tritonparse功能验证通过 ![85be6035-daa6-4c5e-a057-b173978744d8.png](https://raw.gitcode.com/user-images/assets/7623225/10d8dc85-531f-4e38-b33f-6359c3de5e35/85be6035-daa6-4c5e-a057-b173978744d8.png '85be6035-daa6-4c5e-a057-b173978744d8.png') See merge request: Ascend/triton-ascend!14701 个月前
fix: restore cdiv in standard but not init cdiv 3 个月前
fix(buffer): restore buffer_type correctly in for/if subregions 1 个月前
fix: raise error for unsupported allow_tf32 in dot_op Co-authored-by: ZhangAiqiang<zhangaiqiang1@huawei.com> # message auto-generated for no-merge-commit merge: !1643 merge tf32_ban_main into main fix: raise error for unsupported allow_tf32 in dot_op Created-by: zaq15csdn Commit-by: ZhangAiqiang Merged-by: ascend-robot Description: <!--- The core Triton is a small number of people, and we receive many PRs (thank you!). To help us review your code more quickly, **if you are a new contributor (less than 3 PRs merged) we ask that you complete the following tasks and include the filled-out checklist in your PR description.** Complete the following tasks before sending your PR, and replace [ ] with [x] to indicate you have done them. --> # New contributor declaration - [x] I am not making a trivial change, such as fixing a typo in a comment. - [x] I have written a PR description following these [rules](https://cbea.ms/git-commit/#why-not-how). - [ ] I have run pre-commit run --from-ref origin/main --to-ref HEAD. - Select one of the following. - [x] I have added tests. - /test for lit tests - /unittest for C++ tests - /python/test for end-to-end tests - [ ] This PR does not need a test because FILL THIS IN. - Select one of the following. - [x] I have not added any lit tests. - [ ] The lit tests I have added follow these [best practices](https://mlir.llvm.org/getting_started/TestingGuide/#filecheck-best-practices), including the "tests should be minimal" section. (Usually running Python code and using the instructions it generates is not minimal.) See merge request: Ascend/triton-ascend!16431 个月前
fix(interpret): fix interpret implemention of compile_hint op 1 个月前
Merge Triton-Ascend 425236de into release/3.5.x 2 个月前
fix(libentry): TypeError for function arguments Co-authored-by: candyhong<1102229410@qq.com> # message auto-generated for no-merge-commit merge: !55 merge release/3.5.x-upgrade-candy-dev into release/3.5.x-upgrade fix(libentry): TypeError for function arguments Created-by: candyhong Commit-by: candyhong Merged-by: zhuxuejie Description: 修复libentry 入参缺失在问题: ![image.png](https://raw.gitcode.com/user-images/assets/8224568/826f809d-5c4a-447d-a9ee-7f64ed68665d/image.png 'image.png') 修复后执行: python ./triton-ascend/third_party/ascend/unittest/autotune_ut/04-libentry.py 成功pass: ![image.png](https://raw.gitcode.com/user-images/assets/8224568/c270cd44-8584-4da2-9fff-6aefb63a2437/image.png 'image.png') See merge request: jeshd/triton-ascend!552 个月前
[Gluon][Tests] Make frontend tests independent of the active target (#7706) This changes all gluon frontend tests to use run_parser. To achieve this, I also expand run_parser to support kernel arguments and non-blackwell targets. This also allows us to test with the gfx1200 hip target which also has 32 threads per warp. Closes #766710 个月前
[AMD] Enable dot_scaled on gfx11 (#7954) I saw some occasional test failures, but after disabling True16, tests seem to pass fine. --------- Co-authored-by: Paul Trojahn <paul.trojahn@amd.com>9 个月前
fix: remove outdated language/_utils.py and update root _utils.py Co-authored-by: candyhong<1102229410@qq.com> # message auto-generated for no-merge-commit merge: !20 merge release/3.5.x-upgrade-candy-dev into release/3.5.x-upgrade fix: remove outdated language/_utils.py and update root _utils.py Created-by: candyhong Commit-by: candyhong Merged-by: candyhong Description: python/triton/language/_utils.py中的函数方法 在3.4.x版本已经被合并到 python/triton/_utils.py 中,现删除python/triton/language/_utils.py 该文件(同步上游),并且对应注释2的幂次限制 See merge request: jeshd/triton-ascend!203 个月前
[FRONTEND] Unify Interpreter and JIT Compilation Errors (#3355) 2 年前
Fix NvidiaTool.from_path when tool cannot be found (#7985) Signed-off-by: Anatoly Myachev <anatoly.myachev@intel.com>9 个月前
[Testing] Add a div0 check in the benchmarking function (#6868) At Meta we try and reuse the Triton benchmarking infrastructure when comparing our Triton kernels to native comparisons. We have found a [rare case where comparing to a CK baseline is registering as "0ms"](https://github.com/pytorch-labs/tritonbench/blob/a13002697ff55096f495cd132d35cdc414ce36bf/tritonbench/operators/fp8_gemm_rowwise/operator.py#L204). This crashes our work-stream, so this adds as simple division by 0 check to prevent this issue. The default of 1000 is chosen arbitrarily.1 年前