文件最后提交记录最后更新时间
Reland "byte permutes in intra-warp layout conversion" (#7933) Reland https://github.com/triton-lang/triton/pull/7809, https://github.com/triton-lang/triton/pull/7825, https://github.com/triton-lang/triton/pull/7861 Add a workaround for ptxas bug and add a regression test9 个月前
[AMD] Use single LDS for both transposed and non-transposed access (#7813) This commit introduces a pass for detecting a pair of tt.dot ops that both use the same tt.load result, one directly and one via tt.trans and creates the same shared memory allocation. This allows the pipeliner to pick a single LDS layout, and enables pipeline of the loads.9 个月前