文件最后提交记录最后更新时间
[BACKEND] Localize the use and definition of getShapePerCTATile in the AMD backend and aim for elimination (#7740) 9 个月前
[AMD] Support ExtractSliceOp for AxisInfo (#7094) This commit updates AxisInfo to support backend callbacks to enable recognizing backend ops. One can use ExtractSliceOp to slice tensors of pointers to refine tt.load or tt.store. The TritonAMDGPUConvertToBufferOpsBase will fail to perform negativity analysis due to the presence of ExtractSliceOp which after rewrites is going to slice tensors of offsets. This PR addresses the issue.11 个月前
[AMD] improve RangeAnalysis to support persistent kernel (#6390) This PR implements some new features in RangeAnalysis to better support persistent kernels: 1. tl.assume on tt.func args 2. tl.assume for scf.for bounds and analysis inside such scf.fors (this used to only work with static/constant bounds) 3. inference through unsigned arith ops. strictly this isn't necessary since integers are signed/signless in triton but it is useful to be able to see how both the signed and unsigned values in the ConstantIntRanges actually change. Note, by tl.assume I mean hinting ranges for inputs using e.g. tl.assume(K > 1), tl.assume(K < 10).1 年前