文件最后提交记录最后更新时间
[AMD] Introduce amdgpu.buffer_load and amdgpu.buffer_store (#4903) This PR is introducing support for two new AMDGPU specific operations: - amdgpu.buffer_load : it loads from global memory via a pointer and a tensor offset - amdgpu.buffer_store : it store a value in global memory via a pointer and a tensor offset I am also adding conversions patterns in LoadStoreOpToLLVM.cpp. These are similar to the ones for tt.load and tt.store, but different enough to deserve a specific rewrite. I tried to hoist common functionalities between the 4 different patterns, to reduce duplication.1 年前
[AMD] Add fast_expf to libdevice (#4937) This PR added fast_expf operator under libdevice for AMD hardwares. Aligning with other operators in the exp family, the way to deal with denorm inputs is controled by __HIP_FTZ, which currently is fixed to be True. - If __HIP_FTZ = 1, the operator uses llvm.amdgcn.exp2.f32, which will flush denorms in inputs and outputs; - If __HIP_FTZ = 0, the operator uses llvm.exp2.f32, which will not flush denorms. Ref: https://github.com/ROCm/llvm-project/blob/amd-staging/amd/device-libs/cuda2gcn/src/precision.cl Fixes https://github.com/ROCm/triton-internal/issues/3141 年前
[AMD] Add pass to convert tt.load/tt.store to buffer operations (#4966) This PR is only introducing a ttgir pass to convert tt.load/tt.store to amdgpu.buffer_load/amdgpu.buffer_load, _when this is possible_ : this means we need to check for 3 conditions: 1. The pointer arithmetic has been canonicalized (scalarPtr->splat->addptr->load/store) 2. The offsets are 32-bits 3. The offsets are non-negative. We use a mix of analysis and assumptions to verify this condition Right now the functionality is gated behind an AMDGCN_USE_BUFFER_OPS, which now also covers the pointer canonicalization pass which is mostly meant to handle this. 1 年前
[AMD] Add TritonAMDGPU dialect scaffolding (#4685) This PR adds an TritonAMDGPU dialect to host future AMD specific ops to help with AMD backend CodeGen. --------- Co-authored-by: Ognjen Plavsic <ognjen.plavsic@luxoft.com> Co-authored-by: Lei Zhang <antiagainst@gmail.com>1 年前