文件最后提交记录最后更新时间
Fix default FMA implementation for tensors with integer elements (#7419)10 个月前
[AMD] Introduce specialized Allocation pass (#7328)11 个月前
[AMD] Introduce specialized Allocation pass (#7328)11 个月前
[Warp Specialization] Allow worker partitions to steal registers from the default partition (#6798)1 年前
[ConSan] Support for WGMMA. Checks on non-async shmem and tmem accesses (#7712)10 个月前
[AMD] Introduce specialized Allocation pass (#7328)11 个月前
[PROTON] Intra kernel profiling (#7258)10 个月前
Reland "byte permutes in intra-warp layout conversion" (#7933)9 个月前
[Triton][Gluon] Add map_elementwise (#7564)10 个月前
[PROTON] Intra kernel profiling (#7258)10 个月前
[NFC] Make toLinearLayout take a RankedTensorType or MemDescType (#7440)10 个月前
[Backend] Plumb ttg.warp_specialize through LLVM lowering (#5963)1 年前
Fix histograms for complex replicated layouts (#7938)9 个月前
[NFC] replace TritonGPUToLLVM/Utility.h macros with TritonLLVMOpBuilder (#5717)1 年前
[BACKEND] Fix vectorization for PaddedSharedEncoding with non default order (#7845)9 个月前
[Backend][NFC] Switch some inline PTX to NVVM ops/intrinsics (#7725)10 个月前
NFC: Fix common typos (#6309)1 年前
[BACKEND] Move lowering of CF as the last step of conversion to LLVM (#7213)11 个月前
[Backend][NFC] Switch some inline PTX to NVVM ops/intrinsics (#7725)10 个月前
[LAYOUTS] Kill getWarpsPerCTA(Attribute) and prefer LinearLayout-based impl (#6252)1 年前
Cleanup includes in TritonGPUToLLVM/Utility.h (#6818)1 年前
[Backend] Fix various issues with smem base offsets (#7949)9 个月前
[Backend] Fix various issues with smem base offsets (#7949)9 个月前