文件最后提交记录最后更新时间
[Backend] Bump to llvm/llvm-project@bc773632355b (#7881)9 个月前
[WS] reorder partition-loops and lower-aref (#7927)9 个月前
[BACKEND] Don't use mmav5 with num warps < 4 (#7928)9 个月前
[WS] Reimplement PartitionLoops pass supporting general control flow (#7415)10 个月前
[NFC] Use RankedTensorType's clone and cloneWithEncoding member functions (#7464)10 个月前
[AMD] NFC: simplify pass/pattern constructor declaration (#7665)10 个月前
Add support for Nvidia Blackwell GPUs (#5724)1 年前
[AMD] Support scaled dot for gfx12 (#7644)10 个月前
[BACKEND] bump llvm to 2f7ade4b (#5599)1 年前
[RELAND][Blackwell] Optimize MMA warp specialization to allow multiple consumers of MMAv5 result (#6514)1 年前
[BACKEND] Fix wrong check in tmem_alloc canonicalization pattern (#7719)10 个月前
[Utility] fix pass-by-reference addIterArgsToLoop API (#7029)1 年前
[LAYOUTS] Fix memdesc_subviews when we don't slice along the swizzling pattern (#7480)10 个月前
[NFC] Use RankedTensorType's clone and cloneWithEncoding member functions (#7464)10 个月前
[TritonGPU] Split MemDescSubview into MemDescIndex and MemDescSubslice (#7622)10 个月前
[LAYOUTS] Move order to LinearEncoding implementation (#6243)1 年前
[NFC] Use RankedTensorType's clone and cloneWithEncoding member functions (#7464)10 个月前
[BACKEND] Fix dereference nullptr (#5944)1 年前
[NVWS] Add pass to insert aref for TMA load (#7581)10 个月前