| 文件 | 最后提交记录 | 最后更新时间 |
|---|---|---|
refactor(distributed): Refactor distributed tools and simplify linear parameters Co-authored-by: Muu<koimuu@163.com> | 1 个月前 | |
refactor: Adapted to the latest TE linear layer. Co-authored-by: Muu<koimuu@163.com> | 1 个月前 | |
feat(FP4): Support W4A4-MXFP4 Co-authored-by: mingzhenwang<wangmingzhen4@huawei.com> | 15 天前 | |
refactor(distributed): Refactor distributed tools and simplify linear parameters Co-authored-by: Muu<koimuu@163.com> | 1 个月前 | |
feat(fp8): finalize FP8 core implementation and fixes Co-authored-by: Muu<koimuu@163.com> | 2 个月前 | |
refactor(grouped_linear,gemm,fp8): overhaul grouped matmul with unified dispatch, NPU version check, and FP8 quantization cleanup Co-authored-by: Muu<koimuu@163.com> | 17 天前 | |
linear fp8 refactor Co-authored-by: clc2025<chenlucong@huawei.com> | 1 个月前 | |
fix: set input_quantizer usage explicitly so a forward-only pass does not pollute the next training forward Co-authored-by: Bruce-rl-hw<okwsl201210@gmail.com> # message auto-generated for no-merge-commit merge: !95 merge fix/mxfp8-thd-columnwise into main fix: set input_quantizer usage explicitly so a forward-only pass does not pollute the next training forward Created-by: Bruce-rl-hw Commit-by: Bruce-rl-hw Merged-by: ascend-robot Description: Fix mxfp8 training in LayerNormLinear by setting the input quantizer's usage explicitly every forward (rowwise=True, columnwise=backward_needs_input), so a prior forward-only pass (e.g. an RL log-prob recompute) can't leave columnwise=False and make the backward weight-gradient GEMM crash with "Cannot access storage of UndefinedTensorImpl". Verified on qwen3-0.6B/Ascend 950 DT See merge request: Ascend/TransformerEngineNPU!95 | 11 小时前 | |
fix ut Co-authored-by: clc2025<chenlucong@huawei.com> | 28 天前 | |
refactor(grouped_linear,gemm,fp8): overhaul grouped matmul with unified dispatch, NPU version check, and FP8 quantization cleanup Co-authored-by: Muu<koimuu@163.com> | 17 天前 | |
refactor(quantized_tensor): replace QuantizedTensorView with tensor tuples Co-authored-by: Muu<koimuu@163.com> | 12 天前 | |
feat: basic.rmsnorm precision and optimize ut of ops/act Co-authored-by: GuoHaifeng1999<guohaifeng12@huawei.com> | 1 个月前 |
| 文件 | 最后提交记录 | 最后更新时间 |
|---|---|---|
| 1 个月前 | ||
| 1 个月前 | ||
| 15 天前 | ||
| 1 个月前 | ||
| 2 个月前 | ||
| 17 天前 | ||
| 1 个月前 | ||
| 11 小时前 | ||
| 28 天前 | ||
| 17 天前 | ||
| 12 天前 | ||
| 1 个月前 |