| add w8a16 quant
Co-authored-by: wangjunhang<wangjunhang7@huawei.com>
# message auto-generated for no-merge-commit merge:
!3460 merge Dev_W8A16 into master
add w8a16 quant
Created-by: goodflower9
Commit-by: wangjunhang
Merged-by: ascend-robot
Description: What this PR does / why we need it?
This PR adds W8A16 MXFP8 QAT support based on the existing QAT flow.
Does this PR introduce any user-facing change?
Yes. Users can enable W8A16 MXFP8 QAT with:
```
--qat-scheme w8a16-mxfp8
```
Related doc:
```
docs/zh/features/qat_quant.md
```
How was this patch tested?
Enable fake quantization by adding the parameter --qat-scheme w8a16-mxfp8
https://wiki.huawei.com/domains/76578/wiki/233229/WIKI2026051211063362
See merge request: Ascend/MindSpeed!3460 | 3 天前 |