| feat: QAT supports the W4A16 fake quantization option only at the MOE layer.
Co-authored-by: xusiyang<xusiyang2@huawei.com>
# message auto-generated for no-merge-commit merge:
!4446 merge master into master
feat: QAT supports the W4A16 fake quantization option only at the MOE layer.
Created-by: weixin_44492126
Commit-by: xusiyang
Merged-by: ascend-robot
Description:
## What this PR does / why we need it?
QAT支持仅在MOE层做W4A16伪量化选项
## Does this PR introduce any user-facing change?
新增控制MOE层使用W4A16的校验
## How was this patch tested?
910b机器上验证通过
https://wiki.huawei.com/domains/171785/wiki/358154/WIKI2026040110634082
See merge request: Ascend/MindSpeed-LLM!4446 | 22 天前 |