MindSpeed-LLM/mindspeed_llm/features_manager/qat · Ascend/MindSpeed-LLM - AtomGit

文件	最后提交记录	最后更新时间
qat_quant_engine.py	feat: QAT supports the W4A16 fake quantization option only at the MOE layer. Co-authored-by: xusiyang<xusiyang2@huawei.com> # message auto-generated for no-merge-commit merge: !4446 merge master into master feat: QAT supports the W4A16 fake quantization option only at the MOE layer. Created-by: weixin_44492126 Commit-by: xusiyang Merged-by: ascend-robot Description: ## What this PR does / why we need it? QAT支持仅在MOE层做W4A16伪量化选项 ## Does this PR introduce any user-facing change? 新增控制MOE层使用W4A16的校验 ## How was this patch tested? 910b机器上验证通过 https://wiki.huawei.com/domains/171785/wiki/358154/WIKI2026040110634082 See merge request: Ascend/MindSpeed-LLM!4446	22 天前