文件最后提交记录最后更新时间
【msmodelslim】刷新文件头部声明和版权 Co-authored-by: caishengcheng<caishengcheng@huawei.com> # message auto-generated for no-merge-commit merge: !28 merge license into master 【msmodelslim】刷新文件头部声明和版权 Created-by: caishengcheng Commit-by: caishengcheng Merged-by: ascend-robot Description: 【msmodelslim】刷新文件头部声明和版权 See merge request: Ascend/msmodelslim!284 个月前
【msmodelslim】刷新文件头部声明和版权 Co-authored-by: caishengcheng<caishengcheng@huawei.com> # message auto-generated for no-merge-commit merge: !28 merge license into master 【msmodelslim】刷新文件头部声明和版权 Created-by: caishengcheng Commit-by: caishengcheng Merged-by: ascend-robot Description: 【msmodelslim】刷新文件头部声明和版权 See merge request: Ascend/msmodelslim!284 个月前
【msmodelslim】刷新文件头部声明和版权 Co-authored-by: caishengcheng<caishengcheng@huawei.com> # message auto-generated for no-merge-commit merge: !28 merge license into master 【msmodelslim】刷新文件头部声明和版权 Created-by: caishengcheng Commit-by: caishengcheng Merged-by: ascend-robot Description: 【msmodelslim】刷新文件头部声明和版权 See merge request: Ascend/msmodelslim!284 个月前
test: add UT for processor/quant/autoround_utils and processor/quarot/common Co-authored-by: tanxiangyuu<tanxiangyu2@huawei.com> # message auto-generated for no-merge-commit merge: !511 merge feature/add-ut-quant-quarot into master test: add UT for processor/quant/autoround_utils and processor/quarot/common Created-by: tanxiangyuu Commit-by: tanxiangyuu Merged-by: ascend-robot Description: 覆盖率补充请看:https://gitcode.com/tanxiangyuu/msmodelslim/issues/1 See merge request: Ascend/msmodelslim!51113 小时前
【msmodelslim】刷新文件头部声明和版权 Co-authored-by: caishengcheng<caishengcheng@huawei.com> # message auto-generated for no-merge-commit merge: !28 merge license into master 【msmodelslim】刷新文件头部声明和版权 Created-by: caishengcheng Commit-by: caishengcheng Merged-by: ascend-robot Description: 【msmodelslim】刷新文件头部声明和版权 See merge request: Ascend/msmodelslim!284 个月前
【feature】 支持FA3的混合量化配置及保存 Co-authored-by: wangsihao<wangsihao5@h-partners.com> # message auto-generated for no-merge-commit merge: !483 merge master into master 【feature】 支持FA3的混合量化配置及保存 Created-by: wangsihao Commit-by: wangsihao Merged-by: ascend-robot Description: 感谢您贡献的Pull Request! 在提交之前,请务必阅读 [CONTRIBUTING.md](https://gitcode.com/Ascend/msmodelslim/blob/master/CONTRIBUTING.md)。 ## PR描述 (What this PR does / why we need it?) 支持FA3量化中的不同策略和粒度的混合配置 如:fa_q 动态量化,fa_k/fa_v 静态量化 量化粒度支持FP8/INT8,其中INT8仅支持静态量化 ## 功能验证 (How was this patch tested?) - [_] 功能自验 - [_] 本地自验截图(涉及个人标识符等敏感信息请注意脱敏) - [_] 新增/变更内容是否已新增/适配UT测试用例看护 See merge request: Ascend/msmodelslim!48320 小时前
feat: LinearQuantProcessor DTS integration Co-authored-by: rookie_hongchuan<hongchuan6@h-partners.com> # message auto-generated for no-merge-commit merge: !395 merge feature/linearquant-dts-v2 into master feat: LinearQuantProcessor DTS integration Created-by: rookie_hongchuan Commit-by: rookie_hongchuan Merged-by: ascend-robot Description: ## 概述 将 LinearQuantProcessor 与 DistributedTaskScheduler (DTS) 集成,在 DP 逐层量化管线中并行化 data-free 权重量化器的校准计算,减少 MoE 模型 W4A8 SSZ 量化的端到端耗时。 ## 动机 Qwen3-Next-80B(48层,每层 MoE 100 个专家)的 W4A8 SSZ per_channel 量化计算量极大——单卡每层约 4 分钟。在 DP 逐层量化管线中,每层的 LinearQuantProcessor 处理 3 个子图(self_attn、mlp.experts、linear_attn),内部包含大量线性子模块。原始的 SSZ forward 调用是串行的,成为量化瓶颈。 关键洞察:SSZ 权重量化器是 **data-free** 的(确定性算法,不需要激活数据),其 forward 计算可以在各 rank 间安全并行,无需校准数据交换。 ## 设计方案 ### 架构 ``` AutoWeightQuantizer(基类,继承 DTSMixin) ├── distributed_sync() 默认 → self.forward(None) # 各 rank 独立重算 └── WeightPerChannelSsz 覆盖 distributed_sync() → _broadcast_quantizer_state: owner 计算,广播到所有 rank ``` ### LinearQuantProcessor 中的 DTS 调度 ``` preprocess() └── _calibrate_shared_data_free_with_dts() ├── 遍历 LinearQuantizer 子模块 ├── 筛选: weight_quantizer.is_data_free() && dist_helper.is_shared(name) └── 提交 DTS 任务 → 4 rank 并行执行 ``` ### Broadcast 协议(SSZ 专用) ``` owner_rank → broadcast_object_list(元数据: shape/dtype) → broadcast_tensor_process_group_safe(3 个张量: scale, offset, storage) 非 owner → load_quantized_from_broadcast_tensors() → 重建 QParam + QStorage ``` ### 修改文件 | 文件 | 变更 | |------|------| | msmodelslim/core/quantizer/base.py | AutoWeightQuantizer 继承 DTSMixin;默认 distributed_sync 触发 forward(None) | | msmodelslim/core/quantizer/impl/ssz.py | WeightPerChannelSsz.distributed_sync 实现 broadcast;添加 load_quantized_from_broadcast_tensors | | msmodelslim/processor/quant/linear.py | _calibrate_shared_data_free_with_dts 方法;DTS 任务提交与调度 | ## 测试结果 ### 环境 | 项目 | 内容 | |------|------| | 模型 | Qwen3-Next-80B-A3B-Instruct | | 量化方式 | W4A8 SSZ per_channel | | 设备 | 4x Ascend 910B2 (64GB HBM) | | 分布式 | DP LayerWise(4 rank)+ DTS 内层并行 | ### 性能 | 指标 | 单卡基线 | 4卡+DTS | 加速比 | |------|---------|---------|--------| | 端到端耗时 | ~2小时 | ~82分钟 | ~1.5x | | SSZ 计算(每层平均) | ~155s 串行 | ~67s 墙钟 | ~2.3x | DTS 效率:T_run / sum(task_exec) ≈ 0.43(4 rank,每层 1536 个任务) ### 一致性验证 ``` mstool cmp baseline/ 4card/ → 所有 tensor abs diff = 0 ✅ ``` 单卡与 4 卡输出对全部 30,000+ 个 tensor(权重、scales、offsets、scale_biases)逐字节一致。 ### 配套工具 新增 tools/npu-affinity-keeper.sh:通过 npu-smi info 监控 NPU 进程,自动恢复 CPU 亲和性,解决 run_in_background 将子进程锁定到单核的问题。 See merge request: Ascend/msmodelslim!3958 天前
[test] add processor/quant unit test Co-authored-by: tanxiangyuu<tanxiangyu2@huawei.com> # message auto-generated for no-merge-commit merge: !509 merge dev_2 into master [test] add processor/quant unit test Created-by: tanxiangyuu Commit-by: tanxiangyuu Merged-by: ascend-robot Description: 覆盖率补充请看:https://gitcode.com/tanxiangyuu/msmodelslim/issues/1 See merge request: Ascend/msmodelslim!50913 小时前
[test] add processor/quant unit test Co-authored-by: tanxiangyuu<tanxiangyu2@huawei.com> # message auto-generated for no-merge-commit merge: !509 merge dev_2 into master [test] add processor/quant unit test Created-by: tanxiangyuu Commit-by: tanxiangyuu Merged-by: ascend-robot Description: 覆盖率补充请看:https://gitcode.com/tanxiangyuu/msmodelslim/issues/1 See merge request: Ascend/msmodelslim!50913 小时前
【msmodelslim】刷新文件头部声明和版权 Co-authored-by: caishengcheng<caishengcheng@huawei.com> # message auto-generated for no-merge-commit merge: !28 merge license into master 【msmodelslim】刷新文件头部声明和版权 Created-by: caishengcheng Commit-by: caishengcheng Merged-by: ascend-robot Description: 【msmodelslim】刷新文件头部声明和版权 See merge request: Ascend/msmodelslim!284 个月前
【msmodelslim】刷新文件头部声明和版权 Co-authored-by: caishengcheng<caishengcheng@huawei.com> # message auto-generated for no-merge-commit merge: !28 merge license into master 【msmodelslim】刷新文件头部声明和版权 Created-by: caishengcheng Commit-by: caishengcheng Merged-by: ascend-robot Description: 【msmodelslim】刷新文件头部声明和版权 See merge request: Ascend/msmodelslim!284 个月前
【msmodelslim】刷新文件头部声明和版权 Co-authored-by: caishengcheng<caishengcheng@huawei.com> # message auto-generated for no-merge-commit merge: !28 merge license into master 【msmodelslim】刷新文件头部声明和版权 Created-by: caishengcheng Commit-by: caishengcheng Merged-by: ascend-robot Description: 【msmodelslim】刷新文件头部声明和版权 See merge request: Ascend/msmodelslim!284 个月前
[test] add processor/quant unit test Co-authored-by: tanxiangyuu<tanxiangyu2@huawei.com> # message auto-generated for no-merge-commit merge: !509 merge dev_2 into master [test] add processor/quant unit test Created-by: tanxiangyuu Commit-by: tanxiangyuu Merged-by: ascend-robot Description: 覆盖率补充请看:https://gitcode.com/tanxiangyuu/msmodelslim/issues/1 See merge request: Ascend/msmodelslim!50913 小时前