msmodelslim/docs/zh/appendix · Ascend/MindStudio-ModelSlim - AtomGit

文件	最后提交记录	最后更新时间
images	【docs】资料架构重构，并将资料托管至readthedocs。 Co-authored-by: keith_wa<keith_wwa@163.com> # message auto-generated for no-merge-commit merge: !109 merge pr_docs_re_copy into master 【docs】资料架构重构，并将资料托管至readthedocs。 Created-by: keith_wa Commit-by: keith_wa Merged-by: ascend-robot Description: 1. 动机 (Motivation) 内容上：优化导航结构：原目录结构逻辑不清晰、目录层级深且链接复杂。提升阅读体验：原 traditional_quantization_v0 目录下存在 20+ 个零散文档，内容分布碎片化，用户难以快速建立完整的技术全景认知。消除内容冗余：多个文档之间存在重复的依赖说明、操作流程及参数介绍，增加了维护成本及版本不一致的风险。呈现上：提供专业资料托管：原docs/目录结构不清晰、目录名/文档名不直观（英文），跳转繁琐且无搜索功能。 2. 修改点 (Changes) 2.1 重新梳理目录结构 ![image.png](https://raw.gitcode.com/user-images/assets/8444818/8bba19f9-e84e-4f71-bd5e-ffd310dde142/image.png 'image.png') 2.2. 文档整合与重构 V0及传统量化核心文档合并：将 20 多个零散文档按功能维度深度整合为 10篇核心指南： # V0框架文档导航（已停止演进）本目录文档按模型类型与任务场景重排，便于按需求快速定位。 ## 一、传统模型量化与校准 - [传统模型量化与校准](traditional_model_quantization_and_calibration.md) - 包含 PyTorch/ONNX/MindSpore 训练后量化与 QAT。 ## 二、大模型量化与压缩 - [大模型量化与校准](foundation_model_quantization_and_calibration.md) - 包含低显存量化、混合校准数据集、FA3 量化。 - [压缩与结构优化（大模型为主）](foundation_model_compression.md) - 包含稀疏量化与权重压缩、长序列压缩、权重压缩流程、低秩分解。 ## 三、训练加速与模型改造 - [训练加速与模型改造](pruning_and_distillation.md) - 包含重要性剪枝、Transformer 剪枝、Sparse tool、模型蒸馏。 - [稀疏加速训练](sparse_acceleration_training.md) - 包含宽度扩增与深度扩增模型的稀疏训练加速流程。 ## 四、工具与生态适配 - [辅助工具与专项指导](compression_utils.md) - 包含量化权重格式说明与 MindSpeed 适配器。 - [伪量化精度测试工具](fake_quantization_accuracy_testing_tool.md) - 包含 Precision Tool 使用方式与测试流程。 - [多模态生成模型推理优化](inference_optimization_for_multimodal_generative_model.md) - 包含 DiT 缓存优化与自适应采样优化流程。 - [常见代码示例](quantization_and_sparse_quantization_scenario_import_code_examples.md) - 包含常见量化/稀疏量化场景导入代码样例。 2.3 配置readthedocs文档托管： https://modelslim.readthedocs.io/zh-cn/latest/ 2.4 配置deepwiki： https://deepwiki.com/Keithwwa/ModelSlim 3. 验证： 3.1. gimini代码检视： https://github.com/Keithwwa/ModelSlim/pull/1 See merge request: Ascend/msmodelslim!109	3 个月前
CONTRIBUTING.md	增加开发者测试指南 Co-authored-by: 李明宇<limingyu35@h-partners.com> # message auto-generated for no-merge-commit merge: !416 merge master-utdocs into master 【feature】【DOCS】增加开发者测试指南 Created-by: code_mingming Commit-by: code_mingming;李明宇 Merged-by: ascend-robot Description: 感谢您贡献的Pull Request！在提交之前，请务必阅读 [CONTRIBUTING.md](https://gitcode.com/Ascend/msmodelslim/blob/master/CONTRIBUTING.md)。 Thanks for sending a pull request! BEFORE SUBMITTING, PLEASE READ [CONTRIBUTING.md](https://gitcode.com/Ascend/msmodelslim/blob/master/CONTRIBUTING.md). ## PR描述 (What this PR does / why we need it?) 1.提交测试用例规范指南，用户可按资料执行UT用例测试。 2.修改run_ut.sh 可以参数控制只进行modelslim_vl的相关用例测试。 3.修复部分ut，增加之前未开启的ir目录。 ## 面向用户的变更 (Does this PR introduce _any_ user-facing change)? - 请注意，这里指的是任何面向用户的变更，包括但不限于API、用户界面或其他使用方式上的变更。 - Note that it means any user-facing change including all aspects such as API, interface or other behavior changes. ## 功能验证 (How was this patch tested?) 请确认CI已通过增量及存量的单元测试用例。如果本次测试方式与常规单元测试不同，请详细说明您的测试步骤(最好提供完整的可复现的操作路径及关键截图)，以便Committer能够快速复现验证，也便于后续的维护。如果未添加测试，请说明未添加的原因，以及为何难添加测试。 - [_] 功能自验 - [_] 本地自验截图(涉及个人标识符等敏感信息请注意脱敏) - [_] 新增/变更内容是否已新增/适配UT测试用例看护 CI passed with new added/existing test. If it was tested in a way different from regular unit tests, please clarify how you tested step by step, ideally copy and paste-able, so that other reviewers can test and check, and descendants can verify in the future. If tests were not added, please describe why they were not added and/or why it was difficult to add. - [_] Self-verification of the feature. - [_] Screenshot of local self-verification (please anonymize any sensitive information such as personal identifiers) - [_] Have new or modified unit test (UT) cases been added or adapted to cover the newly added or changed content? See merge request: Ascend/msmodelslim!416	12 天前
faq.md	【docs】：语法不兼容显示异常格式优化 Co-authored-by: zzm30<zhengzhimin1@h-partners.com> # message auto-generated for no-merge-commit merge: !361 merge master into master 【docs】：语法不兼容显示异常格式优化 Created-by: zzm30 Commit-by: zzm30 Merged-by: ascend-robot Description: 感谢您贡献的Pull Request！在提交之前，请务必阅读 [CONTRIBUTING.md](https://gitcode.com/Ascend/msmodelslim/blob/master/CONTRIBUTING.md)。 Thanks for sending a pull request! BEFORE SUBMITTING, PLEASE READ [CONTRIBUTING.md](https://gitcode.com/Ascend/msmodelslim/blob/master/CONTRIBUTING.md). ## PR描述 (What this PR does / why we need it?) 1. 托管文档语法不兼容显示异常格式优化； 2. 其他低错问题修改。 See merge request: Ascend/msmodelslim!361	1 个月前
public_urls.xlsx	【docs】资料架构重构，并将资料托管至readthedocs。 Co-authored-by: keith_wa<keith_wwa@163.com> # message auto-generated for no-merge-commit merge: !109 merge pr_docs_re_copy into master 【docs】资料架构重构，并将资料托管至readthedocs。 Created-by: keith_wa Commit-by: keith_wa Merged-by: ascend-robot Description: 1. 动机 (Motivation) 内容上：优化导航结构：原目录结构逻辑不清晰、目录层级深且链接复杂。提升阅读体验：原 traditional_quantization_v0 目录下存在 20+ 个零散文档，内容分布碎片化，用户难以快速建立完整的技术全景认知。消除内容冗余：多个文档之间存在重复的依赖说明、操作流程及参数介绍，增加了维护成本及版本不一致的风险。呈现上：提供专业资料托管：原docs/目录结构不清晰、目录名/文档名不直观（英文），跳转繁琐且无搜索功能。 2. 修改点 (Changes) 2.1 重新梳理目录结构 ![image.png](https://raw.gitcode.com/user-images/assets/8444818/8bba19f9-e84e-4f71-bd5e-ffd310dde142/image.png 'image.png') 2.2. 文档整合与重构 V0及传统量化核心文档合并：将 20 多个零散文档按功能维度深度整合为 10篇核心指南： # V0框架文档导航（已停止演进）本目录文档按模型类型与任务场景重排，便于按需求快速定位。 ## 一、传统模型量化与校准 - [传统模型量化与校准](traditional_model_quantization_and_calibration.md) - 包含 PyTorch/ONNX/MindSpore 训练后量化与 QAT。 ## 二、大模型量化与压缩 - [大模型量化与校准](foundation_model_quantization_and_calibration.md) - 包含低显存量化、混合校准数据集、FA3 量化。 - [压缩与结构优化（大模型为主）](foundation_model_compression.md) - 包含稀疏量化与权重压缩、长序列压缩、权重压缩流程、低秩分解。 ## 三、训练加速与模型改造 - [训练加速与模型改造](pruning_and_distillation.md) - 包含重要性剪枝、Transformer 剪枝、Sparse tool、模型蒸馏。 - [稀疏加速训练](sparse_acceleration_training.md) - 包含宽度扩增与深度扩增模型的稀疏训练加速流程。 ## 四、工具与生态适配 - [辅助工具与专项指导](compression_utils.md) - 包含量化权重格式说明与 MindSpeed 适配器。 - [伪量化精度测试工具](fake_quantization_accuracy_testing_tool.md) - 包含 Precision Tool 使用方式与测试流程。 - [多模态生成模型推理优化](inference_optimization_for_multimodal_generative_model.md) - 包含 DiT 缓存优化与自适应采样优化流程。 - [常见代码示例](quantization_and_sparse_quantization_scenario_import_code_examples.md) - 包含常见量化/稀疏量化场景导入代码样例。 2.3 配置readthedocs文档托管： https://modelslim.readthedocs.io/zh-cn/latest/ 2.4 配置deepwiki： https://deepwiki.com/Keithwwa/ModelSlim 3. 验证： 3.1. gimini代码检视： https://github.com/Keithwwa/ModelSlim/pull/1 See merge request: Ascend/msmodelslim!109	3 个月前
release_notes.md	[docs] improve the docs readability Co-authored-by: zhongzhoutan<1710115119@bjmu.edu.cn> # message auto-generated for no-merge-commit merge: !309 merge docs/docs-improvement into master [docs] improve the docs readability Created-by: tangxuanya Commit-by: zhongzhoutan Merged-by: ascend-robot Description: 感谢您贡献的Pull Request！在提交之前，请务必阅读 [CONTRIBUTING.md](https://gitcode.com/Ascend/msmodelslim/blob/master/CONTRIBUTING.md)。 Thanks for sending a pull request! BEFORE SUBMITTING, PLEASE READ [CONTRIBUTING.md](https://gitcode.com/Ascend/msmodelslim/blob/master/CONTRIBUTING.md). ## 一、Bug 修复（正确性问题） \| 文件 \| 修复内容 \| \|------\|----------\| \| [SessionConfig.md](../docs/zh/python_api_v0/unified_multimodal_generation_apis/SessionConfig.md) \| `act_method` 可选值 `'mixed'` → `'mix'`（已通过源码枚举 `ACT_METHOD` 验证） \| \| [SessionConfig.md](../docs/zh/python_api_v0/unified_multimodal_generation_apis/SessionConfig.md) \| 代码示例中 `output_path="./"` 后缺少逗号，已补全 \| \| [[onnx]run().md](../docs/zh/python_api_v0/quantization_apis/[onnx]post_training_quantization/run().md) \| 补充了被使用但未定义的变量 `input_model_path` \| \| [traditional_model_quantization_and_calibration.md](../docs/zh/feature_guide/traditional_quantization_v0/traditional_model_quantization_and_calibration.md) \| 将 `from ascend_utils.common.security import SafeWriteUmask` 移至实际使用它的代码块（diffusers 示例），而非错误放置在 resnet50 示例中 \| \| [典型模型量化支持特性设计说明书.md](../docs/zh/design/典型模型量化支持特性设计说明书.md) \| 修复两处错误的 CLI 子命令：`msmodelslim quantize` → `msmodelslim quant` \| \| [usage.md（一键量化）](../docs/zh/feature_guide/quick_quantization_v1/usage.md) \| 修复 `tag` 参数说明语法错误："则于用户交互" → "则与用户交互，询问是否"（缺少谓语动词） \| \| [quantization_quick_start.md](../docs/zh/getting_started/quantization_quick_start.md) \| 同上，`tag` 参数说明语法修复 \| --- ## 二、锚点与链接修复 \| 文件 \| 修复内容 \| \|------\|----------\| \| [integrating_multimodal_understanding_model.md](../docs/zh/developer_guide/integrating_multimodal_understanding_model.md) \| 所有带编号的标题锚点（如 `#5-校准数据准备`、`#34-辅助方法...`）已更新为与去编号后的标题一致（如 `#校准数据准备`、`#辅助方法...`） \| \| [sparse_quantization_accuracy_tuning_cases.md](../docs/zh/case_studies/sparse_quantization_accuracy_tuning_cases.md) \| 修复链接目标失效问题 \| \| [w8a8_accuracy_tuning_policy.md](../docs/zh/case_studies/w8a8_accuracy_tuning_policy.md) \| 修复链接目标失效问题 \| --- ## 三、内容优化——`integrating_multimodal_understanding_model.md` - 移除所有标题的 `1.`/`2.`/`3.x` 编号前缀（共 14 处），与文档风格保持一致 - 新增术语内联说明：Merger、DeepStack、PatchMerger、MoE、IterSmooth、QuaRot - 纯文本文件路径改为可点击的 GitCode 链接（`model_adapter.py`、`__init__.py`、`moe_utils.py` 等） - 校准数据章节（原第 5 节）从约 60 行详细说明精简为一句话 + 指向 `usage.md#dataset` 的链接，消除内容重复 - 新增 YAML 配置字段说明表格，对每个 `spec` 配置块进行解释，并附 `linear_quant.md` 和 `usage.md` 的参考链接 - FAQ 节标题去编号，锚点同步更新 - 附录：修复一处微小代码错误；精简重复的 VLM 数据加载器链接 --- ## 四、内容优化——量化算法文档 \| 文件 \| 变更内容 \| \|------\|----------\| \| [fa3_quant.md](../docs/zh/quantization_algorithms/quantization_algorithms/fa3_quant.md) \| 删除"使用说明"节中的重复 YAML 配置块；将使用说明合并至"YAML配置示例"节 \| \| [float_sparse.md](../docs/zh/quantization_algorithms/quantization_algorithms/float_sparse.md) \| 同上，删除重复 YAML 块及空的"模型适配"节 \| \| [gptq.md](../docs/zh/quantization_algorithms/quantization_algorithms/gptq.md) \| 在 YAML 示例前补充"作为Processor使用"引导语 \| \| [histogram_activation_quantization.md](../docs/zh/quantization_algorithms/quantization_algorithms/histogram_activation_quantization.md) \| 删除重复 YAML；代码路径改为 GitCode 链接；字段表格重构为 Histogram 专属约束表；修复 FAQ 中量化器查找说明，改为正确路径引用 \| \| [kvcache_quant.md](../docs/zh/quantization_algorithms/quantization_algorithms/kvcache_quant.md) \| 合并两处重复 YAML 配置块 \| \| [minmax.md](../docs/zh/quantization_algorithms/quantization_algorithms/minmax.md) \| 代码路径改为 GitCode 链接；功能介绍重构为 3 个子节；YAML 示例补全所有参数；字段说明表格扩充 \| \| [pdmix.md](../docs/zh/quantization_algorithms/quantization_algorithms/pdmix.md) \| 统一 `per-token`/`per-tensor` 写法为 ` per_token `/` per_tensor `（代码格式 + 下划线风格），全文一致 \| --- ## 五、内容优化——`quantization_result.md` - 文件目录树及说明表格中新增 `optional/quarot.safetensors` 条目 - 量化权重文件名从 `quant_model_weight_w8a8.safetensors` 泛化为 `quant_model_weight_{quant_type}.safetensors`，并附说明 - 删除冗余的"注意"提示块（信息已移入表格） - 所有带编号章节标题（`#### 1.` … `#### 17.`）去掉编号 - 新增"QuaRot 导出结果"完整章节，包含目录结构、文件说明表、`quant_model_description.json` JSON 结构示例及使用场景说明 --- ## 六、example/ 目录 README 修复 \| 文件 \| 变更内容 \| \|------\|----------\| \| [Llama/README.md](../example/Llama/README.md) \| `model_type` 参数说明扩展，明确 `llama3.1_fp`（基础模型）和 `llama3.1_instruct`（Instruct 模型）；为全部 5 条 LLaMA3.1 基础模型命令补加 `--model_type llama3.1_fp` 参数 \| \| [Qwen3-Next/README.md](../example/Qwen3-Next/README.md) \| transformers 版本要求说明改写；标题去编号；占位符格式改为 ` ${MODEL_PATH} ` 风格 \| \| [GLM/README.md](../example/GLM/README.md) \| "L自动回退等级" → "自动回退等级"（多余字符修复） \| \| [HunYuan/README.md](../example/HunYuan/README.md) \| "HunYuan" → "Hunyuan"（官方大小写拼写） \| \| [GLM-4.1V/README.md](../example/multimodal_vlm/GLM-4.1V/README.md) \| 首次出现"RLCS"时补充全称"基于课程采样的强化学习" \| \| [example/README.md](../example/README.md) \| 删除空的"快速开始"节 \| --- ## 七、其他文档零散修复 \| 文件 \| 变更内容 \| \|------\|----------\| \| [dir_structure.md](../docs/zh/dir_structure.md) \| "Smooth Quant" → "SmoothQuant"（官方名称） \| \| [auto_precision_tuning/usage.md](../docs/zh/feature_guide/auto_precision_tuning/usage.md) \| `device` 参数说明：删除指向失效链接的描述，改为"数据并行（Data Parallel，DP）逐层量化"的简明说明 \| \| [analyze_api_usage.md](../docs/zh/feature_guide/sensitive_layer_analysis/analyze_api_usage.md) \| 删除对 `trust_remote_code=False` 含义的错误/混淆说明 \| \| [mindspeed_adapter.md](../docs/zh/feature_guide/traditional_quantization_v0/mindspeed_adapter.md) \| 删除 `dev_type='npu'` 参数上具有误导性的内联注释 \| \| [foundation_model_support_matrix.md](../docs/zh/model_support/foundation_model_support_matrix.md) \| 精简脚注 4（删除 vLLM 绕行方案描述） \| \| [典型模型量化支持特性设计说明书.md](../docs/zh/design/典型模型量化支持特性设计说明书.md) \| 新增 16 条缩写词表；修复 Use Case 名称拼写错误 \| \| [自动调优加速特性设计说明书.md](../docs/zh/design/自动调优加速特性设计说明书.md) \| 修复 Use Case 名称拼写错误 \| \| [release_notes.md](../docs/zh/appendix/release_notes.md) \| 将空的"无"替换为适当内容 \| \| [README.md](../README.md) \| 开发者指南部分新增多模态模型接入链接 \| --- ## 总体评价本次提交仅涉及文档修改，改动范围清晰、目的明确。重点亮点： - 修复 7 处正确性 Bug：错误 CLI 命令、错误枚举值、缺失变量、错位导入、失效锚点、语法错误 - 消除 5+ 个文件中的内容重复：量化算法文档及多模态接入指南均有大量冗余内容被精简 - 统一去除标题编号：4+ 个文件中的章节编号被一致移除，符合项目文档风格规范 - 无功能性变更引入：所有锚点目标变更均与去编号后的新标题保持对应，未引入回归问题 See merge request: Ascend/msmodelslim!309	1 个月前
security_statement.md	【docs】：大模型检查低错修改 Co-authored-by: zzm30<zhengzhimin1@h-partners.com> # message auto-generated for no-merge-commit merge: !333 merge master into master 【docs】：大模型检查低错修改 Created-by: zzm30 Commit-by: zzm30 Merged-by: ascend-robot Description: 感谢您贡献的Pull Request！在提交之前，请务必阅读 [CONTRIBUTING.md](https://gitcode.com/Ascend/msmodelslim/blob/master/CONTRIBUTING.md)。 Thanks for sending a pull request! BEFORE SUBMITTING, PLEASE READ [CONTRIBUTING.md](https://gitcode.com/Ascend/msmodelslim/blob/master/CONTRIBUTING.md). ## PR描述 (What this PR does / why we need it?) - 请明确说明您提交PR的变更内容。本部分旨在概述所做的变更，以及此PR是如何解决该问题的。请尽可能地提供有助于评审人员更高效、更快速完成检视审查的实用说明。 - 请说明为何需要这些更改，例如具体的使用场景或bug描述。 - 关联issue号（如果有）。 - Please clarify what changes you are proposing. The purpose of this section is to outline the changes and how this PR fixes the issue. If possible, please consider writing useful notes for better and faster reviews in your PR. - Please clarify why the changes are needed. For instance, the use case and bug description. - Related issue number (if any) ## 面向用户的变更 (Does this PR introduce _any_ user-facing change)? - 请注意，这里指的是任何面向用户的变更，包括但不限于API、用户界面或其他使用方式上的变更。 - Note that it means any user-facing change including all aspects such as API, interface or other behavior changes. ## 功能验证 (How was this patch tested?) 请确认CI已通过增量及存量的单元测试用例。如果本次测试方式与常规单元测试不同，请详细说明您的测试步骤(最好提供完整的可复现的操作路径及关键截图)，以便Committer能够快速复现验证，也便于后续的维护。如果未添加测试，请说明未添加的原因，以及为何难添加测试。 - [_] 功能自验 - [_] 本地自验截图(涉及个人标识符等敏感信息请注意脱敏) - [_] 新增/变更内容是否已新增/适配UT测试用例看护 CI passed with new added/existing test. If it was tested in a way different from regular unit tests, please clarify how you tested step by step, ideally copy and paste-able, so that other reviewers can test and check, and descendants can verify in the future. If tests were not added, please describe why they were not added and/or why it was difficult to add. - [_] Self-verification of the feature. - [_] Screenshot of local self-verification (please anonymize any sensitive information such as personal identifiers) - [_] Have new or modified unit test (UT) cases been added or adapted to cover the newly added or changed content? See merge request: Ascend/msmodelslim!333	1 个月前