| 文件 | 最后提交记录 | 最后更新时间 |
|---|---|---|
feat:双机RoCE,删除不涉及的机型。 Co-authored-by: hqx<1343153389@qq.com> # message auto-generated for no-merge-commit merge: !287 merge ms1 into develop feat:双机RoCE,删除不涉及的机型。 Created-by: h7star Commit-by: hqx Merged-by: ascend-robot Description: # PR Template Thanks for your contribution; we appreciate it a lot. The following instructions will make your pull request healthier and help you get feedback more easily. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers. 感谢您的贡献,我们非常重视。以下说明将使您的拉取请求更健康,更易于获得反馈。如果您不理解某些项目,请不要担心,只需提交拉取请求并从维护人员那里寻求帮助即可。 **PR Type / PR类型** - [x] Feature(功能新增) - [ ] Bugfix(Bug 修复) - [ ] Docs(文档更新) - [ ] CI/CD(持续集成/持续部署) - [ ] Refactor(代码重构) - [ ] Perf(性能优化) - [ ] Test-Cases(测试用例更新) - [ ] Other(其他) ## 🔍 Motivation / 变更动机 800I A3双机RoCE。不包含机型800T的机型A3_752T_128G_DIE,需删除A3_752T_128G_DIE_ROCE相关的改动。 ------ ## 📝 Modification / 修改内容 删除A3_752T_128G_DIE_ROCE相关的改动。 ------ ## 📐 Associated Test Results / 关联测试结果 **Please provide the related test results, such as test reports, etc.** **请提供相关测试结果,例如测试报告等。** ------ ## 🌟 Use cases (Optional) / 使用案例(可选) **If this PR introduces a new feature, it is better to list some use cases here and update the documentation.** **如果此拉取请求引入了新功能,最好在此处列出一些用例并更新文档。** ------ ## ✅ Checklist / 检查列表 **Before PR**: - [x] Bug fixes are fully covered by unit tests, the case that causes the bug should be added in the unit tests. / 修复的 Bug 已完全由单元测试覆盖,导致 Bug 的情况应在单元测试中添加。 - [x] The modification is covered by complete unit tests. If not, please add more unit tests to ensure the correctness. / 此拉取请求中的修改已完全由单元测试覆盖。如果不是,请添加更多单元测试以确保正确性。 - [ ] All relevant documentation (API docs, docstrings, example tutorials) has been updated to reflect these changes. / 所有相关文档(API 文档、文档字符串、示例教程)已更新以反映这些更改。 - [ ] Please ensure code files contain no Chinese comments. / 请保证代码文件中不含中文注释。 ------ See merge request: Ascend/msmodeling!287 | 23 天前 | |
feat:支持A5硬件350系列性能评估 Co-authored-by: hqx<1343153389@qq.com> # message auto-generated for no-merge-commit merge: !336 merge ms4 into develop feat:支持A5硬件350系列性能评估 Created-by: h7star Commit-by: hqx Merged-by: ascend-robot Description: # PR Template Thanks for your contribution; we appreciate it a lot. The following instructions will make your pull request healthier and help you get feedback more easily. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers. 感谢您的贡献,我们非常重视。以下说明将使您的拉取请求更健康,更易于获得反馈。如果您不理解某些项目,请不要担心,只需提交拉取请求并从维护人员那里寻求帮助即可。 **PR Type / PR类型** - [x] Feature(功能新增) - [ ] Bugfix(Bug 修复) - [ ] Docs(文档更新) - [ ] CI/CD(持续集成/持续部署) - [ ] Refactor(代码重构) - [ ] Perf(性能优化) - [ ] Test-Cases(测试用例更新) - [ ] Other(其他) ## 🔍 Motivation / 变更动机 支持对基于A5硬件模型进行建模。 ------ ## 📝 Modification / 修改内容 新增了对A5硬件的支持,--device增加以下硬件:ATLAS_350_425T_112G、ATLAS_350_425T_84G。 ------ ## 📐 Associated Test Results / 关联测试结果 python -m cli.inference.text_generate Qwen/Qwen3-8B --num-queries 8 --query-length 1024 --device ATLAS_350_425T_112G --num-devices 2 --tp-size 2 --compile --quantize-linear-action MXFP4  ------ ## 🌟 Use cases (Optional) / 使用案例(可选) **If this PR introduces a new feature, it is better to list some use cases here and update the documentation.** **如果此拉取请求引入了新功能,最好在此处列出一些用例并更新文档。** ------ ## ✅ Checklist / 检查列表 **Before PR**: - [ ] Bug fixes are fully covered by unit tests, the case that causes the bug should be added in the unit tests. / 修复的 Bug 已完全由单元测试覆盖,导致 Bug 的情况应在单元测试中添加。 - [ ] The modification is covered by complete unit tests. If not, please add more unit tests to ensure the correctness. / 此拉取请求中的修改已完全由单元测试覆盖。如果不是,请添加更多单元测试以确保正确性。 - [ ] All relevant documentation (API docs, docstrings, example tutorials) has been updated to reflect these changes. / 所有相关文档(API 文档、文档字符串、示例教程)已更新以反映这些更改。 - [ ] Please ensure code files contain no Chinese comments. / 请保证代码文件中不含中文注释。 ------ See merge request: Ascend/msmodeling!336 | 16 天前 | |
feat:仿真建模支持deepseek-V4模型适配 Co-authored-by: ChenHuiwen<chenhuiwen7@huawei.com> # message auto-generated for no-merge-commit merge: !166 merge deepseek-v4 into develop feat:仿真建模支持deepseek-V4模型适配 Created-by: ChenHuiwen Commit-by: ChenHuiwen Merged-by: ascend-robot Description: Thanks for your contribution; we appreciate it a lot. The following instructions will make your pull request healthier and help you get feedback more easily. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers. 感谢您的贡献,我们非常重视。以下说明将使您的拉取请求更健康,更易于获得反馈。如果您不理解某些项目,请不要担心,只需提交拉取请求并从维护人员那里寻求帮助即可。 **PR Type / PR类型** - [x] Feature(功能新增) - [ ] Bugfix(Bug 修复) - [ ] Docs(文档更新) - [ ] CI/CD(持续集成/持续部署) - [ ] Refactor(代码重构) - [ ] Perf(性能优化) - [ ] Test-Cases(测试用例更新) - [ ] Other(其他) ## 🔍 Motivation / 变更动机 为 msmodeling/tensor_cast 增加对 DeepSeek V4 (Flash/Pro) 模型的端到端支持,使其性能建模流水线能够覆盖 V4 引入的稀疏注意力(NSA / Window / Compressed / Heavily-Compressed 多 layer-type 路由)、HC(Head Compression)混合、Sinkhorn 拆分以及 Hash Routing MoE 等新结构,并补齐对应的 fake-tensor 语义算子与代价模型,让 V4 模型可以直接走通现有 analytic / multistream tracing 流程。 ------ ## 📝 Modification / 修改内容 新增文件 / New files - tensor_cast/transformers/builtin_model/deepseek_v4.py:DeepSeek V4 builtin model profile,包含 DeepseekV4Config / DeepseekV4Model 注册、layer-type 校验({0, 4, 128} 对应 sliding_attention / compressed_sparse_attention / heavily_compressed_attention)、以及与 transformers AutoConfig / AutoModel 的安全注册逻辑。 - tests/test_tensor_cast/test_deepseek_v4.py 与 tests/test_tensor_cast/data/deepseek_v4/*.json:V4 模型对应的测试数据集与用例(含合法/非法/缺失/截短的 ratios 配置)。 注意力 / Attention(tensor_cast/layers/mla.py,tensor_cast/ops/mla.py,tensor_cast/ops/rotary_embedding.py) - 新增 DeepseekV4SparseAttention 与 MultiheadLatentAttentionTensorCast 适配(含 requires_legacy_kv_b_decomposition、KV-cache window 写入路径等)。 - 新增 get_window_topk_idxs / get_compress_topk_idxs 索引生成工具。 - 新增 HC 路径语义算子:hc_pre_inv_rms、hc_pre_sinkhorn,分别对应参考实现中的 inverse-RMS 缩放与 Sinkhorn 加权 reduction。 - 新增 scatter_nd_update_mla 等 KV 写入算子的代价模型,按参考实现仅计 source 行读 + 更新行写,不计 slot_mapping / 整 cache 张量。 MoE / Gate(tensor_cast/layers/moe_layer.py,tensor_cast/ops/fused_moe.py) - MoELayer 增加 V4 统一 gating 路径:识别 gate 上的 is_v4 / hash 标志位,按参考 Gate.forward 顺序发出 matmul + score func + indices + gather/normalize/route_scale 各算子,使每一步按其真实 dtype(gate matmul 走 fp32)单独计费。 - 新增 moe_gating_top_k(带可选 bias 的 V4 非 hash 层)与 moe_gating_top_k_hash(基于 tid2eid 表的 hash 路由层)两个语义算子。 性能模型 / Performance Model(tensor_cast/performance_model/__init__.py) - 引入 _safe_max_int 工具:在 fake / meta / functional tensor 上 tensor.max().item() 不可用时回退为 None,让 caller 走 shape-based 估算。 - 注册 V4 新算子(scatter_nd_update_mla、HC 系列、MoE 新 gating tail 等)的 PerformanceProperties,与参考实现的内存访问语义对齐。 其他 / Misc - tensor_cast/core/config_resolver.py、input_generator.py、model_runner.py、device.py、transformers/transformations.py、 transformers/custom_model_registry.py、layers/utils.py、model_config.py、compilation/passes/multistream_pass.py:补齐 V4 在 config 解析、输入构造、runner 调度、device profile、模型变换与算子注册各环节的接入。 ------ ## 📐 Associated Test Results / 关联测试结果 **Please provide the related test results, such as test reports, etc.** **请提供相关测试结果,例如测试报告等。**   ------ ## 🌟 Use cases (Optional) / 使用案例(可选) **If this PR introduces a new feature, it is better to list some use cases here and update the documentation.** **如果此拉取请求引入了新功能,最好在此处列出一些用例并更新文档。** ------ ## ✅ Checklist / 检查列表 **Before PR**: - [ ] [Linting tools](https://gitcode.com/Ascend/msmodeling/blob/develop/tensor_cast/README.md#coding-style) are used to fix the potential lint issues. / 使用 [lintrunner 工具](https://gitcode.com/Ascend/msmodeling/blob/develop/tensor_cast/README.md#coding-style) 来修复潜在的 lint 问题。 - [ ] Bug fixes are fully covered by unit tests, the case that causes the bug should be added in the unit tests. / 修复的 Bug 已完全由单元测试覆盖,导致 Bug 的情况应在单元测试中添加。 - [ ] The modification is covered by complete unit tests. If not, please add more unit tests to ensure the correctness. / 此拉取请求中的修改已完全由单元测试覆盖。如果不是,请添加更多单元测试以确保正确性。 - [ ] All relevant documentation (API docs, docstrings, example tutorials) has been updated to reflect these changes. / 所有相关文档(API 文档、文档字符串、示例教程)已更新以反映这些更改。 - [ ] Please ensure code files contain no Chinese comments. / 请保证代码文件中不含中文注释。 ------ See merge request: Ascend/msmodeling!166 | 22 天前 | |
docs: update RFC and design templates Co-authored-by: jhon-117<fangkai15@huawei.com> # message auto-generated for no-merge-commit merge: !203 merge design_template into develop docs: update RFC and design templates Created-by: jhon-117 Commit-by: jhon-117 Merged-by: ascend-robot Description: Title: Update RFC and design document templates Body: # PR Template Thanks for your contribution; we appreciate it a lot. The following instructions will make your pull request healthier and help you get feedback more easily. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers. 感谢您的贡献,我们非常重视。以下说明将使您的拉取请求更健康,更易于获得反馈。如果您不理解某些项目,请不要担心,只需提交拉取请求并从维护人员那里寻求帮助即可。 **PR Type / PR类型** - [ ] Feature(功能新增) - [ ] Bugfix(Bug 修复) - [x] Docs(文档更新) - [ ] CI/CD(持续集成/持续部署) - [ ] Refactor(代码重构) - [ ] Perf(性能优化) - [ ] Test-Cases(测试用例更新) - [ ] Other(其他) ## 🔍 Motivation / 变更动机 This PR standardizes the repository's RFC and design document templates with an English-first bilingual format. The updated templates make the required sections clearer while preserving the original RFC semantics and keeping Chinese guidance available for contributors. ------ ## 📝 Modification / 修改内容 - Replace the old RFC template with a new English-first bilingual template under docs/RFC/rfc_template.md. - Remove the previous uppercase RFC template file docs/RFC/RFC_template.md. - Add a design document template under docs/design/docs_template.md. - Align the design document template style with the RFC template while preserving the original design-template content requirements. ------ ## 📐 Associated Test Results / 关联测试结果 No automated tests were run because this PR only updates Markdown documentation templates. Editor diagnostics were checked for the updated Markdown template files before commit, with no actionable issues found. ------ ## 🌟 Use cases (Optional) / 使用案例(可选) Contributors can use these templates when drafting new RFCs or detailed design documents, with English as the primary language and Chinese explanations retained for consistency and accessibility. ------ ## ✅ Checklist / 检查列表 **Before PR**: - [ ] Bug fixes are fully covered by unit tests, the case that causes the bug should be added in the unit tests. / 修复的 Bug 已完全由单元测试覆盖,导致 Bug 的情况应在单元测试中添加。 - [ ] The modification is covered by complete unit tests. If not, please add more unit tests to ensure the correctness. / 此拉取请求中的修改已完全由单元测试覆盖。如果不是,请添加更多单元测试以确保正确性。 - [x] All relevant documentation (API docs, docstrings, example tutorials) has been updated to reflect these changes. / 所有相关文档(API 文档、文档字符串、示例教程)已更新以反映这些更改。 - [x] Please ensure code files contain no Chinese comments. / 请保证代码文件中不含中文注释。 ------ See merge request: Ascend/msmodeling!203 | 1 个月前 | |
doc: GLM5 tensor-cast adaptation design Co-authored-by: minghang_c<chiminghang@h-partners.com> # message auto-generated for no-merge-commit merge: !217 merge glm5-design-docs into develop doc: GLM5 tensor-cast adaptation design Created-by: minghang_c Commit-by: minghang_c Merged-by: ascend-robot Description: ## Summary - Add GLM5 tensor-cast adaptation design documentation. - Document GLM5 architecture mapping, model adaptation, conversion flow, validation plan, and known risks. ## Test Plan - [x] Ran /home/minghang/workspace/msmodeling/.venv/bin/python -m pytest (blocked during collection by existing duplicate test_main module mismatch and missing matplotlib dependency). See merge request: Ascend/msmodeling!217 | 1 个月前 | |
support kimi k2.5 - adopt MTP params Co-authored-by: wangshen001<wangshen34@h-partners.com> # message auto-generated for no-merge-commit merge: !259 merge kimi_k2.5_adopt_mtp into develop support kimi k2.5 - adopt MTP params Created-by: wangshen001 Commit-by: wangshen001 Merged-by: ascend-robot Description: # PR Template Thanks for your contribution; we appreciate it a lot. The following instructions will make your pull request healthier and help you get feedback more easily. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers. 感谢您的贡献,我们非常重视。以下说明将使您的拉取请求更健康,更易于获得反馈。如果您不理解某些项目,请不要担心,只需提交拉取请求并从维护人员那里寻求帮助即可。 **PR Type / PR类型** - [ ] Feature(功能新增) - [ ] Bugfix(Bug 修复) - [ ] Docs(文档更新) - [ ] CI/CD(持续集成/持续部署) - [ ] Refactor(代码重构) - [ ] Perf(性能优化) - [ ] Test-Cases(测试用例更新) - [ ] Other(其他) ## 🔍 Motivation / 变更动机 1.kimi k2.5目前不支持mtp,仿真命令增加mtp参数后执行会报错,需要单独进行适配 2.执行kimi k2.5仿真命令时,会将 CommGrid 的大 tensor 以日志的形式全部打印出来,对于这些日志的打印可以进行简化,如下图:  ------ ## 📝 Modification / 修改内容 1.在kimi_k25.py中增加额外patch步骤,对mtp参数进行适配 2.把 tensor_cast\performance_model\analytic.py文件里面warning 中的 %s 改为只打印名称  ------ ## 📐 Associated Test Results / 关联测试结果 **Please provide the related test results, such as test reports, etc.** **请提供相关测试结果,例如测试报告等。** ------ ## 🌟 Use cases (Optional) / 使用案例(可选) **If this PR introduces a new feature, it is better to list some use cases here and update the documentation.** **如果此拉取请求引入了新功能,最好在此处列出一些用例并更新文档。** ------ ## ✅ Checklist / 检查列表 **Before PR**: - [ ] Bug fixes are fully covered by unit tests, the case that causes the bug should be added in the unit tests. / 修复的 Bug 已完全由单元测试覆盖,导致 Bug 的情况应在单元测试中添加。 - [ ] The modification is covered by complete unit tests. If not, please add more unit tests to ensure the correctness. / 此拉取请求中的修改已完全由单元测试覆盖。如果不是,请添加更多单元测试以确保正确性。 - [ ] All relevant documentation (API docs, docstrings, example tutorials) has been updated to reflect these changes. / 所有相关文档(API 文档、文档字符串、示例教程)已更新以反映这些更改。 - [ ] Please ensure code files contain no Chinese comments. / 请保证代码文件中不含中文注释。 ------ See merge request: Ascend/msmodeling!259 | 18 天前 | |
【FEAT】MindStudio CLI 统一 stderr Logo Co-authored-by: liujiawang<anonymousdev@163.com> # message auto-generated for no-merge-commit merge: !307 merge feat-logo into develop 【FEAT】MindStudio CLI 统一 stderr Logo Created-by: AvadaKedavrua Commit-by: liujiawang Merged-by: ascend-robot Description: ## 修改原因 MindStudio Modeling 各 Python CLI 启动时缺少统一品牌标识,用户在仿真、吞吐寻优、适配与 profiling 工具间切换时难以从终端首屏确认产品归属。本 PR 在 parse_args 成功后向 stderr 输出固定四行 MindStudio Logo,并支持 TTY/TERM 降级与 Windows colorama 控制台初始化。 --- ## 修改内容 - 新增共享模块 cli/logo.py:render_logo / print_logo,65 列块 + 终端居中 + ANSI/纯文本降级 - 在 11 个 Python 入口(cli/inference/*、serving_cast/main.py、tools/perf_data_collection 驱动脚本)于 parse_args 后调用 print_logo();--help 路径不输出 Logo - 依赖: - **运行时** colorama>=0.4.6 — 写入 [project] dependencies 与 requirements.txt,Windows 上调用 just_fix_windows_console() 启用控制台 VT/ANSI 输出 - **CI 静态检查** types-colorama>=0.4.15 — 写入 [dependency-groups] ci,见下方说明 - 详设文档:docs/design/mindstudio-brand-logo-design.md(本仓仅 Python 范围) - 测试:tests/regression/cli/test_logo.py(14 条模块 UT)+ tests/regression/cli/test_logo_cli_hooks.py(help 抑制与入口 hook 回归,in-process run_module_main) ### 为何需要 types-colorama(CI 组,非运行时) cli/logo.py 在 Windows 路径下会调用 colorama.just_fix_windows_console()。colorama 包本身未提供完整的 inline 类型注解,mypy / 仓库 type_check 在无 stub 时会报 *Cannot find implementation or library stub for module named "colorama"*,或将其视为 untyped 调用。 types-colorama 是社区维护的 **PEP 561 stub 包**(.pyi),仅用于开发态与 CI 的类型检查,**不会**随默认 uv sync 进入用户仿真运行时环境(位于 dependency-groups.ci,与 pytest-cov 等工具同属 CI 组)。 加入该依赖的目的: 1. 让 uv sync --group ci + mypy 能正确解析 colorama API,满足本 PR 静态检查门禁,**无需**在业务代码中使用 # type: ignore 绕过规范 2. 与项目现有做法一致:第三方库缺类型时,在 ci 组补 types-* stub,而非放宽 mypy 配置 若仅安装运行时依赖(uv sync / pip install -r requirements.txt),**不需要**也**不会**安装 types-colorama;Logo 功能仅依赖运行时 colorama。 --- ## 自验证 ### Logo 四行块渲染(纯文本 / 80 列居中) 目的:确认固定四行布局、品牌行与 Slogan 居中、无前置空行。 步骤: 1. 在仓库根目录执行: bash uv run python -c "from cli.logo import render_logo; print(render_logo(color=False, terminal_cols=80))" 结果:  ### Logo 模块 + CLI hook 回归测试 目的:满足 CI Gate 对新增 print_logo 路径的覆盖;确认 --help 不泄漏 Logo,正常 parse_args 后 stderr 含品牌块。 步骤: 1. 在仓库根目录执行: bash uv run pytest tests/regression/cli/test_logo.py tests/regression/cli/test_logo_cli_hooks.py -v --tb=no 结果:  ### --help 不输出 Logo 目的:确认 argparse 在 print_logo 之前退出,help 路径保持干净。 步骤: 1. 执行: bash uv run python -m cli.inference.text_generate --help 2>&1 | head -5 结果:  ### 端到端  See merge request: Ascend/msmodeling!307 | 18 天前 | |
优化 msModeling README 与中英文文档结构 Co-authored-by: eveyin1<qianyin2022@hotmail.com> # message auto-generated for no-merge-commit merge: !329 merge doc_fix1 into develop 优化 msModeling README 与中英文文档结构 Created-by: eveyin1 Commit-by: eveyin1 Merged-by: ascend-robot Description: # PR Template Thanks for your contribution; we appreciate it a lot. The following instructions will make your pull request healthier and help you get feedback more easily. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers. 感谢您的贡献,我们非常重视。以下说明将使您的拉取请求更健康,更易于获得反馈。如果您不理解某些项目,请不要担心,只需提交拉取请求并从维护人员那里寻求帮助即可。 **PR Type / PR类型** - [ ] Feature(功能新增) - [ ] Bugfix(Bug 修复) - [x] Docs(文档更新) - [ ] CI/CD(持续集成/持续部署) - [ ] Refactor(代码重构) - [ ] Perf(性能优化) - [ ] Test-Cases(测试用例更新) - [ ] Other(其他) ## 🔍 Motivation / 变更动机 **Please describe the motivation of this PR and the goal you want to achieve through this PR.** **请描述您的拉取请求的动机和您希望通过此拉取请求实现的目标。** ------ ## 📝 Modification / 修改内容 **Please briefly describe what modification is made in this PR.** **请简要描述此拉取请求中进行的修改。** 1 readme参考最新模板 内容整改 2 安装 快速入门内容补齐 3 docs 英文文档整理 4 文档易用性评分 5 通过docs-tool aidd检查 ------ ## 📐 Associated Test Results / 关联测试结果 **Please provide the related test results, such as test reports, etc.** **请提供相关测试结果,例如测试报告等。**  ------ ## 🌟 Use cases (Optional) / 使用案例(可选) **If this PR introduces a new feature, it is better to list some use cases here and update the documentation.** **如果此拉取请求引入了新功能,最好在此处列出一些用例并更新文档。** ------ ## ✅ Checklist / 检查列表 **Before PR**: - [ ] Bug fixes are fully covered by unit tests, the case that causes the bug should be added in the unit tests. / 修复的 Bug 已完全由单元测试覆盖,导致 Bug 的情况应在单元测试中添加。 - [ ] The modification is covered by complete unit tests. If not, please add more unit tests to ensure the correctness. / 此拉取请求中的修改已完全由单元测试覆盖。如果不是,请添加更多单元测试以确保正确性。 - [ ] All relevant documentation (API docs, docstrings, example tutorials) has been updated to reflect these changes. / 所有相关文档(API 文档、文档字符串、示例教程)已更新以反映这些更改。 - [ ] Please ensure code files contain no Chinese comments. / 请保证代码文件中不含中文注释。 ------ See merge request: Ascend/msmodeling!329 | 16 天前 | |
【feature】吞吐寻优支持多硬件对比与终端 ASCII 绘图 Co-authored-by: eveyin1<qianyin2022@hotmail.com> # message auto-generated for no-merge-commit merge: !193 merge display into develop 【feature】吞吐寻优支持多硬件对比与终端 ASCII 绘图 Created-by: eveyin1 Commit-by: eveyin1 Merged-by: ascend-robot Description: Thanks for your contribution; we appreciate it a lot. The following instructions will make your pull request healthier and help you get feedback more easily. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers. 感谢您的贡献,我们非常重视。以下说明将使您的拉取请求更健康,更易于获得反馈。如果您不理解某些项目,请不要担心,只需提交拉取请求并从维护人员那里寻求帮助即可。1 **PR Type / PR类型** - [x] Feature(功能新增) - [ ] Bugfix(Bug 修复) - [ ] Docs(文档更新) - [ ] CI/CD(持续集成/持续部署) - [ ] Refactor(代码重构) - [ ] Perf(性能优化) - [ ] Test-Cases(测试用例更新) - [ ] Other(其他) ## 🔍 Motivation / 变更动机 **Please describe the motivation of this PR and the goal you want to achieve through this PR.** **请描述您的拉取请求的动机和您希望通过此拉取请求实现的目标。** 1. 支持 --device DEVICE [DEVICE ...] 一次传入多个硬件,在一次命令内完成逐设备寻优与跨硬件汇总,汇总输出硬件最优性能的配置对比。 2. 单设备时自动输出终端 ASCII 曲线,覆盖聚合、拆解和 PD 比例三类模式;每种模式输出 Concurrency & TPS、TPOT & TPS 两张图。 3. 多设备时在跨硬件汇总表之后,额外输出每个 device 最优 case 的 scatter 对比图。 ------ ## 📝 Modification / 修改内容 **Please briefly describe what modification is made in this PR.** **请简要描述此拉取请求中进行的修改。** ### cli/inference/throughput_optimizer.py 吞吐寻优的命令行入口。校验 device、逐 hardware 跑寻优、决定是否绘制终端曲线,最后输出跨 hardware 汇总。 ### cli/utils.py 本次新增 check_device_targets(),统一处理 --device:未传时使用默认 profile、去重并保持顺序。 ### serving_cast/service/optimizer_curve_plots.py 终端 ASCII 绘图与多 device 执行编排的核心模块。单 device 时输出 sweep 折线图(聚合、拆解、PD 比例);多 device 时在汇总表之后输出各 hardware 最优 case 的 scatter 对比图。同时负责逐 device 调用 ParallelRunner、采集跨 hardware 对比行,并调度跨 hardware 汇总输出。 ### serving_cast/service/optimizer_summary.py 寻优结果的过滤、单 device 报表与跨 hardware 表格渲染。包括 Top N 结果表、从每个 device 抽取最优一行用于横向对比,以及聚合 / 拆解 / PD 比例 / 硬件参数 输出。表格侧使用 TTFT/TPOT SLA 过滤,与曲线侧仅过滤 OOM 的策略相互独立。 ------ ## 📐 Associated Test Results / 关联测试结果 **Please provide the related test results, such as test reports, etc.** **请提供相关测试结果,例如测试报告等。** ### 多device  ### 单device  ------ ## 🌟 Use cases (Optional) / 使用案例(可选) **If this PR introduces a new feature, it is better to list some use cases here and update the documentation.** **如果此拉取请求引入了新功能,最好在此处列出一些用例并更新文档。** ------ ## ✅ Checklist / 检查列表 **Before PR**: - [ ] [Linting tools](https://gitcode.com/Ascend/msmodeling/blob/develop/tensor_cast/README.md#coding-style) are used to fix the potential lint issues. / 使用 [lintrunner 工具](https://gitcode.com/Ascend/msmodeling/blob/develop/tensor_cast/README.md#coding-style) 来修复潜在的 lint 问题。 - [ ] Bug fixes are fully covered by unit tests, the case that causes the bug should be added in the unit tests. / 修复的 Bug 已完全由单元测试覆盖,导致 Bug 的情况应在单元测试中添加。 - [ ] The modification is covered by complete unit tests. If not, please add more unit tests to ensure the correctness. / 此拉取请求中的修改已完全由单元测试覆盖。如果不是,请添加更多单元测试以确保正确性。 - [ ] All relevant documentation (API docs, docstrings, example tutorials) has been updated to reflect these changes. / 所有相关文档(API 文档、文档字符串、示例教程)已更新以反映这些更改。 - [ ] Please ensure code files contain no Chinese comments. / 请保证代码文件中不含中文注释。 ------ See merge request: Ascend/msmodeling!193 | 1 个月前 | |
补充详设 Co-authored-by: tt0cool<xujintao8@h-partners.com> # message auto-generated for no-merge-commit merge: !328 merge develop into develop 补充详设 Created-by: tt0cool Commit-by: tt0cool Merged-by: ascend-robot Description: # PR Template Thanks for your contribution; we appreciate it a lot. The following instructions will make your pull request healthier and help you get feedback more easily. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers. 感谢您的贡献,我们非常重视。以下说明将使您的拉取请求更健康,更易于获得反馈。如果您不理解某些项目,请不要担心,只需提交拉取请求并从维护人员那里寻求帮助即可。 **PR Type / PR类型** - [ ] Feature(功能新增) - [ ] Bugfix(Bug 修复) - [x] Docs(文档更新) - [ ] CI/CD(持续集成/持续部署) - [ ] Refactor(代码重构) - [ ] Perf(性能优化) - [ ] Test-Cases(测试用例更新) - [ ] Other(其他) ## 🔍 Motivation / 变更动机 **Please describe the motivation of this PR and the goal you want to achieve through this PR.** **请描述您的拉取请求的动机和您希望通过此拉取请求实现的目标。** ------ ## 📝 Modification / 修改内容 **Please briefly describe what modification is made in this PR.** **请简要描述此拉取请求中进行的修改。** ------ ## 📐 Associated Test Results / 关联测试结果 **Please provide the related test results, such as test reports, etc.** **请提供相关测试结果,例如测试报告等。** ------ ## 🌟 Use cases (Optional) / 使用案例(可选) **If this PR introduces a new feature, it is better to list some use cases here and update the documentation.** **如果此拉取请求引入了新功能,最好在此处列出一些用例并更新文档。** ------ ## ✅ Checklist / 检查列表 **Before PR**: - [ ] Bug fixes are fully covered by unit tests, the case that causes the bug should be added in the unit tests. / 修复的 Bug 已完全由单元测试覆盖,导致 Bug 的情况应在单元测试中添加。 - [ ] The modification is covered by complete unit tests. If not, please add more unit tests to ensure the correctness. / 此拉取请求中的修改已完全由单元测试覆盖。如果不是,请添加更多单元测试以确保正确性。 - [ ] All relevant documentation (API docs, docstrings, example tutorials) has been updated to reflect these changes. / 所有相关文档(API 文档、文档字符串、示例教程)已更新以反映这些更改。 - [ ] Please ensure code files contain no Chinese comments. / 请保证代码文件中不含中文注释。 ------ See merge request: Ascend/msmodeling!328 | 16 天前 | |
[feat]增加算子性能数据采集能力建设设计文档及修改配置项 Co-authored-by: zhenghaojie<zhenghaojie2@huawei.com> # message auto-generated for no-merge-commit merge: !286 merge msmodeling_docs into develop [feat]增加算子性能数据采集能力建设设计文档及修改配置项 Created-by: zhenghaojie Commit-by: zhenghaojie Merged-by: ascend-robot Description: # PR Template Thanks for your contribution; we appreciate it a lot. The following instructions will make your pull request healthier and help you get feedback more easily. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers. 感谢您的贡献,我们非常重视。以下说明将使您的拉取请求更健康,更易于获得反馈。如果您不理解某些项目,请不要担心,只需提交拉取请求并从维护人员那里寻求帮助即可。 **PR Type / PR类型** - [ ] Feature(功能新增) - [ ] Bugfix(Bug 修复) - [x] Docs(文档更新) - [ ] CI/CD(持续集成/持续部署) - [ ] Refactor(代码重构) - [ ] Perf(性能优化) - [ ] Test-Cases(测试用例更新) - [x] Other(其他) ## 🔍 Motivation / 变更动机 **Please describe the motivation of this PR and the goal you want to achieve through this PR.** **请描述您的拉取请求的动机和您希望通过此拉取请求实现的目标。** 1、提供算子性能数据采集能力建设设计文档 当前仿真建模主要基于roofline模型进行算子性能估算,虽然能覆盖通用场景下的性能分析需求,但是面对特定硬件平台、复杂算子实现及组合场景时,估算结果与真实执行性能存在一定偏差。以此为出发点,为提升建模精度、增强方案评估可信度,需要建立一套基于实测性能数据的建模能力体系。实测算子性能估算系统依赖 NPU Profiling 数据作为运行时性能模型的查询输入。这些数据需要从 NPU 设备上采集、解析、验证后才能使用,但此前缺乏一套标准化的离线数据生产流程。 2、修改扩样时每个kernel算子表格增加的行数--rows的默认值为1000 ------ ## 📝 Modification / 修改内容 **Please briefly describe what modification is made in this PR.** **请简要描述此拉取请求中进行的修改。** 一、提供本特性的设计文档 本特性提供完整的性能数据采集工具链(位于 tools/perf_data_collection/),其核心价值在于: 1、将解析profiling数据、扩充算子形状、算子回放、数据库回填串成一条可重复执行的数据生产链 2、明确性能数据工具链的结构,将“解析profiling数据、扩充算子形状、算子回放、数据库回填”等流程解耦为职责清晰的子模块 3、提供结构化、可追溯的算子性能数据资产。 二、修改扩样时每个kernel算子表格增加的行数--rows的默认值为1000 ------ ## 📐 Associated Test Results / 关联测试结果 **Please provide the related test results, such as test reports, etc.** **请提供相关测试结果,例如测试报告等。** ------ ## 🌟 Use cases (Optional) / 使用案例(可选) **If this PR introduces a new feature, it is better to list some use cases here and update the documentation.** **如果此拉取请求引入了新功能,最好在此处列出一些用例并更新文档。** ------ ## ✅ Checklist / 检查列表 **Before PR**: - [ ] Bug fixes are fully covered by unit tests, the case that causes the bug should be added in the unit tests. / 修复的 Bug 已完全由单元测试覆盖,导致 Bug 的情况应在单元测试中添加。 - [ ] The modification is covered by complete unit tests. If not, please add more unit tests to ensure the correctness. / 此拉取请求中的修改已完全由单元测试覆盖。如果不是,请添加更多单元测试以确保正确性。 - [ ] All relevant documentation (API docs, docstrings, example tutorials) has been updated to reflect these changes. / 所有相关文档(API 文档、文档字符串、示例教程)已更新以反映这些更改。 - [ ] Please ensure code files contain no Chinese comments. / 请保证代码文件中不含中文注释。 ------ See merge request: Ascend/msmodeling!286 | 23 天前 | |
【REFACTOR】CI Gate 联合执行与 Nightly 流水线加固 Co-authored-by: liujiawang<anonymousdev@163.com> # message auto-generated for no-merge-commit merge: !342 merge fix into develop 【REFACTOR】CI Gate 联合执行与 Nightly 流水线加固 Created-by: AvadaKedavrua Commit-by: liujiawang Merged-by: ascend-robot Description: ## 修改原因 CI Gate / Nightly helpers 在 union 重构后存在多处逻辑与体验问题: - 全豁免测试仍跑 Phase 2(#112) - Nightly audit 使用过期 test_map(#113) - coverage fallback 在 union 重构后断裂(#119) - PRODUCT_SOURCE_PREFIXES 与 gate_policy.yaml 双源 roots - 重复 collect/git diff、日志含内部术语、nightly 缺终端摘要 本 PR 统一执行模型并补齐回归测试与文档。 --- ## 修改内容 **CI Gate (scripts/helpers/ci_gate/)** - 预跑硬阻断仅保留删除测试/删除源码;覆盖率映射改为跑后软策略 - compute_execution_plan 联合去重 pytest waves;产品/测试变更时附加 --cov - fetch_diff() 单次 git diff;load_gate_policy 按 yaml mtime 缓存 - roots 单一来源:tests/.ci/gate_policy.yaml - 用户向英文日志与成功摘要 print **Nightly (scripts/helpers/nightly/)** - 过期/冗余审计使用新鲜 test_map(#113) - 终端摘要、英文 phase 标签;移除 drift TODO - allowed_node_ids 复用、弱覆盖符号检测传 mapping **Common / Policy** - coverage_config.py 懒加载 product_roots() - gate_policy.yaml:tests/helpers/** exclude - 文档同步:tests/README.md、docs/design/ut_refactor.md、tests/SKILL.md 关联 issue:#112 #113 #114 #115 #116 #117 #119 #120 #121 #122 #123 #124 --- ## 优化特性对比(Before / After) ### 特性 1:联合去重 pytest 执行(核心优化) **场景**:同一 PR 同时改了产品代码 tensor_cast/foo.py 和测试文件 tests/regression/cli/test_shared.py,且 test_shared.py::test_x 恰好也是 foo.py 在 test_map 里的回归用例。 #### Before(分阶段、可能重复跑) text # 1) 预跑:新源码无 test_map 映射 → 直接硬阻断,pytest 还没跑 BLOCK: tensor_cast/foo.py has no test_map entry for symbol Foo.bar # 2) Phase 0:单独跑一轮带 --cov 的 pytest 做 coverage fallback pytest tests/regression/cli/test_shared.py -m not npu --cov ... # 3) Phase 1:跑变更测试 pytest tests/regression/cli/test_shared.py::test_x -m not npu # 4) Phase 2:再跑映射回归(与 Phase 1 重叠) pytest tests/regression/cli/test_shared.py::test_x \ tests/regression/cli/test_other.py::test_y \ -m "not npu and not nightly and not network" → test_shared.py::test_x 被执行 2~3 次;git diff 可能重复 fetch;coverage 与映射校验割裂 #### After(先跑、后验、去重) text # 1) 预跑硬阻断:仅删测试 / 删源码 Validating hard-blocking policy ... (no block — foo.py 是修改不是删除) # 2) 计划:联合调度,node id 去重(changed-test 优先) Scheduling 2 test node(s): new or changed test file Scheduling 1 test node(s): changed product file mapped regression Sample node(s): tests/regression/cli/test_shared.py::test_x, ... Execution uses 2 pytest wave(s) after deduplication # 3) 单次 union pytest(附带 --cov --cov-context=test) Wave 1 (-m not npu): tests/regression/cli/test_shared.py::test_x # changed-test,只跑一次 Wave 2 (-m not npu and not nightly and not network): tests/regression/cli/test_other.py::test_y # 纯回归,test_x 不再重复 # 4) 跑后软策略:用同一次 .coverage 做 mapping fallback Checking new/modified source coverage mapping against collected data ... CI gate passed: 2 test node(s) (new or changed test file; changed product file mapped regression) → 同一 node 只执行 1 次;coverage fallback 与 pytest 同轮完成;CI 耗时更短、行为更可预期 ### 特性 2:配置变更触发全量,跳过无效 collect **场景**:PR 只改了 pyproject.toml(依赖/测试配置变更)。 #### Before text config change detected → full suite still collect changed test files in Phase 1 ... # 多余 collect pytest tests/ -m not npu #### After text Config path(s): pyproject.toml Selected full test suite: dependency or test configuration changed pytest tests/ -m not npu # changed_test_nodes 为空,跳过 gate_new_tests collect CI gate passed: full test suite (pyproject.toml) ### 特性 3:Nightly 终端摘要(用户可读) #### Before text Phase 2a done. elapsed=1832.4s # 流水线结束,无一行总览;内部 phase 编号 #### After text Nightly pipeline finished. test_map: written (line 74.1%, branch 61.8%) nightly-marked: 142 passed in 1832s benchmark: 38 passed in 412s network: 12 passed in 89s weak coverage symbols: 3 report: /path/to/nightly_report.json --- ## 自验证 ### Helpers 回归测试 目的:确认 CI Gate / Nightly / Common helpers 全量回归通过 步骤: 1. 进入仓库根目录 2. 执行: bash uv run python -m pytest tests/regression/scripts/helpers/ -q 结果: 341 passed, 5 warnings in 1.02s ### gate_policy 缓存失效 目的:确认 yaml mtime 变化后 load_gate_policy 缓存失效 步骤: 1. 运行单测: bash uv run python -m pytest tests/regression/scripts/helpers/ci_gate/test_gate_policy.py::test_load_gate_policy_cached_until_yaml_mtime_changes -v 结果:PASSED(修复同秒写入 mtime 未变导致的 flaky) ### pre-commit 目的:提交前 hook 全绿 步骤: 1. git commit 触发 pre-commit(ruff、pylint、bandit、typos 等) 结果:全部 Passed See merge request: Ascend/msmodeling!342 | 15 天前 | |
前端设计文档 Co-authored-by: pengzhipin<pengzhipin1@h-partners.com> # message auto-generated for no-merge-commit merge: !280 merge msmodeling_0601 into develop 前端设计文档 Created-by: weixin_43113933 Commit-by: pengzhipin Merged-by: ascend-robot Description: # PR Template Thanks for your contribution; we appreciate it a lot. The following instructions will make your pull request healthier and help you get feedback more easily. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers. 感谢您的贡献,我们非常重视。以下说明将使您的拉取请求更健康,更易于获得反馈。如果您不理解某些项目,请不要担心,只需提交拉取请求并从维护人员那里寻求帮助即可。 **PR Type / PR类型** - [ ] Feature(功能新增) - [ ] Bugfix(Bug 修复) - [x] Docs(文档更新) - [ ] CI/CD(持续集成/持续部署) - [ ] Refactor(代码重构) - [ ] Perf(性能优化) - [ ] Test-Cases(测试用例更新) - [ ] Other(其他) ## 🔍 Motivation / 变更动机 **Please describe the motivation of this PR and the goal you want to achieve through this PR.** **请描述您的拉取请求的动机和您希望通过此拉取请求实现的目标。** 新增 Web UI 前端架构演进与体验优化设计,用于梳理当前 web_ui/ 在功能持续扩展后的结构与体验问题,并给出后续可分阶段落地的优化方向。 当前 Web UI 已支持 Simulator、Optimizer、LLM/VL/Video 多类工作流,以及参数配置、命令预览、批量执行、结果解析、图表分析、历史缓存与导出等能力。随着参数规模和分析能力继续增长,app.py、callbacks.py 等文件逐渐承担过多职责,页面参数密度、回调维护成本、状态组织和测试覆盖也开始承压。 本设计文档 的目标是明确后续 Web UI 的演进原则:在保持 CLI、执行器和解析器主链路稳定的前提下,逐步优化参数分层、场景化配置、结果工作台、任务治理和代码结构,降低后续功能扩展与维护成本。 ------ ## 📝 Modification / 修改内容 **Please briefly describe what modification is made in this PR.** **请简要描述此拉取请求中进行的修改。** - 梳理当前 Web UI 的架构现状、核心数据流、数据契约与缓存机制。 - 分析 Text Generate、MoE、高级并行覆盖、Profiling、Video DiT Cache、Optimizer 等场景下的参数和功能复杂度。 - 总结当前页面体验、代码结构、状态管理、测试与 DFX 方面的主要问题。 - 提出 Web UI 后续演进方向,包括产品结构分层、代码结构拆分、表单对象化、结果工作台与任务中心建设。 - 给出分阶段实施建议:低风险体验优化、结构治理、分析与协作能力增强。 - 补充测试设计、风险分析与应对策略,为后续 Web UI 改造提供依据。 ------ ## 📐 Associated Test Results / 关联测试结果 **Please provide the related test results, such as test reports, etc.** **请提供相关测试结果,例如测试报告等。** ------ ## 🌟 Use cases (Optional) / 使用案例(可选) **If this PR introduces a new feature, it is better to list some use cases here and update the documentation.** **如果此拉取请求引入了新功能,最好在此处列出一些用例并更新文档。** ------ ## ✅ Checklist / 检查列表 **Before PR**: - [ ] Bug fixes are fully covered by unit tests, the case that causes the bug should be added in the unit tests. / 修复的 Bug 已完全由单元测试覆盖,导致 Bug 的情况应在单元测试中添加。 - [ ] The modification is covered by complete unit tests. If not, please add more unit tests to ensure the correctness. / 此拉取请求中的修改已完全由单元测试覆盖。如果不是,请添加更多单元测试以确保正确性。 - [ ] All relevant documentation (API docs, docstrings, example tutorials) has been updated to reflect these changes. / 所有相关文档(API 文档、文档字符串、示例教程)已更新以反映这些更改。 - [ ] Please ensure code files contain no Chinese comments. / 请保证代码文件中不含中文注释。 ------ See merge request: Ascend/msmodeling!280 | 15 天前 |
| 文件 | 最后提交记录 | 最后更新时间 |
|---|---|---|
| 23 天前 | ||
| 16 天前 | ||
| 22 天前 | ||
| 1 个月前 | ||
| 1 个月前 | ||
| 18 天前 | ||
| 18 天前 | ||
| 16 天前 | ||
| 1 个月前 | ||
| 16 天前 | ||
| 23 天前 | ||
| 15 天前 | ||
| 15 天前 |