ascend-robotrefactor(tensor_cast): unify word embedding tp config

文件	最后提交记录	最后更新时间
.ci	【REFACTOR】CI Gate 联合执行与 Nightly 流水线加固 Co-authored-by: liujiawang<anonymousdev@163.com> # message auto-generated for no-merge-commit merge: !342 merge fix into develop 【REFACTOR】CI Gate 联合执行与 Nightly 流水线加固 Created-by: AvadaKedavrua Commit-by: liujiawang Merged-by: ascend-robot Description: ## 修改原因 CI Gate / Nightly helpers 在 union 重构后存在多处逻辑与体验问题： - 全豁免测试仍跑 Phase 2（#112） - Nightly audit 使用过期 test_map（#113） - coverage fallback 在 union 重构后断裂（#119） - `PRODUCT_SOURCE_PREFIXES` 与 `gate_policy.yaml` 双源 roots - 重复 collect/git diff、日志含内部术语、nightly 缺终端摘要本 PR 统一执行模型并补齐回归测试与文档。 --- ## 修改内容 CI Gate (`scripts/helpers/ci_gate/`) - 预跑硬阻断仅保留删除测试/删除源码；覆盖率映射改为跑后软策略 - `compute_execution_plan` 联合去重 pytest waves；产品/测试变更时附加 `--cov` - `fetch_diff()` 单次 git diff；`load_gate_policy` 按 yaml mtime 缓存 - `roots` 单一来源：`tests/.ci/gate_policy.yaml` - 用户向英文日志与成功摘要 `print` Nightly (`scripts/helpers/nightly/`) - 过期/冗余审计使用新鲜 test_map（#113） - 终端摘要、英文 phase 标签；移除 drift TODO - `allowed_node_ids` 复用、弱覆盖符号检测传 mapping Common / Policy - `coverage_config.py` 懒加载 `product_roots()` - `gate_policy.yaml`：`tests/helpers/` exclude - 文档同步：`tests/README.md`、`docs/design/ut_refactor.md`、`tests/SKILL.md` 关联 issue：#112 #113 #114 #115 #116 #117 #119 #120 #121 #122 #123 #124 --- ## 优化特性对比（Before / After） ### 特性 1：联合去重 pytest 执行（核心优化）场景：同一 PR 同时改了产品代码 `tensor_cast/foo.py` 和测试文件 `tests/regression/cli/test_shared.py`，且 `test_shared.py::test_x` 恰好也是 `foo.py` 在 `test_map` 里的回归用例。 #### Before（分阶段、可能重复跑） text # 1) 预跑：新源码无 test_map 映射 → 直接硬阻断，pytest 还没跑 BLOCK: tensor_cast/foo.py has no test_map entry for symbol Foo.bar # 2) Phase 0：单独跑一轮带 --cov 的 pytest 做 coverage fallback pytest tests/regression/cli/test_shared.py -m not npu --cov ... # 3) Phase 1：跑变更测试 pytest tests/regression/cli/test_shared.py::test_x -m not npu # 4) Phase 2：再跑映射回归（与 Phase 1 重叠） pytest tests/regression/cli/test_shared.py::test_x \ tests/regression/cli/test_other.py::test_y \ -m "not npu and not nightly and not network" → test_shared.py::test_x 被执行 2~3 次；git diff 可能重复 fetch；coverage 与映射校验割裂 #### After（先跑、后验、去重） text # 1) 预跑硬阻断：仅删测试 / 删源码 Validating hard-blocking policy ... (no block — foo.py 是修改不是删除) # 2) 计划：联合调度，node id 去重（changed-test 优先） Scheduling 2 test node(s): new or changed test file Scheduling 1 test node(s): changed product file mapped regression Sample node(s): tests/regression/cli/test_shared.py::test_x, ... Execution uses 2 pytest wave(s) after deduplication # 3) 单次 union pytest（附带 --cov --cov-context=test） Wave 1 (-m not npu): tests/regression/cli/test_shared.py::test_x # changed-test，只跑一次 Wave 2 (-m not npu and not nightly and not network): tests/regression/cli/test_other.py::test_y # 纯回归，test_x 不再重复 # 4) 跑后软策略：用同一次 .coverage 做 mapping fallback Checking new/modified source coverage mapping against collected data ... CI gate passed: 2 test node(s) (new or changed test file; changed product file mapped regression) → 同一 node 只执行 1 次；coverage fallback 与 pytest 同轮完成；CI 耗时更短、行为更可预期 ### 特性 2：配置变更触发全量，跳过无效 collect 场景**：PR 只改了 `pyproject.toml`（依赖/测试配置变更）。 #### Before `text config change detected → full suite still collect changed test files in Phase 1 ... # 多余 collect pytest tests/ -m not npu` #### After `text Config path(s): pyproject.toml Selected full test suite: dependency or test configuration changed pytest tests/ -m not npu # changed_test_nodes 为空，跳过 gate_new_tests collect CI gate passed: full test suite (pyproject.toml)` ### 特性 3：Nightly 终端摘要（用户可读） #### Before `text Phase 2a done. elapsed=1832.4s # 流水线结束，无一行总览；内部 phase 编号` #### After `text Nightly pipeline finished. test_map: written (line 74.1%, branch 61.8%) nightly-marked: 142 passed in 1832s benchmark: 38 passed in 412s network: 12 passed in 89s weak coverage symbols: 3 report: /path/to/nightly_report.json` --- ## 自验证 ### Helpers 回归测试目的：确认 CI Gate / Nightly / Common helpers 全量回归通过步骤： 1. 进入仓库根目录 2. 执行： `bash uv run python -m pytest tests/regression/scripts/helpers/ -q` 结果： `341 passed, 5 warnings in 1.02s` ### gate_policy 缓存失效目的：确认 yaml mtime 变化后 `load_gate_policy` 缓存失效步骤： 1. 运行单测： `bash uv run python -m pytest tests/regression/scripts/helpers/ci_gate/test_gate_policy.py::test_load_gate_policy_cached_until_yaml_mtime_changes -v` 结果：PASSED（修复同秒写入 mtime 未变导致的 flaky） ### pre-commit 目的：提交前 hook 全绿步骤： 1. `git commit` 触发 pre-commit（ruff、pylint、bandit、typos 等）结果：全部 Passed See merge request: Ascend/msmodeling!342	15 天前
assets	Add model adapter onboarding automation Co-authored-by: jhon-117<fangkai15@huawei.com> # message auto-generated for no-merge-commit merge: !282 merge codex/model-adaptation-efficiency-v2 into develop Add model adapter onboarding automation Created-by: jhon-117 Commit-by: jhon-117 Merged-by: ascend-robot Description: # PR Template Thanks for your contribution; we appreciate it a lot. The following instructions will make your pull request healthier and help you get feedback more easily. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers. 感谢您的贡献，我们非常重视。以下说明将使您的拉取请求更健康，更易于获得反馈。如果您不理解某些项目，请不要担心，只需提交拉取请求并从维护人员那里寻求帮助即可。 PR Type / PR类型 - [x] Feature（功能新增） - [ ] Bugfix（Bug 修复） - [ ] Docs（文档更新） - [ ] CI/CD（持续集成/持续部署） - [ ] Refactor（代码重构） - [ ] Perf（性能优化） - [ ] Test-Cases（测试用例更新） - [ ] Other（其他） ## 🔍 Motivation / 变更动机 Please describe the motivation of this PR and the goal you want to achieve through this PR. 请描述您的拉取请求的动机和您希望通过此拉取请求实现的目标。 ------ ## 📝 Modification / 修改内容本 PR 实现 TensorCast 新模型接入效率提升流程，围绕“用户只必须提供 raw Insight profiling 导出文件 + 对应仿真命令”的适配方式，补齐 doctor、evidence、patch discovery、profile draft、ST case 生成和 qwen3-vl replay 验证能力。主要改动：新增 tensor_cast.adapter 自动化模块：仿真命令解析与 AdaptationContext raw MindStudio Insight profiling 解析用户 hints 读取、冲突检测和 provenance profile candidate 生成与 review/validation evidence draft 生成与 verifier mismatch 分类 PatchReport、patch discovery、profile draft 渲染 ST guardrail case 生成新增 CLI： python -m cli.inference.model_doctor python -m cli.inference.verify_model_profile model_doctor 支持： --from-command-file --raw-insight-file --hints-file --patch-failure-file --ignore-existing-profile --profile-draft-output 增强 qwen3-vl replay：新增 tiny config-only fixture：tests/assets/model_config/qwen3_vl_tiny/config.json 支持在 --ignore-existing-profile qwen3_vl 下通过 installed transformers 源码发现 VL profile 字段 patch discovery 可基于 qwen3-vl placeholder/mask meta failure 生成 patch_method_for_qwen3_vl 草案新增文档： docs/design/model_adaptation_efficiency_design.md docs/en/tensor_cast_new_model_adaptation.md 增强 runtime/transformations：暴露 runtime summary 所需信息记录 patch reports 支持 profile registry replay/audit ignore ------ ## 📐 Associated Test Results / 关联测试结果 pytest tests/test_tensor_cast/test_adapter_automation.py -q # 29 passed pytest tests/test_tensor_cast -k "adapter or doctor or evidence" -q # 29 passed python -m compileall -q tensor_cast/adapter cli/inference/model_doctor.py cli/inference/verify_model_profile.py cli/inference/adapter_cli.py tests/test_tensor_cast/test_adapter_automation.py # passed python -m cli.inference.model_doctor --help # passed python -m cli.inference.verify_model_profile --help # passed 额外 smoke： qwen3-vl tiny CLI replay smoke：通过 qwen3-vl patch code draft CLI smoke：通过 deepseek fixture doctor/replay smoke：通过，仅出现 fixture 自带 rope 参数 warning，不影响结果。 ------ ## 🌟 Use cases (Optional) / 使用案例（可选） If this PR introduces a new feature, it is better to list some use cases here and update the documentation. 如果此拉取请求引入了新功能，最好在此处列出一些用例并更新文档。 ------ ## ✅ Checklist / 检查列表 Before PR: - [ ] Bug fixes are fully covered by unit tests, the case that causes the bug should be added in the unit tests. / 修复的 Bug 已完全由单元测试覆盖，导致 Bug 的情况应在单元测试中添加。 - [ ] The modification is covered by complete unit tests. If not, please add more unit tests to ensure the correctness. / 此拉取请求中的修改已完全由单元测试覆盖。如果不是，请添加更多单元测试以确保正确性。 - [ ] All relevant documentation (API docs, docstrings, example tutorials) has been updated to reflect these changes. / 所有相关文档（API 文档、文档字符串、示例教程）已更新以反映这些更改。 - [ ] Please ensure code files contain no Chinese comments. / 请保证代码文件中不含中文注释。 ------ See merge request: Ascend/msmodeling!282	23 天前
benchmark	refactor(tensor_cast): unify word embedding tp config Co-authored-by: Kudo__shinichi<liuning119@huawei.com> # message auto-generated for no-merge-commit merge: !344 merge codex/word-embedding-tp-normalize into develop refactor(tensor_cast): unify word embedding tp config Created-by: Kudo__shinichi Commit-by: Kudo__shinichi Merged-by: ascend-robot Description: # PR Template Thanks for your contribution; we appreciate it a lot. The following instructions will make your pull request healthier and help you get feedback more easily. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers. 感谢您的贡献，我们非常重视。以下说明将使您的拉取请求更健康，更易于获得反馈。如果您不理解某些项目，请不要担心，只需提交拉取请求并从维护人员那里寻求帮助即可。 PR Type / PR类型 - [ ] Feature（功能新增） - [ ] Bugfix（Bug 修复） - [x] Docs（文档更新） - [ ] CI/CD（持续集成/持续部署） - [x] Refactor（代码重构） - [ ] Perf（性能优化） - [x] Test-Cases（测试用例更新） - [ ] Other（其他） ## 🔍 Motivation / 变更动机 `word_embedding_tp` and `word_embedding_tp_mode` represented the same configuration concept in two fields: one field toggled word embedding TP, and the other selected the TP mode. This PR reduces the public and internal configuration shape to a single parameter so users only need to configure `word_embedding_tp` as disabled, `col`, or `row`. ------ ## 📝 Modification / 修改内容 - Make `UserInputConfig.word_embedding_tp` the single nullable word embedding TP mode field. - Remove `word_embedding_tp_mode` and `embedding_parallel_mode` from the config model. - Pass the normalized `word_embedding_tp` mode directly into `ParallelConfig.embedding_parallel` and the embedding transformation. - Keep legacy bool input normalization for compatibility: `True -> col`, `False/None -> disabled`. - Remove redundant CLI-side bool/mode conversion and update related benchmark cases and user guide docs. - Add regression coverage for single-field config, legacy bool normalization, and invalid `word_embedding_tp` values. ------ ## 📐 Associated Test Results / 关联测试结果 - `python -m pytest tests/regression/tensor_cast/test_user_config.py -q`: 6 passed - `python -m pytest tests/regression/tensor_cast/test_user_config.py tests/regression/web_ui/test_command_builder.py tests/regression/tensor_cast/test_adapter_automation.py -q`: 98 passed - `python -m pytest tests/regression/tensor_cast/test_text_generate.py -k word_embedding_parallel -q`: 2 passed, 113 deselected - `python -m pytest tests/regression/tensor_cast/test_sequence_parallel_pass.py -o addopts= -m "nightly and not npu and not network" -q`: 2 passed - `python -m pytest tests/benchmark/models/test_model_regression.py --collect-only -q`: 15 tests collected - `python -m ruff check <changed python files>`: All checks passed - `python -m pre_commit run --from-ref origin/develop --to-ref HEAD`: passed - `git diff --check HEAD~1 HEAD`: passed ------ ## 🌟 Use cases (Optional) / 使用案例（可选） - Disable word embedding TP: `word_embedding_tp=None` - Enable column mode: `word_embedding_tp="col"` - Enable row mode: `word_embedding_tp="row"` - CLI usage: `--word-embedding-tp col` or `--word-embedding-tp row` ------ ## ✅ Checklist / 检查列表 Before PR: - [x] Bug fixes are fully covered by unit tests, the case that causes the bug should be added in the unit tests. / 修复的 Bug 已完全由单元测试覆盖，导致 Bug 的情况应在单元测试中添加。 - [x] The modification is covered by complete unit tests. If not, please add more unit tests to ensure the correctness. / 此拉取请求中的修改已完全由单元测试覆盖。如果不是，请添加更多单元测试以确保正确性。 - [x] All relevant documentation (API docs, docstrings, example tutorials) has been updated to reflect these changes. / 所有相关文档（API 文档、文档字符串、示例教程）已更新以反映这些更改。 - [x] Please ensure code files contain no Chinese comments. / 请保证代码文件中不含中文注释。 ------ See merge request: Ascend/msmodeling!344	15 天前
helpers	【FIX】修复 nightly Feishu 误报与 web_ui conftest 全局污染 Co-authored-by: liujiawang<anonymousdev@163.com> # message auto-generated for no-merge-commit merge: !306 merge issue-fix into develop 【FIX】修复 nightly Feishu 误报与 web_ui conftest 全局污染 Created-by: AvadaKedavrua Commit-by: AvadaKedavrua;liujiawang Merged-by: ascend-robot Description: ## 修改原因 1. #97 nightly Feishu 误报：pytest 各 phase `exit=1`（collection/import 失败）但 JUnit 为空时，Feishu 仍显示 `All passed` / `Passed: 0 \| Failed: 0`。 2. #96 web_ui conftest 污染：`tests/regression/web_ui/conftest.py` 在 import 时 `sys.modules["tensor_cast"] = MagicMock()`，xdist 全量收集时污染全局模块，触发 `tensor_cast.__spec__ is not set`。 3. transformers 5.8.0：`deepseek_v4` 重复注册导致 collection 失败。 --- ## 修改内容 ### Nightly 报告（#97） - `FeishuReportInput` / `PhaseBreakdownEntry` 结构化报告输入 - `build_phase_breakdown()`、`resolve_first_error()`、`extract_pytest_log_snippet()` 解析 phase log - Feishu payload 按 `overall_exit` + errors 判定状态，展示 per-phase breakdown 与 infra failure - 新增/更新 `tests/regression/scripts/helpers/nightly/` 回归用例 ### Conftest 与 CI Gate（#96） - 移除 `web_ui/conftest.py` 模块级 `sys.modules` mock - `is_config_path()` 匹配 `tests/*/conftest.py`，conftest 变更触发全量回归 - 新增 `tests/smoke/test_conftest_hygiene.py` 守卫 ### 测试规范补强 - 新增 `tests/helpers/junit_xml.py` 共享 JUnit XML 构造 - 新增四 phase 全 infra failure 集成测 `test_emit_report_four_phase_all_infra_failure` ### 其他 - `transformers` 上限 `<5.8.0` - pytest xdist 默认 `--dist=worksteal`（`scripts/lib/common.sh` 与各 `run_.sh`） - 更新 `tests/README.md` / `tests/SKILL.md` / `docs/design/ut_refactor.md` --- ## 自验证 ### Nightly 工具链 UT collect-only 目的：确认新增/重构用例可被 pytest 正常收集，无 import 错误。步骤： 1. 同步 CI 依赖： `bash uv sync --frozen --group ci` 2. 仅收集 nightly 回归用例： `bash uv run pytest tests/regression/scripts/helpers/nightly/ --collect-only -q` 结果： `85 tests collected in 0.04s` ### Lint / 类型检查目的：确认改动文件符合项目 ruff / mypy 要求。步骤：对以下路径执行 MCP `lint` 与 `type_check`（`explicit_package_bases=true`）： - `scripts/helpers/nightly/.py` - `tests/helpers/junit_xml.py` - `tests/regression/scripts/helpers/nightly/test_.py` 结果： `ruff: clean mypy: clean (follow-imports=skip, explicit-package-bases)` ### Conftest 卫生（#96 回归场景）目的：验证 web_ui conftest 不再污染 `tensor_cast.__spec__`。步骤： `bash uv run pytest tests/smoke/test_conftest_hygiene.py -q` 结果：该用例在 `dfb66f0` 提交中已添加；全量回归由用户在远端环境验证通过（见会话记录）。 See merge request: Ascend/msmodeling!306	20 天前
regression	refactor(tensor_cast): unify word embedding tp config Co-authored-by: Kudo__shinichi<liuning119@huawei.com> # message auto-generated for no-merge-commit merge: !344 merge codex/word-embedding-tp-normalize into develop refactor(tensor_cast): unify word embedding tp config Created-by: Kudo__shinichi Commit-by: Kudo__shinichi Merged-by: ascend-robot Description: # PR Template Thanks for your contribution; we appreciate it a lot. The following instructions will make your pull request healthier and help you get feedback more easily. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers. 感谢您的贡献，我们非常重视。以下说明将使您的拉取请求更健康，更易于获得反馈。如果您不理解某些项目，请不要担心，只需提交拉取请求并从维护人员那里寻求帮助即可。 PR Type / PR类型 - [ ] Feature（功能新增） - [ ] Bugfix（Bug 修复） - [x] Docs（文档更新） - [ ] CI/CD（持续集成/持续部署） - [x] Refactor（代码重构） - [ ] Perf（性能优化） - [x] Test-Cases（测试用例更新） - [ ] Other（其他） ## 🔍 Motivation / 变更动机 `word_embedding_tp` and `word_embedding_tp_mode` represented the same configuration concept in two fields: one field toggled word embedding TP, and the other selected the TP mode. This PR reduces the public and internal configuration shape to a single parameter so users only need to configure `word_embedding_tp` as disabled, `col`, or `row`. ------ ## 📝 Modification / 修改内容 - Make `UserInputConfig.word_embedding_tp` the single nullable word embedding TP mode field. - Remove `word_embedding_tp_mode` and `embedding_parallel_mode` from the config model. - Pass the normalized `word_embedding_tp` mode directly into `ParallelConfig.embedding_parallel` and the embedding transformation. - Keep legacy bool input normalization for compatibility: `True -> col`, `False/None -> disabled`. - Remove redundant CLI-side bool/mode conversion and update related benchmark cases and user guide docs. - Add regression coverage for single-field config, legacy bool normalization, and invalid `word_embedding_tp` values. ------ ## 📐 Associated Test Results / 关联测试结果 - `python -m pytest tests/regression/tensor_cast/test_user_config.py -q`: 6 passed - `python -m pytest tests/regression/tensor_cast/test_user_config.py tests/regression/web_ui/test_command_builder.py tests/regression/tensor_cast/test_adapter_automation.py -q`: 98 passed - `python -m pytest tests/regression/tensor_cast/test_text_generate.py -k word_embedding_parallel -q`: 2 passed, 113 deselected - `python -m pytest tests/regression/tensor_cast/test_sequence_parallel_pass.py -o addopts= -m "nightly and not npu and not network" -q`: 2 passed - `python -m pytest tests/benchmark/models/test_model_regression.py --collect-only -q`: 15 tests collected - `python -m ruff check <changed python files>`: All checks passed - `python -m pre_commit run --from-ref origin/develop --to-ref HEAD`: passed - `git diff --check HEAD~1 HEAD`: passed ------ ## 🌟 Use cases (Optional) / 使用案例（可选） - Disable word embedding TP: `word_embedding_tp=None` - Enable column mode: `word_embedding_tp="col"` - Enable row mode: `word_embedding_tp="row"` - CLI usage: `--word-embedding-tp col` or `--word-embedding-tp row` ------ ## ✅ Checklist / 检查列表 Before PR: - [x] Bug fixes are fully covered by unit tests, the case that causes the bug should be added in the unit tests. / 修复的 Bug 已完全由单元测试覆盖，导致 Bug 的情况应在单元测试中添加。 - [x] The modification is covered by complete unit tests. If not, please add more unit tests to ensure the correctness. / 此拉取请求中的修改已完全由单元测试覆盖。如果不是，请添加更多单元测试以确保正确性。 - [x] All relevant documentation (API docs, docstrings, example tutorials) has been updated to reflect these changes. / 所有相关文档（API 文档、文档字符串、示例教程）已更新以反映这些更改。 - [x] Please ensure code files contain no Chinese comments. / 请保证代码文件中不含中文注释。 ------ See merge request: Ascend/msmodeling!344	15 天前
smoke	【FIX】【TEST】修复 README/文档失效链接并默认运行完整 benchmark 套件 Co-authored-by: liujiawang<anonymousdev@163.com> # message auto-generated for no-merge-commit merge: !331 merge fix into develop 【FIX】【TEST】修复 README/文档失效链接并默认运行完整 benchmark 套件 Created-by: AvadaKedavrua Commit-by: liujiawang Merged-by: ascend-robot Description: ## 修改原因 1. `README.md` 社区区公众号二维码指向 `msinsight` 仓库旧路径，资源已 404，用户扫码/预览失败。 2. `OP_PLUGIN_MAPPING_TUTORIAL.md` 中 Op Mapping skill 相对路径错误，文档内链接跳转失败。 3. benchmark 入口默认只跑 `tests/benchmark/ops/`，`tests/benchmark/models/` 模型回归被静默跳过，CI/nightly 覆盖不足。 4. 全量 benchmark 启用后，`qwen3-30b-a3b` decode/prefill baseline 与当前 compile 输出不一致，需刷新。 --- ## 修改内容 \| 类别 \| 文件 \| 变更 \| \|------\|------\|------\| \| 文档链接 \| `README.md` \| 公众号图片 URL 换为可用 `user-images` 资源；TOC 补全 Contributions / Community 等章节锚点 \| \| 文档链接 \| `docs/perf_database/tutorial/OP_PLUGIN_MAPPING_TUTORIAL.md` \| skill 路径 `../skills/...` → `../../../.agents/skills/op-mapping/SKILL.md` \| \| benchmark 默认行为 \| `scripts/run_benchmark.sh`、`scripts/helpers/nightly/main.py` \| 移除 `MSMODELING_BENCHMARK_MODELS` 开关，固定跑 `tests/benchmark/` 全目录 \| \| 设计文档 \| `docs/design/ut_refactor.md` \| 同步 benchmark phase 描述 \| \| baseline \| `tests/benchmark/models/cases/qwen3-30b-a3b-{decode,prefill}.json` \| 刷新 `baseline_time_s` 与 operator top-N \| \| lint \| `experimental/optix/`、`scripts/`、`tensor_cast/`、`tests/` 等 \| 为 `inspect.*` 误报补 `pylint: disable` 注释 \| --- ## 自验证 ### README 公众号图片链接目的：确认旧链接 404、新链接可访问。步骤： 1. 检查旧 URL HTTP 状态 2. 检查新 URL HTTP 状态 `bash curl -sI "https://raw.gitcode.com/Ascend/msinsight/raw/master/docs/zh/user_guide/figures/readme/officialAccount.jpg" \| head -1 curl -sI "https://raw.gitcode.com/user-images/assets/8428112/2a22a707-de26-4bb3-b312-4952035e021b/30be980e7fd65b2486d251b48a7999f3.jpg" \| head -1` 结果： `text HTTP/1.1 404 Not Found HTTP/1.1 200 OK` ### Op Mapping skill 文档路径目的：确认教程内链接指向真实文件。步骤： 1. 在仓库根目录检查 skill 文件是否存在 `bash test -f .agents/skills/op-mapping/SKILL.md && echo OK` 结果： `text OK` ### Benchmark 入口默认全量目的：确认 `run_benchmark.sh` 不再依赖 `MSMODELING_BENCHMARK_MODELS`，默认覆盖 models 子目录。步骤： 1. 查看脚本 benchmark target 配置 `bash grep -n "TESTS_BENCHMARK" scripts/run_benchmark.sh` 结果： `text run_pytest "${TESTS_BENCHMARK}/" \` ### CI 流水线目的：确认改动未破坏现有 CI/docs CI。步骤： 1. 查看 PR #331 CI label 状态结果：PR 已打标 `ci-pipeline-passed`、`docs-ci-pipeline-success`。 See merge request: Ascend/msmodeling!331	16 天前
README.md	【REFACTOR】CI Gate 联合执行与 Nightly 流水线加固 Co-authored-by: liujiawang<anonymousdev@163.com> # message auto-generated for no-merge-commit merge: !342 merge fix into develop 【REFACTOR】CI Gate 联合执行与 Nightly 流水线加固 Created-by: AvadaKedavrua Commit-by: liujiawang Merged-by: ascend-robot Description: ## 修改原因 CI Gate / Nightly helpers 在 union 重构后存在多处逻辑与体验问题： - 全豁免测试仍跑 Phase 2（#112） - Nightly audit 使用过期 test_map（#113） - coverage fallback 在 union 重构后断裂（#119） - `PRODUCT_SOURCE_PREFIXES` 与 `gate_policy.yaml` 双源 roots - 重复 collect/git diff、日志含内部术语、nightly 缺终端摘要本 PR 统一执行模型并补齐回归测试与文档。 --- ## 修改内容 CI Gate (`scripts/helpers/ci_gate/`) - 预跑硬阻断仅保留删除测试/删除源码；覆盖率映射改为跑后软策略 - `compute_execution_plan` 联合去重 pytest waves；产品/测试变更时附加 `--cov` - `fetch_diff()` 单次 git diff；`load_gate_policy` 按 yaml mtime 缓存 - `roots` 单一来源：`tests/.ci/gate_policy.yaml` - 用户向英文日志与成功摘要 `print` Nightly (`scripts/helpers/nightly/`) - 过期/冗余审计使用新鲜 test_map（#113） - 终端摘要、英文 phase 标签；移除 drift TODO - `allowed_node_ids` 复用、弱覆盖符号检测传 mapping Common / Policy - `coverage_config.py` 懒加载 `product_roots()` - `gate_policy.yaml`：`tests/helpers/` exclude - 文档同步：`tests/README.md`、`docs/design/ut_refactor.md`、`tests/SKILL.md` 关联 issue：#112 #113 #114 #115 #116 #117 #119 #120 #121 #122 #123 #124 --- ## 优化特性对比（Before / After） ### 特性 1：联合去重 pytest 执行（核心优化）场景：同一 PR 同时改了产品代码 `tensor_cast/foo.py` 和测试文件 `tests/regression/cli/test_shared.py`，且 `test_shared.py::test_x` 恰好也是 `foo.py` 在 `test_map` 里的回归用例。 #### Before（分阶段、可能重复跑） text # 1) 预跑：新源码无 test_map 映射 → 直接硬阻断，pytest 还没跑 BLOCK: tensor_cast/foo.py has no test_map entry for symbol Foo.bar # 2) Phase 0：单独跑一轮带 --cov 的 pytest 做 coverage fallback pytest tests/regression/cli/test_shared.py -m not npu --cov ... # 3) Phase 1：跑变更测试 pytest tests/regression/cli/test_shared.py::test_x -m not npu # 4) Phase 2：再跑映射回归（与 Phase 1 重叠） pytest tests/regression/cli/test_shared.py::test_x \ tests/regression/cli/test_other.py::test_y \ -m "not npu and not nightly and not network" → test_shared.py::test_x 被执行 2~3 次；git diff 可能重复 fetch；coverage 与映射校验割裂 #### After（先跑、后验、去重） text # 1) 预跑硬阻断：仅删测试 / 删源码 Validating hard-blocking policy ... (no block — foo.py 是修改不是删除) # 2) 计划：联合调度，node id 去重（changed-test 优先） Scheduling 2 test node(s): new or changed test file Scheduling 1 test node(s): changed product file mapped regression Sample node(s): tests/regression/cli/test_shared.py::test_x, ... Execution uses 2 pytest wave(s) after deduplication # 3) 单次 union pytest（附带 --cov --cov-context=test） Wave 1 (-m not npu): tests/regression/cli/test_shared.py::test_x # changed-test，只跑一次 Wave 2 (-m not npu and not nightly and not network): tests/regression/cli/test_other.py::test_y # 纯回归，test_x 不再重复 # 4) 跑后软策略：用同一次 .coverage 做 mapping fallback Checking new/modified source coverage mapping against collected data ... CI gate passed: 2 test node(s) (new or changed test file; changed product file mapped regression) → 同一 node 只执行 1 次；coverage fallback 与 pytest 同轮完成；CI 耗时更短、行为更可预期 ### 特性 2：配置变更触发全量，跳过无效 collect 场景**：PR 只改了 `pyproject.toml`（依赖/测试配置变更）。 #### Before `text config change detected → full suite still collect changed test files in Phase 1 ... # 多余 collect pytest tests/ -m not npu` #### After `text Config path(s): pyproject.toml Selected full test suite: dependency or test configuration changed pytest tests/ -m not npu # changed_test_nodes 为空，跳过 gate_new_tests collect CI gate passed: full test suite (pyproject.toml)` ### 特性 3：Nightly 终端摘要（用户可读） #### Before `text Phase 2a done. elapsed=1832.4s # 流水线结束，无一行总览；内部 phase 编号` #### After `text Nightly pipeline finished. test_map: written (line 74.1%, branch 61.8%) nightly-marked: 142 passed in 1832s benchmark: 38 passed in 412s network: 12 passed in 89s weak coverage symbols: 3 report: /path/to/nightly_report.json` --- ## 自验证 ### Helpers 回归测试目的：确认 CI Gate / Nightly / Common helpers 全量回归通过步骤： 1. 进入仓库根目录 2. 执行： `bash uv run python -m pytest tests/regression/scripts/helpers/ -q` 结果： `341 passed, 5 warnings in 1.02s` ### gate_policy 缓存失效目的：确认 yaml mtime 变化后 `load_gate_policy` 缓存失效步骤： 1. 运行单测： `bash uv run python -m pytest tests/regression/scripts/helpers/ci_gate/test_gate_policy.py::test_load_gate_policy_cached_until_yaml_mtime_changes -v` 结果：PASSED（修复同秒写入 mtime 未变导致的 flaky） ### pre-commit 目的：提交前 hook 全绿步骤： 1. `git commit` 触发 pre-commit（ruff、pylint、bandit、typos 等）结果：全部 Passed See merge request: Ascend/msmodeling!342	15 天前
SKILL.md	【REFACTOR】CI Gate 联合执行与 Nightly 流水线加固 Co-authored-by: liujiawang<anonymousdev@163.com> # message auto-generated for no-merge-commit merge: !342 merge fix into develop 【REFACTOR】CI Gate 联合执行与 Nightly 流水线加固 Created-by: AvadaKedavrua Commit-by: liujiawang Merged-by: ascend-robot Description: ## 修改原因 CI Gate / Nightly helpers 在 union 重构后存在多处逻辑与体验问题： - 全豁免测试仍跑 Phase 2（#112） - Nightly audit 使用过期 test_map（#113） - coverage fallback 在 union 重构后断裂（#119） - `PRODUCT_SOURCE_PREFIXES` 与 `gate_policy.yaml` 双源 roots - 重复 collect/git diff、日志含内部术语、nightly 缺终端摘要本 PR 统一执行模型并补齐回归测试与文档。 --- ## 修改内容 CI Gate (`scripts/helpers/ci_gate/`) - 预跑硬阻断仅保留删除测试/删除源码；覆盖率映射改为跑后软策略 - `compute_execution_plan` 联合去重 pytest waves；产品/测试变更时附加 `--cov` - `fetch_diff()` 单次 git diff；`load_gate_policy` 按 yaml mtime 缓存 - `roots` 单一来源：`tests/.ci/gate_policy.yaml` - 用户向英文日志与成功摘要 `print` Nightly (`scripts/helpers/nightly/`) - 过期/冗余审计使用新鲜 test_map（#113） - 终端摘要、英文 phase 标签；移除 drift TODO - `allowed_node_ids` 复用、弱覆盖符号检测传 mapping Common / Policy - `coverage_config.py` 懒加载 `product_roots()` - `gate_policy.yaml`：`tests/helpers/` exclude - 文档同步：`tests/README.md`、`docs/design/ut_refactor.md`、`tests/SKILL.md` 关联 issue：#112 #113 #114 #115 #116 #117 #119 #120 #121 #122 #123 #124 --- ## 优化特性对比（Before / After） ### 特性 1：联合去重 pytest 执行（核心优化）场景：同一 PR 同时改了产品代码 `tensor_cast/foo.py` 和测试文件 `tests/regression/cli/test_shared.py`，且 `test_shared.py::test_x` 恰好也是 `foo.py` 在 `test_map` 里的回归用例。 #### Before（分阶段、可能重复跑） text # 1) 预跑：新源码无 test_map 映射 → 直接硬阻断，pytest 还没跑 BLOCK: tensor_cast/foo.py has no test_map entry for symbol Foo.bar # 2) Phase 0：单独跑一轮带 --cov 的 pytest 做 coverage fallback pytest tests/regression/cli/test_shared.py -m not npu --cov ... # 3) Phase 1：跑变更测试 pytest tests/regression/cli/test_shared.py::test_x -m not npu # 4) Phase 2：再跑映射回归（与 Phase 1 重叠） pytest tests/regression/cli/test_shared.py::test_x \ tests/regression/cli/test_other.py::test_y \ -m "not npu and not nightly and not network" → test_shared.py::test_x 被执行 2~3 次；git diff 可能重复 fetch；coverage 与映射校验割裂 #### After（先跑、后验、去重） text # 1) 预跑硬阻断：仅删测试 / 删源码 Validating hard-blocking policy ... (no block — foo.py 是修改不是删除) # 2) 计划：联合调度，node id 去重（changed-test 优先） Scheduling 2 test node(s): new or changed test file Scheduling 1 test node(s): changed product file mapped regression Sample node(s): tests/regression/cli/test_shared.py::test_x, ... Execution uses 2 pytest wave(s) after deduplication # 3) 单次 union pytest（附带 --cov --cov-context=test） Wave 1 (-m not npu): tests/regression/cli/test_shared.py::test_x # changed-test，只跑一次 Wave 2 (-m not npu and not nightly and not network): tests/regression/cli/test_other.py::test_y # 纯回归，test_x 不再重复 # 4) 跑后软策略：用同一次 .coverage 做 mapping fallback Checking new/modified source coverage mapping against collected data ... CI gate passed: 2 test node(s) (new or changed test file; changed product file mapped regression) → 同一 node 只执行 1 次；coverage fallback 与 pytest 同轮完成；CI 耗时更短、行为更可预期 ### 特性 2：配置变更触发全量，跳过无效 collect 场景**：PR 只改了 `pyproject.toml`（依赖/测试配置变更）。 #### Before `text config change detected → full suite still collect changed test files in Phase 1 ... # 多余 collect pytest tests/ -m not npu` #### After `text Config path(s): pyproject.toml Selected full test suite: dependency or test configuration changed pytest tests/ -m not npu # changed_test_nodes 为空，跳过 gate_new_tests collect CI gate passed: full test suite (pyproject.toml)` ### 特性 3：Nightly 终端摘要（用户可读） #### Before `text Phase 2a done. elapsed=1832.4s # 流水线结束，无一行总览；内部 phase 编号` #### After `text Nightly pipeline finished. test_map: written (line 74.1%, branch 61.8%) nightly-marked: 142 passed in 1832s benchmark: 38 passed in 412s network: 12 passed in 89s weak coverage symbols: 3 report: /path/to/nightly_report.json` --- ## 自验证 ### Helpers 回归测试目的：确认 CI Gate / Nightly / Common helpers 全量回归通过步骤： 1. 进入仓库根目录 2. 执行： `bash uv run python -m pytest tests/regression/scripts/helpers/ -q` 结果： `341 passed, 5 warnings in 1.02s` ### gate_policy 缓存失效目的：确认 yaml mtime 变化后 `load_gate_policy` 缓存失效步骤： 1. 运行单测： `bash uv run python -m pytest tests/regression/scripts/helpers/ci_gate/test_gate_policy.py::test_load_gate_policy_cached_until_yaml_mtime_changes -v` 结果：PASSED（修复同秒写入 mtime 未变导致的 flaky） ### pre-commit 目的：提交前 hook 全绿步骤： 1. `git commit` 触发 pre-commit（ruff、pylint、bandit、typos 等）结果：全部 Passed See merge request: Ascend/msmodeling!342	15 天前
conftest.py	Refactor: testing system overhaul — network marker, nightly 4-phase pipeline, config drift check, and in-process CLI coverage Co-authored-by: liujiawang<anonymousdev@163.com> # message auto-generated for no-merge-commit merge: !281 merge fix-tests into develop Refactor: testing system overhaul — network marker, nightly 4-phase pipeline, config drift check, and in-process CLI coverage Created-by: AvadaKedavrua Commit-by: liujiawang;AvadaKedavrua Merged-by: ascend-robot Description: ## 🔍 Motivation / 变更动机 The current `msmodeling` testing system has several structural weaknesses: UT/ST responsibilities are blurred, slow test cases are concentrated, Hub access is uncontrolled, and CLI tests running via subprocess are invisible to coverage collection. This PR completes a testing system overhaul by introducing a `network` marker to isolate live-Hub tests, splitting nightly into a four-phase pipeline with a non-blocking config drift check, centralizing model/config caching, and replacing subprocess CLI invocations with in-process runners so coverage and `test_map` can observe the real execution paths. ------ ## 📝 Modification / 修改内容 ### Design Document (`docs/design/ut_refactor.md`) - Introduce third cross-cutting marker `network` alongside `nightly` and `npu` - Change CI incremental marker from `not npu and not nightly` to `not npu and not nightly and not network` - Nightly pipeline expands from three phases to four: Phase 1 (test_map UT), Phase 2a (nightly), Phase 2b (benchmark), Phase 2c (network with live Hub) - Add non-blocking config drift check after Phase 2c comparing vendored remote configs against live Hub - CI gate no longer enforces coverage thresholds (moved to nightly report-only); remove per-symbol coverage advisory from CI path - Remove `MIN_LINES_PER_SYMBOL` threshold; `collect_from_coverage` now registers any hit - Coverage gate thresholds changed from 70/50 (blocking) to 60/40 (nightly report-only) - Session model cache centralized to `tests/helpers/model_cache.py` - Shell scripts use `-q --tb=short` instead of `-vv` - Feishu payload now includes per-phase breakdown, slowest tests, and config drift warnings - Gate policy format changed from `gate_policy.json` to `gate_policy.yaml` ### pyproject.toml - Register `network` marker; global `addopts` now excludes `npu`, `nightly`, and `network` ### CI Gate (`scripts/helpers/ci_gate/`) - `diff.py`: classify modified test files (`modified_test` field) - `models.py`: add `modified_test` to `ChangeSet` - `rules.py`: `gate_new_tests` now includes `modified_test` paths - `main.py`: remove coverage gate enforcement and per-symbol advisory; modified tests are re-run and re-mapped via Phase 0; pytest runs with `-vv` instead of `-q --no-header` ### Nightly (`scripts/helpers/nightly/`) - `main.py`: four-phase pipeline with Phase 2c (network), config drift check, Feishu-aware console suppression - `feishu_notifier.py`: payload extended with per-phase breakdown, slowest tests, and drift warnings - `pytest_parser.py`: add `parse_junit_file` and `slowest_testcases` ### Coverage Helpers - `build_test_map.py`: remove `MIN_LINES_PER_SYMBOL=3` filter - `coverage_gate.py`: docstring clarification (threshold check only, not gate) ### Execution Scripts - All `run_.sh`: adopt `not npu and not network` marker (plus `not nightly` for ci_gate); switch from `-vv` to `-q --tb=short` - `run_nightly.sh`: updated for four-phase pipeline and config drift check ### Test Infrastructure - `tests/helpers/model_cache.py`: centralized session-scoped caches for HF configs and built models - `tests/helpers/cli_runner.py`: in-process CLI runner for coverage visibility - `tests/conftest.py`: add `_seed_rng` and `_restore_environ` autouse fixtures; `cfg_registry` delegates to `model_cache` - `tests/regression/tensor_cast/conftest.py`: delegate to `model_cache` - `tests/README.md`: comprehensive update covering all changes ### Test Coverage - `tests/regression/cli/test_op_replay.py`: switch from subprocess/ast parsing to real imports - `tests/regression/cli/test_throughput_optimizer.py`: switch to in-process runner - `tests/regression/serving_cast/test_main_st.py`: switch to in-process runner - `tests/regression/scripts/helpers/`: new tests for `modified_test` classification, network marker, drift check, phase breakdown payload, slowest tests - `tests/regression/tensor_cast/test_auto_model_config.py`: split local and network-marked remote test cases ------ ## 📐 Associated Test Results / 关联测试结果 All unit tests covering the helper modules (ci_gate, nightly, test_map) pass: `tests/regression/scripts/helpers/ci_gate/ — all passed tests/regression/scripts/helpers/nightly/ — all passed tests/regression/scripts/helpers/common/ — all passed tests/helpers/tests/ — all passed tests/regression/cli/ — all passed tests/regression/web_ui/ — all passed` CLI tests migrated to in-process runner (`tests/helpers/cli_runner.py`) show full coverage on core paths that were previously invisible. ------ ## ✅ Checklist / 检查列表 Before PR*: - [x] Bug fixes are fully covered by unit tests, the case that causes the bug should be added in the unit tests. / 修复的 Bug 已完全由单元测试覆盖，导致 Bug 的情况应在单元测试中添加。 - [x] The modification is covered by complete unit tests. If not, please add more unit tests to ensure the correctness. / 此拉取请求中的修改已完全由单元测试覆盖。如果不是，请添加更多单元测试以确保正确性。 - [x] All relevant documentation (API docs, docstrings, example tutorials) has been updated to reflect these changes. / 所有相关文档（API 文档、文档字符串、示例教程）已更新以反映这些更改。 - [x] Please ensure code files contain no Chinese comments. / 请保证代码文件中不含中文注释。 See merge request: Ascend/msmodeling!281	24 天前

tests/ Directory Description

Directory Structure and Semantics

tests/
├── conftest.py              # Global: Hub offline policy, cache paths, session end weight cleanup
├── .ci/
│   ├── gate_policy.yaml     # CI gate omit / exemption / test_discovery policy
│   └── approvers.yaml       # approvers required for gate_policy.yaml changes
├── smoke/                   # Smoke test cases
├── regression/              # Regression test cases
│   ├── tensor_cast/
│   ├── serving_cast/
│   ├── cli/
│   ├── scripts/             # ci_gate / nightly toolchain UT (mirrors scripts/helpers/)
│   └── web_ui/
├── assets/
├── helpers/                 # Shared test builders/assertions (see below)
└── benchmark/
    ├── models/              # Model-level precision/performance guardianship
    └── ops/                 # Operator-level perf_database guardianship

The repository root scripts/ provides CI entry points. Implementation lives in scripts/helpers/. scripts/lib/common.sh provides unified environment initialization, uv sync --frozen --group ci when uv is used, and pytest invocation wrappers for all entry scripts.

Layering is by directory (smoke / regression / benchmark). Markers: nightly (long-running), npu (hardware), network (live Hub access).

Execution Model

Entry	When	What runs
`bash scripts/run_ci_gate.sh`	PR `compile` comment	Plan-first incremental gate via external `test_map` (`scripts/helpers/ci_gate/main.py`). Pre-run policy: classify diff, validate `gate_policy.yaml` if changed, build `CiGatePlan`; `blocking_errors` → exit 1, pytest skipped. Execution (after policy passes): deduplicated union of (1) new/changed test node ids, (2) `test_map` regression nodes for modified source, (3) deleted-source guard nodes — up to two pytest waves: changed tests `-m not npu`; mapped/guard tests `-m "not npu and not nightly and not network"`. Same node id scheduled once (changed-test reason wins). All targets exempt → skip pytest, exit 0. Config-triggered full suite: `tests/` with `-m not npu` only (includes nightly/network under `tests/`). Selected-test pytest failure prints copy-paste `exemptions.tests` YAML; full-suite failures do not. Collect-first xdist sizing, `-vv --tb=short`, no `--cov`.
`bash scripts/run_smoke.sh`	Local; CI `/run_tests smoke`	Full `tests/smoke/`; `-o addopts=` clears pyproject default markers; `-m "not npu and not network"` (includes nightly); collect-then-xdist (`-n auto --dist=worksteal`); `-vv --tb=short`
`bash scripts/run_regression.sh`	Local; CI `/run_tests regression`	Full `tests/regression/`; same flags as smoke
`bash scripts/run_benchmark.sh`	Local; CI `/run_tests benchmark`	Full `tests/benchmark/`; `-o addopts=`; `-m "not npu and not network"`; `-vv --tb=short`; xdist only when `MSMODELING_BENCHMARK_PARALLEL=1`
`bash scripts/run_nightly.sh`	Scheduled CI only	Phase 1 UT (`not npu and not nightly and not network`) `-n auto --dist=worksteal` + `--cov` → refresh `test_map` → Phase 2a nightly → Phase 2b benchmark → Phase 2c network (live Hub) → config drift check → report; 60/40 coverage thresholds

Local: always full smoke/regression (no external test_map). CI incremental: requires MSMODELING_TEST_MAP_PATH pointing to a JSON file on the runner (maintained by nightly).

Coverage + xdist: Nightly phase 1 uses -n auto --dist=worksteal with --cov and --cov-context=test. ci_gate uses collect-first xdist sizing only (no --cov). pytest-xdist collects on the controller, then distributes items to workers (collect-then-xdist). Worksteal scheduling helps when case durations vary widely. [tool.coverage.run] parallel = true in pyproject.toml; pytest-cov merges worker fragments into repo-root .coverage for nightly build_test_map and coverage totals.

-o addopts=: pyproject.toml sets addopts = "-m 'not npu and not nightly and not network'". Shell entry scripts pass -o addopts= so their explicit -m expressions are not stacked on top of the global default. ci_gate passes -o addopts= on every wave and supplies its own -m (changed-test wave: not npu; mapped/guard wave: not npu and not nightly and not network; config-triggered full suite: not npu).

Nightly phases: run_nightly.sh runs four pytest phases in order — Phase 1 (not npu and not nightly and not network, with coverage + test_map), Phase 2a (not npu and nightly and not network), Phase 2b (benchmark), Phase 2c (not npu and network, real model Hub, run serially). After Phase 2c a non-blocking config drift check compares vendored remote configs under tests/assets/model_config/ against the live Hub and surfaces any mismatch as a report warning without failing the run. When FEISHU_WEBHOOK_URL is set, each phase's pytest output is captured to a per-phase log file and the console is kept quiet (the detailed report rides the Feishu card instead); the phase breakdown, slowest tests, and drift warnings are rendered into that card.

Marker Semantics

Marker	Usage
`nightly`	Long-running cases under smoke/regression; included in full `run_smoke.sh` / `run_regression.sh`; excluded from mapped/guard ci_gate wave (`not npu and not nightly and not network`), but new/modified test files run in the changed-test wave (`-m not npu` only — nightly cases execute)
`npu`	Hardware-dependent; excluded from all `run_*.sh`
`network`	Requires live model Hub access (HuggingFace/ModelScope); excluded by default (`pyproject.toml` `addopts`) and from every `run_*.sh`; validated only in nightly Phase 2c

Model Configs: Offline by Default

Model-config tests are split so the default/PR path never touches the network:

Local (offline) fixtures live under tests/assets/model_config/<name>/ (vendored config.json, and optionally configuration_*.py / modeling_*.py for remote-code models). Tests that load these run fully offline by default and carry no marker — e.g. the local cases in tests/regression/tensor_cast/test_auto_model_config.py.
Remote loading that resolves a model id against the live Hub is gathered under @pytest.mark.network (e.g. AutoModelAndConfigRemoteTestCase) and therefore runs nightly-only (Phase 2c), never on PR or local run_*.sh.

Vendoring a new model config (move it offline): use scripts/prefetch_model_configs.py to fetch a model id's config-only snapshot (weight shards are ignored), then copy the resulting config.json (plus any configuration_*.py / modeling_*.py for trust-remote-code models) into a new tests/assets/model_config/<name>/ directory and add a local case. Once vendored, the model can be exercised offline and the remote variant stays under @pytest.mark.network.

Local Execution

# full smoke (includes @pytest.mark.nightly under tests/smoke/)
bash scripts/run_smoke.sh

# full regression
bash scripts/run_regression.sh

# benchmark (sequential unless MSMODELING_BENCHMARK_PARALLEL=1)
bash scripts/run_benchmark.sh

# CI gate — requires MSMODELING_TEST_MAP_PATH (not for local use)
MSMODELING_TEST_MAP_PATH=/path/to/test_map.json bash scripts/run_ci_gate.sh

# nightly — requires MSMODELING_TEST_MAP_PATH (CI scheduled job)
MSMODELING_TEST_MAP_PATH=/path/to/test_map.json bash scripts/run_nightly.sh

# prefetch model configs
PYTHONPATH=. python3 scripts/prefetch_model_configs.py --dest-dir tests/assets/cache

Environment Variables

Boolean types: 0/1/true/false/yes/no/on/off (case-insensitive). Shell scripts apply defaults; Python helpers read env without fallback.

CI / scripts

Variable	Required	Default	Used by	Description
`MSMODELING_TEST_MAP_PATH`	ci_gate, nightly	—	`run_ci_gate.sh`, `run_nightly.sh`	Path to external test_map JSON file (must exist for ci_gate; created by nightly on UT success)
`MSMODELING_TEST_BASE_BRANCH`	Optional	`master`	`run_ci_gate.sh`	merge-base for incremental diff
`MSMODELING_TEST_LINE_THRESHOLD`	Optional	`60`	nightly	Line coverage report threshold (%)
`MSMODELING_TEST_BRANCH_THRESHOLD`	Optional	`40`	nightly	Branch coverage report threshold (%)
`MSMODELING_TEST_WEIGHTS_PRUNE`	Optional	`0`	all `run_*.sh`	Prune Hub weights after session
`MSMODELING_BENCHMARK_PARALLEL`	Optional	`0`	`run_benchmark.sh`, nightly benchmark phase	`1` → pytest `-n auto --dist=worksteal`
`FEISHU_WEBHOOK_URL`	Optional	—	nightly	Feishu webhook (includes coverage summary)
`PYTHON`	Optional	—	`common.sh`	Python interpreter override
`PRE_COMMIT_LLM_FILTER`	Optional	unset	pre-commit hooks	`1` → compact LLM-friendly hook output via `pre-commit/llm_render.py`

Pytest output: smoke / regression / benchmark run -vv --no-header --tb=short (with --durations=20). Nightly phases run -q --no-header --tb=short. run_ci_gate.sh delegates to ci_gate/main.py, which plans tests before any pytest run and logs selection as -vv --tb=short --disable-warnings with collect-first xdist sizing.

The test_map collection scope is hardcoded (not an env override): build_test_map and nightly phase 1 use not npu and not nightly and not network over tests/smoke/ and tests/regression/ — same marker as the ci_gate mapped/guard pytest wave. Benchmark cases never participate in mapping.

Pytest session (`tests/conftest.py`)

Variable	Default	Description
`MSMODELING_OFFLINE`	unset	`1` → Hub offline triplet
`MSMODELING_HF_TRUST_REMOTE_CODE_TIMEOUT`	`0`	HF trust-remote-code timeout (seconds); `0` disables
`MSMODELING_MODELSCOPE_CONFIG_ONLY`	`1`	ModelScope config-only fetch; skip weight shards
`HF_ENDPOINT`	—	Hub mirror
`TORCH_HOME` / `HF_HOME` / `MODELSCOPE_CACHE`	`.msmodeling_cache`	Cache dirs

CI gate policy (`tests/.ci/gate_policy.yaml`)

Gate policy is read by run_ci_gate.sh. Source omit is not in gate_policy — use pyproject.toml [tool.coverage.run] omit as the single source of truth (scripts/helpers/common/coverage_omit.py).

Product roots (single source of truth): gate_policy.yaml roots is the authoritative list of product source prefixes. The same list drives:

Consumer	How `roots` is used
`coverage_config.product_roots()` / `COV_PACKAGES`	`load_gate_policy()` → `roots`; coverage `--cov` flags use `rstrip('/')` package names
`classify_changes` / ci_gate diff	Product vs test vs config classification under merge-base diff
`test_map` keys (`load_test_map`, `build_test_map`)	Map keys must start with a `roots` prefix
Nightly `build_test_map`	`collect_from_coverage` scopes measured files to `roots`

To add a new product tree (e.g. a top-level package), append one roots entry in gate_policy.yaml only — do not duplicate prefixes in Python constants.

Section	Purpose
`roots`	Product source path prefixes (each ends with `/`) — SSOT for diff classification, `test_map` key validation, coverage packages, and gate rules — e.g. `cli/`, `tensor_cast/`, `tools/`
`exemptions.sources`	Temporary product-symbol waivers when `test_map` coverage is not yet available. Each `symbols` entry is `product/path.py::qualified_name` (exactly one `::`, path under `roots`). Skips coverage checks for that source symbol. Requires `reason`, `applicant`, `approver`, `deadline`.
`exemptions.tests`	Temporary pytest-node waivers (same metadata shape). Each `symbols` entry is a pytest node id: `tests/.../test_foo.py::test_bar` — must include `::`, must name a concrete test function/method (no class-only `::TestClass`, no parametrized bracket ids like `::test_x[param]`). Skips matching nodes when scheduling changed tests and mapped regression tests. When every node in a changed test file is exempt, that file is not run. Selected-test pytest failure prints a copy-paste YAML hint; full-suite failures do not.
`test_discovery`	Which paths under `tests/` count as gate test modules

Exemption semantics: exemptions.sources waives product code symbols (AST qualified names under roots). exemptions.tests waives individual pytest nodes (test functions/methods), not whole directories. Prefer fixing or narrowing tests over broad file-level entries.

Example:

exemptions:
  tests:
    - symbols:
        - tests/regression/cli/test_run.py::test_run
      reason: "Fixture unavailable on PR runners"
      applicant: alice
      approver: fangkai
      deadline: 2026-12-31
      ticket: "issue-123"

Coverage omit (SSOT): pyproject.toml [tool.coverage.run] omit (e.g. */builtin_model/*) — gate and nightly test_map skip matching product sources under roots.

Coverage fallback (import-time symbols): during pre-run policy in build_ci_gate_plan, unmapped changed symbols may pass when repo-root .coverage shows an executed line in the symbol span — including import-time or conftest contexts (empty coverage context). Implemented in coverage_symbol_check.symbol_lines_covered_in_data; no separate Phase 0 pytest+cov or in-memory test_map merge. Nightly phase 1 maintains the external test_map; ci_gate reads it only.

Changes to gate_policy.yaml require an approver listed in tests/.ci/approvers.yaml.

CodeArts Integration

Trigger	Command
PR comment `compile`	`MSMODELING_TEST_MAP_PATH=… bash scripts/run_ci_gate.sh`
Comment `/run_tests smoke`	`bash scripts/run_smoke.sh`
Comment `/run_tests regression`	`bash scripts/run_regression.sh`
Comment `/run_tests benchmark`	`bash scripts/run_benchmark.sh`
Scheduled nightly	`MSMODELING_TEST_MAP_PATH=… bash scripts/run_nightly.sh`

All run_*.sh scripts source common.sh, which runs uv sync --frozen --group ci when uv is on PATH and PYTHON is unset.

Shared Test Helpers

tests/helpers/ holds reusable builders and assertions for regression cases. Public APIs (read each module for full signatures):

Module	Public API	Role
`model_cache.py`	`get_hf_config(model_id)`, `get_built_model(user_config)`, `user_config_build_cache_key(user_config)`	Single session-scoped cache for HF configs (handed out as deepcopies) and `build_model` results (shared, read-only). Shared by pytest fixtures and unittest `TestCase` paths.
`model_builder.py`	`make_user_input_config(*, model_id, ...)`, `build_or_get_cached_model(user_config, cache)`	Build a minimal `UserInputConfig`; build-once-per-key into a caller-provided cache dict.
`config_factory.py`	`build_case_matrix(*dimensions)`, `build_latency_thresholds(, ttft_ms, tpot_ms, tolerance_ms=0.1)`	Cartesian parametrize matrices; shared serving latency threshold dicts.
`op_registry.py`	`build_op_registry(cfg_registry)`	Lightweight per-model op registry from the shared hf-config cache.
`assert_utils.py`	`assert_tensor_close(actual, expected, , rtol, atol, equal_nan)`, `assert_latency_within(actual_ms, expected_ms, , metric, tolerance_ms, rel_tolerance)`	Tensor closeness (torch semantics) and latency-tolerance assertions.
`cli_runner.py`	`run_module_main(module_name, argv)`, `run_cli_main(main_callable, argv, *, prog)`, `CliResult(returncode, stdout, stderr)`	Run a CLI `main()` in-process so coverage/`test_map` see the real path (subprocess CLI tests measure zero coverage).
`fake_subprocess.py`	`FakeCompleted(returncode, stdout, stderr)`	Minimal `subprocess.CompletedProcess` stand-in for tests that monkeypatch `subprocess.run`.

Self-tests live under tests/helpers/tests/.

`conftest.py` Rules

Pytest loads every tests/**/conftest.py during collection. Side effects at import time leak across the whole suite (including unrelated directories and xdist workers).

Rule	Why
Never assign `sys.modules["tensor_cast"]` (or other product packages) in a conftest	Replaces real modules with mocks → `tensor_cast.__spec__ is not set`, pickle failures in other layers
Use fixture-scoped `monkeypatch` / `@patch` in individual tests when isolation is needed	Scope stays inside one test
Put `pytest_plugins = (...)` only in `tests/conftest.py`	Subdirectory `pytest_plugins` is invalid; root registration shares fixtures across smoke/regression
Subdirectory conftest is for directory-local fixtures only	No global import hacks; project already depends on `torch`
Any change under `tests//conftest.py`, `requirements.txt`, `uv.lock`, or standard pytest/coverage config filenames triggers CI full `tests/` with `-m not npu`**	See `is_config_path()` in `scripts/helpers/common/test_map_config.py`. Changes to `tests/.ci/gate_policy.yaml` do not trigger full suite — they are validated via `validate_gate_policy_if_changed` only

Guard test: tests/smoke/test_conftest_hygiene.py — loads conftest modules like pytest and asserts tensor_cast.__spec__ stays valid.

Cross-layer fixtures (tensor_cast / serving_cast session caches) are registered in root tests/conftest.py via pytest_plugins, not by mocking imports in leaf conftests.

Adding New Test Cases

Step 1: Choose the directory

Your test intent	Directory	Example
Quick path validation, PR-level guard	`tests/smoke/`	`test_compile_paths_smoke.py`
Functional / integration verification	`tests/regression/`	`test_text_generate.py`
Precision or performance baseline	`tests/benchmark/models/` or `tests/benchmark/ops/`	`test_model_regression.py`

Do not add layer markers (smoke, regression, benchmark). Layering is expressed by directory placement. Only use @pytest.mark.nightly (long-running compile paths) or @pytest.mark.npu (hardware-dependent) when applicable.

Step 2: Reuse shared helpers

Need	Module	Key API
Build a `UserInputConfig`	`tests/helpers/model_builder.py`	`make_user_input_config(model_id=..., ...)`
Build / cache a model	`tests/helpers/model_cache.py`	`get_built_model(user_config)` (session cache) or `build_or_get_cached_model(user_config, cache)`
Get a HF config	`tests/helpers/model_cache.py`	`get_hf_config(model_id)` (deepcopy per call)
Assert tensor / latency	`tests/helpers/assert_utils.py`	`assert_tensor_close(...)`, `assert_latency_within(...)`
Build op registry	`tests/helpers/op_registry.py`	`build_op_registry(cfg_registry)`
Run a CLI `main()` in-process	`tests/helpers/cli_runner.py`	`run_module_main(module_name, argv) -> CliResult`
Stub `subprocess.run` result	`tests/helpers/fake_subprocess.py`	`FakeCompleted(returncode, stdout, stderr)`

CLI tests should call run_module_main instead of spawning a subprocess, so coverage and test_map observe the real core path:

from tests.helpers.cli_runner import run_module_main

def test_cli_reports_config():
    result = run_module_main("cli.inference.throughput_optimizer", ["--input-length=1", "--output-length=1", "Qwen/Qwen3-32B"])
    assert result.returncode == 0
    assert "Input Configuration:" in result.stdout

Step 3: Use session-level fixtures (regression)

Regression tests under tests/regression/tensor_cast/ have access to session-scoped model and config caches:

from tests.helpers.model_builder import make_user_input_config
from tests.regression.tensor_cast.conftest import get_session_model

def test_my_feature():
    user_config = make_user_input_config(model_id="my-model-id")
    model = get_session_model(user_config)  # cached across the session via tests.helpers.model_cache
    # ... run assertions

get_session_model / get_session_hf_config delegate to tests.helpers.model_cache, so the build cache is shared across both pytest fixtures and unittest TestCase code paths. This avoids rebuilding the same model for every test function.

Step 4: Add a benchmark case (if precision guardianship)

Create a JSON config file under tests/benchmark/models/cases/ (or tests/benchmark/ops/perf_database/).
Set baseline_time_s and tolerance fields. If no baseline exists yet, set baseline_time_s: 0 — the auto-baseline mechanism will establish a reference on first run.
The test runner (TestModelRegression) loads all JSON cases automatically.

Step 5: Verify locally

# Run only your new layer
bash scripts/run_smoke.sh        # or run_regression.sh / run_benchmark.sh

# Check that your new test appears in the test_map collection scope
PYTHONPATH=. python -m pytest tests/smoke/ tests/regression/ \
  -m "not npu and not nightly and not network" --collect-only -q

Checklist for new cases

Case is in the correct directory (smoke / regression / benchmark)
No layer markers (smoke, regression, benchmark) — only nightly or npu when needed
Shared helpers used where applicable (no copy-paste of builder/assertion logic)
Session fixtures used for model construction in regression (no per-function rebuilds)
If @pytest.mark.nightly is added, a corresponding smoke guard exists under tests/smoke/
New or edited conftest.py has no module-level sys.modules / global mocks (see conftest.py Rules above)
New product symbols are covered or listed in tests/.ci/gate_policy.yaml (exemptions.sources for product symbols, exemptions.tests for pytest node ids)
Local smoke + regression pass before push

Merge Checklist

Test case in correct directory; nightly / npu markers only when needed
New product symbols covered by tests or listed in gate_policy.yaml (exemptions.sources / exemptions.tests)
Local smoke + regression pass before push
Core path changes considered for nightly impact