| 文件 | 最后提交记录 | 最后更新时间 |
|---|---|---|
optimize memory peak for servingcast & support model_config from tensorcast Co-authored-by: stormchasingg<sh_ding@zju.edu.cn> # message auto-generated for no-merge-commit merge: !360 merge enhance-servingcast into master optimize memory peak for servingcast & support model_config from tensorcast Created-by: stormchasingg Commit-by: stormchasingg Merged-by: ascend-robot Description: # PR Template Thanks for your contribution; we appreciate it a lot. The following instructions will make your pull request healthier and help you get feedback more easily. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers. 感谢您的贡献,我们非常重视。以下说明将使您的拉取请求更健康,更易于获得反馈。如果您不理解某些项目,请不要担心,只需提交拉取请求并从维护人员那里寻求帮助即可。 **PR Type / PR类型** - [x] Feature(功能新增) - [ ] Bugfix(Bug 修复) - [ ] Docs(文档更新) - [ ] CI/CD(持续集成/持续部署) - [ ] Refactor(代码重构) - [ ] Perf(性能优化) - [x] Test-Cases(测试用例更新) - [ ] Other(其他) ## 🔍 Motivation / 变更动机 This PR aligns TensorCast/ServingCast throughput simulation with vLLM-Ascend MoE optimization behavior, especially for shared expert tensor parallelism, sequence parallel configuration, and fused MoE communication paths. 本 PR 旨在使 ServingCast 的吞吐仿真配置与 TensorCast 保持一致,尤其是 shared expert TP、sequence parallel 配置以及 fused MoE 通信路径相关行为。 ------ ## 📝 Modification / 修改内容 - Add throughput optimizer options for shared expert TP, sequence parallel, word embedding TP mode, and chrome trace output. - Propagate optimizer CLI options into UserInputConfig and per-parallel-search model runner configs. - Apply sequence-parallel compilation configuration inside each parallel runner task. - Add TP/DP suffixes to chrome trace filenames to avoid overwriting trace files across parallel search candidates. - Adjust MoE shared expert TP execution to decrease memory peak in servingcast. - Enable dispatch-FFN-combine fusion by default in compilation config. ------ ## 📐 Associated Test Results / 关联测试结果 略。 Test coverage included: None. ------ ## 🌟 Use cases (Optional) / 使用案例(可选) This change is useful when evaluating MoE models with vLLM-style shared expert TP and sequence parallel optimizations, and when collecting chrome traces for multiple TP/DP candidates in one throughput search. python3 -m cli.inference.throughput_optimizer $dense_model_path \ --device ATLAS_800_A3_752T_128G_DIE \ --num-devices 16 \ --input-length 4096 \ --output-length 1 \ --compile \ --tp-sizes 8 16 \ --batch-range 16 16 \ --enable-sequence-parallel \ --word-embedding-tp row \ --quantize-linear-action DISABLED \ --ttft-limits 2000 \ --log-level info \ 2>&1 | tee ./run_sc_1.log python3 -m cli.inference.throughput_optimizer $moe_model_path \ --device ATLAS_800_A3_752T_128G_DIE \ --num-devices 16 \ --input-length 4096 \ --output-length 1 \ --compile \ --quantize-linear-action W8A8_STATIC \ --disagg \ --ttft-limits 2000 \ --tp-sizes 8 16 \ --batch-range 4 4 \ --reserved-memory-gb 10 \ --enable-shared-expert-tp \ --word-embedding-tp row \ --chrome-trace trace_decode.json \ --log-level info \ 2>&1 | tee ./run_sc3_2.log ------ ## ✅ Checklist / 检查列表 **Before PR**: - [ ] Bug fixes are fully covered by unit tests, the case that causes the bug should be added in the unit tests. / 修复的 Bug 已完全由单元测试覆盖,导致 Bug 的情况应在单元测试中添加。 - [x] The modification is covered by complete unit tests. If not, please add more unit tests to ensure the correctness. / 此拉取请求中的修改已完全由单元测试覆盖。如果不是,请添加更多单元测试以确保正确性。 - [ ] All relevant documentation (API docs, docstrings, example tutorials) has been updated to reflect these changes. / 所有相关文档(API 文档、文档字符串、示例教程)已更新以反映这些更改。 - [x] Please ensure code files contain no Chinese comments. / 请保证代码文件中不含中文注释。 See merge request: Ascend/msmodeling!360 | 3 天前 | |
【FEAT】发布 msmodeling 统一 wheel 包与 CLI 入口 Co-authored-by: liujiawang<anonymousdev@163.com> # message auto-generated for no-merge-commit merge: !370 merge pack into master 【FEAT】发布 msmodeling 统一 wheel 包与 CLI 入口 Created-by: AvadaKedavrua Commit-by: liujiawang Merged-by: ascend-robot Description: ## 修改原因 msmodeling 此前无标准 pip 安装路径,OptiX 独立于 experimental/ 维护,CLI 入口分散。需要统一为 pip install msmodeling 可安装的主 wheel,并保留源码 python -m 双轨用法。 --- ## 修改内容 - 包名改为 msmodeling(0.2.0),使用 hatchling + uv build --wheel 构建 py3-none-any wheel - experimental/optix/ 迁入根目录 optix/,并入主包;专属依赖(loguru、pyswarms 等)作为核心依赖 - 新增 cli/main.py 作为唯一 console script:msmodeling inference|optix ... - 支持 python -m optix;OptiX CLI 集成 logo(--help 不打印) - web_ui/ 默认打入主 wheel;gradio 保留在核心依赖 - 新增 scripts/build.sh,支持 MSMODELING_WHEEL_OUTPUT_DIR 控制 wheel 输出目录(CI 归档) - CI gate tests/.ci/gate_policy.yaml 新增 optix/ 为 coverage 源码根 - 保留 python -m cli.inference.* 双轨并列 --- ## 自验证 ### Wheel 构建与包边界 目的:确认 wheel 产物名称、内容与包边界符合方案 步骤: 1. 执行 bash scripts/build.sh 2. 检查 dist/msmodeling-0.2.0-py3-none-any.whl 内容 结果: Building wheel... Successfully built dist/msmodeling-0.2.0-py3-none-any.whl Built wheel: dist/msmodeling-0.2.0-py3-none-any.whl Archive: dist/msmodeling-0.2.0-py3-none-any.whl 2206 cli/main.py 4738 optix/config.toml ### CLI 与 logo 目的:确认统一入口、python -m optix 及 logo 行为 步骤: 1. uv run msmodeling -h 2. uv run python -m optix --help 3. uv run pytest tests/regression/cli/test_logo_cli_hooks.py -q 结果: 5 passed in 0.59s ### CI 输出目录环境变量 目的:确认 CI 可通过环境变量指定 wheel 归档目录 步骤: 1. MSMODELING_WHEEL_OUTPUT_DIR=/tmp/msmodeling-whl-test bash scripts/build.sh 2. ls /tmp/msmodeling-whl-test/ 结果: msmodeling-0.2.0-py3-none-any.whl See merge request: Ascend/msmodeling!370 | 10 天前 | |
【同步】【非开发代码】代码从 develop 同步到 master Co-authored-by: yydyzr<liuyuncong1@huawei.com> Co-authored-by: gcw_61YBRfIt<chuzhenxing@huawei.com> Co-authored-by: 孔炳翔<1120200577@qq.com> Co-authored-by: zhengxinqian<qianzhengxin@huawei.com> Co-authored-by: hw_whx<wanghexiang7@huawei.com> Co-authored-by: jgong5<steven.gong@gmail.com> Co-authored-by: hw_whx<2952154980@qq.com> # message auto-generated for no-merge-commit merge: !330 merge master into master 【同步】【非开发代码】代码从 develop 同步到 master Created-by: AvadaKedavrua Commit-by: liujiawang;ascend-robot;AvadaKedavrua;lutean;Horacehxw;eveyin1;minghang_c;zwt__;tt0cool;elrond-g;jia_ya_nan;zhenyu_zhang;ChenHuiwen;wangshen001;Hudingyi;wendellX;Secluded_Ocean;jhon-117;yaohan404;jiangruitao;zhenghaojie;stormchasingg;panyj1993;cmh1056291129;yuyinkai1;sunguozhong;genius52;liu_jiaxu;HongMaoShuiGuai;zhengxinqian;weixin_43368449;jsez-li-bin;jgong5;wqh17101;w00609794;yydyzr;JieZhang679;sppedforcy;gcw_61YBRfIt;Jiong Gong;hw_whx;gongjiong;孔炳翔 Merged-by: ascend-robot Description: 代码从 develop 同步到 master,后续基于 master 演进,并支持打包 See merge request: Ascend/msmodeling!330 | 15 天前 | |
更改默认项为vllm Co-authored-by: tt0cool<xujintao8@h-partners.com> # message auto-generated for no-merge-commit merge: !461 merge master into master 更改默认项为vllm Created-by: tt0cool Commit-by: tt0cool Merged-by: ascend-robot Description: # PR Template Thanks for your contribution; we appreciate it a lot. The following instructions will make your pull request healthier and help you get feedback more easily. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers. 感谢您的贡献,我们非常重视。以下说明将使您的拉取请求更健康,更易于获得反馈。如果您不理解某些项目,请不要担心,只需提交拉取请求并从维护人员那里寻求帮助即可。 **PR Type / PR类型** - [ ] Feature(功能新增) - [ ] Bugfix(Bug 修复) - [ ] Docs(文档更新) - [ ] CI/CD(持续集成/持续部署) - [ ] Refactor(代码重构) - [ ] Perf(性能优化) - [ ] Test-Cases(测试用例更新) - [x] Other(其他) ## 🔍 Motivation / 变更动机 **Please describe the motivation of this PR and the goal you want to achieve through this PR.** **请描述您的拉取请求的动机和您希望通过此拉取请求实现的目标。** 更换寻优工具默认使用为vllm ------ ## 📝 Modification / 修改内容 **Please briefly describe what modification is made in this PR.** **请简要描述此拉取请求中进行的修改。** 更换寻优工具默认使用为vllm ------ ## 📐 Associated Test Results / 关联测试结果 **Please provide the related test results, such as test reports, etc.** **请提供相关测试结果,例如测试报告等。**  ------ ## 🌟 Use cases (Optional) / 使用案例(可选) **If this PR introduces a new feature, it is better to list some use cases here and update the documentation.** **如果此拉取请求引入了新功能,最好在此处列出一些用例并更新文档。** ------ ## ✅ Checklist / 检查列表 **Before PR**: - [x] Bug fixes are fully covered by unit tests, the case that causes the bug should be added in the unit tests. / 修复的 Bug 已完全由单元测试覆盖,导致 Bug 的情况应在单元测试中添加。 - [x] The modification is covered by complete unit tests. If not, please add more unit tests to ensure the correctness. / 此拉取请求中的修改已完全由单元测试覆盖。如果不是,请添加更多单元测试以确保正确性。 - [x] All relevant documentation (API docs, docstrings, example tutorials) has been updated to reflect these changes. / 所有相关文档(API 文档、文档字符串、示例教程)已更新以反映这些更改。 - [x] Please ensure code files contain no Chinese comments. / 请保证代码文件中不含中文注释。 ------ See merge request: Ascend/msmodeling!461 | 20 小时前 | |
fix(security): add model source safety checks Co-authored-by: jia_ya_nan<jiayanan3@h-partners.com> # message auto-generated for no-merge-commit merge: !385 merge fix/trust-remote-code-safety into master fix(security): add model source safety checks Created-by: jia_ya_nan Commit-by: jia_ya_nan Merged-by: ascend-robot Description: # PR Template Thanks for your contribution; we appreciate it a lot. The following instructions will make your pull request healthier and help you get feedback more easily. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers. 感谢您的贡献,我们非常重视。以下说明将使您的拉取请求更健康,更易于获得反馈。如果您不理解某些项目,请不要担心,只需提交拉取请求并从维护人员那里寻求帮助即可。 **PR Type / PR类型** - [ ] Feature(功能新增) - [x] Bugfix(Bug 修复) - [ ] Docs(文档更新) - [ ] CI/CD(持续集成/持续部署) - [ ] Refactor(代码重构) - [ ] Perf(性能优化) - [ ] Test-Cases(测试用例更新) - [x] Other(其他) ## 🔍 Motivation / 变更动机 **Please describe the motivation of this PR and the goal you want to achieve through this PR.** **请描述您的拉取请求的动机和您希望通过此拉取请求实现的目标。** 安全加固 ------ ## 📝 Modification / 修改内容 **Please briefly describe what modification is made in this PR.** **请简要描述此拉取请求中进行的修改。** 增加本地路径权限校验;增加日志风险提示 去掉不维护的老接口 ------ ## 📐 Associated Test Results / 关联测试结果 **Please provide the related test results, such as test reports, etc.** **请提供相关测试结果,例如测试报告等。**  ------ ## 🌟 Use cases (Optional) / 使用案例(可选) **If this PR introduces a new feature, it is better to list some use cases here and update the documentation.** **如果此拉取请求引入了新功能,最好在此处列出一些用例并更新文档。** ------ ## ✅ Checklist / 检查列表 **Before PR**: - [x] Bug fixes are fully covered by unit tests, the case that causes the bug should be added in the unit tests. / 修复的 Bug 已完全由单元测试覆盖,导致 Bug 的情况应在单元测试中添加。 - [x] The modification is covered by complete unit tests. If not, please add more unit tests to ensure the correctness. / 此拉取请求中的修改已完全由单元测试覆盖。如果不是,请添加更多单元测试以确保正确性。 - [x] All relevant documentation (API docs, docstrings, example tutorials) has been updated to reflect these changes. / 所有相关文档(API 文档、文档字符串、示例教程)已更新以反映这些更改。 - [x] Please ensure code files contain no Chinese comments. / 请保证代码文件中不含中文注释。 ------ See merge request: Ascend/msmodeling!385 | 3 天前 |