| 文件 | 最后提交记录 | 最后更新时间 |
|---|---|---|
feat(skills): add throughput-optimizer-explainer Co-authored-by: lutean<lutean1@huawei.com> Co-authored-by: gitcode-bot<noreply@gitcode.com> # message auto-generated for no-merge-commit merge: !413 merge master into master feat(skills): add throughput-optimizer-explainer Created-by: lutean Commit-by: lutean;gitcode-bot Merged-by: ascend-robot Description: # PR Template Thanks for your contribution; we appreciate it a lot. The following instructions will make your pull request healthier and help you get feedback more easily. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers. 感谢您的贡献,我们非常重视。以下说明将使您的拉取请求更健康,更易于获得反馈。如果您不理解某些项目,请不要担心,只需提交拉取请求并从维护人员那里寻求帮助即可。 **PR Type / PR类型** - [x] Feature(功能新增) - [ ] Bugfix(Bug 修复) - [ ] Docs(文档更新) - [ ] CI/CD(持续集成/持续部署) - [ ] Refactor(代码重构) - [ ] Perf(性能优化) - [ ] Test-Cases(测试用例更新) - [ ] Other(其他) ## 🔍 Motivation / 变更动机 **Please describe the motivation of this PR and the goal you want to achieve through this PR.** **请描述您的拉取请求的动机和您希望通过此拉取请求实现的目标。** 新增throughput-optimizer-explainer skill用于对throughput-optimizer结果的分析解释 ------ ## 📝 Modification / 修改内容 **Please briefly describe what modification is made in this PR.** **请简要描述此拉取请求中进行的修改。** 触发方式 这个 skill 用于解释 python -m cli.inference.throughput_optimizer 的结果。典型触发包括: ·用户问吞吐、TTFT、TPOT、PD ratio 是否合理。 ·用户要比较不同硬件、并行策略或最优行。 ·用户要分析 Cube/Vec/Comm/Mem 瓶颈。 ·用户提供 --dump-original-results、text_generate、--dump-op-bound-results 或 profiler trace。 ·用户想把 throughput_optimizer 的 best row 映射成 python -m cli.inference.text_generate 验证命令。 使用场景 核心场景是“解释优化器结果,但不超出证据范围”: ·解释 aggregation / disaggregation / PD ratio 模式下的最优策略。 ·判断结果等级:basically reasonable、partly explainable、suspicious、insufficient evidence。 ·基于 TTFT、TPOT、吞吐、并发、batch、并行配置做宏观判断。 ·基于 phase breakdown 分析 Prefill / Decode 的 Cube、Vec、Comm、Mem 占比。 ·基于 text_generate --dump-op-bound-results 做模拟 operator 级归因。 ·基于真实 profiler 或 chrome trace 做更强的 operator/kernel 级判断。 ·在证据不足时,生成最小必要的验证命令。 工作流 1、识别 optimizer 模式:aggregation、disaggregation 或 PD ratio。 2、提取可比较条件:模型、设备、设备数、输入输出长度、SLO、量化、compile、prefix cache、MTP、搜索空间等。 3、提取 best row / top candidates:throughput、TTFT、TPOT、concurrency、batch size、parallel strategy、PD ratio、QPS、breakdown。 4、先判定证据等级:macro_only、optimizer_phase_breakdown、text_generate_phase_breakdown、text_generate_op_bound、profiler_trace。 5、aggregation 模式必须拆成 Prefill forward + Decode forward + scheduling 公式,不能当成单次 forward。 6、disaggregation 模式直接映射到 Prefill 或 Decode 阶段。 7、如果缺少 breakdown 且需要瓶颈分析,生成 text_generate 验证命令;需要 operator 级归因时加 --dump-op-bound-results。 8、如果有 op-bound 输出,先看 top total-time operators、dominant bound、memory/comm/mma/gp 百分比。 9、比较硬件或策略时,优先级是 phase breakdown、op-bound、macro metrics,硬件规格比例只作辅助。 10、给出合理性等级和主要判断。 11、结束时给出最小验证动作。 关键证据规则 不能在只有宏观输出时断言具体 operator 或 Cube/Vec/Comm/Mem 瓶颈。text_generate --dump-op-bound-results 只能算 TensorCast 模拟 operator 归因,不是真实 profiler/kernel 证据。真实 runtime 结论必须有 profiler 或实际测量支撑。 用到的脚本功能 parse_optimizer_output.py ·输入 raw optimizer 输出、dump 表、text_generate 输出或 op-bound 输出。 ·输出结构化 JSON。 ·可解析 mode、Best Throughput、TTFT、TPOT、PD Ratio、Prefill/Decode QPS。 ·可提取 pretty tables、percentage_breakdowns dump rows、Stats breakdowns、op-bound operator 表。 build_text_generate_commands.py ·从 normalized best row JSON 生成 text_generate 验证命令。 ·支持 --mode aggregation 和 --mode disaggregation。 ·aggregation 会生成 Prefill 和 Decode 两条命令,并计算 effective_input_length、prefill_batch_size、partial Prefill wave。 ·disaggregation 要求指定 phase=prefill|decode,生成对应单阶段命令。 ·--include-op-bound 会追加 --dump-op-bound-results。 compare_phase_breakdowns.py ·比较两个 JSON 中的 Cube/Vec/Comm/Mem breakdown。 ·输出左右值、差值 delta_right_minus_left 和比例 ratio_right_over_left。 ·加 --op-bound 时比较两个 op-bound 表:bound 分布、top operators 差异、total time 和 memory/comm/mma/gp 百分比变化。 ------ ## 📐 Associated Test Results / 关联测试结果 **Please provide the related test results, such as test reports, etc.** **请提供相关测试结果,例如测试报告等。** ------ ## 🌟 Use cases (Optional) / 使用案例(可选) **If this PR introduces a new feature, it is better to list some use cases here and update the documentation.** **如果此拉取请求引入了新功能,最好在此处列出一些用例并更新文档。** ------ ## ✅ Checklist / 检查列表 **Before PR**: - [ ] Bug fixes are fully covered by unit tests, the case that causes the bug should be added in the unit tests. / 修复的 Bug 已完全由单元测试覆盖,导致 Bug 的情况应在单元测试中添加。 - [ ] The modification is covered by complete unit tests. If not, please add more unit tests to ensure the correctness. / 此拉取请求中的修改已完全由单元测试覆盖。如果不是,请添加更多单元测试以确保正确性。 - [ ] All relevant documentation (API docs, docstrings, example tutorials) has been updated to reflect these changes. / 所有相关文档(API 文档、文档字符串、示例教程)已更新以反映这些更改。 - [ ] Please ensure code files contain no Chinese comments. / 请保证代码文件中不含中文注释。 ------ See merge request: Ascend/msmodeling!413 | 2 天前 | |
fix(security): add model source safety checks Co-authored-by: jia_ya_nan<jiayanan3@h-partners.com> # message auto-generated for no-merge-commit merge: !385 merge fix/trust-remote-code-safety into master fix(security): add model source safety checks Created-by: jia_ya_nan Commit-by: jia_ya_nan Merged-by: ascend-robot Description: # PR Template Thanks for your contribution; we appreciate it a lot. The following instructions will make your pull request healthier and help you get feedback more easily. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers. 感谢您的贡献,我们非常重视。以下说明将使您的拉取请求更健康,更易于获得反馈。如果您不理解某些项目,请不要担心,只需提交拉取请求并从维护人员那里寻求帮助即可。 **PR Type / PR类型** - [ ] Feature(功能新增) - [x] Bugfix(Bug 修复) - [ ] Docs(文档更新) - [ ] CI/CD(持续集成/持续部署) - [ ] Refactor(代码重构) - [ ] Perf(性能优化) - [ ] Test-Cases(测试用例更新) - [x] Other(其他) ## 🔍 Motivation / 变更动机 **Please describe the motivation of this PR and the goal you want to achieve through this PR.** **请描述您的拉取请求的动机和您希望通过此拉取请求实现的目标。** 安全加固 ------ ## 📝 Modification / 修改内容 **Please briefly describe what modification is made in this PR.** **请简要描述此拉取请求中进行的修改。** 增加本地路径权限校验;增加日志风险提示 去掉不维护的老接口 ------ ## 📐 Associated Test Results / 关联测试结果 **Please provide the related test results, such as test reports, etc.** **请提供相关测试结果,例如测试报告等。**  ------ ## 🌟 Use cases (Optional) / 使用案例(可选) **If this PR introduces a new feature, it is better to list some use cases here and update the documentation.** **如果此拉取请求引入了新功能,最好在此处列出一些用例并更新文档。** ------ ## ✅ Checklist / 检查列表 **Before PR**: - [x] Bug fixes are fully covered by unit tests, the case that causes the bug should be added in the unit tests. / 修复的 Bug 已完全由单元测试覆盖,导致 Bug 的情况应在单元测试中添加。 - [x] The modification is covered by complete unit tests. If not, please add more unit tests to ensure the correctness. / 此拉取请求中的修改已完全由单元测试覆盖。如果不是,请添加更多单元测试以确保正确性。 - [x] All relevant documentation (API docs, docstrings, example tutorials) has been updated to reflect these changes. / 所有相关文档(API 文档、文档字符串、示例教程)已更新以反映这些更改。 - [x] Please ensure code files contain no Chinese comments. / 请保证代码文件中不含中文注释。 ------ See merge request: Ascend/msmodeling!385 | 3 天前 | |
更改默认项为vllm Co-authored-by: tt0cool<xujintao8@h-partners.com> # message auto-generated for no-merge-commit merge: !461 merge master into master 更改默认项为vllm Created-by: tt0cool Commit-by: tt0cool Merged-by: ascend-robot Description: # PR Template Thanks for your contribution; we appreciate it a lot. The following instructions will make your pull request healthier and help you get feedback more easily. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers. 感谢您的贡献,我们非常重视。以下说明将使您的拉取请求更健康,更易于获得反馈。如果您不理解某些项目,请不要担心,只需提交拉取请求并从维护人员那里寻求帮助即可。 **PR Type / PR类型** - [ ] Feature(功能新增) - [ ] Bugfix(Bug 修复) - [ ] Docs(文档更新) - [ ] CI/CD(持续集成/持续部署) - [ ] Refactor(代码重构) - [ ] Perf(性能优化) - [ ] Test-Cases(测试用例更新) - [x] Other(其他) ## 🔍 Motivation / 变更动机 **Please describe the motivation of this PR and the goal you want to achieve through this PR.** **请描述您的拉取请求的动机和您希望通过此拉取请求实现的目标。** 更换寻优工具默认使用为vllm ------ ## 📝 Modification / 修改内容 **Please briefly describe what modification is made in this PR.** **请简要描述此拉取请求中进行的修改。** 更换寻优工具默认使用为vllm ------ ## 📐 Associated Test Results / 关联测试结果 **Please provide the related test results, such as test reports, etc.** **请提供相关测试结果,例如测试报告等。**  ------ ## 🌟 Use cases (Optional) / 使用案例(可选) **If this PR introduces a new feature, it is better to list some use cases here and update the documentation.** **如果此拉取请求引入了新功能,最好在此处列出一些用例并更新文档。** ------ ## ✅ Checklist / 检查列表 **Before PR**: - [x] Bug fixes are fully covered by unit tests, the case that causes the bug should be added in the unit tests. / 修复的 Bug 已完全由单元测试覆盖,导致 Bug 的情况应在单元测试中添加。 - [x] The modification is covered by complete unit tests. If not, please add more unit tests to ensure the correctness. / 此拉取请求中的修改已完全由单元测试覆盖。如果不是,请添加更多单元测试以确保正确性。 - [x] All relevant documentation (API docs, docstrings, example tutorials) has been updated to reflect these changes. / 所有相关文档(API 文档、文档字符串、示例教程)已更新以反映这些更改。 - [x] Please ensure code files contain no Chinese comments. / 请保证代码文件中不含中文注释。 ------ See merge request: Ascend/msmodeling!461 | 8 小时前 | |
fix(security): add model source safety checks Co-authored-by: jia_ya_nan<jiayanan3@h-partners.com> # message auto-generated for no-merge-commit merge: !385 merge fix/trust-remote-code-safety into master fix(security): add model source safety checks Created-by: jia_ya_nan Commit-by: jia_ya_nan Merged-by: ascend-robot Description: # PR Template Thanks for your contribution; we appreciate it a lot. The following instructions will make your pull request healthier and help you get feedback more easily. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers. 感谢您的贡献,我们非常重视。以下说明将使您的拉取请求更健康,更易于获得反馈。如果您不理解某些项目,请不要担心,只需提交拉取请求并从维护人员那里寻求帮助即可。 **PR Type / PR类型** - [ ] Feature(功能新增) - [x] Bugfix(Bug 修复) - [ ] Docs(文档更新) - [ ] CI/CD(持续集成/持续部署) - [ ] Refactor(代码重构) - [ ] Perf(性能优化) - [ ] Test-Cases(测试用例更新) - [x] Other(其他) ## 🔍 Motivation / 变更动机 **Please describe the motivation of this PR and the goal you want to achieve through this PR.** **请描述您的拉取请求的动机和您希望通过此拉取请求实现的目标。** 安全加固 ------ ## 📝 Modification / 修改内容 **Please briefly describe what modification is made in this PR.** **请简要描述此拉取请求中进行的修改。** 增加本地路径权限校验;增加日志风险提示 去掉不维护的老接口 ------ ## 📐 Associated Test Results / 关联测试结果 **Please provide the related test results, such as test reports, etc.** **请提供相关测试结果,例如测试报告等。**  ------ ## 🌟 Use cases (Optional) / 使用案例(可选) **If this PR introduces a new feature, it is better to list some use cases here and update the documentation.** **如果此拉取请求引入了新功能,最好在此处列出一些用例并更新文档。** ------ ## ✅ Checklist / 检查列表 **Before PR**: - [x] Bug fixes are fully covered by unit tests, the case that causes the bug should be added in the unit tests. / 修复的 Bug 已完全由单元测试覆盖,导致 Bug 的情况应在单元测试中添加。 - [x] The modification is covered by complete unit tests. If not, please add more unit tests to ensure the correctness. / 此拉取请求中的修改已完全由单元测试覆盖。如果不是,请添加更多单元测试以确保正确性。 - [x] All relevant documentation (API docs, docstrings, example tutorials) has been updated to reflect these changes. / 所有相关文档(API 文档、文档字符串、示例教程)已更新以反映这些更改。 - [x] Please ensure code files contain no Chinese comments. / 请保证代码文件中不含中文注释。 ------ See merge request: Ascend/msmodeling!385 | 3 天前 | |
更改默认项为vllm Co-authored-by: tt0cool<xujintao8@h-partners.com> # message auto-generated for no-merge-commit merge: !461 merge master into master 更改默认项为vllm Created-by: tt0cool Commit-by: tt0cool Merged-by: ascend-robot Description: # PR Template Thanks for your contribution; we appreciate it a lot. The following instructions will make your pull request healthier and help you get feedback more easily. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers. 感谢您的贡献,我们非常重视。以下说明将使您的拉取请求更健康,更易于获得反馈。如果您不理解某些项目,请不要担心,只需提交拉取请求并从维护人员那里寻求帮助即可。 **PR Type / PR类型** - [ ] Feature(功能新增) - [ ] Bugfix(Bug 修复) - [ ] Docs(文档更新) - [ ] CI/CD(持续集成/持续部署) - [ ] Refactor(代码重构) - [ ] Perf(性能优化) - [ ] Test-Cases(测试用例更新) - [x] Other(其他) ## 🔍 Motivation / 变更动机 **Please describe the motivation of this PR and the goal you want to achieve through this PR.** **请描述您的拉取请求的动机和您希望通过此拉取请求实现的目标。** 更换寻优工具默认使用为vllm ------ ## 📝 Modification / 修改内容 **Please briefly describe what modification is made in this PR.** **请简要描述此拉取请求中进行的修改。** 更换寻优工具默认使用为vllm ------ ## 📐 Associated Test Results / 关联测试结果 **Please provide the related test results, such as test reports, etc.** **请提供相关测试结果,例如测试报告等。**  ------ ## 🌟 Use cases (Optional) / 使用案例(可选) **If this PR introduces a new feature, it is better to list some use cases here and update the documentation.** **如果此拉取请求引入了新功能,最好在此处列出一些用例并更新文档。** ------ ## ✅ Checklist / 检查列表 **Before PR**: - [x] Bug fixes are fully covered by unit tests, the case that causes the bug should be added in the unit tests. / 修复的 Bug 已完全由单元测试覆盖,导致 Bug 的情况应在单元测试中添加。 - [x] The modification is covered by complete unit tests. If not, please add more unit tests to ensure the correctness. / 此拉取请求中的修改已完全由单元测试覆盖。如果不是,请添加更多单元测试以确保正确性。 - [x] All relevant documentation (API docs, docstrings, example tutorials) has been updated to reflect these changes. / 所有相关文档(API 文档、文档字符串、示例教程)已更新以反映这些更改。 - [x] Please ensure code files contain no Chinese comments. / 请保证代码文件中不含中文注释。 ------ See merge request: Ascend/msmodeling!461 | 8 小时前 |