| 文件 | 最后提交记录 | 最后更新时间 |
|---|---|---|
chore(ci): adopt pre-commit and retire legacy lintrunner adapters Co-authored-by: liujiawang<anonymousdev@163.com> # message auto-generated for no-merge-commit merge: !176 merge pre-commit into develop chore(ci): adopt pre-commit and retire legacy lintrunner adapters Created-by: AvadaKedavrua Commit-by: liujiawang;AvadaKedavrua Merged-by: ascend-robot Description: Thanks for your contribution; we appreciate it a lot. The following instructions will make your pull request healthier and help you get feedback more easily. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers. 感谢您的贡献,我们非常重视。以下说明将使您的拉取请求更健康,更易于获得反馈。如果您不理解某些项目,请不要担心,只需提交拉取请求并从维护人员那里寻求帮助即可。 **PR Type / PR类型** - [ ] Feature(功能新增) - [ ] Bugfix(Bug 修复) - [x] Docs(文档更新) - [x] CI/CD(持续集成/持续部署) - [ ] Refactor(代码重构) - [ ] Perf(性能优化) - [ ] Test-Cases(测试用例更新) - [ ] Other(其他) ------ ## Motivation / 变更动机 Continue the **pre-commit** migration: tighten **Pylint** so only high-signal messages run ( disable=all + explicit enable list), fix real issues that remained under that profile, and translate hook/config comments to **English**. ------ ## Configuration changes(仅工具与注释 / tooling & comments only) | Path | What changed | |------|----------------| | pre-commit/pyproject.toml | **Pylint:** [tool.pylint."messages control"] with disable = ["all"] and a short **allowlist** of message IDs (E0100, E0601–E0611, E0632, E1101, E1120, W0632, W1514). **Ruff:** unchanged behavior; comments translated to English. **Bandit:** comments translated; rule allowlist/skip lists unchanged. | | .pre-commit-config.yaml | Comments translated to English; Bandit hook display name set to **bandit (Python security checks)**. Hook versions and args unchanged except for comment text. | ------ ## Source code changes(应用代码 / application code) | Area | Files | Purpose | |------|--------|---------| | serving_cast | communication.py, engine.py, instance.py, kv_cache_manager.py, load_gen.py, main.py, model_runner.py, request.py, serving.py, utils.py | Replace from . import stime with import serving_cast.stime as stime so Pylint resolves imports (fixes **E0611**). | | serving_cast | stime.py | Singleton **salabim** Environment via _get_sim_env() so type checkers/Pylint see **sim.Environment** (fixes **E1101** on SimulationEnv). | | serving_cast/service | base_throughput_optimizer.py | __init__ defaults + assert runner is not None before run_inference (fixes **E1101** on base class). | | tensor_cast | diffusers/diffusers_model.py, diffusers/diffusers_utils.py, runtime.py | Add **encoding="utf-8"** to open() / trace export (fixes **W1514**). | | web_ui | callbacks.py | **refresh_optimizer_detail:** call _optimizer_detail_view(rows, None, device) and unpack five return values (fixes **E1120**). | ------ ## Recent commits on pre-commit branch - ci(pre-commit): fix pylint message selection with disable=all - fix: resolve pylint findings in serving_cast, tensor_cast, and web_ui - docs(pre-commit): translate comments to English and add all-files run log ------  ------ ## Checklist / 检查列表 - [x] Please ensure code files contain no Chinese comments. / 请保证代码文件中不含中文注释。 See merge request: Ascend/msmodeling!176 | 1 个月前 | |
chore(ci): adopt pre-commit and retire legacy lintrunner adapters Co-authored-by: liujiawang<anonymousdev@163.com> # message auto-generated for no-merge-commit merge: !176 merge pre-commit into develop chore(ci): adopt pre-commit and retire legacy lintrunner adapters Created-by: AvadaKedavrua Commit-by: liujiawang;AvadaKedavrua Merged-by: ascend-robot Description: Thanks for your contribution; we appreciate it a lot. The following instructions will make your pull request healthier and help you get feedback more easily. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers. 感谢您的贡献,我们非常重视。以下说明将使您的拉取请求更健康,更易于获得反馈。如果您不理解某些项目,请不要担心,只需提交拉取请求并从维护人员那里寻求帮助即可。 **PR Type / PR类型** - [ ] Feature(功能新增) - [ ] Bugfix(Bug 修复) - [x] Docs(文档更新) - [x] CI/CD(持续集成/持续部署) - [ ] Refactor(代码重构) - [ ] Perf(性能优化) - [ ] Test-Cases(测试用例更新) - [ ] Other(其他) ------ ## Motivation / 变更动机 Continue the **pre-commit** migration: tighten **Pylint** so only high-signal messages run ( disable=all + explicit enable list), fix real issues that remained under that profile, and translate hook/config comments to **English**. ------ ## Configuration changes(仅工具与注释 / tooling & comments only) | Path | What changed | |------|----------------| | pre-commit/pyproject.toml | **Pylint:** [tool.pylint."messages control"] with disable = ["all"] and a short **allowlist** of message IDs (E0100, E0601–E0611, E0632, E1101, E1120, W0632, W1514). **Ruff:** unchanged behavior; comments translated to English. **Bandit:** comments translated; rule allowlist/skip lists unchanged. | | .pre-commit-config.yaml | Comments translated to English; Bandit hook display name set to **bandit (Python security checks)**. Hook versions and args unchanged except for comment text. | ------ ## Source code changes(应用代码 / application code) | Area | Files | Purpose | |------|--------|---------| | serving_cast | communication.py, engine.py, instance.py, kv_cache_manager.py, load_gen.py, main.py, model_runner.py, request.py, serving.py, utils.py | Replace from . import stime with import serving_cast.stime as stime so Pylint resolves imports (fixes **E0611**). | | serving_cast | stime.py | Singleton **salabim** Environment via _get_sim_env() so type checkers/Pylint see **sim.Environment** (fixes **E1101** on SimulationEnv). | | serving_cast/service | base_throughput_optimizer.py | __init__ defaults + assert runner is not None before run_inference (fixes **E1101** on base class). | | tensor_cast | diffusers/diffusers_model.py, diffusers/diffusers_utils.py, runtime.py | Add **encoding="utf-8"** to open() / trace export (fixes **W1514**). | | web_ui | callbacks.py | **refresh_optimizer_detail:** call _optimizer_detail_view(rows, None, device) and unpack five return values (fixes **E1120**). | ------ ## Recent commits on pre-commit branch - ci(pre-commit): fix pylint message selection with disable=all - fix: resolve pylint findings in serving_cast, tensor_cast, and web_ui - docs(pre-commit): translate comments to English and add all-files run log ------  ------ ## Checklist / 检查列表 - [x] Please ensure code files contain no Chinese comments. / 请保证代码文件中不含中文注释。 See merge request: Ascend/msmodeling!176 | 1 个月前 | |
feat(serving_cast): support chunked prefill modeling Co-authored-by: jia_ya_nan<jiayanan3@h-partners.com> # message auto-generated for no-merge-commit merge: !250 merge feat/chunked-prefill-impl into develop feat(serving_cast): support chunked prefill modeling Created-by: jia_ya_nan Commit-by: jia_ya_nan Merged-by: ascend-robot Description: # PR Template Thanks for your contribution; we appreciate it a lot. The following instructions will make your pull request healthier and help you get feedback more easily. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers. 感谢您的贡献,我们非常重视。以下说明将使您的拉取请求更健康,更易于获得反馈。如果您不理解某些项目,请不要担心,只需提交拉取请求并从维护人员那里寻求帮助即可。 **PR Type / PR类型** - [x] Feature(功能新增) - [ ] Bugfix(Bug 修复) - [ ] Docs(文档更新) - [ ] CI/CD(持续集成/持续部署) - [ ] Refactor(代码重构) - [ ] Perf(性能优化) - [ ] Test-Cases(测试用例更新) - [ ] Other(其他) ## 🔍 Motivation / 变更动机 **Please describe the motivation of this PR and the goal you want to achieve through this PR.** **请描述您的拉取请求的动机和您希望通过此拉取请求实现的目标。** 当前 throughput_optimizer 在混部模式下使用 max_prefill_tokens 作为 prefill token budget,并要求有效输入长度不超过该值。当长上下文请求的 effective_input_length 大于 token budget 时,工具会直接报错,无法模拟实际服务中常见的 chunked prefill 场景。 本 PR 旨在补齐 msmodeling 对 chunked prefill 的建模能力,使吞吐优化器可以在长 prompt 或较小 batch token budget 场景下,自动将 prefill 拆分为多个 chunk 进行估算,并更合理地建模 prefill 与 decode 混部执行对 TTFT、TPOT 和吞吐的影响。 ------ ## 📝 Modification / 修改内容 **Please briefly describe what modification is made in this PR.** **请简要描述此拉取请求中进行的修改。** - 将 CLI 参数 --max-prefill-tokens 重命名为 --max-batched-tokens,用于表达单个 prefill / mixed step 的 token budget。 - 新增 prefill chunk plan 生成逻辑,当 effective_input_length > max_batched_tokens 时自动按 max_batched_tokens 切分 prefill。 - 新增默认调度策略 DecodeFirstWithSlack,支持 decode-first 调度,并允许 15% slack 以避免 decode token 占用导致 prefill chunk 无法调度。 - 聚合模式中新增 chunked prefill 轻量级时间模拟,支持已完成 prefill 的请求提前进入 decode,不再要求所有请求完成 prefill 后统一 decode。 - PD 分离模式中 prefill 阶段支持 chunked prefill,decode 阶段保持原有逻辑。 - 优化 latency cache key,使其区分不同的 query_len、seq_len 和并发形态。 - 输出结果新增 effective_input_length、max_batched_tokens、prefill_num_chunks,便于分析 chunked prefill 配置影响。 - 更新 Web UI 参数生成、表单校验、相关文档和单元测试。 ------ ## 📐 Associated Test Results / 关联测试结果 **Please provide the related test results, such as test reports, etc.** **请提供相关测试结果,例如测试报告等。** 以32条 32k请求为例,无chunk改动前: python -m cli.inference.throughput_optimizer Qwen/Qwen3-32B --device ATLAS_800_A2_280T_64G --quantize-linear-action DISABLED --input-length 32000 --output-length 1024 --tp-sizes 8 --compile --batch-range 32 32 --num-devices 8 --max-prefill-tokens 32000 --log-level info  无chunk改动后: python -m cli.inference.throughput_optimizer Qwen/Qwen3-32B --device ATLAS_800_A2_280T_64G --quantize-linear-action DISABLED --input-length 32000 --output-length 1024 --tp-sizes 8 --compile --batch-range 32 32 --num-devices 8 --max-batched-tokens 32000 --log-level info  结果不变,不影响之前的调度逻辑 chunk为2000: python -m cli.inference.throughput_optimizer Qwen/Qwen3-32B --device ATLAS_800_A2_280T_64G --quantize-linear-action DISABLED --input-length 32000 --output-length 1024 --tp-sizes 8 --compile --batch-range 32 32 --num-devices 8 --max-batched-tokens 2000 --log-level info  chunk为4000: python -m cli.inference.throughput_optimizer Qwen/Qwen3-32B --device ATLAS_800_A2_280T_64G --quantize-linear-action DISABLED --input-length 32000 --output-length 1024 --tp-sizes 8 --compile --batch-range 32 32 --num-devices 8 --max-batched-tokens 4000 --log-level info  chunk为8000: python -m cli.inference.throughput_optimizer Qwen/Qwen3-32B --device ATLAS_800_A2_280T_64G --quantize-linear-action DISABLED --input-length 32000 --output-length 1024 --tp-sizes 8 --compile --batch-range 32 32 --num-devices 8 --max-batched-tokens 4000 --log-level info  从趋势上看,chunk size越小,对tpot越友好;同时chunk 越小,prefill阶段调度越多,ttft会增加,符合预期; 另外,由于开启chunk prefill后,prefill会增加多次计算,导致耗时呈线性增长;可以考虑并行一次性跑完所有切分的prefill,但对资源消耗巨大,考虑在下个PR内提升性能 ------ ## 🌟 Use cases (Optional) / 使用案例(可选) **If this PR introduces a new feature, it is better to list some use cases here and update the documentation.** **如果此拉取请求引入了新功能,最好在此处列出一些用例并更新文档。** ------ ## ✅ Checklist / 检查列表 **Before PR**: - [x] Bug fixes are fully covered by unit tests, the case that causes the bug should be added in the unit tests. / 修复的 Bug 已完全由单元测试覆盖,导致 Bug 的情况应在单元测试中添加。 - [x] The modification is covered by complete unit tests. If not, please add more unit tests to ensure the correctness. / 此拉取请求中的修改已完全由单元测试覆盖。如果不是,请添加更多单元测试以确保正确性。 - [x] All relevant documentation (API docs, docstrings, example tutorials) has been updated to reflect these changes. / 所有相关文档(API 文档、文档字符串、示例教程)已更新以反映这些更改。 - [x] Please ensure code files contain no Chinese comments. / 请保证代码文件中不含中文注释。 ------ See merge request: Ascend/msmodeling!250 | 21 天前 | |
fix(serving_cast): #27–#31 空结果、load_gen、ModelRunner、logger、通信校验 Co-authored-by: welar<welar.ww@gmail.com> # message auto-generated for no-merge-commit merge: !171 merge fix/27-31-serving-cast-stability into develop fix(serving_cast): #27–#31 空结果、load_gen、ModelRunner、logger、通信校验 Created-by: welar Commit-by: welar Merged-by: ascend-robot Description: ## 修改动机 - **#27**:agg/disagg 过滤后 DataFrame 可能为空,.loc[0] 直接 KeyError,吞吐优化 CLI 无法收尾。 - **#28**:get_load_gen 分支按 load_gen_type 判断,但异常信息访问 load_gen_config.type,易 AttributeError 掩盖真实非法类型。 - **#29**:ModelRunner.__init__ 在赋值前读取 self.enable_interpolate,多进程路径可能 AttributeError。 - **#30**:get_logger 每次调用追加 StreamHandler,重复日志与额外开销。 - **#31**:num_bytes <= 0 and (not isinstance(num_bytes, int)) 对 int 负值恒假,负整数绕过校验。 ## 自验证 - 静态阅读:ModelRunner 中互斥判断前已赋值 enable_interpolate;get_load_gen 异常串使用 load_gen_type。 - _get_agg_disagg_final_out:对空 sorted_summary_df 返回提示列表并打 warning,无 .loc[0]。 - host2device_sync / async / device2device_*:非正 int 一律拒绝(含负 int)。 - get_logger:同 name 重复调用不重复挂 handler(logger 级标记)。 Fixes #27. Fixes #28. Fixes #29. Fixes #30. Fixes #31. See merge request: Ascend/msmodeling!171 | 1 个月前 | |
chore(ci): adopt pre-commit and retire legacy lintrunner adapters Co-authored-by: liujiawang<anonymousdev@163.com> # message auto-generated for no-merge-commit merge: !176 merge pre-commit into develop chore(ci): adopt pre-commit and retire legacy lintrunner adapters Created-by: AvadaKedavrua Commit-by: liujiawang;AvadaKedavrua Merged-by: ascend-robot Description: Thanks for your contribution; we appreciate it a lot. The following instructions will make your pull request healthier and help you get feedback more easily. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers. 感谢您的贡献,我们非常重视。以下说明将使您的拉取请求更健康,更易于获得反馈。如果您不理解某些项目,请不要担心,只需提交拉取请求并从维护人员那里寻求帮助即可。 **PR Type / PR类型** - [ ] Feature(功能新增) - [ ] Bugfix(Bug 修复) - [x] Docs(文档更新) - [x] CI/CD(持续集成/持续部署) - [ ] Refactor(代码重构) - [ ] Perf(性能优化) - [ ] Test-Cases(测试用例更新) - [ ] Other(其他) ------ ## Motivation / 变更动机 Continue the **pre-commit** migration: tighten **Pylint** so only high-signal messages run ( disable=all + explicit enable list), fix real issues that remained under that profile, and translate hook/config comments to **English**. ------ ## Configuration changes(仅工具与注释 / tooling & comments only) | Path | What changed | |------|----------------| | pre-commit/pyproject.toml | **Pylint:** [tool.pylint."messages control"] with disable = ["all"] and a short **allowlist** of message IDs (E0100, E0601–E0611, E0632, E1101, E1120, W0632, W1514). **Ruff:** unchanged behavior; comments translated to English. **Bandit:** comments translated; rule allowlist/skip lists unchanged. | | .pre-commit-config.yaml | Comments translated to English; Bandit hook display name set to **bandit (Python security checks)**. Hook versions and args unchanged except for comment text. | ------ ## Source code changes(应用代码 / application code) | Area | Files | Purpose | |------|--------|---------| | serving_cast | communication.py, engine.py, instance.py, kv_cache_manager.py, load_gen.py, main.py, model_runner.py, request.py, serving.py, utils.py | Replace from . import stime with import serving_cast.stime as stime so Pylint resolves imports (fixes **E0611**). | | serving_cast | stime.py | Singleton **salabim** Environment via _get_sim_env() so type checkers/Pylint see **sim.Environment** (fixes **E1101** on SimulationEnv). | | serving_cast/service | base_throughput_optimizer.py | __init__ defaults + assert runner is not None before run_inference (fixes **E1101** on base class). | | tensor_cast | diffusers/diffusers_model.py, diffusers/diffusers_utils.py, runtime.py | Add **encoding="utf-8"** to open() / trace export (fixes **W1514**). | | web_ui | callbacks.py | **refresh_optimizer_detail:** call _optimizer_detail_view(rows, None, device) and unpack five return values (fixes **E1120**). | ------ ## Recent commits on pre-commit branch - ci(pre-commit): fix pylint message selection with disable=all - fix: resolve pylint findings in serving_cast, tensor_cast, and web_ui - docs(pre-commit): translate comments to English and add all-files run log ------  ------ ## Checklist / 检查列表 - [x] Please ensure code files contain no Chinese comments. / 请保证代码文件中不含中文注释。 See merge request: Ascend/msmodeling!176 | 1 个月前 | |
chore(ci): adopt pre-commit and retire legacy lintrunner adapters Co-authored-by: liujiawang<anonymousdev@163.com> # message auto-generated for no-merge-commit merge: !176 merge pre-commit into develop chore(ci): adopt pre-commit and retire legacy lintrunner adapters Created-by: AvadaKedavrua Commit-by: liujiawang;AvadaKedavrua Merged-by: ascend-robot Description: Thanks for your contribution; we appreciate it a lot. The following instructions will make your pull request healthier and help you get feedback more easily. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers. 感谢您的贡献,我们非常重视。以下说明将使您的拉取请求更健康,更易于获得反馈。如果您不理解某些项目,请不要担心,只需提交拉取请求并从维护人员那里寻求帮助即可。 **PR Type / PR类型** - [ ] Feature(功能新增) - [ ] Bugfix(Bug 修复) - [x] Docs(文档更新) - [x] CI/CD(持续集成/持续部署) - [ ] Refactor(代码重构) - [ ] Perf(性能优化) - [ ] Test-Cases(测试用例更新) - [ ] Other(其他) ------ ## Motivation / 变更动机 Continue the **pre-commit** migration: tighten **Pylint** so only high-signal messages run ( disable=all + explicit enable list), fix real issues that remained under that profile, and translate hook/config comments to **English**. ------ ## Configuration changes(仅工具与注释 / tooling & comments only) | Path | What changed | |------|----------------| | pre-commit/pyproject.toml | **Pylint:** [tool.pylint."messages control"] with disable = ["all"] and a short **allowlist** of message IDs (E0100, E0601–E0611, E0632, E1101, E1120, W0632, W1514). **Ruff:** unchanged behavior; comments translated to English. **Bandit:** comments translated; rule allowlist/skip lists unchanged. | | .pre-commit-config.yaml | Comments translated to English; Bandit hook display name set to **bandit (Python security checks)**. Hook versions and args unchanged except for comment text. | ------ ## Source code changes(应用代码 / application code) | Area | Files | Purpose | |------|--------|---------| | serving_cast | communication.py, engine.py, instance.py, kv_cache_manager.py, load_gen.py, main.py, model_runner.py, request.py, serving.py, utils.py | Replace from . import stime with import serving_cast.stime as stime so Pylint resolves imports (fixes **E0611**). | | serving_cast | stime.py | Singleton **salabim** Environment via _get_sim_env() so type checkers/Pylint see **sim.Environment** (fixes **E1101** on SimulationEnv). | | serving_cast/service | base_throughput_optimizer.py | __init__ defaults + assert runner is not None before run_inference (fixes **E1101** on base class). | | tensor_cast | diffusers/diffusers_model.py, diffusers/diffusers_utils.py, runtime.py | Add **encoding="utf-8"** to open() / trace export (fixes **W1514**). | | web_ui | callbacks.py | **refresh_optimizer_detail:** call _optimizer_detail_view(rows, None, device) and unpack five return values (fixes **E1120**). | ------ ## Recent commits on pre-commit branch - ci(pre-commit): fix pylint message selection with disable=all - fix: resolve pylint findings in serving_cast, tensor_cast, and web_ui - docs(pre-commit): translate comments to English and add all-files run log ------  ------ ## Checklist / 检查列表 - [x] Please ensure code files contain no Chinese comments. / 请保证代码文件中不含中文注释。 See merge request: Ascend/msmodeling!176 | 1 个月前 | |
chore(ci): adopt pre-commit and retire legacy lintrunner adapters Co-authored-by: liujiawang<anonymousdev@163.com> # message auto-generated for no-merge-commit merge: !176 merge pre-commit into develop chore(ci): adopt pre-commit and retire legacy lintrunner adapters Created-by: AvadaKedavrua Commit-by: liujiawang;AvadaKedavrua Merged-by: ascend-robot Description: Thanks for your contribution; we appreciate it a lot. The following instructions will make your pull request healthier and help you get feedback more easily. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers. 感谢您的贡献,我们非常重视。以下说明将使您的拉取请求更健康,更易于获得反馈。如果您不理解某些项目,请不要担心,只需提交拉取请求并从维护人员那里寻求帮助即可。 **PR Type / PR类型** - [ ] Feature(功能新增) - [ ] Bugfix(Bug 修复) - [x] Docs(文档更新) - [x] CI/CD(持续集成/持续部署) - [ ] Refactor(代码重构) - [ ] Perf(性能优化) - [ ] Test-Cases(测试用例更新) - [ ] Other(其他) ------ ## Motivation / 变更动机 Continue the **pre-commit** migration: tighten **Pylint** so only high-signal messages run ( disable=all + explicit enable list), fix real issues that remained under that profile, and translate hook/config comments to **English**. ------ ## Configuration changes(仅工具与注释 / tooling & comments only) | Path | What changed | |------|----------------| | pre-commit/pyproject.toml | **Pylint:** [tool.pylint."messages control"] with disable = ["all"] and a short **allowlist** of message IDs (E0100, E0601–E0611, E0632, E1101, E1120, W0632, W1514). **Ruff:** unchanged behavior; comments translated to English. **Bandit:** comments translated; rule allowlist/skip lists unchanged. | | .pre-commit-config.yaml | Comments translated to English; Bandit hook display name set to **bandit (Python security checks)**. Hook versions and args unchanged except for comment text. | ------ ## Source code changes(应用代码 / application code) | Area | Files | Purpose | |------|--------|---------| | serving_cast | communication.py, engine.py, instance.py, kv_cache_manager.py, load_gen.py, main.py, model_runner.py, request.py, serving.py, utils.py | Replace from . import stime with import serving_cast.stime as stime so Pylint resolves imports (fixes **E0611**). | | serving_cast | stime.py | Singleton **salabim** Environment via _get_sim_env() so type checkers/Pylint see **sim.Environment** (fixes **E1101** on SimulationEnv). | | serving_cast/service | base_throughput_optimizer.py | __init__ defaults + assert runner is not None before run_inference (fixes **E1101** on base class). | | tensor_cast | diffusers/diffusers_model.py, diffusers/diffusers_utils.py, runtime.py | Add **encoding="utf-8"** to open() / trace export (fixes **W1514**). | | web_ui | callbacks.py | **refresh_optimizer_detail:** call _optimizer_detail_view(rows, None, device) and unpack five return values (fixes **E1120**). | ------ ## Recent commits on pre-commit branch - ci(pre-commit): fix pylint message selection with disable=all - fix: resolve pylint findings in serving_cast, tensor_cast, and web_ui - docs(pre-commit): translate comments to English and add all-files run log ------  ------ ## Checklist / 检查列表 - [x] Please ensure code files contain no Chinese comments. / 请保证代码文件中不含中文注释。 See merge request: Ascend/msmodeling!176 | 1 个月前 | |
chore(ci): adopt pre-commit and retire legacy lintrunner adapters Co-authored-by: liujiawang<anonymousdev@163.com> # message auto-generated for no-merge-commit merge: !176 merge pre-commit into develop chore(ci): adopt pre-commit and retire legacy lintrunner adapters Created-by: AvadaKedavrua Commit-by: liujiawang;AvadaKedavrua Merged-by: ascend-robot Description: Thanks for your contribution; we appreciate it a lot. The following instructions will make your pull request healthier and help you get feedback more easily. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers. 感谢您的贡献,我们非常重视。以下说明将使您的拉取请求更健康,更易于获得反馈。如果您不理解某些项目,请不要担心,只需提交拉取请求并从维护人员那里寻求帮助即可。 **PR Type / PR类型** - [ ] Feature(功能新增) - [ ] Bugfix(Bug 修复) - [x] Docs(文档更新) - [x] CI/CD(持续集成/持续部署) - [ ] Refactor(代码重构) - [ ] Perf(性能优化) - [ ] Test-Cases(测试用例更新) - [ ] Other(其他) ------ ## Motivation / 变更动机 Continue the **pre-commit** migration: tighten **Pylint** so only high-signal messages run ( disable=all + explicit enable list), fix real issues that remained under that profile, and translate hook/config comments to **English**. ------ ## Configuration changes(仅工具与注释 / tooling & comments only) | Path | What changed | |------|----------------| | pre-commit/pyproject.toml | **Pylint:** [tool.pylint."messages control"] with disable = ["all"] and a short **allowlist** of message IDs (E0100, E0601–E0611, E0632, E1101, E1120, W0632, W1514). **Ruff:** unchanged behavior; comments translated to English. **Bandit:** comments translated; rule allowlist/skip lists unchanged. | | .pre-commit-config.yaml | Comments translated to English; Bandit hook display name set to **bandit (Python security checks)**. Hook versions and args unchanged except for comment text. | ------ ## Source code changes(应用代码 / application code) | Area | Files | Purpose | |------|--------|---------| | serving_cast | communication.py, engine.py, instance.py, kv_cache_manager.py, load_gen.py, main.py, model_runner.py, request.py, serving.py, utils.py | Replace from . import stime with import serving_cast.stime as stime so Pylint resolves imports (fixes **E0611**). | | serving_cast | stime.py | Singleton **salabim** Environment via _get_sim_env() so type checkers/Pylint see **sim.Environment** (fixes **E1101** on SimulationEnv). | | serving_cast/service | base_throughput_optimizer.py | __init__ defaults + assert runner is not None before run_inference (fixes **E1101** on base class). | | tensor_cast | diffusers/diffusers_model.py, diffusers/diffusers_utils.py, runtime.py | Add **encoding="utf-8"** to open() / trace export (fixes **W1514**). | | web_ui | callbacks.py | **refresh_optimizer_detail:** call _optimizer_detail_view(rows, None, device) and unpack five return values (fixes **E1120**). | ------ ## Recent commits on pre-commit branch - ci(pre-commit): fix pylint message selection with disable=all - fix: resolve pylint findings in serving_cast, tensor_cast, and web_ui - docs(pre-commit): translate comments to English and add all-files run log ------  ------ ## Checklist / 检查列表 - [x] Please ensure code files contain no Chinese comments. / 请保证代码文件中不含中文注释。 See merge request: Ascend/msmodeling!176 | 1 个月前 | |
chore(ci): adopt pre-commit and retire legacy lintrunner adapters Co-authored-by: liujiawang<anonymousdev@163.com> # message auto-generated for no-merge-commit merge: !176 merge pre-commit into develop chore(ci): adopt pre-commit and retire legacy lintrunner adapters Created-by: AvadaKedavrua Commit-by: liujiawang;AvadaKedavrua Merged-by: ascend-robot Description: Thanks for your contribution; we appreciate it a lot. The following instructions will make your pull request healthier and help you get feedback more easily. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers. 感谢您的贡献,我们非常重视。以下说明将使您的拉取请求更健康,更易于获得反馈。如果您不理解某些项目,请不要担心,只需提交拉取请求并从维护人员那里寻求帮助即可。 **PR Type / PR类型** - [ ] Feature(功能新增) - [ ] Bugfix(Bug 修复) - [x] Docs(文档更新) - [x] CI/CD(持续集成/持续部署) - [ ] Refactor(代码重构) - [ ] Perf(性能优化) - [ ] Test-Cases(测试用例更新) - [ ] Other(其他) ------ ## Motivation / 变更动机 Continue the **pre-commit** migration: tighten **Pylint** so only high-signal messages run ( disable=all + explicit enable list), fix real issues that remained under that profile, and translate hook/config comments to **English**. ------ ## Configuration changes(仅工具与注释 / tooling & comments only) | Path | What changed | |------|----------------| | pre-commit/pyproject.toml | **Pylint:** [tool.pylint."messages control"] with disable = ["all"] and a short **allowlist** of message IDs (E0100, E0601–E0611, E0632, E1101, E1120, W0632, W1514). **Ruff:** unchanged behavior; comments translated to English. **Bandit:** comments translated; rule allowlist/skip lists unchanged. | | .pre-commit-config.yaml | Comments translated to English; Bandit hook display name set to **bandit (Python security checks)**. Hook versions and args unchanged except for comment text. | ------ ## Source code changes(应用代码 / application code) | Area | Files | Purpose | |------|--------|---------| | serving_cast | communication.py, engine.py, instance.py, kv_cache_manager.py, load_gen.py, main.py, model_runner.py, request.py, serving.py, utils.py | Replace from . import stime with import serving_cast.stime as stime so Pylint resolves imports (fixes **E0611**). | | serving_cast | stime.py | Singleton **salabim** Environment via _get_sim_env() so type checkers/Pylint see **sim.Environment** (fixes **E1101** on SimulationEnv). | | serving_cast/service | base_throughput_optimizer.py | __init__ defaults + assert runner is not None before run_inference (fixes **E1101** on base class). | | tensor_cast | diffusers/diffusers_model.py, diffusers/diffusers_utils.py, runtime.py | Add **encoding="utf-8"** to open() / trace export (fixes **W1514**). | | web_ui | callbacks.py | **refresh_optimizer_detail:** call _optimizer_detail_view(rows, None, device) and unpack five return values (fixes **E1120**). | ------ ## Recent commits on pre-commit branch - ci(pre-commit): fix pylint message selection with disable=all - fix: resolve pylint findings in serving_cast, tensor_cast, and web_ui - docs(pre-commit): translate comments to English and add all-files run log ------  ------ ## Checklist / 检查列表 - [x] Please ensure code files contain no Chinese comments. / 请保证代码文件中不含中文注释。 See merge request: Ascend/msmodeling!176 | 1 个月前 | |
chore(ci): adopt pre-commit and retire legacy lintrunner adapters Co-authored-by: liujiawang<anonymousdev@163.com> # message auto-generated for no-merge-commit merge: !176 merge pre-commit into develop chore(ci): adopt pre-commit and retire legacy lintrunner adapters Created-by: AvadaKedavrua Commit-by: liujiawang;AvadaKedavrua Merged-by: ascend-robot Description: Thanks for your contribution; we appreciate it a lot. The following instructions will make your pull request healthier and help you get feedback more easily. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers. 感谢您的贡献,我们非常重视。以下说明将使您的拉取请求更健康,更易于获得反馈。如果您不理解某些项目,请不要担心,只需提交拉取请求并从维护人员那里寻求帮助即可。 **PR Type / PR类型** - [ ] Feature(功能新增) - [ ] Bugfix(Bug 修复) - [x] Docs(文档更新) - [x] CI/CD(持续集成/持续部署) - [ ] Refactor(代码重构) - [ ] Perf(性能优化) - [ ] Test-Cases(测试用例更新) - [ ] Other(其他) ------ ## Motivation / 变更动机 Continue the **pre-commit** migration: tighten **Pylint** so only high-signal messages run ( disable=all + explicit enable list), fix real issues that remained under that profile, and translate hook/config comments to **English**. ------ ## Configuration changes(仅工具与注释 / tooling & comments only) | Path | What changed | |------|----------------| | pre-commit/pyproject.toml | **Pylint:** [tool.pylint."messages control"] with disable = ["all"] and a short **allowlist** of message IDs (E0100, E0601–E0611, E0632, E1101, E1120, W0632, W1514). **Ruff:** unchanged behavior; comments translated to English. **Bandit:** comments translated; rule allowlist/skip lists unchanged. | | .pre-commit-config.yaml | Comments translated to English; Bandit hook display name set to **bandit (Python security checks)**. Hook versions and args unchanged except for comment text. | ------ ## Source code changes(应用代码 / application code) | Area | Files | Purpose | |------|--------|---------| | serving_cast | communication.py, engine.py, instance.py, kv_cache_manager.py, load_gen.py, main.py, model_runner.py, request.py, serving.py, utils.py | Replace from . import stime with import serving_cast.stime as stime so Pylint resolves imports (fixes **E0611**). | | serving_cast | stime.py | Singleton **salabim** Environment via _get_sim_env() so type checkers/Pylint see **sim.Environment** (fixes **E1101** on SimulationEnv). | | serving_cast/service | base_throughput_optimizer.py | __init__ defaults + assert runner is not None before run_inference (fixes **E1101** on base class). | | tensor_cast | diffusers/diffusers_model.py, diffusers/diffusers_utils.py, runtime.py | Add **encoding="utf-8"** to open() / trace export (fixes **W1514**). | | web_ui | callbacks.py | **refresh_optimizer_detail:** call _optimizer_detail_view(rows, None, device) and unpack five return values (fixes **E1120**). | ------ ## Recent commits on pre-commit branch - ci(pre-commit): fix pylint message selection with disable=all - fix: resolve pylint findings in serving_cast, tensor_cast, and web_ui - docs(pre-commit): translate comments to English and add all-files run log ------  ------ ## Checklist / 检查列表 - [x] Please ensure code files contain no Chinese comments. / 请保证代码文件中不含中文注释。 See merge request: Ascend/msmodeling!176 | 1 个月前 | |
feat(tensor_cast): Organize serving_cast.main output structure Co-authored-by: Elrond G<elrondgcn@gmail.com> # message auto-generated for no-merge-commit merge: !226 merge feature/develop/serving_cast_output_file into develop feat(tensor_cast): Organize serving_cast.main output structure Created-by: elrond-g Commit-by: Elrond G Merged-by: ascend-robot Description: # PR Template Thanks for your contribution; we appreciate it a lot. The following instructions will make your pull request healthier and help you get feedback more easily. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers. 感谢您的贡献,我们非常重视。以下说明将使您的拉取请求更健康,更易于获得反馈。如果您不理解某些项目,请不要担心,只需提交拉取请求并从维护人员那里寻求帮助即可。 **PR Type / PR类型** - [x] Feature(功能新增) - [ ] Bugfix(Bug 修复) - [ ] Docs(文档更新) - [ ] CI/CD(持续集成/持续部署) - [ ] Refactor(代码重构) - [ ] Perf(性能优化) - [ ] Test-Cases(测试用例更新) - [ ] Other(其他) ## 🔍 Motivation / 变更动机 **Please describe the motivation of this PR and the goal you want to achieve through this PR.** **请描述您的拉取请求的动机和您希望通过此拉取请求实现的目标。** Used for structured interaction between cli tools and other services ------ ## 📝 Modification / 修改内容 **Please briefly describe what modification is made in this PR.** **请简要描述此拉取请求中进行的修改。** Add a parameter "--output-json" to the running result of the cli tool tensor_cast.scripts.text_generate, and save it to the specified file. ------ ## 📐 Associated Test Results / 关联测试结果 **Please provide the related test results, such as test reports, etc.** **请提供相关测试结果,例如测试报告等。** bash python -m serving_cast.main --instance_config_path serving_cast/example/instances.yaml --common_config_path serving_cast/example/common.yaml --output_json output_json.json ------ ## 🌟 Use cases (Optional) / 使用案例(可选) **If this PR introduces a new feature, it is better to list some use cases here and update the documentation.** **如果此拉取请求引入了新功能,最好在此处列出一些用例并更新文档。** args: --output_json", type: string (file_name) default: None, desc: If set, write the benchmark summary (per-metric table and overall summary) as a structured JSON to this file path. ------ ## ✅ Checklist / 检查列表 **Before PR**: - [ ] Bug fixes are fully covered by unit tests, the case that causes the bug should be added in the unit tests. / 修复的 Bug 已完全由单元测试覆盖,导致 Bug 的情况应在单元测试中添加。 - [ ] The modification is covered by complete unit tests. If not, please add more unit tests to ensure the correctness. / 此拉取请求中的修改已完全由单元测试覆盖。如果不是,请添加更多单元测试以确保正确性。 - [ ] All relevant documentation (API docs, docstrings, example tutorials) has been updated to reflect these changes. / 所有相关文档(API 文档、文档字符串、示例教程)已更新以反映这些更改。 - [ ] Please ensure code files contain no Chinese comments. / 请保证代码文件中不含中文注释。 ------ See merge request: Ascend/msmodeling!226 | 15 天前 | |
fix(serving_cast): #27–#31 空结果、load_gen、ModelRunner、logger、通信校验 Co-authored-by: welar<welar.ww@gmail.com> # message auto-generated for no-merge-commit merge: !171 merge fix/27-31-serving-cast-stability into develop fix(serving_cast): #27–#31 空结果、load_gen、ModelRunner、logger、通信校验 Created-by: welar Commit-by: welar Merged-by: ascend-robot Description: ## 修改动机 - **#27**:agg/disagg 过滤后 DataFrame 可能为空,.loc[0] 直接 KeyError,吞吐优化 CLI 无法收尾。 - **#28**:get_load_gen 分支按 load_gen_type 判断,但异常信息访问 load_gen_config.type,易 AttributeError 掩盖真实非法类型。 - **#29**:ModelRunner.__init__ 在赋值前读取 self.enable_interpolate,多进程路径可能 AttributeError。 - **#30**:get_logger 每次调用追加 StreamHandler,重复日志与额外开销。 - **#31**:num_bytes <= 0 and (not isinstance(num_bytes, int)) 对 int 负值恒假,负整数绕过校验。 ## 自验证 - 静态阅读:ModelRunner 中互斥判断前已赋值 enable_interpolate;get_load_gen 异常串使用 load_gen_type。 - _get_agg_disagg_final_out:对空 sorted_summary_df 返回提示列表并打 warning,无 .loc[0]。 - host2device_sync / async / device2device_*:非正 int 一律拒绝(含负 int)。 - get_logger:同 name 重复调用不重复挂 handler(logger 级标记)。 Fixes #27. Fixes #28. Fixes #29. Fixes #30. Fixes #31. See merge request: Ascend/msmodeling!171 | 1 个月前 | |
[bugfix] 未指定 ttft-limits 时,添加默认值 Co-authored-by: wendellX<wenxiaoping6@huawei.com> # message auto-generated for no-merge-commit merge: !290 merge bugfix_ttft_param into develop [bugfix] 未指定 ttft-limits 时,添加默认值 Created-by: wendellX Commit-by: wendellX Merged-by: ascend-robot Description: # PR Template Thanks for your contribution; we appreciate it a lot. The following instructions will make your pull request healthier and help you get feedback more easily. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers. 感谢您的贡献,我们非常重视。以下说明将使您的拉取请求更健康,更易于获得反馈。如果您不理解某些项目,请不要担心,只需提交拉取请求并从维护人员那里寻求帮助即可。 **PR Type / PR类型** - [ ] Feature(功能新增) - [ ] Bugfix(Bug 修复) - [ ] Docs(文档更新) - [ ] CI/CD(持续集成/持续部署) - [ ] Refactor(代码重构) - [ ] Perf(性能优化) - [ ] Test-Cases(测试用例更新) - [ ] Other(其他) ## 🔍 Motivation / 变更动机 **Please describe the motivation of this PR and the goal you want to achieve through this PR.** **请描述您的拉取请求的动机和您希望通过此拉取请求实现的目标。** ------ ## 📝 Modification / 修改内容 **Please briefly describe what modification is made in this PR.** **请简要描述此拉取请求中进行的修改。** ------ ## 📐 Associated Test Results / 关联测试结果 **Please provide the related test results, such as test reports, etc.** **请提供相关测试结果,例如测试报告等。** ------ ## 🌟 Use cases (Optional) / 使用案例(可选) **If this PR introduces a new feature, it is better to list some use cases here and update the documentation.** **如果此拉取请求引入了新功能,最好在此处列出一些用例并更新文档。** ------ ## ✅ Checklist / 检查列表 **Before PR**: - [ ] Bug fixes are fully covered by unit tests, the case that causes the bug should be added in the unit tests. / 修复的 Bug 已完全由单元测试覆盖,导致 Bug 的情况应在单元测试中添加。 - [ ] The modification is covered by complete unit tests. If not, please add more unit tests to ensure the correctness. / 此拉取请求中的修改已完全由单元测试覆盖。如果不是,请添加更多单元测试以确保正确性。 - [ ] All relevant documentation (API docs, docstrings, example tutorials) has been updated to reflect these changes. / 所有相关文档(API 文档、文档字符串、示例教程)已更新以反映这些更改。 - [ ] Please ensure code files contain no Chinese comments. / 请保证代码文件中不含中文注释。 ------ See merge request: Ascend/msmodeling!290 | 21 天前 | |
chore(ci): adopt pre-commit and retire legacy lintrunner adapters Co-authored-by: liujiawang<anonymousdev@163.com> # message auto-generated for no-merge-commit merge: !176 merge pre-commit into develop chore(ci): adopt pre-commit and retire legacy lintrunner adapters Created-by: AvadaKedavrua Commit-by: liujiawang;AvadaKedavrua Merged-by: ascend-robot Description: Thanks for your contribution; we appreciate it a lot. The following instructions will make your pull request healthier and help you get feedback more easily. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers. 感谢您的贡献,我们非常重视。以下说明将使您的拉取请求更健康,更易于获得反馈。如果您不理解某些项目,请不要担心,只需提交拉取请求并从维护人员那里寻求帮助即可。 **PR Type / PR类型** - [ ] Feature(功能新增) - [ ] Bugfix(Bug 修复) - [x] Docs(文档更新) - [x] CI/CD(持续集成/持续部署) - [ ] Refactor(代码重构) - [ ] Perf(性能优化) - [ ] Test-Cases(测试用例更新) - [ ] Other(其他) ------ ## Motivation / 变更动机 Continue the **pre-commit** migration: tighten **Pylint** so only high-signal messages run ( disable=all + explicit enable list), fix real issues that remained under that profile, and translate hook/config comments to **English**. ------ ## Configuration changes(仅工具与注释 / tooling & comments only) | Path | What changed | |------|----------------| | pre-commit/pyproject.toml | **Pylint:** [tool.pylint."messages control"] with disable = ["all"] and a short **allowlist** of message IDs (E0100, E0601–E0611, E0632, E1101, E1120, W0632, W1514). **Ruff:** unchanged behavior; comments translated to English. **Bandit:** comments translated; rule allowlist/skip lists unchanged. | | .pre-commit-config.yaml | Comments translated to English; Bandit hook display name set to **bandit (Python security checks)**. Hook versions and args unchanged except for comment text. | ------ ## Source code changes(应用代码 / application code) | Area | Files | Purpose | |------|--------|---------| | serving_cast | communication.py, engine.py, instance.py, kv_cache_manager.py, load_gen.py, main.py, model_runner.py, request.py, serving.py, utils.py | Replace from . import stime with import serving_cast.stime as stime so Pylint resolves imports (fixes **E0611**). | | serving_cast | stime.py | Singleton **salabim** Environment via _get_sim_env() so type checkers/Pylint see **sim.Environment** (fixes **E1101** on SimulationEnv). | | serving_cast/service | base_throughput_optimizer.py | __init__ defaults + assert runner is not None before run_inference (fixes **E1101** on base class). | | tensor_cast | diffusers/diffusers_model.py, diffusers/diffusers_utils.py, runtime.py | Add **encoding="utf-8"** to open() / trace export (fixes **W1514**). | | web_ui | callbacks.py | **refresh_optimizer_detail:** call _optimizer_detail_view(rows, None, device) and unpack five return values (fixes **E1120**). | ------ ## Recent commits on pre-commit branch - ci(pre-commit): fix pylint message selection with disable=all - fix: resolve pylint findings in serving_cast, tensor_cast, and web_ui - docs(pre-commit): translate comments to English and add all-files run log ------  ------ ## Checklist / 检查列表 - [x] Please ensure code files contain no Chinese comments. / 请保证代码文件中不含中文注释。 See merge request: Ascend/msmodeling!176 | 1 个月前 | |
chore(ci): adopt pre-commit and retire legacy lintrunner adapters Co-authored-by: liujiawang<anonymousdev@163.com> # message auto-generated for no-merge-commit merge: !176 merge pre-commit into develop chore(ci): adopt pre-commit and retire legacy lintrunner adapters Created-by: AvadaKedavrua Commit-by: liujiawang;AvadaKedavrua Merged-by: ascend-robot Description: Thanks for your contribution; we appreciate it a lot. The following instructions will make your pull request healthier and help you get feedback more easily. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers. 感谢您的贡献,我们非常重视。以下说明将使您的拉取请求更健康,更易于获得反馈。如果您不理解某些项目,请不要担心,只需提交拉取请求并从维护人员那里寻求帮助即可。 **PR Type / PR类型** - [ ] Feature(功能新增) - [ ] Bugfix(Bug 修复) - [x] Docs(文档更新) - [x] CI/CD(持续集成/持续部署) - [ ] Refactor(代码重构) - [ ] Perf(性能优化) - [ ] Test-Cases(测试用例更新) - [ ] Other(其他) ------ ## Motivation / 变更动机 Continue the **pre-commit** migration: tighten **Pylint** so only high-signal messages run ( disable=all + explicit enable list), fix real issues that remained under that profile, and translate hook/config comments to **English**. ------ ## Configuration changes(仅工具与注释 / tooling & comments only) | Path | What changed | |------|----------------| | pre-commit/pyproject.toml | **Pylint:** [tool.pylint."messages control"] with disable = ["all"] and a short **allowlist** of message IDs (E0100, E0601–E0611, E0632, E1101, E1120, W0632, W1514). **Ruff:** unchanged behavior; comments translated to English. **Bandit:** comments translated; rule allowlist/skip lists unchanged. | | .pre-commit-config.yaml | Comments translated to English; Bandit hook display name set to **bandit (Python security checks)**. Hook versions and args unchanged except for comment text. | ------ ## Source code changes(应用代码 / application code) | Area | Files | Purpose | |------|--------|---------| | serving_cast | communication.py, engine.py, instance.py, kv_cache_manager.py, load_gen.py, main.py, model_runner.py, request.py, serving.py, utils.py | Replace from . import stime with import serving_cast.stime as stime so Pylint resolves imports (fixes **E0611**). | | serving_cast | stime.py | Singleton **salabim** Environment via _get_sim_env() so type checkers/Pylint see **sim.Environment** (fixes **E1101** on SimulationEnv). | | serving_cast/service | base_throughput_optimizer.py | __init__ defaults + assert runner is not None before run_inference (fixes **E1101** on base class). | | tensor_cast | diffusers/diffusers_model.py, diffusers/diffusers_utils.py, runtime.py | Add **encoding="utf-8"** to open() / trace export (fixes **W1514**). | | web_ui | callbacks.py | **refresh_optimizer_detail:** call _optimizer_detail_view(rows, None, device) and unpack five return values (fixes **E1120**). | ------ ## Recent commits on pre-commit branch - ci(pre-commit): fix pylint message selection with disable=all - fix: resolve pylint findings in serving_cast, tensor_cast, and web_ui - docs(pre-commit): translate comments to English and add all-files run log ------  ------ ## Checklist / 检查列表 - [x] Please ensure code files contain no Chinese comments. / 请保证代码文件中不含中文注释。 See merge request: Ascend/msmodeling!176 | 1 个月前 | |
fix(serving_cast): #27–#31 空结果、load_gen、ModelRunner、logger、通信校验 Co-authored-by: welar<welar.ww@gmail.com> # message auto-generated for no-merge-commit merge: !171 merge fix/27-31-serving-cast-stability into develop fix(serving_cast): #27–#31 空结果、load_gen、ModelRunner、logger、通信校验 Created-by: welar Commit-by: welar Merged-by: ascend-robot Description: ## 修改动机 - **#27**:agg/disagg 过滤后 DataFrame 可能为空,.loc[0] 直接 KeyError,吞吐优化 CLI 无法收尾。 - **#28**:get_load_gen 分支按 load_gen_type 判断,但异常信息访问 load_gen_config.type,易 AttributeError 掩盖真实非法类型。 - **#29**:ModelRunner.__init__ 在赋值前读取 self.enable_interpolate,多进程路径可能 AttributeError。 - **#30**:get_logger 每次调用追加 StreamHandler,重复日志与额外开销。 - **#31**:num_bytes <= 0 and (not isinstance(num_bytes, int)) 对 int 负值恒假,负整数绕过校验。 ## 自验证 - 静态阅读:ModelRunner 中互斥判断前已赋值 enable_interpolate;get_load_gen 异常串使用 load_gen_type。 - _get_agg_disagg_final_out:对空 sorted_summary_df 返回提示列表并打 warning,无 .loc[0]。 - host2device_sync / async / device2device_*:非正 int 一律拒绝(含负 int)。 - get_logger:同 name 重复调用不重复挂 handler(logger 级标记)。 Fixes #27. Fixes #28. Fixes #29. Fixes #30. Fixes #31. See merge request: Ascend/msmodeling!171 | 1 个月前 | |
feat(tensor_cast): Organize serving_cast.main output structure Co-authored-by: Elrond G<elrondgcn@gmail.com> # message auto-generated for no-merge-commit merge: !226 merge feature/develop/serving_cast_output_file into develop feat(tensor_cast): Organize serving_cast.main output structure Created-by: elrond-g Commit-by: Elrond G Merged-by: ascend-robot Description: # PR Template Thanks for your contribution; we appreciate it a lot. The following instructions will make your pull request healthier and help you get feedback more easily. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers. 感谢您的贡献,我们非常重视。以下说明将使您的拉取请求更健康,更易于获得反馈。如果您不理解某些项目,请不要担心,只需提交拉取请求并从维护人员那里寻求帮助即可。 **PR Type / PR类型** - [x] Feature(功能新增) - [ ] Bugfix(Bug 修复) - [ ] Docs(文档更新) - [ ] CI/CD(持续集成/持续部署) - [ ] Refactor(代码重构) - [ ] Perf(性能优化) - [ ] Test-Cases(测试用例更新) - [ ] Other(其他) ## 🔍 Motivation / 变更动机 **Please describe the motivation of this PR and the goal you want to achieve through this PR.** **请描述您的拉取请求的动机和您希望通过此拉取请求实现的目标。** Used for structured interaction between cli tools and other services ------ ## 📝 Modification / 修改内容 **Please briefly describe what modification is made in this PR.** **请简要描述此拉取请求中进行的修改。** Add a parameter "--output-json" to the running result of the cli tool tensor_cast.scripts.text_generate, and save it to the specified file. ------ ## 📐 Associated Test Results / 关联测试结果 **Please provide the related test results, such as test reports, etc.** **请提供相关测试结果,例如测试报告等。** bash python -m serving_cast.main --instance_config_path serving_cast/example/instances.yaml --common_config_path serving_cast/example/common.yaml --output_json output_json.json ------ ## 🌟 Use cases (Optional) / 使用案例(可选) **If this PR introduces a new feature, it is better to list some use cases here and update the documentation.** **如果此拉取请求引入了新功能,最好在此处列出一些用例并更新文档。** args: --output_json", type: string (file_name) default: None, desc: If set, write the benchmark summary (per-metric table and overall summary) as a structured JSON to this file path. ------ ## ✅ Checklist / 检查列表 **Before PR**: - [ ] Bug fixes are fully covered by unit tests, the case that causes the bug should be added in the unit tests. / 修复的 Bug 已完全由单元测试覆盖,导致 Bug 的情况应在单元测试中添加。 - [ ] The modification is covered by complete unit tests. If not, please add more unit tests to ensure the correctness. / 此拉取请求中的修改已完全由单元测试覆盖。如果不是,请添加更多单元测试以确保正确性。 - [ ] All relevant documentation (API docs, docstrings, example tutorials) has been updated to reflect these changes. / 所有相关文档(API 文档、文档字符串、示例教程)已更新以反映这些更改。 - [ ] Please ensure code files contain no Chinese comments. / 请保证代码文件中不含中文注释。 ------ See merge request: Ascend/msmodeling!226 | 15 天前 |
| 文件 | 最后提交记录 | 最后更新时间 |
|---|---|---|
| 1 个月前 | ||
| 1 个月前 | ||
| 21 天前 | ||
| 1 个月前 | ||
| 1 个月前 | ||
| 1 个月前 | ||
| 1 个月前 | ||
| 1 个月前 | ||
| 1 个月前 | ||
| 1 个月前 | ||
| 15 天前 | ||
| 1 个月前 | ||
| 21 天前 | ||
| 1 个月前 | ||
| 1 个月前 | ||
| 1 个月前 | ||
| 15 天前 |