文件最后提交记录最后更新时间
[Doc] add dynamic chunked pipeline parallel guide (#8728) ### What this PR does / why we need it? This PR adds a comprehensive guide for the Dynamic Chunked Pipeline Parallel (CPP) feature in vLLM-Ascend. It includes an overview of the strategy, technical details on the quadratic latency model, and configuration instructions. ### Does this PR introduce _any_ user-facing change? No, this is a documentation-only update. ### How was this patch tested? Documentation changes were verified by reviewing the rendered markdown content. - vLLM version: v0.19.0 - vLLM main: https://github.com/vllm-project/vllm/commit/6f786f2c506cb07f4566771fdc62e640e2c4a176 --------- Signed-off-by: Jingchun Gao <gaojingchun1@huawei.com> Signed-off-by: wangyu <wy02300127@antgroup.com> Co-authored-by: wangyu <wy02300127@antgroup.com>1 个月前
[Doc][310p] Add the 310p guide (#8640) ### What this PR does / why we need it? Add a detailed 310 deployment tutorial. ### Does this PR introduce _any_ user-facing change? NA ### How was this patch tested? NA --------- Signed-off-by: Tflowers-0129 <2906339855@qq.com>8 天前
[Doc] add Mixtral-8x7B-Instruct-v0.1 model docs and config (#8537) ### What this PR does / why we need it? This PR improves the scheduler profiling behavior for mixtral workloads by refining chunk handling logic. Previously, the profiling process could lead to inaccurate scheduling results under certain conditions. This change ensures more stable and consistent behavior. - vLLM version: v0.19.0 - vLLM main: https://github.com/vllm-project/vllm/commit/6f786f2c506cb07f4566771fdc62e640e2c4a176 --------- Signed-off-by: lihaofei-2026 <haofei@isrc.iscas.ac.cn>9 天前