文件最后提交记录最后更新时间
[Doc][Misc] Improve readability and fix typos in documentation (#8266) ### What this PR does / why we need it? This PR improves the readability of the documentation by fixing typos, correcting command extensions, and fixing broken links in the Chinese README. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Documentation changes only. - vLLM version: - vLLM main: https://github.com/vllm-project/vllm/commit/v0.19.0 --------- Signed-off-by: sunshine202600 <sunshine202600@163.com>1 个月前
[Refactor][Misc] Use lazy formatting for log (#8756) ### What this PR does / why we need it? This PR replaces eager log message formatting with lazy logging formatting across the repository. - Converts logger.*(f"...") and logging.*(f"...") calls to lazy %-style logging arguments. - Replaces deprecated logger.warn(...) usage with logger.warning(...). - Adds logger.isEnabledFor(logging.DEBUG) guards for debug logs whose arguments include function or method calls. - Enables Ruff G004 enforcement by removing it from ignore and registering vllm.logger.logger as a logger object. This avoids unnecessary string formatting and expensive argument evaluation when the corresponding log level is disabled. ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.19.0 - vLLM main: https://github.com/vllm-project/vllm/commit/6f786f2c506cb07f4566771fdc62e640e2c4a176 --------- Signed-off-by: MrZ20 <2609716663@qq.com>1 个月前
[CI]Fixed the spell check function in typos.toml (#6753) ### What this PR does / why we need it? The incorrect regular expression syntax .*[UE4M3|ue4m3].* actually ignores all words containing any of the following characters: `u, e, 4, m, 3, |` ```yaml extend-ignore-identifiers-re = [".*Unc.*", ".*_thw", ".*UE8M0.*", ".*[UE4M3|ue4m3].*", ".*eles.*", ".*fo.*", ".*ba.*", ".*ot.*", ".*[Tt]h[rR].*"] ``` ===fix===> ```yaml extend-ignore-identifiers-re = [".*Unc.*", ".*_thw", ".*UE8M0.*", ".*(UE4M3|ue4m3]).*", ".*eles.*", ".*fo.*", ".*ba.*", ".*ot.*", ".*[Tt]h[rR].*"] ``` ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.15.0 - vLLM main: https://github.com/vllm-project/vllm/commit/9562912cead1f11e8540fb91306c5cbda66f0007 Signed-off-by: MrZ20 <2609716663@qq.com>3 个月前
remove useless patch (#4699) patach_config is useless now. Let's remove it - vLLM version: v0.12.0 - vLLM main: https://github.com/vllm-project/vllm/commit/ad32e3e19ccf0526cb6744a5fed09a138a5fb2f9 Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com> Co-authored-by: Mengqing Cao <cmq0113@163.com>5 个月前
README.md

Here is an example guiding how to use launch_online_dp.py to launch external dp vLLM servers. User can easily launch external dp servers following the steps below:

Modify parameters in run_dp_template.sh

run_dp_template.sh is a template script used to launch each data parallel (dp) vLLM instance separately. It will be called by launch_online_dp.py in multiple threads and most of its configurations are set by launch_online_dp.py. Parameters you need to set manually include:

  1. The IP and socket_ifname of your machine. If running on multi-nodes, please make sure the scripts on each node has been set with correct IP and socket_ifname of that node.
  2. vLLM serving related parameters including model_path and other configurations. Note that port, dp-related parameters and tp_size is set by launch_online_dp.py, all the other vLLM parameters in this file only serve as an example and you are free to modify them according to your purpose.

Run launch_online_dp.py with CL arguments

All the arguments that can be set by users are:

  1. --dp-size: global data parallel size, must be set
  2. --tp-size: tensor parallel size, default 1
  3. --dp-size-local: local data parallel size, defaultly set to dp_size
  4. --dp-rank-start: Starting rank for data parallel, default 0
  5. --dp-address: IP address of data parallel master node
  6. --dp-rpc-port: Port of data parallel master node, default 12345
  7. --vllm-start-port: Starting port of vLLM serving instances, default 9000

An example of running external DP in one single node:

cd examples/external_online_dp
# running DP4 TP4 in a node with 16 NPUs
python launch_online_dp.py --dp-size 4 --tp-size 4 --dp-size-local 4 --dp-rank-start 0 --dp-address x.x.x.x --dp-rpc-port 12342

An example of running external DP in two nodes:

cd examples/external_online_dp
# running DP4 TP4 in two nodes with 8 NPUs each

# On node 0:
python launch_online_dp.py --dp-size 4 --tp-size 4 --dp-size-local 2 --dp-rank-start 0 --dp-address x.x.x.x --dp-rpc-port 12342

# On node 1:
python launch_online_dp.py --dp-size 4 --tp-size 4 --dp-size-local 2 --dp-rank-start 2 --dp-address x.x.x.x --dp-rpc-port 12342

(Optional) Run dp_load_balance_proxy_server.py to load balance requests between external dp servers

External dp server means that you need to handle load balance between multiple dp instances out of vLLM by implementing your custom proxy server. Here we provide an example of request-length-aware dp load-balance proxy server for you. The arguments of dp_load_balance_proxy_server.py include:

  1. --port: port of proxy server, default 8000
  2. --host: host address of proxy server, default localhost
  3. --dp-hosts: host addresses of external dp servers
  4. --dp-ports: ports of external dp servers, the number of dp ports should be the same as dp hosts.
  5. --max-retries: Max number of retries for HTTP requests, default 3

For example, if you have two external dp servers running in x.x.x.a:10001 and x.x.x.b:10002, then you can start the proxy server by:

python dp_load_balance_proxy_server.py --host x.x.x.c --port 8000 --dp-hosts x.x.x.a x.x.x.b --dp-ports 10001 10002

which will then serve as the entrypoint for inference requests at x.x.x.c:8000, and load balance coming requests between these two external dp servers according to request length.