Custom Plugin Development Guide
Introduction
The optimization tool supports custom plugins, through which users can customize search parameter configurations, custom service frameworks, and custom performance testing tools.
Custom Plugin Development Steps
The steps are as follows:
- Create your own Python project as a plugin.
- Custom content:
Custom Configuration
-
Inherit the Settings class
Settings is implemented through pydantic-settings, and attributes can be added/removed in the class. For example
from ..config.config import Settings class CusSettings(Settings): name: str = "vllm-inference-optimization" -
Register settings initialization function
Add a registration function in your Python project to register the Settings initialization
def register(): from vllm_inference_optimization.settings import CusSettings from ..config.config import register_settings register_settings(lambda : CusSettings()) -
Using settings Import get_settings to obtain
from ..config.config import get_settings settings = get_settings()
Custom Service Framework
-
Inherit msserviceprofiler...optimizer.interfaces.simulator.SimulatorInterface, implement the base_url and data_field properties, implement the update_command method, etc. For example:
class ms_service_profiler...optimizer.interfaces.simulator.SimulatorInterface() Bases: ABC Operate the service framework. Used for service-related operations. abstract property data_field: Tuple[OptimizerConfigField] | None Get data field property Returns: Optional[Tuple[OptimizerConfigField]] abstract property setter data_field: Tuple[OptimizerConfigField] | None Set data field property Returns: None abstract update_command() → None Update the service startup command based on data_field before service startup. Update the self.command property. Returns: None update_config(params: Tuple[OptimizerConfigField] | None = None) → bool Update the service configuration file or other configurations based on parameters. Modify the configuration file before service startup according to the passed parameter values to make the new configuration take effect. Args: params: List of tuning parameters, a tuple, defined by the value and config position of each element. Returns: bool, returns update success or failure. abstract stop() Other preparations at runtime. Returns: None -
And register the service framework in the init file
from ..optimizer.register import register_simulator register_simulator("vllm_infer", VllmSimulator)
Custom Performance Testing Benchmark
-
Inherit msserviceprofiler...optimizer.benchmark.BenchmarkInterface, implement the data_field property, get_performance_index method, etc. For example:
class VllmBenchMark(BenchmarkInterface): def __init__(self, config: Optional[VllmBenchmarkConfig] = None, *args, **kwargs): if config: self.config = config else: settings = get_settings() if settings.name != "vllm-inference-optimization": raise ValueError("Settings is invalidator.") self.config = settings.vllm_benchmark super().__init__(*args, **kwargs) self.command = VllmBenchmarkCommand(self.config.command).command -
Register benchmark
from ..optimizer.register import register_benchmarks register_benchmarks("vllm_infer_benchmark", VllmBenchMark)
III. Setting Plugin Entry Points
Add the registration function of the custom content to the entry group '..plugins'.
For example, register by calling the register function of the vllm_inference_optimization module as follows
pyproject.toml
[project.entry - points.'..plugins']
vllm_inference_optimization = "vllm_inference_optimization:register"
IV. Using Plugins
Specify the plugin-implemented module through the optimization tool's invocation parameters. For example, a newly registered service framework vllm_infer and performance testing client vllm_infer_benchmark First check whether the supported services and benchmark tools include the newly registered ones
msserviceprofiler optimizer -h
options:
-h, --help show this help message and exit
-lb, --load_breakpoint
Continue from where the last optimization was aborted.
--backup Whether to back up data.
-e {vllm,vllm_infer}, --engine {vllm, vllm_infer}
Specifies the engine to be used.
-b {vllm_benchmark,vllm_infer_benchmark}, --benchmark {vllm_benchmark, vllm_infer_benchmark}
Specified benchmark to be used.
Specify the plugin implementation for optimization
msserviceprofiler optimizer -e vllm_infer -b vllm_infer_benchmark
Common Data Structure Definitions
-
..config.config.OptimizerConfigField
class ..config.config.OptimizerConfigField(*, name: str = 'max_batch_size', config_position: str = 'BackendConfig.ScheduleConfig.maxBatchSize', min: float = 0.0, max: float = 100.0, dtype: str = 'float', value: int | float | bool | None = None, dtype_param: Any = None) Bases: BaseModel Structure definition of optimization parameters. config_position: str # Position definition, currently supports two types: one is BackendConfig.ScheduleConfig, indicating modification of parameters in mindieconfig.json; the other is env, indicating setting this parameter as an environment variable, with the variable name being the name attribute dtype: str # Parameter type definition dtype_param: Any # Additional parameters provided when converting data to the specified data type. max: float # Indicates the maximum value of this parameter, used as the upper limit of the parameter variation range min: float # Indicates the minimum value of this parameter, used as the lower limit of the parameter variation range name: str # Field name, when setting as an environment variable, convert it to all uppercase as the variable name. config_position is used to distinguish how to update the field value. value: int | float | bool | None # Parameter value, e.g. modify this value into the configuration file, or set it as the value of an environment variable. dtype_param: Conversion parameters needed for type conversion. -
..config.config.PerformanceIndex
class ..config.config.PerformanceIndex(*, generate_speed: float | None = None, time_to_first_token: float | None = None, time_per_output_token: float | None = None, success_rate: float | None = None, throughput: float | None = None) Bases: BaseModel Performance metrics obtained by benchmark. generate_speed: float| None # Output throughput (token/s), recommended to pass success_rate: float | None # Percentage of successful request returns, recommended to pass throughput: float | None # qps, recommended to pass time_per_output_token: float | None # tpot, recommended to pass time_to_first_token: float | None # ttft, recommended to pass
Configuration Customization
..config.config.register_settings
..config.config.register_settings(func: Callable | None = None) → None
Register custom settings, can provide a function to generate.
Args:
func: Function to generate settings.
Returns: None
Benchmark Interface
class ..optimizer.benchmark.BenchmarkInterface()
Bases: ABC
property num_prompts: Tuple[OptimizerConfigField] | None
Get the request count for data retrieval property
Returns: Optional[Tuple[OptimizerConfigField]]
property setter num_prompts: Tuple[OptimizerConfigField] | None
Set the request count for data retrieval property
Returns: None
property data_field: Tuple[OptimizerConfigField] | None
Get data field property
Returns: Optional[Tuple[OptimizerConfigField]]
abstract property setter data_field: Tuple[OptimizerConfigField] | None
Set data field property
Returns: None
abstract get_performance_index() → PerformanceIndex
Get performance metrics
Returns: Metrics data class
abstract stop()
Other preparations at runtime.
Returns: None
abstract update_command() → None
Update the service startup command based on data_field before service startup. Update the self.command property.
Returns: None
Register benchmark in the init file
class ..optimizer.interfaces.simulator.SimulatorInterface()
Bases: ABC
Operate the service framework. Used for service-related operations.
abstract property data_field: Tuple[OptimizerConfigField] | None
Get data field property
Returns: Optional[Tuple[OptimizerConfigField]]
abstract property setter data_field: Tuple[OptimizerConfigField] | None
Set data field property
Returns: None
abstract update_command() → None
Update the service startup command based on data_field before service startup. Update the self.command property.
Returns: None
update_config(params: Tuple[OptimizerConfigField] | None = None) → bool
Update the service configuration file or other configurations based on parameters. Modify the configuration file before service startup according to the passed parameter values to make the new configuration take effect.
Args:
params: List of tuning parameters, a tuple, defined by the value and config position of each element.
Returns: bool, returns update success or failure.
abstract stop()
Other preparations at runtime.
Returns: None
Set plugin entry points
Add the registration function of the custom content to the entry group 'ms_service_profiler...plugins'.
For example, register by calling the register function of the vllm_inference_optimization module as follows
pyproject.toml
[project.entry - points.'ms_service_profiler...plugins']
vllm_inference_optimization = "vllm_inference_optimization:register"
Install Plugin
Set entry to ..plugins
For example:
[project.entry-points.'..plugins']
vllm_inference_optimization="vllm_inference_optimization:register"
Before using plugin mode, install the plugin in the plugin directory (ensure the current path contains pyproject.toml):
pip install -e .
Using Plugins
Specify the plugin-implemented module through the optimization tool's invocation parameters.
For example, a newly registered service framework vllm_infer and performance testing client vllm_infer_benchmark
First check whether the supported services and benchmark tools include the newly registered ones
msserviceprofiler optimizer -h
options:
-h, --help show this help message and exit
-lb, --load_breakpoint
Continue from where the last optimization was aborted.
--backup Whether to back up data.
-e {vllm,vllm_infer}, --engine {vllm, vllm_infer}
Specifies the engine to be used.
-b {vllm_benchmark,vllm_infer_benchmark}, --benchmark {vllm_benchmark, vllm_infer_benchmark}
Specified benchmark to be used.
Specify the plugin implementation for optimization
msserviceprofiler optimizer -e vllm_infer -b vllm_infer_benchmark