| 文件 | 最后提交记录 | 最后更新时间 |
|---|---|---|
fix(score): normalize PD group metrics across the candidate pool Per-group queue and running maxima collapse to binary busy/idle signals in 1P1D topologies, so groups with different load become indistinguishable. Collect pool-wide limits once per cycle and thread them through PDGroup and prediction fallback scoring. | 15 天前 | |
fix(epp): address remaining codecheck findings | 1 个月前 | |
fix(prediction): treat KVCacheUsagePercent as a 0-1 fraction, not a percent vllm:kv_cache_usage_perc is stored raw as a [0,1] fraction, but the prediction feature extractor and the badness scorers divided it by 100 again. That compressed real 10-90% usage into ~0.001-0.009 and effectively removed KV utilization as a routing signal — in the prediction fallback and the kvaware/autokvaware/pdkvaware baselines alike, since they share score.AggregateBadness / PrefillBadness / DecodeBadness. Replace the /100 (and normalizePercent) on KVCacheUsagePercent with Clamp01 in the feature extractor and the score badness paths. NPU utilization fields keep /100 since the exporter reports 0-100. Add a regression test pinning the fraction semantics. Note: models must be retrained after this change — serving now feeds the unscaled fraction, so an old model would see a 100x feature-scale shift. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> | 19 天前 | |
fix(score): normalize PD group metrics across the candidate pool Per-group queue and running maxima collapse to binary busy/idle signals in 1P1D topologies, so groups with different load become indistinguishable. Collect pool-wide limits once per cycle and thread them through PDGroup and prediction fallback scoring. | 15 天前 | |
fix(score): normalize PD group metrics across the candidate pool Per-group queue and running maxima collapse to binary busy/idle signals in 1P1D topologies, so groups with different load become indistinguishable. Collect pool-wide limits once per cycle and thread them through PDGroup and prediction fallback scoring. | 15 天前 |
| 文件 | 最后提交记录 | 最后更新时间 |
|---|---|---|
| 15 天前 | ||
| 1 个月前 | ||
| 19 天前 | ||
| 15 天前 | ||
| 15 天前 |