| 文件 | 最后提交记录 | 最后更新时间 |
|---|---|---|
feat: upgrade management and routing improvements Major features: - Add elastic management for frequency tracking, load tracking, and routing broadcasts. - Add routing service APIs and expertkit-transport-rs with gRPC, shared-memory, RDMA, and load-balancing strategies. - Integrate worker-side weight management, peer weight transfer, byte-indexed weight serving. - Expand model/runtime support for DeepSeek v2/v3, Mixtral, Qwen3-MoE, vLLM integration, and Aliyun/local validation configs. Important fixes: - Fix fallback path stalls, worker disconnect fallback, torch backend device-index recognition, and DeepSeek-v2-lite runnable support. - Improve recovery by batching recovery upserts, deduplicating recovery triggers, keeping routing stable. - Align controller/worker liveness checks, routing subscriptions. | 11 天前 | |
feat: vLLM plugin support (#53) * feat: add config for ek-vllm plugin feat: integrate vLLM with ek framework and fix token output issues fix: add missing shared expert output summation in DeepSeek MoE forward pass * typo: unify env variables from EXPERTKIT to EK fix: correct vllm output * doc: Update vLLM plugin docs for latest configuration method * fix: Change the example model path from a local directory path to a Hugging Face model name. | 1 年前 | |
feat: update proto for expertkit-vllm, need fix some bug | 1 年前 | |
feat: vLLM plugin support (#53) * feat: add config for ek-vllm plugin feat: integrate vLLM with ek framework and fix token output issues fix: add missing shared expert output summation in DeepSeek MoE forward pass * typo: unify env variables from EXPERTKIT to EK fix: correct vllm output * doc: Update vLLM plugin docs for latest configuration method * fix: Change the example model path from a local directory path to a Hugging Face model name. | 1 年前 | |
feat[integration]: impl plugin for vllm, need for test | 1 年前 | |
feat: upgrade management and routing improvements Major features: - Add elastic management for frequency tracking, load tracking, and routing broadcasts. - Add routing service APIs and expertkit-transport-rs with gRPC, shared-memory, RDMA, and load-balancing strategies. - Integrate worker-side weight management, peer weight transfer, byte-indexed weight serving. - Expand model/runtime support for DeepSeek v2/v3, Mixtral, Qwen3-MoE, vLLM integration, and Aliyun/local validation configs. Important fixes: - Fix fallback path stalls, worker disconnect fallback, torch backend device-index recognition, and DeepSeek-v2-lite runnable support. - Improve recovery by batching recovery upserts, deduplicating recovery triggers, keeping routing stable. - Align controller/worker liveness checks, routing subscriptions. | 11 天前 | |
feat: upgrade management and routing improvements Major features: - Add elastic management for frequency tracking, load tracking, and routing broadcasts. - Add routing service APIs and expertkit-transport-rs with gRPC, shared-memory, RDMA, and load-balancing strategies. - Integrate worker-side weight management, peer weight transfer, byte-indexed weight serving. - Expand model/runtime support for DeepSeek v2/v3, Mixtral, Qwen3-MoE, vLLM integration, and Aliyun/local validation configs. Important fixes: - Fix fallback path stalls, worker disconnect fallback, torch backend device-index recognition, and DeepSeek-v2-lite runnable support. - Improve recovery by batching recovery upserts, deduplicating recovery triggers, keeping routing stable. - Align controller/worker liveness checks, routing subscriptions. | 11 天前 |
| 文件 | 最后提交记录 | 最后更新时间 |
|---|---|---|
| 11 天前 | ||
| 1 年前 | ||
| 1 年前 | ||
| 1 年前 | ||
| 1 年前 | ||
| 11 天前 | ||
| 11 天前 |