| 文件 | 最后提交记录 | 最后更新时间 |
|---|---|---|
perf: use ggml operators to optimize cpu ffn forwarding (#94) * perf: use ggml operators to optimize cpu ffn forwarding * perf: supports bf16 on ggml backend * chore: make clippy happy * chore: align the types * chore: tuning & fix serialization * fix: fix padding and context size * feat: allow dropping cache after loading expert backend * chore: statically link ggml * feat: allocating tensor data from rust side * feat: allow specifying computation backend * chore: clippy & format * chore: tuning * chore: delete unused feature flags * chore: remove ggml-cuda * fix: ggml-cpu.h includes ggml.h * perf: single thread for better throughput | 10 个月前 | |
chore: clippy and format (#77) * chore: clippy * chore: manual fixes to make clippy happy * chore: cargo fmt | 10 个月前 | |
fix: Make Rdma connection establishment stable (#107) * feat: change is_connect judge logic * feat: change connection establish logic * feat: simplify rdma connection establish procedure * feat: RdmaEndpointServer shutdown after connection established. * chore: make compiler happy * feat: add retry logic for rdma connection establish * chore: using url:Url for robust url parsing * feat: add proper disconnect logic for connection rebuild feat: RdmaEndpointServer looped for connection rebuild feat: add proper close logic when disconnect from controller feat: graceful shutdown for rdmaEndpointServer * chore: cargo fmt --all * chore: detailed prepared_qp build error * chore: remove unnecessary _ for some fields in RdmaQueue * chore: remove unused variables | 7 个月前 | |
chore: set worker parallel via env | 10 个月前 | |
chore: clippy and format (#77) * chore: clippy * chore: manual fixes to make clippy happy * chore: cargo fmt | 10 个月前 | |
chore: clippy and format (#77) * chore: clippy * chore: manual fixes to make clippy happy * chore: cargo fmt | 10 个月前 | |
dev: worker abstract | 1 年前 | |
fix: Make Rdma connection establishment stable (#107) * feat: change is_connect judge logic * feat: change connection establish logic * feat: simplify rdma connection establish procedure * feat: RdmaEndpointServer shutdown after connection established. * chore: make compiler happy * feat: add retry logic for rdma connection establish * chore: using url:Url for robust url parsing * feat: add proper disconnect logic for connection rebuild feat: RdmaEndpointServer looped for connection rebuild feat: add proper close logic when disconnect from controller feat: graceful shutdown for rdmaEndpointServer * chore: cargo fmt --all * chore: detailed prepared_qp build error * chore: remove unnecessary _ for some fields in RdmaQueue * chore: remove unused variables | 7 个月前 | |
feat: RDMA Queue impl (#102) * perf: local shared memory for controller-worker communication * chore: clippy & format * feat: raii for shm queue * chore: enlarge queue * chore: remove verbose log * chore: clippy & format * perf: tuning * chore: gracefully terminate worker (shm version) * fix: ExpertRegistry route mistake after rebalance action (#1) * docs: add some comments for dispatcher logic * feat: add new config for Commuicate Backend (grpc && shm) feat: add uniform get_registry for both backend * feat: set node to deactivate in db when node exit, avoiding wrong info used by schedule * chore: clippy & format * chore: reorganize shmq module * feat: rdma implementation * feat: can successfully establish connection * test: example for rdmaQueue * feat: impl rdma queue into registry and state service * Merge branch 'testing' into perf/rdma * feat: rdma runable * refactor: change write logic from "read remote" to "write remote" feat: add sleep for controller side after rdma connection established. * feat: add some debug info * feat: add interface to RdmaBytes for real lenth feat: rdma will only send real data to the remote * clippy & format --------- Co-authored-by: Yip Coekjan <cn_yzr@qq.com> | 9 个月前 | |
dev: move ffn to ek-computation | 1 年前 | |
fix: Make Rdma connection establishment stable (#107) * feat: change is_connect judge logic * feat: change connection establish logic * feat: simplify rdma connection establish procedure * feat: RdmaEndpointServer shutdown after connection established. * chore: make compiler happy * feat: add retry logic for rdma connection establish * chore: using url:Url for robust url parsing * feat: add proper disconnect logic for connection rebuild feat: RdmaEndpointServer looped for connection rebuild feat: add proper close logic when disconnect from controller feat: graceful shutdown for rdmaEndpointServer * chore: cargo fmt --all * chore: detailed prepared_qp build error * chore: remove unnecessary _ for some fields in RdmaQueue * chore: remove unused variables | 7 个月前 | |
feat: RDMA Queue impl (#102) * perf: local shared memory for controller-worker communication * chore: clippy & format * feat: raii for shm queue * chore: enlarge queue * chore: remove verbose log * chore: clippy & format * perf: tuning * chore: gracefully terminate worker (shm version) * fix: ExpertRegistry route mistake after rebalance action (#1) * docs: add some comments for dispatcher logic * feat: add new config for Commuicate Backend (grpc && shm) feat: add uniform get_registry for both backend * feat: set node to deactivate in db when node exit, avoiding wrong info used by schedule * chore: clippy & format * chore: reorganize shmq module * feat: rdma implementation * feat: can successfully establish connection * test: example for rdmaQueue * feat: impl rdma queue into registry and state service * Merge branch 'testing' into perf/rdma * feat: rdma runable * refactor: change write logic from "read remote" to "write remote" feat: add sleep for controller side after rdma connection established. * feat: add some debug info * feat: add interface to RdmaBytes for real lenth feat: rdma will only send real data to the remote * clippy & format --------- Co-authored-by: Yip Coekjan <cn_yzr@qq.com> | 9 个月前 | |
Basic apm: add support for Prometheus exporting, Vector, Clickhouse and Grafana (#44) * dev: update ds-tiny pre-export weight * feat: basic metric for controller * feat: new worker metrics and support compress * feat: instrument worker and support several metrics * dev: setup basic apm infrastructure | 1 年前 | |
chore: clippy and format (#77) * chore: clippy * chore: manual fixes to make clippy happy * chore: cargo fmt | 10 个月前 | |
perf: use ggml operators to optimize cpu ffn forwarding (#94) * perf: use ggml operators to optimize cpu ffn forwarding * perf: supports bf16 on ggml backend * chore: make clippy happy * chore: align the types * chore: tuning & fix serialization * fix: fix padding and context size * feat: allow dropping cache after loading expert backend * chore: statically link ggml * feat: allocating tensor data from rust side * feat: allow specifying computation backend * chore: clippy & format * chore: tuning * chore: delete unused feature flags * chore: remove ggml-cuda * fix: ggml-cpu.h includes ggml.h * perf: single thread for better throughput | 10 个月前 |