| 文件 | 最后提交记录 | 最后更新时间 |
|---|---|---|
perf: use ggml operators to optimize cpu ffn forwarding (#94) * perf: use ggml operators to optimize cpu ffn forwarding * perf: supports bf16 on ggml backend * chore: make clippy happy * chore: align the types * chore: tuning & fix serialization * fix: fix padding and context size * feat: allow dropping cache after loading expert backend * chore: statically link ggml * feat: allocating tensor data from rust side * feat: allow specifying computation backend * chore: clippy & format * chore: tuning * chore: delete unused feature flags * chore: remove ggml-cuda * fix: ggml-cpu.h includes ggml.h * perf: single thread for better throughput | 10 个月前 | |
fix: Make Rdma connection establishment stable (#107) * feat: change is_connect judge logic * feat: change connection establish logic * feat: simplify rdma connection establish procedure * feat: RdmaEndpointServer shutdown after connection established. * chore: make compiler happy * feat: add retry logic for rdma connection establish * chore: using url:Url for robust url parsing * feat: add proper disconnect logic for connection rebuild feat: RdmaEndpointServer looped for connection rebuild feat: add proper close logic when disconnect from controller feat: graceful shutdown for rdmaEndpointServer * chore: cargo fmt --all * chore: detailed prepared_qp build error * chore: remove unnecessary _ for some fields in RdmaQueue * chore: remove unused variables | 7 个月前 | |
dev: add more case for doctor | 1 年前 | |
Feat: W8A16 Dequant implemented in torch (#24) * feat: add torch install role * feat: w8a16 dequant in CPU * Revert "feat: add torch install role" This reverts commit e62592825e9a9549eabb66c5a72371d036f52554. * fix: padding case * dev: load scale from weight server | 1 年前 | |
fix: Make Rdma connection establishment stable (#107) * feat: change is_connect judge logic * feat: change connection establish logic * feat: simplify rdma connection establish procedure * feat: RdmaEndpointServer shutdown after connection established. * chore: make compiler happy * feat: add retry logic for rdma connection establish * chore: using url:Url for robust url parsing * feat: add proper disconnect logic for connection rebuild feat: RdmaEndpointServer looped for connection rebuild feat: add proper close logic when disconnect from controller feat: graceful shutdown for rdmaEndpointServer * chore: cargo fmt --all * chore: detailed prepared_qp build error * chore: remove unnecessary _ for some fields in RdmaQueue * chore: remove unused variables | 7 个月前 | |
fix: Make Rdma connection establishment stable (#107) * feat: change is_connect judge logic * feat: change connection establish logic * feat: simplify rdma connection establish procedure * feat: RdmaEndpointServer shutdown after connection established. * chore: make compiler happy * feat: add retry logic for rdma connection establish * chore: using url:Url for robust url parsing * feat: add proper disconnect logic for connection rebuild feat: RdmaEndpointServer looped for connection rebuild feat: add proper close logic when disconnect from controller feat: graceful shutdown for rdmaEndpointServer * chore: cargo fmt --all * chore: detailed prepared_qp build error * chore: remove unnecessary _ for some fields in RdmaQueue * chore: remove unused variables | 7 个月前 | |
feat:gpu support in backend (#46) * feat:gpu support in backend * Merge branch 'dev' into feat/gpu-support * feat: read config from settings | 11 个月前 | |
dev: implement basic control service | 1 年前 |
| 文件 | 最后提交记录 | 最后更新时间 |
|---|---|---|
| 10 个月前 | ||
| 7 个月前 | ||
| 1 年前 | ||
| 1 年前 | ||
| 7 个月前 | ||
| 7 个月前 | ||
| 11 个月前 | ||
| 1 年前 |