| 【feat】feat multi block_size in cache management | 1 天前 |
| support deepseek v4 | 1 个月前 |
| [fix]update: 重构后,编译缓存位置及目录调整;qwen_moe配置参数纠正 | 15 天前 |
| 【feat】feat multi block_size in cache management | 1 天前 |
| [fix] Resolved the issue that the torchrun command does not exist when the inference is started on the one-stop platform. | 14 天前 |
| [feat]support mxfp8 inference of GLM-5 on 950 platform | 1 天前 |
| refactor: 支持online多batch推理 | 1 个月前 |