| [Refactor] comm_manager refactor | 23 天前 |
| [feat] mooncake + hixl, import, support models lazy import | 25 天前 |
| [refactor] mc2 use independent communication group | 1 个月前 |
| [refactor] add page attention cache management | 1 个月前 |
| [refactor] MTP refactor and R1 adaptation framework | 2 个月前 |
| [feat]support mxfp8 inference of GLM-5 on 950 platform | 2 天前 |
| refactor: 支持online多batch推理 | 1 个月前 |
| [feat]prefill profiling | 2 个月前 |
| [feat]support mxfp8 inference of GLM-5 on 950 platform | 2 天前 |