用户可借助该项目进行LLM的高效后训练，它是为RL scaling设计的框架，能连接Megatron与SGLang实现高性能训练，还支持自定义数据生成接口及server based engine，适配多种主流模型。【此简介由AI生成】

文件	最后提交记录	最后更新时间
.agents	[Ascend]update slime base version to v0.2.4 Co-authored-by: Fei Mei<meifei5@huawei.com>	1 个月前
.claude	[Ascend]update slime base version to v0.2.4 Co-authored-by: Fei Mei<meifei5@huawei.com>	1 个月前
.github	[Ascend]update slime base version to v0.2.4 Co-authored-by: Fei Mei<meifei5@huawei.com>	1 个月前
docker	删除ci启动容器脚本 Co-authored-by: gcw_8jFUVqc5<caowuke@huawei.com>	2 天前
docs	[docs] add A5 support in quickstart Co-authored-by: zhoubeirong<zhoubeirong@huawei.com>	22 天前
examples	feat: add qwen35-9b support Co-authored-by: cjy0x<chenjunyi10@huawei.com>	30 天前
imgs	init slime-ascend Co-authored-by: zhoubeirong<zhoubeirong@huawei.com>	3 个月前
scripts	add long-sequence scripts for 0.2.4 Co-authored-by: xiazhahe<nieshiyu1@huawei.com>	19 天前
slime	修改st用例 Co-authored-by: gcw_8jFUVqc5<caowuke@huawei.com>	18 天前
slime_plugins	[Ascend]update slime base version to v0.2.4 Co-authored-by: Fei Mei<meifei5@huawei.com>	1 个月前
tests	删除ci启动容器脚本 Co-authored-by: gcw_8jFUVqc5<caowuke@huawei.com>	2 天前
tools	[Ascend]update slime base version to v0.2.4 Co-authored-by: Fei Mei<meifei5@huawei.com>	1 个月前
.gitignore	[Ascend]update slime base version to v0.2.4 Co-authored-by: Fei Mei<meifei5@huawei.com>	1 个月前
.pre-commit-config.yaml	init slime-ascend Co-authored-by: zhoubeirong<zhoubeirong@huawei.com>	3 个月前
CONTRIBUTING.md	[Ascend]update slime base version to v0.2.4 Co-authored-by: Fei Mei<meifei5@huawei.com>	1 个月前
LICENSE	init slime-ascend Co-authored-by: zhoubeirong<zhoubeirong@huawei.com>	3 个月前
README.md	[doc]: fix doc in slime-ascend Co-authored-by: xiazhahe<nieshiyu1@huawei.com>	1 个月前
README_ascend.md	[doc]: fix doc in slime-ascend Co-authored-by: xiazhahe<nieshiyu1@huawei.com>	1 个月前
README_ascend_zh.md	add hdk requirement Co-authored-by: xiazhahe<nieshiyu1@huawei.com>	3 天前
README_zh.md	fix quick_install.sh qwen3-8b.sh && add guide Co-authored-by: xiazhahe<nieshiyu1@huawei.com>	1 个月前
build_conda.sh	[Ascend]update slime base version to v0.2.4 Co-authored-by: Fei Mei<meifei5@huawei.com>	1 个月前
pyproject.toml	init slime-ascend Co-authored-by: zhoubeirong<zhoubeirong@huawei.com>	3 个月前
requirements.txt	[Ascend]update slime base version to v0.2.4 Co-authored-by: Fei Mei<meifei5@huawei.com>	1 个月前
setup.py	[Ascend]update slime base version to v0.2.4 Co-authored-by: Fei Mei<meifei5@huawei.com>	1 个月前
train.py	[Ascend]update slime base version to v0.2.4 Co-authored-by: Fei Mei<meifei5@huawei.com>	1 个月前
train_async.py	[Ascend]update slime base version to v0.2.4 Co-authored-by: Fei Mei<meifei5@huawei.com>	1 个月前

slime-ascend

本项目主要在昇腾 (Ascend) NPU 环境下运行， 👉 点击这里查看本项目的详细使用说明 (昇腾中文版) / Click here for Ascend NPU Guide

本项目 main 分支基于 slime v0.2.4 版本，我们在 NPU 上进行适配和增量开发，如果您想了解更多 slime main 分支或其他版本有关信息，请访问 slime

以下为slime原主仓文档内容。

English

slime 是为 RL scaling 设计的 LLM post‑training 框架，提供两大核心能力：

高性能训练：通过连接 Megatron 与 SGLang，支持各种模式的高效训练；
灵活的数据生成：通过自定义数据生成接口以及 server based engine，实现任意的训练数据生成流程。

slime 是 GLM-5、GLM-4.7、GLM-4.6、GLM-4.5 背后的 RL 训练框架，除此之外，slime 还支持:

Qwen3 系列 (Qwen3Next, Qwen3MoE, Qwen3), Qwen2.5 系列；
DeepSeek V3 系列 (DeepSeek V3, V3.1, DeepSeek R1)；
Llama 3。

博文

我们的愿景：slime：为 RL Scaling 设计的 SGLang-Native 后训练框架
关于纯异步 agentic 训练的一些想法：Agent-Oriented Design: An Asynchronous and Decoupled Framework for Agentic RL
v0.1.0 日志：slime v0.1.0: 重新定义高性能 RL 训练框架

架构总览

arch

模块说明：

training (Megatron)：负责主训练流程，从 Data Buffer 读取数据，训练完后将参数同步至 rollout 模块；
rollout (SGLang + router)：生成新数据（含 reward/verifier），存储至 Data Buffer；
data buffer：桥梁模块，管理 prompt 初始化、自定义数据与 rollout 生成方法。

快速开始

有关环境配置、数据准备、训练启动和关键代码分析的完整快速开始指南，请参考：

快速开始指南

我们还提供了一些未在快速开始中覆盖的使用示例，请查看 examples。

参数说明

参数分为三类：

megatron 参数：slime 会读取 PYTHONPATH 中的 megatron 里设置的所有参数，可以通过传入如 --tensor-model-parallel-size 2 的方式配置 megatron；
sglang 参数：支持环境中安装的 sglang 的所有参数，这些参数需要以 --sglang 起始，例如 --mem-fraction-static 需要通过 --sglang-mem-fraction-static 传入。
slime 自身的参数：请见：slime/utils/arguments.py

完整使用说明请查阅使用文档。

开发指南

欢迎贡献！ 若有功能建议、性能调优或使用体验反馈，欢迎提交 Issue / PR 😊

使用 pre-commit 保证提交代码风格：

apt install pre-commit -y
pre-commit install

# 运行 pre-commit 保证代码风格
pre-commit run --all-files --show-diff-on-failure --color=always

调试技巧请参考 debug 指南

常见 Q&A 与致谢

常见问题请见 Q&A
特别感谢以下项目 & 社区：SGLang、Megatron‑LM、mbridge、OpenRLHF、veRL、Pai-Megatron-Patch 等。
引用 slime 请使用：

@misc{slime_github,
  author       = {Zilin Zhu and Chengxing Xie and Xin Lv and slime Contributors},
  title        = {slime: An LLM post-training framework for RL Scaling},
  year         = {2025},
  howpublished = {\url{https://github.com/THUDM/slime}},
  note         = {GitHub repository. Corresponding author: Xin Lv},
  urldate      = {2025-06-19}
}

项目介绍

Apache-2.0 Python 103提交数

定制我的领域

README

规则集

slime-ascend:基于昇腾的LLM后训练框架项目