35311031创建于 3月3日历史提交

文件	最后提交记录	最后更新时间
generator	init slime-ascend Co-authored-by: zhoubeirong<zhoubeirong@huawei.com>	3 个月前
README.md	init slime-ascend Co-authored-by: zhoubeirong<zhoubeirong@huawei.com>	3 个月前
README_zh.md	init slime-ascend Co-authored-by: zhoubeirong<zhoubeirong@huawei.com>	3 个月前
buffer.py	init slime-ascend Co-authored-by: zhoubeirong<zhoubeirong@huawei.com>	3 个月前
rollout_buffer_example.py	init slime-ascend Co-authored-by: zhoubeirong<zhoubeirong@huawei.com>	3 个月前
rollout_buffer_example.sh	init slime-ascend Co-authored-by: zhoubeirong<zhoubeirong@huawei.com>	3 个月前

Rollout Buffer

Overview

Rollout Buffer is an independent component for asynchronous agent trajectory generation, with the main function of using the LLM OpenAI Server launched by slime training to generate agent trajectories.

Workflow

slime Training Process ←─── HTTP API ───→ Rollout Buffer
        ↓                                      ↓
   LLM Server ←─────── HTTP Requests ─────── Agent Framework
        ↓                                      ↓
   Model Response ──────────────────────→ Trajectory Generation

For each different Agent task, there should be a corresponding independent Generator class, responsible for generating trajectories for that type of task. Rollout Buffer automatically reads and loads different types of Generators.

Quick Start

Basic Usage Process

Copy Template: Copy base_generator.py as a template
Modify Task Type: Change TASK_TYPE to your task name (cannot duplicate with other Generators)
Implement Core Function: Implement the run_rollout() function
Optional Customization: Rewrite five optional functions as needed

Generator files must end with _generator.py and be placed in the generator/ directory:

generator/
├── base_generator.py      # Math task implementation (default template)
└── your_task_generator.py # Your custom task

Each Generator file must define TASK_TYPE and run_rollout().

In addition, Rollout Buffer also provides some customizable functions to meet special needs of different tasks. If no custom implementation is provided, the system will use default implementations (located in slime_plugins/rollout_buffer/default_func.py).

Example Script

First, you need to follow Example: Qwen3-4B Model to configure the environment, download data and convert model checkpoints. And then run the following scripts:

cd slime_plugins/rollout_buffer
bash rollout_buffer_example.sh

# In a different terminal
python buffer.py