| [pytorch][feature]Add RL algorithms (PPO, GRPO, DAPO)
Co-authored-by: fh_188<fenghui32@huawei.com>
# message auto-generated for no-merge-commit merge:
!3613 merge fenzhi2 into master
[pytorch][feature]Add RL algorithms (PPO, GRPO, DAPO)
Created-by: weixin_44917616
Commit-by: fh_188
Merged-by: ascend-robot
Description: Add RL algorithms (PPO, GRPO, DAPO)
See merge request: Ascend/MindSpeed-LLM!3613 | 6 个月前 |