文件最后提交记录最后更新时间
[Docs] Correct the description in the document Co-authored-by: LKONE<wanglikai4@huawei.com> # message auto-generated for no-merge-commit merge: !2363 merge 26.0.0 into 26.0.0 [Docs] Correct the description in the document Created-by: wanglikai1019 Commit-by: wanglikai1019;LKONE Merged-by: ascend-robot Description: ## What this PR does / why we need it? 修改部分readme文档中的错误描述。 ## Does this PR introduce any user-facing change? 无 ## How was this patch tested? 无 See merge request: Ascend/MindSpeed-MM!23631 个月前
[Feature] The magisterial model supports RL Co-authored-by: wanghao<wanghao773@h-partners.com> # message auto-generated for no-merge-commit merge: !2209 merge rl_mistral into master [Feature] The magisterial model supports RL Created-by: vectorwhh Commit-by: wanghao Merged-by: ascend-robot Description: ## Motivation The magisterial model supports RL ## Modification Please briefly describe what modification is made in this PR. ## Self-test (Optional) If modifications to this PR may cause/fix function/accuracy/performance DTSs/issues, a self-inspection record needs to be attached. ## BC-breaking (Optional) If there are compatibility issues, such as dependencies on cann/torch_npu versions, they need to be explained in the PR. ## Checklist **Before PR**: - [x] The new code needs to comply with the Clean Code specification. - [x] The PR content is self-checked, and the expression can be clear and the writing standardized **After PR**: - [x] CLA has been signed and all committers have signed the CLA in this PR. - [x] The ci-pipeline is passed, Code Check is passed. See merge request: Ascend/MindSpeed-MM!22093 个月前
[Feature] The magisterial model supports RL Co-authored-by: wanghao<wanghao773@h-partners.com> # message auto-generated for no-merge-commit merge: !2209 merge rl_mistral into master [Feature] The magisterial model supports RL Created-by: vectorwhh Commit-by: wanghao Merged-by: ascend-robot Description: ## Motivation The magisterial model supports RL ## Modification Please briefly describe what modification is made in this PR. ## Self-test (Optional) If modifications to this PR may cause/fix function/accuracy/performance DTSs/issues, a self-inspection record needs to be attached. ## BC-breaking (Optional) If there are compatibility issues, such as dependencies on cann/torch_npu versions, they need to be explained in the PR. ## Checklist **Before PR**: - [x] The new code needs to comply with the Clean Code specification. - [x] The PR content is self-checked, and the expression can be clear and the writing standardized **After PR**: - [x] CLA has been signed and all committers have signed the CLA in this PR. - [x] The ci-pipeline is passed, Code Check is passed. See merge request: Ascend/MindSpeed-MM!22093 个月前
README.md

Magistral强化学习

环境安装

环境搭建及依赖安装

conda create -n mistral_verl python=3.11
conda activate mistral_verl

pip install torch_npu==2.9.0

source /home/cann/ascend-toolkit/set_env.sh
source /home/cann/nnal/atb/set_env.sh

# 安装vllm
git clone https://github.com/vllm-project/vllm.git
git checkout d7de043d55d1dd629554467e23874097e1c48993
cd vllm
VLLM_TARGET_DEVICE=empty pip install -e .
cd ..

# 安装vllm-ascend
git clone https://github.com/vllm-project/vllm-ascend
cd vllm-ascend
git checkout 52d4acfa51fb868823d1070b81cbd2d97e9e4696
pip install -e .
cd ..

# 安装verl
git clone https://github.com/verl-project/verl.git
cd verl
git checkout 4424616d7dfe03cc564866dc5e99dfaba1daba2e
pip install -r requirements.txt
pip install -v -e .
cd ..


# 安装三方库
pip install qwen-vl-utils==0.0.11 mathruler viztracer uvloop==0.21.0 setuptools==80.9.0

# 卸载triton(如有)
pip uninstall triton

# 安装triton-ascend
pip install triton-ascend==3.2.0rc4

# 确保transformers已安装并且版本为4.57.6
pip install transformers==4.57.6

git clone --branch 26.0.0 https://gitcode.com/Ascend/MindSpeed-MM.git
cd MindSpeed-MM

代码替换: 将verl目录下的verl/utils/vllm/utils.py文件替换为MindSpeed-MM/examples/verl_examples/mistral/utils.py

权重下载

数据集下载

运行

修改examples/verl_examples/mistral/mistral_lora_grpo.sh中的cann路径、data.train_filesdata.val_filesactor_rollout_ref.model.pathdefault_local_dir参数 运行命令:

bash examples/verl_examples/mistral/mistral_lora_grpo.sh