MindSpeed-LLM/mindspeed_llm/features_manager/megatron_basic · Ascend/MindSpeed-LLM - AtomGit

ascend-robotfeat(pytorch): support deepseekv4_flash in mcore backend

文件	最后提交记录	最后更新时间
__init__.py	!3075 [pytorch][refactor]refactor pipeline and te features Merge pull request !3075 from shengjy/refactor_pp_te	9 个月前
megatron_basic.py	feat(pytorch): support deepseekv4_flash in mcore backend Co-authored-by: dingzicha1997<dingzilin@huawei.com> # message auto-generated for no-merge-commit merge: !4420 merge geneva2 into master feat(pytorch): support deepseekv4_flash in mcore backend Created-by: dingzicha1997 Commit-by: dingzicha1997 Merged-by: ascend-robot Description: ## What this PR does / why we need it? Please describe the background and detailed changes of the PR. If it is a bugfix, please attach the related issue. ## Does this PR introduce any user-facing change? Please describe whether the PR will result in any user-facing usage changes. If there is related documentation, please specify its path. ## How was this patch tested? Please explain how to verify the correctness and effectiveness of this feature, as well as its usage constraints and limitations. See merge request: Ascend/MindSpeed-LLM!4420	1 个月前
model_basic.py	feat(pytorch): Support torch_dist and async_save Co-authored-by: z__y<z4t155664@163.com> # message auto-generated for no-merge-commit merge: !4361 merge async_save_pretrain_no_generate into master feat(pytorch): Support torch_dist and async_save Created-by: z__y Commit-by: z__y Merged-by: ascend-robot Description: ## What this PR does / why we need it? This PR adds support for torch_dist format checkpoint saving and asynchronous checkpoint saving. Notably, asynchronous checkpoint saving only supports the torch_dist format. ## Does this PR introduce any user-facing change? No. This PR only adds new checkpoint saving features without changing existing user workflows or APIs. ## How was this patch tested? Tests verify that torch_dist format checkpoint saving and asynchronous checkpoint saving work correctly, and that async_save only works with the torch_dist format as expected. See merge request: Ascend/MindSpeed-LLM!4361	1 个月前
requirements_basic.py	[pytorch][bugfix] update checkpoint loading for LoRA and QLoRA Co-authored-by: yanzhixiao<yanzhixiao@h-partners.com> # message auto-generated for no-merge-commit merge: !3522 merge lora-load-ckpt into master [pytorch][bugfix] update checkpoint loading for LoRA and QLoRA Created-by: yanzhixiao23 Commit-by: yanzhixiao Merged-by: ascend-robot Description: Update checkpoint loading for LoRA and QLoRA See merge request: Ascend/MindSpeed-LLM!3522	6 个月前
training_basic.py	feat(pytorch): Support torch_dist and async_save Co-authored-by: z__y<z4t155664@163.com> # message auto-generated for no-merge-commit merge: !4361 merge async_save_pretrain_no_generate into master feat(pytorch): Support torch_dist and async_save Created-by: z__y Commit-by: z__y Merged-by: ascend-robot Description: ## What this PR does / why we need it? This PR adds support for torch_dist format checkpoint saving and asynchronous checkpoint saving. Notably, asynchronous checkpoint saving only supports the torch_dist format. ## Does this PR introduce any user-facing change? No. This PR only adds new checkpoint saving features without changing existing user workflows or APIs. ## How was this patch tested? Tests verify that torch_dist format checkpoint saving and asynchronous checkpoint saving work correctly, and that async_save only works with the torch_dist format as expected. See merge request: Ascend/MindSpeed-LLM!4361	1 个月前
transformer_engine_basic.py	feat(torch): fp8 reuse quant w and hccl op mode set Co-authored-by: Jia_Austin<dengjia6@huawei.com> # message auto-generated for no-merge-commit merge: !4374 merge core_r0.12.1_adaptive_hccl_and_fp8_reuse into master feat(torch): fp8 reuse quant w and hccl op mode set Created-by: Jia_Austin Commit-by: Jia_Austin Merged-by: ascend-robot Description: ## What this PR does / why we need it? feat(torch): fp8 reuse quant w and hccl op mode set ## Does this PR introduce any user-facing change? --fp8-reuse-quantized-weight --hccl-op-mode ## How was this patch tested? fp8_reuse_quantized_weight is only valid when FP8 training is enabled See merge request: Ascend/MindSpeed-LLM!4374	1 个月前