MindSpeed-LLM/examples/mcore/glm52 · Ascend/MindSpeed-LLM - AtomGit

ascend-robotfeat: add GLM5.2 model sh and docs

文件	最后提交记录	最后更新时间
ckpt_convert_glm52_hf2mcore.sh	feat:Add adaptation support for GLM5.2 model Co-authored-by: LinShua<707894133@qq.com> # message auto-generated for no-merge-commit merge: !4610 merge master_glm5 into master feat:Add adaptation support for GLM5.2 model Created-by: LinShua Commit-by: LinShua Merged-by: ascend-robot Description: ## What this PR does / why we need it? Add adaptation support for GLM5.2 model: 1.Revise and extend the model configuration mapping, add the indexer_types field, and optimize parameter mapping for MoE shared experts. 2.Add dedicated weight conversion scripts for GLM5.2 to support bidirectional checkpoint conversion between HF and Mcore formats, with preconfigured parallelism, MoE and MLA parameters. 3.Implement the share indexer capability, which can be enabled via the arguments --index-topk-freq and --index-skip-topk-offset. 4.Integrate data preprocessing into training scripts and provide supporting environment variable scripts. No breaking changes to external APIs. Internal tests have been completed, with normal inference and precision aligned with vllm-ascend. 5.The MLA function requires enabling the parameters --apply-rope-no-in-complex and --no-use-sparse-c8-indexer. ## Does this PR introduce any user-facing change? NA. ## How was this patch tested? 见PR See merge request: Ascend/MindSpeed-LLM!4610	3 天前
ckpt_convert_glm52_mcore2hf.sh	feat:Add adaptation support for GLM5.2 model Co-authored-by: LinShua<707894133@qq.com> # message auto-generated for no-merge-commit merge: !4610 merge master_glm5 into master feat:Add adaptation support for GLM5.2 model Created-by: LinShua Commit-by: LinShua Merged-by: ascend-robot Description: ## What this PR does / why we need it? Add adaptation support for GLM5.2 model: 1.Revise and extend the model configuration mapping, add the indexer_types field, and optimize parameter mapping for MoE shared experts. 2.Add dedicated weight conversion scripts for GLM5.2 to support bidirectional checkpoint conversion between HF and Mcore formats, with preconfigured parallelism, MoE and MLA parameters. 3.Implement the share indexer capability, which can be enabled via the arguments --index-topk-freq and --index-skip-topk-offset. 4.Integrate data preprocessing into training scripts and provide supporting environment variable scripts. No breaking changes to external APIs. Internal tests have been completed, with normal inference and precision aligned with vllm-ascend. 5.The MLA function requires enabling the parameters --apply-rope-no-in-complex and --no-use-sparse-c8-indexer. ## Does this PR introduce any user-facing change? NA. ## How was this patch tested? 见PR See merge request: Ascend/MindSpeed-LLM!4610	3 天前
data_convert_glm52_pretrain.sh	feat:Add adaptation support for GLM5.2 model Co-authored-by: LinShua<707894133@qq.com> # message auto-generated for no-merge-commit merge: !4610 merge master_glm5 into master feat:Add adaptation support for GLM5.2 model Created-by: LinShua Commit-by: LinShua Merged-by: ascend-robot Description: ## What this PR does / why we need it? Add adaptation support for GLM5.2 model: 1.Revise and extend the model configuration mapping, add the indexer_types field, and optimize parameter mapping for MoE shared experts. 2.Add dedicated weight conversion scripts for GLM5.2 to support bidirectional checkpoint conversion between HF and Mcore formats, with preconfigured parallelism, MoE and MLA parameters. 3.Implement the share indexer capability, which can be enabled via the arguments --index-topk-freq and --index-skip-topk-offset. 4.Integrate data preprocessing into training scripts and provide supporting environment variable scripts. No breaking changes to external APIs. Internal tests have been completed, with normal inference and precision aligned with vllm-ascend. 5.The MLA function requires enabling the parameters --apply-rope-no-in-complex and --no-use-sparse-c8-indexer. ## Does this PR introduce any user-facing change? NA. ## How was this patch tested? 见PR See merge request: Ascend/MindSpeed-LLM!4610	3 天前
generate_glm52_744b_A3_ptd.sh	feat:Add adaptation support for GLM5.2 model Co-authored-by: LinShua<707894133@qq.com> # message auto-generated for no-merge-commit merge: !4610 merge master_glm5 into master feat:Add adaptation support for GLM5.2 model Created-by: LinShua Commit-by: LinShua Merged-by: ascend-robot Description: ## What this PR does / why we need it? Add adaptation support for GLM5.2 model: 1.Revise and extend the model configuration mapping, add the indexer_types field, and optimize parameter mapping for MoE shared experts. 2.Add dedicated weight conversion scripts for GLM5.2 to support bidirectional checkpoint conversion between HF and Mcore formats, with preconfigured parallelism, MoE and MLA parameters. 3.Implement the share indexer capability, which can be enabled via the arguments --index-topk-freq and --index-skip-topk-offset. 4.Integrate data preprocessing into training scripts and provide supporting environment variable scripts. No breaking changes to external APIs. Internal tests have been completed, with normal inference and precision aligned with vllm-ascend. 5.The MLA function requires enabling the parameters --apply-rope-no-in-complex and --no-use-sparse-c8-indexer. ## Does this PR introduce any user-facing change? NA. ## How was this patch tested? 见PR See merge request: Ascend/MindSpeed-LLM!4610	3 天前
pretrain_glm52_744b_4k_A3_ptd.sh	feat: add GLM5.2 model sh and docs Co-authored-by: LinShua<707894133@qq.com> # message auto-generated for no-merge-commit merge: !4619 merge master_glm5_docs2 into master feat: add GLM5.2 model sh and docs Created-by: LinShua Commit-by: LinShua Merged-by: ascend-robot Description: ## What this PR does / why we need it? Revise GLM5.2 and relevant documentation descriptions ## Does this PR introduce any user-facing change? NA ## How was this patch tested? NA See merge request: Ascend/MindSpeed-LLM!4619	3 天前