文件最后提交记录最后更新时间
[pytorch][feature]fsdp2 supports megatron data preprocessing and pretrain Co-authored-by: guozhihua<g00679814@china.huawei.com> # message auto-generated for no-merge-commit merge: !4113 merge fsdp2_megatron_dataset_master into master [pytorch][feature]fsdp2 supports megatron data preprocessing and pretrain Created-by: guozhihua2 Commit-by: guozhihua Merged-by: ascend-robot Description: 1. fsdp2支持megatron风格的多数据集预训练数据处理 2. fsdp2支持预训练,且预训练精度和megatron对齐 See merge request: Ascend/MindSpeed-LLM!41133 个月前
[pytorch][feature]fsdp2 supports megatron data preprocessing and pretrain Co-authored-by: guozhihua<g00679814@china.huawei.com> # message auto-generated for no-merge-commit merge: !4113 merge fsdp2_megatron_dataset_master into master [pytorch][feature]fsdp2 supports megatron data preprocessing and pretrain Created-by: guozhihua2 Commit-by: guozhihua Merged-by: ascend-robot Description: 1. fsdp2支持megatron风格的多数据集预训练数据处理 2. fsdp2支持预训练,且预训练精度和megatron对齐 See merge request: Ascend/MindSpeed-LLM!41133 个月前
[pytorch][feature]fsdp2 supports megatron data preprocessing and pretrain Co-authored-by: guozhihua<g00679814@china.huawei.com> # message auto-generated for no-merge-commit merge: !4113 merge fsdp2_megatron_dataset_master into master [pytorch][feature]fsdp2 supports megatron data preprocessing and pretrain Created-by: guozhihua2 Commit-by: guozhihua Merged-by: ascend-robot Description: 1. fsdp2支持megatron风格的多数据集预训练数据处理 2. fsdp2支持预训练,且预训练精度和megatron对齐 See merge request: Ascend/MindSpeed-LLM!41133 个月前
[pytorch][feature]fsdp2 supports megatron data preprocessing and pretrain Co-authored-by: guozhihua<g00679814@china.huawei.com> # message auto-generated for no-merge-commit merge: !4113 merge fsdp2_megatron_dataset_master into master [pytorch][feature]fsdp2 supports megatron data preprocessing and pretrain Created-by: guozhihua2 Commit-by: guozhihua Merged-by: ascend-robot Description: 1. fsdp2支持megatron风格的多数据集预训练数据处理 2. fsdp2支持预训练,且预训练精度和megatron对齐 See merge request: Ascend/MindSpeed-LLM!41133 个月前
[pytorch][feature]fsdp2 supports megatron data preprocessing and pretrain Co-authored-by: guozhihua<g00679814@china.huawei.com> # message auto-generated for no-merge-commit merge: !4113 merge fsdp2_megatron_dataset_master into master [pytorch][feature]fsdp2 supports megatron data preprocessing and pretrain Created-by: guozhihua2 Commit-by: guozhihua Merged-by: ascend-robot Description: 1. fsdp2支持megatron风格的多数据集预训练数据处理 2. fsdp2支持预训练,且预训练精度和megatron对齐 See merge request: Ascend/MindSpeed-LLM!41133 个月前
[pytorch][feature]fsdp2 supports megatron data preprocessing and pretrain Co-authored-by: guozhihua<g00679814@china.huawei.com> # message auto-generated for no-merge-commit merge: !4113 merge fsdp2_megatron_dataset_master into master [pytorch][feature]fsdp2 supports megatron data preprocessing and pretrain Created-by: guozhihua2 Commit-by: guozhihua Merged-by: ascend-robot Description: 1. fsdp2支持megatron风格的多数据集预训练数据处理 2. fsdp2支持预训练,且预训练精度和megatron对齐 See merge request: Ascend/MindSpeed-LLM!41133 个月前
[pytorch][feature]fsdp2 supports megatron data preprocessing and pretrain Co-authored-by: guozhihua<g00679814@china.huawei.com> # message auto-generated for no-merge-commit merge: !4113 merge fsdp2_megatron_dataset_master into master [pytorch][feature]fsdp2 supports megatron data preprocessing and pretrain Created-by: guozhihua2 Commit-by: guozhihua Merged-by: ascend-robot Description: 1. fsdp2支持megatron风格的多数据集预训练数据处理 2. fsdp2支持预训练,且预训练精度和megatron对齐 See merge request: Ascend/MindSpeed-LLM!41133 个月前
[pytorch][feature]fsdp2 supports megatron data preprocessing and pretrain Co-authored-by: guozhihua<g00679814@china.huawei.com> # message auto-generated for no-merge-commit merge: !4113 merge fsdp2_megatron_dataset_master into master [pytorch][feature]fsdp2 supports megatron data preprocessing and pretrain Created-by: guozhihua2 Commit-by: guozhihua Merged-by: ascend-robot Description: 1. fsdp2支持megatron风格的多数据集预训练数据处理 2. fsdp2支持预训练,且预训练精度和megatron对齐 See merge request: Ascend/MindSpeed-LLM!41133 个月前
[pytorch][feature]fsdp2 supports megatron data preprocessing and pretrain Co-authored-by: guozhihua<g00679814@china.huawei.com> # message auto-generated for no-merge-commit merge: !4113 merge fsdp2_megatron_dataset_master into master [pytorch][feature]fsdp2 supports megatron data preprocessing and pretrain Created-by: guozhihua2 Commit-by: guozhihua Merged-by: ascend-robot Description: 1. fsdp2支持megatron风格的多数据集预训练数据处理 2. fsdp2支持预训练,且预训练精度和megatron对齐 See merge request: Ascend/MindSpeed-LLM!41133 个月前
feat(pytorch): Training in TND with GDN support Co-authored-by: mhh111<mahonghao1@huawei.com> # message auto-generated for no-merge-commit merge: !4354 merge 0327 into master feat(pytorch): Training in TND with GDN support Created-by: mhh111 Commit-by: mhh111 Merged-by: ascend-robot Description: feat(pytorch): Training in TND with GDN support See merge request: Ascend/MindSpeed-LLM!43541 个月前
[pytorch][feature]fsdp2 supports megatron data preprocessing and pretrain Co-authored-by: guozhihua<g00679814@china.huawei.com> # message auto-generated for no-merge-commit merge: !4113 merge fsdp2_megatron_dataset_master into master [pytorch][feature]fsdp2 supports megatron data preprocessing and pretrain Created-by: guozhihua2 Commit-by: guozhihua Merged-by: ascend-robot Description: 1. fsdp2支持megatron风格的多数据集预训练数据处理 2. fsdp2支持预训练,且预训练精度和megatron对齐 See merge request: Ascend/MindSpeed-LLM!41133 个月前
[pytorch][feature]fsdp2 supports megatron data preprocessing and pretrain Co-authored-by: guozhihua<g00679814@china.huawei.com> # message auto-generated for no-merge-commit merge: !4113 merge fsdp2_megatron_dataset_master into master [pytorch][feature]fsdp2 supports megatron data preprocessing and pretrain Created-by: guozhihua2 Commit-by: guozhihua Merged-by: ascend-robot Description: 1. fsdp2支持megatron风格的多数据集预训练数据处理 2. fsdp2支持预训练,且预训练精度和megatron对齐 See merge request: Ascend/MindSpeed-LLM!41133 个月前
style(pytorch): SCA compliance rectification Co-authored-by: zhyebin01<zhangyebin@h-partners.com> # message auto-generated for no-merge-commit merge: !4355 merge master into master style(pytorch): SCA compliance rectification Created-by: zhyebin01 Commit-by: zhyebin01 Merged-by: ascend-robot Description: ## What this PR does / why we need it? SCA compliance rectification ## Does this PR introduce any user-facing change? no ## How was this patch tested? pipeline test passed See merge request: Ascend/MindSpeed-LLM!43552 个月前
feat(pytorch): Training in TND with GDN support Co-authored-by: mhh111<mahonghao1@huawei.com> # message auto-generated for no-merge-commit merge: !4354 merge 0327 into master feat(pytorch): Training in TND with GDN support Created-by: mhh111 Commit-by: mhh111 Merged-by: ascend-robot Description: feat(pytorch): Training in TND with GDN support See merge request: Ascend/MindSpeed-LLM!43541 个月前
[pytorch][model]add qwen3-32b in fsdp2 Co-authored-by: guozhihua2<guozhihua2@huawei.com> # message auto-generated for no-merge-commit merge: !4212 merge qwen3-32b-fsdp2-master into master [pytorch][model]add qwen3-32b in fsdp2 Created-by: guozhihua2 Commit-by: guozhihua2 Merged-by: ascend-robot Description: 1. 新增fsdp2场景下的qwen3-32b预训练和微调 2. 修改fsdp2预训练的数据加载逻辑,保持和微调一直的数据路径 3. 移除预训练数据处理中存在bin和idx就不处理逻辑,保持每次都处理数据集逻辑 4. 移除tokenizer-name-or-path和data_parallel_mode参数,保持和微调一致 5. 预训练只需要将data_manager_type和stage分别改成:mg和pt See merge request: Ascend/MindSpeed-LLM!42123 个月前
[pytorch][feature]fsdp2 supports megatron data preprocessing and pretrain Co-authored-by: guozhihua<g00679814@china.huawei.com> # message auto-generated for no-merge-commit merge: !4113 merge fsdp2_megatron_dataset_master into master [pytorch][feature]fsdp2 supports megatron data preprocessing and pretrain Created-by: guozhihua2 Commit-by: guozhihua Merged-by: ascend-robot Description: 1. fsdp2支持megatron风格的多数据集预训练数据处理 2. fsdp2支持预训练,且预训练精度和megatron对齐 See merge request: Ascend/MindSpeed-LLM!41133 个月前
feature(pytorch): FSDP2 support hardware-adaptive execution Co-authored-by: zhyebin01<zhangyebin@h-partners.com> # message auto-generated for no-merge-commit merge: !4343 merge fsdp2_gpu into master feature(pytorch): FSDP2 support hardware-adaptive execution Created-by: zhyebin01 Commit-by: zhyebin01 Merged-by: ascend-robot Description: ## What this PR does / why we need it? FSDP2 support hardware-adaptive execution ## Does this PR introduce any user-facing change? No ## How was this patch tested? pipeline test passed See merge request: Ascend/MindSpeed-LLM!43432 个月前