dd6fe347创建于 4月9日历史提交

文件	最后提交记录	最后更新时间
configs	fix link validity Co-authored-by: frozenleaves<914814442@qq.com> # message auto-generated for no-merge-commit merge: !7517 merge master into master fix link validity Created-by: frozenn Commit-by: frozenleaves Merged-by: ascend-robot Description: ## Motivation Please describe the motivation of this PR and the goal you want to achieve through this PR. ## Modification Please briefly describe what modification is made in this PR. ## Self-test (Optional) If modifications to this PR may cause/fix function/accuracy/performance DTSs/issues, a self-inspection record needs to be attached. ## BC-breaking (Optional) If there are compatibility issues, such as dependencies on cann/torch_npu versions, they need to be explained in the PR. ## Checklist Before PR: - [ ] The new code needs to comply with the Clean Code specification. - [ ] The PR content is self-checked, and the expression can be clear and the writing standardized After PR: - [ ] CLA has been signed and all committers have signed the CLA in this PR. - [ ] The ci-pipeline is passed, Code Check is passed. See merge request: Ascend/ModelZoo-PyTorch!7517	1 个月前
eval_tasks	Add GPT-NeoX Signed-off-by: lifuhua <lifuhua3@huawei.com>	2 年前
megatron	!5769 GPT-Neox 内存优化&重计算优化 Merge pull request !5769 from jianyu jiang/master	2 年前
requirements	!6012 双机yml配置参数对齐 Merge pull request !6012 from xinxinL/master	2 年前
tests	!5952 [fix][PyTorch]Adapt to Python 3.8 Merge pull request !5952 from LRJ/master	2 年前
tools	Adapting to the NPU Signed-off-by: lifuhua <lifuhua3@huawei.com>	2 年前
.clang-format	Add GPT-NeoX Signed-off-by: lifuhua <lifuhua3@huawei.com>	2 年前
.dockerignore	Add GPT-NeoX Signed-off-by: lifuhua <lifuhua3@huawei.com>	2 年前
.gitignore	Add GPT-NeoX Signed-off-by: lifuhua <lifuhua3@huawei.com>	2 年前
.pre-commit-config.yaml	Add GPT-NeoX Signed-off-by: lifuhua <lifuhua3@huawei.com>	2 年前
CITATION.cff	Add GPT-NeoX Signed-off-by: lifuhua <lifuhua3@huawei.com>	2 年前
CODEOWNERS	Add GPT-NeoX Signed-off-by: lifuhua <lifuhua3@huawei.com>	2 年前
Dockerfile	Add GPT-NeoX Signed-off-by: lifuhua <lifuhua3@huawei.com>	2 年前
LICENSE	Add GPT-NeoX Signed-off-by: lifuhua <lifuhua3@huawei.com>	2 年前
MANIFEST.in	Add GPT-NeoX Signed-off-by: lifuhua <lifuhua3@huawei.com>	2 年前
README-MUP.md	Add GPT-NeoX Signed-off-by: lifuhua <lifuhua3@huawei.com>	2 年前
README.md	fix link validity Co-authored-by: frozenleaves<914814442@qq.com> # message auto-generated for no-merge-commit merge: !7517 merge master into master fix link validity Created-by: frozenn Commit-by: frozenleaves Merged-by: ascend-robot Description: ## Motivation Please describe the motivation of this PR and the goal you want to achieve through this PR. ## Modification Please briefly describe what modification is made in this PR. ## Self-test (Optional) If modifications to this PR may cause/fix function/accuracy/performance DTSs/issues, a self-inspection record needs to be attached. ## BC-breaking (Optional) If there are compatibility issues, such as dependencies on cann/torch_npu versions, they need to be explained in the PR. ## Checklist Before PR: - [ ] The new code needs to comply with the Clean Code specification. - [ ] The PR content is self-checked, and the expression can be clear and the writing standardized After PR: - [ ] CLA has been signed and all committers have signed the CLA in this PR. - [ ] The ci-pipeline is passed, Code Check is passed. See merge request: Ascend/ModelZoo-PyTorch!7517	1 个月前
deepy.py	Adapting to the NPU Signed-off-by: lifuhua <lifuhua3@huawei.com>	2 年前
env_npu.sh	!5952 [fix][PyTorch]Adapt to Python 3.8 Merge pull request !5952 from LRJ/master	2 年前
evaluate.py	Add GPT-NeoX Signed-off-by: lifuhua <lifuhua3@huawei.com>	2 年前
generate.py	Add GPT-NeoX Signed-off-by: lifuhua <lifuhua3@huawei.com>	2 年前
hostfile	!5709 README补充 & 选择重计算优化 & 执行脚本补充 & deepspeed0.9.2适配 Merge pull request !5709 from 刘鑫鑫/master	2 年前
prepare_data.py	Add GPT-NeoX Signed-off-by: lifuhua <lifuhua3@huawei.com>	2 年前
public_address_statement.md	!7376 optimize public_address_statement.md Merge pull request !7376 from 王凯宇/master	8 个月前
run_16p.sh	!5850 【README补充】:数据集处理Demo & Finetuning说明 & 配置及安装补充说明 Merge pull request !5850 from 刘鑫鑫/master	2 年前
run_8p.sh	!5850 【README补充】:数据集处理Demo & Finetuning说明 & 配置及安装补充说明 Merge pull request !5850 from 刘鑫鑫/master	2 年前
train.py	!6180 算子黑名单批量删除 Merge pull request !6180 from Chai/master	2 年前

mup)

Add mup neox args to your config

# mup

"use-mup": true,

"save-base-shapes": false, # this only needs to be enabled once in order to generate the base-shapes-file on each rank

"base-shapes-file": "base-shapes", # load base shapes from this file

"coord-check": false, # generate coord check plots to verify mup's implementation in neox

# mup hp search

"mup-init-scale": 1.0,

"mup-attn-temp": 1.0,

"mup-output-temp": 1.0,

"mup-embedding-mult": 1.0,

"mup-rp-embedding-mult": 1.0,

Generate base shapes

Set use-mup to true
Set save-base-shapes to true
Run once. gpt-neox will instantiate a base model and a delta model, then save one file per rank named .. gpt-neox will exit immediately.
Set save-base-shapes to false

Generate coord check plots (optional)

Keep use-mup true
Set coord-check to true
Run once. gpt-neox will output jpg images similar to https://github.com/microsoft/mutransformers/blob/main/README.md#coord-check. gpt-neox will exit immediately
Set coord-check to false

Tune mup hyperparameters and LR

The values under mup hp search were added and correspond to appendix F.4 from https://arxiv.org/pdf/2203.03466.pdf. These and LR are tuned with a random search using the scaled-up config (tested with 6-7B.yml) but with hidden-size set to the value from the scaled-down config (small.yml).

Transfer

With the best LR set and the best mup HPs set, revert the value of hidden-size in the scaled-up config and run again.