dd6fe347创建于 4月9日历史提交

文件	最后提交记录	最后更新时间
.circleci	!5109 【PyTorch】【built-in】【scaling-nmt】初次提交 * scaling-nmt 初次提交	2 年前
docs	fix link validity Co-authored-by: frozenleaves<914814442@qq.com> # message auto-generated for no-merge-commit merge: !7517 merge master into master fix link validity Created-by: frozenn Commit-by: frozenleaves Merged-by: ascend-robot Description: ## Motivation Please describe the motivation of this PR and the goal you want to achieve through this PR. ## Modification Please briefly describe what modification is made in this PR. ## Self-test (Optional) If modifications to this PR may cause/fix function/accuracy/performance DTSs/issues, a self-inspection record needs to be attached. ## BC-breaking (Optional) If there are compatibility issues, such as dependencies on cann/torch_npu versions, they need to be explained in the PR. ## Checklist Before PR: - [ ] The new code needs to comply with the Clean Code specification. - [ ] The PR content is self-checked, and the expression can be clear and the writing standardized After PR: - [ ] CLA has been signed and all committers have signed the CLA in this PR. - [ ] The ci-pipeline is passed, Code Check is passed. See merge request: Ascend/ModelZoo-PyTorch!7517	1 个月前
examples	!5109 【PyTorch】【built-in】【scaling-nmt】初次提交 * scaling-nmt 初次提交	2 年前
fairseq	!5590 [PyTorch]批量模型混精适配INF/NAN模式 Merge pull request !5590 from chensida/master	2 年前
fairseq_cli	!5109 【PyTorch】【built-in】【scaling-nmt】初次提交 * scaling-nmt 初次提交	2 年前
hydra_plugins	!5109 【PyTorch】【built-in】【scaling-nmt】初次提交 * scaling-nmt 初次提交	2 年前
scripts	!5109 【PyTorch】【built-in】【scaling-nmt】初次提交 * scaling-nmt 初次提交	2 年前
test	!5109 【PyTorch】【built-in】【scaling-nmt】初次提交 * scaling-nmt 初次提交	2 年前
tests	!5109 【PyTorch】【built-in】【scaling-nmt】初次提交 * scaling-nmt 初次提交	2 年前
.pre-commit-config.yaml	!5109 【PyTorch】【built-in】【scaling-nmt】初次提交 * scaling-nmt 初次提交	2 年前
1.8_requirements.txt	!5109 【PyTorch】【built-in】【scaling-nmt】初次提交 * scaling-nmt 初次提交	2 年前
CODE_OF_CONDUCT.md	!5109 【PyTorch】【built-in】【scaling-nmt】初次提交 * scaling-nmt 初次提交	2 年前
CONTRIBUTING.md	!5109 【PyTorch】【built-in】【scaling-nmt】初次提交 * scaling-nmt 初次提交	2 年前
LICENSE	!5109 【PyTorch】【built-in】【scaling-nmt】初次提交 * scaling-nmt 初次提交	2 年前
MANIFEST.in	!5109 【PyTorch】【built-in】【scaling-nmt】初次提交 * scaling-nmt 初次提交	2 年前
README.md	fix link validity Co-authored-by: frozenleaves<914814442@qq.com> # message auto-generated for no-merge-commit merge: !7517 merge master into master fix link validity Created-by: frozenn Commit-by: frozenleaves Merged-by: ascend-robot Description: ## Motivation Please describe the motivation of this PR and the goal you want to achieve through this PR. ## Modification Please briefly describe what modification is made in this PR. ## Self-test (Optional) If modifications to this PR may cause/fix function/accuracy/performance DTSs/issues, a self-inspection record needs to be attached. ## BC-breaking (Optional) If there are compatibility issues, such as dependencies on cann/torch_npu versions, they need to be explained in the PR. ## Checklist Before PR: - [ ] The new code needs to comply with the Clean Code specification. - [ ] The PR content is self-checked, and the expression can be clear and the writing standardized After PR: - [ ] CLA has been signed and all committers have signed the CLA in this PR. - [ ] The ci-pipeline is passed, Code Check is passed. See merge request: Ascend/ModelZoo-PyTorch!7517	1 个月前
README_raw.md	fix link validity Co-authored-by: frozenleaves<914814442@qq.com> # message auto-generated for no-merge-commit merge: !7517 merge master into master fix link validity Created-by: frozenn Commit-by: frozenleaves Merged-by: ascend-robot Description: ## Motivation Please describe the motivation of this PR and the goal you want to achieve through this PR. ## Modification Please briefly describe what modification is made in this PR. ## Self-test (Optional) If modifications to this PR may cause/fix function/accuracy/performance DTSs/issues, a self-inspection record needs to be attached. ## BC-breaking (Optional) If there are compatibility issues, such as dependencies on cann/torch_npu versions, they need to be explained in the PR. ## Checklist Before PR: - [ ] The new code needs to comply with the Clean Code specification. - [ ] The PR content is self-checked, and the expression can be clear and the writing standardized After PR: - [ ] CLA has been signed and all committers have signed the CLA in this PR. - [ ] The ci-pipeline is passed, Code Check is passed. See merge request: Ascend/ModelZoo-PyTorch!7517	1 个月前
RELEASE.md	!5109 【PyTorch】【built-in】【scaling-nmt】初次提交 * scaling-nmt 初次提交	2 年前
hubconf.py	!5109 【PyTorch】【built-in】【scaling-nmt】初次提交 * scaling-nmt 初次提交	2 年前
public_address_statement.md	!7376 optimize public_address_statement.md Merge pull request !7376 from 王凯宇/master	8 个月前
pyproject.toml	!5109 【PyTorch】【built-in】【scaling-nmt】初次提交 * scaling-nmt 初次提交	2 年前
release_utils.py	!5109 【PyTorch】【built-in】【scaling-nmt】初次提交 * scaling-nmt 初次提交	2 年前
setup.cfg	!5109 【PyTorch】【built-in】【scaling-nmt】初次提交 * scaling-nmt 初次提交	2 年前
setup.py	!5109 【PyTorch】【built-in】【scaling-nmt】初次提交 * scaling-nmt 初次提交	2 年前
train.py	!5109 【PyTorch】【built-in】【scaling-nmt】初次提交 * scaling-nmt 初次提交	2 年前

Scaling-nmt for PyTorch

概述

简述

Scaling-NMT是一种用于神经机器翻译（NMT）的模型架构，旨在解决NMT中的可扩展性问题。在传统的NMT模型中，随着数据集的增大，模型的大小和计算量也会增加，导致训练和推理时间变得非常长。Scaling-NMT通过使用分层架构和动态路由机制来解决这个问题。在Scaling-NMT中，模型被分为多个层次，每个层次包含多个子模型。每个子模型只处理输入的一部分，然后将其传递给下一个子模型。这种分层架构可以使模型更容易扩展，因为每个子模型的大小和计算量都比整个模型小得多。

参考实现：

url=https://github.com/facebookresearch/fairseq/tree/v0.12.2/examples/scaling_nmt
commit_id=4a388e64cd646ed7d7ad8de8fae55df2b8eea91d

适配昇腾 AI 处理器的实现：

url=https://gitcode.com/ascend/ModelZoo-PyTorch.git
code_path=PyTorch/built-in/nlp

准备训练环境

准备环境

当前模型支持的 PyTorch 版本和已知三方库依赖如下表所示。

表 1 版本支持表

Torch_Version 三方库依赖版本

PyTorch 1.5 -

PyTorch 1.8 torchvision==0.9.1

PyTorch 1.11 -
环境准备指导。

请参考《Pytorch框架训练环境准备》。

Torch_Version	三方库依赖版本
PyTorch 1.5	-
PyTorch 1.8	torchvision==0.9.1
PyTorch 1.11	-

安装依赖。

pip uninstall fairseq
pip install -e ./
pip install -r 1.8_requirements.txt

准备数据集

获取数据集。

用户自行下载WMT'16 En-De数据集。在源码包根目录下新建 wmt16_en_de_bpe32k 文件夹，将数据集上传到 wmt16_en_de_bpe32k 目录下并解压。

数据预处理。在源码包根目录下新建 data-bin 文件夹，并执行以下命令进行数据预处理。

fairseq-preprocess \
 --source-lang en --target-lang de \
 --trainpref wmt16_en_de_bpe32k/train.tok.clean.bpe.32000 \
 --validpref wmt16_en_de_bpe32k/newstest2013.tok.bpe.32000 \
 --testpref wmt16_en_de_bpe32k/newstest2014.tok.bpe.32000 \
 --destdir data-bin/wmt16_en_de_bpe32k \
 --nwordssrc 32768 --nwordstgt 32768 \
 --joined-dictionary \
 --workers 20

预处理后，数据集目录结构参考如下所示。

|——data-bin
    |——wmt16_en_de_bpe32k
        ├── dict.de.txt
        ├── dict.en.txt
        ├── preprocess.log
        ├── test.en-de.de.bin
        ├── test.en-de.de.idx
        ├── test.en-de.en.bin
        ├── test.en-de.en.idx
        ├── train.en-de.de.bin
        ├── train.en-de.de.idx
        ├── train.en-de.en.bin
        ├── train.en-de.en.idx
        ├── valid.en-de.de.bin
        ├── valid.en-de.de.idx
        ├── valid.en-de.en.bin
        └── valid.en-de.en.idx

说明： 该数据集的训练过程脚本只作为一种参考示例。

开始训练

训练模型

进入解压后的源码包根目录。
```
cd /${模型文件夹名称} 
```

运行训练脚本。

该模型支持单机单卡训练和单机8卡训练。

单机单卡训练

启动单卡训练

bash ./test/train_performance_1p.sh --data_path=$data_path  # 单卡性能

单机8卡训练

启动8卡训练

bash ./test/train_full_8p.sh --data_path=$data_path  # 8卡性能
bash ./test/train_performance_8p.sh --data_path=$data_path  # 8卡性能

模型训练脚本参数说明如下。

公共参数：
--arch                                     //模型架构
--optimizer                                //优化器
--adam-betas                               //优化器参数
--lr                                       //初始学习率
--warmup-updates                           //预热训练更新次数
--dropout                                  //dropout参数
--weight-decay                             //权重衰减
--criterion                                //损失计算方法
--max-tokens                               //最大tokens
--fp16                                     //是否使用fp16
--keep-last-epochs                         //保存最后几个epoch的权重
--distributed-world-size                   //训练卡数量
--device-id                                //指定训练卡
--max-update                               //最大训练迭代次数

训练完成后，权重文件保存在当前路径下，并输出模型训练精度和性能信息。

模型评估。

首先，使用 average_checkpoints.py 脚本对最后几个 checkpoint 求平均值。对最后 5-10 个 checkpoint 求平均通常很好，但您可能需要根据训练时间进行调整。
```
python scripts/average_checkpoints \
   --inputs /path/to/checkpoints \
   --num-epoch-checkpoints 10 \
   --output checkpoint.avg10.pt
```

接下来，使用 bean=4, lenpen=0.6 生成转换数据。

fairseq-generate \
   data-bin/wmt16_en_de_bpe32k \
   --path checkpoint.avg10.pt \
   --beam 4 --lenpen 0.6 --remove-bpe > gen.out

最后，计算 BLEU 指标。

bash scripts/compound_split_bleu.sh gen.out

训练结果展示

表 2 训练结果展示表

Name	BLEU	WPS	max-update	MODE	Torch_Version
1P-竞品V	-	22743.4	1000	fp16	1.8
8P-竞品V	28.33	110441	300000	fp16	1.8
1P-NPU	-	17729.7	1000	fp16	1.8
8P-NPU	28.47	84627.5	300000	fp16	1.8

版本说明

变更

2023.07.04：首次发布。

FAQ

无。