文件最后提交记录最后更新时间
fix:fix atten_mask_shape error when using transformer_engine Co-authored-by: Keilo_W<wangkaiyu11@h-partners.com> # message auto-generated for no-merge-commit merge: !3293 merge master into master fix:fix atten_mask_shape error when using transformer_engine Created-by: Keilo_W Commit-by: Keilo_W Merged-by: ascend-robot Description: An atten_mask_shape error will occur if --attention-mask-type causal is used together with --transformer-impl transformer_engine. To avoid this, you must also enable the --use-flash-attn option. See merge request: Ascend/MindSpeed!32932 个月前
Add offline pad_data Co-authored-by: wuweiqiang24<wuweiqiang11@huawei.com> # message auto-generated for no-merge-commit merge: !2938 merge revise_preprocess_data into master Add offline pad_data Created-by: wuweiqiang24 Commit-by: wuweiqiang24 Merged-by: ascend-robot Description: 增加离线预处理pack数据集功能,可提前将数据padding到2\*CP倍,在线使用CP功能时可节约padding部分耗时 * 精度与非离线padding版本存在一定差异 ![2.png](https://raw.gitcode.com/user-images/assets/7404741/07e65a36-a1cd-4f79-ab62-832febdfa052/2.png '2.png') * Llama2-7b,单机16k,GBS=8场景下,性能提升4.8% ![性能提升.png](https://raw.gitcode.com/user-images/assets/7404741/e38f0a9b-e0b2-498a-a42b-c8b59ec05e87/性能提升.png '性能提升.png') See merge request: Ascend/MindSpeed!29386 个月前