MindSpeed-Ops:基于昇腾生态的训练业务自定义算子项目

提供昇腾优化的训练业务自定义算子实现

分支1Tags0
Wwanggangguofix
93bdb9c3创建于 12 小时前42次提交
文件最后提交记录最后更新时间
add ci& add ut Co-authored-by: shiyuan680<917935075@qq.com> 1 个月前
test(backend): add nightly smoke test Co-authored-by: shiyuan680<yangcheng104@huawei.com> 1 个月前
feat: add aclnn ops of add Co-authored-by: shiyuan680<yangcheng104@huawei.com> # message auto-generated for no-merge-commit merge: !37 merge ascendc into master feat: add aclnn ops of add Created-by: zhizaidicengshehua Commit-by: shiyuan680 Merged-by: ascend-robot Description: ## What this PR does / why we need it? add aclnn ops of add. issue: [#30](https://gitcode.com/Ascend/MindSpeed-Ops/issues/30) ## Does this PR introduce any user-facing change? Please describe whether the PR will result in any user-facing usage changes. If there is related documentation, please specify its path. ## How was this patch tested? because of its demo of aclnn, so not support atk tests ![image.png](https://raw.gitcode.com/user-images/assets/9612429/9533a5a4-b11b-401f-adf5-37c688868452/image.png 'image.png') See merge request: Ascend/MindSpeed-Ops!378 天前
fix 1 天前
【feat】修改完善pre-commit配置文件 Co-authored-by: wujinyuan1<wujinyuan1@huawei.com> 27 天前
fix 12 小时前
【feat】修改完善pre-commit配置文件 Co-authored-by: wujinyuan1<wujinyuan1@huawei.com> 27 天前
feat: prepare_wy_repr ops Co-authored-by: liuxi_<liuxi75@huawei.com> # message auto-generated for no-merge-commit merge: !20 merge prepare_wy_repr into master feat: prepare_wy_repr ops Created-by: liuxi_ Commit-by: liuxi_ Merged-by: ascend-robot Description: ## What this PR does / why we need it? prepare_wy_repr ops [#1](https://gitcode.com/Ascend/MindSpeed-Ops/issues/1) ## Does this PR introduce any user-facing change? 不涉及 ## How was this patch tested? TA 3.2.1 ut ![image.png](https://raw.gitcode.com/user-images/assets/9612429/e080ebd9-8a25-45ae-ad87-dccda27e1e77/image.png 'image.png') atk ![image.png](https://raw.gitcode.com/user-images/assets/9612429/78ebd5d6-8a38-4316-8ee3-ff1a055f75eb/image.png 'image.png') ![image.png](https://raw.gitcode.com/user-images/assets/9612429/2c25d24b-773d-43c0-a714-4461a6cf51e9/image.png 'image.png') See merge request: Ascend/MindSpeed-Ops!2017 天前
【feat】修改完善pre-commit配置文件 Co-authored-by: wujinyuan1<wujinyuan1@huawei.com> 27 天前
add ci& add ut Co-authored-by: shiyuan680<917935075@qq.com> 1 个月前
feat: add RmsNormGated Co-authored-by: feng0w0<houyufeng4@huawei.com> # message auto-generated for no-merge-commit merge: !22 merge master into master feat: add RmsNormGated Created-by: feng0w0 Commit-by: feng0w0 Merged-by: ascend-robot Description: ## What this PR does / why we need it? add RmsNormGated Triton [#1](https://gitcode.com/Ascend/MindSpeed-Ops/issues/1) ## Does this PR introduce any user-facing change? Please describe whether the PR will result in any user-facing usage changes. If there is related documentation, please specify its path. ## How was this patch tested? ut ![image.png](https://raw.gitcode.com/user-images/assets/9612429/12418371-1f2d-4966-ab96-986077b3c558/image.png 'image.png') atk 精度 ![image.png](https://raw.gitcode.com/user-images/assets/9612429/3879ca4d-ab45-42c2-ab46-fa2920736536/image.png 'image.png') atk性能 ![image.png](https://raw.gitcode.com/user-images/assets/9612429/5efab269-aefb-4d02-8be5-fe36df89b857/image.png 'image.png') See merge request: Ascend/MindSpeed-Ops!228 天前
feat: version update & doc update Co-authored-by: shiyuan680<yangcheng104@huawei.com> # message auto-generated for no-merge-commit merge: !19 merge update into master feat: version update & doc update Created-by: zhizaidicengshehua Commit-by: shiyuan680 Merged-by: ascend-robot Description: ## What this PR does / why we need it? update triton version to fix bug(https://gitcode.com/Ascend/MindSpeed-Ops/issues/2) ## Does this PR introduce any user-facing change? Please describe whether the PR will result in any user-facing usage changes. If there is related documentation, please specify its path. ## How was this patch tested? Please explain how to verify the correctness and effectiveness of this feature, as well as its usage constraints and limitations. See merge request: Ascend/MindSpeed-Ops!1914 天前
add ATK installation dependencies and operator ATK test case specifications Co-authored-by: wangx700<wangxin700@huawei.com> 1 个月前

MindSpeed-Ops

简介

提供昇腾优化的训练业务自定义算子实现

版本说明

软件 版本
MindSpeed Ops分支 master
CANN版本 9.0.0
PyTorch 2.7.1
triton-ascend 3.2.1
Python版本 Python3.10.x

安装

安装依赖的软件

在安装MindSpeed-Ops之前,请参考版本配套表,安装配套的昇腾软件栈,软件列表如下:

依赖软件 软件安装指南
昇腾NPU驱动 驱动固件安装指南
昇腾NPU固件
Toolkit(开发套件) CANN 软件安装指南
Kernel(算子包)
NNAL(Ascend Transformer Boost加速库)
PyTorch Ascend Extension for PyTorch 配置与安装
torch_npu插件
apex

获取MindSpeed-Ops源码并安装

git clone https://gitcode.com/Ascend/MindSpeed-Ops.git

cd MindSpeed-Ops
# 如果原先有mindspeed_ops安装,请先卸载再重新安装
pip install -e . --extra-index-url=https://triton-ascend.osinfra.cn/pypi/simple

支持算子清单

算子名称 算子描述 Released triton tilelang
Add 加法算子
FusedCrossEntropyLoss 自定义融合算子
RmsnormWithoutWeight 自定义融合算子
Sinkhorn Sinkhorn算法
Causal_conv1d causal_conv1d算子
Wy_fast wy_fast算子
RmsNormGated RmsNormGated算子
chunk_kda_bwd_wy_dqkg_fused KDA chunk backward 融合算子

分支维护策略

🛠️ MindSpeed-Ops 版本分支的维护阶段如下:

状态 时间 说明
计划 🕐 1-3 个月 计划特性
开发 🕔 3 个月 开发特性
维护 🕚 6-12 个月 合入所有已解决的问题并发布版本,针对不同的MindSpeed-Ops 版本采取不同的维护策略,常规版本和长期支持版本维护周期分别为6个月和12个月
无维护 🕛 0-3 个月 合入所有已解决的问题,无专职维护人员,无版本发布
生命周期终止(EOL)🚫 N/A 分支不再接受任何修改

算子开发及合入说明

1、算子合入说明请参考ops.md

2、算子开发可参考使用skills工具:

常见问题

现象 介绍
module 'triton.language' has no attribute 'extract_slice' 问题介绍

免责声明

致MindSpeed-Ops使用者

  1. MindSpeed-Ops提供的所有内容仅供您用于非商业目的。
  2. 对于MindSpeed-Ops测试用例以及示例文件中所涉及的各模型和数据集,平台仅用于功能测试,华为不提供任何模型权重和数据集,如您使用这些数据进行训练,请您特别注意应遵守对应模型和数据集的License,如您因使用这些模型和数据集而产生侵权纠纷,华为不承担任何责任。
  3. 如您在使用MindSpeed-Ops过程中,发现任何问题(包括但不限于功能问题、合规问题),请在GitCode提交issue,我们将及时审视并解决。

License声明

Ascend MindSpeed-Ops中涉及的算子,如目录下存在License的,以该License为准。如目录下不存在License的,以Apache 2.0许可证许可,对应许可证文本可查阅Ascend MindSpeed-Ops根目录。

项目介绍

提供昇腾优化的训练业务自定义算子实现

定制我的领域

下载使用量

0

项目总下载次数(含Clone、Pull、 zip 包及 release 下载),每日凌晨更新

语言类型

Python95.56%
C++4.11%
Shell0.33%