| add mhc
Co-authored-by: ElevenLiu<liuzhiwen@autokernel.cn>
# message auto-generated for no-merge-commit merge:
!1589 merge master into master
add mhc
Created-by: ElevenLiu
Commit-by: LiuEleven;ElevenLiu
Merged-by: cann-robot
Description: ## 描述
面向昇腾 NPU 的 mHC(Manifold-Constrained Hyper-Connections,流形约束超连接)算子 AscendC 实现,此算子由智子芯元 KernelCAT 智能体生成。
## 关联的Issue
关联Issue #763
## 测试
```bash
# C++
cd build && LD_LIBRARY_PATH=./lib:$LD_LIBRARY_PATH ./test_multi_dtype
# Python
LD_LIBRARY_PATH=./build/lib:$LD_LIBRARY_PATH python mhc_pre_ops.py
```
## 使用方式
```python
import mhc_pre_ext
x = torch.randn(B * N, S, D, device='npu') # [batch*streams, seq, dim]
h = torch.randn(N, device='npu') # [streams]
out = mhc_pre_ext.forward(x, h) # [batch, seq, dim]
```
## 性能(对比 torch.einsum, Ascend 910B2)
| 算子 | 加速比 |
|----------|--------|
| mhc_pre | 24x ~ 52x |
| mhc_post | 2x ~ 5x |
| mhc_res | 24x ~ 50x |
## 文档更新
更新了README.md文件
## 类型标签
<!-- [x] 表示选中 -->
- [ ] Bug修复
- [ ] 新特性
- [ ] 性能优化
- [x] 文档更新
- [x] 其他,请描述:新增开源贡献算子
See merge request: cann/ops-transformer!1589 | 3 个月前 |