| [feature] mstuner_catlass supports arch Ascend950 and the kernel 43_ascend950_basic_matmul
Co-authored-by: gong-siwei<gongsiwei@huawei.com>
# message auto-generated for no-merge-commit merge:
!612 merge tuner_950 into master
[feature] mstuner_catlass supports arch Ascend950 and the kernel 43_ascend950_basic_matmul
Created-by: gong-siwei
Commit-by: gong-siwei
Merged-by: cann-robot
Description: ## 描述
<!--在这里详细描述你的改动,包括改动的原因和所采取的方法。-->
mstuner_catlass 寻优工具支持950代际,首先支持 43_ascend950_basic_matmul。
## 关联的Issue
<!-- 如果这个PR是为了解决特定的Issue,请在这里提供Issue链接。-->
## 原因
<!--说明此次改动的目的、解决的问题等,应与类型标签匹配 -->
## 测试
<!--描述进行了哪些测试来验证你的改动。包括但不限于二级冒烟、算子泛化等。-->
在Ascend950环境验证如下所示:
编译:
```bash
$ bash scripts/build.sh -DCATLASS_ARCH=3510 -DCATLASS_LIBRARY_KERNELS=43_ascend950_basic_matmul mstuner_catlass
...
[100%] Linking ASC executable 55_ascend950_basic_matmul_notla
[100%] Built target 55_ascend950_basic_matmul_notla
-- Install configuration: "Debug"
-- Installing: /data/wangyixian/gsw/catlass_aml/output/bin/55_ascend950_basic_matmul_notla
[INFO]Target '55_ascend950_basic_matmul_notla' built successfully
```
运行:
```bash
$ ./output/bin/mstuner_catlass --m=256 --n=512 --k=1024 --device=0 --output=result.csv
...
================================
case_id : 353
task_duration(us) : 155.445
device_id : 0
operation : Gemm
description : catlass_gemm_43_ascend950_basic_matmul_fp32xRowMajor_fp32xRowMajor_fp32xRowMajor_256x240x128_256x240x32_swizzle3x1
l0_tile_shape : 256x240x32
l1_tile_shape : 256x240x128
swizzle : swizzle3x1
m : 256
n : 512
k : 1024
A : fp32:row
B : fp32:row
C : fp32:row
================================
case_id : 354
task_duration(us) : 163.431
device_id : 0
operation : Gemm
description : catlass_gemm_43_ascend950_basic_matmul_fp32xRowMajor_fp32xRowMajor_fp32xRowMajor_256x256x128_256x256x32_swizzle3x1
l0_tile_shape : 256x256x32
l1_tile_shape : 256x256x128
swizzle : swizzle3x1
m : 256
n : 512
k : 1024
A : fp32:row
B : fp32:row
C : fp32:row
================================
Top 10:
case_id,task_duration(us),device_id,operation,description,m,n,k,A,B,C
3,43.349,0,Gemm,catlass_gemm_43_ascend950_basic_matmul_fp32xRowMajor_fp32xRowMajor_fp32xRowMajor_128x128x128_128x128x64_swizzle3x1,256,512,1024,fp32:row,fp32:row,fp32:row
6,43.761,0,Gemm,catlass_gemm_43_ascend950_basic_matmul_fp32xRowMajor_fp32xRowMajor_fp32xRowMajor_128x128x144_128x128x64_swizzle3x1,256,512,1024,fp32:row,fp32:row,fp32:row
1,43.782,0,Gemm,catlass_gemm_43_ascend950_basic_matmul_fp32xRowMajor_fp32xRowMajor_fp32xRowMajor_128x128x128_128x128x32_swizzle3x1,256,512,1024,fp32:row,fp32:row,fp32:row
10,43.808,0,Gemm,catlass_gemm_43_ascend950_basic_matmul_fp32xRowMajor_fp32xRowMajor_fp32xRowMajor_128x128x176_128x128x32_swizzle3x1,256,512,1024,fp32:row,fp32:row,fp32:row
2,43.857,0,Gemm,catlass_gemm_43_ascend950_basic_matmul_fp32xRowMajor_fp32xRowMajor_fp32xRowMajor_128x128x128_128x128x48_swizzle3x1,256,512,1024,fp32:row,fp32:row,fp32:row
13,43.919,0,Gemm,catlass_gemm_43_ascend950_basic_matmul_fp32xRowMajor_fp32xRowMajor_fp32xRowMajor_128x128x192_128x128x32_swizzle3x1,256,512,1024,fp32:row,fp32:row,fp32:row
5,43.919,0,Gemm,catlass_gemm_43_ascend950_basic_matmul_fp32xRowMajor_fp32xRowMajor_fp32xRowMajor_128x128x144_128x128x48_swizzle3x1,256,512,1024,fp32:row,fp32:row,fp32:row
12,43.954,0,Gemm,catlass_gemm_43_ascend950_basic_matmul_fp32xRowMajor_fp32xRowMajor_fp32xRowMajor_128x128x176_128x128x64_swizzle3x1,256,512,1024,fp32:row,fp32:row,fp32:row
8,43.958,0,Gemm,catlass_gemm_43_ascend950_basic_matmul_fp32xRowMajor_fp32xRowMajor_fp32xRowMajor_128x128x160_128x128x48_swizzle3x1,256,512,1024,fp32:row,fp32:row,fp32:row
15,43.967,0,Gemm,catlass_gemm_43_ascend950_basic_matmul_fp32xRowMajor_fp32xRowMajor_fp32xRowMajor_128x128x192_128x128x64_swizzle3x1,256,512,1024,fp32:row,fp32:row,fp32:row
[INFO ] Save profile data to /data/wangyixian/gsw/catlass/result.csv success
```
## 文档更新
<!--如果这个PR包含文档的更新,请在这里指出。例如:更新了README.md文件。-->
## 类型标签
<!-- [x] 表示选中 -->
- [ ] Bug修复
- [x] 新特性
- [ ] 性能优化
- [ ] 文档更新
- [ ] 其他,请描述:
See merge request: cann/catlass!612 | 13 天前 |