文件最后提交记录最后更新时间
aclnnInplaceAttentionWorkerScheduler资料更新 Co-authored-by: Ding_Jing<dingjing19@huawei.com> # message auto-generated for no-merge-commit merge: !4269 merge modify_0416 into master aclnnInplaceAttentionWorkerScheduler资料更新 Created-by: Ding_Jing Commit-by: Ding_Jing Merged-by: cann-robot Description: ## 描述 <!--在这里详细描述你的改动,包括改动的原因和所采取的方法。--> aclnnInplaceAttentionWorkerScheduler资料格式按规范整改 ## 关联的Issue <!-- 如果这个PR是为了解决特定的Issue,请在这里提供Issue链接。例如:关联Issue #000--> <!-- 如果这个PR是为了解决特定的问题单,请在这里描述问题单单号。--> issue [#1747](https://gitcode.com/cann/ops-transformer/issues/1747) issue [#1748](https://gitcode.com/cann/ops-transformer/issues/1748) issue [#1750](https://gitcode.com/cann/ops-transformer/issues/1750) issue [#1751](https://gitcode.com/cann/ops-transformer/issues/1751) ## 测试 <!--描述进行了哪些测试来验证你的改动。包括但不限于二级冒烟、算子泛化等。--> 资料修改,不影响业务功能 ## 文档更新 <!--如果这个PR包含文档的更新,请在这里指出。例如:更新了README.md文件。--> 更新了attention/attention_worker_scheduler/docs/aclnnInplaceAttentionWorkerScheduler.md ## 类型标签 <!-- [x] 表示选中 --> - [ ] 🐛 Bug 修复 - [ ] ✨ 新特性 - [ ] ⚡ 性能优化 - [ ] ♻️ 重构 - [ ] 🧪 测试 - [ ] 📦 构建/CI - [ ] 🔧 配置变更 - [x] 📝 文档更新 - [ ] ⬆️ 依赖升级 - [ ] 🔒 安全修复 - [ ] 🧹 代码清理 - [ ] ❓ 其他,请描述: See merge request: cann/ops-transformer!42691 个月前
Add the AI CPU implementation of FfnWorkerScheduler and AttentionWorkerScheduler Co-authored-by: @ding-jing12<dingjing19@huawei.com> # message auto-generated for no-merge-commit merge: !1076 merge atten_ffn_op into master Add the AI CPU implementation of FfnWorkerScheduler and AttentionWorkerScheduler Created-by: Ding_Jing Commit-by: @ding-jing12 Merged-by: cann-robot Description: ## 描述 <!--在这里详细描述你的改动,包括改动的原因和所采取的方法。--> Add the AI CPU implementation of FfnWorkerScheduler and AttentionWorkerScheduler ## 关联的Issue <!-- 如果这个PR是为了解决特定的Issue,请在这里提供Issue链接。例如:关联Issue #000--> <!-- 如果这个PR是为了解决特定的问题单,请在这里描述问题单单号。--> ## 测试 <!--描述进行了哪些测试来验证你的改动。包括但不限于二级冒烟、算子泛化等。--> ``` 功能验证:AttentionWorkerScheduler [2026-01-28 11:32:27] [2026-01-28 11:32:27] Start to run example,name:attention_worker_scheduler mode:eager [2026-01-28 11:32:27] Start compile and run example file: ../attention/attention_worker_scheduler/examples/test_aclnn_inplace_attention_worker_scheduler.cpp [2026-01-28 11:32:27] pkg_mode:cust vendor_name:custom [2026-01-28 11:32:33] micro_batch_id = 0. [2026-01-28 11:32:33] run test_aclnn_attention_worker_scheduler, execute samples success [2026-01-28 11:32:33] Example completed successfully 功能验证:FfnWorkerScheduler [2026-01-28 11:40:44] [2026-01-28 11:40:44] Start to run example,name:ffn_worker_scheduler mode:eager [2026-01-28 11:40:44] Start compile and run example file: ../ffn/ffn_worker_scheduler/examples/test_aclnn_inplace_ffn_worker_scheduler.cpp [2026-01-28 11:40:44] pkg_mode:cust vendor_name:custom [2026-01-28 11:40:50] Init ffn success, token_info_buf_size=192,token_data_buf_size= 1024. [2026-01-28 11:40:50] layer_ids_buf_size = 8. [2026-01-28 11:40:50] session_ids_buf_size = 8. [2026-01-28 11:40:50] micro_batch_ids_buf_size = 8. [2026-01-28 11:40:50] expert_ids_buf_size = 80. [2026-01-28 11:40:50] layer_ids[0] is: 55 [2026-01-28 11:40:50] layer_ids[1] is: 55 [2026-01-28 11:40:50] session_ids[0] is: 0 [2026-01-28 11:40:50] session_ids[1] is: 1 [2026-01-28 11:40:50] micro_batch_ids[0] is: 0 [2026-01-28 11:40:50] micro_batch_ids[1] is: 0 [2026-01-28 11:40:50] expert_ids[0] is: 0 [2026-01-28 11:40:50] expert_ids[1] is: 1 [2026-01-28 11:40:50] expert_ids[2] is: 2 [2026-01-28 11:40:50] expert_ids[3] is: 3 [2026-01-28 11:40:50] expert_ids[4] is: 4 [2026-01-28 11:40:50] expert_ids[5] is: 5 [2026-01-28 11:40:50] expert_ids[6] is: 6 [2026-01-28 11:40:50] expert_ids[7] is: 7 [2026-01-28 11:40:50] expert_ids[8] is: 8 [2026-01-28 11:40:50] expert_ids[9] is: 9 [2026-01-28 11:40:50] expert_ids[10] is: 0 [2026-01-28 11:40:50] expert_ids[11] is: 1 [2026-01-28 11:40:50] expert_ids[12] is: 2 [2026-01-28 11:40:50] expert_ids[13] is: 3 [2026-01-28 11:40:50] expert_ids[14] is: 4 [2026-01-28 11:40:50] expert_ids[15] is: 5 [2026-01-28 11:40:50] expert_ids[16] is: 6 [2026-01-28 11:40:50] expert_ids[17] is: 7 [2026-01-28 11:40:50] expert_ids[18] is: 8 [2026-01-28 11:40:50] expert_ids[19] is: 9 [2026-01-28 11:40:50] run test_aclnn_ffn_worker_scheduler, execute samples success [2026-01-28 11:40:50] Example completed successfully ``` ## 文档更新 <!--如果这个PR包含文档的更新,请在这里指出。例如:更新了README.md文件。--> Update the README.md files in attention/attention_worker_scheduler/ and ffn/ffn_worker_scheduler/, adding usage instructions ## 类型标签 <!-- [x] 表示选中 --> - [ ] Bug修复 - [x] 新特性 - [ ] 性能优化 - [ ] 文档更新 - [ ] 其他,请描述: See merge request: cann/ops-transformer!10763 个月前
Add the AI CPU implementation of FfnWorkerScheduler and AttentionWorkerScheduler Co-authored-by: @ding-jing12<dingjing19@huawei.com> # message auto-generated for no-merge-commit merge: !1076 merge atten_ffn_op into master Add the AI CPU implementation of FfnWorkerScheduler and AttentionWorkerScheduler Created-by: Ding_Jing Commit-by: @ding-jing12 Merged-by: cann-robot Description: ## 描述 <!--在这里详细描述你的改动,包括改动的原因和所采取的方法。--> Add the AI CPU implementation of FfnWorkerScheduler and AttentionWorkerScheduler ## 关联的Issue <!-- 如果这个PR是为了解决特定的Issue,请在这里提供Issue链接。例如:关联Issue #000--> <!-- 如果这个PR是为了解决特定的问题单,请在这里描述问题单单号。--> ## 测试 <!--描述进行了哪些测试来验证你的改动。包括但不限于二级冒烟、算子泛化等。--> ``` 功能验证:AttentionWorkerScheduler [2026-01-28 11:32:27] [2026-01-28 11:32:27] Start to run example,name:attention_worker_scheduler mode:eager [2026-01-28 11:32:27] Start compile and run example file: ../attention/attention_worker_scheduler/examples/test_aclnn_inplace_attention_worker_scheduler.cpp [2026-01-28 11:32:27] pkg_mode:cust vendor_name:custom [2026-01-28 11:32:33] micro_batch_id = 0. [2026-01-28 11:32:33] run test_aclnn_attention_worker_scheduler, execute samples success [2026-01-28 11:32:33] Example completed successfully 功能验证:FfnWorkerScheduler [2026-01-28 11:40:44] [2026-01-28 11:40:44] Start to run example,name:ffn_worker_scheduler mode:eager [2026-01-28 11:40:44] Start compile and run example file: ../ffn/ffn_worker_scheduler/examples/test_aclnn_inplace_ffn_worker_scheduler.cpp [2026-01-28 11:40:44] pkg_mode:cust vendor_name:custom [2026-01-28 11:40:50] Init ffn success, token_info_buf_size=192,token_data_buf_size= 1024. [2026-01-28 11:40:50] layer_ids_buf_size = 8. [2026-01-28 11:40:50] session_ids_buf_size = 8. [2026-01-28 11:40:50] micro_batch_ids_buf_size = 8. [2026-01-28 11:40:50] expert_ids_buf_size = 80. [2026-01-28 11:40:50] layer_ids[0] is: 55 [2026-01-28 11:40:50] layer_ids[1] is: 55 [2026-01-28 11:40:50] session_ids[0] is: 0 [2026-01-28 11:40:50] session_ids[1] is: 1 [2026-01-28 11:40:50] micro_batch_ids[0] is: 0 [2026-01-28 11:40:50] micro_batch_ids[1] is: 0 [2026-01-28 11:40:50] expert_ids[0] is: 0 [2026-01-28 11:40:50] expert_ids[1] is: 1 [2026-01-28 11:40:50] expert_ids[2] is: 2 [2026-01-28 11:40:50] expert_ids[3] is: 3 [2026-01-28 11:40:50] expert_ids[4] is: 4 [2026-01-28 11:40:50] expert_ids[5] is: 5 [2026-01-28 11:40:50] expert_ids[6] is: 6 [2026-01-28 11:40:50] expert_ids[7] is: 7 [2026-01-28 11:40:50] expert_ids[8] is: 8 [2026-01-28 11:40:50] expert_ids[9] is: 9 [2026-01-28 11:40:50] expert_ids[10] is: 0 [2026-01-28 11:40:50] expert_ids[11] is: 1 [2026-01-28 11:40:50] expert_ids[12] is: 2 [2026-01-28 11:40:50] expert_ids[13] is: 3 [2026-01-28 11:40:50] expert_ids[14] is: 4 [2026-01-28 11:40:50] expert_ids[15] is: 5 [2026-01-28 11:40:50] expert_ids[16] is: 6 [2026-01-28 11:40:50] expert_ids[17] is: 7 [2026-01-28 11:40:50] expert_ids[18] is: 8 [2026-01-28 11:40:50] expert_ids[19] is: 9 [2026-01-28 11:40:50] run test_aclnn_ffn_worker_scheduler, execute samples success [2026-01-28 11:40:50] Example completed successfully ``` ## 文档更新 <!--如果这个PR包含文档的更新,请在这里指出。例如:更新了README.md文件。--> Update the README.md files in attention/attention_worker_scheduler/ and ffn/ffn_worker_scheduler/, adding usage instructions ## 类型标签 <!-- [x] 表示选中 --> - [ ] Bug修复 - [x] 新特性 - [ ] 性能优化 - [ ] 文档更新 - [ ] 其他,请描述: See merge request: cann/ops-transformer!10763 个月前
common目录结构整改 Co-authored-by: hello_simida<wangyi206@huawei.com> # message auto-generated for no-merge-commit merge: !5133 merge master into master common目录结构整改 Created-by: hello_simida Commit-by: hello_simida Merged-by: cann-robot Description: ## 描述 完成 common 目录结构调整,并适配所有 CMake 和源码引用路径。 ## 关联的Issue Closes #2369 ## 测试 - 已通过单算子编译验证:bash build.sh --pkg --soc=ascend910b --ops=all_gather_matmul_v2 - build_out/ 已成功生成 .run 包 ## 文档更新 无 ## 类型标签 - [ ] 🐛 Bug 修复 - [ ] ✨ 新特性 - [ ] ⚡ 性能优化 - [x] ♻️ 重构 - [ ] 🧪 测试 - [ ] 📦 构建/CI - [ ] 🔧 配置变更 - [ ] 📝 文档更新 - [ ] ⬆️ 依赖升级 - [ ] 🔒 安全修复 - [ ] 🧹 代码清理 - [ ] ❓ 其他,请描述: See merge request: cann/ops-transformer!513314 天前
feat:aicpu算子信息库从json合并模式整改到opdef的形式 Co-authored-by: zhaowenrui666<zhaowenrui7@huawei.com> # message auto-generated for no-merge-commit merge: !4864 merge opdef into master feat:aicpu算子信息库从json合并模式整改到opdef的形式 Created-by: zhaowenrui666 Commit-by: zhaowenrui666 Merged-by: cann-robot Description: ## 描述 当前 AttentionWorkerSchedulerFfnWorkerScheduler 的 AICPU 算子信息仍通过手写 JSON 维护,未接入 OpDef 驱动生成链路,存在 JSON 与算子定义分离、真值源不统一、构建链路维护成本高的问题。 本次 PR 将这两个算子的 AICPU 算子信息库从手写 JSON 模式整改为 OpDef 生成链路,主要变更如下: 1. 为 AttentionWorkerSchedulerFfnWorkerScheduler 新增对应的 *_aicpu_def.cpp,使用真实 OpDef 描述输入输出、属性和 AICPU 配置。 2. 删除原有的手写 JSON 真值文件,不再直接维护这两个算子的 AICPU JSON。 3. 调整 CMake 与 AICPU 生成链路,使 cust_aicpu_kernel.json*_aicpu_def.cpp -> libaicpu_ops.so -> op_build --aicpu -> aicpu_kernel.ini -> aicpu_parser_ini_to_json.py 生成。 4. 为仓内新增 AICPU 专用的 aicpu_parser_ini_to_json.py,独立承担 AICPU ini 转 json 的校验与生成。 5. 统一 AttentionWorkerSchedulerFfnWorkerScheduler 的 AICPU 默认配置表达,并补齐 FfnWorkerScheduler 的属性定义。 ## 关联的Issue - 关联 Issue #2196 - https://gitcode.com/cann/ops-transformer/issues/2196 ## 测试 已基于重新编译产物完成验证,结果如下: 1. opInfo vs 原始 JSON:通过 2. I/O normalized vs canndev:通过 3. JSON vs INI 完整对比:通过 4. aicpu_parser_ini_to_json.py 校验:通过 5. change 前后 JSON 对比:通过 关键结果: - 重新生成目标输出 Compile op info cfg successfully. - before 字段路径在 after 中全部存在 - 新增字段仅为 opInfo.formatAgnosticopInfo.opsFlagopInfo.subTypeOfInferShape - AttentionWorkerSchedulerFfnWorkerSchedulerinput0/output0 与 canndev 基线一致 ## 文档更新 无。 ## 类型标签 - [x] 🐛 Bug 修复 - [ ] ✨ 新特性 - [ ] ⚡ 性能优化 - [ ] ♻️ 重构 - [ ] 🧪 测试 - [ ] 📦 构建/CI - [ ] 🔧 配置变更 - [ ] 📝 文档更新 - [ ] ⬆️ 依赖升级 - [ ] 🔒 安全修复 - [ ] 🧹 代码清理 - [ ] ❓ 其他,请描述: See merge request: cann/ops-transformer!486411 天前
common目录结构整改 Co-authored-by: hello_simida<wangyi206@huawei.com> # message auto-generated for no-merge-commit merge: !5133 merge master into master common目录结构整改 Created-by: hello_simida Commit-by: hello_simida Merged-by: cann-robot Description: ## 描述 完成 common 目录结构调整,并适配所有 CMake 和源码引用路径。 ## 关联的Issue Closes #2369 ## 测试 - 已通过单算子编译验证:bash build.sh --pkg --soc=ascend910b --ops=all_gather_matmul_v2 - build_out/ 已成功生成 .run 包 ## 文档更新 无 ## 类型标签 - [ ] 🐛 Bug 修复 - [ ] ✨ 新特性 - [ ] ⚡ 性能优化 - [x] ♻️ 重构 - [ ] 🧪 测试 - [ ] 📦 构建/CI - [ ] 🔧 配置变更 - [ ] 📝 文档更新 - [ ] ⬆️ 依赖升级 - [ ] 🔒 安全修复 - [ ] 🧹 代码清理 - [ ] ❓ 其他,请描述: See merge request: cann/ops-transformer!513314 天前
匹配最新cann版本,上线新版本attention/ffn/moe/posembedding算子 Co-authored-by: huangchuhong<huangchuhong1@h-partners.com> # message auto-generated for no-merge-commit merge: !538 merge master into master 匹配最新cann版本,上线新版本attention/ffn/moe/posembedding算子 Created-by: huang-chuhong Commit-by: huangchuhong Merged-by: cann-robot Description: ## 描述 transformer 同步算子最新开发,涉及attention/ffn/moe/posembedding ## 关联的Issue https://gitcode.com/cann/ops-transformer/issues/315 ## 测试 <!--描述进行了哪些测试来验证你的改动。包括但不限于二级冒烟、算子泛化等。--> ## 文档更新 <!--如果这个PR包含文档的更新,请在这里指出。例如:更新了README.md文件。--> ## 类型标签 <!-- [x] 表示选中 --> - [ ] Bug修复 - [ ] 新特性 - [ ] 性能优化 - [ ] 文档更新 - [ ] 其他,请描述: See merge request: cann/ops-transformer!5385 个月前
doc Tools扫描错误修改 Co-authored-by: gitee-yanglulu<yanglulul@h-partners.com> # message auto-generated for no-merge-commit merge: !3348 merge master into master doc Tools扫描错误修改 Created-by: gitee-yanglulu Commit-by: gitee-yanglulu Merged-by: cann-robot Description: doc Tools扫描错误修改 See merge request: cann/ops-transformer!33482 个月前
README.md

AttentionWorkScheduler

产品支持情况

产品 是否支持
Ascend 950PR/Ascend 950DT
Atlas A3 训练系列产品/Atlas A3 推理系列产品
Atlas A2 训练系列产品/Atlas A2 推理系列产品
Atlas 200I/500 A2 推理产品 ×
Atlas 推理系列产品
Atlas 训练系列产品

功能说明

  • 算子功能:Attention和FFN分离部署场景下,Attention侧数据扫描算子。该算子接收来自FFNToAttention算子的输出数据,并对数据进行逐步扫描,确保数据准备就绪。

    该算子不建议单独使用,建议与FFNToAttention和AttentionWorkerCombine算子配合使用,形成完整的工作流。

    1. 接收FFNToAttention算子发送的数据。该数据以ScheduleContext结构体存储。该结构体包含CommonArea,ControlArea,AttentionArea,FfnArea域。本接口涉及CommonArea(用于存储配置信息,如session_num,micro_batch_num,micro_batch_size,selected_expert_num),ControlArea(用于上层控制进程是否退出),AttentionArea域(负责管理算子计算过程中所需的核心数据缓冲区与状态信息,其中token_info_buf存储了与输入相关的数据信息)。

    2. 读取ScheduleContext.AttentionArea域中token_info_buf存储的flag信息,查看通信数据是否准备就绪。

    3. 数据全部准备就绪后,后续可供AttentionWorkerCombine算子使用。

  • 计算公式:

Initialize:ready_count=0,flag_num=micro_batch_size×selected_expert_num\text{Initialize:} \quad \text{ready\_count} = 0, \quad \text{flag\_num} = \text{micro\_batch\_size} \times \text{selected\_expert\_num}

Check if run_flag is 0:if run_flag=0,exit and log\text{Check if run\_flag is 0:} \quad \text{if run\_flag} = 0, \quad \text{exit and log}

Loop:while run_flag≠0:ready_count=∑_i=1flag_num1_{flag[i]=1};if ready_count=flag_num,break\text{Loop:} \quad \text{while run\_flag} \neq 0: \quad \text{ready\_count} = \sum\_{i=1}^{\text{flag\_num}} \mathbf{1}\_{\{ \text{flag}[i] = 1 \}}; \quad \text{if ready\_count} = \text{flag\_num}, \quad \text{break}

Reset flags:flag[i]=0fori=1,2,…,flag_num\text{Reset flags:} \quad \text{flag}[i] = 0 \quad \text{for} \quad i = 1, 2, \dots, \text{flag\_num}

Set micro_batch_id:micro_batch_id=(micro_batch_id+1)%micro_batch_num\text{Set micro\_batch\_id:} \quad \text{micro\_batch\_id} = (\text{micro\_batch\_id} + 1) \% \text{micro\_batch\_num}

备注:micro_batch_size、selected_expert_num、run_flag、micro_batch_id是入参ScheduleContext结构体的参数,该结构体信息在调用示例中进行展示说明。

参数说明

  • 参数说明

    参数名 输入/输出 描述 使用说明 数据类型 数据格式 维度(shape) 非连续Tensor
    scheduleContextRef 输入/输出 Attention侧接收的待处理数据,表示输入scheduleContext信息,详细结构见调用示例。 不支持空Tensor。 INT8 ND 1维,shape固定为(1024) ×
    workspaceSize 输出 返回需要在Device侧申请的workspace大小。 - - - - -
    executor 输出 返回op执行器,包含了算子计算流程。 - - - - -

约束说明

无。

调用说明

调用方式 样例代码 说明
aclnn接口 test_aclnn_inplace_attention_worker_scheduler 通过aclnnInplaceAttentionWorkerScheduler接口方式调用AttentionWorkScheduler算子。