文件最后提交记录最后更新时间
[FEAT] [Inductor] Add support for sequence length bucketing in dynamic shapes Co-authored-by: luqichao<luqichao1@huawei.com> # message auto-generated for no-merge-commit merge: !28555 merge v2.7.1_shape_2dim into v2.7.1 [FEAT] [Inductor] Add support for sequence length bucketing in dynamic shapes Created-by: luqichao Commit-by: luqichao Merged-by: ascend-robot Description: 修改后全量配置参考: ``` self.shape_options = { "enable_shape_handling": True, "shape_handling_configs":[ { "type": "BATCHSIZE", # 处理的维度类型(Required) "dimensions": 0, # 该维度在tensor中的下标(BATCHSIZE默认为0)(Optional) "indices": [0, 1], # 需要处理的tensor下标(默认为所有tensor)(Optional) "value": 0.0, # padding时填充的值, 默认为0.0(Optional) "gears": [], # 自定义档位信息(Optional) "min_size": 1, # 该维度的最小大小(档位), 默认为1(Optional) "max_size": 1024, # 该维度的最大大小(档位), 默认为1024(Optional) "policy": "TIMES", # 依据min_size, max_size自动生成gears的策略, 默认为TIMES, 表示生成范围内2的整数幂档位(Optional) }, { "type": "SEQLEN", # 处理的维度类型(Required) "dimensions": [1, 1], # 该维度在tensor中的下标, 与indices一一对应, 也可以接收dimension表示所有tensor使用相同的dimension index, 优先接收dimension(SEQLEN默认为1)(Optional) "indices": [0, 1], # 需要处理的tensor下标(默认为所有tensor)(Optional) "value": 0.0, # padding时填充的值, 默认为0.0(Optional) "gears": [], # 自定义档位信息(Optional) "min_size": 1, # 该维度的最小大小(档位), 默认为1(Optional) "max_size": 1024, # 该维度的最大大小(档位), 默认为1024(Optional) "policy": "TIMES", # 依据min_size, max_size自动生成gears的策略, 默认为TIMES, 表示生成范围内2的整数幂档位(Optional) } ], "shape_handling_dict": { "trans_pre_fn": transform_pre_fn, "trans_post_fn": transform_post_fn, "re_pre_fn": recover_pre_fn, "re_post_fn": recover_post_fn, } } ``` configs可以不提供,默认只对batchsize做分档 See merge request: Ascend/pytorch!285554 个月前
【Inductor】add aot inductor for v2.7.1 Co-authored-by: Zichun Ye<zichun.ye@huawei.com> # message auto-generated for no-merge-commit merge: merge aoti_v27 into v2.7.1 【Inductor】add aot inductor for v2.7.1 Created-by: zichun_ye Commit-by: Zichun Ye Merged-by: ascend-robot Description: <!-- Thanks for sending a pull request! --> **What type of PR is this?** > Uncomment only one /kind <> line, hit enter to put that in a new line, and remove leading whitespaces from that line: > > /kind bug > /kind task /kind feature **What does this PR do / why do we need it**: add aot inductor for v2.7.1 update logic to generate cpp wrapper for v2.7.1 add the whole cpp utils functions for aot inductor fix some bugs **Which issue(s) this PR fixes**: <!-- *Automatically closes linked issue when PR is merged. Usage: Fixes #<issue number>, or Fixes (paste link of issue). --> Fixes # **Special notes for your reviewers**: See merge request: Ascend/pytorch!248538 个月前
[Inductor] add aoti support Co-authored-by: zhuceHW<zhuce@huawei.com> # message auto-generated for no-merge-commit merge: !32868 merge v2.7.1-26.0.0 into v2.7.1-26.0.0 [Inductor] add aoti support Created-by: zhucehw Commit-by: zhuceHW Merged-by: ascend-robot Description: # 【合入来源】 add aoti support - [ ] 需求 - [ ] 问题单 - [ ] issue/工单 - [x] 重构优化 - [ ] 资料更新 # 【修改方案】 1. add ffts check, device guard, dynamic shape support for AOTInductor, make CppWrapperNpu extends CppWrapperGpu, make fallback when cpp_wrapper meets mm/bmm/gmm, add utils_npu.h, shim_npu.h shim_npu.cpp into csrc\inductor, now AOTI works for v2.7.1 in A2\A3\A5 2. refactor triton heuristic logic, now get_heuristic will return heuristic type like community('pointwise', 'reduction' etc) 3. add support for cpp_wrapper # 【资料变更】 不涉及 # 【接口变更】 不涉及 # 【功能验证】 ci passes # 【CheckList】 > PR提交人对以下CheckList自检项进行全量自检,自检通过或不涉及,均修改 [ ] 为 [x] - [x] 代码注释完备,正确记录错误日志 - [x] 代码实现进行了返回值、空指针等校验 - [x] PR标题正确使用类型标签,如:feat、fix、refactor、docs、test等 - [x] PR持续集成流水线(CI)执行通过,代码检查无异常 See merge request: Ascend/pytorch!328681 个月前
[Inductor] add aoti support Co-authored-by: zhuceHW<zhuce@huawei.com> # message auto-generated for no-merge-commit merge: !32868 merge v2.7.1-26.0.0 into v2.7.1-26.0.0 [Inductor] add aoti support Created-by: zhucehw Commit-by: zhuceHW Merged-by: ascend-robot Description: # 【合入来源】 add aoti support - [ ] 需求 - [ ] 问题单 - [ ] issue/工单 - [x] 重构优化 - [ ] 资料更新 # 【修改方案】 1. add ffts check, device guard, dynamic shape support for AOTInductor, make CppWrapperNpu extends CppWrapperGpu, make fallback when cpp_wrapper meets mm/bmm/gmm, add utils_npu.h, shim_npu.h shim_npu.cpp into csrc\inductor, now AOTI works for v2.7.1 in A2\A3\A5 2. refactor triton heuristic logic, now get_heuristic will return heuristic type like community('pointwise', 'reduction' etc) 3. add support for cpp_wrapper # 【资料变更】 不涉及 # 【接口变更】 不涉及 # 【功能验证】 ci passes # 【CheckList】 > PR提交人对以下CheckList自检项进行全量自检,自检通过或不涉及,均修改 [ ] 为 [x] - [x] 代码注释完备,正确记录错误日志 - [x] 代码实现进行了返回值、空指针等校验 - [x] PR标题正确使用类型标签,如:feat、fix、refactor、docs、test等 - [x] PR持续集成流水线(CI)执行通过,代码检查无异常 See merge request: Ascend/pytorch!328681 个月前
dvm: refactor ws alloc paths and reorder pybind_api functions Co-authored-by: huangchengnuo<huangchengnuo1@huawei.com> # message auto-generated for no-merge-commit merge: !31627 merge fix_inductor_dvm_bug into v2.7.1 dvm: refactor ws alloc paths and reorder pybind_api functions Created-by: SorryNaCN Commit-by: huangchengnuo Merged-by: ascend-robot Description: <!-- PR描述模板更新日期:20260203 --> # 【合入来源】 > <font color="red">**如有社区issue,请关联issue链接**</font>\ > <font color="red">**请勿携带内部流程信息(需求链接、问题单、内部issue等)**</font> - [ ] 需求 - [ ] 问题单 - [ ] issue/工单 - [x] 重构优化 - [ ] 资料更新 # 【修改方案】 1. 在 torch_npu/csrc/inductor/dvm/pybind_api.cpp 增加流捕获状态判断 IsCurrentStreamCapturing(),并引入两套 workspace 分配/launch 路径: LaunchV2 + AllocWorkspaceV2(常规路径)和 LaunchV1 + AllocWorkspaceV1(流捕获路径)。 2. 在 TorchKernelPy::CallGraphSplitKernelPy::Call 增加 if (!IsCurrentStreamCapturing()) ... else ... 分支,确保流捕获场景不走 allocate_workspace 路径。 3. 将 TorchKernelPy::Launch 拆分为 LaunchV1/LaunchV2;将 GraphSplitBase::Launch 拆分为 LaunchV1/LaunchV2,并提供 ExternalWsAllocator 支持外部 workspace 指针。 4. 在 GraphSplitKernelPy::Setup 增加 fake allocator 路径,用于提前探测并记录 ws_size_。 5. 在动态 shape 路径增加保护: DynKernelPy::CallDynGraphSplitKernelPy::Call 中新增 TORCH_CHECK(!IsCurrentStreamCapturing(), ...),明确动态 shape 运行时 CodeGen/Infer 不支持流捕获。 6. 头文件 torch_npu/csrc/inductor/dvm/pybind_api.h 同步接口声明变更(LaunchV1/LaunchV2SetWorkspaceSizeGraphSplitBase::Alloc 等)。 # 【资料变更】 不涉及。 # 【接口变更】 不涉及(无跨仓接口或客户可见接口变更)。 # 【功能验证】 1. 代码变更范围仅限: torch_npu/csrc/inductor/dvm/pybind_api.cpptorch_npu/csrc/inductor/dvm/pybind_api.h。 2. 已完成代码级差异核对(与目标提交中流捕获判断相关逻辑对齐)。 3. 本地未完成完整编译/UT(当前环境 build 目录缺少可用 Makefile,cmake --build build --target torch_npu 无法执行),后续以 CI 结果为准。 # 【CheckList】 > PR提交人对以下CheckList自检项进行全量自检,自检通过或不涉及,均修改 [ ] 为 [x] - [x] 代码注释完备,正确记录错误日志 - [x] 代码实现进行了返回值、空指针等校验 - [x] PR标题正确使用类型标签,如:feat、fix、refactor、docs、test等 - [ ] PR持续集成流水线(CI)执行通过,代码检查无异常 See merge request: Ascend/pytorch!316272 个月前
refactor: fix mlir compile Co-authored-by: huangchengnuo<huangchengnuo1@huawei.com> # message auto-generated for no-merge-commit merge: !30363 merge fix_mlir_compile into v2.7.1 refactor: fix mlir compile Created-by: SorryNaCN Commit-by: huangchengnuo Merged-by: ascend-robot Description: <!-- PR描述模板更新日期:20260203 --> # 【合入来源】 > (如有)请关联需求文档/issue链接 - [ ] 需求 - [ ] 问题单 - [ ] issue/工单 - [x] 重构优化 - [ ] 资料更新 # 【修改方案】 1. 移除 Python 侧动态构建 MLIR 扩展与相关打包逻辑:清理 setup.py 中 pybind11 扩展、libcpp_common 编译及文件 拷贝路径。(setup.py) 2. MLIR 绑定下沉到 C++:新增 torch_npu._C.mlir 子模块,提供 load_kernel_binary 接口,替代原 `_inductor/ ascend_npu_ir/_C 绑定实现。(torch_npu/csrc/inductor/mlir/mlir_bindings.cpp`) 3. 统一 MLIR 运行时公共代码位置:hacl_rt.h/cpp_common 迁移到 torch_npu/csrc/inductor/mlir/,并补齐导出与 msprof 头文件引用。(torch_npu/csrc/inductor/mlir/cpp_common.{h,cpp}) 4. 适配调用路径:mlir_compiler.py 使用 torch_npu._C.mlir.load_kernel_binary,同时删除 build_ext 初始化逻 辑。(torch_npu/_inductor/__init__.pytorch_npu/utils/_dynamo.py、`torch_npu/_inductor/ ascend_npu_ir/...`) 5. 更新测试与构建脚本:移除对 build_ext 的 public bindings 依赖,修正 CMake 源文件列表。(`test/npu/ test_public_bindings.pytorch_npu/csrc/inductor/CMakeLists.txt`) # 【资料变更】 不涉及 # 【接口变更】 不涉及(内部绑定实现迁移,无跨仓/对外接口变更) # 【功能验证】 - 未本地执行,待CI验证(如需可补自验证截图) # 【CheckList】 - [x] 代码注释完备,正确记录错误日志 - [x] 代码实现进行了返回值、空指针等校验 - [x] PR标题正确使用类型标签,如:feat、fix、refactor、docs、test等 - [x] PR持续集成流水线(CI)执行通过,代码检查无异常 See merge request: Ascend/pytorch!303633 个月前
refactor: fix mlir compile Co-authored-by: huangchengnuo<huangchengnuo1@huawei.com> # message auto-generated for no-merge-commit merge: !30363 merge fix_mlir_compile into v2.7.1 refactor: fix mlir compile Created-by: SorryNaCN Commit-by: huangchengnuo Merged-by: ascend-robot Description: <!-- PR描述模板更新日期:20260203 --> # 【合入来源】 > (如有)请关联需求文档/issue链接 - [ ] 需求 - [ ] 问题单 - [ ] issue/工单 - [x] 重构优化 - [ ] 资料更新 # 【修改方案】 1. 移除 Python 侧动态构建 MLIR 扩展与相关打包逻辑:清理 setup.py 中 pybind11 扩展、libcpp_common 编译及文件 拷贝路径。(setup.py) 2. MLIR 绑定下沉到 C++:新增 torch_npu._C.mlir 子模块,提供 load_kernel_binary 接口,替代原 `_inductor/ ascend_npu_ir/_C 绑定实现。(torch_npu/csrc/inductor/mlir/mlir_bindings.cpp`) 3. 统一 MLIR 运行时公共代码位置:hacl_rt.h/cpp_common 迁移到 torch_npu/csrc/inductor/mlir/,并补齐导出与 msprof 头文件引用。(torch_npu/csrc/inductor/mlir/cpp_common.{h,cpp}) 4. 适配调用路径:mlir_compiler.py 使用 torch_npu._C.mlir.load_kernel_binary,同时删除 build_ext 初始化逻 辑。(torch_npu/_inductor/__init__.pytorch_npu/utils/_dynamo.py、`torch_npu/_inductor/ ascend_npu_ir/...`) 5. 更新测试与构建脚本:移除对 build_ext 的 public bindings 依赖,修正 CMake 源文件列表。(`test/npu/ test_public_bindings.pytorch_npu/csrc/inductor/CMakeLists.txt`) # 【资料变更】 不涉及 # 【接口变更】 不涉及(内部绑定实现迁移,无跨仓/对外接口变更) # 【功能验证】 - 未本地执行,待CI验证(如需可补自验证截图) # 【CheckList】 - [x] 代码注释完备,正确记录错误日志 - [x] 代码实现进行了返回值、空指针等校验 - [x] PR标题正确使用类型标签,如:feat、fix、refactor、docs、test等 - [x] PR持续集成流水线(CI)执行通过,代码检查无异常 See merge request: Ascend/pytorch!303633 个月前
【Inductor】add aot inductor for v2.7.1 Co-authored-by: Zichun Ye<zichun.ye@huawei.com> # message auto-generated for no-merge-commit merge: merge aoti_v27 into v2.7.1 【Inductor】add aot inductor for v2.7.1 Created-by: zichun_ye Commit-by: Zichun Ye Merged-by: ascend-robot Description: <!-- Thanks for sending a pull request! --> **What type of PR is this?** > Uncomment only one /kind <> line, hit enter to put that in a new line, and remove leading whitespaces from that line: > > /kind bug > /kind task /kind feature **What does this PR do / why do we need it**: add aot inductor for v2.7.1 update logic to generate cpp wrapper for v2.7.1 add the whole cpp utils functions for aot inductor fix some bugs **Which issue(s) this PR fixes**: <!-- *Automatically closes linked issue when PR is merged. Usage: Fixes #<issue number>, or Fixes (paste link of issue). --> Fixes # **Special notes for your reviewers**: See merge request: Ascend/pytorch!248538 个月前
【Inductor】add aot inductor for v2.7.1 Co-authored-by: Zichun Ye<zichun.ye@huawei.com> # message auto-generated for no-merge-commit merge: merge aoti_v27 into v2.7.1 【Inductor】add aot inductor for v2.7.1 Created-by: zichun_ye Commit-by: Zichun Ye Merged-by: ascend-robot Description: <!-- Thanks for sending a pull request! --> **What type of PR is this?** > Uncomment only one /kind <> line, hit enter to put that in a new line, and remove leading whitespaces from that line: > > /kind bug > /kind task /kind feature **What does this PR do / why do we need it**: add aot inductor for v2.7.1 update logic to generate cpp wrapper for v2.7.1 add the whole cpp utils functions for aot inductor fix some bugs **Which issue(s) this PR fixes**: <!-- *Automatically closes linked issue when PR is merged. Usage: Fixes #<issue number>, or Fixes (paste link of issue). --> Fixes # **Special notes for your reviewers**: See merge request: Ascend/pytorch!248538 个月前