| feat(inductor): add grouped symbolic autotune
Co-authored-by: Xuan Peng<pengxuan9@huawei.com>
# message auto-generated for no-merge-commit merge:
!36593 merge feat/symbolic-grouped-autotune into v2.7.1
feat(inductor): add grouped symbolic autotune
Created-by: HinPeng
Commit-by: Xuan Peng
Merged-by: ascend-robot
Description: <!--
PR描述模板更新日期:20260203
-->
## Summary
This PR adds grouped symbolic autotuning for Inductor-Triton kernels.
Instead of autotuning the symbolic shape using the size_hints (current size), it groups shape ranges into buckets according to schedule template and reuses the best kernel variant within each group. This change will slightly increase autotune overhead for symbolic kernel, but preserving shape-aware kernel selection.
## What changed
- Added grouped symbolic autotune metadata and runtime planning in symbolic_grouping.py
- Extended split/tiling analysis to identify eligible symbolic kernels and build grouping features
- Updated Triton codegen to emit grouped kernel metadata and runtime block arguments
- Added grouped grid construction and runtime launcher selection in triton_heuristics.py
- Introduced rollout switches:
- INDUCTOR_ASCEND_SYMBOLIC_GROUP_AUTOTUNE
- INDUCTOR_ASCEND_SYMBOLIC_GROUP_TEMPLATES
## Notes
- The feature is gated behind config flags for controlled rollout
- Initial support targets pointwise, reduction, and persistent_reduction templates
- Unsupported grouped plans automatically fall back to the existing autotune path
# 【合入来源】
> <font color="red">**如有社区issue,请关联issue链接**</font>\
> <font color="red">**请勿携带内部流程信息(需求链接、问题单、内部issue等)**</font>
- [ ] 需求
- [ ] 问题单
- [ ] issue/工单
- [ ] 重构优化
- [ ] 资料更新
# 【修改方案】
> 请描述修改内容的具体实现,涉及哪些组件之间进行交互,可以用1、2、3、...进行罗列\
> 如果是需求或者重构类的PR,需要补充详细设计文档(说明上下游组件关系、时序图、类图、DFX能力等内容)
# 【资料变更】
> 请确认是否涉及资料变更。如涉及,需要在PR中体现,并简要说明修改内容。如不涉及,需填写“不涉及”
# 【接口变更】
> 请确认是否涉及跨代码仓或者客户面可见的接口变更。如涉及,需要详细说明接口以及对应的变更内容,同时需要在资料中体现。如不涉及,需填写“不涉及”
# 【功能验证】
> 说明测试场景,测试方法。如果本次测试方式与常规单元测试不同,请详细说明您的测试步骤\
> 新增/变更内容是否已新增/适配UT测试用例看护,并补充测试自验证截图
# 【CheckList】
> PR提交人对以下CheckList自检项进行全量自检,自检通过或不涉及,均修改 [ ] 为 [x]
- [ ] 代码注释完备,正确记录错误日志
- [ ] 代码实现进行了返回值、空指针等校验
- [ ] PR标题正确使用类型标签,如:feat、fix、refactor、docs、test等
- [ ] PR持续集成流水线(CI)执行通过,代码检查无异常
See merge request: Ascend/pytorch!36593 | 6 天前 |