| [bugfix]aclgraph fix:clone static inputs in ACLGraph capture to prevent stale data_ptr precision issue Co-authored-by: hyh_hh<huyinghong1@huawei.com> # message auto-generated for no-merge-commit merge: !326 merge aclgraph into dev [bugfix]aclgraph fix:clone static inputs in ACLGraph capture to prevent stale data_ptr precision issue Created-by: hyh_hh Commit-by: hyh_hh Merged-by: ascend-robot Description: # Purpose ACLGraph 后端精度修复与内存优化: - 修复 ACLGraph capture 阶段静态缓冲区引用 Dynamo example inputs 导致的 stale data_ptr 精度问题 - 新增 aclgraph_lazy_capture 模式,延迟 capture 至首次推理并使用 detach() 替代 clone(),解决多子图模型(60+ blocks)的 OOM 问题 - 新增 aclgraph_max_entries 配置,限制每个闭包的 entry 数量,防止动态输入形状下的显存无限增长 # Test Plan 1. TI2V-5B + aclgraph_lazy_capture=False(默认):验证输出视频精度正确 2. qwen-image-edit2509 + aclgraph_lazy_capture=True:验证无 OOM 且精度正确 # Test Report - TI2V-5B 默认配置:精度正确 ✅ - qwen-image-edit2509 lazy 模式:无 OOM,精度正确 ✅ See merge request: Ascend/MindIE-SD!326 | 18 天前 |