| 文件 | 最后提交记录 | 最后更新时间 |
|---|---|---|
[fix] fix the a5 msda bug Co-authored-by: huangyuan64<huangyuan64@huawei.com> # message auto-generated for no-merge-commit merge: !2041 merge msda into master [fix] fix the a5 msda bug Created-by: huangyuan64 Commit-by: huangyuan64 Merged-by: ascend-robot Description: ## What this PR does / why we need it? 修改边界判断条件GT为GE,修复MSDA的A5算子在坐标为0时精度异常的问题 ## Does this PR introduce any user-facing change? No. ## How was this patch tested? A5 ATK. See merge request: Ascend/DrivingSDK!2041 | 1 个月前 | |
change sync ascendc api to adapt to A5 Co-authored-by: yuansunshun<17301703748@163.com> # message auto-generated for no-merge-commit merge: !2017 merge bug into master change sync ascendc api to adapt to A5 Created-by: yuansunshun Commit-by: yuansunshun Merged-by: ascend-robot Description: ## What this PR does / why we need it? This PR migrates the AscendC synchronization API calls in 41 op_kernel files to adapt to the Ascend950 (A5) hardware. The deprecated APIs (pipe_barrier, set_flag, wait_flag) are replaced with their new type-safe equivalents (PipeBarrier<PIPE_X>(), SetFlag<HardEvent::X_Y>(), WaitFlag<HardEvent::X_Y>()), which are required by the newer AscendC SDK used on Ascend950. Affected operators include: add_relu, border_align, draw_gaussian_to_heatmap, dynamic_scatter, furthest_point_sampling, furthest_point_sampling_with_dist, gaussian, geometric_kernel_attention, graph_softmax, grid_sampler2d_v2, group_points, knn, max_pool2d, multi_scale_deformable_attn, points_in_box, scatter_add, scatter_mean, select_idx_with_mask, sparse_conv3d, subm_sparse_conv3d, unique_voxel, voxel_pooling_train, and others. Both standard and arch35-specific kernel files are updated where applicable. ## Does this PR introduce any user-facing change? No. This is a purely internal kernel-level API migration. The Python API, C++ interface, operator semantics, input/output formats, and supported data types are all unchanged. ## How was this patch tested? Unit tests for all affected operators were executed on Ascend950 hardware and passed. Each test compares NPU output against a CPU reference implementation using assertRtolEqual, covering multiple data types and input shapes. See merge request: Ascend/DrivingSDK!2017 | 1 个月前 | |
change sync ascendc api to adapt to A5 Co-authored-by: yuansunshun<17301703748@163.com> # message auto-generated for no-merge-commit merge: !2017 merge bug into master change sync ascendc api to adapt to A5 Created-by: yuansunshun Commit-by: yuansunshun Merged-by: ascend-robot Description: ## What this PR does / why we need it? This PR migrates the AscendC synchronization API calls in 41 op_kernel files to adapt to the Ascend950 (A5) hardware. The deprecated APIs (pipe_barrier, set_flag, wait_flag) are replaced with their new type-safe equivalents (PipeBarrier<PIPE_X>(), SetFlag<HardEvent::X_Y>(), WaitFlag<HardEvent::X_Y>()), which are required by the newer AscendC SDK used on Ascend950. Affected operators include: add_relu, border_align, draw_gaussian_to_heatmap, dynamic_scatter, furthest_point_sampling, furthest_point_sampling_with_dist, gaussian, geometric_kernel_attention, graph_softmax, grid_sampler2d_v2, group_points, knn, max_pool2d, multi_scale_deformable_attn, points_in_box, scatter_add, scatter_mean, select_idx_with_mask, sparse_conv3d, subm_sparse_conv3d, unique_voxel, voxel_pooling_train, and others. Both standard and arch35-specific kernel files are updated where applicable. ## Does this PR introduce any user-facing change? No. This is a purely internal kernel-level API migration. The Python API, C++ interface, operator semantics, input/output formats, and supported data types are all unchanged. ## How was this patch tested? Unit tests for all affected operators were executed on Ascend950 hardware and passed. Each test compares NPU output against a CPU reference implementation using assertRtolEqual, covering multiple data types and input shapes. See merge request: Ascend/DrivingSDK!2017 | 1 个月前 |
| 文件 | 最后提交记录 | 最后更新时间 |
|---|---|---|
| 1 个月前 | ||
| 1 个月前 | ||
| 1 个月前 |