| [Kernel] add custom op GmmSwigluQuantWeightNzTensorList (#3804)
### What this PR does / why we need it?
This PR introduces support for adding custom CANN aclnn ops to
vllm-ascend, allowing users to define and use their own custom
operators.
Key changes include:
- Building and installing custom ops into the vllm-ascend-specified
directory
- Binding the aclnn op interface to the torch.ops._C_ascend module
- Enabling invocation of these ops within vllm-ascend
This PR includes a sample custom op:
aclnnGroupedMatmulSwigluQuantWeightNzTensorList, which is adapted from
the CANN operator
[aclnnGroupedMatmulSwigluQuantWeightNZ](https://www.hiascend.com/document/detail/zh/canncommercial/83RC1/API/aolapi/context/aclnnGroupedMatmulSwigluQuantWeightNZ.md).
Its input parameters weight and weight_scale now accept
list[torch.Tensor] (i.e., at::TensorList).
### Does this PR introduce _any_ user-facing change?
No.
- vLLM version: v0.11.2
---------
Signed-off-by: QianChenxi <chenxi.qian.cq@outlook.com> | 5 个月前 |