/kind <>
npu_grouped_matmul_swiglu_quant_v2
weightScale
[E, ceil(K/64), N, 2]
weightScale.dim2
npu_grouped_matmul_swiglu_quant_v2_meta
FP4_IN_INT8
output = [M, N / 2]
outputScale = [M, ceil((N / 2) / 64), 2]
GroupedMatmulSwigluQuantV2NpuOpapi.cpp
weight.size(2)
test/core_tests/test_fake_tensor.py
test/test_custom_ops/test_npu_grouped_matmul_swiglu_quant_v2.py
x_dtype/weight_dtype=torch_npu.float4_e2m1fn_x2
weight_scale_dtype/x_scale_dtype=torch_npu.float8_e8m0fnu
fix: grouped matmul swiglu quant v2 mxfp4 shape infer