Bbaishanyanginit project
5f1c8c3b创建于 4 天前历史提交

name: onnx2ms-runtime-operator-fix description: Use when user reports MindSpore Lite runtime errors, .ms model inference failures, or mentions "鸿蒙运行时报错", "ms模型报错", "推理报错", "推理失败", "运行时错误", "benchmark报错", "模型加载失败", "执行失败". Analyzes error logs, provides diagnosis, suggestions, and can fix unsupported operators by replacing them with equivalent subgraphs.

ONNX to MS Runtime Operator Fix Skill

Analyzes and fixes MindSpore Lite (.ms) model runtime errors during inference. Now supports device-specific operator detection and automatic replacement for unsupported operators.

When to Use

Trigger this skill when user mentions:

  • "鸿蒙运行时报错"
  • "ms模型报错"
  • "推理报错" / "推理失败"
  • "运行时错误"
  • "benchmark报错"
  • "模型加载失败"
  • "执行失败"
  • "NPU算子不支持" / "CPU算子不支持"
  • "Split算子报错" / "Mod算子报错"

Supported Error Types

Error Type Pattern Keywords Diagnosis Suggestions
shape_mismatch shape mismatch, dimension, size mismatch, expected shape 输入tensor维度与模型期望不匹配 检查输入数据shape,使用--inputShape参数
dtype_error dtype error, data type mismatch, type mismatch, float16, int8 输入/输出数据类型与模型定义不符 使用--inputDataType/--outputDataType重新转换
memory_error OOM, out of memory, allocate failed, memory error 内存不足导致推理失败 启用INT8量化,减小batch size
unsupported_op_cpu op not supported CPU, kernel not found CPU, Expand not supported 算子在CPU设备上未实现 切换设备或使用ohos_runtime_op_fix修复
unsupported_op_npu op not supported NPU, kernel not found NPU, Split not supported 算子在NPU设备上未实现 切换设备或使用ohos_runtime_op_fix修复
unsupported_op_gpu op not supported GPU, kernel not found GPU, OpenCL error 算子在GPU设备上未实现 切换设备或使用ohos_runtime_op_fix修复
unsupported_op_runtime op not supported, unsupported operator (generic) 算子在目标设备上未实现 查看算子支持列表,切换设备尝试
device_error GPU not available, NPU error, OpenCL error, device failed 目标设备不可用或驱动问题 检查设备驱动,切换到CPU运行
input_error input error, input file not found, invalid input 输入数据文件不存在或格式错误 检查--inDataFile路径
model_load_error model load failed, model invalid, parse error, file corrupted .ms文件损坏或格式错误 检查文件完整性,重新转换
quant_accuracy_error accuracy threshold, cosine distance, precision loss INT8量化后精度不达标 调整校准数据集
negative_dimension negative dimension, CheckTensorsInvalid, Resize() 模型包含动态shape维度 使用--inputShape参数指定输入维度

Supported Operator Replacements

Operator Device Issue Replacement Strategy Description
Split NPU不支持 Slices split_to_slices 将Split算子替换为多个Slice算子
Mod NPU不支持 Sub+Mul+Div mod_to_sub_mul_div 将Mod算子替换为等价计算子图
Expand CPU不支持 Tile expand_to_tile 将Expand算子替换为Tile算子

Workflow

Case A: User Reports Runtime Error (Diagnosis Only)

  1. User provides error log describing the failure
  2. Call ohos_runtime_analyze with error_log parameter
  3. Script parses and matches all error types from log
  4. Detects device (CPU/NPU/GPU) and specific operator if applicable
  5. Return diagnosis and suggestions for each matched type

Case B: NPU/CPU Unsupported Operator (Auto Fix)

  1. User reports: "模型在NPU上推理报错 Split算子不支持"
  2. Call ohos_runtime_analyze with error_log and device="NPU"
  3. Detect: unsupported_op_npu with operator "Split"
  4. Check replacement config: Split has fix strategy (split_to_slices)
  5. Suggest: 使用 ohos_runtime_op_fix 替换 Split 算子
  6. Execute: ohos_runtime_op_fix(onnx_path, device="NPU", operators=["Split"])
  7. Generate fixed ONNX file
  8. Re-convert and re-run benchmark

Case C: Explicit Fix Request

  1. User calls ohos_runtime_op_fix with ONNX file path
  2. System detects unsupported operators for specified device
  3. Execute operator replacement using configured strategy
  4. Generate fixed ONNX file
  5. Suggest user to re-run conversion

Tools

ohos_runtime_analyze

Call: ohos_runtime_analyze
Arguments: {
    "error_log": "<error-log-string>",     // Required
    "ms_path": "<path-to-ms-file>",        // Optional, for context
    "device": "CPU|GPU|NPU"                 // Optional, target device
}

Output:

{
    "matched_errors": [
        {
            "error_type": "unsupported_op_npu",
            "diagnosis": "算子在NPU设备上未实现",
            "suggestions": [
                "使用 ohos_runtime_op_fix 工具替换 Split 算子",
                "替换策略: 将Split算子替换为多个Slice算子",
                ...
            ],
            "matched_patterns": ["op not supported NPU"],
            "confidence": 0.9,
            "operator": "Split",
            "device": "NPU",
            "replacement": {
                "operator": "Split",
                "strategy": "split_to_slices",
                "has_fix": true
            }
        }
    ],
    "total_matches": 1,
    "has_errors": true,
    "detected_device": "NPU",
    "detected_operator": "Split"
}

ohos_runtime_op_fix

Call: ohos_runtime_op_fix
Arguments: {
    "onnx_path": "<path-to-onnx-file>",
    "output_path": "<output-path>",        // Optional, defaults to *_runtime_fixed.onnx
    "device": "CPU|NPU|GPU",               // Required, target device
    "operators": ["Split", ...]            // Optional, auto-detect if not specified
}

Output:

{
    "success": true,
    "device": "NPU",
    "operators_replaced": ["Split"],
    "nodes_per_operator": {"Split": 3},
    "output_path": "model_runtime_fixed.onnx",
    "message": "成功替换 1 种算子"
}

Example Usage

Example 1: NPU Split Operator Error

User: "模型在NPU上推理报错 Split算子不支持"

Call: ohos_runtime_analyze
Arguments: {
    "error_log": "op not supported: Split is not implemented for NPU device",
    "device": "NPU"
}

Result:
- error_type: unsupported_op_npu
- detected_operator: Split
- replacement: {strategy: "split_to_slices", has_fix: true}
- suggestions: 使用 ohos_runtime_op_fix 替换 Split 算子

Call: ohos_runtime_op_fix
Arguments: {
    "onnx_path": "model.onnx",
    "device": "NPU",
    "operators": ["Split"]
}

Result:
- 3 Split nodes replaced with Slices
- Output: model_runtime_fixed.onnx
- Suggest: re-run conversion

Example 2: CPU Expand Operator Error

User: "模型在CPU上报错 Expand算子不支持"

Call: ohos_runtime_analyze
Arguments: {
    "error_log": "kernel not found for op type Expand on CPU",
    "device": "CPU"
}

Result:
- error_type: unsupported_op_cpu
- detected_operator: Expand
- replacement: {strategy: "expand_to_tile", has_fix: true}

Call: ohos_runtime_op_fix
Arguments: {
    "onnx_path": "model.onnx",
    "device": "CPU"
}

Result:
- Expand nodes replaced with Tile
- Suggest: re-run conversion

Example 3: Multiple Errors

User: "运行时报错:GPU not available,而且 out of memory"

Call: ohos_runtime_analyze
Arguments: {
    "error_log": "GPU not available, out of memory, allocate failed"
}

Result:
- matched_errors: [device_error, memory_error]
- total_matches: 2
- 每种错误返回独立的诊断和建议

File Locations

  • Analyzer Script: .opencode/skills/onnx2ms-runtime-operator-fix/scripts/runtime_error_analyzer.py
  • Operator Replace Script: .opencode/skills/onnx2ms-runtime-operator-fix/scripts/runtime_op_replace.py
  • Replacement Config: .opencode/skills/onnx2ms-runtime-operator-fix/config/operator_replacements.json
  • Error Examples: .opencode/skills/onnx2ms-runtime-operator-fix/examples/*.json
  • MCP Server: ohos_model_claw/mcp_server.py
  • Error Parser: ohos_model_claw/error_parser.py
  • Core Logic: ohos_model_claw/ohos_model_claw.py

Notes

  • Returns ALL matched error types (not just first match)
  • Device-specific patterns allow precise diagnosis (CPU/NPU/GPU)
  • Operator extraction from error logs enables targeted fixes
  • Replacement strategies are configured in operator_replacements.json
  • Confidence score indicates pattern match ratio
  • Independent from onnx-converter-operator-fix (handles runtime, not conversion)
  • Some operators may not have replacement strategies (check has_fix flag)

Relationship with Other Skills

Skill Scope Trigger Fix Available
onnx-converter-operator-fix 转换阶段算子错误 ONNX转换时ERROR日志 HardSigmoid/HardSwish/Swish/Mish
onnx2ms-runtime-operator-fix 运行时推理错误 benchmark或推理时报错 Split/Mod/Expand (device-specific)

This skill handles errors AFTER conversion with device-specific diagnosis and operator replacement, while onnx-converter-operator-fix handles errors DURING conversion.