文件最后提交记录最后更新时间
inference_service: support RKNN distributed runtime Support RKNN artifacts in the distributed inference flow so the board runtime can load .rknn models while keeping LeRobot checkpoint metadata for edge-side preprocessing and postprocessing. Align the live dispatcher defaults, board packaging, launch docs, and reliable controller command QoS with the measured RKNN deployment path. Defer optional backend imports so RKNN and CPU paths remain importable and testable without Ascend runtime packages installed. Signed-off-by: XiaoqiangWu <wuxiaoqiang.rtos@huawei.com> 24 天前
inference_service: support RKNN distributed runtime Support RKNN artifacts in the distributed inference flow so the board runtime can load .rknn models while keeping LeRobot checkpoint metadata for edge-side preprocessing and postprocessing. Align the live dispatcher defaults, board packaging, launch docs, and reliable controller command QoS with the measured RKNN deployment path. Defer optional backend imports so RKNN and CPU paths remain importable and testable without Ascend runtime packages installed. Signed-off-by: XiaoqiangWu <wuxiaoqiang.rtos@huawei.com> 24 天前
chore: initial monorepo migration for IB_Robot Signed-off-by: XiaoqiangWu <wuxiaoqiang.rtos@huawei.com> 3 个月前
inference_service: support RKNN distributed runtime Support RKNN artifacts in the distributed inference flow so the board runtime can load .rknn models while keeping LeRobot checkpoint metadata for edge-side preprocessing and postprocessing. Align the live dispatcher defaults, board packaging, launch docs, and reliable controller command QoS with the measured RKNN deployment path. Defer optional backend imports so RKNN and CPU paths remain importable and testable without Ascend runtime packages installed. Signed-off-by: XiaoqiangWu <wuxiaoqiang.rtos@huawei.com> 24 天前
inference_service: support RKNN distributed runtime Support RKNN artifacts in the distributed inference flow so the board runtime can load .rknn models while keeping LeRobot checkpoint metadata for edge-side preprocessing and postprocessing. Align the live dispatcher defaults, board packaging, launch docs, and reliable controller command QoS with the measured RKNN deployment path. Defer optional backend imports so RKNN and CPU paths remain importable and testable without Ascend runtime packages installed. Signed-off-by: XiaoqiangWu <wuxiaoqiang.rtos@huawei.com> 24 天前
inference_service: support RKNN distributed runtime Support RKNN artifacts in the distributed inference flow so the board runtime can load .rknn models while keeping LeRobot checkpoint metadata for edge-side preprocessing and postprocessing. Align the live dispatcher defaults, board packaging, launch docs, and reliable controller command QoS with the measured RKNN deployment path. Defer optional backend imports so RKNN and CPU paths remain importable and testable without Ascend runtime packages installed. Signed-off-by: XiaoqiangWu <wuxiaoqiang.rtos@huawei.com> 24 天前
chore: initial monorepo migration for IB_Robot Signed-off-by: XiaoqiangWu <wuxiaoqiang.rtos@huawei.com> 3 个月前
build: overhaul workspace setup scripts and ROS 2 dependencies - **Scripts Refactoring**: - Remove usb_cam submodule, relying entirely on openEuler sources. - Refactor openEuler ROS repo URLs dynamically ($basearch) and remove redundant hardcoded packages in install_ros.sh. - Upgrade Ubuntu target from desktop to ros-humble-desktop-full. - Enhance openEuler OS detection robustness in setup.sh. - Add strict error handling for rosdepc update and install operations to prevent hidden failures. - Bulletproof rosdepc --skip-keys syntax across platforms. - **Workspace Dependencies**: - Fix missing package.xml dependencies (moveit, moveit_servo, joy, gz_ros2_control). - Correct invalid dependency names (nlohmann_json to nlohmann-json-dev). - Remove unneeded dependencies (ros2python, ros2idl_runtime_py, rclpy_action, rclpy_interfaces). - Ensure ament_python is strictly kept as a <buildtool_depend>. Signed-off-by: XiaoqiangWu <wuxiaoqiang.rtos@huawei.com> 1 个月前
refactor: implement async edge proxy for distributed inference - Decouple monolithic inference into pure Python Pre, Infer, and Post core components. - Introduce InferenceCoordinator for zero-copy single-process inference. - Refactor lerobot_policy_node into an Asynchronous Edge Proxy to support Edge-Cloud split without breaking action_dispatch. - Remove redundant base nodes (base_model_node, passive_inference_node) and legacy component wrappers. - Add pure_inference_node and cloud_inference.launch.py. - Support runtime mode switching via robot_config YAML. Signed-off-by: XiaoqiangWu <wuxiaoqiang.rtos@huawei.com> 2 个月前
chore: initial monorepo migration for IB_Robot Signed-off-by: XiaoqiangWu <wuxiaoqiang.rtos@huawei.com> 3 个月前
openharmony: package board runtime fixes Make the OpenHarmony custom build helper produce a board-ready inference runtime. The script now rewrites runtime prefixes for /data/install, stages a patched lerobot runtime tree, applies the OpenHarmony patch series during packaging, and switches inference nodes to the skh-run torch runtime. Also inline the remaining helper functions into the main build script and package inference_service subpackages so board deployment includes the required core modules without a separate common shell helper. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: XiaoqiangWu <wuxiaoqiang.rtos@huawei.com> 1 个月前
README.en.md

Inference Service

inference_service is the core AI execution package for IB-Robot. It provides a standardized framework for running end-to-end Machine Learning policies (like ACT, pi0, etc.) on physical robots with strict temporal alignment and zero-copy latency optimizations.

Architecture: Composition over Inheritance

The inference pipeline is decoupled into three pure-Python core components (inference_service.core):

  1. TensorPreprocessor: Converts raw ROS 2 sensor data (images, joint states) into normalized PyTorch Tensors.
  2. PureInferenceEngine: A completely stateless, ROS-agnostic GPU execution engine.
  3. TensorPostprocessor: Denormalizes output action tensors back into physical control commands.

By separating the core math from the ROS 2 transport layer, this package supports two distinct deployment modes, toggleable via a single YAML parameter.


🚀 Execution Modes

Mode A: Monolithic (Single-Machine Zero-Copy)

Best for: Robots equipped with high-performance onboard GPUs (e.g., RTX 4060).

In this mode, lerobot_policy_node.py instantiates an InferenceCoordinator that chains the Preprocessor, Engine, and Postprocessor together.

  • Data Flow: Sensor data stays entirely within the RAM/VRAM of the single process. Tensors are passed by reference.
  • Performance: Absolute lowest latency. Zero serialization/deserialization overhead.
  • Config: execution_mode: "monolithic"

Mode B: Device-Edge-Cloud Synergy (Distributed)

Best for: Lightweight robots (Device) running on low-power CPUs (e.g., Raspberry Pi) paired with a high-end computation node (Edge) or tower server (Cloud) over a LAN.

To preserve compatibility with the pull-based action_dispatch system without clogging the network with 30fps video streams, the Device node acts as an Asynchronous Proxy.

  1. Device Node (lerobot_policy_node.py): Receives the action goal, reads the cameras on-demand, runs the Preprocessor on CPU, and publishes the lightweight Tensor batch to /preprocessed/batch. The action callback is then suspended using an asynchronous threading.Event.
  2. Edge/Cloud Node (pure_inference_node.py): Subscribes to the batch, crunches the numbers on the GPU using PureInferenceEngine, and returns the raw action to /inference/action.
  3. Device Node: Wakes up, runs the Postprocessor, and completes the Action sequence.
  • Performance: Achieves "Compute Offloading" perfectly. The Device only sends the exact frames needed for inference (e.g., 20Hz), saving massive network bandwidth.
  • Config: execution_mode: "distributed"
Device Machine (Robot / Sim)               GPU Machine (Edge/Cloud)
┌──────────────────────────────┐          ┌──────────────────────────┐
│  action_dispatcher_node      │          │                          │
│       ↓                      │          │  pure_inference_node     │
│  lerobot_policy_node (Proxy) │          │  ├─ Subscribe            │
│  ├─ TensorPreprocessor (CPU) │          │  │  /preprocessed/batch  │
│  ├─ threading.Event          │          │  ├─ PureInferenceEngine  │
│  └─ TensorPostprocessor(CPU) │          │  │  (GPU)                │
│       ↓ Pub        ↑ Sub     │          │  └─ Publish              │
│  /preprocessed  /inference   │          │     /inference/action    │
│  /batch         /action      │          │                          │
└──────────┬──────────┬────────┘          └───────┬──────────┬───────┘
           │          │      LAN (same ROS_DOMAIN_ID)        │          │
           └──────────┴──────────────────────────┴──────────┘

⚙️ Configuration & Usage

The execution mode is controlled seamlessly via your robot_config YAML files. You do not need to change launch files to switch modes on the device.

# src/robot_config/config/robots/your_robot.yaml
control_modes:
  model_inference:
    inference:
      enabled: true
      execution_mode: "distributed"  # Or "monolithic"
      model: so101_act

Launching

Both machines must share the same ROS_DOMAIN_ID and be on the same LAN.

Step 1 — On the Robot (Device):

The Device launches only the Edge proxy node (pre/post-processing), without loading GPU models:

export ROS_DOMAIN_ID=<your_domain_id>
ros2 launch robot_config robot.launch.py \
    robot_config:=so101_single_arm \
    control_mode:=model_inference \
    execution_mode:=distributed \
    use_sim:=true   # For simulation; omit for real hardware

Step 2 — On the GPU Server (Edge/Cloud):

export ROS_DOMAIN_ID=<same_domain_id_as_device>
ros2 launch inference_service cloud_inference.launch.py \
    policy_path:=/path/to/models/pretrained_model \
    device:=cuda

For models exported through the ATC/SVP toolchain, inference_service can own the OM wrappers migrated from the original LeRobot patches:

# Generic Ascend ACL .om backend
ros2 launch inference_service cloud_inference.launch.py \
    policy_path:=/path/to/pretrained_model \
    device:=ascend_om

# SD3403 worker binary protocol backend
ros2 launch inference_service cloud_inference.launch.py \
    policy_path:=/path/to/pretrained_model \
    device:=ascend_om_3403

device:=ascend_om resolves the .om model from om_model_path in policy_path/config.json, ASCEND_OM_MODEL_PATH/OM_MODEL_PATH, or common files under policy_path. device:=ascend_om_3403 also resolves the worker executable from cpp_executable, SVP_WORKER_EXECUTABLE, or common out/main layouts. Preprocessing, postprocessing, ROS topics, and distributed transport remain the existing inference_service pipeline.

For RK3588 / OpenHarmony boards running RKNN Lite, switch the cloud node to:

export ROS_DOMAIN_ID=<same_domain_id_as_device>
ros2 launch inference_service cloud_inference.launch.py \
    policy_path:=/path/to/models/pretrained_model \
    device:=rknn

device:=rknn still uses the LeRobot metadata under policy_path/config.json for preprocessing and postprocessing, while expecting the actual RKNN artifact to live inside policy_path, using model.rknn as the default filename.

Scenario 2: Single-Machine Debug (Development)

Run both Edge + Cloud nodes on one machine by adding cloud_local:=true:

ros2 launch robot_config robot.launch.py \
    robot_config:=so101_single_arm \
    control_mode:=model_inference \
    execution_mode:=distributed \
    use_sim:=true \
    cloud_local:=true

Verifying Distributed Mode

# 1. Confirm both inference nodes are online
ros2 node list | grep -E 'act_inference|pure_inference'
# Expected:
#   /act_inference_node      ← Edge (pre/post-processing)
#   /pure_inference           ← Cloud (GPU inference)

# 2. Confirm distributed topics exist
ros2 topic list | grep -E 'preprocessed|inference/action'
# Expected:
#   /preprocessed/batch      ← Edge → Cloud
#   /inference/action         ← Cloud → Edge

# 3. Monitor inference frequency
ros2 topic hz /inference/action

Logging Reference

After launch, each node prints key lifecycle messages for quick status diagnosis:

Node Example Log Meaning
pure_inference Waiting for preprocessed batches from edge node... Cloud node ready, waiting for Edge data
pure_inference ✓ First inference completed: latency=XXms First inference succeeded, end-to-end link confirmed
pure_inference [stats] count=XX, avg=XXms, last=XXms Performance stats every 5 seconds
act_inference_node ✓ First inference complete (distributed): total=XXms Edge node completed full inference round-trip
action_dispatcher ✓ First inference received: chunk=XX, latency=XXms Dispatcher received first executable actions
action_dispatcher [stats] inferences=XX, avg_latency=XXms, queue=XX, hold=XX Dispatch stats every 5s; hold = times queue exhausted and last frame held

🧪 Testing

Because the core components are isolated from ROS, they can be validated entirely offline:

pytest src/inference_service/tests/