ascend-robot【docs】【修改说明】新镜像支持openEuler安装部署资料补全

Cluster Scheduling Component Infer Operator

English | 中文

Quick Reference

Infer Operator is maintained by MindCluster Code Repository
Where to get help

Infer Operator

Infer Operator is a MindCluster cluster scheduling component deployed on management nodes. It is a Kubernetes Operator used to deploy and manage multi-role collaborative inference tasks. Infer Operator defines three CRDs — InferServiceSet, InferService, and InstanceSet — and implements controllers for these three resource types to reconcile their instance states.

Use Cases

MindCluster provides the Infer Operator component to launch inference services based on instance configuration and supports manual scaling of inference instances.

Features

Creates inference instance Workloads and Services.
Supports manual scaling of inference instances.

Upstream and Downstream Dependencies

Creates inference instance Workloads based on user-configured task YAML.
After the Workload Controller creates Pods, Volcano performs the final resource selection.
If the Workload requests NPU cards, Ascend Device Plugin obtains NPU information and completes device mounting.

Supported Tags and Dockerfile Links

Tag Convention

Tags follow this format:

<version>-<os>

Field	Example	Description
`version`	`v26.1.0`	Infer Operator component version
`os`	`ubuntu22.04`	Infer Operator image operating system

Infer Operator 26.1.0

Tag	Dockerfile	Image Content
`v26.1.0-ubuntu22.04`	Dockerfile.ubuntu	Infer Operator v26.1.0 image for Ubuntu 22.04
`v26.1.0-openeuler24.03`	Dockerfile.openeuler	Infer Operator v26.1.0 image for openEuler 24.03

Quick Start

Prerequisites

Software Dependencies

Software	Supported Versions	Installation Location	Description
Kubernetes	1.17.x~1.34.x (1.19.x or later recommended)	All nodes	See Kubernetes Documentation
Volcano	See Volcano Kubernetes compatibility	Management nodes	Infer Operator depends on Volcano for resource scheduling
Ascend Device Plugin	Same version as Infer Operator	Compute nodes	Required when inference tasks use NPU

Hardware Requirements

Resource	Requirement
CPU	2 cores
Memory	2 GB

How to Build Locally

docker build --no-cache -t infer-operator:{tag} ./ -f Dockerfile.{os}

Note:

TARGETPLATFORM is a global built-in parameter provided by Docker BuildKit, used to obtain the target platform of the current build, such as linux/amd64 and linux/arm64.

This variable is automatically injected only when BuildKit is enabled. It will not be available in older Docker versions or environments with BuildKit disabled by default. Run export DOCKER_BUILDKIT=1 to enable it temporarily before executing build commands.

Deploy Infer Operator

Pull the image

docker pull swr.cn-south-1.myhuaweicloud.com/ascendhub/infer-operator:{tag}

Retag the image

docker tag swr.cn-south-1.myhuaweicloud.com/ascendhub/infer-operator:{tag} infer-operator:{tag}

Start Infer Operator

Replace {tag} in the infer-operator-{version}.yaml file with the actual image tag.

kubectl apply -f infer-operator-{version}.yaml

Verify deployment

kubectl get pods -A | grep infer-operator

Supported Hardware

For descriptions of supported Ascend hardware models, please refer to the official documentation: Supported Product Formats and OS List

License

View the license information for the Mind series software contained in these images.

As with all container images, pre-installed software packages (Python, system libraries, etc.) may be subject to their respective license agreements.