Cluster Scheduling Component Infer Operator
English | 中文
Quick Reference
- Infer Operator is maintained by MindCluster Code Repository
- Where to get help
Infer Operator
Infer Operator is a MindCluster cluster scheduling component deployed on management nodes. It is a Kubernetes Operator used to deploy and manage multi-role collaborative inference tasks. Infer Operator defines three CRDs — InferServiceSet, InferService, and InstanceSet — and implements controllers for these three resource types to reconcile their instance states.
Use Cases
MindCluster provides the Infer Operator component to launch inference services based on instance configuration and supports manual scaling of inference instances.
Features
- Creates inference instance Workloads and Services.
- Supports manual scaling of inference instances.
Upstream and Downstream Dependencies
- Creates inference instance Workloads based on user-configured task YAML.
- After the Workload Controller creates Pods, Volcano performs the final resource selection.
- If the Workload requests NPU cards, Ascend Device Plugin obtains NPU information and completes device mounting.
Supported Tags and Dockerfile Links
Tag Convention
Tags follow this format:
<version>-<os>
| Field | Example | Description |
|---|---|---|
version |
v26.1.0 |
Infer Operator component version |
os |
ubuntu22.04 |
Infer Operator image operating system |
Infer Operator 26.1.0
| Tag | Dockerfile | Image Content |
|---|---|---|
v26.1.0-ubuntu22.04 |
Dockerfile.ubuntu | Infer Operator v26.1.0 image for Ubuntu 22.04 |
v26.1.0-openeuler24.03 |
Dockerfile.openeuler | Infer Operator v26.1.0 image for openEuler 24.03 |
Quick Start
Prerequisites
Software Dependencies
| Software | Supported Versions | Installation Location | Description |
|---|---|---|---|
| Kubernetes | 1.17.x~1.34.x (1.19.x or later recommended) | All nodes | See Kubernetes Documentation |
| Volcano | See Volcano Kubernetes compatibility | Management nodes | Infer Operator depends on Volcano for resource scheduling |
| Ascend Device Plugin | Same version as Infer Operator | Compute nodes | Required when inference tasks use NPU |
Hardware Requirements
| Resource | Requirement |
|---|---|
| CPU | 2 cores |
| Memory | 2 GB |
How to Build Locally
docker build --no-cache -t infer-operator:{tag} ./ -f Dockerfile.{os}
Note:
- TARGETPLATFORM is a global built-in parameter provided by Docker BuildKit, used to obtain the target platform of the current build, such as linux/amd64 and linux/arm64.
- This variable is automatically injected only when BuildKit is enabled. It will not be available in older Docker versions or environments with BuildKit disabled by default. Run export DOCKER_BUILDKIT=1 to enable it temporarily before executing build commands.
Deploy Infer Operator
- Pull the image
docker pull swr.cn-south-1.myhuaweicloud.com/ascendhub/infer-operator:{tag}
- Retag the image
docker tag swr.cn-south-1.myhuaweicloud.com/ascendhub/infer-operator:{tag} infer-operator:{tag}
- Start Infer Operator
Replace {tag} in the infer-operator-{version}.yaml file with the actual image tag.
kubectl apply -f infer-operator-{version}.yaml
- Verify deployment
kubectl get pods -A | grep infer-operator
Supported Hardware
For descriptions of supported Ascend hardware models, please refer to the official documentation: Supported Product Formats and OS List
License
View the license information for the Mind series software contained in these images.
As with all container images, pre-installed software packages (Python, system libraries, etc.) may be subject to their respective license agreements.