When you deploy GPU computing jobs in an ACK managed cluster Pro, you can assign scheduling property labels to GPU nodes. These labels, such as exclusive, shared, and topology-aware scheduling, and card model labels, help optimize resource utilization and enable precise application scheduling.
Scheduling labels
GPU scheduling labels identify GPU models and resource allocation policies. This enables fine-grained resource management and efficient scheduling.
Scheduling feature | Label value | Scenarios |
Exclusive scheduling (Default) |
| High-performance jobs that require exclusive use of an entire GPU card, such as model training and HPC. |
Shared scheduling |
| Improves GPU utilization. Suitable for scenarios where multiple lightweight jobs run concurrently, such as in multitenancy or inference workloads.
|
| This applies to optimizing the resource allocation policy for multiple GPU cards on a single node after the
| |
Topology-aware scheduling |
| Automatically assigns pods to the GPU combination with the optimal communication bandwidth based on the physical GPU topology within a single node. Suitable for jobs that are sensitive to inter-GPU communication latency. |
Card model scheduling |
Use with card model scheduling to set the video memory capacity and total number of GPU cards for a GPU job.
| Schedules jobs to nodes with specified GPU models or avoids nodes with specified models. |
Enable scheduling features
Exclusive scheduling
If a node has no GPU scheduling labels, exclusive scheduling is enabled by default. In this mode, the node allocates GPU resources to pods in units of a single GPU.
If other GPU scheduling features are enabled, removing the labels does not restore exclusive scheduling. You must manually change the label value to ack.node.gpu.schedule: default
to restore the exclusive scheduling feature.
Shared scheduling
Shared scheduling is supported only in ACK managed cluster Pro. For more information, see Limits.
Install the
ack-ai-installer
shared scheduling component.Log on to the ACK console. In the navigation pane on the left, click Clusters.
On the Clusters page, find the cluster you want and click its name. In the left-side navigation pane, choose .
On the Cloud-native AI Suite page, click Deploy. On the Deploy Cloud-native AI Suite page, select Scheduling Policy Extension (Batch Scheduling, GPU Sharing, GPU Topology Awareness).
For more information about how to set the computing power scheduling policy for cGPU, see Install and use the cGPU component.
On the Cloud-native AI Suite page, click Deploy Cloud-native AI Suite.
On the Cloud-native AI Suite page, find the installed shared GPU component ack-ai-installer in the component list.
Enable the shared scheduling feature.
On the Clusters page, click the name of the target cluster. In the navigation pane on the left, choose .
On the Node Pools page, click Create Node Pool, configure the node labels, and then click Confirm.
You can keep the default settings for other configuration items. For more information about the scenarios for node labels, see Scheduling labels.
Configure basic shared scheduling.
Click the
icon for Node Labels, set Key to
ack.node.gpu.schedule
, and select one of the following tag values:cgpu
,core_mem
,share
, ormps
(requires you to install the MPS Control Daemon component).Configure multi-card shared scheduling.
If a node has multiple GPUs, you can configure multi-card shared scheduling to optimize resource allocation.
Click the
icon for Node Labels, set the Key to
ack.node.gpu.placement
, and set the tag value tobinpack
orspread
.
Verify that shared scheduling is enabled.
cgpu
/share
/mps
Replace <NODE_NAME> with the name of your target node and run the following command to verify that
cgpu
,share
, ormps
shared scheduling is enabled for the node pool.kubectl get nodes <NODE_NAME> -o yaml | grep -q "aliyun.com/gpu-mem"
Expected output:
aliyun.com/gpu-mem: "60"
If the value of the
aliyun.com/gpu-mem
field is not 0,cgpu
,share
, ormps
shared scheduling is enabled.core_mem
Replace
<NODE_NAME>
with the name of your target node and run the following command to verify thatcore_mem
shared scheduling is enabled for the node pool.kubectl get nodes <NODE_NAME> -o yaml | grep -E 'aliyun\.com/gpu-core\.percentage|aliyun\.com/gpu-mem'
Expected output:
aliyun.com/gpu-core.percentage:"80" aliyun.com/gpu-mem:"6"
If the values of the
aliyun.com/gpu-core.percentage
andaliyun.com/gpu-mem
fields are not 0,core_mem
shared scheduling is enabled.binpack
Use the shared GPU scheduling GPU resource query tool and run the following command to query the GPU resource allocation of the node:
kubectl inspect cgpu
Expected output:
NAME IPADDRESS GPU0(Allocated/Total) GPU1(Allocated/Total) GPU2(Allocated/Total) GPU3(Allocated/Total) GPU Memory(GiB) cn-shanghai.192.0.2.109 192.0.2.109 15/15 9/15 0/15 0/15 24/60 -------------------------------------------------------------------------------------- Allocated/Total GPU Memory In Cluster: 24/60 (40%)
The output shows that GPU0 is fully allocated (15/15) and GPU1 is partially allocated (9/15). This matches the strategy of filling one GPU before allocating resources to the next, which confirms that the
binpack
policy is in effect.spread
Use the shared scheduling GPU resource query tool and run the following command to query the GPU resource allocation of the node:
kubectl inspect cgpu
Expected output:
NAME IPADDRESS GPU0(Allocated/Total) GPU1(Allocated/Total) GPU2(Allocated/Total) GPU3(Allocated/Total) GPU Memory(GiB) cn-shanghai.192.0.2.109 192.0.2.109 4/15 4/15 0/15 4/15 12/60 -------------------------------------------------------------------------------------- Allocated/Total GPU Memory In Cluster: 12/60 (20%)
The output shows that 4/15 of the resources are allocated to GPU0, 4/15 to GPU1, and 4/15 to GPU3. This confirms that the
spread
policy is in effect because the pods are distributed across different GPUs.
Topology-aware scheduling
Topology-aware scheduling is supported only in ACK managed cluster Pro. For more information, see System component version requirements.
Enable topology-aware scheduling.
Replace <NODE_NAME> with the name of your target node and run the following command to add a label to the node. This activates the topology-aware scheduling feature for the node.
kubectl label node <NODE_NAME> ack.node.gpu.schedule=topology
After you activate topology-aware scheduling for a node, it no longer supports scheduling for non-topology-aware GPU resources. You can run the
kubectl label node <NODE_NAME> ack.node.gpu.schedule=default --overwrite
command to change the label and restore exclusive scheduling.Verify that topology-aware scheduling is enabled.
Replace <NODE_NAME> with the name of your target node and run the following command to verify that
topology
-aware scheduling is enabled for the node pool.kubectl get nodes <NODE_NAME> -o yaml | grep aliyun.com/gpu
Expected output:
aliyun.com/gpu: "2"
If the value of the
aliyun.com/gpu
field is not 0,topology
-aware scheduling is enabled.
Card model scheduling
You can schedule a Job to a node with a specified GPU model or avoid a specific model.
Check the GPU model of the node.
Run the following command to query the GPU models of the nodes in the cluster.
The GPU model name is in the NVIDIA_NAME field.
kubectl get nodes -L aliyun.accelerator/nvidia_name
The expected output is similar to the following:
NAME STATUS ROLES AGE VERSION NVIDIA_NAME cn-shanghai.192.XX.XX.176 Ready <none> 17d v1.26.3-aliyun.1 Tesla-V100-SXM2-32GB cn-shanghai.192.XX.XX.177 Ready <none> 17d v1.26.3-aliyun.1 Tesla-V100-SXM2-32GB
Enable card model scheduling.
On the Clusters page, find the cluster you want and click its name. In the left-side pane, choose .
On the Jobs page, click Create From YAML. Use the following examples to create an application and enable the card model scheduling feature.
Specify a particular card model
Use the GPU model label to run your application on nodes with a specific GPU model.
Replace
Tesla-V100-SXM2-32GB
in thealiyun.accelerator/nvidia_name: "Tesla-V100-SXM2-32GB"
code with the actual GPU model of your node.After the Job is created, you can choose
in the navigation pane on the left. In the pod list, you can see that an example pod is successfully scheduled to a matching node, which demonstrates flexible scheduling based on the GPU model label.Exclude a particular card model
Use the GPU model label with node affinity and anti-affinity to prevent your application from running on certain GPU models.
Replace
Tesla-V100-SXM2-32GB
invalues: - "Tesla-V100-SXM2-32GB"
with the actual GPU model of your node.After the Job is created, the application is not scheduled to nodes that have the label key
aliyun.accelerator/nvidia_name
and the valueTesla-V100-SXM2-32GB
. However, it can be scheduled to GPU nodes with other GPU models.