🎀Speaking at KubeCon EU 2026Lessons Learned Orchestrating Multi-Tenant GPUs on OpenShift AIView Session
ai intermediate ⏱ 30 minutes K8s 1.28+

Installing NVIDIA KAI Scheduler for AI Workloads

Deploy KAI Scheduler for optimized GPU resource allocation in Kubernetes AI/ML clusters with hierarchical queues and batch scheduling

By Luca Berton β€’

The Problem

Standard Kubernetes schedulers aren’t optimized for AI/ML workloads. They lack features like gang scheduling, GPU sharing, hierarchical resource quotas, and topology-aware placement that are essential for efficiently running training and inference workloads at scale.

The Solution

Deploy NVIDIA KAI Scheduler, a robust scheduler designed for large-scale GPU clusters. It provides batch scheduling, hierarchical queues, GPU sharing, elastic workloads, and topology-aware scheduling optimized for AI workloads.

KAI Scheduler Architecture

flowchart TB
    subgraph cluster["☸️ KUBERNETES CLUSTER"]
        subgraph kai["🎯 KAI SCHEDULER"]
            SC["Scheduler<br/>Controller"]
            PG["PodGrouper"]
            AD["Admission<br/>Controller"]
            RR["Resource<br/>Reservation"]
        end
        
        subgraph queues["πŸ“Š HIERARCHICAL QUEUES"]
            RQ["Root Queue<br/>(Cluster Resources)"]
            TQ["Training Queue<br/>quota: 60%"]
            IQ["Inference Queue<br/>quota: 30%"]
            DQ["Dev Queue<br/>quota: 10%"]
        end
        
        subgraph nodes["πŸ–₯️ GPU NODES"]
            N1["Node 1<br/>8x A100"]
            N2["Node 2<br/>8x A100"]
            N3["Node 3<br/>8x H100"]
        end
        
        subgraph workloads["πŸš€ WORKLOADS"]
            W1["Training Job<br/>PodGroup"]
            W2["Inference<br/>Deployment"]
            W3["Interactive<br/>Notebook"]
        end
    end
    
    workloads --> AD
    AD --> SC
    SC --> PG
    PG --> queues
    queues --> nodes

Step 1: Verify Prerequisites

# Check Kubernetes version
kubectl version --short

# Verify GPU Operator is installed
kubectl get pods -n gpu-operator

# Check GPU nodes are available
kubectl get nodes -l nvidia.com/gpu.present=true

# Verify GPUs are discoverable
kubectl describe node <gpu-node> | grep nvidia.com/gpu

Step 2: Install KAI Scheduler

# Create namespace
kubectl create namespace kai-scheduler

# Install from NVIDIA's container registry (recommended)
# Replace <VERSION> with latest from https://github.com/NVIDIA/KAI-Scheduler/releases
helm upgrade -i kai-scheduler \
  oci://ghcr.io/nvidia/kai-scheduler/kai-scheduler \
  -n kai-scheduler \
  --version v0.12.10

# Verify installation
kubectl get pods -n kai-scheduler

Expected output:

NAME                                    READY   STATUS    RESTARTS   AGE
kai-scheduler-0                         1/1     Running   0          60s
kai-scheduler-admission-xxx             1/1     Running   0          60s
kai-scheduler-podgrouper-xxx            1/1     Running   0          60s
kai-scheduler-resource-reservation-xxx  1/1     Running   0          60s

Step 3: Configure Default Queues

# queues.yaml
apiVersion: scheduling.run.ai/v2
kind: Queue
metadata:
  name: root
spec:
  displayName: "Root Queue"
  resources:
    gpu:
      quota: -1  # Unlimited (cluster total)
      limit: -1
---
apiVersion: scheduling.run.ai/v2
kind: Queue
metadata:
  name: training
spec:
  displayName: "Training Queue"
  parentQueue: root
  resources:
    gpu:
      quota: 16      # Guaranteed GPUs
      limit: 24      # Max GPUs (over-quota)
      overQuotaWeight: 2  # Priority for over-quota
---
apiVersion: scheduling.run.ai/v2
kind: Queue
metadata:
  name: inference
spec:
  displayName: "Inference Queue"
  parentQueue: root
  resources:
    gpu:
      quota: 8
      limit: 16
      overQuotaWeight: 1
---
apiVersion: scheduling.run.ai/v2
kind: Queue
metadata:
  name: development
spec:
  displayName: "Development Queue"
  parentQueue: root
  resources:
    gpu:
      quota: 4
      limit: 8
      overQuotaWeight: 0.5
kubectl apply -f queues.yaml
kubectl get queues

Step 4: Create a Workload Namespace

# workload-namespace.yaml
apiVersion: v1
kind: Namespace
metadata:
  name: ai-workloads
  labels:
    # Associate namespace with a queue
    runai/queue: training
kubectl apply -f workload-namespace.yaml

Step 5: Submit a Simple GPU Workload

# simple-gpu-job.yaml
apiVersion: batch/v1
kind: Job
metadata:
  name: gpu-test
  namespace: ai-workloads
  labels:
    runai/queue: training
spec:
  template:
    metadata:
      labels:
        runai/queue: training
    spec:
      schedulerName: kai-scheduler
      restartPolicy: Never
      containers:
      - name: cuda-test
        image: nvcr.io/nvidia/cuda:12.0-base
        command: ["nvidia-smi"]
        resources:
          limits:
            nvidia.com/gpu: 1
kubectl apply -f simple-gpu-job.yaml
kubectl get pods -n ai-workloads -w
kubectl logs job/gpu-test -n ai-workloads

Step 6: Configure Priority Classes

# priority-classes.yaml
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
  name: kai-high-priority
value: 100
globalDefault: false
description: "High priority for production inference"
---
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
  name: kai-normal-priority
value: 50
globalDefault: true
description: "Normal priority for training jobs"
---
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
  name: kai-low-priority
value: 10
globalDefault: false
description: "Low priority for development/interactive"
preemptionPolicy: Never  # Won't preempt others
kubectl apply -f priority-classes.yaml

Step 7: OpenShift Installation (Optional)

# For OpenShift with GPU Operator < v25.10.0
helm upgrade -i kai-scheduler \
  oci://ghcr.io/nvidia/kai-scheduler/kai-scheduler \
  -n kai-scheduler \
  --create-namespace \
  --version v0.12.10 \
  --set admission.gpuPodRuntimeClassName=null

Step 8: Verify Scheduler is Working

# Check scheduler logs
kubectl logs -n kai-scheduler -l app=kai-scheduler --tail=100

# View queue status
kubectl get queues -o wide

# Check scheduled pods
kubectl get pods -A -o wide --field-selector spec.schedulerName=kai-scheduler

# View scheduling events
kubectl get events -n ai-workloads --field-selector reason=Scheduled

KAI Scheduler Key Features

FeatureDescription
Batch SchedulingGang scheduling - all pods scheduled together or none
Hierarchical QueuesTwo-level queue hierarchy for organizational control
GPU SharingMultiple workloads share GPUs efficiently
Bin PackingMinimize fragmentation by packing pods
Spread SchedulingDistribute for resilience and load balancing
Elastic WorkloadsDynamic scaling within min/max bounds
DRA SupportKubernetes ResourceClaims for vendor GPUs
Topology-AwareOptimal placement for NVLink/NVSwitch

Troubleshooting

Pods stuck in Pending

# Check pod events
kubectl describe pod <pod-name> -n <namespace>

# Check queue capacity
kubectl get queue training -o yaml

# View scheduler logs for the pod
kubectl logs -n kai-scheduler -l app=kai-scheduler | grep <pod-name>

Queue not found

# Ensure queue label is correct
kubectl get pod <pod-name> -o jsonpath='{.metadata.labels.runai/queue}'

# List available queues
kubectl get queues

# Check namespace queue association
kubectl get namespace <ns> -o jsonpath='{.metadata.labels.runai/queue}'

Best Practices

PracticeDescription
Dedicated workload namespaceNever use kai-scheduler namespace for workloads
Queue per teamCreate separate queues for different teams/projects
Set quotasDefine guaranteed and limit resources per queue
Use priority classesDifferentiate production vs development workloads
Monitor queue utilizationTrack quota usage for capacity planning

Summary

KAI Scheduler provides enterprise-grade GPU scheduling for Kubernetes AI workloads. With hierarchical queues, batch scheduling, and GPU sharing, it optimizes resource utilization while maintaining fairness across teams and workloads.


πŸ“˜ Go Further with Kubernetes Recipes

Love this recipe? There’s so much more! This is just one of 100+ hands-on recipes in our comprehensive Kubernetes Recipes book.

Inside the book, you’ll master:

  • βœ… Production-ready deployment strategies
  • βœ… Advanced networking and security patterns
  • βœ… Observability, monitoring, and troubleshooting
  • βœ… Real-world best practices from industry experts

β€œThe practical, recipe-based approach made complex Kubernetes concepts finally click for me.”

πŸ‘‰ Get Your Copy Now β€” Start building production-grade Kubernetes skills today!

#kai-scheduler #nvidia #gpu #scheduling #machine-learning #ai-workloads

Want More Kubernetes Recipes?

This recipe is from Kubernetes Recipes, our 750-page practical guide with hundreds of production-ready patterns.