πŸ“šBook Signing at KubeCon EU 2026Meet us at Booking.com HQ (Mon 18:30-21:00) & vCluster booth #521 (Tue 24 Mar, 12:30-1:30pm) β€” free book giveaway!RSVP Booking.com Event
ai advanced ⏱ 20 minutes K8s 1.28+

Volcano Job minAvailable Gang Schedule

Volcano batch scheduling with minAvailable gang scheduling on Kubernetes. Job configuration, queue policies, and AI training workload scheduling.

By Luca Berton β€’ β€’ πŸ“– 5 min read

πŸ’‘ Quick Answer: Install Volcano for gang scheduling (all-or-nothing pod groups), fair-share queues, and job lifecycle management. Create Volcano Jobs with minAvailable to prevent partial starts of distributed training, and Queues with weights for fair GPU sharing across teams.

The Problem

Distributed training with 4 workers needs all 4 to start simultaneously β€” the default scheduler starts them one by one, causing worker 1-3 to idle while waiting for worker 4. Gang scheduling ensures all workers start together or none do. Volcano also provides queue-based job management for multi-tenant GPU clusters.

The Solution

Install Volcano

kubectl apply -f https://raw.githubusercontent.com/volcano-sh/volcano/release-1.10/installer/volcano-development.yaml

Volcano Queue

apiVersion: scheduling.volcano.sh/v1beta1
kind: Queue
metadata:
  name: training-queue
spec:
  weight: 3
  capability:
    nvidia.com/gpu: 32
    cpu: 128
    memory: 512Gi
  reclaimable: true
---
apiVersion: scheduling.volcano.sh/v1beta1
kind: Queue
metadata:
  name: inference-queue
spec:
  weight: 5
  capability:
    nvidia.com/gpu: 16
  reclaimable: false

Gang-Scheduled Training Job

apiVersion: batch.volcano.sh/v1alpha1
kind: Job
metadata:
  name: distributed-llm-train
spec:
  schedulerName: volcano
  minAvailable: 4
  queue: training-queue
  policies:
    - event: PodEvicted
      action: RestartJob
    - event: PodFailed
      action: RestartJob
    - event: TaskCompleted
      action: CompleteJob
  plugins:
    sla:
      - --waiting-time=30m
    gang:
      - --ordered-pod
  tasks:
    - replicas: 1
      name: master
      template:
        spec:
          schedulerName: volcano
          containers:
            - name: pytorch
              image: registry.example.com/training:1.0
              command: ["torchrun", "--master_addr=$(VC_MASTER_HOST)", "train.py"]
              resources:
                limits:
                  nvidia.com/gpu: 8
              env:
                - name: RANK
                  value: "0"
    - replicas: 3
      name: worker
      template:
        spec:
          schedulerName: volcano
          containers:
            - name: pytorch
              image: registry.example.com/training:1.0
              command: ["torchrun", "--master_addr=$(VC_MASTER_HOST)", "train.py"]
              resources:
                limits:
                  nvidia.com/gpu: 8

Volcano Plugins

PluginPurpose
gangAll-or-nothing scheduling
slaWaiting time limits, job deadlines
proportionFair-share queue allocation
binpackPack pods onto fewer nodes
nodeorderCustom node scoring
tdmTime-division multiplexing

Monitor Queue Status

# Queue utilization
kubectl get queue -o wide

# Job status
kubectl get vcjob
kubectl describe vcjob distributed-llm-train
graph TD
    subgraph Volcano Scheduler
        GANG[Gang Plugin<br/>All-or-nothing] 
        PROP[Proportion Plugin<br/>Fair sharing]
        BINPACK[Binpack Plugin<br/>Node consolidation]
        SLA_P[SLA Plugin<br/>Waiting timeout]
    end
    
    JOB[Volcano Job<br/>minAvailable: 4] --> GANG
    GANG -->|All 4 pods<br/>schedulable?| CHECK{Resources<br/>available?}
    CHECK -->|Yes| SCHEDULE[Schedule all 4<br/>simultaneously βœ…]
    CHECK -->|No| QUEUE[Queue job<br/>wait for resources]
    QUEUE -->|SLA timeout 30m| FAIL[Fail job<br/>insufficient resources]

Common Issues

Job stuck in Pending β€” queue has capacity

Volcano scheduler might not be running. Check: kubectl get pods -n volcano-system. Ensure schedulerName: volcano is set on all pod templates.

Gang scheduling deadlock β€” two jobs each have partial allocation

Volcano’s gang plugin prevents this β€” it only admits a job when ALL pods can be scheduled. If you see partial allocation, check that minAvailable equals total replicas.

Best Practices

  • minAvailable equals total pods for gang scheduling β€” prevents partial starts
  • SLA plugin with waiting time β€” fail fast if resources can’t be acquired
  • Queue weights for priority β€” higher weight = more resource share
  • reclaimable: true for training queues β€” inference can reclaim GPU resources
  • PodEvicted β†’ RestartJob β€” automatic restart on preemption

Key Takeaways

  • Volcano provides gang scheduling β€” all pods start together or none do
  • Prevents the #1 distributed training problem: workers idling waiting for peers
  • Queue-based resource management with fair-share proportional allocation
  • SLA plugin sets waiting time limits β€” fail fast instead of waiting indefinitely
  • Integrates with PyTorch, TensorFlow, MPI, and Spark distributed workloads
#volcano #batch-scheduling #gang-scheduling #queue #fair-share
Luca Berton
Written by Luca Berton

Principal Solutions Architect specializing in Kubernetes, AI/GPU infrastructure, and cloud-native platforms. Author of Kubernetes Recipes and creator of CopyPasteLearn courses.

Kubernetes Recipes book cover

Want More Kubernetes Recipes?

This recipe is from Kubernetes Recipes, our 750-page practical guide with hundreds of production-ready patterns.

Luca Berton Ansible Pilot Ansible by Example Open Empower K8s Recipes Terraform Pilot CopyPasteLearn ProteinLens