πŸ“šBook Signing at KubeCon EU 2026Meet us at Booking.com HQ (Mon 18:30-21:00) & vCluster booth #521 (Tue 24 Mar, 12:30-1:30pm) β€” free book giveaway!RSVP Booking.com Event
Configuration intermediate ⏱ 12 minutes K8s 1.28+

Pod Priority Preemption Kubernetes

Configure PriorityClasses to ensure critical workloads get resources by preempting lower-priority pods. Understand preemption mechanics and safeguards.

By Luca Berton β€’ β€’ πŸ“– 5 min read

πŸ’‘ Quick Answer: PriorityClasses assign numeric priorities to pods. Higher-priority pods preempt (evict) lower-priority pods when the cluster is full, ensuring critical workloads always get scheduled.

The Problem

When cluster resources are exhausted:

  • Critical workloads (production APIs) stay Pending behind batch jobs
  • GPU workloads can’t schedule because dev experiments hold the GPUs
  • No way to express β€œthis pod is more important than that pod”
  • Manual intervention needed to free resources during incidents

The Solution

Define PriorityClasses

apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
  name: critical-production
value: 1000000
globalDefault: false
preemptionPolicy: PreemptLowerPriority
description: "Critical production services that must always run"
---
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
  name: standard
value: 100000
globalDefault: true
description: "Default priority for regular workloads"
---
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
  name: batch-low
value: 10000
preemptionPolicy: Never
description: "Batch jobs that should not preempt other workloads"

Assign Priority to Pods

apiVersion: apps/v1
kind: Deployment
metadata:
  name: payment-api
spec:
  replicas: 3
  selector:
    matchLabels:
      app: payment-api
  template:
    spec:
      priorityClassName: critical-production
      containers:
        - name: api
          image: payment-api:2.0
          resources:
            requests:
              cpu: "1"
              memory: 1Gi

Non-Preempting Priority

apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
  name: high-priority-no-preempt
value: 500000
preemptionPolicy: Never
description: "High priority in queue but won't evict others"
graph TD
    A[New Pod: priority 1000000] --> B{Resources Available?}
    B -->|Yes| C[Schedule Normally]
    B -->|No| D{Find Preemption Victims}
    D --> E{Lower Priority Pods?}
    E -->|Yes| F[Evict lowest priority pods]
    F --> G[Schedule new pod]
    E -->|No| H[Pod stays Pending]
    
    subgraph Priority Levels
        P1[1000000: critical-production]
        P2[500000: high-priority]
        P3[100000: standard default]
        P4[10000: batch-low]
    end

Common Issues

System pods getting preempted System-critical pods use built-in priority classes:

kubectl get priorityclasses
# system-cluster-critical: 2000000000
# system-node-critical:    2000001000

Never set custom priorities above 1000000000.

Preemption cascades Pod A preempts B, B’s disruption triggers C’s eviction. Use PDBs to limit:

# PDB protects minimum replicas even during preemption
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: standard-pdb
spec:
  minAvailable: 1
  selector:
    matchLabels:
      priority-tier: standard

Batch jobs immediately preempted Use preemptionPolicy: Never for batch β€” they queue without displacing others, but still get priority in scheduling order.

Best Practices

  • Define 3-5 priority levels (don’t over-complicate)
  • Set one globalDefault: true class for workloads that don’t specify priority
  • Use preemptionPolicy: Never for workloads that should wait, not evict
  • Keep priorities below 1000000000 (system classes use higher values)
  • Combine with ResourceQuota to prevent priority abuse per namespace
  • Document priority classes and their intended use cases
  • Monitor preemption events: kubectl get events --field-selector reason=Preempted

Key Takeaways

  • PriorityClass value determines scheduling order and preemption eligibility
  • Higher-priority pods can evict lower-priority pods to get resources
  • preemptionPolicy: Never = high scheduling priority without evicting others
  • globalDefault: true applies to pods without explicit priorityClassName
  • System classes (2 billion range) are reserved β€” stay below 1 billion
  • PDBs are respected during preemption β€” preemptor may stay Pending
  • Only one PriorityClass can be globalDefault: true
#priority #preemption #scheduling #priorityclass
Luca Berton
Written by Luca Berton

Principal Solutions Architect specializing in Kubernetes, AI/GPU infrastructure, and cloud-native platforms. Author of Kubernetes Recipes and creator of CopyPasteLearn courses.

Kubernetes Recipes book cover

Want More Kubernetes Recipes?

This recipe is from Kubernetes Recipes, our 750-page practical guide with hundreds of production-ready patterns.

Luca Berton Ansible Pilot Ansible by Example Open Empower K8s Recipes Terraform Pilot CopyPasteLearn ProteinLens