How to Debug Pod Scheduling Failures
Troubleshoot pods stuck in Pending state due to scheduling issues. Learn to diagnose resource constraints, node affinity, taints, and topology spread.
π‘ Quick Answer: Pods stuck in
Pending= scheduling failure. Runkubectl describe pod <pod>and check Events. Common causes: insufficient CPU/memory (add nodes or reduce requests), untolerated taints (kubectl taint nodes <node> key:NoSchedule-), node selector mismatch, or PVC not bound. Check withkubectl get events --sort-by=.lastTimestamp.
Pods stuck in Pending state indicate scheduling failures. Learn to diagnose resource constraints, affinity rules, taints, tolerations, and other scheduling issues.
Identify the Problem
# Find pending pods
kubectl get pods --field-selector=status.phase=Pending -A
# Get scheduling failure reason
kubectl describe pod <pod-name>Look for scheduler events:
Events:
Type Reason Age Message
---- ------ ---- -------
Warning FailedScheduling 1m 0/3 nodes are available:
1 node(s) had untolerated taint {node-role.kubernetes.io/control-plane: },
2 node(s) didn't have free ports for the requested pod ports.Common Scheduling Failures
1. Insufficient Resources
# Check node resources
kubectl describe nodes | grep -A 5 "Allocated resources"
# Check available capacity
kubectl top nodes
# Detailed node capacity
kubectl get nodes -o custom-columns=\
NAME:.metadata.name,\
CPU:.status.capacity.cpu,\
MEMORY:.status.capacity.memory,\
PODS:.status.capacity.pods# Pod requesting too many resources
apiVersion: v1
kind: Pod
metadata:
name: resource-heavy
spec:
containers:
- name: app
image: nginx
resources:
requests:
memory: "64Gi" # Too high!
cpu: "32" # Too high!Fix: Reduce resource requests or add nodes
# Reasonable resources
resources:
requests:
memory: "256Mi"
cpu: "100m"
limits:
memory: "512Mi"
cpu: "500m"2. Node Selector Mismatch
# Check node labels
kubectl get nodes --show-labels
# Find nodes with specific label
kubectl get nodes -l gpu=nvidia# Pod requires label that doesn't exist
apiVersion: v1
kind: Pod
spec:
nodeSelector:
gpu: nvidia # No nodes have this label!Fix: Add label to nodes or remove selector
# Add label to node
kubectl label nodes node-1 gpu=nvidia3. Node Affinity Issues
# Check pod's affinity requirements
apiVersion: v1
kind: Pod
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: topology.kubernetes.io/zone
operator: In
values:
- us-east-1a # No nodes in this zone!Diagnose:
# Find nodes matching affinity
kubectl get nodes -l topology.kubernetes.io/zone=us-east-1a
# List all zones
kubectl get nodes -o jsonpath='{.items[*].metadata.labels.topology\.kubernetes\.io/zone}' | tr ' ' '\n' | sort -u4. Taints and Tolerations
# Check node taints
kubectl describe nodes | grep -A 3 Taints
# List all taints
kubectl get nodes -o custom-columns=NAME:.metadata.name,TAINTS:.spec.taints# Pod needs toleration for tainted node
apiVersion: v1
kind: Pod
spec:
tolerations:
- key: "dedicated"
operator: "Equal"
value: "gpu"
effect: "NoSchedule"
# Tolerate control plane nodes
- key: "node-role.kubernetes.io/control-plane"
operator: "Exists"
effect: "NoSchedule"5. Pod Disruption Budget Blocking
# Check PDBs
kubectl get pdb -A
# Find blocking PDB
kubectl describe pdb <pdb-name>Fix: Ensure PDB allows some disruptions
6. Topology Spread Constraint Failures
# Strict constraint with insufficient domains
topologySpreadConstraints:
- maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: DoNotSchedule # Blocks if can't spread
minDomains: 5 # Requires 5 zones!Fix: Reduce minDomains or use ScheduleAnyway
7. Port Conflicts
# Using hostPort
apiVersion: v1
kind: Pod
spec:
containers:
- name: app
ports:
- containerPort: 80
hostPort: 80 # Only one pod per node can use this!Diagnose:
# Find pods using hostPort 80
kubectl get pods -A -o json | jq '.items[] | select(.spec.containers[].ports[]?.hostPort == 80) | .metadata.name'8. PersistentVolumeClaim Not Bound
# Check PVC status
kubectl get pvc
# Check PV availability
kubectl get pvFix: Create matching PV or use dynamic provisioning
9. Volume Node Affinity
# Check PV node affinity
kubectl describe pv <pv-name> | grep -A 5 "Node Affinity"Volume might be bound to a specific node/zone.
Debugging Commands
# Scheduler debug
kubectl get events --field-selector reason=FailedScheduling -A
# Check scheduler logs
kubectl logs -n kube-system -l component=kube-scheduler
# Simulate scheduling
kubectl get pod <pod-name> -o yaml | kubectl apply --dry-run=server -f -
# Force reschedule
kubectl delete pod <pod-name>Check Cluster Capacity
# Total cluster resources
kubectl describe nodes | grep -E "cpu|memory" | head -20
# Resource summary
kubectl top nodes
# Pods per node
kubectl get pods -A -o wide | awk '{print $8}' | sort | uniq -c | sort -rnPriority and Preemption
# High priority pod can preempt lower priority
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
name: high-priority
value: 1000000
globalDefault: false
preemptionPolicy: PreemptLowerPriority
---
apiVersion: v1
kind: Pod
spec:
priorityClassName: high-priorityQuick Troubleshooting Checklist
# 1. Get exact failure reason
kubectl describe pod <pod> | grep -A 10 "Events"
# 2. Check resource availability
kubectl describe nodes | grep -A 6 "Allocated resources"
# 3. Check for taints
kubectl get nodes -o custom-columns=NAME:.metadata.name,TAINTS:.spec.taints
# 4. Check node labels for nodeSelector/affinity
kubectl get nodes --show-labels
# 5. Check PVCs if using storage
kubectl get pvc
# 6. Check PDBs if upgrading/draining
kubectl get pdb -ASummary
Pending pods result from scheduling constraints not being met. Check resource availability first, then verify node selectors, affinity rules, taints/tolerations, and topology constraints. Use kubectl describe pod to see the exact scheduling failure reason.
π Go Further with Kubernetes Recipes
Love this recipe? Thereβs so much more! This is just one of 100+ hands-on recipes in our comprehensive Kubernetes Recipes book.
Inside the book, youβll master:
- β Production-ready deployment strategies
- β Advanced networking and security patterns
- β Observability, monitoring, and troubleshooting
- β Real-world best practices from industry experts
βThe practical, recipe-based approach made complex Kubernetes concepts finally click for me.β
π Get Your Copy Now β Start building production-grade Kubernetes skills today!

Recommended
Kubernetes Recipes β The Complete Book100+ production-ready patterns with detailed explanations, best practices, and copy-paste YAML. Everything in one place.
Get the Book βLearn by Doing
CopyPasteLearn β Hands-on Cloud & DevOps CoursesMaster Kubernetes, Ansible, Terraform, and MLOps with interactive, copy-paste-run lessons. Start free.
Browse Courses β