πŸ“šBook Signing at KubeCon EU 2026Meet us at Booking.com HQ (Mon 18:30-21:00) & vCluster booth #521 (Tue 24 Mar, 12:30-1:30pm) β€” free book giveaway!RSVP Booking.com Event
Deployments beginner ⏱ 12 minutes K8s 1.28+

Kubernetes Node Drain Cordon Guide

Safely drain and cordon Kubernetes nodes for maintenance. Graceful pod eviction, PDB-aware drains, force drain, and maintenance window procedures.

By Luca Berton β€’ β€’ πŸ“– 5 min read

πŸ’‘ Quick Answer: kubectl cordon <node> marks a node unschedulable (no new pods), kubectl drain <node> evicts all pods and cordons in one step. Always drain with --ignore-daemonsets --delete-emptydir-data for clean maintenance. Uncordon with kubectl uncordon <node> when maintenance is complete.

The Problem

Node maintenance β€” OS upgrades, kernel patches, hardware replacement β€” requires moving workloads off nodes without downtime. Incorrectly draining nodes causes:

  • Application outages from simultaneous pod eviction
  • Stuck drains from PDB conflicts
  • Data loss from pods with emptyDir volumes
  • Orphaned DaemonSet pods blocking the drain

The Solution

Cordon (Mark Unschedulable)

# Prevent new pods from scheduling on the node
kubectl cordon worker-3

# Verify
kubectl get nodes
# NAME       STATUS                     ROLES    AGE
# worker-3   Ready,SchedulingDisabled   worker   90d

# Uncordon when done
kubectl uncordon worker-3

Drain (Evict + Cordon)

# Standard drain for maintenance
kubectl drain worker-3 \
  --ignore-daemonsets \
  --delete-emptydir-data \
  --grace-period=60 \
  --timeout=300s

# Dry run first
kubectl drain worker-3 --dry-run=client \
  --ignore-daemonsets \
  --delete-emptydir-data

Drain Flags

FlagPurpose
--ignore-daemonsetsSkip DaemonSet pods (they can’t be rescheduled)
--delete-emptydir-dataAllow evicting pods with emptyDir volumes
--grace-period=NOverride pod termination grace period (seconds)
--timeout=NAbort drain if it takes longer than N seconds
--forceDelete pods not managed by a controller (bare pods)
--pod-selector=labelOnly evict pods matching the selector
--disable-evictionUse delete instead of eviction API (bypasses PDB)

Maintenance Window Script

#!/bin/bash
NODE=$1
echo "=== Starting maintenance on $NODE ==="

# Step 1: Cordon
kubectl cordon "$NODE"

# Step 2: Wait for in-flight requests to complete
sleep 30

# Step 3: Drain
kubectl drain "$NODE" \
  --ignore-daemonsets \
  --delete-emptydir-data \
  --grace-period=120 \
  --timeout=600s

if [ $? -ne 0 ]; then
  echo "ERROR: Drain failed. Check PDB conflicts."
  exit 1
fi

echo "=== Node $NODE drained. Perform maintenance. ==="
echo "=== Run: kubectl uncordon $NODE when complete ==="
graph LR
    A[Cordon Node] --> B[Wait for<br/>In-Flight]
    B --> C[Drain Node]
    C --> D[Maintenance]
    D --> E[Uncordon Node]
    E --> F[Verify Pods<br/>Scheduled]
    
    style D fill:#FF9800,color:white
    style F fill:#4CAF50,color:white

Common Issues

β€œCannot evict pod” β€” PDB violation

A PodDisruptionBudget is blocking eviction. Wait for other pods to become ready, or use --disable-eviction as a last resort (bypasses PDB, may cause downtime).

β€œpod not managed by a controller”

Bare pods (not from a Deployment/StatefulSet) won’t be rescheduled. Use --force to delete them, but understand they’re gone permanently.

Drain takes forever

A pod has a long terminationGracePeriodSeconds or a PreStop hook. Use --grace-period=30 to override, or investigate the stuck pod.

Best Practices

  • Always drain before maintenance β€” don’t just power off nodes
  • Dry run first β€” --dry-run=client shows what would be evicted
  • Set PDBs on all production workloads β€” prevents mass eviction
  • Drain one node at a time β€” maintain cluster capacity
  • Use --timeout β€” prevent infinite waits from stuck pods
  • Automate with scripts β€” cordon β†’ drain β†’ maintain β†’ uncordon

Key Takeaways

  • cordon prevents new scheduling; drain evicts existing pods AND cordons
  • --ignore-daemonsets --delete-emptydir-data are needed for most real-world drains
  • PDBs can block drains β€” by design, to protect availability
  • Always uncordon after maintenance to restore scheduling
  • Drain one node at a time during rolling maintenance windows
#node-drain #cordon #maintenance #eviction #operations
Luca Berton
Written by Luca Berton

Principal Solutions Architect specializing in Kubernetes, AI/GPU infrastructure, and cloud-native platforms. Author of Kubernetes Recipes and creator of CopyPasteLearn courses.

Kubernetes Recipes book cover

Want More Kubernetes Recipes?

This recipe is from Kubernetes Recipes, our 750-page practical guide with hundreds of production-ready patterns.

Luca Berton Ansible Pilot Ansible by Example Open Empower K8s Recipes Terraform Pilot CopyPasteLearn ProteinLens