Kubernetes Taints and Tolerations Guide
Use Kubernetes taints and tolerations to control pod scheduling. Dedicate nodes for GPU workloads, isolate teams, and prevent scheduling on specific nodes.
π‘ Quick Answer: Use Kubernetes taints and tolerations to control pod scheduling. Dedicate nodes for GPU workloads, isolate teams, and prevent scheduling on specific nodes.
The Problem
This is one of the most searched Kubernetes topics. Having a comprehensive, well-structured guide helps both beginners and experienced users quickly find what they need.
The Solution
Add Taints to Nodes
# Taint a node (NoSchedule β pods won't be scheduled unless they tolerate it)
kubectl taint nodes gpu-node-1 nvidia.com/gpu=true:NoSchedule
# PreferNoSchedule β soft version, scheduler avoids but doesn't forbid
kubectl taint nodes expensive-node cost=high:PreferNoSchedule
# NoExecute β evict existing pods that don't tolerate
kubectl taint nodes maintenance-node maintenance=true:NoExecute
# Remove a taint
kubectl taint nodes gpu-node-1 nvidia.com/gpu=true:NoSchedule-
# View taints
kubectl describe node gpu-node-1 | grep -A5 TaintsAdd Tolerations to Pods
apiVersion: apps/v1
kind: Deployment
metadata:
name: gpu-training
spec:
template:
spec:
tolerations:
# Exact match
- key: "nvidia.com/gpu"
operator: "Equal"
value: "true"
effect: "NoSchedule"
# Key exists (any value)
- key: "nvidia.com/gpu"
operator: "Exists"
effect: "NoSchedule"
# Tolerate NoExecute with timeout
- key: "maintenance"
operator: "Exists"
effect: "NoExecute"
tolerationSeconds: 3600 # Stay 1 hour then evict
nodeSelector:
nvidia.com/gpu: "true" # Also select GPU nodes
containers:
- name: training
image: training:v1
resources:
limits:
nvidia.com/gpu: 1Common Patterns
| Pattern | Taint | Toleration on |
|---|---|---|
| GPU nodes | nvidia.com/gpu=true:NoSchedule | Only GPU workloads |
| Spot/preemptible | cloud.google.com/gke-spot=true:NoSchedule | Tolerant workloads |
| Control plane | node-role.kubernetes.io/control-plane:NoSchedule | System pods |
| Team isolation | team=frontend:NoSchedule | Frontend team pods |
| Maintenance | maintenance=true:NoExecute | Nothing (drains all pods) |
graph TD
A[Pod without toleration] -->|Tries to schedule| B{Node tainted?}
B -->|Yes, NoSchedule| C[Rejected - not scheduled]
B -->|No| D[Scheduled normally]
E[Pod with matching toleration] -->|Tries to schedule| B
B -->|Yes, but tolerated| DFrequently Asked Questions
Whatβs the difference between taints/tolerations and node affinity?
Taints repel pods (opt-out model). Node affinity attracts pods (opt-in model). Use together: taint GPU nodes AND use nodeSelector to ensure GPU pods land on GPU nodes.
Does adding a toleration guarantee scheduling on that node?
No! Tolerations only allow scheduling β they donβt attract. Use nodeSelector or node affinity together with tolerations to ensure pods land on specific nodes.
Best Practices
- Start simple β use the basic form first, add complexity as needed
- Be consistent β follow naming conventions across your cluster
- Document your choices β add annotations explaining why, not just what
- Monitor and iterate β review configurations regularly
Key Takeaways
- This is fundamental Kubernetes knowledge every engineer needs
- Start with the simplest approach that solves your problem
- Use
kubectl explainandkubectl describewhen unsure - Practice in a test cluster before applying to production

Recommended
Kubernetes Recipes β The Complete Book100+ production-ready patterns with detailed explanations, best practices, and copy-paste YAML. Everything in one place.
Get the Book βLearn by Doing
CopyPasteLearn β Hands-on Cloud & DevOps CoursesMaster Kubernetes, Ansible, Terraform, and MLOps with interactive, copy-paste-run lessons. Start free.
Browse Courses βπ Deepen Your Skills β Hands-on Courses
Courses by CopyPasteLearn.com β Learn IT by Doing
