Resource Limits and Requests Guide
Configure CPU and memory requests and limits for Kubernetes pods. Guaranteed vs Burstable vs BestEffort QoS classes, OOMKill prevention.
π‘ Quick Answer: Set
requestsequal to expected steady-state usage andlimitsto 2x requests for burstable workloads. For critical services, setrequests == limits(Guaranteed QoS) to prevent OOMKill during node pressure. Never set CPU limits on latency-sensitive services β CPU throttling causes tail latency spikes.
The Problem
Pods crash with OOMKilled, or response times spike due to CPU throttling. Developers either over-provision (wasting 50%+ of cluster resources) or under-provision (causing instability). Understanding the difference between requests, limits, and QoS classes is fundamental.
The Solution
Requests vs Limits
apiVersion: v1
kind: Pod
metadata:
name: app
spec:
containers:
- name: app
resources:
requests:
cpu: 250m
memory: 256Mi
limits:
cpu: "1"
memory: 512Mi- Requests: Guaranteed minimum. Used for scheduling decisions.
- Limits: Maximum allowed. CPU is throttled; memory causes OOMKill.
QoS Classes
| Class | Condition | Eviction Priority |
|---|---|---|
| Guaranteed | requests == limits for ALL containers | Last to evict |
| Burstable | At least one request or limit set | Middle |
| BestEffort | No requests or limits at all | First to evict |
# Guaranteed QoS (highest priority)
resources:
requests:
cpu: 500m
memory: 512Mi
limits:
cpu: 500m
memory: 512Mi
# Burstable QoS (most common)
resources:
requests:
cpu: 250m
memory: 256Mi
limits:
memory: 512Mi
# No CPU limit β prevents throttlingCPU Throttling vs Memory OOMKill
CPU (compressible):
Pod exceeds CPU limit β throttled (slower, not killed)
CFS quota mechanism: 100m = 10ms every 100ms
Memory (incompressible):
Pod exceeds memory limit β OOMKilled immediately
Container restarts (if restartPolicy allows)graph TD
SCHED[Scheduler] -->|Uses requests| PLACE[Place pod on node<br/>with enough capacity]
subgraph Runtime Enforcement
CPU[CPU > limit] -->|Throttled| SLOW[Slower execution<br/>NOT killed]
MEM[Memory > limit] -->|OOMKilled| RESTART[Container restarted]
end
subgraph Node Pressure
PRESSURE[Node memory pressure] -->|Evict first| BE[BestEffort pods]
PRESSURE -->|Then| BURST[Burstable pods<br/>exceeding requests]
PRESSURE -->|Last| GUAR[Guaranteed pods]
endCommon Issues
OOMKilled but the container only uses 400Mi (limit is 512Mi)
Check for child processes and memory-mapped files. kubectl top pod shows RSS only β actual memory includes page cache and tmpfs.
CPU throttling causing latency spikes
Remove CPU limits for latency-sensitive services. CPU throttling activates in 100ms windows β a burst that exceeds quota gets throttled even if average usage is low.
Best Practices
- Donβt set CPU limits on latency-sensitive services β throttling causes p99 latency spikes
- Always set memory limits β prevents one pod from consuming all node memory
- Guaranteed QoS for databases and critical services β requests == limits
- Burstable for web services β requests for baseline, limits for peaks
- Monitor actual usage with VPA β right-size after 1 week of observation
Key Takeaways
- Requests are scheduling guarantees; limits are runtime enforcement
- CPU is throttled (compressible); memory is OOMKilled (incompressible)
- Guaranteed QoS (requests==limits) is last to be evicted during node pressure
- Donβt set CPU limits on latency-sensitive services β throttling causes p99 spikes
- Always set memory limits β one pod without limits can take down a node
- Use VPA in Off mode to observe actual usage before setting values

Recommended
Kubernetes Recipes β The Complete Book100+ production-ready patterns with detailed explanations, best practices, and copy-paste YAML. Everything in one place.
Get the Book βLearn by Doing
CopyPasteLearn β Hands-on Cloud & DevOps CoursesMaster Kubernetes, Ansible, Terraform, and MLOps with interactive, copy-paste-run lessons. Start free.
Browse Courses βπ Deepen Your Skills β Hands-on Courses
Courses by CopyPasteLearn.com β Learn IT by Doing
