πŸ“šBook Signing at KubeCon EU 2026Meet us at Booking.com HQ (Mon 18:30-21:00) & vCluster booth #521 (Tue 24 Mar, 12:30-1:30pm) β€” free book giveaway!RSVP Booking.com Event
Configuration beginner ⏱ 15 minutes K8s 1.28+

Resource Limits and Requests Guide

Configure CPU and memory requests and limits for Kubernetes pods. Guaranteed vs Burstable vs BestEffort QoS classes, OOMKill prevention.

By Luca Berton β€’ β€’ πŸ“– 5 min read

πŸ’‘ Quick Answer: Set requests equal to expected steady-state usage and limits to 2x requests for burstable workloads. For critical services, set requests == limits (Guaranteed QoS) to prevent OOMKill during node pressure. Never set CPU limits on latency-sensitive services β€” CPU throttling causes tail latency spikes.

The Problem

Pods crash with OOMKilled, or response times spike due to CPU throttling. Developers either over-provision (wasting 50%+ of cluster resources) or under-provision (causing instability). Understanding the difference between requests, limits, and QoS classes is fundamental.

The Solution

Requests vs Limits

apiVersion: v1
kind: Pod
metadata:
  name: app
spec:
  containers:
    - name: app
      resources:
        requests:
          cpu: 250m
          memory: 256Mi
        limits:
          cpu: "1"
          memory: 512Mi
  • Requests: Guaranteed minimum. Used for scheduling decisions.
  • Limits: Maximum allowed. CPU is throttled; memory causes OOMKill.

QoS Classes

ClassConditionEviction Priority
Guaranteedrequests == limits for ALL containersLast to evict
BurstableAt least one request or limit setMiddle
BestEffortNo requests or limits at allFirst to evict
# Guaranteed QoS (highest priority)
resources:
  requests:
    cpu: 500m
    memory: 512Mi
  limits:
    cpu: 500m
    memory: 512Mi

# Burstable QoS (most common)
resources:
  requests:
    cpu: 250m
    memory: 256Mi
  limits:
    memory: 512Mi
    # No CPU limit β€” prevents throttling

CPU Throttling vs Memory OOMKill

CPU (compressible):
  Pod exceeds CPU limit β†’ throttled (slower, not killed)
  CFS quota mechanism: 100m = 10ms every 100ms

Memory (incompressible):
  Pod exceeds memory limit β†’ OOMKilled immediately
  Container restarts (if restartPolicy allows)
graph TD
    SCHED[Scheduler] -->|Uses requests| PLACE[Place pod on node<br/>with enough capacity]
    
    subgraph Runtime Enforcement
        CPU[CPU > limit] -->|Throttled| SLOW[Slower execution<br/>NOT killed]
        MEM[Memory > limit] -->|OOMKilled| RESTART[Container restarted]
    end
    
    subgraph Node Pressure
        PRESSURE[Node memory pressure] -->|Evict first| BE[BestEffort pods]
        PRESSURE -->|Then| BURST[Burstable pods<br/>exceeding requests]
        PRESSURE -->|Last| GUAR[Guaranteed pods]
    end

Common Issues

OOMKilled but the container only uses 400Mi (limit is 512Mi)

Check for child processes and memory-mapped files. kubectl top pod shows RSS only β€” actual memory includes page cache and tmpfs.

CPU throttling causing latency spikes

Remove CPU limits for latency-sensitive services. CPU throttling activates in 100ms windows β€” a burst that exceeds quota gets throttled even if average usage is low.

Best Practices

  • Don’t set CPU limits on latency-sensitive services β€” throttling causes p99 latency spikes
  • Always set memory limits β€” prevents one pod from consuming all node memory
  • Guaranteed QoS for databases and critical services β€” requests == limits
  • Burstable for web services β€” requests for baseline, limits for peaks
  • Monitor actual usage with VPA β€” right-size after 1 week of observation

Key Takeaways

  • Requests are scheduling guarantees; limits are runtime enforcement
  • CPU is throttled (compressible); memory is OOMKilled (incompressible)
  • Guaranteed QoS (requests==limits) is last to be evicted during node pressure
  • Don’t set CPU limits on latency-sensitive services β€” throttling causes p99 spikes
  • Always set memory limits β€” one pod without limits can take down a node
  • Use VPA in Off mode to observe actual usage before setting values
#resources #limits #requests #qos #oomkill
Luca Berton
Written by Luca Berton

Principal Solutions Architect specializing in Kubernetes, AI/GPU infrastructure, and cloud-native platforms. Author of Kubernetes Recipes and creator of CopyPasteLearn courses.

Kubernetes Recipes book cover

Want More Kubernetes Recipes?

This recipe is from Kubernetes Recipes, our 750-page practical guide with hundreds of production-ready patterns.

Luca Berton Ansible Pilot Ansible by Example Open Empower K8s Recipes Terraform Pilot CopyPasteLearn ProteinLens