πŸ“šBook Signing at KubeCon EU 2026Meet us at Booking.com HQ (Mon 18:30-21:00) & vCluster booth #521 (Tue 24 Mar, 12:30-1:30pm) β€” free book giveaway!RSVP Booking.com Event
Troubleshooting intermediate ⏱ 15 minutes K8s 1.28+

Fix OOMKilled Containers in Kubernetes

Debug and resolve OOMKilled container terminations. Understand memory limits, kernel OOM killer behavior, and right-sizing strategies for Kubernetes pods.

By Luca Berton β€’ β€’ πŸ“– 5 min read

πŸ’‘ Quick Answer: OOMKilled (exit code 137) means your container exceeded its memory limit and the kernel killed it. Fix: increase resources.limits.memory, fix memory leaks, or use VPA to auto-right-size. Check actual usage with kubectl top pod before adjusting limits.

Key insight: There are TWO types of OOM: container limit OOM (cgroup) and node-level OOM (kernel). Container limit OOM kills only that container. Node-level OOM can kill any pod on the node.

Gotcha: JVM -Xmx must be LESS than the container memory limit. If -Xmx=512m and limit is 512Mi, the JVM will be OOMKilled because the JVM uses additional memory beyond heap (metaspace, threads, native).

The Problem

Your container keeps getting killed with OOMKilled:

$ kubectl describe pod myapp-abc123 | grep -A5 "Last State"
    Last State:  Terminated
      Reason:    OOMKilled
      Exit Code: 137
      Started:   Thu, 02 Apr 2026 10:00:00 UTC
      Finished:  Thu, 02 Apr 2026 10:05:32 UTC

The Solution

Step 1: Check Current Memory Usage

# Real-time memory usage (requires metrics-server)
kubectl top pod myapp-abc123
# NAME           CPU(cores)   MEMORY(bytes)
# myapp-abc123   50m          480Mi

# Check the memory limit
kubectl get pod myapp-abc123 -o jsonpath='{.spec.containers[0].resources.limits.memory}'
# 512Mi

If usage is close to the limit, the container will eventually be OOMKilled.

Step 2: Determine OOM Type

# Container cgroup OOM (most common)
kubectl describe pod myapp-abc123 | grep -i oom
# Reason: OOMKilled

# Node-level OOM (check node events)
kubectl describe node worker-1 | grep -A3 "OOM"
# "System OOM encountered, victim process: myapp"
graph TD
    A[Container Memory Grows] --> B{Exceeds limit?}
    B -->|Yes| C[cgroup OOM killer]
    C --> D[Container killed, exit 137]
    D --> E[Pod restarts]
    B -->|No but node under pressure| F{Node memory pressure?}
    F -->|Yes| G[Kernel OOM killer]
    G --> H[Lowest priority pod killed]
    F -->|No| I[Container runs normally]

Step 3: Fix β€” Increase Memory Limits

resources:
  requests:
    memory: "256Mi"   # Scheduler uses this for placement
  limits:
    memory: "1Gi"     # Hard ceiling β€” OOMKilled if exceeded

Right-sizing guidelines:

  • Set requests to the average memory usage
  • Set limits to the peak usage + 20% buffer
  • Use kubectl top pod over time to find the right values

Step 4: Fix Memory Leaks

If memory grows continuously, the app has a leak. Debug:

# Watch memory over time
watch -n5 "kubectl top pod myapp-abc123"

# Profile inside the container
kubectl exec -it myapp-abc123 -- /bin/sh
# For Go: curl localhost:6060/debug/pprof/heap > heap.prof
# For Java: jmap -dump:format=b,file=/tmp/heap.hprof 1
# For Python: pip install memory-profiler
# For Node: --inspect flag + Chrome DevTools

JVM-Specific Fix

containers:
  - name: myapp
    image: myapp:latest
    resources:
      limits:
        memory: "1Gi"
    env:
      # Set heap to 75% of container limit
      - name: JAVA_OPTS
        value: "-Xms256m -Xmx768m -XX:+UseContainerSupport"
      # Or let JVM auto-detect (Java 10+)
      - name: JAVA_OPTS
        value: "-XX:MaxRAMPercentage=75.0 -XX:+UseContainerSupport"

JVM memory breakdown:

  • Heap (-Xmx): ~75% of limit
  • Metaspace: ~100-200MB
  • Thread stacks: ~1MB per thread
  • Native memory, code cache, buffers: ~100-200MB
  • Total > Heap β€” always leave headroom

Use VPA for Auto-Right-Sizing

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: myapp-vpa
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: myapp
  updatePolicy:
    updateMode: "Auto"  # Or "Off" for recommendation-only
  resourcePolicy:
    containerPolicies:
      - containerName: myapp
        minAllowed:
          memory: "128Mi"
        maxAllowed:
          memory: "4Gi"

Common Issues

OOMKilled but kubectl top shows low memory

The peak happened between polling intervals. Use Prometheus with container_memory_working_set_bytes for accurate historical data.

OOMKilled immediately on startup

The memory limit is too low for the application to even start. Java apps commonly need 512Mi+ just for JVM initialization.

Node evicts pods but no OOMKilled

This is eviction, not OOM. The kubelet evicts pods when node memory drops below the eviction threshold (--eviction-hard=memory.available<100Mi). Check with kubectl describe node | grep -A5 Conditions.

VPA and HPA conflict

Run VPA for memory only and HPA for CPU scaling to avoid conflicts:

resourcePolicy:
  containerPolicies:
    - containerName: myapp
      controlledResources: ["memory"]  # VPA manages memory only

Best Practices

  • Always set memory limits β€” without limits, one container can consume all node memory
  • JVM heap = 75% of container limit β€” leave room for non-heap memory
  • Use -XX:+UseContainerSupport (Java 10+) for container-aware JVM
  • Monitor with Prometheus β€” container_memory_working_set_bytes is what the OOM killer uses
  • Set requests β‰ˆ average, limits β‰ˆ peak + 20% for right-sizing
  • Use VPA in recommendation mode first to understand actual usage patterns

Key Takeaways

  • OOMKilled = container exceeded its cgroup memory limit β†’ exit code 137
  • Two types: container OOM (cgroup limit) and node OOM (kernel pressure)
  • JVM apps need -Xmx set to ~75% of container limit, not 100%
  • Use kubectl top pod and Prometheus for right-sizing decisions
  • VPA can auto-adjust memory limits β€” use with HPA by splitting resource control
#oomkilled #memory #resources #troubleshooting #kubernetes
Luca Berton
Written by Luca Berton

Principal Solutions Architect specializing in Kubernetes, AI/GPU infrastructure, and cloud-native platforms. Author of Kubernetes Recipes and creator of CopyPasteLearn courses.

Kubernetes Recipes book cover

Want More Kubernetes Recipes?

This recipe is from Kubernetes Recipes, our 750-page practical guide with hundreds of production-ready patterns.

Luca Berton Ansible Pilot Ansible by Example Open Empower K8s Recipes Terraform Pilot CopyPasteLearn ProteinLens