Kubernetes Vertical Pod Autoscaler VPA Guide
Deploy and configure the Vertical Pod Autoscaler (VPA) on Kubernetes. Auto-adjust CPU and memory requests based on actual usage, right-size
π‘ Quick Answer: The Vertical Pod Autoscaler (VPA) automatically adjusts pod CPU and memory requests/limits based on historical usage. Install with
./hack/vpa-up.shfrom the autoscaler repo, create a VPA resource targeting your Deployment, and setupdateMode: "Auto"for automatic adjustment or"Off"for recommendations only.
The Problem
- Developers guess resource requests β too low causes OOMKilled, too high wastes cluster capacity
- Manual right-sizing requires constant monitoring and adjustment
- Applicationsβ resource needs change over time (traffic patterns, data growth)
- Over-provisioned clusters waste 40-60% of allocated resources
- Under-provisioned pods get throttled (CPU) or killed (memory)
The Solution
Install VPA
# Clone the autoscaler repo
git clone https://github.com/kubernetes/autoscaler.git
cd autoscaler/vertical-pod-autoscaler
# Install VPA components (admission controller, recommender, updater)
./hack/vpa-up.sh
# Verify
kubectl get pods -n kube-system | grep vpa
# vpa-admission-controller-xxx 1/1 Running 0 1m
# vpa-recommender-xxx 1/1 Running 0 1m
# vpa-updater-xxx 1/1 Running 0 1mVPA in Recommendation Mode (Safe Start)
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: my-app-vpa
namespace: production
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: my-app
updatePolicy:
updateMode: "Off" # Only recommend, don't change pods
resourcePolicy:
containerPolicies:
- containerName: "*"
minAllowed:
cpu: "50m"
memory: "64Mi"
maxAllowed:
cpu: "4"
memory: "8Gi"
controlledResources: ["cpu", "memory"]# Check recommendations
kubectl describe vpa my-app-vpa -n production
# Recommendation:
# Container Recommendations:
# Container Name: my-app
# Lower Bound: Cpu: 100m, Memory: 128Mi
# Target: Cpu: 250m, Memory: 384Mi β Use this
# Upper Bound: Cpu: 1000m, Memory: 1Gi
# Uncapped Target: Cpu: 250m, Memory: 384MiVPA in Auto Mode
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: my-app-vpa
namespace: production
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: my-app
updatePolicy:
updateMode: "Auto" # Automatically adjust + restart pods
minReplicas: 2 # Don't evict if fewer than 2 replicas
resourcePolicy:
containerPolicies:
- containerName: app
minAllowed:
cpu: "100m"
memory: "128Mi"
maxAllowed:
cpu: "2"
memory: "4Gi"
controlledResources: ["cpu", "memory"]
controlledValues: RequestsAndLimits # or RequestsOnlyUpdate Modes
Mode β Behavior
ββββββββββββΌβββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Off β Only generates recommendations (no pod changes)
β Safe for production exploration
ββββββββββββΌβββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Initial β Sets resources only at pod creation (no evictions)
β Good for Jobs and one-shot pods
ββββββββββββΌβββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Auto β Evicts and recreates pods with new resource values
β Full automation (ensure PDB protects availability)
ββββββββββββ΄βββββββββββββββββββββββββββββββββββββββββββββββββββββββββVPA with Multiple Containers
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: multi-container-vpa
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: my-app
updatePolicy:
updateMode: "Auto"
resourcePolicy:
containerPolicies:
- containerName: app
minAllowed:
cpu: "200m"
memory: "256Mi"
maxAllowed:
cpu: "4"
memory: "8Gi"
- containerName: sidecar
mode: "Off" # Don't touch sidecar resources
- containerName: init-container
mode: "Off" # Don't touch init containersCommon Issues
VPA evicting pods too frequently
- Cause: Small resource changes trigger unnecessary restarts
- Fix: Set
minReplicas> 1; use PodDisruptionBudget; widen min/max range
VPA and HPA conflict
- Cause: Both trying to adjust the same deployment β VPA changes requests, HPA scales replicas
- Fix: Use VPA for CPU/memory right-sizing + HPA for replica scaling on custom metrics (not CPU)
βcannot use VPA with HPA on CPUβ
- Cause: HPA targets CPU utilization percentage β VPA changes the denominator
- Fix: Use HPA with custom/external metrics (not
cpu/memory) alongside VPA
Recommendations show 0 after VPA creation
- Cause: VPA needs 24-48h of metrics history to generate reliable recommendations
- Fix: Wait for data collection; ensure metrics-server is running
Best Practices
- Start with
Offmode β observe recommendations before enabling auto-adjustment - Set min/max bounds β prevent recommendations from going too low (OOM) or too high (waste)
- Use PodDisruptionBudgets β protect availability when VPA evicts pods
- Donβt combine VPA + HPA on CPU β they conflict; use custom metrics with HPA
controlledValues: RequestsOnlyβ let limits float with a ratio to requests- Monitor recommendation stability β erratic recommendations indicate variable workloads (use HPA instead)
Key Takeaways
- VPA auto-adjusts CPU/memory requests based on actual historical usage
- Install with
./hack/vpa-up.shfrom kubernetes/autoscaler repo - Three modes:
Off(recommend only),Initial(set at creation),Auto(evict + update) - Recommendations include lower bound, target, and upper bound
- VPA + HPA: safe only when HPA uses custom metrics (not CPU/memory)
- Set
minAllowed/maxAllowedto prevent extreme values Automode evicts pods β use PDB to maintain availability during restarts

Recommended
Kubernetes Recipes β The Complete Book100+ production-ready patterns with detailed explanations, best practices, and copy-paste YAML. Everything in one place.
Get the Book βLearn by Doing
CopyPasteLearn β Hands-on Cloud & DevOps CoursesMaster Kubernetes, Ansible, Terraform, and MLOps with interactive, copy-paste-run lessons. Start free.
Browse Courses βπ Deepen Your Skills β Hands-on Courses
Courses by CopyPasteLearn.com β Learn IT by Doing
