Vertical Pod Autoscaler Deep Dive
Configure VPA for automatic memory and CPU right-sizing in Kubernetes. Recommendation modes, update policies, VPA with HPA coexistence, and GPU workload tuning.
π‘ Quick Answer: Deploy VPA in
Offmode first to get recommendations without changes, then switch toAutofor memory only (let HPA handle CPU scaling). SetminAllowedandmaxAllowedto prevent extreme values. VPA and HPA can coexist if they control different resources.
The Problem
Developers guess resource requests/limits at deploy time β usually too high (wasting cluster capacity) or too low (causing OOM kills and throttling). VPA observes actual usage and adjusts requests automatically, but misconfiguration can cause pod restarts or conflict with HPA.
The Solution
Step 1: Recommendation Mode (Safe Start)
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: my-app-vpa
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: my-app
updatePolicy:
updateMode: "Off"Check recommendations:
kubectl describe vpa my-app-vpa
# Recommendation:
# Container: app
# Lower Bound: Cpu: 25m, Memory: 128Mi
# Target: Cpu: 100m, Memory: 256Mi
# Upper Bound: Cpu: 500m, Memory: 1Gi
# Uncapped: Cpu: 100m, Memory: 256MiStep 2: Auto Mode with Bounds
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: my-app-vpa
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: my-app
updatePolicy:
updateMode: "Auto"
resourcePolicy:
containerPolicies:
- containerName: app
minAllowed:
cpu: 50m
memory: 128Mi
maxAllowed:
cpu: "4"
memory: 8Gi
controlledResources: ["cpu", "memory"]VPA + HPA Coexistence
# VPA controls memory only
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: my-app-vpa
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: my-app
updatePolicy:
updateMode: "Auto"
resourcePolicy:
containerPolicies:
- containerName: app
controlledResources: ["memory"]
minAllowed:
memory: 128Mi
maxAllowed:
memory: 8Gi
---
# HPA controls CPU scaling (replica count)
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: my-app-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: my-app
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70graph TD
VPA[VPA<br/>controlledResources: memory] -->|Adjust memory<br/>requests/limits| POD[Pod]
HPA[HPA<br/>metric: CPU utilization] -->|Scale replicas| DEPLOY[Deployment<br/>replicas: 2-10]
DEPLOY --> POD
VPA -.->|β Does NOT control| CPU[CPU resources]
HPA -.->|β Does NOT control| MEM[Memory resources]Update Modes
| Mode | Behavior | Use Case |
|---|---|---|
Off | Recommendations only, no changes | Initial assessment |
Initial | Set resources on pod creation only | Avoid mid-lifecycle restarts |
Recreate | Evict and recreate pods to apply changes | Stateless services |
Auto | Same as Recreate (in-place update planned) | Production workloads |
Common Issues
VPA keeps restarting pods
VPA evicts pods to apply new resource values (in-place resize is not yet GA). Use updateMode: Initial to only set resources at pod creation.
VPA and HPA fighting over CPU
Never let both VPA and HPA control the same resource. Use controlledResources: ["memory"] on VPA when HPA targets CPU.
VPA recommends extremely low values
Set minAllowed to prevent VPA from under-provisioning. This is common for services with spiky traffic.
Best Practices
- Start with
Offmode β observe recommendations for a week before enabling Auto - Set
minAllowedandmaxAllowedβ prevent extreme right-sizing - VPA for memory, HPA for CPU β the safe coexistence pattern
- Use
Initialmode for StatefulSets β avoid evicting database pods - Monitor VPA events β
kubectl get events --field-selector reason=EvictedByVPA
Key Takeaways
- VPA right-sizes resource requests based on observed usage
- Start in
Offmode to see recommendations without changes - VPA + HPA coexistence: VPA controls memory, HPA controls CPU/replicas
minAllowed/maxAllowedare essential guardrails- VPA evicts pods to apply changes (no in-place resize yet in GA)
Initialmode is safest for stateful workloads

Recommended
Kubernetes Recipes β The Complete Book100+ production-ready patterns with detailed explanations, best practices, and copy-paste YAML. Everything in one place.
Get the Book βLearn by Doing
CopyPasteLearn β Hands-on Cloud & DevOps CoursesMaster Kubernetes, Ansible, Terraform, and MLOps with interactive, copy-paste-run lessons. Start free.
Browse Courses βπ Deepen Your Skills β Hands-on Courses
Courses by CopyPasteLearn.com β Learn IT by Doing
