What is the quick answer for VPA Example: Vertical Pod Autoscaler Setup?

Kubernetes VPA example with step-by-step setup. Configure Vertical Pod Autoscaler to automatically right-size pod CPU and memory requests and limits.

VPA Example: Vertical Pod Autoscaler Setup

💡 Quick Answer: VPA auto-adjusts pod CPU/memory requests based on actual usage. Install VPA, then create VerticalPodAutoscaler resource targeting your Deployment. Modes: Off (recommendations only), Auto (applies changes, restarts pods). Check recommendations: kubectl describe vpa <name>. Don’t use VPA + HPA on same CPU metric. VPA restarts pods to apply changes—use PodDisruptionBudget.

The Problem

You don’t know the right CPU and memory values for your pod resource requests, leading to either wasted resources (over-provisioning) or OOM kills (under-provisioning).

The Solution

Use Vertical Pod Autoscaler (VPA) to automatically analyze resource usage and recommend or apply optimal resource requests.

Step 1: Install VPA

Clone and install VPA:

git clone https://github.com/kubernetes/autoscaler.git
cd autoscaler/vertical-pod-autoscaler

# Install VPA components
./hack/vpa-up.sh

Or with Helm:

helm repo add fairwinds-stable https://charts.fairwinds.com/stable
helm install vpa fairwinds-stable/vpa --namespace vpa --create-namespace

Verify installation:

kubectl get pods -n kube-system | grep vpa

Step 2: Create a Test Deployment

apiVersion: apps/v1
kind: Deployment
metadata:
  name: hamster
spec:
  replicas: 2
  selector:
    matchLabels:
      app: hamster
  template:
    metadata:
      labels:
        app: hamster
    spec:
      containers:
      - name: hamster
        image: registry.k8s.io/ubuntu-slim:0.1
        resources:
          requests:
            cpu: 100m
            memory: 50Mi
        command: ["/bin/sh"]
        args:
          - "-c"
          - "while true; do timeout 0.5s yes >/dev/null; sleep 0.5s; done"

Step 3: Create a VPA Resource

Recommendation Mode (Off)

Get recommendations without applying changes:

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: hamster-vpa
spec:
  targetRef:
    apiVersion: "apps/v1"
    kind: Deployment
    name: hamster
  updatePolicy:
    updateMode: "Off"  # Only recommend, don't apply

Auto Mode

Automatically update pod resources:

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: hamster-vpa
spec:
  targetRef:
    apiVersion: "apps/v1"
    kind: Deployment
    name: hamster
  updatePolicy:
    updateMode: "Auto"
  resourcePolicy:
    containerPolicies:
    - containerName: '*'
      minAllowed:
        cpu: 50m
        memory: 50Mi
      maxAllowed:
        cpu: 2
        memory: 2Gi
      controlledResources: ["cpu", "memory"]

Step 4: View Recommendations

Check VPA status:

kubectl describe vpa hamster-vpa

Output shows recommendations:

Recommendation:
  Container Recommendations:
    Container Name:  hamster
    Lower Bound:
      Cpu:     25m
      Memory:  262144k
    Target:
      Cpu:     587m
      Memory:  262144k
    Uncapped Target:
      Cpu:     587m
      Memory:  262144k
    Upper Bound:
      Cpu:     1
      Memory:  500Mi

VPA Update Modes

Mode	Behavior
Off	VPA only provides recommendations
Initial	VPA sets resources only at pod creation
Recreate	VPA updates by evicting and recreating pods
Auto	Currently same as Recreate

Production Configuration

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: myapp-vpa
  namespace: production
spec:
  targetRef:
    apiVersion: "apps/v1"
    kind: Deployment
    name: myapp
  updatePolicy:
    updateMode: "Auto"
    minReplicas: 2  # Don't evict if less than 2 replicas
  resourcePolicy:
    containerPolicies:
    - containerName: myapp
      minAllowed:
        cpu: 100m
        memory: 128Mi
      maxAllowed:
        cpu: 4
        memory: 8Gi
      controlledResources: ["cpu", "memory"]
      controlledValues: RequestsAndLimits
    - containerName: sidecar
      mode: "Off"  # Don't autoscale the sidecar

VPA with Resource Policies

Control what gets scaled:

resourcePolicy:
  containerPolicies:
  - containerName: app
    controlledResources: ["memory"]  # Only scale memory
    controlledValues: RequestsOnly   # Don't change limits

Combining VPA and HPA

⚠️ Warning: Don’t use VPA and HPA on the same resource (CPU/memory).

Safe combination:

# VPA controls memory
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: myapp-vpa
spec:
  targetRef:
    apiVersion: "apps/v1"
    kind: Deployment
    name: myapp
  resourcePolicy:
    containerPolicies:
    - containerName: '*'
      controlledResources: ["memory"]  # Only memory
---
# HPA scales on CPU
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: myapp-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: myapp
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70

Monitoring VPA

Get all VPA resources:

kubectl get vpa -A

Watch recommendations:

kubectl get vpa hamster-vpa -o jsonpath='{.status.recommendation.containerRecommendations}' | jq

Goldilocks: VPA Dashboard

Install Goldilocks for a UI:

helm repo add fairwinds-stable https://charts.fairwinds.com/stable
helm install goldilocks fairwinds-stable/goldilocks --namespace goldilocks --create-namespace

Enable for namespace:

kubectl label namespace production goldilocks.fairwinds.com/enabled=true

Best Practices

1. Start with Off Mode

Get recommendations first, then apply manually:

updateMode: "Off"

2. Set Min/Max Bounds

Prevent extreme values:

minAllowed:
  cpu: 50m
  memory: 64Mi
maxAllowed:
  cpu: 4
  memory: 8Gi

3. Use PodDisruptionBudget

Ensure availability during updates:

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: myapp-pdb
spec:
  minAvailable: 1
  selector:
    matchLabels:
      app: myapp

4. Monitor OOMKilled Events

If pods still get OOMKilled, adjust maxAllowed.

Key Takeaways

VPA automatically right-sizes pod resources
Use Off mode to get recommendations first
Set min/max bounds to prevent extreme values
Don’t use VPA and HPA on the same metric
Combine with PDB for safe updates

📘 Go Further with Kubernetes Recipes

Love this recipe? There’s so much more! This is just one of 100+ hands-on recipes in our comprehensive Kubernetes Recipes book.

Inside the book, you’ll master:

✅ Production-ready deployment strategies
✅ Advanced networking and security patterns
✅ Observability, monitoring, and troubleshooting
✅ Real-world best practices from industry experts

“The practical, recipe-based approach made complex Kubernetes concepts finally click for me.”

👉 Get Your Copy Now — Start building production-grade Kubernetes skills today!

Frequently Asked Questions

What is VPA in Kubernetes?

VPA (Vertical Pod Autoscaler) automatically adjusts CPU and memory requests/limits for pods based on actual usage. Unlike HPA which adds more pods, VPA right-sizes existing pods to avoid over-provisioning (wasting resources) or under-provisioning (causing OOMKilled).

How do I install VPA?

git clone https://github.com/kubernetes/autoscaler.git
cd autoscaler/vertical-pod-autoscaler
./hack/vpa-up.sh

This deploys three components: Recommender (analyzes usage), Updater (evicts pods needing updates), and Admission Controller (sets resources on new pods).

Can I use VPA and HPA together?

Yes, but don’t target the same metrics. Use HPA for CPU-based horizontal scaling and VPA for memory right-sizing only. Set VPA’s controlledResources: ["memory"] to avoid conflicts with HPA’s CPU scaling.