πŸ“šBook Signing at KubeCon EU 2026Meet us at Booking.com HQ (Mon 18:30-21:00) & vCluster booth #521 (Tue 24 Mar, 12:30-1:30pm) β€” free book giveaway!RSVP Booking.com Event
Autoscaling intermediate ⏱ 15 minutes K8s 1.28+

Kubernetes Cost Optimization Strategies

Reduce Kubernetes cloud costs by 30-60 percent. Covers right-sizing, spot instances, cluster autoscaler tuning, resource quotas, and FinOps practices.

By Luca Berton β€’ β€’ πŸ“– 5 min read

πŸ’‘ Quick Answer: Reduce Kubernetes cloud costs by 30-60 percent. Covers right-sizing, spot instances, cluster autoscaler tuning, resource quotas, and FinOps practices.

The Problem

Most Kubernetes clusters are 30-60% over-provisioned. Developers request more resources than needed, nodes sit idle during off-hours, and nobody reviews actual utilization.

The Solution

Step 1: Measure Current Waste

# Compare requested vs actual usage
kubectl top pods -A --sort-by=cpu | head -20

# Find over-provisioned pods (requests >> actual)
kubectl top pods -A --no-headers | awk '{
  split($3,cpu,"m"); split($4,mem,"Mi");
  if (cpu[1] < 10 && mem[1] < 50) print $1"/"$2" - barely using resources"
}'

# Node-level utilization
kubectl top nodes
# If nodes are <40% utilized, you're overpaying

Step 2: Right-Size with VPA

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: myapp-vpa
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: myapp
  updatePolicy:
    updateMode: "Off"    # Start with recommendations only
  resourcePolicy:
    containerPolicies:
      - containerName: myapp
        controlledResources: ["cpu", "memory"]
        minAllowed:
          cpu: 10m
          memory: 32Mi
        maxAllowed:
          cpu: "2"
          memory: 4Gi
# Check VPA recommendations
kubectl describe vpa myapp-vpa
# Recommendation:
#   Container: myapp
#     Lower Bound:   10m CPU, 64Mi memory
#     Target:        50m CPU, 128Mi memory    ← Use this
#     Upper Bound:   200m CPU, 512Mi memory
#     Uncapped Target: 50m CPU, 128Mi memory

Step 3: Spot/Preemptible Instances

# Node pool for spot instances (70% cheaper)
apiVersion: cluster.x-k8s.io/v1beta1
kind: MachineDeployment
metadata:
  name: spot-workers
spec:
  replicas: 10
  template:
    spec:
      infrastructureRef:
        kind: AWSMachineTemplate
        name: spot-template
---
# Schedule tolerant workloads to spot nodes
apiVersion: apps/v1
kind: Deployment
metadata:
  name: batch-processor
spec:
  template:
    spec:
      nodeSelector:
        node.kubernetes.io/instance-type: spot
      tolerations:
        - key: kubernetes.io/spot
          operator: Exists
      topologySpreadConstraints:
        - maxSkew: 1
          topologyKey: topology.kubernetes.io/zone
          whenUnsatisfiable: ScheduleAnyway

Step 4: Scale Down Off-Hours

# CronJob to scale down at night
apiVersion: batch/v1
kind: CronJob
metadata:
  name: scale-down-night
spec:
  schedule: "0 20 * * 1-5"    # 8PM weekdays
  jobTemplate:
    spec:
      template:
        spec:
          containers:
            - name: scaler
              image: bitnami/kubectl:latest
              command:
                - /bin/sh
                - -c
                - |
                  kubectl scale deployment -n staging --all --replicas=0
                  kubectl scale deployment -n dev --all --replicas=0
          restartPolicy: OnFailure
---
apiVersion: batch/v1
kind: CronJob
metadata:
  name: scale-up-morning
spec:
  schedule: "0 7 * * 1-5"     # 7AM weekdays
  jobTemplate:
    spec:
      template:
        spec:
          containers:
            - name: scaler
              image: bitnami/kubectl:latest
              command:
                - /bin/sh
                - -c
                - |
                  kubectl scale deployment -n staging --all --replicas=1
                  kubectl scale deployment -n dev --all --replicas=1
          restartPolicy: OnFailure

Cost Optimization Checklist

ActionTypical SavingsEffort
Right-size pods (VPA)20-40%Low
Use spot instances60-70% on those nodesMedium
Scale down off-hours30-50% on dev/stagingLow
Cluster autoscaler tuning10-20%Low
Delete unused PVCs5-10% on storageLow
Use ARM instances20-30%Medium
graph TD
    A[Current Spend] --> B{Audit utilization}
    B -->|Pods over-provisioned| C[VPA right-sizing: -30%]
    B -->|Nodes idle at night| D[Off-hours scaling: -40%]
    B -->|All on-demand instances| E[Spot instances: -60%]
    B -->|Unused PVCs/LBs| F[Cleanup: -10%]
    C --> G[Optimized Spend]
    D --> G
    E --> G
    F --> G

Best Practices

  • Start with observation β€” measure before optimizing
  • Automate β€” manual processes don’t scale
  • Iterate β€” implement changes gradually and measure impact
  • Document β€” keep runbooks for your team

Key Takeaways

  • This is a critical capability for production Kubernetes clusters
  • Start with the simplest approach and evolve as needed
  • Monitor and measure the impact of every change
  • Share knowledge across your team with internal documentation
#cost-optimization #finops #spot-instances #right-sizing #autoscaling #kubernetes
Luca Berton
Written by Luca Berton

Principal Solutions Architect specializing in Kubernetes, AI/GPU infrastructure, and cloud-native platforms. Author of Kubernetes Recipes and creator of CopyPasteLearn courses.

Kubernetes Recipes book cover

Want More Kubernetes Recipes?

This recipe is from Kubernetes Recipes, our 750-page practical guide with hundreds of production-ready patterns.

Luca Berton Ansible Pilot Ansible by Example Open Empower K8s Recipes Terraform Pilot CopyPasteLearn ProteinLens