πŸ“šBook Signing at KubeCon EU 2026Meet us at Booking.com HQ (Mon 18:30-21:00) & vCluster booth #521 (Tue 24 Mar, 12:30-1:30pm) β€” free book giveaway!RSVP Booking.com Event
Configuration intermediate ⏱ 15 minutes K8s 1.28+

How to Optimize Kubernetes Costs

Reduce cloud costs in Kubernetes clusters. Right-size resources, use spot instances, implement autoscaling, and monitor spending effectively.

By Luca Berton β€’ β€’ πŸ“– 6 min read

πŸ’‘ Quick Answer: Key strategies: right-size resources (compare kubectl top vs requests), use spot/preemptible instances for fault-tolerant workloads, enable cluster autoscaler, and delete unused resources. Use tools like Kubecost or OpenCost to visualize spending per namespace/label.

Key command: kubectl top pods -A --sort-by=cpu identifies resource-hungry pods; VPA recommends better resource settings.

Gotcha: Over-provisioned requests waste money; under-provisioned limits cause OOMKillsβ€”monitor and iterate.

Kubernetes cost optimization involves right-sizing workloads, leveraging spot instances, implementing autoscaling, and monitoring resource utilization to reduce cloud spending.

Analyze Current Usage

# Check resource requests vs actual usage
kubectl top pods -A

# Compare requests to usage
kubectl get pods -A -o custom-columns=\
'NAMESPACE:.metadata.namespace,NAME:.metadata.name,CPU_REQ:.spec.containers[*].resources.requests.cpu,MEM_REQ:.spec.containers[*].resources.requests.memory'

# Find pods without resource limits
kubectl get pods -A -o json | jq -r '.items[] | select(.spec.containers[].resources.limits == null) | "\(.metadata.namespace)/\(.metadata.name)"'

Right-Size Resources

# Before: Over-provisioned
spec:
  containers:
    - name: app
      resources:
        requests:
          cpu: "2000m"      # Requesting 2 cores
          memory: "4Gi"     # Requesting 4GB
        limits:
          cpu: "4000m"
          memory: "8Gi"

# After: Right-sized based on actual usage
spec:
  containers:
    - name: app
      resources:
        requests:
          cpu: "250m"       # Actual avg usage + buffer
          memory: "512Mi"   # Actual avg usage + buffer
        limits:
          cpu: "500m"       # 2x request for bursting
          memory: "1Gi"

Vertical Pod Autoscaler (VPA)

# vpa.yaml - Automatically right-size pods
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: myapp-vpa
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: myapp
  updatePolicy:
    updateMode: "Auto"  # Or "Off" for recommendations only
  resourcePolicy:
    containerPolicies:
      - containerName: "*"
        minAllowed:
          cpu: 50m
          memory: 64Mi
        maxAllowed:
          cpu: 2000m
          memory: 4Gi
# View VPA recommendations
kubectl describe vpa myapp-vpa

# Recommendation output shows:
# - Lower Bound: Minimum resources needed
# - Target: Optimal recommendation
# - Upper Bound: Maximum expected usage

Horizontal Pod Autoscaler (HPA)

# hpa.yaml - Scale replicas based on usage
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: myapp-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: myapp
  minReplicas: 2
  maxReplicas: 20
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300  # Wait before scaling down
      policies:
        - type: Percent
          value: 10
          periodSeconds: 60

Use Spot/Preemptible Instances

# spot-node-pool.yaml
# Configure node pool for spot instances (cloud-specific)

# AWS EKS with Karpenter
apiVersion: karpenter.sh/v1alpha5
kind: Provisioner
metadata:
  name: spot-provisioner
spec:
  requirements:
    - key: karpenter.sh/capacity-type
      operator: In
      values: ["spot"]
    - key: node.kubernetes.io/instance-type
      operator: In
      values: ["m5.large", "m5.xlarge", "m5a.large"]
  limits:
    resources:
      cpu: 100
  ttlSecondsAfterEmpty: 30
# Tolerate spot node taints
apiVersion: apps/v1
kind: Deployment
metadata:
  name: batch-processor
spec:
  template:
    spec:
      tolerations:
        - key: "kubernetes.azure.com/scalesetpriority"
          operator: "Equal"
          value: "spot"
          effect: "NoSchedule"
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
              - matchExpressions:
                  - key: kubernetes.azure.com/scalesetpriority
                    operator: In
                    values:
                      - spot
      containers:
        - name: processor
          image: batch:v1

Cluster Autoscaler

# cluster-autoscaler deployment
# Scales nodes based on pending pods

# Key settings:
# --scale-down-enabled=true
# --scale-down-delay-after-add=10m
# --scale-down-unneeded-time=10m
# --scale-down-utilization-threshold=0.5

# Pods that prevent scale-down:
# - Pods with local storage
# - Pods with PodDisruptionBudget preventing eviction
# - Kube-system pods without PDB

Resource Quotas per Namespace

# quota.yaml - Prevent over-provisioning
apiVersion: v1
kind: ResourceQuota
metadata:
  name: team-quota
  namespace: team-a
spec:
  hard:
    requests.cpu: "20"
    requests.memory: 40Gi
    limits.cpu: "40"
    limits.memory: 80Gi
    pods: "50"

Limit Ranges

# limit-range.yaml - Default and max limits
apiVersion: v1
kind: LimitRange
metadata:
  name: default-limits
  namespace: team-a
spec:
  limits:
    - type: Container
      default:
        cpu: "200m"
        memory: "256Mi"
      defaultRequest:
        cpu: "100m"
        memory: "128Mi"
      max:
        cpu: "2"
        memory: "4Gi"

Schedule Non-Critical Workloads Off-Peak

# cronjob-off-peak.yaml
apiVersion: batch/v1
kind: CronJob
metadata:
  name: batch-job
spec:
  schedule: "0 2 * * *"  # Run at 2 AM (off-peak)
  jobTemplate:
    spec:
      template:
        spec:
          tolerations:
            - key: "spot"
              operator: "Exists"
          containers:
            - name: batch
              image: batch:v1

Cost Monitoring with Kubecost

# Install Kubecost
helm repo add kubecost https://kubecost.github.io/cost-analyzer/
helm install kubecost kubecost/cost-analyzer \
  --namespace kubecost \
  --create-namespace

# Access dashboard
kubectl port-forward -n kubecost deployment/kubecost-cost-analyzer 9090

# Key metrics:
# - Cost by namespace
# - Cost by deployment
# - Idle resources
# - Right-sizing recommendations

Prometheus Cost Queries

# Cost-related metrics
# CPU cost estimation (simplified)
sum(rate(container_cpu_usage_seconds_total{namespace!=""}[5m])) by (namespace)

# Memory cost estimation
sum(container_memory_working_set_bytes{namespace!=""}) by (namespace)

# Resource efficiency (usage vs request)
sum(rate(container_cpu_usage_seconds_total[5m])) / sum(kube_pod_container_resource_requests{resource="cpu"})

Pod Priority for Cost Control

# priority-class.yaml
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
  name: low-priority
value: 100
preemptionPolicy: PreemptLowerPriority
description: "Low priority for batch jobs - can be preempted"
---
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
  name: high-priority
value: 1000
preemptionPolicy: PreemptLowerPriority
description: "High priority for production services"

Cleanup Unused Resources

# Find unused PVCs
kubectl get pvc -A -o json | jq -r '.items[] | select(.status.phase == "Bound") | "\(.metadata.namespace)/\(.metadata.name)"' | while read pvc; do
  ns=$(echo $pvc | cut -d'/' -f1)
  name=$(echo $pvc | cut -d'/' -f2)
  if ! kubectl get pods -n $ns -o json | grep -q "\"claimName\": \"$name\""; then
    echo "Unused PVC: $pvc"
  fi
done

# Find orphaned PVs
kubectl get pv -o json | jq -r '.items[] | select(.status.phase == "Released") | .metadata.name'

# Clean up completed jobs
kubectl delete jobs --field-selector status.successful=1 -A

Cost Optimization Checklist

1. Right-size workloads
   β–‘ Set resource requests based on actual usage
   β–‘ Use VPA for recommendations
   β–‘ Review and adjust monthly

2. Autoscaling
   β–‘ HPA for variable workloads
   β–‘ Cluster autoscaler enabled
   β–‘ Scale to zero for dev/test

3. Spot instances
   β–‘ Use for stateless workloads
   β–‘ Batch processing on spot
   β–‘ Mix on-demand and spot

4. Resource governance
   β–‘ Quotas per namespace/team
   β–‘ LimitRanges for defaults
   β–‘ Chargeback by namespace

5. Cleanup
   β–‘ Delete unused resources
   β–‘ TTL on completed jobs
   β–‘ Review orphaned PVs/PVCs

Summary

Kubernetes cost optimization starts with right-sizing: use VPA recommendations and actual metrics to set appropriate requests and limits. Implement HPA for horizontal scaling and cluster autoscaler for node efficiency. Leverage spot instances for fault-tolerant workloads. Set resource quotas and limit ranges to prevent over-provisioning. Use tools like Kubecost for visibility and cleanup unused resources regularly. Continuous monitoring and adjustment are key to maintaining cost efficiency.


πŸ“˜ Go Further with Kubernetes Recipes

Love this recipe? There’s so much more! This is just one of 100+ hands-on recipes in our comprehensive Kubernetes Recipes book.

Inside the book, you’ll master:

  • βœ… Production-ready deployment strategies
  • βœ… Advanced networking and security patterns
  • βœ… Observability, monitoring, and troubleshooting
  • βœ… Real-world best practices from industry experts

β€œThe practical, recipe-based approach made complex Kubernetes concepts finally click for me.”

πŸ‘‰ Get Your Copy Now β€” Start building production-grade Kubernetes skills today!

#cost #optimization #resources #finops #efficiency
Luca Berton
Written by Luca Berton

Principal Solutions Architect specializing in Kubernetes, AI/GPU infrastructure, and cloud-native platforms. Author of Kubernetes Recipes and creator of CopyPasteLearn courses.

Kubernetes Recipes book cover

Want More Kubernetes Recipes?

This recipe is from Kubernetes Recipes, our 750-page practical guide with hundreds of production-ready patterns.

Luca Berton Ansible Pilot Ansible by Example Open Empower K8s Recipes Terraform Pilot CopyPasteLearn ProteinLens