How to Optimize Kubernetes Costs
Reduce cloud costs in Kubernetes clusters. Right-size resources, use spot instances, implement autoscaling, and monitor spending effectively.
How to Optimize Kubernetes Costs
Kubernetes cost optimization involves right-sizing workloads, leveraging spot instances, implementing autoscaling, and monitoring resource utilization to reduce cloud spending.
Analyze Current Usage
# Check resource requests vs actual usage
kubectl top pods -A
# Compare requests to usage
kubectl get pods -A -o custom-columns=\
'NAMESPACE:.metadata.namespace,NAME:.metadata.name,CPU_REQ:.spec.containers[*].resources.requests.cpu,MEM_REQ:.spec.containers[*].resources.requests.memory'
# Find pods without resource limits
kubectl get pods -A -o json | jq -r '.items[] | select(.spec.containers[].resources.limits == null) | "\(.metadata.namespace)/\(.metadata.name)"'Right-Size Resources
# Before: Over-provisioned
spec:
containers:
- name: app
resources:
requests:
cpu: "2000m" # Requesting 2 cores
memory: "4Gi" # Requesting 4GB
limits:
cpu: "4000m"
memory: "8Gi"
# After: Right-sized based on actual usage
spec:
containers:
- name: app
resources:
requests:
cpu: "250m" # Actual avg usage + buffer
memory: "512Mi" # Actual avg usage + buffer
limits:
cpu: "500m" # 2x request for bursting
memory: "1Gi"Vertical Pod Autoscaler (VPA)
# vpa.yaml - Automatically right-size pods
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: myapp-vpa
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: myapp
updatePolicy:
updateMode: "Auto" # Or "Off" for recommendations only
resourcePolicy:
containerPolicies:
- containerName: "*"
minAllowed:
cpu: 50m
memory: 64Mi
maxAllowed:
cpu: 2000m
memory: 4Gi# View VPA recommendations
kubectl describe vpa myapp-vpa
# Recommendation output shows:
# - Lower Bound: Minimum resources needed
# - Target: Optimal recommendation
# - Upper Bound: Maximum expected usageHorizontal Pod Autoscaler (HPA)
# hpa.yaml - Scale replicas based on usage
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: myapp-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: myapp
minReplicas: 2
maxReplicas: 20
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
behavior:
scaleDown:
stabilizationWindowSeconds: 300 # Wait before scaling down
policies:
- type: Percent
value: 10
periodSeconds: 60Use Spot/Preemptible Instances
# spot-node-pool.yaml
# Configure node pool for spot instances (cloud-specific)
# AWS EKS with Karpenter
apiVersion: karpenter.sh/v1alpha5
kind: Provisioner
metadata:
name: spot-provisioner
spec:
requirements:
- key: karpenter.sh/capacity-type
operator: In
values: ["spot"]
- key: node.kubernetes.io/instance-type
operator: In
values: ["m5.large", "m5.xlarge", "m5a.large"]
limits:
resources:
cpu: 100
ttlSecondsAfterEmpty: 30# Tolerate spot node taints
apiVersion: apps/v1
kind: Deployment
metadata:
name: batch-processor
spec:
template:
spec:
tolerations:
- key: "kubernetes.azure.com/scalesetpriority"
operator: "Equal"
value: "spot"
effect: "NoSchedule"
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.azure.com/scalesetpriority
operator: In
values:
- spot
containers:
- name: processor
image: batch:v1Cluster Autoscaler
# cluster-autoscaler deployment
# Scales nodes based on pending pods
# Key settings:
# --scale-down-enabled=true
# --scale-down-delay-after-add=10m
# --scale-down-unneeded-time=10m
# --scale-down-utilization-threshold=0.5
# Pods that prevent scale-down:
# - Pods with local storage
# - Pods with PodDisruptionBudget preventing eviction
# - Kube-system pods without PDBResource Quotas per Namespace
# quota.yaml - Prevent over-provisioning
apiVersion: v1
kind: ResourceQuota
metadata:
name: team-quota
namespace: team-a
spec:
hard:
requests.cpu: "20"
requests.memory: 40Gi
limits.cpu: "40"
limits.memory: 80Gi
pods: "50"Limit Ranges
# limit-range.yaml - Default and max limits
apiVersion: v1
kind: LimitRange
metadata:
name: default-limits
namespace: team-a
spec:
limits:
- type: Container
default:
cpu: "200m"
memory: "256Mi"
defaultRequest:
cpu: "100m"
memory: "128Mi"
max:
cpu: "2"
memory: "4Gi"Schedule Non-Critical Workloads Off-Peak
# cronjob-off-peak.yaml
apiVersion: batch/v1
kind: CronJob
metadata:
name: batch-job
spec:
schedule: "0 2 * * *" # Run at 2 AM (off-peak)
jobTemplate:
spec:
template:
spec:
tolerations:
- key: "spot"
operator: "Exists"
containers:
- name: batch
image: batch:v1Cost Monitoring with Kubecost
# Install Kubecost
helm repo add kubecost https://kubecost.github.io/cost-analyzer/
helm install kubecost kubecost/cost-analyzer \
--namespace kubecost \
--create-namespace
# Access dashboard
kubectl port-forward -n kubecost deployment/kubecost-cost-analyzer 9090
# Key metrics:
# - Cost by namespace
# - Cost by deployment
# - Idle resources
# - Right-sizing recommendationsPrometheus Cost Queries
# Cost-related metrics
# CPU cost estimation (simplified)
sum(rate(container_cpu_usage_seconds_total{namespace!=""}[5m])) by (namespace)
# Memory cost estimation
sum(container_memory_working_set_bytes{namespace!=""}) by (namespace)
# Resource efficiency (usage vs request)
sum(rate(container_cpu_usage_seconds_total[5m])) / sum(kube_pod_container_resource_requests{resource="cpu"})Pod Priority for Cost Control
# priority-class.yaml
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
name: low-priority
value: 100
preemptionPolicy: PreemptLowerPriority
description: "Low priority for batch jobs - can be preempted"
---
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
name: high-priority
value: 1000
preemptionPolicy: PreemptLowerPriority
description: "High priority for production services"Cleanup Unused Resources
# Find unused PVCs
kubectl get pvc -A -o json | jq -r '.items[] | select(.status.phase == "Bound") | "\(.metadata.namespace)/\(.metadata.name)"' | while read pvc; do
ns=$(echo $pvc | cut -d'/' -f1)
name=$(echo $pvc | cut -d'/' -f2)
if ! kubectl get pods -n $ns -o json | grep -q "\"claimName\": \"$name\""; then
echo "Unused PVC: $pvc"
fi
done
# Find orphaned PVs
kubectl get pv -o json | jq -r '.items[] | select(.status.phase == "Released") | .metadata.name'
# Clean up completed jobs
kubectl delete jobs --field-selector status.successful=1 -ACost Optimization Checklist
1. Right-size workloads
□ Set resource requests based on actual usage
□ Use VPA for recommendations
□ Review and adjust monthly
2. Autoscaling
□ HPA for variable workloads
□ Cluster autoscaler enabled
□ Scale to zero for dev/test
3. Spot instances
□ Use for stateless workloads
□ Batch processing on spot
□ Mix on-demand and spot
4. Resource governance
□ Quotas per namespace/team
□ LimitRanges for defaults
□ Chargeback by namespace
5. Cleanup
□ Delete unused resources
□ TTL on completed jobs
□ Review orphaned PVs/PVCsSummary
Kubernetes cost optimization starts with right-sizing: use VPA recommendations and actual metrics to set appropriate requests and limits. Implement HPA for horizontal scaling and cluster autoscaler for node efficiency. Leverage spot instances for fault-tolerant workloads. Set resource quotas and limit ranges to prevent over-provisioning. Use tools like Kubecost for visibility and cleanup unused resources regularly. Continuous monitoring and adjustment are key to maintaining cost efficiency.
📘 Go Further with Kubernetes Recipes
Love this recipe? There’s so much more! This is just one of 100+ hands-on recipes in our comprehensive Kubernetes Recipes book.
Inside the book, you’ll master:
- ✅ Production-ready deployment strategies
- ✅ Advanced networking and security patterns
- ✅ Observability, monitoring, and troubleshooting
- ✅ Real-world best practices from industry experts
“The practical, recipe-based approach made complex Kubernetes concepts finally click for me.”
👉 Get Your Copy Now — Start building production-grade Kubernetes skills today!
📘 Get All 100+ Recipes in One Book
Stop searching — get every production-ready pattern with detailed explanations, best practices, and copy-paste YAML.
Want More Kubernetes Recipes?
This recipe is from Kubernetes Recipes, our 750-page practical guide with hundreds of production-ready patterns.