πŸ“šBook Signing at KubeCon EU 2026Meet us at Booking.com HQ (Mon 18:30-21:00) & vCluster booth #521 (Tue 24 Mar, 12:30-1:30pm) β€” free book giveaway!RSVP Booking.com Event
Autoscaling intermediate ⏱ 15 minutes K8s 1.25+

Kubernetes Cluster Autoscaler Configuration

Configure Kubernetes Cluster Autoscaler: scale-down delay, node group settings, priority expander, GPU scaling, and cloud provider integration for EKS, GKE.

By Luca Berton β€’ β€’ πŸ“– 5 min read

πŸ’‘ Quick Answer: Cluster Autoscaler adds nodes when pods are Pending (unschedulable) and removes underutilized nodes after scale-down-delay-after-add (default 10m). Configure with --scale-down-utilization-threshold=0.5, --scale-down-unneeded-time=10m, and node group min/max sizes.

The Problem

  • Pods stuck in Pending because no node has enough resources
  • Paying for idle nodes during off-peak hours
  • GPU nodes sitting empty but too expensive to keep
  • Scale-down too aggressive (kills nodes during brief dips)
  • Scale-up too slow (pods wait minutes for new nodes)

The Solution

Core Configuration Flags

apiVersion: apps/v1
kind: Deployment
metadata:
  name: cluster-autoscaler
  namespace: kube-system
spec:
  template:
    spec:
      containers:
        - name: cluster-autoscaler
          image: registry.k8s.io/autoscaling/cluster-autoscaler:v1.30.1
          command:
            - ./cluster-autoscaler
            - --v=4
            - --cloud-provider=aws
            # Scale-up settings
            - --scan-interval=10s
            - --max-node-provision-time=15m
            # Scale-down settings
            - --scale-down-enabled=true
            - --scale-down-delay-after-add=10m
            - --scale-down-delay-after-delete=0s
            - --scale-down-delay-after-failure=3m
            - --scale-down-unneeded-time=10m
            - --scale-down-utilization-threshold=0.5
            # Node group discovery
            - --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/my-cluster
            # Expander (how to choose which node group to scale)
            - --expander=least-waste
            # Safety
            - --skip-nodes-with-local-storage=false
            - --skip-nodes-with-system-pods=true
            - --balance-similar-node-groups=true
            - --max-graceful-termination-sec=600

Key Parameters Explained

ParameterDefaultDescription
scale-down-delay-after-add10mWait before scale-down after adding a node
scale-down-unneeded-time10mNode must be underutilized this long before removal
scale-down-utilization-threshold0.5Node utilization below this = candidate for removal
scan-interval10sHow often to check for pending pods
max-node-provision-time15mMax wait for node to become ready
expanderrandomStrategy: random, most-pods, least-waste, priority

EKS Configuration

# EKS managed node group (auto-discovered)
eksctl create nodegroup \
  --cluster my-cluster \
  --name general \
  --node-type m5.xlarge \
  --nodes-min 2 \
  --nodes-max 20 \
  --asg-access

# GPU node group
eksctl create nodegroup \
  --cluster my-cluster \
  --name gpu-workers \
  --node-type p3.2xlarge \
  --nodes-min 0 \
  --nodes-max 8 \
  --asg-access
# EKS Helm values
autoDiscovery:
  clusterName: my-cluster
  tags:
    - k8s.io/cluster-autoscaler/enabled
    - k8s.io/cluster-autoscaler/my-cluster
awsRegion: us-east-1
extraArgs:
  scale-down-delay-after-add: 10m
  scale-down-utilization-threshold: "0.5"
  skip-nodes-with-local-storage: "false"
  expander: least-waste

GKE Configuration

# GKE has built-in autoscaling (no separate deployment needed)
gcloud container clusters update my-cluster \
  --enable-autoscaling \
  --min-nodes=2 \
  --max-nodes=20 \
  --node-pool=default-pool

# Per-node-pool settings
gcloud container node-pools update gpu-pool \
  --cluster=my-cluster \
  --enable-autoscaling \
  --min-nodes=0 \
  --max-nodes=8 \
  --location-policy=ANY

# Autoscaling profile
gcloud container clusters update my-cluster \
  --autoscaling-profile=optimize-utilization  # More aggressive scale-down

Priority Expander (GPU workloads)

# Choose which node group scales up first
apiVersion: v1
kind: ConfigMap
metadata:
  name: cluster-autoscaler-priority-expander
  namespace: kube-system
data:
  priorities: |-
    50:
      - .*gpu.*           # GPU nodes are expensive β€” lowest priority
    100:
      - .*spot.*          # Try spot instances first
    200:
      - .*general.*       # General purpose β€” default choice

Scale-to-Zero for GPU Nodes

# Node group with min=0 for expensive GPU instances
# Cluster Autoscaler scales to 0 when no GPU pods are pending
# Requires: GPU node group with proper taints

# Taint GPU nodes so only GPU workloads land there
# k8s.io/cluster-autoscaler/node-template/taint/nvidia.com/gpu=:NoSchedule

Annotations for Scale-Down Protection

# Prevent a specific node from being scaled down
kubectl annotate node worker-5 "cluster-autoscaler.kubernetes.io/scale-down-disabled=true"

# Safe-to-evict annotation on pods
# Prevents node scale-down if this pod is running
kubectl annotate pod important-job "cluster-autoscaler.kubernetes.io/safe-to-evict=false"

Architecture

graph TD
    A[Pending Pods] -->|triggers| B[Cluster Autoscaler]
    B -->|evaluates| C{Which node group?}
    C -->|expander strategy| D[Scale Up Node Group]
    D -->|API call| E[Cloud Provider]
    E -->|provisions| F[New Node]
    F -->|joins cluster| G[Pods Scheduled]
    
    H[Underutilized Node] -->|threshold check| B
    B -->|scale-down-unneeded-time| I{Safe to remove?}
    I -->|Yes| J[Drain + Terminate]
    I -->|No - has local storage/system pods| K[Skip]

Common Issues

IssueCauseFix
Pods pending but no scale-upNode group at maxIncrease --nodes-max
Scale-down not happeningscale-down-delay-after-add too longReduce to 5m for dev clusters
Wrong node type scales upRandom expanderUse least-waste or priority expander
Node takes >15min to joinSlow AMI/image pullIncrease max-node-provision-time
GPU nodes never scale to 0DaemonSet pods prevent removalAdd safe-to-evict=true to DaemonSet pods
Flapping (scale up β†’ down β†’ up)threshold too aggressiveIncrease scale-down-unneeded-time to 15m

Best Practices

  1. Set scale-down-delay-after-add: 10m β€” prevents immediate scale-down after scale-up
  2. Use least-waste expander β€” picks the node group that wastes least resources
  3. Scale GPU nodes to zero β€” taint + min=0 saves significant cost
  4. Set balance-similar-node-groups=true β€” spreads across AZs
  5. Use PodDisruptionBudgets β€” control how many pods can be evicted during scale-down

Key Takeaways

  • Scale-up triggers on Pending pods; scale-down triggers on underutilization (<50% by default)
  • scale-down-delay-after-add (10m default) prevents thrashing after node addition
  • Use priority expander for cost-optimized node group selection (spot β†’ on-demand β†’ GPU)
  • GKE has native autoscaling; EKS/AKS need the Cluster Autoscaler deployment
  • Scale-to-zero GPU nodes with taints + min: 0 to avoid idle GPU costs
#cluster-autoscaler #autoscaling #node-scaling #eks #gke #aks
Luca Berton
Written by Luca Berton

Principal Solutions Architect specializing in Kubernetes, AI/GPU infrastructure, and cloud-native platforms. Author of Kubernetes Recipes and creator of CopyPasteLearn courses.

Kubernetes Recipes book cover

Want More Kubernetes Recipes?

This recipe is from Kubernetes Recipes, our 750-page practical guide with hundreds of production-ready patterns.

Luca Berton Ansible Pilot Ansible by Example Open Empower K8s Recipes Terraform Pilot CopyPasteLearn ProteinLens