πŸ“šBook Signing at KubeCon EU 2026Meet us at Booking.com HQ (Mon 18:30-21:00) & vCluster booth #521 (Tue 24 Mar, 12:30-1:30pm) β€” free book giveaway!RSVP Booking.com Event
Autoscaling advanced ⏱ 15 minutes K8s 1.28+

Karpenter Node Autoscaling for Kubernetes

Replace Cluster Autoscaler with Karpenter for faster, smarter node provisioning. Right-sized instances, spot fallback, consolidation, and GPU-aware scaling.

By Luca Berton β€’ β€’ πŸ“– 5 min read

πŸ’‘ Quick Answer: Replace Cluster Autoscaler with Karpenter for faster, smarter node provisioning. Right-sized instances, spot fallback, consolidation, and GPU-aware scaling.

The Problem

Cluster Autoscaler scales node groups β€” Karpenter provisions individual nodes. It picks the optimal instance type, size, and purchase option (on-demand vs spot) for each pending pod. Nodes launch in seconds, not minutes.

The Solution

Step 1: Install Karpenter

# AWS EKS
export KARPENTER_VERSION="0.37.0"
export CLUSTER_NAME="my-cluster"

helm install karpenter oci://public.ecr.aws/karpenter/karpenter \
  --version "$KARPENTER_VERSION" \
  --namespace kube-system \
  --set "settings.clusterName=$CLUSTER_NAME" \
  --set "settings.interruptionQueue=$CLUSTER_NAME" \
  --set controller.resources.requests.cpu=1 \
  --set controller.resources.requests.memory=1Gi

Step 2: Create NodePool

apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
  name: default
spec:
  template:
    spec:
      requirements:
        - key: kubernetes.io/arch
          operator: In
          values: ["amd64", "arm64"]       # Consider ARM for cost savings
        - key: karpenter.sh/capacity-type
          operator: In
          values: ["spot", "on-demand"]    # Prefer spot, fallback to on-demand
        - key: karpenter.k8s.aws/instance-category
          operator: In
          values: ["c", "m", "r"]          # Compute, general, memory optimized
        - key: karpenter.k8s.aws/instance-generation
          operator: Gt
          values: ["5"]                     # Only gen 5+ instances
      nodeClassRef:
        apiVersion: karpenter.k8s.aws/v1beta1
        kind: EC2NodeClass
        name: default
  limits:
    cpu: 1000                               # Max 1000 vCPUs total
    memory: 4000Gi
  disruption:
    consolidationPolicy: WhenUnderutilized  # Auto-consolidate idle nodes
    expireAfter: 720h                       # Replace nodes every 30 days
---
# GPU NodePool β€” separate pool for GPU workloads
apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
  name: gpu
spec:
  template:
    spec:
      requirements:
        - key: karpenter.k8s.aws/instance-family
          operator: In
          values: ["p4d", "p5", "g5", "g6"]
        - key: karpenter.sh/capacity-type
          operator: In
          values: ["on-demand"]            # GPUs: on-demand only
      taints:
        - key: nvidia.com/gpu
          value: "true"
          effect: NoSchedule
      nodeClassRef:
        apiVersion: karpenter.k8s.aws/v1beta1
        kind: EC2NodeClass
        name: gpu-nodes
  limits:
    nvidia.com/gpu: 32                      # Max 32 GPUs total

Step 3: EC2NodeClass

apiVersion: karpenter.k8s.aws/v1beta1
kind: EC2NodeClass
metadata:
  name: default
spec:
  amiFamily: AL2023
  subnetSelectorTerms:
    - tags:
        karpenter.sh/discovery: my-cluster
  securityGroupSelectorTerms:
    - tags:
        karpenter.sh/discovery: my-cluster
  blockDeviceMappings:
    - deviceName: /dev/xvda
      ebs:
        volumeSize: 100Gi
        volumeType: gp3
        encrypted: true

Karpenter vs Cluster Autoscaler

FeatureCluster AutoscalerKarpenter
Scaling unitNode groupIndividual node
Instance selectionFixed per groupDynamic per pod
Provisioning speed2-5 minutes30-60 seconds
Spot handlingPer node groupPer node with fallback
ConsolidationLimitedBuilt-in
GPU awarenessBasicAdvanced
Multi-archSeparate groupsAutomatic
graph TD
    A[Pending Pod: 4 CPU, 8Gi, GPU] --> B[Karpenter Controller]
    B --> C{Find optimal instance}
    C --> D[Check spot availability]
    D -->|Spot available| E[Launch p5.xlarge spot]
    D -->|No spot| F[Launch p5.xlarge on-demand]
    E --> G[Node ready in 30s]
    F --> G
    G --> H[Pod scheduled]
    I[Consolidation loop] -->|Node <50% utilized| J[Cordon + drain + terminate]

Best Practices

  • Start small and iterate β€” don’t over-engineer on day one
  • Monitor and measure β€” you can’t improve what you don’t measure
  • Automate repetitive tasks β€” reduce human error and toil
  • Document your decisions β€” future you will thank present you

Key Takeaways

  • This is essential knowledge for production Kubernetes operations
  • Start with the simplest approach that solves your problem
  • Monitor the impact of every change you make
  • Share knowledge across your team with internal runbooks
#karpenter #autoscaling #nodes #spot #gpu #cost-optimization #kubernetes
Luca Berton
Written by Luca Berton

Principal Solutions Architect specializing in Kubernetes, AI/GPU infrastructure, and cloud-native platforms. Author of Kubernetes Recipes and creator of CopyPasteLearn courses.

Kubernetes Recipes book cover

Want More Kubernetes Recipes?

This recipe is from Kubernetes Recipes, our 750-page practical guide with hundreds of production-ready patterns.

Luca Berton Ansible Pilot Ansible by Example Open Empower K8s Recipes Terraform Pilot CopyPasteLearn ProteinLens