πŸ“šBook Signing at KubeCon EU 2026Meet us at Booking.com HQ (Mon 18:30-21:00) & vCluster booth #521 (Tue 24 Mar, 12:30-1:30pm) β€” free book giveaway!RSVP Booking.com Event
Deployments intermediate ⏱ 25 minutes K8s 1.28+

Cluster API on AWS: Provision EKS Clusters

Use Cluster API (CAPI) to provision and manage EKS clusters declaratively. Install clusterctl, configure CAPA provider, and automate cluster lifecycle on AWS.

By Luca Berton β€’ β€’ πŸ“– 5 min read

πŸ’‘ Quick Answer: Cluster API (CAPI) lets you manage Kubernetes clusters as Kubernetes resources. Install clusterctl on a management cluster, configure the AWS provider (CAPA), then kubectl apply a Cluster + AWSManagedControlPlane manifest to provision EKS clusters. Scale node groups by editing MachineDeployment replicas. Delete clusters with kubectl delete cluster.

The Problem

Managing multiple EKS clusters through the AWS console or Terraform becomes unwieldy at scale. Each cluster needs its own Terraform state, IAM roles, and upgrade pipeline. Cluster API treats clusters as Kubernetes-native resources β€” you manage them with kubectl, GitOps (ArgoCD/Flux), and standard K8s tooling.

flowchart TB
    MGMT["Management Cluster<br/>(runs CAPI controllers)"] -->|"reconciles"| C1["Workload Cluster 1<br/>(EKS us-east-1)"]
    MGMT -->|"reconciles"| C2["Workload Cluster 2<br/>(EKS eu-west-1)"]
    MGMT -->|"reconciles"| C3["Workload Cluster 3<br/>(EKS ap-south-1)"]
    
    subgraph CAPI["Cluster API Resources"]
        CLUSTER["Cluster"]
        CP["AWSManagedControlPlane"]
        MD["MachineDeployment"]
        MP["AWSManagedMachinePool"]
    end

The Solution

Install Cluster API

# Install clusterctl CLI
curl -L https://github.com/kubernetes-sigs/cluster-api/releases/latest/download/clusterctl-linux-amd64 -o clusterctl
chmod +x clusterctl && sudo mv clusterctl /usr/local/bin/

# Initialize CAPI with AWS provider on management cluster
export AWS_REGION=us-east-1
export AWS_ACCESS_KEY_ID=<your-key>
export AWS_SECRET_ACCESS_KEY=<your-secret>

# Create IAM resources for CAPA
clusterawsadm bootstrap iam create-cloudformation-stack

# Initialize providers
clusterctl init --infrastructure aws

# Verify controllers are running
kubectl get pods -n capa-system
kubectl get pods -n capi-system

Provision an EKS Cluster

# eks-cluster.yaml
apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
metadata:
  name: production-east
  namespace: default
spec:
  clusterNetwork:
    pods:
      cidrBlocks: ["192.168.0.0/16"]
    services:
      cidrBlocks: ["10.128.0.0/12"]
  controlPlaneRef:
    apiVersion: controlplane.cluster.x-k8s.io/v1beta2
    kind: AWSManagedControlPlane
    name: production-east-control-plane
  infrastructureRef:
    apiVersion: infrastructure.cluster.x-k8s.io/v1beta2
    kind: AWSManagedCluster
    name: production-east
---
apiVersion: controlplane.cluster.x-k8s.io/v1beta2
kind: AWSManagedControlPlane
metadata:
  name: production-east-control-plane
spec:
  region: us-east-1
  version: v1.31.0
  eksClusterName: production-east
  endpointAccess:
    public: true
    private: true
  logging:
    apiServer: true
    audit: true
    authenticator: true
    controllerManager: true
    scheduler: true
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta2
kind: AWSManagedCluster
metadata:
  name: production-east
---
apiVersion: cluster.x-k8s.io/v1beta1
kind: MachinePool
metadata:
  name: production-east-pool-0
spec:
  clusterName: production-east
  replicas: 3
  template:
    spec:
      clusterName: production-east
      bootstrap:
        dataSecretName: ""
      infrastructureRef:
        apiVersion: infrastructure.cluster.x-k8s.io/v1beta2
        kind: AWSManagedMachinePool
        name: production-east-pool-0
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta2
kind: AWSManagedMachinePool
metadata:
  name: production-east-pool-0
spec:
  scaling:
    minSize: 3
    maxSize: 10
  instanceType: m5.xlarge
  diskSize: 100
# Apply and watch provisioning
kubectl apply -f eks-cluster.yaml
kubectl get cluster -w

# Get kubeconfig for workload cluster
clusterctl get kubeconfig production-east > production-east.kubeconfig
kubectl --kubeconfig=production-east.kubeconfig get nodes

Cluster Lifecycle Operations

# Scale node pool
kubectl patch machinepool production-east-pool-0 \
  --type=merge -p '{"spec":{"replicas":5}}'

# Upgrade Kubernetes version
kubectl patch awsmanagedcontrolplane production-east-control-plane \
  --type=merge -p '{"spec":{"version":"v1.32.0"}}'

# Add GPU node pool
cat << EOF | kubectl apply -f -
apiVersion: infrastructure.cluster.x-k8s.io/v1beta2
kind: AWSManagedMachinePool
metadata:
  name: production-east-gpu
spec:
  scaling:
    minSize: 0
    maxSize: 4
  instanceType: g5.2xlarge
  diskSize: 200
  labels:
    nvidia.com/gpu.present: "true"
  taints:
    - key: nvidia.com/gpu
      value: "true"
      effect: NoSchedule
EOF

# Delete a workload cluster
kubectl delete cluster production-east
# CAPI deletes all AWS resources (EKS, node groups, VPC, etc.)

GitOps with Cluster API

# ArgoCD Application for fleet management
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: fleet-clusters
  namespace: argocd
spec:
  project: default
  source:
    repoURL: https://github.com/example/fleet-config
    path: clusters/
    targetRevision: main
  destination:
    server: https://kubernetes.default.svc
    namespace: default
  syncPolicy:
    automated:
      prune: false                   # Never auto-delete clusters
      selfHeal: true

Common Issues

IssueCauseFix
clusterctl init failsMissing AWS credentialsRun clusterawsadm bootstrap iam first
Cluster stuck in ProvisioningIAM role trust policy wrongCheck CAPA controller logs
Node pool not scalingScaling limits too lowIncrease maxSize in MachinePool
Upgrade failsAdd-ons incompatibleVerify CoreDNS/kube-proxy versions
Delete hangsFinalizers blockingCheck for stuck LoadBalancers in workload cluster

Best Practices

  • Use a dedicated management cluster β€” don’t run CAPI on a workload cluster
  • Pin Kubernetes versions β€” explicit versions prevent surprise upgrades
  • GitOps for cluster definitions β€” store manifests in Git, deploy with ArgoCD
  • Set prune: false in ArgoCD β€” prevent accidental cluster deletion
  • Label node pools by purpose β€” GPU, spot, on-demand, system
  • Enable all EKS logging β€” audit logs are essential for compliance

Key Takeaways

  • Cluster API manages clusters as Kubernetes resources with kubectl
  • CAPA provider provisions EKS clusters, node groups, and VPC networking
  • Scale, upgrade, and delete clusters with standard Kubernetes operations
  • Combine with GitOps for declarative multi-cluster fleet management
  • Management cluster pattern keeps control plane separate from workloads
#cluster-api #capi #aws #eks #infrastructure-as-code
Luca Berton
Written by Luca Berton

Principal Solutions Architect specializing in Kubernetes, AI/GPU infrastructure, and cloud-native platforms. Author of Kubernetes Recipes and creator of CopyPasteLearn courses.

Kubernetes Recipes book cover

Want More Kubernetes Recipes?

This recipe is from Kubernetes Recipes, our 750-page practical guide with hundreds of production-ready patterns.

Luca Berton Ansible Pilot Ansible by Example Open Empower K8s Recipes Terraform Pilot CopyPasteLearn ProteinLens