Multi-Cluster Fleet Management on Kubernetes
Manage multiple Kubernetes clusters with kubectl contexts, federation, GitOps fleet patterns, and tools like Rancher, ArgoCD, and Cluster API.
π‘ Quick Answer: Manage multiple K8s clusters by: 1)
kubectlcontexts for manual switching, 2) ArgoCD ApplicationSets for GitOps fleet deployment, 3) Cluster API for lifecycle management, or 4) Rancher/OpenShift ACM for full platform management.
The Problem
Organizations run multiple clusters for:
- Environment separation (dev/staging/prod)
- Geographic distribution (multi-region)
- Workload isolation (PCI, GPU, general)
- Disaster recovery (active-active or active-passive)
Managing them individually doesnβt scale.
The Solution
kubectl Context Management
# List all configured clusters
kubectl config get-contexts
# CURRENT NAME CLUSTER AUTHINFO NAMESPACE
# * prod-us prod-us admin default
# prod-eu prod-eu admin default
# staging staging dev default
# Switch context
kubectl config use-context prod-eu
# Run command against specific context
kubectl --context=staging get pods
# Merge kubeconfigs
KUBECONFIG=~/.kube/prod.yaml:~/.kube/staging.yaml kubectl config view --flatten > ~/.kube/config
# Rename context for clarity
kubectl config rename-context kubernetes-admin@cluster prod-us-eastArgoCD Multi-Cluster GitOps
# Register clusters in ArgoCD
# argocd cluster add prod-eu --name prod-eu
# ApplicationSet for fleet-wide deployment
apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
name: myapp-fleet
namespace: argocd
spec:
generators:
- clusters:
selector:
matchLabels:
env: production
template:
metadata:
name: "myapp-{{name}}"
spec:
project: default
source:
repoURL: https://github.com/org/k8s-manifests.git
targetRevision: main
path: "apps/myapp/overlays/{{metadata.labels.region}}"
destination:
server: "{{server}}"
namespace: myapp
syncPolicy:
automated:
prune: true
selfHeal: trueCluster API (Lifecycle Management)
# Manage cluster lifecycle declaratively
apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
metadata:
name: prod-eu-west
namespace: clusters
spec:
clusterNetwork:
pods:
cidrBlocks: ["192.168.0.0/16"]
services:
cidrBlocks: ["10.128.0.0/12"]
controlPlaneRef:
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: KubeadmControlPlane
name: prod-eu-west-cp
infrastructureRef:
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AWSCluster
name: prod-eu-westArchitecture Patterns
graph TD
A[Management Cluster] --> B[Cluster API]
A --> C[ArgoCD]
A --> D[Monitoring Hub]
B -->|Provisions| E[Prod US]
B -->|Provisions| F[Prod EU]
B -->|Provisions| G[Staging]
C -->|Deploys apps| E
C -->|Deploys apps| F
C -->|Deploys apps| G
D -->|Collects metrics| E
D -->|Collects metrics| F
D -->|Collects metrics| GFleet Management Tools Comparison
| Tool | Use Case | Approach |
|---|---|---|
| kubectl contexts | Small teams, <5 clusters | Manual context switching |
| ArgoCD ApplicationSets | GitOps fleet deployment | Pull-based, declarative |
| Flux + Cluster API | Full lifecycle + deploy | GitOps native |
| Rancher | Platform management | UI + API, full lifecycle |
| OpenShift ACM | Red Hat enterprise | Policy-driven governance |
| Loft/vCluster | Virtual clusters | Multi-tenancy on single cluster |
| Crossplane | Infrastructure as code | Compositions, cloud resources |
Service Mesh Multi-Cluster
# Istio multi-cluster (shared trust domain)
# Link services across clusters
apiVersion: networking.istio.io/v1alpha3
kind: ServiceEntry
metadata:
name: remote-service
spec:
hosts:
- myservice.namespace.global
location: MESH_INTERNAL
endpoints:
- address: 10.1.2.3 # Remote cluster gateway IP
ports:
http: 15443Centralized Monitoring
# Thanos for multi-cluster Prometheus
# Each cluster runs Prometheus + Thanos Sidecar
# Central Thanos Query aggregates all clusters
# Or use Grafana Cloud / Datadog with cluster labels
# Add cluster label to all metrics:
# external_labels:
# cluster: prod-us-east
# region: us-east-1Common Issues
| Issue | Cause | Fix |
|---|---|---|
| Wrong cluster deployed to | Context confusion | Use namespace-per-cluster or ArgoCD |
| Certificate expired on remote | kubeconfig token stale | Refresh credentials or use OIDC |
| Network connectivity | Clusters not peered | Set up VPN/peering between VPCs |
| Config drift between clusters | Manual changes | Enforce GitOps β no kubectl apply |
| Inconsistent versions | Clusters at different K8s versions | Use Cluster API for version management |
Best Practices
- Use a management cluster β single pane for fleet operations
- GitOps for all deployments β ArgoCD ApplicationSets across clusters
- Standardize cluster labels β
env,region,tierfor fleet targeting - Separate kubeconfigs per cluster β avoid accidental cross-cluster commands
- Centralize observability β one Grafana for all clusters with cluster label
Key Takeaways
- Start with kubectl contexts, graduate to ArgoCD/Flux for automation
- Management cluster pattern: one cluster manages the fleetβs lifecycle and deployments
- Cluster API provisions clusters declaratively; ArgoCD deploys apps declaratively
- Label clusters consistently for ApplicationSet generators
- Multi-cluster service mesh (Istio/Linkerd) enables cross-cluster service discovery

Recommended
Kubernetes Recipes β The Complete Book100+ production-ready patterns with detailed explanations, best practices, and copy-paste YAML. Everything in one place.
Get the Book βLearn by Doing
CopyPasteLearn β Hands-on Cloud & DevOps CoursesMaster Kubernetes, Ansible, Terraform, and MLOps with interactive, copy-paste-run lessons. Start free.
Browse Courses βπ Deepen Your Skills β Hands-on Courses
Courses by CopyPasteLearn.com β Learn IT by Doing
