Enterprise GitOps at Scale with Fleet Mgmt
Manage hundreds of Kubernetes clusters with ArgoCD ApplicationSets, Flux multi-cluster, and fleet-wide policy enforcement using GitOps principles.
π‘ Quick Answer: Use ArgoCD ApplicationSets with generators (Git, cluster, matrix) to declaratively manage applications across hundreds of clusters from a single Git repo. Combine with Kustomize overlays for environment-specific configuration and progressive rollouts.
The Problem
Enterprise organizations manage tens to hundreds of Kubernetes clusters across regions, cloud providers, and environments. Manually deploying and updating applications on each cluster is error-prone and doesnβt scale. You need a fleet management approach where Git is the single source of truth, changes propagate automatically, and drift is self-healed.
flowchart TB
GIT["Git Repository<br/>(Single Source of Truth)"] -->|Sync| ARGO["ArgoCD Hub<br/>(Management Cluster)"]
ARGO -->|ApplicationSet| DEV1["Dev Cluster (US)"]
ARGO -->|ApplicationSet| STG1["Staging Cluster (US)"]
ARGO -->|ApplicationSet| PROD1["Prod Cluster (US-East)"]
ARGO -->|ApplicationSet| PROD2["Prod Cluster (US-West)"]
ARGO -->|ApplicationSet| PROD3["Prod Cluster (EU)"]
ARGO -->|ApplicationSet| EDGE1["Edge Cluster 1"]
ARGO -->|ApplicationSet| EDGE2["Edge Cluster N"]The Solution
Cluster Generator: Deploy to All Clusters
apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
name: platform-monitoring
namespace: argocd
spec:
generators:
- clusters:
selector:
matchLabels:
environment: production
template:
metadata:
name: "monitoring-{{name}}"
spec:
project: platform
source:
repoURL: https://git.example.com/platform/monitoring.git
targetRevision: main
path: "overlays/{{metadata.labels.environment}}"
destination:
server: "{{server}}"
namespace: monitoring
syncPolicy:
automated:
prune: true
selfHeal: true
syncOptions:
- CreateNamespace=true
- ServerSideApply=trueMatrix Generator: Environments Γ Applications
apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
name: microservices-fleet
namespace: argocd
spec:
generators:
- matrix:
generators:
# All production clusters
- clusters:
selector:
matchLabels:
environment: production
# All microservices in Git
- git:
repoURL: https://git.example.com/apps/microservices.git
revision: main
directories:
- path: "services/*"
template:
metadata:
name: "{{path.basename}}-{{name}}"
labels:
app: "{{path.basename}}"
cluster: "{{name}}"
spec:
project: microservices
source:
repoURL: https://git.example.com/apps/microservices.git
targetRevision: main
path: "{{path}}/overlays/production"
destination:
server: "{{server}}"
namespace: "{{path.basename}}"
syncPolicy:
automated:
prune: true
selfHeal: trueProgressive Rollout Across Clusters
apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
name: progressive-rollout
namespace: argocd
spec:
generators:
- clusters:
selector:
matchLabels:
environment: production
strategy:
type: RollingSync
rollingSync:
steps:
# Step 1: Canary cluster (1 cluster)
- matchExpressions:
- key: rollout-tier
operator: In
values: ["canary"]
maxUpdate: 1
# Step 2: Wait for manual approval
- matchExpressions:
- key: rollout-tier
operator: In
values: ["early"]
maxUpdate: 2
# Step 3: Remaining clusters
- matchExpressions:
- key: rollout-tier
operator: In
values: ["general"]
maxUpdate: "25%"
template:
metadata:
name: "app-{{name}}"
spec:
source:
repoURL: https://git.example.com/apps/frontend.git
targetRevision: main
path: overlays/production
destination:
server: "{{server}}"
namespace: frontendGit Repository Structure for Fleet
βββ platform/ # Platform-wide components
β βββ monitoring/
β β βββ base/
β β β βββ kustomization.yaml
β β β βββ prometheus.yaml
β β β βββ grafana.yaml
β β βββ overlays/
β β βββ production/
β β β βββ kustomization.yaml
β β βββ staging/
β β βββ kustomization.yaml
β βββ logging/
β βββ ingress/
β βββ security-policies/
βββ services/ # Application microservices
β βββ api-gateway/
β β βββ base/
β β βββ overlays/
β βββ auth-service/
β βββ payment-service/
βββ clusters/ # Cluster-specific configs
β βββ us-east-prod/
β β βββ kustomization.yaml
β βββ us-west-prod/
β βββ eu-prod/
βββ policies/ # Fleet-wide policies
βββ network-policies/
βββ resource-quotas/
βββ pod-security/Fleet-Wide Policy Enforcement
# Deploy Kyverno policies to all clusters
apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
name: security-policies
namespace: argocd
spec:
generators:
- clusters: {} # ALL clusters
template:
metadata:
name: "policies-{{name}}"
spec:
project: platform
source:
repoURL: https://git.example.com/platform/policies.git
targetRevision: main
path: policies
destination:
server: "{{server}}"
namespace: kyverno
syncPolicy:
automated:
prune: true
selfHeal: true
syncOptions:
- ServerSideApply=trueCommon Issues
| Issue | Cause | Fix |
|---|---|---|
| ApplicationSet generates too many apps | Broad cluster selector | Add label selectors to target specific cluster groups |
| Drift detected but self-heal fails | Resource conflicts with operators | Add ServerSideApply=true to sync options |
| Secret management across clusters | Secrets shouldnβt be in Git | Use External Secrets Operator or Sealed Secrets per cluster |
| Slow sync across 100+ clusters | ArgoCD controller overloaded | Shard ArgoCD controllers, increase replicas |
| Rollout stuck on canary step | Canary cluster health check failing | Check Application health status, fix before proceeding |
Best Practices
- One Git repo for platform, one per app team β platform team owns infra, app teams own their services
- Kustomize overlays for environments β base config + per-environment patches, never duplicate
- Progressive rollouts β canary β early adopters β general availability for production changes
- Self-heal everything β
selfHeal: trueensures Git remains the source of truth - Shard ArgoCD for scale β beyond 50 clusters, shard the application controller
- Label clusters consistently β
environment,region,provider,rollout-tierfor precise targeting
Key Takeaways
- ApplicationSets with generators (cluster, git, matrix) declaratively manage fleet-wide deployments
- Progressive rollout strategies (canary β staged β full) reduce blast radius for changes
- Kustomize overlays + Git directory structure enable clean separation of concerns
- Fleet-wide policy enforcement via GitOps ensures consistent security posture across all clusters

Recommended
Kubernetes Recipes β The Complete Book100+ production-ready patterns with detailed explanations, best practices, and copy-paste YAML. Everything in one place.
Get the Book βLearn by Doing
CopyPasteLearn β Hands-on Cloud & DevOps CoursesMaster Kubernetes, Ansible, Terraform, and MLOps with interactive, copy-paste-run lessons. Start free.
Browse Courses βπ Deepen Your Skills β Hands-on Courses
Courses by CopyPasteLearn.com β Learn IT by Doing
