Kubernetes 1.36 Graceful Leader Transition
Configure graceful leader transitions in Kubernetes 1.36 control plane components. Eliminate brief outages during leader election failovers.
π‘ Quick Answer: Kubernetes 1.36 introduces Graceful Leader Transition (KEP-5366). When a control plane component steps down as leader (e.g., during rolling upgrade), it gracefully hands off leadership instead of forcing a timeout-based failover β reducing control plane gaps from 15+ seconds to near-zero.
The Problem
In HA Kubernetes clusters, control plane components (controller-manager, scheduler) use leader election. When the leader Pod restarts:
- Leader stops and releases the lease
- 15-second lease timeout before other replicas detect the vacancy
- New leader acquires the lease and starts processing
- During this gap: no new deployments reconciled, no Pods scheduled, no scaling actions
This 15-30 second gap happens during every rolling upgrade of the control plane.
The Solution
Graceful leader transition lets the outgoing leader notify a candidate directly, enabling instant handoff.
Enable Graceful Leader Transition
# kube-controller-manager configuration
apiVersion: kubecontrollermanager.config.k8s.io/v1alpha1
kind: KubeControllerManagerConfiguration
leaderElection:
leaderElect: true
leaseDuration: 15s
renewDeadline: 10s
retryPeriod: 2s
gracefulTransition: true # NEW in 1.36API Server Flag Configuration
# Controller Manager
kube-controller-manager \
--leader-elect=true \
--leader-elect-graceful-transition=true \
--feature-gates=GracefulLeaderTransition=true
# Scheduler
kube-scheduler \
--leader-elect=true \
--leader-elect-graceful-transition=true \
--feature-gates=GracefulLeaderTransition=truekubeadm Configuration
apiVersion: kubeadm.k8s.io/v1beta4
kind: ClusterConfiguration
controllerManager:
extraArgs:
- name: leader-elect-graceful-transition
value: "true"
- name: feature-gates
value: "GracefulLeaderTransition=true"
scheduler:
extraArgs:
- name: leader-elect-graceful-transition
value: "true"
- name: feature-gates
value: "GracefulLeaderTransition=true"How It Works
Traditional failover:
Leader A stops β 15s timeout β Leader B detects β B acquires lease β B starts
Gap: ~15-20 seconds
Graceful transition:
Leader A signals "stepping down" β Leader B immediately acquires β B starts
Gap: ~0.5 secondsMonitor Leader Transitions
# Watch lease objects
kubectl get lease -n kube-system -w
# Check controller-manager leader
kubectl get lease kube-controller-manager -n kube-system \
-o jsonpath='{.spec.holderIdentity}'
# Check scheduler leader
kubectl get lease kube-scheduler -n kube-system \
-o jsonpath='{.spec.holderIdentity}'
# View transition events
kubectl get events -n kube-system --field-selector reason=LeaderTransitionVerify During Rolling Upgrade
# Start a watcher before upgrading
kubectl get events -n kube-system -w --field-selector reason=LeaderElection &
# Perform rolling upgrade
kubeadm upgrade apply v1.36.0
# Observe near-instant leader transitions instead of 15s gapsCommon Issues
Transition not working β still seeing 15s gaps
- Cause: Feature gate not enabled on all control plane replicas
- Fix: Enable
GracefulLeaderTransitionon ALL replicas, not just one
Leader election conflicts during mixed-version upgrade
- Cause: Old replicas donβt understand graceful transition signals
- Fix: Upgrade all control plane replicas before relying on graceful transition
Best Practices
- Enable on all replicas β graceful transition requires both outgoing and incoming leaders to support it
- Upgrade control plane components together β mixed versions fall back to timeout-based election
- Monitor lease transitions β verify gap reduction with metrics
- Keep lease timeouts as safety net β graceful transition is best-effort; timeouts are the fallback
- Test during maintenance windows β verify behavior before relying on it in production
Key Takeaways
- Graceful Leader Transition is available in Kubernetes 1.36 (KEP-5366)
- Reduces control plane failover gaps from 15+ seconds to sub-second
- Applies to controller-manager and scheduler leader election
- Critical for rolling upgrades in production HA clusters
- Falls back to timeout-based election if the feature isnβt enabled on both sides

Recommended
Kubernetes Recipes β The Complete Book100+ production-ready patterns with detailed explanations, best practices, and copy-paste YAML. Everything in one place.
Get the Book βLearn by Doing
CopyPasteLearn β Hands-on Cloud & DevOps CoursesMaster Kubernetes, Ansible, Terraform, and MLOps with interactive, copy-paste-run lessons. Start free.
Browse Courses βπ Deepen Your Skills β Hands-on Courses
Courses by CopyPasteLearn.com β Learn IT by Doing
