πŸ“šBook Signing at KubeCon EU 2026Meet us at Booking.com HQ (Mon 18:30-21:00) & vCluster booth #521 (Tue 24 Mar, 12:30-1:30pm) β€” free book giveaway!RSVP Booking.com Event
Deployments intermediate ⏱ 15 minutes K8s 1.28+

Kubernetes Graceful Shutdown and Pod Termination

Implement graceful shutdown for Kubernetes pods. Configure terminationGracePeriodSeconds, preStop hooks, SIGTERM handling, connection

By Luca Berton β€’ β€’ πŸ“– 5 min read

πŸ’‘ Quick Answer: When Kubernetes terminates a pod, it sends SIGTERM to PID 1, waits up to terminationGracePeriodSeconds (default 30s), then sends SIGKILL. For zero-downtime: 1) Handle SIGTERM in your app (stop accepting, drain connections), 2) Add a preStop hook with a short sleep (5-10s) to allow endpoint removal propagation, 3) Set grace period longer than your drain time.

The Problem

  • Pods receive in-flight requests during shutdown β†’ 502/504 errors for clients
  • SIGTERM not handled β€” app killed abruptly losing work in progress
  • Endpoint removal races with pod termination β€” traffic sent to dying pods
  • Rolling updates cause brief connection resets
  • Long-running requests (WebSocket, streaming) cut off prematurely

The Solution

Pod Termination Sequence

Time β”‚ Event
─────┼────────────────────────────────────────────────────────
 0s  β”‚ Pod marked for termination
     β”‚ β”œβ”€β”€ Pod removed from Service endpoints (async)
     β”‚ β”œβ”€β”€ preStop hook executed (blocking)
     β”‚ └── SIGTERM sent to PID 1 (parallel with preStop)
     β”‚
 5s  β”‚ preStop hook completes (e.g., sleep 5)
     β”‚ App receives SIGTERM (if preStop blocked it)
     β”‚ App starts graceful shutdown:
     β”‚   β”œβ”€β”€ Stop accepting new connections
     β”‚   β”œβ”€β”€ Drain in-flight requests
     β”‚   └── Close database connections, flush buffers
     β”‚
25s  β”‚ App finishes graceful shutdown, exits 0
     β”‚
30s  β”‚ terminationGracePeriodSeconds expires
     β”‚ SIGKILL sent (force kill if still running)
─────┴────────────────────────────────────────────────────────

Key insight: Endpoint removal is ASYNC β€” traffic may still
arrive for a few seconds after SIGTERM. The preStop sleep
gives time for kube-proxy/ingress to update routing tables.
apiVersion: apps/v1
kind: Deployment
metadata:
  name: api-server
spec:
  template:
    spec:
      terminationGracePeriodSeconds: 60    # Total time before SIGKILL
      containers:
        - name: app
          image: registry.example.com/api:v2
          ports:
            - containerPort: 8080
          lifecycle:
            preStop:
              exec:
                command: ["/bin/sh", "-c", "sleep 10"]
                # Sleep gives time for endpoint removal to propagate
                # Then SIGTERM triggers app's graceful shutdown
          readinessProbe:
            httpGet:
              path: /healthz
              port: 8080
            periodSeconds: 5
            # Failing readiness removes pod from endpoints faster

Application SIGTERM Handling

# Python (Flask/FastAPI)
import signal
import sys

def graceful_shutdown(signum, frame):
    print("SIGTERM received, shutting down gracefully...")
    # Stop accepting new requests
    server.should_exit = True
    # Wait for in-flight requests (max 20s)
    server.shutdown(timeout=20)
    sys.exit(0)

signal.signal(signal.SIGTERM, graceful_shutdown)
// Go
func main() {
    ctx, stop := signal.NotifyContext(context.Background(), syscall.SIGTERM, syscall.SIGINT)
    defer stop()

    server := &http.Server{Addr: ":8080"}
    go server.ListenAndServe()

    <-ctx.Done()
    log.Println("SIGTERM received, draining connections...")

    shutdownCtx, cancel := context.WithTimeout(context.Background(), 20*time.Second)
    defer cancel()
    server.Shutdown(shutdownCtx)
}
// Node.js
process.on('SIGTERM', () => {
  console.log('SIGTERM received, graceful shutdown...');
  server.close(() => {
    console.log('All connections drained');
    process.exit(0);
  });
  // Force exit after 20s if connections don't drain
  setTimeout(() => process.exit(1), 20000);
});

Zero-Downtime Rolling Update

apiVersion: apps/v1
kind: Deployment
metadata:
  name: api-server
spec:
  replicas: 4
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 0     # Never reduce below desired replicas
      maxSurge: 1           # Add 1 extra pod during update
  template:
    spec:
      terminationGracePeriodSeconds: 60
      containers:
        - name: app
          lifecycle:
            preStop:
              exec:
                command: ["/bin/sh", "-c", "sleep 10"]
          readinessProbe:
            httpGet:
              path: /ready
              port: 8080
            initialDelaySeconds: 5
            periodSeconds: 5
---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: api-server-pdb
spec:
  minAvailable: 3    # Always keep at least 3 pods running
  selector:
    matchLabels:
      app: api-server

Long-Running Connections (WebSocket/gRPC Streaming)

spec:
  terminationGracePeriodSeconds: 300    # 5 minutes for long connections
  containers:
    - name: websocket-server
      lifecycle:
        preStop:
          exec:
            command:
              - /bin/sh
              - -c
              - |
                # Signal app to stop accepting new connections
                curl -X POST localhost:8080/admin/drain
                # Wait for existing connections to close naturally
                sleep 30

Common Issues

502 errors during rolling update

  • Cause: Traffic sent to pod after SIGTERM but before endpoint removal propagates
  • Fix: Add preStop: sleep 5-10 to delay shutdown; set maxUnavailable: 0

Pod killed before finishing graceful shutdown

  • Cause: terminationGracePeriodSeconds too short for drain time
  • Fix: Increase grace period; grace period must be > preStop + drain time

SIGTERM not received by application

  • Cause: PID 1 is shell script that doesn’t forward signals; or using CMD with shell form
  • Fix: Use exec form in Dockerfile (CMD ["./app"] not CMD ./app); or use exec in entrypoint

App exits immediately on SIGTERM without draining

  • Cause: Application doesn’t handle SIGTERM (default behavior = exit)
  • Fix: Add signal handler to drain connections before exiting

Best Practices

  1. Always add preStop: sleep 5-10 β€” allows endpoint removal to propagate
  2. Handle SIGTERM in your application β€” drain connections, flush buffers
  3. Set terminationGracePeriodSeconds > preStop + drain time β€” prevent SIGKILL
  4. Use maxUnavailable: 0 β€” never remove capacity during updates
  5. Fail readiness probe during shutdown β€” accelerates endpoint removal
  6. Use exec form in Dockerfile CMD β€” ensures PID 1 receives signals
  7. PodDisruptionBudget β€” protects against voluntary disruptions

Key Takeaways

  • Pod termination: mark terminating β†’ remove endpoints (async) β†’ preStop β†’ SIGTERM β†’ wait β†’ SIGKILL
  • preStop: sleep 5-10 bridges the gap between SIGTERM and endpoint removal
  • Application must handle SIGTERM: stop accepting, drain in-flight, exit cleanly
  • terminationGracePeriodSeconds (default 30s) is the hard deadline before SIGKILL
  • Zero-downtime: preStop hook + SIGTERM handler + maxUnavailable: 0 + readiness probe
  • Shell form Dockerfile CMD (CMD ./app) doesn’t forward signals β€” use exec form
#graceful-shutdown #pod-lifecycle #termination #rolling-updates #zero-downtime
Luca Berton
Written by Luca Berton

Principal Solutions Architect specializing in Kubernetes, AI/GPU infrastructure, and cloud-native platforms. Author of Kubernetes Recipes and creator of CopyPasteLearn courses.

Kubernetes Recipes book cover

Want More Kubernetes Recipes?

This recipe is from Kubernetes Recipes, our 750-page practical guide with hundreds of production-ready patterns.

Luca Berton Ansible Pilot Ansible by Example Open Empower K8s Recipes Terraform Pilot CopyPasteLearn ProteinLens