πŸ“šBook Signing at KubeCon EU 2026Meet us at Booking.com HQ (Mon 18:30-21:00) & vCluster booth #521 (Tue 24 Mar, 12:30-1:30pm) β€” free book giveaway!RSVP Booking.com Event
Deployments beginner ⏱ 10 minutes K8s 1.28+

Kubernetes Liveness and Readiness Probes Guide

Configure Kubernetes liveness, readiness, and startup probes for health checks. HTTP, TCP, exec probes, timing parameters, and failure threshold tuning.

By Luca Berton β€’ β€’ πŸ“– 5 min read

πŸ’‘ Quick Answer: Three probe types: Liveness (is the container alive? restart if not), Readiness (can it serve traffic? remove from Service if not), Startup (has it started? disable other probes until it passes). Use httpGet for web apps, tcpSocket for databases, exec for custom checks. Always set initialDelaySeconds to avoid premature restarts.

The Problem

Without health probes:

  • Dead containers keep running (liveness)
  • Traffic sent to pods that aren’t ready (readiness)
  • Slow-starting apps get killed before they’re up (startup)
  • Rolling updates proceed before new pods can serve
  • No automatic recovery from application deadlocks

The Solution

All Three Probes

apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-app
spec:
  replicas: 3
  selector:
    matchLabels:
      app: web
  template:
    metadata:
      labels:
        app: web
    spec:
      containers:
      - name: app
        image: myapp:v2
        ports:
        - containerPort: 8080
        
        # Startup probe β€” runs first, disables other probes until success
        startupProbe:
          httpGet:
            path: /healthz
            port: 8080
          failureThreshold: 30      # 30 Γ— 10s = 5 min to start
          periodSeconds: 10
        
        # Liveness probe β€” restart container if this fails
        livenessProbe:
          httpGet:
            path: /healthz
            port: 8080
          initialDelaySeconds: 0    # Startup probe handles delay
          periodSeconds: 15
          timeoutSeconds: 3
          failureThreshold: 3       # 3 failures β†’ restart
        
        # Readiness probe β€” remove from Service if this fails
        readinessProbe:
          httpGet:
            path: /ready
            port: 8080
          initialDelaySeconds: 0
          periodSeconds: 5
          timeoutSeconds: 3
          failureThreshold: 3
          successThreshold: 1

Probe Types

# HTTP GET β€” most common for web apps
livenessProbe:
  httpGet:
    path: /healthz
    port: 8080
    httpHeaders:
    - name: Accept
      value: application/json
  # Success: 200-399 status code

# TCP Socket β€” for databases, caches, non-HTTP services
livenessProbe:
  tcpSocket:
    port: 5432
  # Success: TCP connection established

# Exec β€” run a command inside the container
livenessProbe:
  exec:
    command:
    - sh
    - -c
    - pg_isready -U postgres
  # Success: exit code 0

# gRPC (K8s 1.27+) β€” for gRPC services
livenessProbe:
  grpc:
    port: 50051
    service: health
  # Success: SERVING status

Timing Parameters

livenessProbe:
  httpGet:
    path: /healthz
    port: 8080
  initialDelaySeconds: 15    # Wait before first probe (default: 0)
  periodSeconds: 10          # How often to probe (default: 10)
  timeoutSeconds: 3          # Timeout per probe (default: 1)
  failureThreshold: 3        # Failures before action (default: 3)
  successThreshold: 1        # Successes to be considered healthy (default: 1)
  # For readiness: successThreshold can be >1

# Total time to failure = initialDelay + (period Γ— failureThreshold)
# Example: 15 + (10 Γ— 3) = 45 seconds before container restart

When to Use Each Probe

STARTUP PROBE:
  Purpose: Protect slow-starting containers
  On fail: Keep trying (up to failureThreshold)
  Use for: Java apps, apps loading large models, database migrations
  
LIVENESS PROBE:
  Purpose: Detect deadlocked/broken containers
  On fail: Container RESTARTED
  Use for: Detecting deadlocks, infinite loops, unrecoverable errors
  Danger: Too aggressive = restart loops
  
READINESS PROBE:
  Purpose: Control traffic routing
  On fail: Pod removed from Service endpoints
  On pass: Pod added back to Service endpoints
  Use for: Warmup periods, dependency checks, graceful degradation

Common Patterns

# Pattern 1: Slow Java app (60s startup)
startupProbe:
  httpGet:
    path: /actuator/health
    port: 8080
  failureThreshold: 60     # 60 Γ— 2s = 2 min max startup
  periodSeconds: 2
livenessProbe:
  httpGet:
    path: /actuator/health
    port: 8080
  periodSeconds: 30
  failureThreshold: 3

---
# Pattern 2: Database dependency check
readinessProbe:
  exec:
    command:
    - sh
    - -c
    - |
      curl -sf http://localhost:8080/healthz && \
      pg_isready -h $DB_HOST -p 5432
  periodSeconds: 10

---
# Pattern 3: File-based health (sidecar pattern)
livenessProbe:
  exec:
    command: ["cat", "/tmp/healthy"]
  # App creates /tmp/healthy when alive
  # Sidecar or app removes it on fatal error

Liveness vs Readiness Endpoints

// Separate endpoints for liveness and readiness
// /healthz β€” am I alive? (simple, fast check)
func healthz(w http.ResponseWriter, r *http.Request) {
    w.WriteHeader(http.StatusOK)  // Always 200 unless deadlocked
}

// /ready β€” can I serve traffic? (check dependencies)
func ready(w http.ResponseWriter, r *http.Request) {
    if !dbConnected || !cacheWarmed {
        w.WriteHeader(http.StatusServiceUnavailable) // 503
        return
    }
    w.WriteHeader(http.StatusOK)
}

Common Issues

Container restart loop (CrashLoopBackOff)

Liveness probe too aggressive β€” fails before app starts. Add startupProbe or increase initialDelaySeconds.

Traffic sent to unready pods during deployment

No readiness probe configured. Add one β€” rolling updates wait for readiness before proceeding.

Liveness probe passes but app is broken

Health endpoint doesn’t check enough. Test actual functionality (DB connection, cache availability), not just β€œprocess is running.”

Readiness probe never passes

Dependency (DB, external service) is down. Pod stays not-ready. Check: kubectl describe pod β†’ Events.

Best Practices

  • Always use readiness probes β€” prevents traffic to unready pods
  • Use startup probes for slow apps β€” don’t abuse initialDelaySeconds
  • Liveness β‰  readiness β€” different endpoints, different checks
  • Keep liveness probes simple β€” check if process is alive, not dependencies
  • Readiness probes check dependencies β€” database, cache, external services
  • Don’t make liveness probes too aggressive β€” causes unnecessary restarts
  • timeoutSeconds > 1 for network probes β€” avoid flapping on slow responses

Key Takeaways

  • Liveness: restart dead containers | Readiness: route traffic | Startup: protect slow starts
  • HTTP probes succeed on 200-399 status codes
  • Startup probe disables liveness/readiness until it passes
  • Keep liveness simple (is it alive?), readiness thorough (can it serve?)
  • Configure timing to match your application’s behavior
#probes #health-checks #liveness #readiness #cka
Luca Berton
Written by Luca Berton

Principal Solutions Architect specializing in Kubernetes, AI/GPU infrastructure, and cloud-native platforms. Author of Kubernetes Recipes and creator of CopyPasteLearn courses.

Kubernetes Recipes book cover

Want More Kubernetes Recipes?

This recipe is from Kubernetes Recipes, our 750-page practical guide with hundreds of production-ready patterns.

Luca Berton Ansible Pilot Ansible by Example Open Empower K8s Recipes Terraform Pilot CopyPasteLearn ProteinLens