Kubernetes Health Checks Best Practices
Design effective Kubernetes health checks with liveness, readiness, and startup probes. Avoid common anti-patterns like database checks in liveness probes.
π‘ Quick Answer: configuration
The Problem
This is a fundamental Kubernetes topic that engineers search for frequently. A comprehensive reference with production-ready examples saves hours of trial and error.
The Solution
The Three Probes
spec:
containers:
- name: app
image: my-app:v1
# Startup probe: wait for slow-starting apps
startupProbe:
httpGet:
path: /healthz
port: 8080
failureThreshold: 30 # 30 Γ 10s = 5 min max startup
periodSeconds: 10
# Liveness probe: restart if dead
livenessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 0 # Startup probe handles the wait
periodSeconds: 10
failureThreshold: 3 # 3 failures = restart
# Readiness probe: remove from Service if not ready
readinessProbe:
httpGet:
path: /ready
port: 8080
periodSeconds: 5
failureThreshold: 3Anti-Patterns to AVOID
| Anti-Pattern | Why Itβs Bad | Fix |
|---|---|---|
| Liveness checks database | DB down β all pods restart β thundering herd | Only check app process health |
| Same endpoint for liveness + readiness | Canβt distinguish βdeadβ from βtemporarily busyβ | Separate /healthz and /ready |
| Too aggressive liveness (1s period, 1 failure) | Normal GC pauses trigger restarts | 10s period, 3 failures |
| No startup probe for slow apps | Liveness kills pod before it starts | Add startupProbe |
| Readiness checks external deps | External service down β all pods unready β no traffic | Only check local readiness |
Probe Types
# HTTP GET (most common)
livenessProbe:
httpGet:
path: /healthz
port: 8080
httpHeaders:
- name: Accept
value: application/json
# TCP Socket (databases, Redis)
livenessProbe:
tcpSocket:
port: 5432
# Exec command
livenessProbe:
exec:
command: ["pg_isready", "-U", "postgres"]
# gRPC (K8s 1.27+)
livenessProbe:
grpc:
port: 50051
service: health.v1.Healthgraph TD
A[Pod starts] --> B{Startup probe}
B -->|Passes| C[Liveness + Readiness probes start]
B -->|Fails after threshold| D[Kill and restart pod]
C --> E{Readiness probe}
E -->|Pass| F[Added to Service endpoints]
E -->|Fail| G[Removed from Service - no traffic]
C --> H{Liveness probe}
H -->|Pass| I[Pod stays running]
H -->|Fail 3x| J[Container restarted]Frequently Asked Questions
Do I need all three probes?
Readiness is almost always needed (controls traffic routing). Liveness is needed if your app can deadlock or hang. Startup is needed for slow-starting apps (Java, large model loading). Many apps only need readiness + liveness.
Best Practices
- Start with the simplest configuration that meets your needs
- Test changes in staging before production
- Use
kubectl describeand events for troubleshooting - Document your decisions for the team
Key Takeaways
- This is essential Kubernetes knowledge for production operations
- Follow the principle of least privilege and minimal configuration
- Monitor and iterate based on real-world behavior
- Automation reduces human error and improves consistency

Recommended
Kubernetes Recipes β The Complete Book100+ production-ready patterns with detailed explanations, best practices, and copy-paste YAML. Everything in one place.
Get the Book βLearn by Doing
CopyPasteLearn β Hands-on Cloud & DevOps CoursesMaster Kubernetes, Ansible, Terraform, and MLOps with interactive, copy-paste-run lessons. Start free.
Browse Courses βπ Deepen Your Skills β Hands-on Courses
Courses by CopyPasteLearn.com β Learn IT by Doing
