Kubernetes Liveness and Readiness Probes Best Practices
Configure health checks for your Kubernetes pods using liveness and readiness probes. Learn the differences, when to use each, and avoid common pitfalls that cause cascading failures.
The Problem
Kubernetes needs to know if your application is healthy and ready to receive traffic. Without proper health checks, Kubernetes might send traffic to broken pods or fail to restart crashed applications.
The Solution
Configure three types of probes:
- Liveness Probe - Is the container alive? (Restart if not)
- Readiness Probe - Is the container ready for traffic? (Remove from service if not)
- Startup Probe - Has the container started? (Protect slow-starting containers)
Understanding the Difference
| Probe | Purpose | Failure Action |
|---|---|---|
| Liveness | Detect deadlocks/hangs | Restart container |
| Readiness | Detect temporary unavailability | Remove from Service endpoints |
| Startup | Wait for slow startup | Prevent liveness checks during startup |
Basic HTTP Probe Example
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app
spec:
replicas: 3
selector:
matchLabels:
app: my-app
template:
metadata:
labels:
app: my-app
spec:
containers:
- name: my-app
image: my-app:1.0
ports:
- containerPort: 8080
# Liveness: Restart if this fails
livenessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3
# Readiness: Remove from service if this fails
readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 5
periodSeconds: 5
timeoutSeconds: 3
failureThreshold: 3Probe Types
HTTP Probe (Most Common)
livenessProbe:
httpGet:
path: /healthz
port: 8080
httpHeaders:
- name: Authorization
value: Bearer tokenTCP Probe (For non-HTTP services)
livenessProbe:
tcpSocket:
port: 3306Command Probe (For custom checks)
livenessProbe:
exec:
command:
- cat
- /tmp/healthygRPC Probe (Kubernetes 1.24+)
livenessProbe:
grpc:
port: 50051
service: my.health.ServiceProbe Parameters Explained
| Parameter | Default | Description |
|---|---|---|
initialDelaySeconds | 0 | Wait before first probe |
periodSeconds | 10 | How often to probe |
timeoutSeconds | 1 | Probe timeout |
successThreshold | 1 | Consecutive successes to be healthy |
failureThreshold | 3 | Consecutive failures to be unhealthy |
Startup Probe for Slow Applications
For applications that take a long time to start (Java apps, ML models):
startupProbe:
httpGet:
path: /healthz
port: 8080
# Allow up to 5 minutes for startup (30 * 10 seconds)
failureThreshold: 30
periodSeconds: 10
livenessProbe:
httpGet:
path: /healthz
port: 8080
# Once startup succeeds, use tighter liveness checks
periodSeconds: 10
failureThreshold: 3Best Practices
✅ DO
# Separate endpoints for liveness and readiness
livenessProbe:
httpGet:
path: /healthz # Only checks if app is alive
readinessProbe:
httpGet:
path: /ready # Checks dependencies (DB, cache, etc.)❌ DON’T
# Don't check external dependencies in liveness probe!
livenessProbe:
httpGet:
path: /ready # This checks DB connection
# If DB is slow, ALL pods restart = cascading failure!Recommended Health Check Implementations
Liveness endpoint (/healthz):
- Return 200 if the process is running
- Don’t check external dependencies
- Fast response (< 100ms)
Readiness endpoint (/ready):
- Check database connections
- Check cache availability
- Check required external services
- Return 503 if not ready
Complete Example: Node.js Application
// healthz - for liveness (simple)
app.get('/healthz', (req, res) => {
res.status(200).json({ status: 'alive' });
});
// ready - for readiness (checks dependencies)
app.get('/ready', async (req, res) => {
try {
// Check database
await db.ping();
// Check Redis
await redis.ping();
res.status(200).json({ status: 'ready' });
} catch (error) {
res.status(503).json({
status: 'not ready',
error: error.message
});
}
});Common Mistakes
1. Liveness checks external dependencies
# BAD: If DB is slow, all pods restart
livenessProbe:
httpGet:
path: /api/users # Queries database2. Timeout too short
# BAD: Probe times out during GC pauses
livenessProbe:
httpGet:
path: /healthz
timeoutSeconds: 1 # Too short for Java apps3. No startup probe for slow apps
# BAD: App takes 60s to start, but liveness starts at 10s
livenessProbe:
initialDelaySeconds: 10 # App not ready yet = restart loop4. ReadinessProbe too aggressive
# BAD: Single failure removes from service
readinessProbe:
failureThreshold: 1 # Flaky during deploymentsDebugging Probes
Check probe status:
kubectl describe pod my-app-xxx
# Look for "Liveness" and "Readiness" in ConditionsCheck events for probe failures:
kubectl get events --field-selector reason=UnhealthyTest the endpoint manually:
kubectl exec -it my-app-xxx -- curl localhost:8080/healthzSummary
You’ve learned how to:
- Configure liveness, readiness, and startup probes
- Choose the right probe type for your use case
- Implement health check endpoints correctly
- Avoid common pitfalls that cause cascading failures
Key takeaway: Keep liveness probes simple, use readiness probes for dependency checks.
References
📘 Go Further with Kubernetes Recipes
Love this recipe? There’s so much more! This is just one of 100+ hands-on recipes in our comprehensive Kubernetes Recipes book.
Inside the book, you’ll master:
- ✅ Production-ready deployment strategies
- ✅ Advanced networking and security patterns
- ✅ Observability, monitoring, and troubleshooting
- ✅ Real-world best practices from industry experts
“The practical, recipe-based approach made complex Kubernetes concepts finally click for me.”
👉 Get Your Copy Now — Start building production-grade Kubernetes skills today!
📘 Get All 100+ Recipes in One Book
Stop searching — get every production-ready pattern with detailed explanations, best practices, and copy-paste YAML.
Want More Kubernetes Recipes?
This recipe is from Kubernetes Recipes, our 750-page practical guide with hundreds of production-ready patterns.