How to Configure Alertmanager for Kubernetes Alerts
Set up Alertmanager to route, group, and deliver Kubernetes alerts. Learn to configure Slack, PagerDuty, and email notifications.
The Problem
You have Prometheus collecting metrics, but you need to be notified when things go wrong instead of constantly watching dashboards.
The Solution
Configure Alertmanager to receive alerts from Prometheus and route them to the appropriate notification channels based on severity and team.
Understanding the Flow
Prometheus β Alerting Rules β Alertmanager β NotificationsStep 1: Create Alerting Rules
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: kubernetes-alerts
namespace: monitoring
labels:
release: prometheus # Match your Prometheus operator release
spec:
groups:
- name: kubernetes
rules:
- alert: PodCrashLooping
expr: |
rate(kube_pod_container_status_restarts_total[15m]) > 0
for: 5m
labels:
severity: warning
annotations:
summary: "Pod {{ $labels.pod }} is crash looping"
description: "Pod {{ $labels.pod }} in namespace {{ $labels.namespace }} is restarting frequently."
- alert: PodNotReady
expr: |
kube_pod_status_ready{condition="false"} == 1
for: 15m
labels:
severity: warning
annotations:
summary: "Pod {{ $labels.pod }} is not ready"
description: "Pod {{ $labels.pod }} has been not ready for more than 15 minutes."
- alert: HighMemoryUsage
expr: |
(container_memory_usage_bytes / container_spec_memory_limit_bytes) > 0.9
for: 5m
labels:
severity: critical
annotations:
summary: "High memory usage in {{ $labels.pod }}"
description: "Container {{ $labels.container }} in pod {{ $labels.pod }} is using more than 90% of its memory limit."
- alert: NodeNotReady
expr: |
kube_node_status_condition{condition="Ready",status="true"} == 0
for: 5m
labels:
severity: critical
annotations:
summary: "Node {{ $labels.node }} is not ready"
description: "Node {{ $labels.node }} has been not ready for more than 5 minutes."Step 2: Configure Alertmanager
Alertmanager Config Secret
apiVersion: v1
kind: Secret
metadata:
name: alertmanager-main
namespace: monitoring
type: Opaque
stringData:
alertmanager.yaml: |
global:
resolve_timeout: 5m
slack_api_url: 'https://hooks.slack.com/services/YOUR/SLACK/WEBHOOK'
route:
group_by: ['alertname', 'namespace']
group_wait: 30s
group_interval: 5m
repeat_interval: 4h
receiver: 'slack-notifications'
routes:
- match:
severity: critical
receiver: 'pagerduty-critical'
continue: true
- match:
severity: critical
receiver: 'slack-critical'
- match:
severity: warning
receiver: 'slack-warnings'
- match:
team: database
receiver: 'database-team'
receivers:
- name: 'slack-notifications'
slack_configs:
- channel: '#alerts'
send_resolved: true
title: '{{ .Status | toUpper }}: {{ .CommonAnnotations.summary }}'
text: '{{ .CommonAnnotations.description }}'
- name: 'slack-critical'
slack_configs:
- channel: '#alerts-critical'
send_resolved: true
color: '{{ if eq .Status "firing" }}danger{{ else }}good{{ end }}'
title: 'π¨ {{ .CommonAnnotations.summary }}'
text: '{{ .CommonAnnotations.description }}'
- name: 'slack-warnings'
slack_configs:
- channel: '#alerts-warning'
send_resolved: true
- name: 'pagerduty-critical'
pagerduty_configs:
- service_key: 'YOUR-PAGERDUTY-KEY'
send_resolved: true
- name: 'database-team'
email_configs:
- to: 'database-team@example.com'
send_resolved: true
inhibit_rules:
- source_match:
severity: 'critical'
target_match:
severity: 'warning'
equal: ['alertname', 'namespace']Slack Configuration
Rich Slack Messages
receivers:
- name: 'slack-detailed'
slack_configs:
- channel: '#kubernetes-alerts'
send_resolved: true
icon_url: https://avatars.githubusercontent.com/u/3380462
title: '{{ template "slack.default.title" . }}'
text: '{{ template "slack.default.text" . }}'
actions:
- type: button
text: 'Runbook :notebook:'
url: '{{ (index .Alerts 0).Annotations.runbook_url }}'
- type: button
text: 'Dashboard :chart_with_upwards_trend:'
url: 'https://grafana.example.com/d/abc123'PagerDuty Configuration
receivers:
- name: 'pagerduty'
pagerduty_configs:
- service_key: '<your-integration-key>'
send_resolved: true
severity: '{{ if eq .Status "firing" }}critical{{ else }}info{{ end }}'
description: '{{ .CommonAnnotations.summary }}'
details:
firing: '{{ template "pagerduty.default.instances" .Alerts.Firing }}'
resolved: '{{ template "pagerduty.default.instances" .Alerts.Resolved }}'
num_firing: '{{ .Alerts.Firing | len }}'
num_resolved: '{{ .Alerts.Resolved | len }}'Email Configuration
global:
smtp_smarthost: 'smtp.gmail.com:587'
smtp_from: 'alertmanager@example.com'
smtp_auth_username: 'alertmanager@example.com'
smtp_auth_password: 'your-app-password'
receivers:
- name: 'email'
email_configs:
- to: 'oncall@example.com'
send_resolved: true
headers:
subject: '{{ .Status | toUpper }}: {{ .CommonAnnotations.summary }}'
html: |
<h2>{{ .Status | toUpper }}</h2>
<p><strong>Alert:</strong> {{ .CommonAnnotations.summary }}</p>
<p><strong>Description:</strong> {{ .CommonAnnotations.description }}</p>OpsGenie Configuration
receivers:
- name: 'opsgenie'
opsgenie_configs:
- api_key: '<your-api-key>'
message: '{{ .CommonAnnotations.summary }}'
description: '{{ .CommonAnnotations.description }}'
priority: '{{ if eq .CommonLabels.severity "critical" }}P1{{ else }}P3{{ end }}'Routing Rules
Team-Based Routing
route:
receiver: 'default'
routes:
- match:
namespace: 'production'
receiver: 'production-team'
- match_re:
namespace: 'team-.*'
receiver: 'platform-team'Time-Based Routing
route:
receiver: 'default'
routes:
- match:
severity: critical
receiver: 'pagerduty'
active_time_intervals:
- business-hours
time_intervals:
- name: business-hours
time_intervals:
- weekdays: ['monday:friday']
times:
- start_time: '09:00'
end_time: '17:00'
location: 'America/New_York'Silences
Create a Silence via API
# Using amtool
amtool silence add alertname=PodNotReady --duration=2h --comment="Maintenance window"
# Via kubectl
kubectl port-forward -n monitoring svc/alertmanager-main 9093:9093
# Then access http://localhost:9093Programmatic Silence
apiVersion: batch/v1
kind: Job
metadata:
name: create-silence
spec:
template:
spec:
containers:
- name: silence
image: prom/alertmanager
command:
- amtool
- silence
- add
- alertname="MaintenanceAlert"
- --alertmanager.url=http://alertmanager-main:9093
- --duration=2h
restartPolicy: NeverTesting Alerts
Fire a Test Alert
curl -X POST http://localhost:9093/api/v1/alerts \
-H "Content-Type: application/json" \
-d '[{
"labels": {
"alertname": "TestAlert",
"severity": "warning"
},
"annotations": {
"summary": "This is a test alert",
"description": "Testing Alertmanager configuration"
}
}]'Best Practices
1. Donβt Alert on Everything
Focus on actionable alerts that require human intervention.
2. Set Appropriate Severities
- Critical: Immediate action required
- Warning: Investigate soon
- Info: For dashboards only
3. Use Inhibit Rules
Prevent alert storms:
inhibit_rules:
- source_match:
alertname: 'NodeDown'
target_match_re:
alertname: 'Pod.*'
equal: ['node']4. Group Related Alerts
route:
group_by: ['alertname', 'cluster', 'service']
group_wait: 30s
group_interval: 5m5. Include Runbook URLs
annotations:
runbook_url: 'https://wiki.example.com/runbooks/pod-crash-looping'Key Takeaways
- Define clear alerting rules with appropriate thresholds
- Route alerts based on severity and team ownership
- Use inhibit rules to prevent alert fatigue
- Include actionable information in alerts
- Test your alerting configuration regularly
π Go Further with Kubernetes Recipes
Love this recipe? Thereβs so much more! This is just one of 100+ hands-on recipes in our comprehensive Kubernetes Recipes book.
Inside the book, youβll master:
- β Production-ready deployment strategies
- β Advanced networking and security patterns
- β Observability, monitoring, and troubleshooting
- β Real-world best practices from industry experts
βThe practical, recipe-based approach made complex Kubernetes concepts finally click for me.β
π Get Your Copy Now β Start building production-grade Kubernetes skills today!
π Get All 100+ Recipes in One Book
Stop searching β get every production-ready pattern with detailed explanations, best practices, and copy-paste YAML.
Want More Kubernetes Recipes?
This recipe is from Kubernetes Recipes, our 750-page practical guide with hundreds of production-ready patterns.