πŸ“šBook Signing at KubeCon EU 2026Meet us at Booking.com HQ (Mon 18:30-21:00) & vCluster booth #521 (Tue 24 Mar, 12:30-1:30pm) β€” free book giveaway!RSVP Booking.com Event
Autoscaling intermediate ⏱ 12 minutes K8s 1.28+

KEDA: Event-Driven Autoscaling for K8s

Scale Kubernetes workloads with KEDA based on events from Kafka, RabbitMQ, AWS SQS, Prometheus metrics, and cron schedules.

By Luca Berton β€’ β€’ πŸ“– 5 min read

πŸ’‘ Quick Answer: KEDA extends HPA to scale on external events β€” Kafka lag, RabbitMQ queue depth, Prometheus metrics, cron schedules, and 60+ sources. Install: helm install keda kedacore/keda -n keda --create-namespace. Create a ScaledObject pointing to your deployment + trigger source. KEDA scales from 0β†’N and back to 0 when idle. Works alongside native HPA.

The Problem

HPA only scales on CPU/memory β€” but real workloads need:

  • Scale based on Kafka consumer lag (messages piling up)
  • Scale based on queue depth (RabbitMQ, SQS, Azure Queue)
  • Scale to zero when no work (save costs)
  • Scale on custom Prometheus metrics
  • Scale on cron schedule (business hours only)

The Solution

Install KEDA

helm repo add kedacore https://kedacore.github.io/charts
helm install keda kedacore/keda -n keda --create-namespace

# Verify
kubectl get pods -n keda
# keda-operator-xxx                Running
# keda-operator-metrics-xxx        Running

Scale on Kafka Consumer Lag

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: kafka-consumer
  namespace: production
spec:
  scaleTargetRef:
    name: kafka-consumer           # Deployment name
  pollingInterval: 15              # Check every 15s
  cooldownPeriod: 300              # Wait 5min before scale-down
  minReplicaCount: 0               # Scale to zero!
  maxReplicaCount: 50
  triggers:
  - type: kafka
    metadata:
      bootstrapServers: kafka.default:9092
      consumerGroup: my-consumer-group
      topic: orders
      lagThreshold: "100"          # Scale up when lag > 100
      offsetResetPolicy: earliest

Scale on RabbitMQ Queue Depth

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: order-processor
spec:
  scaleTargetRef:
    name: order-processor
  minReplicaCount: 0
  maxReplicaCount: 30
  triggers:
  - type: rabbitmq
    metadata:
      host: amqp://guest:guest@rabbitmq.default:5672/
      queueName: orders
      queueLength: "50"            # Scale when queue > 50 messages

Scale on Prometheus Metrics

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: api-scaler
spec:
  scaleTargetRef:
    name: api-server
  minReplicaCount: 1               # Always keep 1 running
  maxReplicaCount: 20
  triggers:
  - type: prometheus
    metadata:
      serverAddress: http://prometheus.monitoring:9090
      metricName: http_requests_per_second
      query: sum(rate(http_requests_total{service="api"}[2m]))
      threshold: "100"             # Scale when RPS > 100

Scale on AWS SQS Queue

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: sqs-processor
spec:
  scaleTargetRef:
    name: sqs-processor
  minReplicaCount: 0
  maxReplicaCount: 100
  triggers:
  - type: aws-sqs-queue
    metadata:
      queueURL: https://sqs.us-east-1.amazonaws.com/123456789/orders
      queueLength: "10"
      awsRegion: us-east-1
    authenticationRef:
      name: aws-credentials

---
apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
  name: aws-credentials
spec:
  secretTargetRef:
  - parameter: awsAccessKeyID
    name: aws-secret
    key: access-key
  - parameter: awsSecretAccessKey
    name: aws-secret
    key: secret-key

Cron-Based Scaling

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: business-hours-scaler
spec:
  scaleTargetRef:
    name: web-frontend
  minReplicaCount: 1
  maxReplicaCount: 20
  triggers:
  # Scale up during business hours
  - type: cron
    metadata:
      timezone: America/New_York
      start: 0 8 * * 1-5           # 8 AM Mon-Fri
      end: 0 18 * * 1-5            # 6 PM Mon-Fri
      desiredReplicas: "10"
  # Combine with Prometheus for real-time adjustment
  - type: prometheus
    metadata:
      serverAddress: http://prometheus.monitoring:9090
      query: sum(rate(http_requests_total{app="web"}[5m]))
      threshold: "200"

Multiple Triggers

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: multi-trigger
spec:
  scaleTargetRef:
    name: worker
  minReplicaCount: 0
  maxReplicaCount: 50
  triggers:
  # Scale on EITHER trigger (highest wins)
  - type: kafka
    metadata:
      bootstrapServers: kafka:9092
      consumerGroup: workers
      topic: tasks
      lagThreshold: "50"
  - type: cpu
    metricType: Utilization
    metadata:
      value: "70"
  - type: memory
    metricType: Utilization
    metadata:
      value: "80"

ScaledJob (For Jobs Instead of Deployments)

apiVersion: keda.sh/v1alpha1
kind: ScaledJob
metadata:
  name: batch-processor
spec:
  jobTargetRef:
    template:
      spec:
        containers:
        - name: processor
          image: batch-processor:v1
          command: [python, process.py]
        restartPolicy: Never
  pollingInterval: 30
  maxReplicaCount: 20
  successfulJobsHistoryLimit: 10
  failedJobsHistoryLimit: 5
  triggers:
  - type: rabbitmq
    metadata:
      host: amqp://rabbitmq.default:5672/
      queueName: batch-tasks
      queueLength: "1"            # One job per message

Check Status

# List scaled objects
kubectl get scaledobjects -A
# NAME              SCALETARGET      MIN  MAX  TRIGGERS  READY  ACTIVE
# kafka-consumer    kafka-consumer   0    50   kafka     True   True

# Check HPA created by KEDA
kubectl get hpa -A
# KEDA creates and manages HPA resources automatically

# View scaling events
kubectl describe scaledobject kafka-consumer

Common Issues

Scale to zero not working

minReplicaCount must be 0. Also check cooldownPeriod β€” KEDA waits this many seconds before scaling to zero.

Trigger β€œerror connecting”

Authentication issue to external system. Use TriggerAuthentication for credentials. Check network connectivity.

Scaling too aggressively

Increase pollingInterval (check frequency) and cooldownPeriod (scale-down delay). Adjust threshold values.

Best Practices

  • Scale to zero for batch workers and event processors β€” save costs
  • Keep minReplicaCount=1 for user-facing services (avoid cold start)
  • Combine triggers β€” cron for baseline + metrics for burst
  • TriggerAuthentication for secrets β€” don’t embed credentials in ScaledObject
  • ScaledJob for one-shot tasks β€” ScaledObject for long-running deployments

Key Takeaways

  • KEDA scales on 60+ event sources (Kafka, RabbitMQ, SQS, Prometheus, cron)
  • Scales to zero and back β€” impossible with native HPA
  • Creates and manages HPA resources automatically
  • ScaledObject for Deployments, ScaledJob for batch Jobs
  • Combines with native HPA β€” KEDA handles external metrics, HPA handles CPU/memory
#keda #autoscaling #event-driven #scale-to-zero #serverless
Luca Berton
Written by Luca Berton

Principal Solutions Architect specializing in Kubernetes, AI/GPU infrastructure, and cloud-native platforms. Author of Kubernetes Recipes and creator of CopyPasteLearn courses.

Kubernetes Recipes book cover

Want More Kubernetes Recipes?

This recipe is from Kubernetes Recipes, our 750-page practical guide with hundreds of production-ready patterns.

Luca Berton Ansible Pilot Ansible by Example Open Empower K8s Recipes Terraform Pilot CopyPasteLearn ProteinLens