πŸ“šBook Signing at KubeCon EU 2026Meet us at Booking.com HQ (Mon 18:30-21:00) & vCluster booth #521 (Tue 24 Mar, 12:30-1:30pm) β€” free book giveaway!RSVP Booking.com Event
Autoscaling intermediate ⏱ 20 minutes K8s 1.28+

KEDA vs HPA: Event-Driven Autoscaling Expla...

Compare KEDA and HPA for Kubernetes autoscaling. Scale on Kafka lag, Prometheus metrics, queue depth, cron, and custom events. KEDA vs HPA decision guide.

By Luca Berton β€’ β€’ πŸ“– 6 min read

πŸ’‘ Quick Answer: HPA scales on CPU/memory and simple custom metrics via Metrics API. KEDA wraps HPA and adds 60+ event sources: Kafka lag, RabbitMQ queue depth, Prometheus queries, cron schedules, Azure/AWS/GCP services, and more. KEDA also scales to zero. Use HPA for CPU/memory-based scaling; use KEDA when you need scale-to-zero or event-driven triggers.

The Problem

HPA is limited to metrics available through the Kubernetes Metrics API. To scale on Kafka consumer lag, SQS queue depth, or Prometheus query results, you need a metrics adapter β€” which is complex to set up per source. KEDA provides a unified autoscaling framework with 60+ built-in scalers, plus the ability to scale Deployments to zero replicas.

flowchart TB
    subgraph HPA["HPA (Built-in)"]
        M_API["Metrics API<br/>(metrics-server)"]
        M_CUSTOM["Custom Metrics API<br/>(Prometheus Adapter)"]
        H_CTRL["HPA Controller"]
    end
    
    subgraph KEDA_S["KEDA"]
        K_OP["KEDA Operator"]
        K_MA["KEDA Metrics Adapter"]
        K_SCALERS["60+ Scalers<br/>Kafka, Prometheus,<br/>Cron, RabbitMQ,<br/>AWS SQS, HTTP..."]
        K_SO["ScaledObject<br/>/ ScaledJob"]
    end
    
    M_API --> H_CTRL -->|"Scale 1β†’N"| DEPLOY["Deployment"]
    K_SCALERS --> K_OP --> K_MA -->|"Feeds HPA"| H_CTRL
    K_OP -->|"Scale 0β†’1"| DEPLOY

The Solution

Comparison Table

FeatureHPAKEDA
Scale to zero❌ (min 1)βœ…
Scale from zeroβŒβœ…
CPU/Memory triggersβœ… nativeβœ… (wraps HPA)
Custom metricsRequires adapter60+ built-in scalers
Kafka consumer lagNeeds Prometheus AdapterBuilt-in scaler
Queue depth (RabbitMQ/SQS)Needs custom adapterBuilt-in scaler
Cron scheduleβŒβœ…
Prometheus queriesNeeds adapterBuilt-in scaler
HTTP request rateNeeds adapterKEDA HTTP Add-on
Scale Jobs (batch)βŒβœ… (ScaledJob)
Multiple triggersOne metric type per HPAMultiple triggers per ScaledObject
Cooldown controlstabilizationWindowcooldownPeriod + pollingInterval
InstallationBuilt-inHelm install

Install KEDA

helm repo add kedacore https://kedacore.github.io/charts
helm install keda kedacore/keda \
  --namespace keda \
  --create-namespace \
  --version 2.16.0

HPA Example (CPU-Based)

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: web-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: web-app
  minReplicas: 2
  maxReplicas: 20
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300

KEDA: Scale on Kafka Consumer Lag

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: kafka-consumer
spec:
  scaleTargetRef:
    name: kafka-consumer
  minReplicaCount: 0               # Scale to zero!
  maxReplicaCount: 50
  pollingInterval: 15
  cooldownPeriod: 300
  triggers:
    - type: kafka
      metadata:
        bootstrapServers: kafka:9092
        consumerGroup: my-consumer-group
        topic: orders
        lagThreshold: "100"        # Scale when lag > 100

KEDA: Scale on Prometheus Query

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: api-scaler
spec:
  scaleTargetRef:
    name: api-server
  minReplicaCount: 1
  maxReplicaCount: 30
  triggers:
    - type: prometheus
      metadata:
        serverAddress: http://prometheus:9090
        query: |
          sum(rate(http_requests_total{service="api-server"}[2m]))
        threshold: "100"           # Scale at 100 req/s per pod
        activationThreshold: "5"   # Activate from 0 at 5 req/s

KEDA: Scale on Cron Schedule

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: business-hours-scaler
spec:
  scaleTargetRef:
    name: web-app
  minReplicaCount: 1
  maxReplicaCount: 20
  triggers:
    # Scale up during business hours
    - type: cron
      metadata:
        timezone: "America/New_York"
        start: "0 8 * * 1-5"      # 8 AM weekdays
        end: "0 18 * * 1-5"       # 6 PM weekdays
        desiredReplicas: "10"
    # Minimum overnight
    - type: cron
      metadata:
        timezone: "America/New_York"
        start: "0 18 * * 1-5"
        end: "0 8 * * 2-6"
        desiredReplicas: "2"

KEDA: ScaledJob for Batch Work

# Scale Jobs (not Deployments) for batch processing
apiVersion: keda.sh/v1alpha1
kind: ScaledJob
metadata:
  name: queue-processor
spec:
  jobTargetRef:
    template:
      spec:
        containers:
          - name: processor
            image: myorg/queue-processor:v2.0
            env:
              - name: QUEUE_URL
                value: "amqp://rabbitmq:5672"
        restartPolicy: Never
  pollingInterval: 10
  maxReplicaCount: 100
  triggers:
    - type: rabbitmq
      metadata:
        host: amqp://rabbitmq:5672
        queueName: tasks
        queueLength: "5"           # 1 Job per 5 messages

KEDA + HPA Together: Multiple Signals

# KEDA can combine multiple triggers (max wins)
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: multi-trigger
spec:
  scaleTargetRef:
    name: api-server
  minReplicaCount: 2
  maxReplicaCount: 50
  triggers:
    # CPU-based (same as HPA)
    - type: cpu
      metricType: Utilization
      metadata:
        value: "70"
    # Plus: scale on request rate
    - type: prometheus
      metadata:
        serverAddress: http://prometheus:9090
        query: sum(rate(http_requests_total{app="api-server"}[2m]))
        threshold: "50"
    # Plus: scale on queue depth
    - type: rabbitmq
      metadata:
        host: amqp://rabbitmq:5672
        queueName: async-tasks
        queueLength: "10"
  # KEDA uses the HIGHEST replica count from all triggers

Decision Guide

Use HPA When:

  • You only need CPU/memory scaling
  • You already have metrics-server installed
  • You want zero dependencies beyond built-in K8s
  • Your workloads never need to scale to zero

Use KEDA When:

  • You need scale-to-zero (save costs during idle)
  • You scale on external events: Kafka, queues, Prometheus, cron
  • You need multiple scaling triggers on one workload
  • You want to scale Jobs (not just Deployments)
  • You manage event-driven microservices

Use Both When:

  • KEDA manages event triggers + scale-to-zero
  • HPA continues handling CPU/memory for other workloads
  • They coexist on the same cluster (KEDA creates HPAs internally)

Common Issues

IssueCauseFix
KEDA not scaling from 0Activation threshold not reachedLower activationThreshold
Scale-down too aggressiveShort cooldownIncrease cooldownPeriod
Kafka scaler auth failureMissing SASL credentialsAdd authenticationRef with TriggerAuthentication
HPA and KEDA conflictBoth target same DeploymentRemove HPA; KEDA creates its own
Prometheus query returns NaNMetric doesn’t exist yetSet activationThreshold and handle startup

Best Practices

  • Start with KEDA if event-driven β€” retrofitting is harder than starting right
  • Use activationThreshold β€” controls when KEDA creates pods from zero
  • Set reasonable cooldownPeriod β€” 300s prevents flapping for most workloads
  • Use TriggerAuthentication β€” never put credentials in ScaledObject metadata
  • Monitor KEDA itself β€” KEDA exposes Prometheus metrics for its own health
  • Combine CPU + event triggers β€” handles both traffic spikes and queue backlogs

Key Takeaways

  • HPA is built-in and handles CPU/memory scaling well
  • KEDA extends HPA with 60+ event sources and scale-to-zero
  • KEDA creates HPAs internally β€” they work together, not against each other
  • ScaledJob enables autoscaling batch Jobs (not just Deployments)
  • For event-driven workloads (queues, streams, cron), KEDA is the clear choice
  • Both are production-ready; KEDA is a CNCF graduated project since 2024
#keda #hpa #event-driven #autoscaling #comparison
Luca Berton
Written by Luca Berton

Principal Solutions Architect specializing in Kubernetes, AI/GPU infrastructure, and cloud-native platforms. Author of Kubernetes Recipes and creator of CopyPasteLearn courses.

Kubernetes Recipes book cover

Want More Kubernetes Recipes?

This recipe is from Kubernetes Recipes, our 750-page practical guide with hundreds of production-ready patterns.

Luca Berton Ansible Pilot Ansible by Example Open Empower K8s Recipes Terraform Pilot CopyPasteLearn ProteinLens