Scaling Kubernetes Jobs with KEDA on EKS

The Problem

Kubernetes has a built-in Horizontal Pod Autoscaler (HPA), but it only scales based on CPU and memory. That works fine for web servers handling HTTP traffic. It does not work for workloads that process messages from a queue, respond to events from a stream, or run periodic batch jobs.

If you have a worker deployment pulling messages from Amazon SQS, the HPA has no idea how many messages are waiting. Your pods sit idle or get overwhelmed - there is no middle ground.

KEDA (Kubernetes Event-Driven Autoscaling) solves this. It extends the HPA to scale based on external event sources: SQS queue depth, Kafka lag, Prometheus metrics, cron schedules, and 60+ other triggers.

What We Are Building

In this post, we will set up KEDA on an EKS cluster and configure it to autoscale a worker deployment based on the number of messages in an SQS queue. When messages pile up, KEDA spins up more pods. When the queue is empty, it scales down to zero.

The stack:

Amazon EKS - Kubernetes cluster
KEDA 2.x - Event-driven autoscaler
Amazon SQS - Message queue trigger
IRSA - IAM Roles for Service Accounts (no hardcoded credentials)

Step 1: Install KEDA on EKS

The cleanest way to install KEDA is via Helm:

Terminal

helm repo add kedacore https://kedacore.github.io/charts
helm repo update

helm install keda kedacore/keda \
  --namespace keda \
  --create-namespace \
  --version 2.16.0

Verify the installation:

Terminal

kubectl get pods -n keda

# Expected output:
# NAME                                      READY   STATUS
# keda-operator-7f4d8b6c5d-xxxxx            1/1     Running
# keda-metrics-apiserver-6c9b7d8f4-xxxxx    1/1     Running

KEDA installs two components: the operator (watches your ScaledObject resources) and the metrics API server (feeds custom metrics to the HPA).

Step 2: Create the SQS Queue

If you do not already have a queue, create one:

Terminal

aws sqs create-queue \
  --queue-name order-processing \
  --region ap-southeast-2

Note the queue URL - you will need it in the KEDA trigger config:

TEXT

https://sqs.ap-southeast-2.amazonaws.com/123456789012/order-processing

Step 3: Set Up IAM Role for Service Account (IRSA)

KEDA needs permission to read the SQS queue depth. On EKS, the correct way to do this is IRSA - no access keys, no secrets, just a Kubernetes service account mapped to an IAM role.

Create the IAM policy:

JSON

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "sqs:GetQueueAttributes",
        "sqs:GetQueueUrl"
      ],
      "Resource": "arn:aws:sqs:ap-southeast-2:123456789012:order-processing"
    }
  ]
}

Create the role and associate it with a service account:

Terminal

eksctl create iamserviceaccount \
  --name keda-sqs-sa \
  --namespace default \
  --cluster my-cluster \
  --attach-policy-arn arn:aws:iam::123456789012:policy/KedaSQSReadPolicy \
  --approve

Step 4: Deploy the Worker Application

Here is a simple worker deployment that processes SQS messages:

YAML

# worker-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: order-worker
  namespace: default
spec:
  replicas: 0  # KEDA manages the replica count
  selector:
    matchLabels:
      app: order-worker
  template:
    metadata:
      labels:
        app: order-worker
    spec:
      serviceAccountName: keda-sqs-sa
      containers:
        - name: worker
          image: myregistry/order-worker:latest
          env:
            - name: SQS_QUEUE_URL
              value: "https://sqs.ap-southeast-2.amazonaws.com/123456789012/order-processing"
            - name: AWS_REGION
              value: "ap-southeast-2"
          resources:
            requests:
              cpu: 100m
              memory: 128Mi
            limits:
              cpu: 500m
              memory: 256Mi

Notice replicas: 0. KEDA will handle scaling from zero when messages arrive.

Terminal

kubectl apply -f worker-deployment.yaml

Step 5: Create the KEDA ScaledObject

This is the core piece. The ScaledObject tells KEDA what to scale and what trigger to use:

YAML

# keda-scaledobject.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: order-worker-scaler
  namespace: default
spec:
  scaleTargetRef:
    name: order-worker
  pollingInterval: 15          # Check SQS every 15 seconds
  cooldownPeriod: 60           # Wait 60s before scaling down
  minReplicaCount: 0           # Scale to zero when idle
  maxReplicaCount: 20          # Cap at 20 pods
  triggers:
    - type: aws-sqs-queue
      metadata:
        queueURL: "https://sqs.ap-southeast-2.amazonaws.com/123456789012/order-processing"
        queueLength: "5"       # 1 pod per 5 messages
        awsRegion: "ap-southeast-2"
      authenticationRef:
        name: keda-aws-auth

The queueLength field is the scaling ratio: if there are 25 messages in the queue, KEDA will request 5 pods (25 / 5 = 5).

Now create the authentication resource that maps to our IRSA service account:

YAML

# keda-trigger-auth.yaml
apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
  name: keda-aws-auth
  namespace: default
spec:
  podIdentity:
    provider: aws-eks

Apply both:

Terminal

kubectl apply -f keda-trigger-auth.yaml
kubectl apply -f keda-scaledobject.yaml

Step 6: Test It

Send a batch of messages to the queue:

Terminal

for i in $(seq 1 30); do
  aws sqs send-message \
    --queue-url https://sqs.ap-southeast-2.amazonaws.com/123456789012/order-processing \
    --message-body "{\"orderId\": \"order-$i\"}" \
    --region ap-southeast-2
done

Watch the pods scale up:

Terminal

kubectl get pods -w -l app=order-worker

# You should see pods going from 0 → 6 (30 messages / 5 per pod)

Once the messages are processed and the queue is empty, KEDA will scale back down to zero after the cooldown period.

Monitoring

KEDA exposes Prometheus metrics out of the box. The key ones to watch:

Terminal

# Key KEDA Prometheus metrics to monitor
#
# METRIC                          DESCRIPTION
# ─────────────────────────────── ──────────────────────────────────────────────
# keda_scaler_metrics_value       Current value of the trigger metric (queue depth)
# keda_scaled_object_errors       Errors in the scaling loop
# keda_resource_totals            Total ScaledObjects and ScaledJobs

# Quick check - query all KEDA metrics:
kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1" | jq .

If you are using the Prometheus operator, KEDA's Helm chart can create a ServiceMonitor automatically:

Terminal

helm upgrade keda kedacore/keda \
  --namespace keda \
  --set prometheus.metricServer.enabled=true \
  --set prometheus.operator.enabled=true

Common Pitfalls

1. Forgetting IRSA setup - KEDA will fail silently if it cannot read the queue. Check the keda-operator logs if scaling is not happening.

2. Setting minReplicaCount to 1 - If you want true scale-to-zero, set it to 0. But be aware that cold starts add latency for the first message.

3. Too aggressive cooldownPeriod - Setting this to 0 causes rapid scale-up/down cycling. 60-120 seconds is a good default.

4. Not setting resource requests - Without CPU/memory requests, the cluster autoscaler (Karpenter or Cluster Autoscaler) will not know to provision new nodes when KEDA scales the deployment up.

Summary

KEDA fills a real gap in Kubernetes autoscaling. If your workloads are event-driven - processing queues, responding to webhooks, running scheduled jobs - KEDA gives you scaling behaviour that the built-in HPA simply cannot provide.

The setup on EKS is straightforward: install via Helm, wire up IRSA for authentication, define a ScaledObject, and you are done. Your workers scale from zero to whatever you need, and back to zero when the work is finished.

Scaling Kubernetes Jobs with KEDA on EKS

The Problem

What We Are Building

Step 1: Install KEDA on EKS

Step 2: Create the SQS Queue

Step 3: Set Up IAM Role for Service Account (IRSA)

Step 4: Deploy the Worker Application

Step 5: Create the KEDA ScaledObject

Step 6: Test It

Monitoring

Common Pitfalls

Summary

Related Topics

Share this article

Comments & Discussion