Adding Event-driven Autoscaling to your Kubernetes Cluster

Azure Kubernetes Service (AKS) comes with a Cluster Autoscaler (CA) that can automatically add nodes to the node pool based on the load of the cluster (based on CPU/memory usage). KEDA is active on the pod-level and uses Horizontal Pod Autoscaling (HPA) to dynamically add additional pods based on the configured scaler. CA and KEDA therefore go hand-in-hand when managing dynamic workloads on an AKS cluster since they scale on different dimensions, based on different rules, as shown below:

Vertical and Horizontal scaling in Kubernetes

Overview

An overview of KEDA that scales an App based on the Topic Queue size of Azure Service Bus is shown in this diagram:

Communication flow for KEDA scaling an app in AKS

The app is deployed together with a KEDA-backed ScaledObject. This ScaledObject supports minReplicaCount and maxReplicaCount that defines the range of concurrent replicas for the pods that can exist for the app. Furthermore, a scale trigger object is defined inside the ScaledObject that defines the scaling criteria and conditions for scaling up and down.

Although optional, the diagram above also uses Pod Identity. Similar to how secrets are fetched from Azure Key Vault inside containers, Pod Identity is used with KEDA to directly subscribe to an e.g. Azure Service Bus Topic to scale the pods without passing any connecting strings by specifying its AzureIdentityBinding.

Usage

The KEDA Helm Chart needs to be installed on the AKS cluster and configured to use AzureIdentityBinding to access resources in Azure. This Azure AD Identity needs to have sufficient RBAC permissions to directly access the required resources in Azure.

The ScaledObject is defined as follows, which is deployed along with the application deployment specified under scaleTargetRef. This needs to match the deployment name which needs to be deployed in the same Kubernetes namespace.

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: msg-processor-scaler
spec:
  scaleTargetRef:
    name: msg-processor # must be in the same namespace as the ScaledObject
  minReplicaCount: 1
  maxReplicaCount: 10
  triggers:
  - type: azure-servicebus
    metadata:
      namespace: SERVICE_BUS_NAMESPACE 
      topicName: SERVICE_BUS_TOPIC
      subscriptionName: SERVICE_BUS_TOPIC_SUBSCRIPTION
      messageCount: "5"
    authenticationRef:
      name: trigger-auth-service-bus-msgs

It defines the type of scale trigger we would like to use and the scaling criteria specified under the metadata object. Lots of different KEDA scalars are available out of the box and details can be found by going through the KEDA documentation.

In the example above, we use an azure-servicebus scalar and would like to scale out if there are more than 5 unprocessed messages on the topic subscription SERVICE_BUS_TOPIC_SUBSCRIPTION on the SERVICE_BUS_TOPIC topic in the SERVICE_BUS_NAMESPACE kubernetes namespace. Scaling will go up to a maximum of 10 concurrent replicas which is defined via maxReplicaCount and there will always be a 1 pod minimum as defined by minReplicaCount.

Since we are using Pod Identity, we also specify the authenticationRef for the ScaledObject totrigger-auth-service-bus-msgs. This is a TriggerAuthentication resource that defines how KEDA should authenticate to get the metrics.

apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
  name: trigger-auth-service-bus-msgs
spec:
  podIdentity:
    provider: azure

In this case, we are telling KEDA to use Azure as a Pod Identity provider which uses Azure AD Pod Identity. Alternatively, a full connection string can also be used without specifying a TriggerAuthentication resource.

By using a TriggerAuthentication you can easily re-use this authentication resource, but it also allows you to separate the permissions for KEDA and other resources inside your kubernetes cluster by binding them to different Azure AD Identities with different RBAC permissions.

Using alternative KEDA scalars

The example above shows how to configure KEDA for autoscaling using Azure Service Bus Topics, but lots of other scalars are supported out of the box and more information can be found on KEDA Documentation - Scalars .

Note: when adding additional triggers, also ensure that Pod Identity can read from these resources by adding the corresponding RBAC permissions.

Using the prebuilt Helm chart with KEDA support

If you have deployed a fully configured AKS cluster to Azure and you are also using the accompanying Umbrella Helm chart for easy deployment of your apps then you're in luck, because it also allows you to easily add KEDA support for the applications you deploy to the AKS cluster.

Most of the documentation to get started is available on GitHub, but here's a sample on how you would deploy the application described above using this Helm Chart.

helm-app:
  app:
    name: app-service-name
    container:
      image: hello-world:latest
      port: 80

  keda:
    enabled: true
    name: app-service-name-keda
    authRefName: auth-trigger-app-service-name
    scaleTargetRef: app-service-name 
    minReplicaCount: 1
    maxReplicaCount: 10
    triggers:
    - type: azure-servicebus
      metadata:
        topicName: sbtopic-app-example-service
        subscriptionName: sbsub-app-example-service
        namespace: servicebus-app-example-ns
        messageCount: 5

Using this Helm Chart, you can easily deploy your Service, Deployment, Scaled Object, AuthenticationTriggers and optionally all other resources (e.g. ingress, secretstore) your deployment requires.