Application Autoscaling

Objectives

  • Configure a horizontal pod autoscaler for an application.

Kubernetes can autoscale a deployment based on current load on the application pods, by means of a HorizontalPodAutoscaler (HPA) resource type.

A horizontal pod autoscaler resource uses performance metrics that the OpenShift Metrics subsystem collects. The Metrics subsystem comes preinstalled in OpenShift. To autoscale a deployment, you must specify resource requests for pods so that the horizontal pod autoscaler can calculate the percentage of usage.

The autoscaler works in a loop. Every 15 seconds by default, it performs the following steps:

  • The autoscaler retrieves the details of the metric for scaling from the HPA resource.

  • For each pod that the HPA resource targets, the autoscaler collects the metric from the metric subsystem.

  • For each targeted pod, the autoscaler computes the usage percentage, from the collected metric and from the pod resource requests.

  • The autoscaler computes the average usage and the average resource requests across all the targeted pods. It establishes a usage ratio from these values, and then uses the ratio for its scaling decision.

The simplest way to create a horizontal pod autoscaler resource is by using the oc autoscale command, for example:

[user@host ~]$ oc autoscale deployment/hello --min 1 --max 10 --cpu-percent 80

The previous command creates a horizontal pod autoscaler resource that changes the number of replicas on the hello deployment to keep its pods under 80% of their total requested CPU usage.

The oc autoscale command creates a horizontal pod autoscaler resource by using the name of the deployment as an argument (hello in the previous example).

The maximum and minimum values for the horizontal pod autoscaler resource accommodate bursts of load and avoid overloading the OpenShift cluster. If the load on the application changes too quickly, then it might help to keep several spare pods to cope with sudden bursts of user requests. Conversely, too many pods can use up all cluster capacity and impact other applications that use the same OpenShift cluster.

To get information about horizontal pod autoscaler resources in the current project, use the oc get command. For example:

[user@host ~]$ oc get hpa
NAME   REFERENCE               TARGETS        MINPODS  MAXPODS  REPLICAS  ...
hello  Deployment/hello  <unknown>/80%        1        10       1         ...
scale  Deployment/scale        60%/80%        2        10       2         ...

Important

The horizontal pod autoscaler initially has a value of <unknown> in the TARGETS column. It might take up to five minutes before <unknown> changes to display a percentage for current usage.

A persistent value of <unknown> in the TARGETS column might indicate that the deployment does not define resource requests for the metric. The horizontal pod autoscaler does not scale these pods.

Pods that are created by using the oc create deployment command do not define resource requests. Using the OpenShift autoscaler might therefore require editing the deployment resources, creating custom YAML or JSON resource files for your application, or adding limit range resources to your project that define default resource requests.

In addition to the oc autoscale command, you can create a horizontal pod autoscaler resource from a file in the YAML format.

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: hello
spec:
  minReplicas: 1   1
  maxReplicas: 10  2
  metrics:
  - resource:
      name: cpu
      target:
        averageUtilization: 80  3
        type: Utilization
    type: Resource
  scaleTargetRef:  4
    apiVersion: apps/v1
    kind: Deployment
    name: hello

1

Minimum number of pods.

2

Maximum number of pods.

3

Ideal average CPU usage for each pod. If the global average CPU usage is above that value, then the horizontal pod autoscaler starts new pods. If the global average CPU usage is below that value, then the horizontal pod autoscaler deletes pods.

4

Reference to the name of the deployment resource.

Use the oc apply -f hello-hpa.yaml command to create the resource from the file.

The preceding example creates a horizontal pod autoscaler resource that scales based on CPU usage. Alternatively, it can scale based on memory usage by setting the resource name to memory, as in the following example:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: hello
spec:
  minReplicas: 1
  maxReplicas: 10
  metrics:
  - resource:
      name: memory
      target:
        averageUtilization: 80
...output omitted...

To create a horizontal pod autoscaler resource from the web console, click the WorkloadsHorizontalPodAutoscalers menu. Click Create HorizontalPodAutoscaler and customize the YAML manifest.

Note

If an application uses more overall memory as the number of replicas increases, then it cannot be used with memory-based autoscaling.

References

For more information, refer to the Automatically Scaling Pods with the Horizontal Pod Autoscaler section in the Working with Pods chapter in the Red Hat OpenShift Container Platform 4.14 Nodes documentation at https://docs.redhat.com/en/documentation/openshift_container_platform/4.14/html-single/nodes/index#nodes-pods-autoscaling