Configure a horizontal pod autoscaler for an application.
Kubernetes can autoscale a deployment based on current load on the application pods, by means of a HorizontalPodAutoscaler (HPA) resource type.
The autoscaler works in a loop. Every 15 seconds by default, it performs the following steps:
The autoscaler retrieves the details of the metric for scaling from the HPA resource.
For each pod that the HPA resource targets, the autoscaler collects the metric from the metric subsystem.
For each targeted pod, the autoscaler computes the usage percentage, from the collected metric and from the pod resource requests.
The autoscaler computes the average usage and the average resource requests across all the targeted pods. It establishes a usage ratio from these values, and then uses the ratio for its scaling decision.
The simplest way to create a horizontal pod autoscaler resource is by using the oc autoscale command, for example:
[user@host ~]$ oc autoscale deployment/hello --min 1 --max 10 --cpu-percent 80To get information about horizontal pod autoscaler resources in the current project, use the oc get command.
For example:
[user@host ~]$ oc get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS ...
hello Deployment/hello <unknown>/80% 1 10 1 ...
scale Deployment/scale 60%/80% 2 10 2 ...Important
The horizontal pod autoscaler initially has a value of <unknown> in the TARGETS column.
It might take up to five minutes before <unknown> changes to display a percentage for current usage.
A persistent value of <unknown> in the TARGETS column might indicate that the deployment does not define resource requests for the metric.
The horizontal pod autoscaler does not scale these pods.
Pods that are created by using the oc create deployment command do not define resource requests.
Using the OpenShift autoscaler might therefore require editing the deployment resources, creating custom YAML or JSON resource files for your application, or adding limit range resources to your project that define default resource requests.
In addition to the oc autoscale command, you can create a horizontal pod autoscaler resource from a file in the YAML format.
apiVersion: autoscaling/v2 kind: HorizontalPodAutoscaler metadata: name: hello spec: minReplicas: 1maxReplicas: 10
metrics: - resource: name: cpu target: averageUtilization: 80
type: Utilization type: Resource scaleTargetRef:
apiVersion: apps/v1 kind: Deployment name: hello
Minimum number of pods. | |
Maximum number of pods. | |
Ideal average CPU usage for each pod. If the global average CPU usage is above that value, then the horizontal pod autoscaler starts new pods. If the global average CPU usage is below that value, then the horizontal pod autoscaler deletes pods. | |
Reference to the name of the deployment resource. |
Use the oc apply -f command to create the resource from the file.hello-hpa.yaml
The preceding example creates a horizontal pod autoscaler resource that scales based on CPU usage.
Alternatively, it can scale based on memory usage by setting the resource name to memory, as in the following example:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: hello
spec:
minReplicas: 1
maxReplicas: 10
metrics:
- resource:
name: memory
target:
averageUtilization: 80
...output omitted...To create a horizontal pod autoscaler resource from the web console, click the → menu. Click and customize the YAML manifest.
Note
If an application uses more overall memory as the number of replicas increases, then it cannot be used with memory-based autoscaling.
References
For more information, refer to the Automatically Scaling Pods with the Horizontal Pod Autoscaler section in the Working with Pods chapter in the Red Hat OpenShift Container Platform 4.14 Nodes documentation at https://docs.redhat.com/en/documentation/openshift_container_platform/4.14/html-single/nodes/index#nodes-pods-autoscaling