Describe how Kubernetes uses health probes during deployment, scaling, and failover of applications.
Health probes are an important part of maintaining a robust cluster. Probes enable the cluster to determine the status of an application by repeatedly probing it for a response.
A set of health probes affect a cluster's ability to do the following tasks:
Crash mitigation by automatically attempting to restart failing pods
Failover and load balancing by sending requests only to healthy pods
Monitoring by determining whether and when pods are failing
Scaling by determining when a new replica is ready to receive requests
Application developers are expected to code health probe endpoints during application development. These endpoints determine the health and status of the application. For example, a data-driven application might report a successful health probe only if it can connect to the database.
Because the cluster calls them often, health probe endpoints should be quick to perform. Endpoints should not perform complicated database queries or many network calls.
Kubernetes provides the following types of probes: startup, readiness, and liveness. Depending on the application, you might configure one or more of these types.
A readiness probe determines whether the application is ready to serve requests. If the readiness probe fails, then Kubernetes prevents client traffic from reaching the application by removing the pod's IP address from the service resource.
Readiness probes help to detect temporary issues that might affect your applications. For example, the application might be temporarily unavailable when it starts, because it must establish initial network connections, load files in a cache, or perform initial tasks that take time to complete. The application might occasionally need to run long batch jobs, which make it temporarily unavailable to clients.
Kubernetes continues to run the probe even after the application fails. If the probe succeeds again, then Kubernetes adds back the pod's IP address to the service resource, and requests are sent to the pod again.
In such cases, the readiness probe addresses a temporary issue and improves application availability.
Like a readiness probe, a liveness probe is called throughout the lifetime of the application. Liveness probes determine whether the application container is in a healthy state. If an application fails its liveness probe enough times, then the cluster restarts the pod according to its restart policy.
Unlike a startup probe, liveness probes are called after the application's initial start process. Usually, this mitigation is by restarting or re-creating the pod.
A startup probe determines when an application's startup is completed.
Unlike a liveness probe, a startup probe is not called after the probe succeeds.
If the startup probe does not succeed after a configurable timeout, then the pod is restarted based on its restartPolicy value.
Consider adding a startup probe to applications with a long start time. By using a startup probe, the liveness probe can remain short and responsive.
When defining a probe, you must specify one of the following types of test to perform:
- HTTP GET
Each time that the probe runs, the cluster sends a request to the specified HTTP endpoint. The test is considered a success if the request responds with an HTTP response code between
200and399. Other responses cause the test to fail.- Container command
Each time that the probe runs, the cluster runs the specified command in the container. If the command exits with a status code of
0, then the test succeeds. Other status codes cause the test to fail.- TCP socket
Each time that the probe runs, the cluster attempts to open a socket to the container. The test succeeds only if the connection is established.
All the types of probes include timing variables. The period seconds variable defines how often the probe runs. The failure threshold defines how many failed attempts are required before the probe itself fails.
For example, a probe with a failure threshold of 3 and period seconds of 5 can fail up to three times before the overall probe fails.
Using this probe configuration means that the issue can exist for 10 seconds before it is mitigated.
However, running probes too often can waste resources.
Consider these values when setting probes.
Because probes are defined on a pod template, probes can be added to workload resources such as deployments.
To add a probe to an existing deployment, update and apply the YAML file or use the oc edit command.
For example, the following YAML excerpt defines a deployment pod template with a probe:
apiVersion: apps/v1 kind: Deployment ...output omitted... spec: ...output omitted... template: spec: containers: - name: web-server ...output omitted... livenessProbe:failureThreshold: 6
periodSeconds: 10
httpGet:
path: /health
port: 3000
The oc set probe command adds or modifies a probe on a deployment.
For example, the following command adds a readiness probe to a deployment called front-end:
[user@host ~]$oc set probe deployment/front-end \ --readiness \--failure-threshold 6 \
--period-seconds 10 \
--get-url http://:8080/healthz
To add or modify a probe on a deployment from the web console, navigate to the → menu and select a deployment.
Click and then click .
Click to specify the readiness type, the HTTP headers, the path, the port, and more.
Note
The set probe command is exclusive to RHOCP and oc.
References
Configure Liveness, Readiness and Startup Probes
For more information about health probes, refer to the Monitoring Application Health by Using Health Checks chapter in the Red Hat OpenShift Container Platform 4.14 Building Applications documentation at https://docs.redhat.com/en/documentation/openshift_container_platform/4.14/html-single/building_applications/index#application-health