- Application High Availability with Kubernetes
- Guided Exercise:
Application High Availability with Kubernetes
- Application Health Probes
- Guided Exercise:
Application Health Probes
- Reserve Compute Capacity for Applications
- Guided Exercise:
Reserve Compute Capacity for Applications
- Limit Compute Capacity for Applications
- Guided Exercise:
Limit Compute Capacity for Applications
- Application Autoscaling
- Guided Exercise:
Application Autoscaling
- Lab: Configure Applications for Reliability
- Quiz: Configure Applications for Reliability
- Summary
- Lab: Configure Applications for Reliability
Abstract
| Goal | |
| Objectives |
|
| Sections |
|
| Lab |
|
High availability (HA) is a goal of making applications more robust and resistant to runtime failures. Implementing HA techniques decreases the likelihood that an application is completely unavailable to users.
In general, HA can protect an application from failures in the following contexts:
From itself in the form of application bugs
From its environment, such as networking issues
From other applications that exhaust cluster resources
Additionally, HA practices can protect the cluster from applications, such as one with a memory leak.
At its core, cluster-level HA tooling mitigates worst-case scenarios. HA is not a substitute for fixing application-level issues, but augments developer mitigations. Although required for reliability, application security is a separate concern.
Applications must work with the cluster so that Kubernetes can best handle failure scenarios. Kubernetes expects the following behaviors from applications:
Tolerates restarts
Responds to health probes, such as the startup, readiness, and liveness probes
Supports multiple simultaneous instances
Has well-defined and well-behaved resource usage
Operates with restricted privileges
Although the cluster can run applications that lack the preceding behaviors, applications with these behaviors better use the reliability and HA features that Kubernetes provides.
Most HTTP-based applications provide an endpoint to verify application health. The cluster can be configured to observe this endpoint and mitigate potential issues for the application.
The application is responsible for providing such an endpoint. Developers must decide how the application determines its state.
For example, if an application depends on a database connection, then the application might respond with a healthy status only when the database is reachable. However, not all applications that make database connections need such a check. This decision is at the discretion of the developers.
If an application pod crashes, then it cannot respond to requests. Depending on the configuration, the cluster can automatically restart the pod. If the application fails without crashing the pod, then the pod does not receive requests. However, the cluster can do so only with the appropriate health probes.
Kubernetes uses the following HA techniques to improve application reliability:
Restarting pods: By configuring a restart policy on a pod, the cluster restarts misbehaving instances of an application.
Probing: By using health probes, the cluster knows when applications cannot respond to requests, and can automatically act to mitigate the issue.
Horizontal scaling: When the application load changes, the cluster can scale the number of replicas to match the load.
These techniques are explored throughout this chapter.