Outcomes
Observe that memory resource requests allocate cluster node memory.
Explore how adjusting resource requests impacts the number of replicas that can be scheduled on a node.
As the student user on the workstation machine, use the lab command to prepare your system for this exercise.
This command ensures that the following conditions are true:
The
reliability-requestsproject exists.The resource files are available in the course directory.
The classroom registry has the
registry.ocp4.example.com:8443/redhattraining/long-load:v1container image.
The registry.ocp4.example.com:8443/redhattraining/long-load:v1 container image contains an application with utility endpoints.
These endpoints perform such tasks as crashing the process and toggling the server's health status.
[student@workstation ~]$ lab start reliability-requests
Instructions
As the
adminuser, deploy thelong-loadapplication by applying thelong-load-deploy.yamlfile in thereliability-requestsproject.Log in as the
adminuser with theredhatocppassword.[student@workstation ~]$
oc login -u admin -p redhatocp \ https://api.ocp4.example.com:6443Login successful. ...output omitted...Note
In general, use accounts with the least required privileges to perform a task. In the classroom environment, this account is the
developeruser. However, cluster administrator privileges are required to view the cluster node metrics in this exercise.View the total memory request allocation for the node.
[student@workstation ~]$
oc describe node master01...output omitted... Allocated resources: (Total limits may be over 100 percent, i.e., overcommitted.) Resource Requests Limits -------- -------- ------ cpu 3158m (42%) 980m (13%) memory12667Mi (66%)1250Mi (6%) ...output omitted...The command output shows that the pods that are currently running on the node requested a total of 12667 MiB of memory. That value might be slightly different on your system.
Important
Projects and objects from previous exercises can cause the memory usage from this exercise to mismatch the intended results. Delete any unrelated projects before continuing.
If you still experience issues, re-create your classroom environment and try this exercise again.
Select the
reliability-requestsproject.[student@workstation ~]$
oc project reliability-requestsNow using project "reliability-requests" on server "https://api.ocp4.example.com:6443".Navigate to the
~/DO180/labs/reliability-requestsdirectory. Create a deployment, service, and route by using theoc applycommand and thelong-load-deploy.yamlfile.[student@workstation ~]$
cd DO180/labs/reliability-requests[student@workstation reliability-requests]$oc apply -f long-load-deploy.yamldeployment.apps/long-load created service/long-load created route.route.openshift.io/long-load created
Add a resource request to the pod definition and scale the deployment beyond the cluster's capacity.
Modify the
long-load-deploy.yamlfile by adding a resource request. The request allocates one gibibyte (1 GiB) to each of the application pods.spec: ...output omitted... template: ...output omitted... spec: containers: - image: registry.ocp4.example.com:8443/redhattraining/long-load:v1
resources: requests: memory: 1Gi...output omitted...Apply the YAML file to modify the deployment with the resource request.
[student@workstation reliability-requests]$
oc apply -f long-load-deploy.yamldeployment.apps/long-load configured service/long-load unchanged route.route.openshift.io/long-load unchangedScale the deployment to have 10 replicas.
[student@workstation reliability-requests]$
oc scale deploy/long-load \--replicas 10deployment.apps/long-load scaledObserve that the cluster cannot schedule all pods on the single node. The pods with a
Pendingstatus cannot be scheduled.[student@workstation reliability-requests]$
oc get podsNAME READY STATUS RESTARTS AGE ...output omitted... long-load-86bb4b79f8-44zwd 0/1Pending0 58s ...output omitted...Retrieve the cluster event log, and observe that insufficient memory is the cause of the failed scheduling.
[student@workstation reliability-requests]$
oc get events \ --field-selector reason="FailedScheduling"...output omitted... pod/long-load-86bb4b79f8-44zwd 0/1 nodes are available: 1Insufficient memory....output omitted...Alternatively, view the events for a pending pod to see the reason. In the following command, replace the pod name with one of the pending pods in your classroom.
[student@workstation reliability-requests]$
oc describe \ pod/long-load-...output omitted... Events: ...output omitted... 0/1 nodes are available: 186bb4b79f8-44zwdInsufficient memory....output omitted...Observe that the node's requested memory usage is high.
[student@workstation reliability-requests]$
oc describe node master01...output omitted... Allocated resources: (Total limits may be over 100 percent, i.e., overcommitted.) Resource Requests Limits -------- -------- ------ cpu 3158m (42%) 980m (13%) memory18811Mi (99%)1250Mi (6%) ...output omitted...The command output shows that the pods from the
long-loaddeployment requested most of the remaining memory from the node. However, not enough memory is available to accommodate the 10 replicas.
Reduce the requested memory per pod so that the replicas can run on the node.
Manually set the resource request to
250Mi.[student@workstation reliability-requests]$
oc set resources deploy/long-load \ --requests memory=250Mideployment.apps/long-load resource requirements updatedDelete the pods so that they are re-created with the new resource request.
[student@workstation reliability-requests]$
oc delete pod -l app=long-loadpod "long-load-557b4d94f5-29brx" deleted ...output omitted...Observe that all pods can start with the lowered memory request. Within a minute, the pods are marked as
Readyand in aRunningstate, with no pods in aPendingstatus.[student@workstation reliability-requests]$
oc get podsNAME READY STATUS RESTARTS AGE long-load-557b4d94f5-68hbb 1/1 Running 0 3m14s long-load-557b4d94f5-bfk7c 1/1 Running 0 3m21s long-load-557b4d94f5-bnpzh 1/1 Running 0 3m21s long-load-557b4d94f5-chtv9 1/1 Running 0 3m21s long-load-557b4d94f5-drg2p 1/1 Running 0 3m14s long-load-557b4d94f5-hwsz6 1/1 Running 0 3m12s long-load-557b4d94f5-k5vqj 1/1 Running 0 3m21s long-load-557b4d94f5-lgstq 1/1 Running 0 3m21s long-load-557b4d94f5-r8hq4 1/1 Running 0 3m21s long-load-557b4d94f5-xrg7c 1/1 Running 0 3m21sObserve that the memory usage of the node is lower.
[student@workstation reliability-requests]$
oc describe node master01...output omitted... Allocated resources: (Total limits may be over 100 percent, i.e., overcommitted.) Resource Requests Limits -------- -------- ------ cpu 3158m (42%) 980m (13%) memory15167Mi (80%)1250Mi (6%) ...output omitted...Return to the
/home/student/directory.[student@workstation reliability-requests]$
cd /home/student/[student@workstation ~]$