Kubernetes Pod and Service Networks

Objectives

  • Interconnect applications pods inside the same cluster by using Kubernetes services.

The Software-defined Network

Kubernetes implements software-defined networking (SDN) to manage the network infrastructure of the cluster and user applications. The SDN is a virtual network that encompasses all cluster nodes. The virtual network enables communication between any container or pod inside the cluster. Cluster node processes that Kubernetes pods manage can access the SDN. However, the SDN is not accessible from outside the cluster, nor to regular processes on cluster nodes. With the software-defined networking model, you can manage network services through the abstraction of several networking layers.

With the SDN, you can manage the network traffic and network resources programmatically, so that the organization teams can decide how to expose their applications. The SDN implementation creates a model that is compatible with traditional networking practices. It makes pods akin to virtual machines in terms of port allocation, IP address leasing, and reservation.

With the SDN design, you do not need to change how application components communicate with each other, which helps to containerize legacy applications. If your application is composed of many services that communicate over the TCP/UDP stack, then this approach still works, because containers in a pod use the same network stack.

The following diagram shows how all pods are connected to a shared network:

Figure 4.5: How the Kubernetes SDN manages the network

Among the many features of SDN, with open standards, vendors can propose their solutions for centralized management, dynamic routing, and tenant isolation.

Kubernetes Networking

Networking in Kubernetes provides a scalable means of communication between containers.

Kubernetes networking provides the following capabilities:

Kubernetes automatically assigns an IP address to every pod. However, pod IP addresses are unstable, because pods are ephemeral. Pods are constantly created and destroyed across the nodes in the cluster. For example, when you deploy a new version of your application, Kubernetes destroys the existing pods and then deploys new ones.

All containers within a pod share networking resources. The IP address and MAC address that are assigned to the pod are shared among all containers in the pod. Thus, all containers within a pod can reach each other's ports through the loopback address, localhost. Ports that are bound to localhost are available to all containers that run within the pod, but never to containers outside it.

By default, the pods can communicate with each other even if they run on different cluster nodes or belong to different Kubernetes namespaces. Every pod is assigned an IP address in a flat shared networking namespace that has full communication with other physical computers and containers across the network. All pods are assigned a unique IP address from a Classless Inter-Domain Routing (CIDR) range of host addresses. The shared address range places all pods in the same subnet.

Because all the pods are on the same subnet, pods on all nodes can communicate with pods on any other node without the aid of Network Address Translation (NAT). Kubernetes also provides a service subnet, which links the stable IP address of a service resource to a set of specified pods. The traffic is forwarded in a transparent way to the pods; an agent (depending on the network mode that you use) manages routing rules to route traffic to the pods that match the service resource selectors. Thus, pods can be treated much like Virtual Machines (VMs) or physical hosts from the perspective of port allocation, networking, naming, service discovery, load balancing, application configuration, and migration. Kubernetes implements this infrastructure by managing the SDN.

The following illustration gives further insight into how the infrastructure components work along with the pod and service subnets to enable network access between pods inside an OpenShift instance.

Figure 4.6: Network access between pods in a cluster

The shared networking namespace of pods enables a straightforward communication model. However, the dynamic nature of pods presents a problem. Pods can be added on the fly to handle increased traffic. Likewise, pods can be dynamically scaled down. If a pod fails, then Kubernetes automatically replaces the pod with a new one. These events change pod IP addresses.

Figure 4.7: Problem with direct access to pods

In the diagram, the Before side shows the Front-end container that is running in a pod with a 10.8.0.1 IP address. The container also refers to a Back-end container that is running in a pod with a 10.8.0.2 IP address. In this example, an event occurs that causes the Back-end container to fail. A pod can fail for many reasons. In response to the failure, Kubernetes creates a pod for the Back-end container that uses a new IP address of 10.8.0.4. From the After side of the diagram, the Front-end container now has an invalid reference to the Back-end container because of the IP address change. Kubernetes resolves this problem with service resources.

Using Services

Containers inside Kubernetes pods must not connect directly to each other's dynamic IP address. Instead, Kubernetes assigns a stable IP address to a service resource that is linked to a set of specified pods. The service then acts as a virtual network load balancer for the pods that are linked to the service.

If the pods are restarted, replicated, or rescheduled to different nodes, then the service endpoints are updated, thus providing scalability and fault tolerance for your applications. Unlike the IP addresses of pods, the IP addresses of services do not change.

Figure 4.8: Services resolve pod failure issues

In the diagram, the Before side shows that the Front-end container now holds a reference to the stable IP address of the Back-end service, instead of to the IP address of the pod that is running the Back-end container. When the Back-end container fails, Kubernetes creates a pod with the New back-end container to replace the failed pod. In response to the change, Kubernetes removes the failed pod from the service's host list, or service endpoints, and then adds the IP address of the New back-end container pod to the service endpoints. With the addition of the service, requests from the Front-end container to the Back-end container continue to work, because the service is dynamically updated with the IP address change. A service provides a permanent, static IP address for a group of pods that belong to the same deployment or replica set for an application. Until you delete the service, the assigned IP address does not change, and the cluster does not reuse it.

Most real-world applications do not run as a single pod. Applications need to scale horizontally. Multiple pods run the same containers to meet a growing user demand. A Deployment resource manages multiple pods that execute the same container. A service provides a single IP address for the whole set, and provides load-balancing for client requests among the member pods.

With services, containers in one pod can open network connections to containers in another pod. The pods, which the service tracks, are not required to exist on the same compute node or in the same namespace or project. Because a service provides a stable IP address for other pods to use, a pod also does not need to discover the new IP address of another pod after a restart. The service provides a stable IP address to use, no matter which compute node runs the pod after each restart.

Figure 4.9: Service with pods on many nodes

The SERVICE object provides a stable IP address for the CLIENT container on NODE X to send a request to any one of the API containers.

Kubernetes uses labels on the pods to select the pods that are associated with a service. To include a pod in a service, the pod labels must include each of the selector fields of the service.

Figure 4.10: Service selector match to pod labels

In this example, the selector has a key-value pair of app: myapp. Thus, pods with a matching label of app: myapp are included in the set that is associated with the service. The selector attribute of a service is used to identify the set of pods that form the endpoints for the service. Each pod in the set is a an endpoint for the service.

To create a service for a deployment, use the oc expose command:

[user@host ~]$ oc expose deployment/<deployment-name> [--selector <selector>]
[--port <port>][--target-port <target port>][--protocol <protocol>][--name <name>]

The oc expose command can use the --selector option to specify the label selectors to use. When the command is used without the --selector option, the command applies a selector to match the replication controller or replica set.

The --port option of the oc expose command specifies the port that the service listens on. This port is available only to pods within the cluster. If a port value is not provided, then the port is copied from the deployment configuration.

The --target-port option of the oc expose command specifies the name or number of the container port that the service uses to communicate with the pods.

The --protocol option determines the network protocol for the service. TCP is used by default.

The --name option of the oc expose command can explicitly name the service. If not specified, the service uses the same name that is provided for the deployment.

To view the selector that a service uses, use the -o wide option with the oc get command.

[user@host ~]$ oc get service db-pod -o wide
NAME    TYPE        CLUSTER-IP      EXTERNAL-IP PORT(S)     AGE     SELECTOR
db-pod  ClusterIP   172.30.108.92   <none>      3306/TCP    108s    app=db-pod

In this example, db-pod is the name of the service. Pods must use the app=db-pod label to be included in the host list for the db-pod service. To see the endpoints that a service uses, use the oc get endpoints command.

[user@host ~]$ oc get endpoints
NAME     ENDPOINTS                       AGE
db-pod   10.8.0.86:3306,10.8.0.88:3306   27s

This example illustrates a service with two pods in the host list. The oc get endpoints command returns the service endpoints in the current selected project. Add the name of the service to the command to show only the endpoints of a single service. Use the --namepace option to view the endpoints in a different namespace.

Use the oc describe deployment <deployment name> command to view the deployment selector.

[user@host ~]$ oc describe deployment db-pod
Name:                   db-pod
Namespace:              deploy-services
CreationTimestamp:      Wed, 18 Jan 2023 17:46:03 -0500
Labels:                 app=db-pod
Annotations:            deployment.kubernetes.io/revision: 2
Selector:               app=db-pod
...output omitted...

You can view or parse the selector from the YAML or JSON output for the deployment resource from the spec.selector.matchLabels object. In this example, the -o yaml option of the oc get command returns the selector label that the deployment uses.

[user@host ~]$ oc get deployment/<deployment_name> -o yaml
...output ommitted...
  selector:
    matchLabels:
      app: db-pod
...output ommitted...

Kubernetes DNS for Service Discovery

Kubernetes uses an internal Domain Name System (DNS) server that the DNS operator deploys. The DNS operator creates a default cluster DNS name, and assigns DNS names to services that you define. The DNS operator implements the DNS API from the operator.openshift.io API group. The operator deploys CoreDNS; creates a service resource for the CoreDNS; and then configures the kubelet to instruct pods to use the CoreDNS service IP for name resolution. When a service does not have a cluster IP address, the DNS operator assigns to the service a DNS record that resolves to the set of IP addresses of the pods behind the service.

The DNS server discovers a service from a pod by using the internal DNS server, which is visible only to pods. Each service is dynamically assigned a Fully Qualified Domain Name (FQDN) that uses the following format:

SVC-NAME.PROJECT-NAME.svc.CLUSTER-DOMAIN

When a pod is created, Kubernetes provides the container with a /etc/resolv.conf file with similar contents to the following items:

[user@host ~]$ cat /etc/resolv.conf
search deploy-services.svc.cluster.local svc.cluster.local ...
nameserver 172.30.0.10
options ndots:5

In this example, deploy-services is the project name for the pod, and cluster.local is the cluster domain.

The nameserver directive provides the IP address of the Kubernetes internal DNS server. The options ndots directive specifies the number of dots that must appear in a name to qualify for an initial absolute query. Alternative hostname values are derived by appending values from the Search directive to the name that is sent to the DNS server.

In the search directive in this example, the svc.cluster.local entry enables any pod to communicate with another pod in the same cluster by using the service name and project name:

SVC-NAME.PROJECT-NAME

The first entry in the search directive enables a pod to use the service name to specify another pod in the same project. In RHOCP, a project is also the namespace for the pod. The service name alone is sufficient for pods in the same RHOCP project:

SVC-NAME

Kubernetes Networking Drivers

Container Network Interface (CNI) plug-ins provide a common interface between the network provider and the container runtime. CNI defines the specifications for plug-ins that configure network interfaces inside containers. Plug-ins that are written to the specification enable different network providers to control the RHOCP cluster network.

Red Hat provides the following CNI plug-ins for a RHOCP cluster:

  • OVN-Kubernetes: The default plug-in for first-time installations of RHOCP, starting with RHOCP 4.10.

  • OpenShift SDN: An earlier plug-in from RHOCP 3.x; it is incompatible with some later features of RHOCP 4.x.

  • Kuryr: A plug-in for integration and performance on OpenStack deployments.

Certified CNI-plugins from other vendors are also compatible with an RHOCP cluster.

The SDN uses CNI plug-ins to create Linux namespaces to partition the usage of resources and processes on physical and virtual hosts. With this implementation, containers inside pods can share network resources, such as devices, IP stacks, firewall rules, and routing tables. The SDN allocates a unique routable IP to each pod, so that you can access the pod from any other service in the same network.

In OpenShift 4.14, OVN-Kubernetes is the default network provider.

OVN-Kubernetes uses Open Virtual Network (OVN) to manage the cluster network. A cluster that uses the OVN-Kubernetes plug-in also runs Open vSwitch (OVS) on each node. OVN configures OVS on each node to implement the declared network configuration.

The OpenShift Cluster Network Operator

RHOCP provides a Cluster Network Operator (CNO) that configures OpenShift cluster networking. The CNO is a OpenShift cluster operator that loads and configures Container Network Interface (CNI) plug-ins. As a cluster administrator, execute the following command to observe the status of the CNO:

[user@host ~]$ oc get -n openshift-network-operator deployment/network-operator
NAME              READY   UP-TO-DATE  AVAILABLE   AGE
network-operator  1/1     1           1           41d

An administrator configures the cluster network operator at installation time. To see the configuration, use the following command:

[user@host ~]$ oc describe network.config/cluster
Name: cluster
...output omitted...
Spec:
Cluster Network:
Cidr: 10.8.0.0/14 1
Host Prefix: 23
External IP:
Policy:
Network Type: OVNKubernetes
Service Network:
172.30.0.0/16 2
...output omitted...

1

The Cluster Network CIDR defines the range of IPs for all pods in the cluster.

2

The Service Network CIDR defines the range of IPs for all services in the cluster.

References

For more information, refer to the About Kubernetes Pods and Services chapter in the Red Hat OpenShift Container Platform 4.14 Networking documentation at https://docs.redhat.com/en/documentation/openshift_container_platform/4.14/html-single/architecture/index#building-simple-container

For more information, refer to the Cluster Network Operator in OpenShift Container Platform chapter in the Red Hat OpenShift Container Platform 4.14 Networking documentation at https://docs.redhat.com/en/documentation/openshift_container_platform/4.14/html-single/networking/index#cluster-network-operator

For more information, refer to the About the OVN-Kubernetes Network Plug-in chapter in the Red Hat OpenShift Container Platform 4.14 Networking documentation at https://docs.redhat.com/en/documentation/openshift_container_platform/4.14/html-single/networking/index#about-ovn-kubernetes

Cluster Networking