The document discusses Kubernetes networking and container networking interfaces (CNI). It provides an overview of the Kubelet and container runtime workflows for setting up pod networking using CNI plugins. Specific details are given on networking setup in ContainerD and CRI-O. A CNI plugin written in BASH is demonstrated. Container networking uses bridges, veth pairs, and CNI plugins to connect containers to networks. Performance implications of double tunneling with Kubernetes on OpenStack are also noted.
2. Victor Morales
• +18 yrs as a Software Engineer
• .NET, Java, python, Go programmer
• OpenStack, OPNFV, ONAP and CNCF
contributor.
https://about.me/electrocucaracha
3. Main goal
Understand the Containers Networking setup process during the
creation of Pods in Kubernetes
References:
• https://www.altoros.com/blog/kubernetes-networking-writing-your-own-simple-cni-plug-in-with-bash/
• https://www.tkng.io/cni/
• https://sookocheff.com/post/kubernetes/understanding-kubernetes-networking-model/
4. Kubernetes (K8s) is an
open-source system for
automating deployment,
scaling, and management
of containerized
applications.
6. Kubelet workflow
Syncs the running pod into the desired pod.
1. Compute sandbox and container changes
2. Kill pod sandbox if necessary
3. Kill any containers that should not be running
4. Create sandbox if necessary
5. Create ephemeral containers
6. Create init containers
7. Create normal containers
https://github.com/kubernetes/kubernetes/blob/v1.24.2/pkg/kubelet/kuberuntime/kuberuntime_manager.go#L800
RunKubelet() Kubelet.RunOnce() ContainerRuntime.SyncPod() RuntimeService.RunPodSandbox()
10. https://github.com/cncf/artwork
CNI (Container Network Interface),
a Cloud Native Computing
Foundation project, consists of a
specification and libraries for
writing plugins to configure
network interfaces in Linux
containers, along with a number of
supported plugins. CNI concerns
itself only with network
connectivity of containers
and removing allocated resources
when the container is deleted.
13. CNI plugin written in BASH
https://github.com/electrocucaracha/k8s-NetworkingDeepDive-demo/tree/master/bash
14. Setup
Bridge
A bridge behaves like a network switch. It
forwards packets between interfaces that
are connected to it. It's usually used for
forwarding packets on routers, on gateways,
or between VMs and network namespaces
on a host.
k8s-control-plane
eth0
lo
cni0
kube-ipvs0
127.0.0.1/8
172.80.1.2/24
10.96.0.1/32
10.244.0.1/24
18. tmpCEE8 vethCEE8
VETH
The VETH (virtual Ethernet)
device is a local Ethernet tunnel.
Devices are created in pairs,
packets transmitted on one
device in the pair are
immediately received on the
other device. These 2 devices can
be imagined as being connected
by a network cable.
Kubernetes is an open source platform for managing modern distributed applications. Unlike a traditional applications, distributed applications utilize multiple systems simultaneously and operate on the same network. In other words, distribution means that bits and bytes are moved from one process to another over a network. There are multiple components involved in the creation and configuration of the networking in Kubernetes. In this talk, we pretend to clarify this process through the creation of a CNI written in bash script which can help users to detect issues and facilitate their troubleshooting.
PLEG: Call the container runtime interface to get the information of containers/sandboxes of this node, compare it with the locally maintained pod cache, generate the corresponding PodLifecycleEvent, and then send it to Kubelet syncLoop through eventChannel, and then synchronize the pod by the timing task to finally reach the desired state of the user.
CAdvisor: A container monitoring tool integrated in Kubelet, used to collect monitoring information of this node and containers.
PodWorkers: Multiple pod handlers are registered to handle pods at different times, including creation, update, deletion, etc.
oomWatcher: Listener of system OOM, will establish SystemOOM with CAdvisor module, and generate events related to OOM signals received from CAdvisor by Watch.
containerGC: responsible for cleaning up the useless containers on the node, the specific garbage collection operation is implemented by the container runtime.
imageGC: Responsible for image recycling on node nodes. When the local disk space where the image is stored reaches a certain threshold, the image recycling will be triggered and the image not used by the pod will be deleted.
Managers: Contains various managers that manage various resources related to pods. Each manager has its own role and works together in SyncLoop.
https://github.com/kubernetes/kubernetes/blob/v1.24.2/pkg/kubelet/kubelet.go#L2027-L2172
configCh: The producer of this channel is provided by the PodConfig submodule in the kubeDeps object. This module will listen for changes in pod information from file, http, and apiserver, and will produce events to this channel once the pod information from a source is updated.
plegCh: The producer of this channel is the pleg submodule, which will periodically query the container runtime for the current state of all containers, and if the state changes, it will produce events to this channel.
syncCh: Sync the latest saved pod state periodically.
livenessManager.Updates(): Health check finds that a pod is unavailable and the Kubelet will automatically perform the correct action based on the pod’s restartPolicy.
houseKeepingCh: pipeline for housekeeping events, doing pod cleanup.
When an ADD event occurs in configCh, the loop will trigger the HandlePodAdditions method of SyncHandler.
https://github.com/kubernetes/kubernetes/blob/v1.24.2/pkg/kubelet/kubelet.go#L2212-L2250
https://github.com/kubernetes/kubernetes/blob/v1.24.2/pkg/kubelet/kubelet.go#L1455-L1737
https://github.com/kubernetes/kubernetes/blob/v1.24.2/pkg/kubelet/kuberuntime/kuberuntime_manager.go#L702-L940