Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Understanding The Security Implications of Kubernetes Networking

Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

KUBERNETES NETWORKING

Understanding the Security Implications


of Kubernetes Networking
Francesco Minna and Balakrishnan Chandrasekaran | Vrije Universiteit Amsterdam
Agathe Blaise and Filippo Rebecchi | Thales SIX GTS France
Fabio Massacci | University of Trento and Vrije Universiteit Amsterdam

Container-orchestration software such as Kubernetes make it easy to deploy and manage modern cloud
applications based on microservices. Yet, its network abstractions pave the way for "unexpected attacks"
if we approach cloud network security with the same mental model of traditional network security.

M icroservices have become the template for


cloud-native applications: easy to develop, deploy,
debug, scale, and share. When an application is decom-
“mental model” of networking derived from physical
networks, with switches and interfaces interconnected
with physical cables—a model that we show signifi-
posed into independent microservices, ensuring that the cantly departs from reality.
services can communicate with one another in a secure As a result, when thinking about (cloud) network
way introduces new challenges, particularly when the security, we may picture “digitally unbridgeable moats”
decomposition results in many services. Even rudimen- that do not really exist. The correct analogy with tradi-
tary cloud applications contain a few tens of microservices, tional networking would be that as one is able to esca-
and some of the largest (e.g., Netflix and Uber platforms) late within a switching device, then one can start laying
contain hundreds or thousands of microservices, possibly cables between different devices. The key takeaway is
running on several containers. Container-orchestration not that K8s is insecure, but that it is insecure to apply
software such as Kubernetes (K8s)1 provide a simplified the “mental extension” of traditional network security
interface or model to address these challenges. terminology to a different world.
At the same time, abstractions make it easy to overlook
security threats. For example, (in)secure practices con- A Playground for “Unexpected Attacks”
cerning use of K8s default configuration have been well To understand the issues, consider some typical
studied.2,3 Security issues in software-defined networking deployment scenarios in which a company is wishing
(SDN) solutions used for managing cloud infrastructure to use K8s.
have also been investigated.4,5 Nam et al.6 present an over- A K8s-single-cluster consists of one master and one
view of security challenges in container networks and the or more (i.e., a customizable number of) worker nodes.
limitations of common networking plug-ins. The applications aimed at the users are deployed on
In particular, the security implications of K8s net- two clusters—“development” and “production”—both
working components (e.g., how K8s configures connec- of which contain the same set of applications, but with
tivity between services and enforces network-security different levels of security (typically, more restricted
policies) are largely unexplored. Indeed, when we think for the production than the development cluster). A
about networking between microservices, we have a network-security policy separates the nodes.
A K8s-multicluster setup consists of two clusters
Digital Object Identifier 10.1109/MSEC.2021.3094726
composed of one master and at least one worker node
Date of current version: 27 July 2021 for each. To add a layer of security, one cluster can be
This work is licensed under a Creative Commons
Attribution 4.0 License. For more information,
46 September/October 2021 Copublished by the IEEE Computer and Reliability Societies see https://creativecommons.org/licenses/by/4.0/deed.ast.
“development” and the other can be in the “produc- that at each node mimics the existence of an over-
tion,” each deployed in different network subnets not lay network.
meant to access one another. ■■ Hit&Spread [container shell through remote code exe-
A K8s-custom-multicluster is a fully customizable cution (RCE) vulnerability in the web application]:
setup, which allows the user to specify both the num- An attacker can exploit an RCE in a web application
ber of master nodes and worker nodes to be used, to get a reverse shell on a container and then access
where etcd database containing the clusters information sensitive information, laterally move within the clus-
should be deployed (within the master nodes or as an ter, and escalate privileges.
external cluster for high availability) and other segrega- ■■ Replace&Propagate (supply chain attack through malicious
tion information available through Linux namespaces. container image): An attacker can deceive developers into
It is also possible to specify, for each K8s component, deploying a malicious container, which then contacts a
the release version to be installed (this setup is suitable
to replicate production-like environments).
Development Cluster
Figure 1 provides an overview of the multicluster
setup and its main components. Different application
services, possibly segregated by security policies at the Master
Node
operating system level, are typically present even in a etcd
single-cluster setup: VM 172.16.2.10

■■ longhorn: providing distributed storage


■■ nfs server: providing persistent storage
Worker Worker Worker NAT
■■ development: three applications—Wordpress (with Node 1 Node 2 Node 3
MariaDB), Nginx, and Guestbook (with a Redis
leader, Redis follower, and front end) VM 172.16.2.11 VM 172.16.2.12 VM 172.16.2.13
■■ production: same applications as for development. Private Network 172.16.2.0/24
Host
Production Cluster Machine
We provide a practical testbed,7 built on Vagrant for
reproducibility reasons, where the above scenarios can
be replicated through containers and virtual Machines Master
(VMs). VMs are created and deployed created from a Node
etcd NAT
host machine in a private network, not accessible from VM 172.16.3.10
the internet. VMs can, however, reach the Internet via a
network address translator (NAT).

The “Unexpected Threats” Model Worker Worker Worker


Node 1 Node 2 Node 3
In this testbed scenarios, we consider sample attack
scenarios from either external or internal attackers. So, VM 172.16.3.11 VM 172.16.3.12 VM 172.16.3.13
we assume that all attacks start by compromising a pod Private Network 172.16.3.0/24
in some way (the Initial Access of MITRE ATT&CK
Testbed Namespaces
framework as adapted by Microsoft for K8s8).
With the traditional mental model of network By Default: DNS, Scheduler, Kube-Proxy, Calico, etc.. Kube-System
security, such attacks should remain confined to the
initial compromised pod: network-security measures By Default: For Objects Created Without Namespace Default
are in place. Additional exploits would be needed to
move around. Yet, exploiting the connectivity of K8s Longhorn
components, the cluster may still be compromised.
We will return to them with more precise details in nfs-Server
Table 1 after describing the network functionality.
dev
■■ FirewallHole (bypassing security barriers of an over-
lay network): An attacker launches a SYN flood prod
denial-of-service (DoS) attack against a service Worker Node 1 Worker Node 2 Worker Node 3
bypassing an (apparent) firewall by mimicking the
encapsulation of the plug-in in charge of networking Figure 1. An overview of the multicluster components.

www.computer.org/security 47
KUBERNETES NETWORKING

command-and-control server and hijacks the whole cluster among a set of processes. Specifically, a network namespace
including services running on other containers. is a copy of the network stack, including network interfaces,
routing and firewall rules, which can be assigned to each
process or container. The longhorn, nfs server, dev, and
A Primer on Containers and Kubernetes prod namespaces shown in Figure 1 implement a similar
To explain why these attacks are feasible, some back- resources isolation at a K8s cluster level.
ground material on containers and key components of Deployment and management of containers is typi-
K8s is useful. cally automated with orchestration engines such as K8s,
A container emulates the operating system layers to Docker Swarm, and AWS ECS. In this article we focus only
offer a virtualized and self-contained environment with on K8s, the most widely used orchestration software.9
its own subprocesses and resources. Container isolation An application running on K8s is deployed within
in a typical Linux environment is implemented through a cluster, a set of machines (either virtual or physical)
namespaces, which allow a kernel to partition resources for running containerized applications. As shown in

Table 1. Details of the example attack scenarios in a K8s cluster.

Attack Scenario Alternative Steps or Scenarios


FirewallHole: • Deploying a malicious CNI plug-in, which could
• The target is a web application (front end, database, allow malicious requests and enable MITM attacks.
and back-end server); firewall policies allows only the • K8s objects dynamically created and located in a
back end to send packets to the database, and pods CIDR not covered by the firewall.
without NET_RAW capability (i.e. no source IP address • CVE-2020-10749 vulnerability found in affected
spoofing). container networking implementations allowing
• The attacker wants to run a SYN flood DoS attack malicious containers in a cluster to perform MITM
on the database, by crafting User Datagram Protocol attacks.
(UDP) packets to mimic Virtual Extensible LAN • CNI plug-in that does not handle network policies
(VXLAN) encapsulation. (e.g., Flannel) or network policies not defined by the
• The attacker, miming the Flannel VXLAN encapsulation user.
(i.e., UDP packets with VXLAN header, destination IP
set to the node’s IP, and destination port set to 8472—
default VXLAN UDP port), can bypass the firewall and
send packets from the front-end pod to the database.
Hit&Spread: • Transport Layer Security (TLS) authentication
• Consider a web application containing a remote disabled for any component on the master node: API
code execution (RCE) vulnerability in the code or a server, controller-manager, scheduler, and etcd server.
third-party dependency, that allows obtaining a reverse • Interaction with the cloud provider: obtaining the
shell in a container. node’s credentials from the metadata API, gaining
• The attacker can communicate with the API server via K8s authentication tokens from cloud storage
kubectl with tokens and certificates mounted on the buckets, modifying or creating compute instances,
compromised pod. and modifying or duplicating storage.
• The attacker can also perform malicious actions like • Exploiting users with a large set of permissions (e.g.,
mounting the host’s file system on new containers, for accessing secrets, creating pods or deployments).
accessing other pods in the cluster, asking the API server • Secrets management: accessing secrets stored as
to modify containers or intercepting network traffic. environment variables or in other insecure ways.
Replace&Propagate: • The attacker can deceive the developers in deploying
• Consider the deployment of a malicious image the malicious image either by sharing it on public
controlled by the attacker and able to open a reverse registries (e.g., Docker Hub) with misleading names
shell or communicate with a command-and-control (i.e., typosquatting attacks), by gaining access to a
server. repository and directly modifying the source code,
• The attacker gets a reverse shell on the malicious or exploit a registry’s vulnerability and hijack the
container and, similarly as before, can install custom images (e.g., CVE-2019-16097).
scripts or malicious programs, access other pods and • Given attacker’s access to the cluster through a
secrets, intercept network traffic, and escape on the compromised container image, which developers
node. can also reuse as a base image for other containers,
enlarging the attack surface.

48 IEEE Security & Privacy September/October 2021


Figure 1, in every cluster there is (at least) one master the plug-in (e.g., through iptables rules or admission con-
node and several worker nodes. Master nodes have the trollers). In fact, creating a network policy without a CNI
task of managing all cluster (i.e., K8s objects and worker plug-in will have no effect on the cluster traffic.
nodes) and keeping it at the desired state, scheduling Contrary to common belief, a CNI plug-in is not a K8s
the application containers on the worker nodes, which component, it is not bound to it in any way, and it does
are the computing units. To provide high availability, not depend on K8s. In a K8s cluster, a CNI simply acts
both the master and worker nodes can be replicated, as a middleware between pods and the container engine
either physically or virtually (e.g., through VMs). being used. Specifically, the kubelet contacts the CNI
The main components of a worker node are: plug-in providing a JavaScript Object Notation config file
containing the network specifications that a worker node
■■ pod: the smallest deployable object containing at should use (e.g., the network subnet) with its pods. This
least one container; a pod (or the containers running has strong implications on the way networking is imple-
within it) is attacked and initially compromised in all mented using CNIs. A security policy enforced by CNI is
our scenarios only enforced if a K8s component queries the appropri-
■■ kubelet: managing and checking running pods ate CNI for policy and interfaces mapping and does what
■■ kube-proxy: implementing NAT for new services; this is told.
is the component implementing the network policies
through iptables rules retrieved from the etcd datas- Kubernetes Networking: Bottom-Up
tore available in the Master node Within a K8s cluster, every CNI plug-in must guarantee
■ ■ container runtime: container engine that runs the following properties:
containers.
■■ a container (and pod) can communicate with any
Instead, the main components of a master node are: other container (and pod) on any worker node with-
out using NAT
■■ API server: REST API control manager that controls ■■ a worker node can communicate with any pod on any
the whole cluster; K8s users can interact with the clus- worker node without NAT
ter through kubectl, a command-line tool, or the web ■■ each pod is assigned a unique IP address across the
dashboard, by sending commands to the API server entire cluster (i.e., an IP-per-pod model).
■■ controller-manager: controller loops on cluster objects
■■ scheduler: scheduling pods on worker nodes In this section, we elucidate various communica­
■■ etcd: key-value distributed database storing cluster tion scenarios between the key K8s entities (shown
configurations; a faulty network analogy would be a in Figure 2) and highlight security issues relevant to
dynamic host configuration protocol server database. each scenario.
More properly, it is an identity database for workers
and pods. Should it fail or be compromised, there is Container-to-Container Networking
no longer a proper distinction between pods and net- The simplest scenario consists of communication
work policies cannot be retrieved anymore. between containers within the same pod, which is rep-
resented by the green line in Figure 2. Containers within
By default, the K8s network among worker nodes the same pod share the same network namespace (abbre-
and pods is all flat: to provide network segmentation viated, henceforth, as netns). They share, hence, the same
and to restrict the communication between different (virtual) network stack (i.e., network interfaces, routing
objects, K8s allows defining network policies. table, and so on), and they can communicate over loc-
A network policy (which is actually a misnomer as alhost. Thus, a compromised container has (network)
will be coming apparent) allows specifying how a pod is access to the other containers running in the same pod.
allowed to talk to other networking components, such The CNI plug-in, invoked by the kubelet, and in charge of
as other pods, services, and so on. Such policy is not setting up network interfaces does not disallow (or even
enforced by K8s itself, but by network plug-in, a container monitor) communication over localhost.
network interface (CNI) aiming to connect a container
engine to a network, providing connectivity specifications Pod-to-Pod Networking
for the running containers. Kumar and Trivedi et al.10 pro- Moving up one layer, pods can “talk” to each other. We
vide an extensive performance comparison of common distinguish between two cases: two pods communicate
CNI plug-ins. By default, all policies are stored in the etcd within the same worker node (yellow line in Figure 2),
database and retrieved by the plug-in agent running on or they are on different nodes (purple line in Figure 2).
each node. How these policies are enforced depends on Nodes may also not be part of the same subnet (e.g.,

www.computer.org/security 49
50
Controller- Scheduler kubelet Kube-Proxy kubelet Kube-Proxy

IEEE Security & Privacy


Manager
KUBERNETES NETWORKING

DevOps
Container Runtime Container Runtime

kubectl API Server


Container: 80 Container: 53 Container: 80 Client
Service 1 Container: 22 Service 1

eth0 eth0 eth0


etcd 10.0.1.2 10.0.1.3 10.0.1.2
Network Load
veth1 veth2 veth1 Balancer

Legend cbr0 cbr0


Control Flow 10.0.1.1 10.0.1.1

Physical Link CNI DNAT CNI DNAT


Plugin Plugin
Virtual Link
eth0 eth0
Container-to-Container 172.16.2.11 172.16.2.11

Pod-to-Pod Switch
Pod-to-Pod 172.16.0.0/16
(Different Nodes) Worker node Virtual Network
Master Node Pod netns Network
Pod-to-Service netns Interface Interface

Figure 2. K8s “real” network and possible interactions between different services in a single cluster. This figure shows a more accurate representation of the actual network architecture
of a single K8s-managed cluster from the top part of Figure 1. The key takeaway from a security standpoint is that all isolation boundaries (i.e., the namespace abstractions and
network-security policies) are implemented in software; these boundaries disappear as soon as one escapes to the right namespace. For example, if an attacker breaks out of the pod netns
of “Service 1” to the surrounding worker-node netns, the cluster is laid bare and, thus network-security policies no longer apply (see the “A Primer on Containers and Kubernetes” section).
Alternatively, a compromised service could ignore the CNI plug-in and “talk” directly to any reachable interface. Even in the presence of some physical boundaries, an attacker could rewire
the logical control flow (through kubelet on top) and provide backdoor access to workers that appear apparently separated.

September/October 2021
when nodes are in different datacenters or clouds), in A Summary of Network-Security
which case they, usually, use an overlay network. The Implications
CNI plug-in tracks which pods are on which subnets, In this section, we highlight a list of network-security
and on which nodes, and updates the routing rules in the issues that may arise within a K8s cluster and that every
network namespace of each node such that the pod-to- K8s user and developer should keep in mind.
pod traffic can be forwarded through the right node.
Connectivity between nodes is, however, not managed Pod netns by a Pause Container
by K8s, and we omit concerned scenarios, since they are The pod netns is held by a special container, called
beyond the scope of this article. Pod-to-pod communi- a pause container. Every container scheduled on a pod
cation on the same node is implemented via virtual Eth- will share the netns with the pause container. Thus,
ernet devices (veth pairs in Figure 2) and a bridge (cbr0 escaping from the pod netns means escaping from the
in the illustration). Therefore, multiple pods running on pause container netns, ending up in the host netns (the
the same worker node can exchange network packets pause container is not shown in Figure 2, but the same
via the virtual bridge. can be thought as
When a container is escaping from the pod
compromised, CNI square). An attacker
plug-ins using a bridge who is able to get on
become vulnerable the host netns can
to common L2 net- Therefore, multiple pods running on the potentially see network
work attacks [such as same worker node can exchange network interfaces, routing
Address Resolution packets via the virtual bridge. rules, other pods ne-
Protocol (ARP) and tns: if the attacker
Domain Name Sys- has privileged access,
tem (DNS) spoof- the worker-node netns
ing]. Other plug-ins, is fully compromised.
instead of the bridge,
use a virtual router in each node or IP in IP encapsulation CNI Plug-Ins Jeopardy
to avoid such problems. CNI plug-ins run as (privileged) programs on worker
nodes. Subverting these objects automatically results in
Pod-to-Service Networking privileged access to the worker nodes, compromising the
A K8s service is an abstract way to expose an application whole network. Also, an attacker can compromise the net-
running on a set of pods. All pods used by an applica- work interfaces or other components of the CNI plug-in
tion share a common label that K8s uses for grouping itself. Layer 2 plug-ins that use the Linux bridge may be
the pods. K8s also uses labels to automatically keep susceptible to man-in-the-middle (MITM) attacks (e.g.,
track of newly instantiated pods and maintains a list ARP spoofing and DNS spoofing); routing daemons of
of pod IP addresses associated with each service in an layer 3 plug-ins (e.g., CVE-2021-26928 affecting Border
EndpointSlice resource. K8s supports three different Gateway Protocol) and eBPF (e.g., CVE-2021—31440)
types of services: may also be vulnerable.

■ The ClusterIP service assigns the concerned applica- Software Isolation of Resources
tion a cluster-wide unique virtual IP address, only By default, the K8s network is flat. K8s isolates resources
reachable from within the cluster. in this flat architecture through network policies, while
■ The NodePort service assigns the service to a static port also introducing new security implications. Within a clus-
on every node in the cluster. It can be accessed from out- ter, network policies are enforced by the CNI plug-in and
side the cluster using the node’s IP address and the stati- not K8s itself. Subverting the plug-in may result in invali-
cally assigned port number. K8s also routes requests to dating all policies. The policies are also usually stored in
NodePort services to a clusterIP services (to load bal- the CNI-plug-in datastore (e.g., etcd): Compromising
ance traffic across the pods). this database will result in another point of failure.
■ In the Load Balancer case, K8s exposes the ser-
vice through a cloud-provider’s load balancer Network Policies Limitations
(red line in the Figure 2). Requests arriving at the K8s base network policies that do not depend on the par-
cloud-provider’s load balancer are subsequently ticular CNI plug-in do not support logs and drop/block
routed to a NodePort service, which in turn routes it to a options. There is no support for fully qualified domain
ClusterIP service. name filtering in network rules, limiting the security

www.computer.org/security 51
KUBERNETES NETWORKING

options available. Furthermore, network policies offer of these events, such as network traffic, system calls, and
protection for layer 3 network controls between pod IP CPU and memory usages is useful both to identify attacks
addresses, but attacks over trusted IP addresses can only and to improve the overall performance of the cluster.
be detected with layer 7 network filtering, which requires As an example, sophisticated attacks consist of several
additional components. Finally, to the best of our knowl- steps (e.g., malicious network traffic, CPU overload, and
edge, there is no methodology or tool yet to automatically mounting sensitive directories): resources tracing may
compare network policies with the business logic of appli- allow detecting these steps or identifying attack patterns.
cations, other than manually verifying them. As of today, correlation of data from different sources
remains complex and has to be done with external tools.
Multitenant K8s Clusters
CVE-2020-8554, affecting multitenant K8s clusters, is No Audit of the Level of Security of Policies
not fully patched in K8s yet. An attacker who has per- K8s does not automatically audit the security level of
mission to create or edit services and pods can intercept policies in a cluster and the potential risks and vulnerabil-
network traffic from other pods or nodes, by creating a ities that may result from them. In particular, authentica-
ClusterIP service with an arbitrary IP address to which tion and authorization such as role-based access control
traffic is forwarded. Some plug-ins offer few counter- (RBAC) and service accounts, secrets management, net-
measures, but an attacker might still be able to succeed. work policies, pod security policies, general policies han-
dling the use of namespaces, and security options should
Dynamic Nature of K8s Objects be analyzed before deploying the cluster and exposing
It is a common practice to segment a network by its services to the outside. As an example, sensitive files
assigning subnets and separating them with firewalls in to audit include configuration files (/etc/K8s) of both
between. This approach cannot cope with the ephem- the master and worker nodes and user-defined policies.
eral nature of K8s objects: network rules based on IP
addresses are not very effective, since IP addresses of Mapping Attacks and Defenses
resources may keep changing; a classic firewall should at The ATT&CK (adversarial tactics, techniques, and
least define rules based on classless interdomain routing common knowledge) framework, created by MITRE
(CIDR) ranges and not specific addresses. Referring to in 2013, describes common techniques used by attack-
Figure 2, a firewall has the same network visibility of the ers to gain access into a system as well as their behavior
switch: It can be used to monitor ingress traffic, but it has (e.g., lateral movement and privilege escalation) follow-
no internal visibility over intrazones traffic. ing the intrusion, based on real-world observations of
attacks.MITRE also recently published the D&FEND
Virtual Network Infrastructure framework for security defenses. The ATT&CK frame-
As we explained in §IV, the K8s network infrastructure work has also been specialized to address security
is all virtual (i.e., software-defined via veth pairs or the threats relevant to containers.11 Albeit MITRE has not
Linux bridge), with no physical interfaces or cables con- yet released a more specialized K8s-related matrix, most
necting the different components. The attack surface K8s attack techniques can be mapped to the MITRE
and security issues of SDN have been widely studied framework. Indeed, Microsoft published in April 2020
(e.g., Dabbagh et al.4 and Yoon et al.5), but, to the best a K8s threat matrix8 based on the structure of MITRE’s
of our knowledge, similar studies on the K8s network ATT&CK framework that has been widely adopted to
do not exist. As an example, the K8s master node, by study and secure K8s deployments. Given prior collab-
default, is not replicated, which makes it a single-point- orations between Microsoft and MITRE, and the over-
of-failure, affecting other components, like the API lap between the Microsoft and the MITRE ATT&CK
server, the controller-manager, the scheduler, and the matrices, we suppose that the Microsoft matrix will be
etcd database. The database is also not replicated, by included in the MITRE ATT&CK framework, in the
default: if it becomes unavailable, it may not be possible near future. We opted, hence, to choose the Microsoft
to retrieve network policies or other settings. An outage matrix to describe the attack scenarios, as shown in
or attack on a single master node cluster would not stop Figure 3. In Table 2, we also summarize the security issues,
the cluster from working, but the cluster itself would and propose solutions for hardening K8s deployments
become unmanageable (i.e., it would be impossible to and link them to the MITRE D&FEND framework.
change configurations or create new objects).

Distributed Tracing Not Embedded


By default, K8s does not allow distributed tracing of
resources usage or networking requests. Keeping track
I n this article, we analyzed the Kubernetes networking
infrastructure, highlighting the key low-level abstractions,
and offered a glimpse into the security implications

52 IEEE Security & Privacy September/October 2021


www.computer.org/security
Privilege Defense Credential Lateral
Initial Access Execution Persistence Discovery Collection Impact
Escalation Evasion Access Movement

Using Cloud Exec Into Backdoor Privileged Clear Container List K8s Access the K8s Access Cloud Image From a Data
Credentials Container Container Container Logs Secrets API Server Resources Private Registry Destruction
Compromised Bash/cmd Writable Container
Cluster-Admin Delete K8s Mount Service Access Resource
Images in Inside HostPath Service
Binding Events Principal Kubelet API Hijacking
Registry Container Mount Account
Kubeconfig New Kubernetes HostPath Pod/Container Access Container Network Cluster Internal Denial of
File Container Cronjob Mount Name Similarity Service Account Mapping Networking Service
Applications
Malicious Access Applications Access
Application Application Connect From Credentials in
Admission Cloud Credentials in Kubernetes
Vulnerability Exploit (RCE) Proxy Server Configuration
Controller Resources config. files Dashboard Files
SSH Server Access Managed Writable Volume
Exposed Instance
Running Identity Mounts on the
Dashboard Metadata API Host
Inside Container Credential
Exposed Malicious Access
Sidecar
Sensitive Admission Kubernetes
Injection
Interfaces Controller Dashboard
Legend
Access
Tiller
Scenario 1 Endpoint
CoreDNS
Scenario 2 Poisoning

ARP Poisoning
Scenario 3
and IP Spoofing

Figure 3. Mapping the example attack scenarios from Table 1 to the Microsoft K8s threat matrix8 adapted from MITRE’s ATT&CK framework.

53
54
Table 2. Main security challenges and mapping possible solutions in K8s networking to MITRE’s D&FEND Framework.

Domain Goal Key Challenge Tools (Control Knobs) MITRE D&FEND


Protect the traffic through Automate the checking • Built-in K8s Service Accounts—manage authentication and • Message authentication

IEEE Security & Privacy


Traffic security
KUBERNETES NETWORKING

mutual authentication and enforcement of such configuration of containers to access the K8s API server. • Message encryption
both at the control- and practices both at cluster • Built-in K8s Secrets—store encrypted confidential • Disk encryption (e.g., for the
data-planes. and service levels so as information within a nonpublic centralized repository. datastore cluster)
to reduce human errors • (Centralized) Secret management to store and grant access • Certificate analysis
and the time to deploy to secrets.
correct security policies. • Service mesh solutions—use mutual TLS for transport
authentication, simplifying the management of keys,
certificates and configurations.
Access control Enforce fine-grained user- Provide the ability of • Built-in K8s RBAC—regulate permissions to access K8s • Mandatory access control
and application- access administrators to define resources (from pods to namespaces). (at network level)
policies to clusters by roles and manage control • Service mesh solutions—include a decentralized • Authentication event
providing least-privilege with greater granularity authorization framework for communications among thresholding
principles. than what is allowed containers. • Authorization event
today. • Identity providers tools—allow centralized management of thresholding
roles and access policies, including reporting and auditing • Resource access pattern
features. analysis
• Explicit rules inside clusters—restrict access between
pods and allow communications only between permitted
services.
Traffic Achieve defense-in- Provide in a declarative • NetworkPolicy API—provide firewall-like declarative rules. • Broadcast domain isolation
segmentation depth, limiting the lateral way, the expected model It presents high complexity and a few limitations (such as • Inbound/Outbound traffic
movements of a malicious of interaction between the pod-level granularity of policies). filtering
actor. the deployed services • Namespaces—isolate the K8s API resources environments • DNS allowlisting/denylisting
and then enforce its from each other. They do not have any effect on the
configuration without isolation of network and cluster resources.
breaking complex but • Service mesh–work at the application layer and can help to
desired networks paths. segment traffic; though cannot inspect the traffic.
• Virtual networks rules inside clusters–explicit virtual
network rules useful to segment traffic between pods.
(continued)

September/October 2021
Table 2. Main security challenges and mapping possible solutions in K8s networking to MITRE’s D&FEND Framework. (continued)

Domain Goal Key Challenge Tools (Control Knobs) MITRE D&FEND


Network Provide situation awareness Develop techniques such • Built-in K8s audit logger–audit the activities generated by • Administrative network
visibility at runtime to discover as model learning and users, applications using K8s API, and the control plane activity analysis

www.computer.org/security
anomalous and suspicious fingerprinting at runtime itself • Connection attempt analysis
behaviors. represent promising yet • Built-in K8s metrics-server–fetch individual container usage • DNS traffic analysis
not fully investigated statistics • Network traffic community
approaches to identify • Open source monitoring and logging engines–aggregate deviation
drift behaviors and process logs and metrics from different sources • Protocol metadata anomaly
providing a centralized repository for correlating events and detection
detecting problems
• Runtime security engines–detects unexpected application
behavior and alerts on threats at runtime by monitoring
system calls and comparing against security rules
Automated Automated triage and Extend deployment • Open source monitoring and logging engines–providing • Administrative network
remediation handling security alerts, automation to respond alerts and warnings of runtime events activity analysis
incidents and policy to security alerts (e.g., • Complex event process–identify and analyze cause and • Resource access pattern
violations. intentionally and effect relationship in real-time, connect with the automated analysis
proactively stopping then actions to be performed • System call analysis
restarting containers) • Automation engines and modules–define the playbooks
managing how security events are handled and connecting
to K8s nodes
Compliance & Follow industry best Extract evidence to • Benchmarks and best practice recommendations and • Software update
audit practices, standards, and demonstrate compliance, certification schemes–a set of recommendations for • Disk encryption (e.g., for the
internal security policies. risks indicators, and the configuring K8s in a secure way, including the certification datastore cluster)
efficient automation of of the cybersecurity of deployed services • Administrative network
such checks and evidence • K8s Audit logs–provide timestamped evidence on the state activity analysis
collection process of the K8s cluster and related user interactions • Resource access pattern
• Open source monitoring and logging engines–provide analysis
indicators on runtime events • System call analysis

55
KUBERNETES NETWORKING

of these abstractions. Understanding the design choices in in Proc. ICAS 2019 Adv. Intell. Syst. Comput., vol. 1158,
implementing these abstractions7 as well as their ramifica­ pp. 99–109, 2021.
tions for security is a key first step toward securing a K8s 11. “Containers matrix,” MITRE. Accessed: June 28, 2021.
(or any container-based) platform. We present a number [Online]. Available: https://attack.mitre.org/matrices/
of open challenges for the security community and hope enterprise/containers/
that this article spurs the community to address them.
Francesco Minna is a Ph.D. candidate at Vrije Univer-
Acknowledgments siteit Amsterdam, 1081 HV, The Netherlands. His
We thank the reviewers and IEEE Security & Privacy’s research interests include cloud security and dynamic
Editor in Chief Sean Peisert for their comments that greatly risk analysis for free and open source software. Minna
helped to improve this article. Any remaining error is our received a double master’s in cybersecurity in from
fault. This work has received funding by the European the University of Trento and the University of Rennes.
Union under the H2020 grant 952647 (AssureMOSS). Contact him at f.minna@vu.nl.

References Balakrishnan Chandrasekaran is a tenure-track assis-


1. Kubernetes. Accessed: June 15, 2021. [Online]. Avail- tant professor at the Vrije Universiteit Amsterdam,
able: https://kubernetes.io/ 1081 HV, The Netherlands. His research focuses on
2. M. S. Islam Shamim, F. Ahamed Bhuiyan, and A. Rahman, “XI the per­formance and security aspects of networked
Commandments of Kubernetes security: A systematization of sys­tems. Chandrasekaran received a Ph.D. from
knowledge related to Kubernetes security practices,” in Proc. Duke University. Contact him at b.chandrasekaran@
IEEE Secure Development (SecDev 2020), 2020, pp. 58–64. vu.nl.
3. D. D’Silva and D. D. Ambawade, “Building a zero trust
architecture using Kubernetes,” in Proc. 6th Int. Conf. Con- Agathe Blaise is a research engineer at Thales, Gennevil-
vergence Technol. (I2CT), 2021, pp. 1–8. doi: 10.1109/ liers, 92230, France. Her research interests focus on
I2CT51068.2021.9418203. cybersecurity, data analysis applied to networks, and
4. M. Dabbagh, B. Hamdaoui, G. Mohsen, and R. Ammar, programmable networks. Blaise received a Ph.D. in
“Software-defined networking security: Pros and cons,” computer science from Sorbonne University. Contact
IEEE Commun. Mag., vol. 53, no. 6, pp. 48–54, 2015. doi: her at agathe.blaise@thalesgroup.com.
10.1109/MCOM.2015.7120048.
5. C. Yoon et al., “Flow wars: Systemizing the attack surface Filippo Rebecchi is a research engineer at Thales,
and defenses in software-defined networks,” IEEE/ACM Gennevilliers, 92230, France. His current research
Trans. Netw., vol. 25, no. 6, pp. 3514–3530, 2017. doi: interests are in software-defined networking, cyber­
10.1109/TNET.2017.2748159. security, and next-generation mobile networking.
6. J. Nam, S. Lee, H. Seo, P. Porras, V. Yegneswaran, and S. Rebecchi received a Ph.D. from Pierre & Marie Curie
Shin, “BASTION: A security enforcement network stack University. Contact him at filippo.rebecchi@thales
for container networks,” in Proc. USENIX Annu. Techn. group.com.
Conf. (ATC 2020), 2020, pp. 81–95.
7. AssureMOSS Kubernetes Security Testbed. Accessed: Fabio Massacci is a professor at the University of Trento,
July 19, 2021. [Online]. Available: https://github.com/ Trento, 38123, Italy, and Vrije Universiteit, Amster-
assuremoss/Kubernetes-testbed dam, 1081 HV, The Netherlands. Massacci received a
8. “Secure containerized environments with updated threat Ph.D. in computing from the University of Rome “La
matrix for Kubernetes,” Microsoft, Mar. 23, 2021. Accessed: Sapienza.” He received the IEEE Requirements Engi-
June 15, 2021. [Online]. Available: https://www.microsoft neering Conference Ten Year Most Influential Paper
.com/security/blog/2021/03/23/secure-containerized Award on security in sociotechnical systems. He partic-
-environments-with-updated-threat-matrix-for-kubernetes/ ipates in the FIRST special interest group on the Com-
9. “Kubernetes documentation,” Kubernetes. Accessed: mon Vulnerability Scoring System and the European
June 15, 2021. [Online]. Available: https://kubernetes pilot CyberSec4Europe on the governance of cyber-
.io/docs/home/ security. He coordinates the European AssureMOSS
10. R. Kumar and M. C. Trivedi, “Networking analysis and project. He is a Member of IEEE. Contact him at fabio
performance comparison of Kubernetes CNI plugins,” .massacci@ieee.org.

56 IEEE Security & Privacy September/October 2021

You might also like