Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
Architecture Analysis, Hands-On
Evaluation/ Demo
Cilium Multi-Cluster
Networking & Service Mesh
Sanjeev Rampal
1
● Cilium Overview
● Cilium Cluster Mesh
● Cilium Service Mesh
● Some aspects of other Cilium features (eBPF data path, load balancing
optimizations, policy)
● Relation to other K8s community/ RH projects
● Demos
● Summary/ Takeaways
Agenda
Source:
Insert source data here
What we will discuss today
2
Cilium Overview
3
Cilium Overview & Architecture
Source:
Cilium.io & other Cilium/ Isovalent material
Entirely based on kernel networking + Cilium eBpf data plane for added performance & functionality
Implements
Kubernetes CNI w/ full featured v4, v6 support on Linux & Windows, overlay & BGP modes (aka direct
routing vs tunneling modes vs hybrid mode)
Kubernetes Network Policy + Cilium Network policy (advanced L4, L7 policies, Global network policy)
K8s E-W Service Load Balancing (ClusterIP)
K8s N-S Load Balancing (NodePort, LoadBalancer, Ingress, Gateway API resources)
Cilium Host Firewall, Egress Gateway, Kube-proxy replacement
Cilium Service Mesh (L7 + L4 traffic management, mTLS, Istio-like but without sidecar proxies)
Multi-Cluster Support (Cluster Mesh, Service mesh, Multi-Clus. policy, Multi-Clus. load balancing)
Cilium Overall Overview
4
Diagram:
Cilium.io
Cilium Datapath -Pod to pod case
eBpf tc switching
Datapaths
eBpf Socket switching
datapaths ->
Ref. Cilium data path
5
Diagram:
Borkmann, Isovalent
Cilium E-W and N-S LB w/o kube-proxy
- Handles external traffic (N-S) for svc IP:port
- Backends can be local or remote
- Performs DNAT and DSR/SNAT/Hybrid when remote
- Same code compilable for XDP and tc/BPF
- Hairpin to remote on XDP layer, local backends
handled via tc ingress
eth0
eth0
redis
lxc0
Node A
eth0
eth0
nginx
lxc0
Node B
client
XDP/BPF
tc/BPF
sock/BPF sock/BPF
XDP/BPF
tc/BPF
- Handles internal traffic (E-W) for svc IP:port
- Backends can be local or remote
- No packet-based NAT needed due to connect(),
sendmsg(), recvmsg() hook
- No intermediate hops as in kube-proxy
- Exposes services to all local addresses and
loopback 127.0.0.1/::1
- Blocks other applications in post-bind() hook
from port reuse
Main principle: Operating as close as
possible to the socket for E-W and as close
as possible to the driver for N-S.
6
Cilium Multi-Cluster Mesh
Cilium Multi-Cluster Mesh
7
● Multi-cluster networking analogous to “Submariner Mesh” or “Kubernetes Multi-Cluster
Services API” but significant differences
● Need Pod IP, service IP uniqueness and direct routability (no NAT) across the mesh
● This is not Kubernetes Federation .. still separately provisioned clusters but with coupled
networking, up to 256 clusters (possibly more in future) in a cluster mesh
● Separate control plane/ etcd for cross-cluster information sharing (e.g. pod IPs)
● MC Policy, identity at this layer, MC Load balancing (N-S, E-W)
● Use this for Multi-cluster with or without Cilium Service Mesh
● Encryption options: IPSec and Wireguard differences (per node tunnel vs per worker)
● Relation of K8s MCS API, Submariner, other community projects, compare MCS 2
resources (ClusterSetIPs/ ClusterIPs vs Cilium single service + global annotation)
● Note: Recently announced Cilium Mesh builds on this further
Cilium Cluster Mesh
8
Cilium Multi-Cluster Mesh -Control plane
● 2 or more (up to 256) independently provisioned k8s clusters, all running Cilium CNI, coupled in a “cluster mesh” (sort of “submariner mesh”)
● MCM control plane: A separate control plane with separate etcd datastore for the Multi-cluster mesh itself running as data plane pods within
the k8s clusters
● Cilium operator mirrors global k8s services, associated endpoints, related network policy info into MCM etcd
● A k8s Service is marked “Global” explicitly via Cilium annotations
○ Example: service.cilium.io/global: "true"
Diagram:
Cilium.io 9
Cluster Mesh Architecture
10
Diagram:
Cilium.io
Multi-Cluster Services & Network Policies
○ Relevant annotations:
■ service.cilium.io/global: "true" (/ “false”) Mark this local service as a “Global” (or not)
■ service.cilium.io/shared: "true" (/ “false”) Mark this local service as “Shared” (or not) within Global
■ service.cilium.io/affinity: "local|remote|none" Global service endpoint load balancing affinity/ preference
○ Note: Global services also have to adhere to namespace sameness rules
○ Multi-Cluster Network Policies
■ Exactly same API and implementation as single cluster network policies (both K8s network policy and
Cilium proprietary L4 and L7 Network policies)
■ Network policy labels/ selectors are reflected in the Multi-cluster mesh control plane so can have
global significance (plus optional additional per-cluster qualification within policy selectors)
11
Multi-Cluster Network Policy Example
#Sample Cilium Multicluster network policy augmented with cluster selectors
apiVersion: "cilium.io/v2"
kind: CiliumNetworkPolicy
metadata:
name: "allow-cross-cluster"
spec:
description: "Allow x-wing in cluster1 to contact rebel-base in cluster2"
endpointSelector:
matchLabels:
name: x-wing
io.cilium.k8s.policy.cluster: cluster1
egress:
- toEndpoints:
- matchLabels:
name: rebel-base
io.cilium.k8s.policy.cluster: cluster2
12
Cilium Cluster Mesh Demos
13
Demo Topology
S1 S2
S1
S2
Cluster-id 1 Cluster-id 2
Cilium install –cluster-id 1 …
Cilium clustermesh enable
Cilium clustermesh connect
–context c1 –destination-context c2
Global Service Annotations
io.cilium/global-service="true"
io.cilium/shared-service="false"
io.cilium/service-affinity=local
Example demo application:
S1 & S2 each is a global (multi-cluster)
service with 2 backend pods in each
cluster of the clustermesh
clustermesh
14
Demos
● Demo 1:
○ Cilium ClusterMesh intro and setup
○ Link to demo recording
● Demo 2:
○ Multi-cluster E-W Services & Load balancing
○ Multi-cluster network policy
○ Link to demo recording
● Demo 3:
○ N-S Load balancing, gateway API
○ Single and multi-cluster
○ Link to demo recording
15
Demo Topology
S1 S2
S1
S2
Cluster-id 1 Cluster-id 2
N-S Loadbalancing using
Cilium Ingress or Cilium GW Api
Multi-Cluster Ingress when combined
with Cilium ClusterMesh
Multiple modes possible (demo
topology shows just 1 mode)
clustermesh
GTWY
16
Background: Multi-Cluster Ingress LB modes/ scenarios
Svc
1
Svc
2
Svc
1
Svc
2
Svc
3
GW
(K8s GW api)
E-W GW
E-W GW
Svc
1
Svc
2
Svc
1
Svc
2
Svc
3
GW
(K8s GW api)
GW
(K8s GW api)
Single gateway, on-cluster LB, Multi-network Multi-gateway, on-cluster LB, Single-network
Svc
1
Svc
2
Svc
1
Svc
2
Svc
3
Single gateway, off-cluster (e.g. public cloud) LB, Single-network
External GLB class
Multi-cluster services can be combined
with BPG, DNS and public cloud anycast
to yield a variety of multi-cluster L4 and L7
ingress solutions for various use cases
including RH Hybrid Cloud Gateway.
Related Refs.
RH-ET blog post on this topic
Diagram:
Borkmann, Isovalent
Cilium E-W and N-S LB w/o kube-proxy
- Handles external traffic (N-S) for svc IP:port
- Backends can be local or remote
- Performs DNAT and DSR/SNAT/Hybrid when remote
- Same code compilable for XDP and tc/BPF
- Hairpin to remote on XDP layer, local backends
handled via tc ingress
eth0
eth0
redis
lxc0
Node A
eth0
eth0
nginx
lxc0
Node B
client
XDP/BPF
tc/BPF
sock/BPF sock/BPF
XDP/BPF
tc/BPF
- Handles internal traffic (E-W) for svc IP:port
- Backends can be local or remote
- No packet-based NAT needed due to connect(),
sendmsg(), recvmsg() hook
- No intermediate hops as in kube-proxy
- Exposes services to all local addresses and
loopback 127.0.0.1/::1
- Blocks other applications in post-bind() hook
from port reuse
Main principle: Operating as close as
possible to the socket for E-W and as close
as possible to the driver for N-S.
Additional refs: Cilium kube-proxy replacement and
related enhancements
18
eBpf Intercepts
for Nodeport Svc
Rx from
buffer
XDP
alloc_skb
TC ingress
nat
Prerouting
mangle
Prerouting
conntrack
raw
Prerouting
Socket
lookup
Container namespace
Prerouting & Input Chains
mangle
Postrouting
nat
Postrouting
Routing
Destined to
host
Input chain
destined
To Veth
To container
Veth ns
Socket rx
buffer/ app
Rx node pkt
filter
Forward
Eth0
172.18.0.2
a.a.a.a/ pa b.b.b.b/ pb
NodePort svc:
172.18.0.2:31000 =>
(10.1.2.2:80, 10.1.2.4:80)
10.1.2.2:80
xx:31000
Eth0
172.18.0.3
10.1.2.4:80
xx:31000
Cilium N-S enhancements
- Direct Server Return and Hybrid modes
(in addition to SNAT mode)
- Source IP preservation
- XDP acceleration
- 4to6 NAT, Maglev hashing
Cilium Service Mesh
21
● Option 1: Use Cilium only for L3/ L4 networking, use Istio control and data planes for L7
service mesh
● Option 2: Preferred long term direction but not fully ready tet.
○ Use Cilium as a single solution for all Kubernetes networking including
■ CNI plugin
■ Multi-Cluster networking
■ L4 and L7 Service Mesh
■ All networking functions incl load balancing (N-S and E-W), network policy, ingress and egress, gateway API implementation
■ 1) Data plane: Cilium. 2) Control plane: K8s native (Gateway api) + Envoy config CRD + Cilium apis
● CSM uses “sidecar-less model” in contrast with Istio/ LinkerD per pod sidecar model
● Istio community is developing “Ambient mode” of Istio in response to CSM sidecar less
mode
● Side note: State of multi-cluster in upstream native K8s apis (independent of specific
service meshes) is incomplete. Early draft proposals at initiating standards
Cilium Service Mesh (CSM)
22
Cilium’s Design Philosophy for Service Mesh
● A single networking plugin can serve all networking needs (basic CNI, service load
balancing, network policy, ingress, multi-cluster networking & service mesh
functions at both L4 and L7 layers). This results in a better integrated
architecture, improved user experience and lower resource consumption and
control plane complexity than using multiple separate projects to serve as CNI
plugin, service mesh plugin, ingress, gateway API plugin, multi-cluster networking
plugin etc.
○ Cilium already has had both L4 and L7 networking policy and load balancing even for its CNI
plugin, just reuse it for service mesh and augment where needed, rather than create separate
functions.
○ Cilium already has L4 traffic encryption & zero trust networking functions, reuse for service mesh
○ Extensions to Kubernetes APIs like Gateway API and others are already beginning to address all
service mesh functions without need for special APIs like the Istio & LinkerD
○ Service mesh and gateways are moving into the kernel and infra layers in any case (e.g. Ambient)
23
Diagram: Liz Rice: Learning eBPF
● Conventional sidecar based model => poor
latency due to suboptimal packet processing
with multiple user space<->kernel space
context switches
● Lowered reliability due to disconnect
between sidecar proxy readiness and app
readiness, server side initiated connections
● High resource consumption (proxy per pod
adds up)
Issues with Sidecar proxies
24
L4 + L7 data paths
L4 data path
L7 data path
25
Diagram: cilium.io
eBPF vs Proxy function split
26
Diagram: cilium.io
S1 S2 S2
GTWY
Envoy
Cilium
Agent
Envoy
IPSec Tunnel
L7 mesh
traffic
● Cilium Agent envoy 1 per
node, used for L7 proxy (N-S
and E-W service load
balancing & policy)
● Cilium kernel eBPF used for
L4 pod and service load
balancing, policy
● Single Envoy instance
wrapped inside Cilium agent
for all L7 functions
● Special mTLS + IPSec &
Wireguard optional “on the
wire”
27
Diagram
cilium.io
Cilium’s Alternate Architecture for Service Mesh mutual-TLS
● Conventional session based
TLS: usable only by
applications running on TCP
and HTTP, fine grained
sessions but performance
impact
● Network based encrypted
tunnels (IPSec, Wireguard
etc): usable by any app
protocol, coarse grained,
higher performance
● Cilium mTLS : Combine the
two (Cilium Proprietary
solution)
● Not full GA yet (Cilium 1.14
?)
28
Cilium Service Mesh Demos
29
Demo
● Cilium Service Mesh
○ L4 Cilium mesh proxy with IPSec encryption
○ <<< Demo recording Link to be added here>>>
○ L7 Cilium proxy without IPSec encryption
30
S1 S2 S2
Cilium
L4
eBPF
IPSec Tunnel
L4 mesh
traffic ● Current release (1.13.x)
IPSec encryption limited to
L4 mesh traffic proxy,
policy & load balancing
31
Cilium Service Mesh
S1 S2 S2
Cilium
L7 proxy
Envoy
L7 Svc
mesh traffic
● In current release (1.13.x)
Cilium L7 proxy functions
supported without IPSec/
encryption support
● L7 traffic management via
EnvoyConfig CRD. In future via
Gateway API & extensions
Demo Topology
Google gRPC microservices demo app
32
Extra: Notes on some additional Cilium Features
33
L7 Network Policy examples
● Cilium CNI already supported L4 and L7 network policy
● Hence L7 network policy can also be used for svc mesh
● Https, gRPC, Cassandra, Redis, TLS SNI 34
Big TCP support
35
Diagram
cilium.io
“Meta”devices
36
Diagram
cilium.io
BBR Algorithm for advanced congestion control
37
Diagram
cilium.io
Thank you
Twitter: @sr2357
Github: @srampal
38

More Related Content

cilium-public.pdf

  • 1. Architecture Analysis, Hands-On Evaluation/ Demo Cilium Multi-Cluster Networking & Service Mesh Sanjeev Rampal 1
  • 2. ● Cilium Overview ● Cilium Cluster Mesh ● Cilium Service Mesh ● Some aspects of other Cilium features (eBPF data path, load balancing optimizations, policy) ● Relation to other K8s community/ RH projects ● Demos ● Summary/ Takeaways Agenda Source: Insert source data here What we will discuss today 2
  • 4. Cilium Overview & Architecture Source: Cilium.io & other Cilium/ Isovalent material Entirely based on kernel networking + Cilium eBpf data plane for added performance & functionality Implements Kubernetes CNI w/ full featured v4, v6 support on Linux & Windows, overlay & BGP modes (aka direct routing vs tunneling modes vs hybrid mode) Kubernetes Network Policy + Cilium Network policy (advanced L4, L7 policies, Global network policy) K8s E-W Service Load Balancing (ClusterIP) K8s N-S Load Balancing (NodePort, LoadBalancer, Ingress, Gateway API resources) Cilium Host Firewall, Egress Gateway, Kube-proxy replacement Cilium Service Mesh (L7 + L4 traffic management, mTLS, Istio-like but without sidecar proxies) Multi-Cluster Support (Cluster Mesh, Service mesh, Multi-Clus. policy, Multi-Clus. load balancing) Cilium Overall Overview 4
  • 5. Diagram: Cilium.io Cilium Datapath -Pod to pod case eBpf tc switching Datapaths eBpf Socket switching datapaths -> Ref. Cilium data path 5
  • 6. Diagram: Borkmann, Isovalent Cilium E-W and N-S LB w/o kube-proxy - Handles external traffic (N-S) for svc IP:port - Backends can be local or remote - Performs DNAT and DSR/SNAT/Hybrid when remote - Same code compilable for XDP and tc/BPF - Hairpin to remote on XDP layer, local backends handled via tc ingress eth0 eth0 redis lxc0 Node A eth0 eth0 nginx lxc0 Node B client XDP/BPF tc/BPF sock/BPF sock/BPF XDP/BPF tc/BPF - Handles internal traffic (E-W) for svc IP:port - Backends can be local or remote - No packet-based NAT needed due to connect(), sendmsg(), recvmsg() hook - No intermediate hops as in kube-proxy - Exposes services to all local addresses and loopback 127.0.0.1/::1 - Blocks other applications in post-bind() hook from port reuse Main principle: Operating as close as possible to the socket for E-W and as close as possible to the driver for N-S. 6
  • 7. Cilium Multi-Cluster Mesh Cilium Multi-Cluster Mesh 7
  • 8. ● Multi-cluster networking analogous to “Submariner Mesh” or “Kubernetes Multi-Cluster Services API” but significant differences ● Need Pod IP, service IP uniqueness and direct routability (no NAT) across the mesh ● This is not Kubernetes Federation .. still separately provisioned clusters but with coupled networking, up to 256 clusters (possibly more in future) in a cluster mesh ● Separate control plane/ etcd for cross-cluster information sharing (e.g. pod IPs) ● MC Policy, identity at this layer, MC Load balancing (N-S, E-W) ● Use this for Multi-cluster with or without Cilium Service Mesh ● Encryption options: IPSec and Wireguard differences (per node tunnel vs per worker) ● Relation of K8s MCS API, Submariner, other community projects, compare MCS 2 resources (ClusterSetIPs/ ClusterIPs vs Cilium single service + global annotation) ● Note: Recently announced Cilium Mesh builds on this further Cilium Cluster Mesh 8
  • 9. Cilium Multi-Cluster Mesh -Control plane ● 2 or more (up to 256) independently provisioned k8s clusters, all running Cilium CNI, coupled in a “cluster mesh” (sort of “submariner mesh”) ● MCM control plane: A separate control plane with separate etcd datastore for the Multi-cluster mesh itself running as data plane pods within the k8s clusters ● Cilium operator mirrors global k8s services, associated endpoints, related network policy info into MCM etcd ● A k8s Service is marked “Global” explicitly via Cilium annotations ○ Example: service.cilium.io/global: "true" Diagram: Cilium.io 9
  • 11. Multi-Cluster Services & Network Policies ○ Relevant annotations: ■ service.cilium.io/global: "true" (/ “false”) Mark this local service as a “Global” (or not) ■ service.cilium.io/shared: "true" (/ “false”) Mark this local service as “Shared” (or not) within Global ■ service.cilium.io/affinity: "local|remote|none" Global service endpoint load balancing affinity/ preference ○ Note: Global services also have to adhere to namespace sameness rules ○ Multi-Cluster Network Policies ■ Exactly same API and implementation as single cluster network policies (both K8s network policy and Cilium proprietary L4 and L7 Network policies) ■ Network policy labels/ selectors are reflected in the Multi-cluster mesh control plane so can have global significance (plus optional additional per-cluster qualification within policy selectors) 11
  • 12. Multi-Cluster Network Policy Example #Sample Cilium Multicluster network policy augmented with cluster selectors apiVersion: "cilium.io/v2" kind: CiliumNetworkPolicy metadata: name: "allow-cross-cluster" spec: description: "Allow x-wing in cluster1 to contact rebel-base in cluster2" endpointSelector: matchLabels: name: x-wing io.cilium.k8s.policy.cluster: cluster1 egress: - toEndpoints: - matchLabels: name: rebel-base io.cilium.k8s.policy.cluster: cluster2 12
  • 14. Demo Topology S1 S2 S1 S2 Cluster-id 1 Cluster-id 2 Cilium install –cluster-id 1 … Cilium clustermesh enable Cilium clustermesh connect –context c1 –destination-context c2 Global Service Annotations io.cilium/global-service="true" io.cilium/shared-service="false" io.cilium/service-affinity=local Example demo application: S1 & S2 each is a global (multi-cluster) service with 2 backend pods in each cluster of the clustermesh clustermesh 14
  • 15. Demos ● Demo 1: ○ Cilium ClusterMesh intro and setup ○ Link to demo recording ● Demo 2: ○ Multi-cluster E-W Services & Load balancing ○ Multi-cluster network policy ○ Link to demo recording ● Demo 3: ○ N-S Load balancing, gateway API ○ Single and multi-cluster ○ Link to demo recording 15
  • 16. Demo Topology S1 S2 S1 S2 Cluster-id 1 Cluster-id 2 N-S Loadbalancing using Cilium Ingress or Cilium GW Api Multi-Cluster Ingress when combined with Cilium ClusterMesh Multiple modes possible (demo topology shows just 1 mode) clustermesh GTWY 16
  • 17. Background: Multi-Cluster Ingress LB modes/ scenarios Svc 1 Svc 2 Svc 1 Svc 2 Svc 3 GW (K8s GW api) E-W GW E-W GW Svc 1 Svc 2 Svc 1 Svc 2 Svc 3 GW (K8s GW api) GW (K8s GW api) Single gateway, on-cluster LB, Multi-network Multi-gateway, on-cluster LB, Single-network Svc 1 Svc 2 Svc 1 Svc 2 Svc 3 Single gateway, off-cluster (e.g. public cloud) LB, Single-network External GLB class Multi-cluster services can be combined with BPG, DNS and public cloud anycast to yield a variety of multi-cluster L4 and L7 ingress solutions for various use cases including RH Hybrid Cloud Gateway. Related Refs. RH-ET blog post on this topic
  • 18. Diagram: Borkmann, Isovalent Cilium E-W and N-S LB w/o kube-proxy - Handles external traffic (N-S) for svc IP:port - Backends can be local or remote - Performs DNAT and DSR/SNAT/Hybrid when remote - Same code compilable for XDP and tc/BPF - Hairpin to remote on XDP layer, local backends handled via tc ingress eth0 eth0 redis lxc0 Node A eth0 eth0 nginx lxc0 Node B client XDP/BPF tc/BPF sock/BPF sock/BPF XDP/BPF tc/BPF - Handles internal traffic (E-W) for svc IP:port - Backends can be local or remote - No packet-based NAT needed due to connect(), sendmsg(), recvmsg() hook - No intermediate hops as in kube-proxy - Exposes services to all local addresses and loopback 127.0.0.1/::1 - Blocks other applications in post-bind() hook from port reuse Main principle: Operating as close as possible to the socket for E-W and as close as possible to the driver for N-S. Additional refs: Cilium kube-proxy replacement and related enhancements 18
  • 19. eBpf Intercepts for Nodeport Svc Rx from buffer XDP alloc_skb TC ingress nat Prerouting mangle Prerouting conntrack raw Prerouting Socket lookup Container namespace Prerouting & Input Chains mangle Postrouting nat Postrouting Routing Destined to host Input chain destined To Veth To container Veth ns Socket rx buffer/ app Rx node pkt filter Forward
  • 20. Eth0 172.18.0.2 a.a.a.a/ pa b.b.b.b/ pb NodePort svc: 172.18.0.2:31000 => (10.1.2.2:80, 10.1.2.4:80) 10.1.2.2:80 xx:31000 Eth0 172.18.0.3 10.1.2.4:80 xx:31000 Cilium N-S enhancements - Direct Server Return and Hybrid modes (in addition to SNAT mode) - Source IP preservation - XDP acceleration - 4to6 NAT, Maglev hashing
  • 22. ● Option 1: Use Cilium only for L3/ L4 networking, use Istio control and data planes for L7 service mesh ● Option 2: Preferred long term direction but not fully ready tet. ○ Use Cilium as a single solution for all Kubernetes networking including ■ CNI plugin ■ Multi-Cluster networking ■ L4 and L7 Service Mesh ■ All networking functions incl load balancing (N-S and E-W), network policy, ingress and egress, gateway API implementation ■ 1) Data plane: Cilium. 2) Control plane: K8s native (Gateway api) + Envoy config CRD + Cilium apis ● CSM uses “sidecar-less model” in contrast with Istio/ LinkerD per pod sidecar model ● Istio community is developing “Ambient mode” of Istio in response to CSM sidecar less mode ● Side note: State of multi-cluster in upstream native K8s apis (independent of specific service meshes) is incomplete. Early draft proposals at initiating standards Cilium Service Mesh (CSM) 22
  • 23. Cilium’s Design Philosophy for Service Mesh ● A single networking plugin can serve all networking needs (basic CNI, service load balancing, network policy, ingress, multi-cluster networking & service mesh functions at both L4 and L7 layers). This results in a better integrated architecture, improved user experience and lower resource consumption and control plane complexity than using multiple separate projects to serve as CNI plugin, service mesh plugin, ingress, gateway API plugin, multi-cluster networking plugin etc. ○ Cilium already has had both L4 and L7 networking policy and load balancing even for its CNI plugin, just reuse it for service mesh and augment where needed, rather than create separate functions. ○ Cilium already has L4 traffic encryption & zero trust networking functions, reuse for service mesh ○ Extensions to Kubernetes APIs like Gateway API and others are already beginning to address all service mesh functions without need for special APIs like the Istio & LinkerD ○ Service mesh and gateways are moving into the kernel and infra layers in any case (e.g. Ambient) 23
  • 24. Diagram: Liz Rice: Learning eBPF ● Conventional sidecar based model => poor latency due to suboptimal packet processing with multiple user space<->kernel space context switches ● Lowered reliability due to disconnect between sidecar proxy readiness and app readiness, server side initiated connections ● High resource consumption (proxy per pod adds up) Issues with Sidecar proxies 24
  • 25. L4 + L7 data paths L4 data path L7 data path 25 Diagram: cilium.io
  • 26. eBPF vs Proxy function split 26 Diagram: cilium.io
  • 27. S1 S2 S2 GTWY Envoy Cilium Agent Envoy IPSec Tunnel L7 mesh traffic ● Cilium Agent envoy 1 per node, used for L7 proxy (N-S and E-W service load balancing & policy) ● Cilium kernel eBPF used for L4 pod and service load balancing, policy ● Single Envoy instance wrapped inside Cilium agent for all L7 functions ● Special mTLS + IPSec & Wireguard optional “on the wire” 27
  • 28. Diagram cilium.io Cilium’s Alternate Architecture for Service Mesh mutual-TLS ● Conventional session based TLS: usable only by applications running on TCP and HTTP, fine grained sessions but performance impact ● Network based encrypted tunnels (IPSec, Wireguard etc): usable by any app protocol, coarse grained, higher performance ● Cilium mTLS : Combine the two (Cilium Proprietary solution) ● Not full GA yet (Cilium 1.14 ?) 28
  • 30. Demo ● Cilium Service Mesh ○ L4 Cilium mesh proxy with IPSec encryption ○ <<< Demo recording Link to be added here>>> ○ L7 Cilium proxy without IPSec encryption 30
  • 31. S1 S2 S2 Cilium L4 eBPF IPSec Tunnel L4 mesh traffic ● Current release (1.13.x) IPSec encryption limited to L4 mesh traffic proxy, policy & load balancing 31
  • 32. Cilium Service Mesh S1 S2 S2 Cilium L7 proxy Envoy L7 Svc mesh traffic ● In current release (1.13.x) Cilium L7 proxy functions supported without IPSec/ encryption support ● L7 traffic management via EnvoyConfig CRD. In future via Gateway API & extensions Demo Topology Google gRPC microservices demo app 32
  • 33. Extra: Notes on some additional Cilium Features 33
  • 34. L7 Network Policy examples ● Cilium CNI already supported L4 and L7 network policy ● Hence L7 network policy can also be used for svc mesh ● Https, gRPC, Cassandra, Redis, TLS SNI 34
  • 37. BBR Algorithm for advanced congestion control 37 Diagram cilium.io