Cloud DC 2.3 Evpn Vxlan For Qfx10k
Cloud DC 2.3 Evpn Vxlan For Qfx10k
Cloud DC 2.3 Evpn Vxlan For Qfx10k
QFX10K
Prabakaran A
Kumaraguru Radhakrishnan
• ECMP
• ESI
• MAC+IP Route
• VMTO
• EVPN-VXLAN Type5
• EVPN-VXLAN Troubleshooting
• Virtual Extensible LAN (VXLAN), defines a 24-bit LAN segment identifier that
provides segmentation at cloud scale
• VXLAN can also enable migration of virtual machines between servers across
Layer 3 networks.
• VXLAN provides an architecture that customers can use to expand their cloud
deployments with repeatable pods in different Layer 2 domains
48 DEST MAC
IP HDR SOURCE
72 16
DATA PORT FLAGS
48 SRC MAC 8
PROTO: VXLAN RRRR1RRR
8 16
VLAN UDP PORT
32
(OPTIONAL) UDP RESERVED 24
16 CKSUM 16
ETH TYPE LENGTH
16
0X0800 SRC IP: CHKSUM VNI 24
32 16
MY VTEP 0X0000
DST IP: RESERVED 8
32
DEST VTEP
• Controller Based
1. Contrail with EVPN
2. Contrail with OVSDB
3. VMWare NSX
• It allows more scalable VXLAN overlay network designs suitable for private and
public clouds.
• The MP-BGP EVPN control plane introduces a set of features that reduces or
eliminates traffic flooding in the overlay network and enables optimal forwarding
for both west-east and south-north traffic.
Spine-1 Spine-2
Leaf-1>show vlans bd1
FF:FF:FF
:FF:FF:F
F
VNID : 100 VNID : 200 VNID Inner
: 100L2
M1 Leaf-1 Leaf-2 Leaf-3
0x0806
P
E
T
V
bd1 M1 vtep.zzz
bd1 M1 vtep.yyyy
SIP = S1IP VRF-1
VRF-1
DIP = L1IP IRB.1– 1.0.1.253, VMAC1
IRB.1– 1.0.1.253, VMAC1
IRB.2– 2.0.1.253, VMAC2 VTEP.
UDP IRB.2– 2.0.1.253, VMAC2
V
VXLAN (VNI=100) T
E Vlan MAC Interface
Inner L2-ARP Resp
Vlan MAC Interface P V bd1 VMAC1 esi.xxxx
V T V
bd2 VMAC2 esi.xxxx
bd1 VMAC1 esi.xxxx .
ET
T
DMAC = S1MAC E bd1 M1 vtep.xxx
bd2 VMAC2 esi.xxxx 3 P E P
bd1 M1 xe-0/0/0 SMAC = L1MAC2 P
7
SIP = L1IP 6
DIP = S1IP 9
FF:FF:FF:F
FF:FF:FF:F UDP
F:FF:FF
F:FF:FF
VXLAN (VNI=100)VNID : 100 VNID : 200
M1
M1 InnerInner Leaf-1
L2 Resp
L2-ARP Leaf-2
0x0806
0x0806
TIP:1.0.1.
TIP:1.0.1.
253
253 VM1 VM2 VM3
….. VLAN : 10 VLAN : 10
….. VLAN : 20
M1, IP1 M2, IP2 M3, IP3
1.0.1.1 1.0.1.2 2.0.1.1
12 15 14 13 12 15 14 13 14
14 13 13 12 12 15
15
58 59 61 60 58 59 61 60 61 60 59 58 61 60 58 59
QFX10002 QFX10002
QFX10002 QFX10002
Lo0:100.0.0.11 Lo0:100.0.0.12
Lo0:100.0.0.13 Lo0:100.0.0.14
IRB-vlan100: 10.1.100.1 66 66 67
67 66 67 67 66
IRB-vlan101: 10.1.101.1
IRB-vlan108: 10.1.108.1 50 51 50 51
50 51 51
50
QFX5100 52 QFX5100
QFX5100 Lo0:100.0.0.23 53
QFX5100 Lo0:100.0.0.24
Lo0:100.0.0.21 Lo0:100.0.0.22 52 53
12 13 QFX5100
12
L2 Switch
12 13 12 13 48
STC-8/5 STC-4/6 STC-8/6
STC-8/3 STC-1/3 STC-8/4 STC-1/4 STC-4/8
EBGP Underlay set policy-options policy-statement bgp-ipclos-in term loopbacks then accept
set policy-options policy-statement bgp-ipclos-out term loopback from protocol direct
set policy-options policy-statement bgp-ipclos-out term loopback from route-filter 100.0.0.11/32 orlonger
Fabric-1 Fabric-2 Fabric-3 Fabric-4
set policy-options policy-statement bgp-ipclos-out term loopback then community add MYCOMMUNITY
AS-60001 AS-60002 AS-60003 AS-60004
set policy-options policy-statement bgp-ipclos-out term loopback then next-hop self
Lo0:100.0.0.1 set policy-options Lo0:100.0.0.2 Lo0:100.0.0.3 Lo0:100.0.0.4
policy-statement bgp-ipclos-out
12
term loopback
15
then accept
12 15 14 12 13 14 13 14
15 14 13 set policy-options 13
policy-statement bgp-ipclos-out term 12
as-path from as-path asPathLength2 15
set policy-options policy-statement bgp-ipclos-out term as-path from community MYCOMMUNITY
set policy-options policy-statement bgp-ipclos-out term as-path then reject
set policy-options community MYCOMMUNITY members target:12345:111
58 59 61 59 as-path
60 set policy-options 61 asPathLength2 61 60 58 61 60 58
58 60 ".{2,}" 59 59
Spine-1 Spine-2 Spine-3 Spine-4
AS-60011 AS-60012 AS-60013 AS-60014
Spine Layer Lo0:100.0.0.11 Lo0:100.0.0.12 Lo0:100.0.0.13 Lo0:100.0.0.14
66 66 67
67 66 67 67 66
50 51 50 51
50 51 51
50
LEAF-1 LEAF-2 LEAF-3 LEAF-4
Leaf Layer AS-60021 AS-60022 AS-60023 AS-60024
Lo0:100.0.0.24
Lo0:100.0.0.21 Lo0:100.0.0.22 Lo0:100.0.0.23
set policy-options policy-statement bgp-ipclos-in term loopbacks from route-filter 100.0.0.0/16 orlonger
set policy-options policy-statement bgp-ipclos-in term loopbacks then accept
POD1 POD2
set policy-options policy-statement bgp-ipclos-out term loopback from protocol direct
set policy-options policy-statement bgp-ipclos-out term loopback from route-filter 100.0.0.21/32 exact
set policy-options policy-statement bgp-ipclos-out term loopback then next-hop self
set policy-options policy-statement bgp-ipclos-out term loopback then accept
set policy-options policy-statement bgp-ipclos-out term reject then reject
12 15 14 13 12 15 14 13 14
14 13 13 12 12 15
15
58 59 61 60 58 59 61 60 61 60 59 58 61 60 58 59
66 66 67
67 66 67 67 66
1.3
1.350 51
1.2 50 51
50 51 51
50
LEAF-3 52 LEAF-4
LEAF-1 AS-60023 53
LEAF-2 AS-60024
AS-60021 AS-60022 52 53
13 TOR
1.1 STC-8/5 STC-4/6 48
STC-8/3 STC-8/4 STC-1/4 STC-8/6
STC-1/3
POD1 POD2 STC-4/8
1. Traffic Flow
1.1 Hosts on same leaf
1.2 Hosts on different leaf but same POD
16 1.3 Hosts on different POD Copyright © 2014 Juniper Networks, Inc.
Inter VLAN Traffic - Type2 Tunnel
Fabric-1 Fabric-2 Fabric-3 Fabric-4
12 15 14 13 12 15 14 13 14
14 13 13 12 12 15
15
58 59 61 60 58 59 61 60 61 60 59 58 61 60 58 59
1.350 51 50 51
50 51 51
50
LEAF-3 52 LEAF-4
LEAF-1 AS-60023 53
1.1 AS-60021 1.2 LEAF-2 AS-60024
AS-60022 52 53
13 TOR
STC-8/5 STC-4/6 48
STC-8/3 STC-8/4 STC-1/4 STC-8/6
STC-1/3
POD1 POD2 STC-4/8
1. Traffic Flow
1.1 Inter VLAN on same leaf
1.2 Inter VLAN on different leaf but same POD
17 1.3 Inter VLAN on different POD Copyright © 2014 Juniper Networks, Inc.
ESI set interfaces et-0/0/53 ether-options 802.3ad ae0
set interfaces ae0 esi 00:01:01:01:01:01:01:01:01:01
set interfaces ae0 esi all-active
Fabric-1 Fabric-2 Fabric-3 lacp active
set interfaces ae0 aggregated-ether-options Fabric-4
set interfaces ae0 aggregated-ether-options lacp system-id 00:00:00:01:01:01
12 15 14 13 12 15 14 13 14
set interfaces et-0/0/52 12 ae0 unit
set13interfaces 0 family ethernet-switching interface-mode
12 trunk 15
15 ether-options
14 13 802.3ad ae0
set interfaces ae0 esi 00:01:01:01:01:01:01:01:01:01 set interfaces ae0 unit 0 family ethernet-switching vlan members 100-108
set interfaces ae0 esi all-active
set interfaces ae0 aggregated-ether-options lacp active
58 59 61 60 lacp system-id
58 59 61 60 61 60 59 58 61 60 58 59
set interfaces ae0 aggregated-ether-options 00:00:00:01:01:01
set interfaces ae0 unit 0 family ethernet-switching interface-mode trunk
Spine-1 Spine-2 Spine-3 Spine-4
set interfaces ae0 unit 0 family ethernet-switching vlan members 100-108
AS-60011 AS-60012 AS-60013 AS-60014
66 66 67
67 66 67 67 66
50 51 50 51
50 51 51
50
LEAF-3 52 LEAF-4
set interfaces xe-0/0/48:0 unit 0 family ethernet-switching interface-mode trunk 53
LEAF-1 LEAF-2 vlan members 100-108 AS-60023 AS-60024
set interfaces xe-0/0/48:0
AS-60021
unit 0 family ethernet-switching 52 AE 53
set interfaces et-0/0/52 ether-options 802.3ad ae0 AS-60022
set interfaces et-0/0/53 ether-options 802.3ad ae0 13 TOR
set interfaces ae0 aggregated-ether-options lacp active STC-8/5 STC-4/6 48
set interfaces ae0 unit 0 family
STC-8/3 ethernet-switching interface-mode
STC-8/4 trunk
STC-1/4 STC-8/6
STC-1/3
set interfaces ae0 unit 0 family ethernet-switching vlan members 100-108
POD1 POD2 STC-4/8
VGW:10.10.1 VGW:
VGW:
10.10.10.2
0.1
10.10.10.1
iBGP with family inet-vpn and 10.1.46.0/24 *[BGP/170] 00:44:14, localpref 100, from 100.0.0.13
LDP/RSVP AS path: I, validation-state: unverified
> to 172.16.0.55 via et-0/0/22.0, label-switched-path rr1-to-elit3
[BGP/170] 5d 04:00:03, localpref 100, from 100.100.255.4
AS path: I, validation-state: unverified
> to 172.16.0.57 via et-0/0/32.0, label-switched-path rr1-to-u8
STC-1/6
VRF
STC-1/5
51 50
51
root@cloud-bms-ultimat-sw01# run
50 show
51 arp no-resolve vpn VRF_10
50 50 51
QFX5100
MAC Address Address Interface Flags
QFX5100
root@cloud-bms-elite-sw03# run show arp no-resolve vpn VRF_10 de:ad:be:00:2e:40 10.1.46.40 irb.46 [ae203.0]
cloud-bms-sw04
QFX10008
none
QFX5100 cloud-bms-elite-sw04
Lo0:100.0.0.23 de:ad:be:00:2f:40 10.1.47.40 irb.47 [ae203.0]
cloud-bms-sw01 cloud-bms-sw05 Lo0:100.0.0.24permanent remote
MAC Address
Lo0:100.0.0.21 Address Interface
Lo0:100.0.0.22 Flags
12 13
de:ad:be:00:2e:40 10.1.46.40 irb.46 [ae203.0] permanent remote 12
12 de:ad:be:00:2f:40
13 12 10.1.47.40 13irb.47 [ae203.0] none
STC-8/5 STC-4/6 STC-4/8 STC-8/6
STC-8/3 STC-1/3 STC-8/4 STC-1/4
• L2 instance should be UP and VXLAN is associated properly with the L2 Bridging instance to form a VXLAN
tunnels
• L2 MAC learning in local L2 interface as well as over VTEP interface should happen
• IRB interface should be UP and ARP should be resolved for L3 traffic coming from Internet over IRB to
VTEP endpoints or vice versa
6 16 6 16 6 16 0/0/7
1/0/7
QFX10002 QFX10002
QFX10002 QFX10008
cloud-bms-elite-sw01 cloud-bms-elite-sw02 12
cloud-bms-elite-sw03 cloud-bms-ultimat-sw01
Lo0:100.0.0.11 Lo0:100.0.0.12
Lo0:100.0.0.13 Lo0:100.100.255.4
IRB-vlan1: 10.1.1.1 67 0/0/26
67 67 66
IRB-vlan2: 10.1.2.1
50 51
50
QFX5100 QFX5100 QFX5100
cloud-bms-sw05 cloud-bms-sw01 cloud-bms-sw04
Lo0:100.0.0.22 Lo0:100.0.0.22 Lo0:100.0.0.23
10G Links
MAC flags (S - static MAC, D - dynamic MAC, L - locally learned, P - Persistent static
SE - statistics enabled, NM - non configured MAC, R - remote PE MAC, O - ovsdb MAC)
Ethernet switching table : 4 entries, 4 learned
Routing instance : default-switch
Vlan MAC MAC Logical Active
name address flags interface source
bd1 00:00:5e:00:02:01 DR,SD esi.92140 05:00:00:00:00:00:00:03:e9:00
bd1 88:a2:5e:cb:a8:80 DR esi.92303 00:04:04:04:04:04:04:04:04:04
bd1 88:a2:5e:cc:48:80 D et-0/0/67.0
bd1 88:a2:5e:cc:d9:80 D vtep.32769 100.0.0.12
2:100.100.255.4:5000::1001::88:a2:5e:cb:a8:80/304
*[BGP/170] 00:40:28, localpref 100, from 100.0.0.31
AS path: I, validation-state: unverified
> to 172.16.0.50 via et-0/0/6.0
2:100.100.255.4:5000::1001::88:a2:5e:cb:a8:80::10.1.1.25/304
*[BGP/170] 00:40:28, localpref 100, from 100.0.0.31
AS path: I, validation-state: unverified
> to 172.16.0.50 via et-0/0/6.0
….
------------------------------------
L2 descriptor
==============
des-size des-size
============================================================
des-size des-size
Public No No (0 )3 2 1 0 0xb9e8
Flabel : 197064 Segment table index: : 48[4] Page Table index : 140[567] desc start addr:: 47592[1]
MAC flags (S - static MAC, D - dynamic MAC, L - locally learned, P - Persistent static
SE - statistics enabled, NM - non configured MAC, R - remote PE MAC, O - ovsdb MAC)
L2ald
set protocols l2-learning traceoptions file l2ald.log
set protocols l2-learning traceoptions file size 1g
set protocols l2-learning traceoptions level all
set protocols l2-learning traceoptions flag all
RPD
set protocols evpn traceoptions file evpn.log
set protocols evpn traceoptions file size 1g
set protocols evpn traceoptions flag all
Kernel
rtsockmon –rt
QFX5100 Yes NS NS NS NS
QFX10K Yes Yes Yes Yes Yes
NS – Not Supported
F
OUTER OUTER OUTER VXLAN
Original L2 Frame C
MAC IP UDP Header S
If each PE has the same MAC address configured on the IRB interface, there is no
need to dynamically synchronize the IRB MACs through the control plane by
advertising them with the default gateway extended community
With Knob:
Leaf-1>show vlans bd1
• QFX10K :
- ECMP across VTEP works
- ECMP underlay to each VTEP endpoint works