BRKDCT 3378
BRKDCT 3378
BRKDCT 3378
MAN/WA
N
FabricPath VXLAN
/BGP /EVPN
MAN/WA MAN/WA
N N
Data Centre Fabric Properties
Extended Namespace
Scalable Layer-2 Domains
Integrated Route and Bridge
Multi-Tenancy
Hybrid Overlays
Inter-Pod connectivity
Overlay Based Data Centre Fabrics
Desirable Attributes:
RR RR
• Mobility
• Segmentation
• Scale
• Automated & Programmable
• Abstracted consumption models
• Full Cross Sectional Bandwidth
• Layer-2 + Layer-3 Connectivity
• Physical + Virtual
Overlay Based Data Centre: Edge Devices
Network Overlays Host Overlays Hybrid Overlays
V
V
V
V
V
V
Data (Payload)
CRC/FCS
VLAN ID
12 bits
TPID = Tag Protocol Identifier, TCI = Tag Control Information, PCP = Priority Code Point,
CFI = Canonical Format Indicator, VID = VLAN Identifier
Overview
Introducing VXLAN
• Traditionally VLAN is expressed
over 12 bits (802.1Q tag)
• Limits the maximum number of
segments in a Data Centre to 4096
VLANs
• VXLAN leverages the VNI field with Classical Ethernet Frame DMAC SMAC 802.1Q Etype Payload CRC
Cisco DFA
Frame
•
8 bits 24 bits 24 bits 8 bits
The VXLAN Network Identifier ags Reserved VNIVNI Reserved
48
MAC-in-IP Encapsulation VLAN Type 14 Bytes
16
0x8100 (4 Bytes Optional) IP Header
72
Misc. Data
VLAN ID
16
Outer MAC Header Tag
Protocol 0x11 (UDP) 8
Ether Type
16
Underlay
0x0800 Header
16 20 Bytes
Checksum
Source IP 32
50 (54) Bytes of Overhead
Overlay
VXLAN Flags
8
RRRRIRRR
Original Layer-2 Frame Allows for 16M
possible
Reserved 24
Segments
8 Bytes
VNI 24
Reserved 8
Data Centre Fabric Properties
Extended Namespace
Scalable Layer-2 Domains
Integrated Route and Bridge
Multi-Tenancy
Understanding Overlay Technologies
Overlay Services
• Layer 2 Underlay Transport
Tunnel Encapsulation
• Layer 3 Network
• Layer 2 and Layer 3
Edge Device
Local LAN
Edge Device Segment
Edge Device
VTEP
VTEP
Local LAN
VTEP Segment
VTEP
VTEP
Control-
EVPN MP-BGP - RFC 7432
Plane
ID Title Category
Overlay
Integrated (VXLAN)
Route/Bridge
BGP
(EVPN)
Agenda
Underlay
• Avoid Fragmentation by adjusting
the IP Networks MTU
50 (54) Bytes of Overhead
Outer IP Header
*Cisco Nexus 5600/6000 switches only support 9192 Byte for Layer-3 Traffic
Building your IP Network – Interface Principles (1)
• Know your IP addressing and IP Rendezvous-Point Loopback
10.254.254.1
scale requirements Routing Loopback
10.10.10.203/32
V
• Best to use individual Aggregates
for the Underlay V
• Unicast Routing p2p** Links p2p Links V
10.1.1.1/30
• Unicast Routing Loopbacks V
• VTEP (NVE) Loopback V p2p Agg: 10.1.1.0/24
• Multicast Routing Loopback (RP) RID Agg: 10.10.10.0/24
Routing Loopback V VTEP Agg: 10.200.200.0/24
10.10.10.101/32
RP Agg: 10.254.254.0/24
• IPv4 only (today) VTEP Loopback
10.200.200.101/32
*RP: Rendezvous-Point (Multicast)
**p2p: Point-to-Point
Building your IP Network – Interface Principles (2)
• Routed Ports/Interfaces
• Layer-3 Interfaces between Spine and
Leaf (no switchport)
• For each Point-2-Point (P2P)
connection, minimum /31 required
V
• Alternative, use IP Unnumbered (/32)
768 IP Addresses required == /22 Prefix *RID: Router ID; Unicast Routing Loopback
IP Unnumbered– Simplifying the Math
Multicast Mode IGMP v2/v3 PIM ASM PIM BiDir PIM ASM / PIM BiDir PIM ASM PIM BiDir PIM ASM / PIM BiDir
V3 iBGP Peering*
V1 V2
RR BGP Route-Reflector
V3 iBGP Peering
Multiprotocol BGP (MP-BGP) Primer
• Cisco’s VXLAN/EVPN does provide VRF Info
automated Route Distinguisher (RD) VRF Info Name: VRF-A
Name: VRF-A RD: 15:10.0.0.2 (auto)
• Automatic uses Type 1 format RD: Imp
3:10.0.0.1 (auto) RRRoute-Target
RR 65500:50000 (auto)
Imp Route-Target 65500:50000 Exp Route-Target 65500:50000 (auto)
• 4-byte IP Address (Router ID) VRF Info (auto)
Exp Route-Target 65500:50000
Name: (auto)VRF-A
• 4-byte Value (VRF ID) RD: 62:10.0.0.3 (auto)
Imp Route-Target 65500:50000 (auto)
vrf context VRF-A Exp Route-Target 65500:50000 (auto)
vni 50000 V2
rd auto V1
address-family ipv4 unicast
route-target both auto
route-target both auto evpn
address-family ipv6 unicast
route-target both auto RR BGP Route-Reflector
route-target both auto evpn V3 iBGP Peering
Multiprotocol BGP (MP-BGP) Primer BGP Advertisement
VPN-EVPN: RD:[MAC_A][IP_A]
• VPN Segmentation for tenant BGP Next-Hop: V1
Route Target: 65500:50000
routing (Multi-Tenancy) Label (L3VNI): 50000
RR RR
• Selective distribute VPN routes -
Route Target (RT)
• 8-byte field of VRF parameter
MAC_A / IP_A >>
• unique value to define the LOCAL
Route-Type2
import/export rules for VPN prefix V1 V2
MAC_A / IP_A >> V1
Route-Type2
RR BGP Route-Reflector
V3 iBGP Peering
Host A
MAC_A / IP_A
Multiprotocol BGP (MP-BGP) Primer BGP Advertisement
VPN-EVPN: RD:[MAC_A][IP_A]
• Cisco’s VXLAN/EVPN does provide BGP Next-Hop: V1
Route Target: 65500:50000
automated Route Target (RT) Label (L3VNI): 50000
• 8-byte Route Target (2 x 4-byte) RR RR
• ASN : VNI
Host A
MAC_A / IP_A
Overlay with Optimised Routing
EVPN Control Plane -- Host and Subnet Route Distribution
BGP Update
RR RR • Host-MAC
Spine • Host-IP
Border • Internal IP Subnet
• External Prefixes
V
V
V
V
V
V
BGP Adjacencies
Route-Reflectors deployed
RR
for scaling purposes (iBGP)
Host Advertisement
Route MAC, IP L2VNI L3VNI NH Encap Seq
(“VLAN”) (“VRF”)
• Host Attaches Type
BGP routing table information for VRF default, address family L2VPN EVPN
Route Distinguisher: 10.0.0.1:32868
BGP routing table entry for
[2]:[0]:[0]:[48]:[0050.56a3.c2bb]:[32]:[192.168.1.73]/272,
version 4
Paths: (1 available, best #1)
Flags: (0x000202) on xmit-list, is not in l2rib/evpn, is locked
Advertised path-id 1
L3VNI Path type: internal, path is valid, is best path, no labeled nexthop
AS-Path: NONE, path sourced internal to AS
L2VNI 10.0.0.1 (metric 3) from 10.0.0.111 (10.0.0.111)
Origin IGP, MED not set, localpref 100, weight 0
Received label 30001 50001
Extcommunity: RT:65501:30001 RT:65501:50001 ENCAP:8 Router MAC:5087.89d4.5495
Originator: 10.0.0.1 Cluster list: 10.0.0.111
Remote VTEP Route Target: Route Target: Overlay Encapsulation: Router MAC of
IP Address L2VNI (VLAN) L3VNI (VRF) 8 - VXLAN Remote VTEP
Protocol Learning & Distribution
RR RR
MAC, IP L2VNI L3VNI NH MAC, IP L2VNI L3VNI NH
MAC_A, IP_A 30001 50001 local MAC_B, IP_B 30001 50002 local
1
1
1 V2
V1
Host C Host Y
MAC_C / IP_C MAC_Y / IP_Y
Protocol Learning & Distribution
RR RR
MAC, IP L2VNI L3VNI NH MAC, IP L2VNI L3VNI NH
MAC_A, IP_A 30001 50001 local MAC_B, IP_B 30001 50001 local
2
2
2 V2
V1
Host C Host Y
MAC_C / IP_C MAC_Y / IP_Y
Protocol Learning & Distribution
3 3
RR RR
MAC, IP L2VNI L3VNI NH MAC, IP L2VNI L3VNI NH
MAC_A, IP_A 30001 50001 local MAC_B, IP_B 30001 50001 local
MAC_B, IP_B 30001 50001 IP_V2 MAC_A, IP_A 30001 50001 IP_V1
MAC_C, IP_C 30001 50001 IP_V3 MAC_C, IP_C 30001 50001 IP_V3
2
VMAC_Y, IP_Y 30002 50001 IP_V3
MAC_Y, IP_Y 30002 50001
V1IP_V3
3
MAC, IP L2VNI L3VNI NH
Host C Host Y
MAC_C / IP_C MAC_Y / IP_Y
Subnet Route Advertisement
Route MAC, IP L3VNI NH Encap
(“VRF”)
• IP Prefix Redistribution Type
5 Subnet_A/24 50000
50001 IP_V1 8:VXLAN
Prefix, Equal Cost Multipath (ECMP)
5 Subnet_A/24 50001 IP_V2 8:VXLAN
will apply 5 Subnet_A/24 RR
50001 RR IP_V3 8:VXLAN
BGP routing table information for VRF default, address family L2VPN EVPN
Route Distinguisher: 10.0.0.1:3
BGP routing table entry for [5]:[0]:[0]:[24]:[192.168.2.0]:[0.0.0.0]/224, version 3
Paths: (1 available, best #1)
Flags: (0x000002) on xmit-list, is not in l2rib/evpn, is locked
Advertised path-id 1
Path type: internal, path is valid, is best path, no labeled nexthop
AS-Path: NONE, path sourced internal to AS
L3VNI 10.0.0.1 (metric 3) from 10.0.0.111 (10.0.0.111)
Origin incomplete, MED 0, localpref 100, weight 0
Received label 50001
Extcommunity: RT:65501:50001 ENCAP:8 Router MAC:5087.89d4.5495
Originator: 10.0.0.1 Cluster list: 10.0.0.111
1 2
MAC, IP VNI NH
MAC_A, IP_A 30001 IP_V1
Host A
MAC_A / IP_A V3 Host B
MAC_B / IP_B
ARP Request for IP_B
Src MAC: MAC_A
Dst MAC: FF:FF:FF:FF:FF:FF
1 ARP Request sent for IP_B sent from Host A
Virtual Switch
4
3
MAC, IP VNI NH
MAC_A, IP_A 30000
30001 V1
IP_V1
Underlay
DMAC: hop-by-hop DMAC: MAC_V2
Underlay
SIP: IP_V1 SIP: IP_V1
DIP: IP_V2 DIP: IP_V2
Host A
MAC_A / IP_A
UDP V3 UDP Host B
MAC_B / IP_B
VXLAN VNID: 30001 VXLAN VNID: 30001
Overlay
Overlay
MAC_A, IP_A 30001 Local 50001 MAC_A, IP_A 30001 Local 50001
MAC_F, IP_F 30005 IP_V2 50001 MAC_F, IP_F 30005 E1/4 50001
SMAC: MAC_A SMAC: MAC_GW
DMAC: MAC_GW DMAC: MAC_F
SIP: IP_A
V1 V2 SIP: IP_A
DIP: IP_F DIP: IP_F
1
2 3 4
SMAC: MAC_V1 SMAC: hop-by-hop
Underlay
DMAC: hop-by-hop DMAC: MAC_V2
Underlay
SIP: IP_V1 SIP: IP_V1
DIP: IP_V2 DIP: IP_V2
Host A
MAC_A / IP_A
UDP V3 UDP Host F
MAC_F, IP_F
VXLAN VNID: 50001 VXLAN VNID: 50001
Overlay
Overlay
MAC_A, IP_A 30000 Local 50001 MAC_A, IP_A 30000 Local 50001
Underlay
DMAC: hop-by-hop DMAC: MAC_V2
Underlay
SIP: IP_V1 SIP: IP_V1
DIP: IP_V2 DIP: IP_V2
Host A
MAC_A / IP_A
UDP V3 UDP Host F
MAC_F, IP_F
VXLAN VNID: 50001 VXLAN VNID: 50001
Overlay
Overlay
routing methodology
• datagrams sent from a single
sender to the topologically
✖
nearest node
✔
• group of potential receivers, ✖
all identified by the same
destination address
✔
Distributed IP Anycast Gateway
• Distributed Inter-VXLAN Routing at
Access Layer (Leaf) RR RR
SVI 200
SVI 100
• Distributed state - Smaller ARP SVI 200
SVI 100
tables SVI 200
SVI 100
• Only local attached End-Points SVI 200
(Servers)
SVI 100, Gateway IP: 192.168.1.1, Gateway MAC: AG:AG:AG:AG:AG:AG
SVI 200, Gateway IP: 10.10.10.1, Gateway MAC: AG:AG:AG:AG:AG:AG
Distributed IP Anycast Gateway
RR RR
Spine
bridge SVI 100, Gateway IP: 192.168.1.1
SVI 200, Gateway IP: 10.10.10.1
route V
route
SVI 100
Host3
V MAC: CC:CC:CC:CC:CC:CC
V SVI 200 IP: 192.168.1.33
VLAN 100
V VXLAN VNI 30001
V Host2
MAC: BB:BB:BB:BB:BB:BB
V IP: 10.10.10.22
SVI 100 VLAN 200
Host1 VXLAN VNI 30002
MAC: AA:AA:AA:AA:AA:AA
IP: 192.168.1.11
VLAN 100
VXLAN VNI 30001
Integrated Routing and Bridging (IRB)
VXLAN/EVPN based overlays follow
two slightly different Integrated RR RR
Leafs
• Optimal for Consistency SVI 100
SVI 200
• Every VLAN/VNI Everywhere
SVI 300
SVI 100
• Sub-Optimal for Scale SVI 100
SVI 200
SVI 300
Scoped Configuration
• Logical Configuration (VLAN, VRF,
VNI) scoped to Leafs with respective RR RR
connected End-Points
• Optimal for Scale SVI 200
SVI 100
• Consistency with End-Points SVI 200
SVI 100
• Configuration Consistency depends SVI 300
SVI 200
on End-Points
SVI 100 SVI 300
SVI 200
Asymmetric IRB
• Similar to todays Inter-VLAN routing
RR RR
• Requires to follow a consistent
configuration of VLAN and L2VNI
across all Switches
• Post routed traffic will leverage
destination Layer 2 Segment
SVI 200
SVI 300
(L2VNI), same as for bridged traffic
SVI 300
SVI 200
SVI 300
✖
RR
SVI 200
RR
SVI 200
SVI 300
Asymmetric IRB
Asymmetric IRB
L2VNI 30001
L2VNI 30002
Leaf
V V
SVI 300 SVI 200 SVI 200 SVI 300
SVI 200
SVI 300
context
• Routed traffic uses transit VNI
(L3VNI), while bridged traffic uses
SVI 300
SVI 200
L2VNI
SVI 300
RR
SVI 200
RR
SVI 200
SVI 300
Symmetric IRB
Symmetric IRB
L3VNI 50001 (VRF)
Leaf
V V
SVI 300 SVI 200 SVI 200 SVI 300
RR RR
Spine
bridge
V
VLAN 100
Host3
V MAC: CC:CC:CC:CC:CC:CC
V IP: 192.168.1.33
VLAN 100
V VXLAN VNI 30001
V
V
VLAN 100
Host1
MAC: AA:AA:AA:AA:AA:AA
IP: 192.168.1.11
VLAN 100
VXLAN VNI 30001
Layer-2 Multi-Tenancy – Bridge Domains
VXLAN Overlay
(VNI 30001)
Leaf
V
VLAN 100
Bridge Domain V
VLAN 100
Host1 Host3
MAC: AA:AA:AA:AA:AA:AA MAC: CC:CC:CC:CC:CC:CC
IP: 192.168.1.11 IP: 192.168.1.33
VLAN 100 VLAN 100
VXLAN VNI 30001 VXLAN VNI 30001
Layer-2 Multi-Tenancy – Bridge Domains
VXLAN Overlay
The Bridge Domain
(VNI is the Layer-2 Segment from Host to Host
30001)
V
VLAN 100
Bridge Domain
1) The Ethernet Segment (VLAN), between Host
V and Switch
VLAN 100
2) The Hardware Resources (Bridge Domain) within the Switch
Host1 Host3
MAC: AA:AA:AA:AA:AA:AA MAC: CC:CC:CC:CC:CC:CC
IP: 192.168.1.11 IP: 192.168.1.33
VLAN 100 VLAN 100
VXLAN VNI 30001 VXLAN VNI 30001
VLAN-to-VNI Mapping
VXLAN Overlay
(VNI 30001)
Leaf
V V
VLAN 100 VLAN 100
Leaf
V V
VLAN 100 VLAN 200
Leaf
V V
VLAN 100 VLAN 200 VLAN 300
Leaf#1
vlan 2500
vn-segment 30001
Leaf#1
bridge-domain 100
member vni 30001
V
V Host2
IP: 10.10.10.22 (VRF-B)
V VLAN 200
SVI 100
Host1
IP: 192.168.1.11 (VRF-A)
VLAN 100
Layer-3 Multi-Tenancy – VRF-VNI or L3VNI
VRF-A VRF-B
(VNI 50001) (VNI 50002)
Leaf
Routing Routing
DomainV
SVI 100
V
SVI 200
Domain V
SVI 300
VRF-A VRF-B
Host1 Host2 Host3
IP: 192.168.1.11 (VRF-A) IP: 10.10.10.22 (VRF-B) IP: 172.16.1.33 (VRF-B)
VLAN 100 VLAN 200 VLAN 300
Layer-3 Multi-Tenancy – VRF-VNI or L3VNI
VRF-A VRF-B
The Routing Domain
(VNI 50001) is the VRF owning multiple
(VNI 50002)across multiple Switches
Subnets
Leaf
Routing Routing
In VXLAN EVPN, the Routing Domain consists of three Components
DomainV
SVI 100
1) The
SVI 200
Domain
V Routing Domains (VRF), local
V to the Switch
SVI 300
2) The Routing Domain (L3VNI) between the Switches
VRF-A VRF-B
3) Multi-Protocol BGP with EVPN Address-Family
Leaf
V V
SVI 100 SVI 200 SVI 300 SVI 400
interface eth1/10.1002
VLAN 1001 interface eth1/10.1002
encapsulation dot1q 1002 Ethernet encapsulation dot1q 1002
vrf member VRF-B VLAN 1002 vrf member VRF-B
ip address 10.2.2.1/24 ip address 10.2.2.2/24
ip router ospf 100 area 0.0.0.0 ip router ospf 100 area 0.0.0.0
Leaf
V V
SVI 100 SVI 200 SVI 300 SVI 400
Leaf
V V
SVI 100 SVI 200 SVI 300 SVI 400
V Host2
MAC: BB:BB:BB:BB:BB:BB
V IP: 10.10.10.22
SVI 100 VLAN 200
Host1 VXLAN VNI 30002
MAC: AA:AA:AA:AA:AA:AA
IP: 192.168.1.11
VLAN 100
VXLAN VNI 30001
Integrated Route & Bridge + Multi-Tenancy
VRF-A (VNI 50001)
RR RR
Spine
SVI 100, Gateway IP: 192.168.1.1 (VRF-A)
SVI 200, Gateway IP: 10.10.10.1 (VRF-A)
V
SVI 100
Host3
V MAC: CC:CC:CC:CC:CC:CC
V SVI 200 IP: 192.168.1.33 (VRF-A)
VLAN 100
V VXLAN VNI 30001
V Host2
MAC: BB:BB:BB:BB:BB:BB
V IP: 10.10.10.22 (VRF-A)
SVI 100 VLAN 200
Host1 VXLAN VNI 30002
MAC: AA:AA:AA:AA:AA:AA
IP: 192.168.1.11 (VRF-A)
VLAN 100
VXLAN VNI 30001
Data Centre Fabric Properties
Extended Namespace
Scalable Layer-2 Domains
Integrated Route and Bridge
Multi-Tenancy
Agenda
evolves!
Story #1: Scalable Data Centre Fabric
Border Leaf
Leaf
Leaf
Leaf
Leaf
AS65555
AS65504
AS65503
AS65503
AS65502
V
V
• End-to-End Data-Plane
encapsulation
V
V
VXLAN Encapsulation
EVPN Control-Plane Domain 2
Inter-Fabric Connectivity (Option 2)
• Multiple BGP-EVPN Control-Plane
EVPN Control-Plane Domain 1 Domains
• Normalisation via Ethernet (MPLS,
VRF-lite & IEEE 802.1Q Trunk) at
V
V the Border
V • Separate Data-Plane (DP)
V
DCI
V
encapsulation per Domain
V
• Multicast / Ingress Replication
V
V
VXLAN Encapsulation
EVPN Control-Plane Domain 2 DCI Encapsulation
Inter-Fabric Connectivity (Option 3 / Option 4)
• Multiple BGP-EVPN Control-Plane
EVPN Control-Plane Domain 1 Domains
• Integrated Hand-Off with Data-Plane
separation
V
V • Option 3 – L3 DCI
• L3-LISP, MPLS, EVPN
V
V
• Option 4 – L2 DCI
V • OTV, L2-LISP, EVPN
V
• Separate Data-Plane (DP)
V encapsulation per Domain
V • Multicast / Ingress Replication
VXLAN Encapsulation
EVPN Control-Plane Domain 2 DCI Encapsulation
Inter-Fabric Connectivity
Option 1 Option 2 Option 3/4
Underlay Control Plane Unified Underlay Domain Separated Underlay Domains Separated Underlay Domains
Broadcast
no yes yes
Suppression/Limit (DCI)
Layer-2 Loop Prevention Loop mitigation (Edge Protection) VPC at Border Loop mitigation (At DCI)
Fabric Management &
Automation
How to Achieve Data Centre Automation
• Simplify
• Do not start with the most difficult task (low hanging Fruits)
• Standardise
• Find common Denominators and create Templates
• Automate repetitive Tasks
• Use Templates for Simple Tasks and use Automation (e.g. create VLAN, SVI, VRF)
• Abstract
• Take a step back and look at the WHOLE
• Cisco ACI
Anatomy of Data Centre Automation
VMM Chef Openstack
Puppet NX-API Ansible
API
IP Fabric
Network Infrastructure
Fabric Management & Operations
Element Day-0: Day- 1: Day-2:
management:
Configuration Configuration and Visibility,
Hardware (POAP) Configuration Configuration
Management, Management increments,
Health Status, and Underlay compare changes.
Inventory Management Automated
Configuration
Compute
Integration
Troubleshooting
Device Auto-Configuration (POAP)
Day 0, Day 0.5 and Day 1
1. Easy way to unbox, rack the device, and not enter any base CLI
configuration. Just rack, power, and plug into the management network.
2. Provides a standard and consistent configuration across of the data centre
network devices.
3. Provides a standard and consistent images to deploy to all of the data centre
devices.
Recommended Reading
Using TRILL, FabricPath, and VXLAN:
Designing Massively Scalable Data
Centres (MSDC) with Overlays
• Sanjay K. Hooda
• Shyam Kapadia
• Padmanabhan Krishnan
ISBN-10: 1-58714-393-3
ISBN-13: 978-1-58714-393-9
Recommended Viewing
Cisco Programmable Fabric Using
VXLAN with BGP EVPN LiveLessons
• David Jansen
• Lukas Krattiger
ISBN-10: 0-13-427229-3
ISBN-13: 978-0-13-427229-0
Q&A
Complete Your Online Session Evaluation
Give us your feedback and receive a
Cisco 2016 T-Shirt by completing the
Overall Event Survey and 5 Session
Evaluations.
– Directly from your mobile device on the Cisco Live
Mobile App
– By visiting the Cisco Live Mobile Site
http://showcase.genie-connect.com/ciscolivemelbourne2016/
– Visit any Cisco Live Internet Station located
throughout the venue
Learn online with Cisco Live!
T-Shirts can be collected Friday 11 March Visit us online after the conference
for full access to session videos and
at Registration presentations.
www.CiscoLiveAPAC.com
Thank you