BRKSPG 2206
BRKSPG 2206
BRKSPG 2206
Samer Salam
Principal Engineer, Cisco
2
“Flat != Easy”
Norman Finn
Cisco Fellow and IEEE 802.1 Veteran
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 4
Scaling Layer-2 Networks to Millions of End-Points?
Addressing
Scalability, Mobility, Lookup
Optimal Forwarding
Routing vs. Bridging, Full
Efficient Interconnection of
Network Bandwidth use Ethernet Networks
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 5
Addressing Aspects
“How to Avoid Keeping a Host-Route for Every Host on Every Network
Element While Maintaining Mobility and Ease of Use?”
Addressing Aspects
Requirement Approaches
Efficient Addressing • Hierarchical addressing schemes
• Location-dependent addressing
• Control Plane Learning (where feasible)
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 7
Addresses – Location and Identity
• “Identity” addresses (“who”)
– MAC- addresses typically represent an Identity
– The manufacturer’s MAC address issued to a physical Ethernet interface is a “who” address; it identifies a
station regardless of what network, or where in that network, the station is attached. Virtual Machines (VM)
typically re-use the server-assigned MAC addresses
* Locally administered MAC addresses: low-order two bits of the first byte are “10”; globally administered manufacturers’ addresses: those bits are “00”.
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 8
MAC addresses: “Identity” vs. “Location”
• “Identity addresses”
– Switches need to learn destination hosts MAC addresses
– If a host moves only the switches (rather than hosts) need to update their forwarding
tables
• “Location addresses”
– Reduce the size of the (L2) forwarding table
– Hosts change addresses when they move: Requires notification of every host.
• Approaches to combine the two worlds (i.e. namespaces)
– “Map ‘n Encap”: [FabricPath], [TRILL], [OTV], [LISP], [8021Qbp]
– Translate: [PortLand], [MOOSE]
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 9
Identity and Location Addresses:
“Map ‘n Encap” Approaches
• Identity Addresses: Kept to the Edge
– Endpoint Identifier (EIR) [LISP]
– MAC-Address [OTV], [TRILL], [PBB-EVPN]
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 10
802.1Q Data Planes Recap…
Service Instance Scale
Address Hierarchy
Service Instance
Hierarchy Provider Backbone Bridges
Service 802.1ah
Instances Payload
“Flat”
Connectivity Ethertype
C-VID
Provider Bridges
802.1ad C-TAG
Ethernet S-VID
VLAN Payload S-TAG
SA
Ethernet Payload Ethertype DA
C-VID I-SID
Payload Ethertype C-TAG I-TAG
C-VID S-VID B-VID
Ethertype Q-TAG S-TAG B-TAG
SA SA SA B-SA
DA DA DA B-DA
1998 2005 2008
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved.
Standard11Approved
Cisco Public
Combining “location” and “identity”
Example: Cisco FabricPath Frame Format 16-Byte MAC-in-MAC Header
OOO/DL
RSVD
Endnode ID Endnode ID Sub
U/L
I/G
Switch ID Port ID Etype Ftag TTL
hop to prevent frames looping (5:0) (7:6) Switch ID
infinitely
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 12
Combining “location” and “identity”
Example: FabricPath
Network-based, L2-in-L2, C-MAC Data Plane Learning, Switch-ID Control Plane distribution
(ISIS) S100 S200
MAC1 MAC2
Payload
Ingress Egress
MAC1 MAC2 Edge Edge
Device Device MAC1 MAC2
Payload
S100 S200 Payload
1. Layer 2 lookup on the destination MAC. MAC 2 is reachable through S200. 4. The Edge Device receives and de-capsulates the packet.
2. The Edge Device encapsulates the frame. 5. Layer 2 lookup on the original frame. MAC 2 is a local MAC.
3. The transport delivers the packet to the Edge Device on the other site. 6. The frame is delivered to the destination.
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 13
Combining “location” and “identity”
Example: PBB-EVPN
Network-based, L2-in-L2-over-MPLS, B-MAC Control Plane Distribution (BGP)
MPLS labels
B-MAC1 B-MAC2
MAC1 MAC2
PE performs
PE performs
Payload • EVPN
• EVPN
• MAC-in-MAC
• MAC-in-MAC
MAC1 MAC2
MAC1 MAC2
Payload PE1 PE2
B-MAC1 B-MAC2 Payload
Learned table for attachment circuit MAC1 Int Eth 1 BGP MAC1 B-MAC1 Learned table for attachment circuit
MAC2 B-MAC2 MAC2 Int Eth 2
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 14
Combining “location” and “identity”
Example: OTV
Network-based, L2-in-L3, Distributed (ISIS based) Control Plane Learning
11.0.0.1 12.0.0.2
MAC1 MAC2
Payload
Ingress Egress
MAC1 MAC2 Edge Edge
Device Device MAC1 MAC2
Payload
11.0.0.1 12.0.0.2 Payload
ISIS
1. Layer 2 lookup on the destination MAC. MAC 2 is reachable through IP 12.0.0.2. 4. The Edge Device receives and de-capsulates the packet.
2. The Edge Device encapsulates the frame. 5. Layer 2 lookup on the original frame. MAC 2 is a local MAC.
3. The transport delivers the packet to the Edge Device on the other site. 6. The frame is delivered to the destination.
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 15
Combining “location” and “identity”
Example: LISP
Network-based, L3-in-L3, Distributed Control Plane Learning (MR/MS/ALT)
11.0.0.1 12.0.0.2
2.0.0.2 3.0.0.3
Payload
Ingress Egress
Tunnel Tunnel
2.0.0.2 3.0.0.3
Router (ITR) Router (ETR) 2.0.0.2 3.0.0.3
Payload
Map
Cache 11.0.0.1 12.0.0.2 Payload
3.0.0.1
2.0.0.1
End-System ID End-System ID
(EID: 2.0.0.2) (EID): 3.0.0.3
Alternate
Map Resolver Map Server
Topology (ALT)
(MR) (MS)
Mapping Service
EID-Prefix Locator(s)
2.0.0.0/24 11.0.0.1
3.0.0.0/24 12.0.0.2
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 16
Example: LISP/ OTV/ VXLAN
Observation
• LISP applies to Layer-2 and Layer-3
– For L2, End-System Identifier (EID)
= MAC-Address
– Layer 2 (L2) LISP Encapsulation Format
Source-site Overlay Interface Address
• Layer-2 LISP/ OTV/ VXLAN Destination-site Overlay Interface (or Multicast) Address
• See also
– http://tools.ietf.org/html/draft-mahalingam-dutt-dcops-vxlan
– http://tools.ietf.org/html/draft-smith-lisp-layer2
For VXLAN:
“VXLAN Network Identifier (VNI)”
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 17
Combining “Location” and “Identity”
In case you don’t like “Map ‘n Encap”… How about Translation?
“02:22:22:00:00:01”
“02:22:22/24”
“02:22:22:00:00:02”
“02:22:22:00:00:03” “02:33:33:00:00:01”
“02:33:33/24”
“02:33:33:00:00:02”
“02:33:33:00:00:03”
00:1E:13:9B:8E:10 “02:22:22:00:00:4”
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 18
A Note on MAC-NAT and Mobility
Location dependent addresses .. require notification of every switch when a host moves
• Concept
– If a host moves, it is allocated a new 2
Gratuitous ARP sent
MAC address by its new switch by new home switch
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 19
A Note on MAC learning
If you believe you need MAC learning, consider learning only if you have to
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 20
Conversational Learning
Learning only the MAC addresses required
MAC IF
xxx # of MACs xxx # of MACs
MAC IF
L2 Fabric
B 2/1
S11
S12 B
MAC IF MAC IF
STP Domain
A 2/1 C 3/1
C S12 A S11
• ALL MACs needs to be learned on Local MAC: Source-MAC Learning only happen to traffic received
on CE Ports
EVERY Switch
Remote MAC: Source-MAC for traffic received on Core-facing ports
• Large L2 domain and virtualization are only learned if Destination-MAC is already known as Local
present challenges to MAC Table Example: FabricPath Implementation
scalability
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 21
Controlling Broadcast Traffic
Servers/Caches for Applications using Broadcast
• Example: ARP/ND
– Edge devices (e.g. with OTV)
maintain an ARP cache, which is ARP reply on
behalf of
populated by snooping ARP replies. 5 remote server Cache/
4 (IP A) Server
– Initial ARP requests are broadcasted Subsequent
ARP requests
2
ARP reply
to all sites, but subsequent ARP (IP A)
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 22
Optimal Forwarding
“How to Leverage the Entire Network Topology for Packet Forwarding
– and Approach Full Cross-Sectional Bandwidth?”
Towards a new Layer-2 Control Protocol
Why?
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 24
Optimal Forwarding
IETF TRILL and Cisco FabricPath
TRILL
IETF Approach to Shortest Path Bridging
• TRILL
(TRansparent Interconnect of Lots of Links)
– http://www.ietf.org/html.charters/trill-charter.html
• Main areas addressed by TRILL:
– Provide Shortest Path and
Equal Cost Multi-Pathing for traffic
– Be Plug-n-Play
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 26
TRILL Basics
IEEE Bridge
RBridge
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 27
TRILL
Principles of Operation
.1Q frame
.1Q frame B
E
A
D
IEEE Bridge
C .1Q frame A E D E
RBridge
.1Q frame A E A C
RBridge Outer .1Q frame A E C D
Header MAC
Frames are encapsulated with the RBridges learn what MAC addresses are
RBridge addresses and further on their edge ports using general
encapsulated with originating rbridge dataplane learning and MAY advertise
and next hop rbridge MAC address them other RBridges
Header fields differ from 802.1ah Remote mac-address-to-rbridge binding learning:
hardware or control plane
Headers are swapped hop by hop (similar to routing)
Unknown unicast /multicast/broadcast
frames flooded along pre-calculated
distribution tree(s)
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 28
TRILL Forwarding
• RBridges use ISIS for discovery and to synchronize Link State Databases
• TRILL uses these Link State Database to
– Compute pair wise bidirectional paths for unicast (per node and/or per VLAN)
between all Rbridges
– For multicast, distribution trees are calculated rooted at (potentially) every rbridge ;
trees are given an rbridge-id/nickname as well
• TRILL adds to standard IS-IS
– Ships in the night with other protocols using ISIS
– TRILL Hellos
• Find out whether nodes are on a LAN or P2P link
• Designated Rbridge (DRB) Election
• Root-Bridge-IDs
– See also: RFC 6165 (Extensions to IS-IS for Layer-2 Systems)
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 29
TRILL—Ethernet Data Encapsulation
Outer Ethernet Header (link specific):
V: Version
TRILL Header:
M: Multi-destination; indicates if the frame is to be
delivered to a single or multiple end stations
Ethertype = TRILL V R M Op-Length Hop Count Opt-Length: >0 if an Option field is present
Hop Limit: Similar to TTL
Egress (RB2) Nickname Ingress (RB1) Nickname RBridge Nickname: Not the MAC address of the
Rbridge, but the a TRILL ID for the RBridge (Egress
Nickname used differently if M = 1)
Inner Ethernet Header:
RB6 TRILL
RB9 Network RB2
MR6
M1 RB5
B1 MR5 B4 B5 RB3
B2 MR3
RB1 B3 B6
RB8 B7 M2
MR1 802.1Q 802.1Q RB4
MR8 RB7
Cloud Cloud
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 31
Packet Flow — Multicast/Broadcast/Unknown Unicast
Perform MAC lookup on G1 Perform lookup on
Encapsulate in TRILL header, egress rbridge-id to
set M bit and tree id (egress determine distribution Decapsulate TRILL header
rbridge id) & forward to all-RB’s tree Perform MAC lookup on G1 to
mcast address determine egress ports
RB9 TRILL
MR9 Network RB2
RB6 G1
M1 RB5
MR5 RB3
MR3
RB1
RB8 G1
MR1 802.1Q 802.1Q RB4
MR8 RB7
Cloud Cloud
All-RB-
MCAST Outer MAC DA Outer MAC DA All-RB-MCAST
(or MR9) Outer MAC DA Outer MAC SA Changes Outer MAC DA Outer MAC SA
MR1 Outer MAC SA Outer MAC SA MR5
Hop-to-Hop
Etype = Outer VLAN (MACs, VLAN, TTL) Etype = Outer VLAN
802.1Q
Etype = TRILL V/M/R, TTL M=1 802.1Q
Etype = TRILL V/M/R, TTL M=1
RB9 Egress RB-ID Ingress RB-ID RB1 Egress RB-ID Ingress RB-ID RB1
G1 Inner MAC DA
Unchanged From Inner MAC DA
Inner MAC DA Inner MAC SA Ingress to Egress Inner MAC DA Inner MAC SA
M1 Inner MAC SA Inner MAC SA
Etype = Inner VLAN Etype = Inner VLAN
802.1QPayload …. 802.1QPayload ….
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 32
TRILL Benefits
• Shortest path delivery of unicast
• Layer 2 multi-pathing (ECMP) of unicast
• Optimal multicast delivery over shared trees
– Load-balancing over multiple trees.
– Per-VLAN/c-group pruning of trees via IGMP/PIM snooping.
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 33
Cisco FabricPath in a Nutshell
Similarities with TRILL
– “MAC-in-MAC”-like encapsulation,
includes TTL
– ISIS based control plane (unicast and multicast)
• No MAC learning in the Fabric,
Forwarding based on “Switch IDs”
– ECMP for Multi-Path Load Balancing
Ethernet FabricPath Header
– Multiple-Topologies
FabricPath
– Conversational Learning at the Edges
– Interworking with STP-based Ethernet Access Domains
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 34
Data Plane Operation
Forwarding Decision Based on ‘FabricPath Routing Table’
S1 S2 S3 S4
• FabricPath header is
imposed by ingress switch FabricPath
AB S11 S42
Routing Table
• Only switch addresses are
Switch IF
used to make “routing” … …
S11 S12
FabricPath S42
… …
B S42 Single mac address lookup at the edge
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 35
FabricPath Forwarding: Unknown Unicast
S1 S2 S3 S4
L2 L3 L4
L6 L7 L8
L5
L1
L10 L11 L12
L9
HIT S11 S12 L2 Fabric S42
LEARN
MAC IF MAC IF
Decap MISS C 3/1
A 1/1
1/1 3/1
Encap
C S42
Decap MISS
Don’t LEARN
A B C
FabricPath Port
CE Port
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 36
FabricPath Forwarding: Known Unicast
Switch IF
… …
S42 L12
S1 S2 S3 S4
L3 L4
L2
L6 L7
L5
L1 L10 L11
Switch IF L12
L8 L9
… …
S42 L1, L2, L3, L4
S11 S12 L2 Fabric S42
MAC IF
MAC IF Encap HIT! C 3/1
A 1/1 1/1 3/1 A S11
Decap
HIT! C S42
A B C
FabricPath Port
CE Port
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 37
Multicast with FabricPath
Forwarding through distinct ‘Trees’
Root for Root for Root for Root for
• Several ‘Trees’ are Tree #1 Tree #2 Tree #3 Tree #4
rooted in key
location inside the
fabric
Ingress switch for
• All Switches in L2 FabricPath decides which
“tree” to be used and add
Fabric share the tree number in the header
same view for each L2 Fabric
‘Tree’
• Multicast traffic
load-balanced
across these ‘Trees’
A C
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 38
Migration to FabricPath
vPC+ Helps Integrating CE Devices
Layer 3
• Allows inserting non-FabricPath Network
capable devices in the network:
L3 Routing
With Active/Active redundancy Active Active
Classical
vPC+
Ethernet
Non-FabricPath
capable devices
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 39
Multiple Topologies
Topology 0
Topology 1
Topology 2
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 40
STP Interaction
FabricPath Port
CE Port
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 41
Optimal Forwarding
IEEE Shortest Path Bridging
802.1aq — Shortest Path Bridging
Motivation
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 43
802.1aq—Shortest Path Tree per Bridge
Original concept
Each bridge is the “root” of a separate shortest path tree instance
Bridge G is the root of the green tree
Bridge E is the root of the blue tree
Both trees are active AND symmetric at all times
Needed in Ethernet to have congruent multicast and unicast
Root
A A A
Root Root Blocked Ports
Root D D D
B B B
C Blocked
Root Root
C Root C
Ports
E E E
Root
G F Root G F G F
Root
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 44
IEEE 802.1aq Variants
• Shortest Path Bridging MAC (SPBM) targets PBB networks where all addresses
are managed
• Shortest Path Bridging VID (SPBV) is applicable in customer, enterprise or
storage area networks
SPB
SPBV SPBM
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 45
P802.1aq Shortest Path Bridging
Provider Bridging and Provider Backbone Bridging
*”Private VLANs” Leverage the Same Concept: One Service Instance (VLAN) Leverages Two VIDs (Upstream and Downstream)
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 46
IS-IS Control Plane
• Neighbor and Topology discovery
Shortest Path Tree computation (unicast and multicast)
Each bridge builds view of the physical topology of the SPT Region
Control plane learning instead of dataplane learning
• Service discovery
I-SID registrations are included into a new TLV
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 47
Path Congruency and Symmetry
3 3
1 1
1 1
2 2
1 1 1 1
1 1
1 4 5 1 4 5
2 1 2 1
6 6
unicast
multicast
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 49
Neighbor Handshake Mechanism
(Draft 4.4 802.1aq, Clause 13) Proposal
Agreement
Agreement
Ensure that bridges with different views on the network topology do not exchange
frames
– Local agreements: Two way agreement,
Tree way handshake
When Topology Change occurs, bridge determines set of multicast trees where
distance to the root has changed
– Remove the state for those trees and advertize a digest of LSP database (CRC,
Cryptographic hash function) of the new topology database to the peers of the bridge
– On receiving a matching digest from a peer, a bridge can be sure that the neighbor has
done the same and that the updated multicast state can be installed on the interface
facing the peer
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 50
Optimal Forwarding
Multi-Path Forwarding
Multi-Path Load Balancing
Approaches
• “Classical” ECMP
– [OTV], [TRILL]
• Equal Cost Trees
– [802.1aq]
• Ethernet ECMP
– [802.1Qbp]
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 52
Classic Equal Cost Multi Path (ECMP)
• Pre-Requisite
– Link-Layer Routing Protocol which can compute two or more equal cost shortest paths
between two nodes
• ECMP distributes the traffic per hop among the equal-cost paths
– Packet-based (in round-robin fashion): Can cause out-of-order packets
– Flow-based using hashing e.g. source and destination addresses (and potentially
additional header fields):
Effectiveness depends on the number and distribution of flows (according to the hash
function)
• ECMP is leveraged by [TRILL], [FabricPath]
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 53
Equal Cost Trees (ECT)
Optimizations for 802.1aq Shortest Path Bridging
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 54
802.1aq Equal Cost Trees (ECT)
• ECT provides up to 16 symmetrical paths (by XORing the Node-ID with one of
16 predefined masks)
• Per-Hop (TRILL) vs. Global (ECT) Traffic Hashing
Different from TRILL, ECT can only identify a maximum of 16 different paths between any
source and destination pair
Can lead to situations where certain links are not utilized at all (depending on the hash
function used)
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 55
802.1Qbp ECMP
Motivation and Requirements
• Support of per-hop ECMP
• Support of TTL for loop mitigation
• Support of Flow-id
– To avoid deep packet inspection in the core
– To provide proactive service-level monitoring
• Flexible n-tuple hash algorithm for flow-identification
– Any edge node can choose any set of n-tuples and any hash algorithm to derive a flow
id
• Support proactive service-level monitoring
– For a given flow-id, the path for that flow through the network be deterministic
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 56
Recap:
Existing PBB/802.1ah Frame Format
B-DA
B-SA
EtherType = I-Tag
B-TAG
UCA
PCP rsv2 I-SID (part 1)
rsv1
DE
B-VID
(3b) (2b) (8b)
I-TAG
I-SID (part 2)
I-SID (16b)
DA
SA • PCP: Priority Code Point – 3 bits
S-TAG
• DE: Discard Eligible – 1 bit
S-VID
C-TAG • UCA: Use Customer Address – 1 bit
C-VID • Rsv1: Reserved1 – 1 bit
• Rsv2: Reserved2 – 2 bits
Payload
• I-SID: Service ID – 24 bits
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 57
Proposed 802.1Qbp/ECMP Frame Format
B-DA
B-SA
EtherType = F-Tag
DE
(3b) (6b) (6b)
I-TAG
Flow-ID
I-SID (16b)
DA
• F-Tag Fields:
SA
– PCP: Priority Code Point – 3 bits (copied from B-Tag)
S-TAG
– DE: Discard Eligible – 1 bit (copied from B-Tag)
S-VID – Rsv1: Reserved – 6 bits
C-TAG – TTL: Hop Count – 6 bits
C-VID – Flow-ID: 16 bits
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 58
Interconnecting Ethernet Domains
“How to Connect Ethernet Domains Across a WAN in an Efficient Way?”
Interconnecting Ethernet Domains
The “Legacy Approach”:
Virtual Private LAN Services (VPLS)
Virtual Private LAN Services (VPLS)
(Almost) Emulating a Bridge: Flooding, Forwarding
CE
U-PE B
• Forwarding N-PE 2 Applies Applies
N-PE 4
Split- Split-
Ethernet UNI Horizon Horizon Ethernet UNI
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 61
Scaling VPLS: Hierarchical VPLS
Flavors
Access: VPLS Core:
802.1ad Full Mesh Access: VPLS Core:
Provider of PW Active PW H&S PW Full Mesh
Bridges Overlay of PW
U-PE1 N-PE1
U-PE1 N-PE1
Ethernet
Bridges Standby PW
N-PE2
N-PE2
“H-VPLS“
“H-VPLS“
with MPLS to the Edge
with Ethernet Access
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 62
Scaling VPLS Further
Combining H-VPLS and PBB (802.1ah)*
VPLS/ VPLS VPLS/ 802.1Q/
802.1Q/
w/ 802.1ah w/ 802.1ah 802.1ad
802.1ad
P3
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 64
FAT PW Encapsulation
Original Pseudo Wire Encapsulation FAT Pseudo Wire Encapsulation
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 65
FAT PW Architecture
FAT PW Capability Exchange
AC AC
CE1 PE1 PE2 CE2
P3
VPNC PE1 PE2 VPNC
P1
E1 P5 E2
Flow Flow
Traffic Flow P4 Labels Payload
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 67
Interconnecting Ethernet Domains
The “New” Approach: Ethernet VPN (EVPN)
Towards “EVPN”
Requirements
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 69
Towards “EVPN”
Solve Additional Challenges of Current VPLS for All-Active Redundancy
MAC1 MAC2
MAC2
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 70
BGP MPLS Based Ethernet VPNs
Main Principles
• Leverage similarities with L3VPN as much as possible Further segmentation using Ethernet-TAGs,
to e.g. identify VLAN(s). Use of Ethernet-TAGs
– Remote learning:
PE PE
• Distribution of Customer MAC-Addresses using BGP
• When multiple PEs/MESs announce the same C-MAC,
hash to pick one PE
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 71
Operational Principles
• Autodiscovery
Autodiscovery similar to current VPLS
• Multicast/Broadcast
Distribute the MAC-SA via BGP (if new) to all other PEs and send the frame over multicast tunnel
Far-end PE forwards the frame over local ACs (no learning)
If a PE receives a frame with unknown MAC DA, discard the frame (or optionally forward it)
• Known Unicast
Forward using MP2P label associated with VPLS instance (VSI)
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 72
New BGP E-VPN NLRI
Route Types
Route Usage
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 73
Operation
Loop prevention for BUM for multi-homed segments
• Designated Forwarder Election per
draft-ietf-l2vpn-evpn Ethernet A-D Route per
(per ESI and Ethernet TAG) Ethernet Segment incl.
– Active/Active support for ESI attachments via LAG ESI MPLS label
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 74
EVPN Operations Example (1/4)
M1 communicates with M2 (e.g. ARP) - Broadcast
AGG1 BGP AGG4
M2
PE1 PE3
ESI=3
M1 AGG2 AGG5
C-MAC2
ESI=1
C-MAC1
AGG3 AGG6
PE2 PE4
ESI=2
iBGP L2-NLRI
• next-hop: MES1
• <C-MAC1, Label 100>
C-MAC1
AGG3 AGG6
PE2 PE4
ESI=2
• PE1 sends this message over all its local ACs that are not blocked (for mcast/bcast) & sends
it over MP2MP LSP (of that EVI)
Only a single AC per (multi-homed ID) ESI can be a designated forwarder (DF) to send (but not receive) mcast/bcast
messages to the customer site
Any AC in the group (per ESI) can receive mcast/bcast messages
• PE2 receives the message but it drops it at its AGG2-PE2 AC even though this AC is a DF for
ESI=1 because ESI of the frame matches the ESI of the AC
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public
EVPN Operations Example (3/4)
Reply from M2 to M1 (Unicast)
AGG1 AGG4
M2
PE1 PE3
ESI=3
M1 AGG2 AGG5
ESI=1
AGG3 AGG6
PE2 PE4
ESI=2
iBGP L2-NLRI
• next-hop: MES4
• <C-MAC2, Label 100>
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 77
EVPN Operations Example (4/4)
Reply from M2 to M1 (Unicast)
AGG1 AGG4
M2
PE1 PE3
ESI=3
M1 AGG2 AGG5
ESI=1
AGG3 AGG6
PE2 PE4
ESI=2
• Since PE4 already knows that M1 sits behind PE1, it forwards the frame to PE1
If PE4 has two BGP ECMP for M1 (e.g., both PE1 & PE2 advertised M1), then it uses a hash based
on L2/L3/L4 header to decide which of the two PEs to forward the frame to
• Upon receiving the frame, PE1 does a MAC lookup and forwards the frame to
Agg2-PE1 AC
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 78
Towards PBB-EVPN
Design Goals
• MAC Advertisement Route Scalability
– To support millions of C-MAC addresses (million of VMs)
• C-MAC Mobility with “location” MAC addresses
• C-MAC Address Conversational Learning
• Interworking with TRILL & 802.1aq/.1Qbp networks w/ C-MAC Transparency
– To avoid learning of C-MACs by DC WAN Edge PE
• Per Site/Segment Policy (rather than per network)
• Avoiding C-MAC flushing upon link, port, or node failure for multi-homed devices
• Avoid transient loop for known unicast when doing egress MAC lookup
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 79
PBB-EVPN Solution Overview
C-MAC B-MAC
M1 M2 B-MAC1
Site 1
MHN1 Site 2
M2 B-MAC NH
B-MAC1 PE1
PE2 PE4
PBB EVPN B-MAC1 PE2
• Upon an AC or PE failure, adjust the BGP path list accordingly for a given B-MAC –
e.g., no C-MAC withdraw is needed
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 80
PBB-EVPN
Review and Solution Context
TRILL TRILL
TRILL/FP-EVPN
FabricPath FabricPath
Addressing
Scalability, Mobility, Lookup
• “Map ‘n Encap”: Location
and Identity addresses
• Avoid Flooding, control plane
learning, broadcast “proxies”
Optimal Forwarding *
Multi-Pathing, Optimal
Efficient Interconnection of
Network Topology Use Ethernet Networks
• ISIS Control Plane: • ISIS/BGP Control Plane
Unicast & Multicast • Control Plane learning
• ECMP and enhancements • Native IP and MPLS transport
• Multiple Topologies • Active-Active Multi-Homing
• Efficient Multicast
*: 3x3x3 Cube World Record Holder: Feliks Zemdegs, Melbourne Winter Open 2011: 5.66s:
http://www.youtube.com/watch?v=3v_Km6cv6DU
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 83
References
• [FabricPath]: FabricPath http://www.cisco.com/en/US/prod/switches/ps9441/fabric_path_promo.html
• [PORTLAND]: PortLand: A Scalable Fault-Tolerant Layer 2 Data Center Network Fabric http://ccr.sigcomm.org/online/?q=node/503
• [MONSOON]: Towards a Next Generation Data Center Architecture: Scalability and Commoditization
http://research.microsoft.com/apps/pubs/default.aspx?id=79348
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 84
Participate in the “My Favorite Speaker” Contest
Promote Your Favorite Speaker and You Could be a Winner
• Promote your favorite speaker through Twitter and you could win $200 of Cisco
Press products (@CiscoPress)
• Send a tweet and include
– Your favorite speaker’s Twitter handle
– Two hashtags: #CLUS #MyFavoriteSpeaker
• You can submit an entry for more than one of your “favorite” speakers
• Don’t forget to follow @CiscoLive and @CiscoPress
• View the official rules at http://bit.ly/CLUSwin
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 85
Complete Your Online Session Evaluation
• Give us your feedback and you
could win fabulous prizes. Winners
announced daily.
• Complete your session evaluation through
the Cisco Live mobile app
or visit one of the interactive kiosks located
throughout the convention center.
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 86
Continue Your Education
• Demos in the Cisco Campus
• Walk-in Self-Paced Labs
• Table Topics
• Meet the Engineer 1:1 meetings
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 87