cisco - from spanning tree to l2 mulipath
TRANSCRIPT
Cisco Confidential 1 © 2010 Cisco and/or its affiliates. All rights reserved.
From Spanning Tree to L2 Multipath
Jaromír Pilař
Consulting Systems Engineer
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 2
Agenda
L2 challenges and limitations
Spanning tree protocol - traditional approach
Multichassis Etherchannel
"Routing" at L2 - Fabricpath and TRILL
© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKDCT-2049 3
Spanning Tree Protocol
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 4
Traditional approach L2 Requires a Tree Branches of trees never
interconnect (no loop)
Spanning Tree Protocol (STP) typically used to build this tree
Tree topology implies:
Wasted bandwidth → increased oversubscription
Sub-optimal paths
Conservative convergence (timer-based) → failure catastrophic (fails open)
11 Physical Links 5 Logical Links
S1
S2
S3
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 5
What is Spanning-Tree ? Why do we need it ?
A redundant connection kills a bridged network:
• No TTL at layer 2,
• A single packet can take the whole bandwidth
Though, we want to keep parallel links for redundancy
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 6
What is Spanning-Tree ? Why do we need it ?
• The Spanning-Tree is a layer-2 algorithm was originally designed by
Radia Perlman while working for DEC in 1985.
• Adopted into IEEE 802.1D 1990 with updates in 1998 and 2004
• This protocol provides the following:
Loop-free network
Keeps the redundancy in case of failure
Operates in a plug & play fashion
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 8
Spanning Tree Timers and reconfiguration time
• Hello_time:
time between two BPDUs
• Forward_delay:
duration of Listening and Learning stages
• Max_age:
For ports receiving BPDUs, time before the device sending BPDUs
is considered lost
• Given the following configurable parameters: Hello time (Default: 2s, Range allowed 1 - 10) Max Age (Default 20s. Range allowed 6 - 40) Forward Delay (Default 15s. Range allowed 4 - 30)
… the convergence time in the worst case is given by formula:
Max Age + (2 * Forward delay) = 50 s
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 9
How to reduce the convergence time
Cisco solution:
Bridge 1
ROOT
Bridge 2 Bridge 3
Bridge 4 Bridge 7 Bridge 5 Bridge 6
•BackboneFast
•UplinkFast
•PortFast
IEEE solution: 802.1w/RSTP (Rapid Spanning Tree Protocol)
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 10
Optimizing L2 Convergence
PVST+, Rapid PVST+ or MST
Rapid-PVST+ greatly improves the restoration times for any VLAN that requires a topology convergence due to link UP
Rapid-PVST+ also greatly improves convergence time over backbone fast for any indirect link failures
PVST+
Traditional spanning tree implementation
Rapid PVST+
Scales to large size (~10,000 logical ports)
Easy to implement, proven, scales
MST
Permits very large scale STP implementations (~30,000 logical ports)
Not as flexible as rapid PVST+
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 11
Layer 2 Hardening
Place the root where you want it
Root primary/secondary macro
The root bridge should stay where you put it
RootGuard
LoopGuard
UplinkFast
UDLD
Only end-station traffic should be seen on an edge port
BPDU Guard
RootGuard
PortFast BPDU Guard or
RootGuard
PortFast
RootGuard
STP Root
LoopGuard
LoopGuard
Spanning Tree Should Behave the Way You Expect
© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKDCT-2049 12
Multichassis Etherchannel
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 13
Feature Overview How does it help with STP? (1 of 2)
Before
STP blocks redundant uplinks
VLAN based load balancing
Loop Resolution relies on STP
Protocol Failure
After
No blocked uplinks
Lower oversubscription
EtherChannel load balancing (hash)
Loop Free Topology
Primary
Root
Secondary
Root
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 14
Feature Overview How does it help with STP? (2 of 2)
Reuse existing infrastructure
• Build Loop-Free Networks
© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKDCT-2049 15
Virtual Switching System (VSS)
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 16
VSS (Physical View)
SiSi
Access Switch or ToR or Blades
Server Server Server
10GE 10GE
Access Switch or ToR or Blades
Access Switch or ToR or Blades
802.3ad
Spanning Tree VSS (Logical View)
802.3ad or
PagP
802.3ad or
PagP 802.3ad
Simplifies operational Manageability via Single point of Management, Elimination of
STP, FHRP etc
Doubles bandwidth utilization with Active-Active Multi-Chassis Etherchannel
(802.3ad/PagP) Reduce Latency
Minimizes traffic disruption from switch or uplink failure with Deterministic subsecond
Stateful and Graceful Recovery (SSO/NSF)
Catalyst 6500 Virtual Switching System Overview
SiSi SiSi SiSiSiSi
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 17
Virtual Switching System Architecture Virtual Switch Link (VSL)
The Virtual Switch Link joins the two physical switch together and
it provides the mechanism to keep both the chassis in sync
A Virtual Switch Link bundle can consist of up
to 8 x 10GE links
All traffic traversing the VSL link is encapsulated with a 32 byte “Virtual Switch Header” containing ingress and egress switchport indexes, class of service (COS), VLAN number, other important information from the layer 2 and layer 3 header
Control plane uses the VSL for CPU to CPU communications while the data plane uses the VSL to extend the internal chassis fabric to the remote chassis
Virtual Switch Active
Virtual Switch Standby
Virtual Switch Link
VS Header L2 Hdr L3 Hdr Data CRC
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 18
Virtual Switching System Unified Control Plane
One supervisor in each chassis with inter-chassis Stateful Switchover (SSO) method in with one supervisor is ACTIVE and other in HOT_STANDBY mode
Active/Standby supervisors run in synchronized mode (boot-env, running-configuration, protocol state, and line cards status gets synchronized)
Active supervisor manages the control plane functions such as protocols (routing, EtherChannel, SNMP, telnet, etc.) and hardware control (Online Insertion Removal, port management)
Active Supervisor
SF RP PFC
CFC or DFC Line Cards
CFC or DFC Line Cards
CFC or DFC Line Cards
CFC or DFC Line Cards
CFC or DFC Line Cards
Standby HOT Supervisor
SF RP PFC
VSL CFC or DFC Line Cards
CFC or DFC Line Cards
CFC or DFC Line Cards
CFC or DFC Line Cards
CFC or DFC Line Cards
CFC or DFC Line Cards
CFC or DFC Line Cards
SSO
Synchronization
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 19
Virtual Switching System Dual Active Forwarding Planes
Both forwarding planes are active
Standby supervisor and all linecards including DFC’s are actively forwarding
VSS# show switch virtual redundancy My Switch Id = 1 Peer Switch Id = 2 <snip> Switch 1 Slot 5 Processor Information : ----------------------------------------------
- Current Software state = ACTIVE <snip> Fabric State = ACTIVE Control Plane State = ACTIVE Switch 2 Slot 5 Processor Information : ----------------------------------------------
- Current Software state = STANDBY HOT
(switchover target) <snip> Fabric State = ACTIVE Control Plane State = STANDBY
Data Plane Active
Data Plane Active
SiSiSiSi
Switch1 Switch2
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 20
Virtual Switching System Architecture Virtual Switch Domain
A Virtual Switch Domain ID is allocated during the conversion process and
represents the logical grouping the 2 physical chassis within a VSS. It is
possible to have multiple VS Domains throughout the network…
Use a UNIQUE VSS Domain-ID for each VSS Domain throughout the network.
Various protocols use Domain-IDs to uniquely identify each pair.
VSS Domain 10
VSS Domain 30 VSS Domain 20
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 21
Virtual Switching System Architecture Multichassis EtherChannel (MEC)
Prior to the Virtual Switching System, Etherchannels were restricted to reside
within the same physical switch. In a Virtual Switching environment, the two
physical switches form a single logical network entity - therefore
Etherchannels can now be extended across the two physical chassis
Regular Etherchannel on single
chassis
Multichassis EtherChannel across 2
VSS-enabled chassis
VSS
Both LACP and PAGP Etherchannel
protocols and Manual ON modes are
supported…
Standalone
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 22
Virtual Switching System Architecture EtherChannel Hash for MEC
Link 1 Link 2
Etherchannel hashing algorithms are modified in
VSS to always favor locally attached interfaces
Blue Traffic destined
for the Server will
result in Link 1 in the
MEC link bundle being
chosen as the
destination path…
Orange Traffic
destined for the Server
will result in Link 2 in
the MEC link bundle
being chosen as the
destination path…
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 24
Active
Switch1 Switch2
VSL
High Availability Dual-Active Detection
If the entire VSL bundle should happen to go down, the Virtual Switching
System Domain will enter a Dual Active scenario where both switches
transition to Active state and share the same network configuration (IP
addresses, MAC address, Router IDs, etc…) potentially causing
communication problems through the network…
3 Step Process
Dual-Active detection (using one or more of three available methods - ePAgP, VLSP Fast Hello, IP BFD)
1
Recovery Period - Further network disruption is avoided by disabling previous VSS active switch interfaces connected to neighboring
devices .
2
Dual-Active Restoration - when VSL is restored , the switch that has all it’s interfaces brought down in the previous step will reload to boot in a preferred standby state
3 Active Recovery Standby
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 25
VSS Redundant Supervisor Support
A Supervisor failure event will down the affected chassis decreasing the VSS bandwidth by 50%
Certain devices may only single-attach to the VSS for various reasons
Service Modules/Servers
Geographic separation of VSS chassis
Costs $$
Supervisor failure events therefore require manual intervention for recovery of the affected chassis
Uplinks are not active when the Supervisor is in ROMMON mode
Undeterministic outage time
Relies on manual process to install and convert the new Supervisor with current VSS configuration
Why Redundant Supervisors Are Needed
SiSiSiSi
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 26
STANDBY COLD
SiSi SiSi SSO Active SSO Hot-Standby RPR -Warm RPR -Warm
VSL
Switch-1 Switch-2
Virtual Switching System (VSS) Quad-Sup – Control Plane
Redundant supervisors fully boot Cisco IOS to RPR-WARM
redundancy mode
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 27
STANDBY COLD
SiSi SiSiActive Active Active Active
VSL
Switch-1 Switch-2
Virtual Switching System (VSS) Quad-Sup- Data plane
From data plane perspective the RPR-Warm supervisor operates similarly to a DFC-enabled line card. Forwarding tables are in sync and data plane is active for module
uplinks
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 28
1
100
%
50%
Switch-2
= Line Cards Active
STANDBY COLD
SiSi SiSiSSO Active
RPR-Warm RPR-Warm
VSL
Switch-1
Virtual Switching System (VSS) Active Supervisor Hardware Failure
SSO Hot Standby
1
Active VSS supervisor
incurs a hardware
failure
Duration
Available Bandwidth
SSO
SW1
SW2 SW2
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 29
1
100
%
50%
Switch-2
= Line Cards Active
SSO = SSO Switchover
STANDBY COLD
SiSi SiSiRPR-Warm
VSL
Switch-1
SW1
Virtual Switching System (VSS) Active Supervisor Hardware Failure
SSO Active
2
1. SSO failover to
the hot-standby
supervisor in
switch-2
2. Switch-1 reloads
and comes back
online.
3. 50% bandwidth
is available
during switch-1
reload
SSO
SW2
2
R
R = Reload
Duration
Available Bandwidth
SW2 SW2
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 30
1
100
%
50%
Switch-2
= Line Cards Active
R = Reload
STANDBY COLD
SiSi SiSiRPR Warm
VSL
Switch-1
Virtual Switching System (VSS) Active Supervisor Hardware Failure
SSO Active
1. Switch-1 comes online
2. Previous RPR warm
supervisor resumes SSO
hot standby state
3. The failed supervisor boots
up in RPR warm mode.
4. 100% Bandwidth is
available leveraging both
switches
SSO Hot Standby
RPR Warm
3 2
3
Duration
Available Bandwidth
SW2
SW1
SW2 SW2
SW1
SW2
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 31
VSS Software Upgrade Full Image Upgrade Bandwidth Availability Graph
The following graphs illustrate the aggregate bandwidth available to the VSS
1 2 3
100%
50%
4 5
100%
50%
1 2 3 4 5
Fast Software Upgrade bandwidth availability
until 12.2(33)SXI
Enhanced Fast Software Upgrade bandwidth availability 12.2(33)SXI and after
With EFSU, a minimum of 50% bandwidth is available throughout the software upgrade process
At step 3 during RPR switchover, bandwidth will be dropped to 0% for 1-2 minutes
SW2 SW1 SW1 SW2 SW1/SW2 SW1
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 32
Virtual Switching System Enterprise Campus
A Virtual Switching System-enabled Enterprise Campus network
takes on multiple benefits including simplified management &
administration, facilitating greater high availability, while maintaining
a flexible and scalable architecture…
Access
L2/L3
Distribution
L3 Core
No FHRPs
No Looped topology
Policy Management
Reduced routing
neighbors, Minimal
L3 reconvergence
Multiple active
uplinks per VLAN, No
STP convergence
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 33
Virtual Switching System Data Center
A Virtual Switching System-enabled Data Center allows for maximum
scalability so bandwidth can be added when required, but still providing a
larger Layer 2 hierarchical architecture free of reliance on Spanning Tree…
L2/L3 Core
L2
Distribution
L2 Access
Dual-Homed
Servers, Single
active uplink per
VLAN (PVST), Fast
L2 convergence
Dual Active Uplinks,
Fast L2 convergence,
minimized L2 Control
Plane, Scalable
Single router node,
Fast L2 convergence,
Scalable architecture
© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKDCT-2049 34
Virtual Portchannel (vPC)
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 35
Feature Overview vPC Definition
Allow a single device to use a port channel across two upstream switches
Eliminate STP blocked ports and uses all available uplink bandwidth
Dual-homed server operate in active-active mode
Provide fast convergence upon link/device failure
Reduce CAPEX and OPEX
Available on all current and future generation cards
Logical Topology without vPC
Logical Topology with vPC
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 36
vPC Domain - pair of vPC switches
vPC peer - vPC switch, one of the pair
vPC member port - one of the set
of ports that form a vPC
vPC - the combined port channel
between the vPC peers and the
downstream device
vPC peer-link - link used to
synchronize state between vPC
peer devices, must be 10GbE
vPC peer-keepalive link - the
keepalive link between vPC peer
devices (backup to the vPC peer-link)
vPC
member
port
vPC
vPC
member
port
vPC peer-link
Feature Overview vPC Terminology
vPC peer
vPC Domain
vPC Peer-keepalive link
CFS protocol
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 37
vPC on the N7k
N7k01 N7k02
N5k01 N5k02
2/1 2/2 2/1 2/2
2/9 2/10 2/9 2/10
Po51,2
root
Single-Sided vPC
logical equivalent
Root
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 38
vPC on the N7k
N7k01 N7k02
N5k01 N5k02
2/1 2/2 2/1 2/2
Po10
2/9 2/10 2/9 2/10
Po51
Peer Link
primary secondary
root
regular STP priority
Double-Sided vPC
logical equivalent
Root
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 39
Definition:
Port-channel for devices for devices
dual-attached to the vPC pair
Provides local load balancing for
port-channel members
STANDARD 802.3ad port channel
Access Device Requirements
STANDARD 802.3ad capability
LACP or static port-channels
Recommendations:
Use LACP when available for graceful failover and mis-configuration protection
vPC
member
port
Regular
Port-
channel
port
Attaching to a vPC Domain IEEE 802.3ad and LACP
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 40
S P
3. Secondary ISL Port-Channel
Orphan
Ports
Orphan
Ports
S P
4. Single Attached to vPC Device
S P
2. Attached via VDC/Secondary Switch
S P
1. Dual Attached
Primary vPC
Secondary vPC S
P
Attaching to a vPC Domain Dual Homed vs. Single Attached
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 41
Router
7k1 7k2
Switch
Po1
Po2
Use L3 links to hook up routers and peer with a vPC domain
Don’t use L2 port channel to attach routers to a vPC domain unless you statically route to HSRP address
If both, routed and bridged traffic is required, use individual L3 links for routed traffic and L2 port-channel for bridged traffic
Router
Switch
L3 ECMP
Po2
Layer 3 and vPC Designs Layer 3 and vPC Design
P P
P
Routing Protocol Peer
Dynamic Peering Relationship
P
P P P
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 42
Spanning Tree Recommendations STP Interoperability
STP Uses:
Loop detection (failsafe to vPC)
Non-vPC attached device
Loop management on vPC addition/removal
Requirements:
Needs to remain enabled, but doesn’t dictate vPC
member port state
Logical ports still count
Best Practices:
Make sure all switches in you layer 2 domain are running
with Rapid-PVST or MST (IOS default is non-rapid
PVST+), to avoid slow STP convergence (30+ secs)
Remember to configure portfast (edge port-type) on host
facing interfaces to avoid slow STP convergence (30+
secs)
vPC vPC STP is running to manage
loops outside of vPC’s
direct domain, or before
initial vPC configuration
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 44
Support for all FHRP protocols in Active/Active mode with vPC
No additional configuration required
Standby device communicates with vPC manager produces to determine if vPC peer is “Active” HSRP/VRRP peer
General HSRP best practices still applies
When running active/active aggressive timers can be relaxed (i.e. 2-router vPC case)
L3 L2
HSRP/VRRP
“Standby”:
Active for
shared L3 MAC
HSRP/VRRP
“Active”:
Active for
shared L3 MAC
HSRP with vPC FHRP Active/Active
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 45
Functionality VSS
(Virtual Switching System)
vPC (Virtual Port
Channel)
Multi-Chassis Port Channel ✓ ✓ Loop-free Topology ✓ ✓ STP as a “fail-safe”
protocol ✓ ✓
Control Plane Single Logical Node Two Independent Nodes, both
active
Support for Layer 3 port-channels ✓ ✗
Control Plane Protocols Single instance Instances per Node
10GE ports in the Channel 8 16
Device Configuration Combined Configs Common Configs (w/ consistency checker)
Non Disruptive ISSU Support ✗ ✓
Feature Overview vPC and VSS Comparison
© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKDCT-2049 46
Layer 2 Multipath ... and what about if tree is not necessary
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 47
Next step in model evolution - FabricPath Layer 2 Multipathing
Finally removes Spanning Tree Protocol from the network after several evolutionary intermediate steps (STP+, VSS, vPC)
• Integrates legacy devices via vPC+
• Increase bandwidth of L2 networks via multiple active links
• L3 multipathing is common in IP networks, similar principles and protocols applied to L2
• Cisco FabricPath - available for Nexus 7000 and for Nexus 5500
• Transparent Interconnection of Lots of Links (TRILL)
• Extensions to well-known protocols (IS-IS)
• Simple configuration
© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKDCT-2049 48
FabricPath Introduction
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 49
FabricPath IS-IS
FabricPath IS-IS replaces STP as control-plane protocol in FabricPath network
Introduces link-state protocol with support for ECMP for Layer 2 forwarding
Exchanges reachability of Switch IDs and builds forwarding trees
Improves failure detection, network reconvergence, and high availability
Minimal IS-IS knowledge required –no user configuration by default
Maintains plug-and-play nature of Layer 2
STP FabricPath
STP BPDU FabricPath IS-IS
STP BPDU
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 50
Why IS-IS?
A few key reasons:
Has no IP dependency – no need for IP reachability in order to form adjacency between devices
Easily extensible – Using custom TLVs, IS-IS devices can exchange information about virtually anything
Provides SPF routing – Excellent topology building and reconvergence characteristics
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 51
FabricPath and Classic Ethernet (CE) Interfaces
STP FabricPath
Classic Ethernet (CE) Interface
Interfaces connected to existing NICs and
traditional network devices
Send/receive traffic in 802.3 Ethernet frame format
Participate in STP domain
Forwarding based on MAC table
FabricPath Interface
Interfaces connected to another FabricPath device
Send/receive traffic with FabricPath header
No spanning tree!!!
No MAC learning
Exchange topology info through L2 ISIS adjacency
Forwarding based on ‘Switch ID Table’
Ethernet Ethernet FabricPath Header
→ FabricPath interface
→ CE interface
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 52
Basic FabricPath Data Plane Operation
Ingress FabricPath switch determines destination Switch ID and imposes FabricPath header
Destination Switch ID used to make routing decisions through FabricPath core
No MAC learning or lookups required inside core
Egress FabricPath switch removes FabricPath header and forwards to CE
STP
FabricPath Core
→ FabricPath interface
→ CE interface
STP
MAC A MAC B
S10 S20
DMAC→B
SMAC→A
Payload
DMAC→B
SMAC→A
Payload
Ingress FabricPath
Switch
Egress FabricPath
Switch
DMAC→B
SMAC→A
Payload
DSID→20
SSID→10
DMAC→B
SMAC→A
Payload
DSID→20
SSID→10
DMAC→B
SMAC→A
Payload
DMAC→B
SMAC→A
Payload
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 54
FabricPath MAC Table Edge switches maintain both MAC address table and Switch ID table
Ingress switch uses MAC table to determine destination Switch ID
Egress switch uses MAC table (optionally) to determine output switchport
Local MACs point
to switchports
Remote MACs point
to Switch IDs
S10 S20 S30 S40
S100 S101 S200 FabricPath
MAC A MAC C MAC D MAC B
FabricPath
MAC Table on S100
MAC IF/SID
A e1/1
B e1/2
C S101
D S200
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 55
FabricPath Routing Table FabricPath IS-IS builds and manages Switch ID (routing) table
All FabricPath-enabled switches automatically assigned Switch ID (no user configuration required)
Algorithm computes shortest (best) paths to each Switch ID based on link metrics
Equal-cost paths supported between FabricPath switches
S10 S20 S30 S40
S100 S101 S200
FabricPath
FabricPath
Routing Table on S100
Switch IF
S10 L1
S20 L2
S30 L3
S40 L4
S101 L1, L2, L3, L4
… …
S200 L1, L2, L3, L4
One ‘best’ path
to S10 (via L1)
Four equal-cost
paths to S101
L1 L2 L4 L3
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 58
MAC C
Conversational MAC Learning
FabricPath Core
MAC A
MAC B
FabricPath
MAC Table on S100
MAC IF/SID
A e1/1 (local)
B S200 (remote)
S100
S200
S300
FabricPath
MAC Table on S200
MAC IF/SID
A S100 (remote)
B e12/1(local)
C S300 (remote)
FabricPath
MAC Table on S300
MAC IF/SID
B S200 (remote)
C e7/10 (local)
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 59
FabricPath Multidestination Trees
Multidestination traffic constrained to loop-free trees touching all FabricPath switches
Root switch assigned for each multidestination tree in FabricPath domain
Loop-free tree built from each Root and assigned a network-wide identifier (Ftag)
Support for multiple multidestination trees provides multipathing for multi-destination traffic
Two trees supported in NX-OS release 5.1
S10 S20 S30 S40
S100 S101 S200 FabricPath
Root for
Tree 1
S10
S100
S101
S200
S20
S30
S40
Logical
Tree 1
Root for
Tree 2
S40
S100
S101
S200
S10
S20
S30
Logical
Tree 2
Root Root
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 61
Introducing VPC+
VPC+ allows dual-homed connections from edge ports into FabricPath domain with active/active forwarding
CE switch, Layer 3 router, dual-homed server, etc.
VPC+ requires F1 modules with FabricPath enabled in the VDC
Peer-link and all VPC+ connections must be to F1 ports
VPC+ creates “virtual” FabricPath switch for each VPC+-attached device to allow load-balancing within FabricPath domain
F1 F1
VPC+ F1
F1 F1
S1 S2
po3
F1
F1 F1
VPC+ F1
F1 F1
S1 S2
po3
F1
Host A→S4→L1,L2 S3
Host A
Host A
L1 L2
S3
L1 L2
S4
Physical
Logical
Virtual “Switch 4” becomes next-hop
for Host A in FabricPath domain
FabricPath
CE
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 62
MAC A
VPC+ Physical Topology
S10 S20 S30 S40
S100 S200 FabricPath
MAC B MAC C
Peer link and
PKA required
Peer link runs as
FabricPath core port
VPCs configured
as normal
No requirements for
attached devices other
than channel support
VLANs must be
FabricPath VLANs
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 63
VPC+ Logical Topology
MAC A
S10 S20 S30 S40
S100 S200 FabricPath
MAC B MAC C
S1000
Virtual switch
introduced
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 64
SVI SVI
VPC+ and Active/Active HSRP
With VPC+ and SVIs in mixed-chassis, HSRP Hellos sent with VPC+ virtual switch ID
FabricPath edge switches learn HSRP MAC as reached through virtual switch
Traffic destined to HSRP MAC can leverage ECMP if available
Either VPC+ peer can route traffic destined to HSRP MAC
HSRP Active HSRP Standby
MAC A
S10 S20 S30 S40
S100 S200 FabricPath
MAC B MAC C
S1000
po1 po2
1/30
DMAC→0002
SMAC→HSRP
Payload
DSID→MC
SSID→1000
© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKDCT-2049 65
FabricPath & Standards
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 66
IETF standard for Layer 2 multipathing
Driven by multiple vendors, including Cisco
RFC ready for standardization
FabricPath capable hardware is also TRILL capable
http://datatracker.ietf.org/wg/trill/
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 67
What Is the Relationship between FabricPath and TRILL?
a set of Layer 2 multipathing technologies
FabricPath initial release runs in a Native mode that is Cisco-specific, using proprietary encapsulation and control-plane elements
Nexus 7000 F1 I/O modules and Nexus 5500 HW are capable of running both FabricPath and TRILL modes
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 68
FabricPath & TRILL Feature Summary
FS-link is a superset of TRILL
FabricPath TRILL
Frame routing (ECMP, TTL, RPFC etc…)
Yes Yes
vPC+ Yes No
FHRP active/active Yes No
Multiple topologies Yes No
Conversational learning Yes No
Inter-switch links Point-to-point only Point-to-point OR shared
Base protocol specification is now a proposed IETF standard (March 2010)
Control plane specification will become a proposed standard within months
© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKDCT-2049 69
Conclusion
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 70
L2 domain control protocol evolution
STP is still most commonly used protocol and through the time it was enhanced and improved in many different areas
Solutions based on MEC are removing some STP limitations but do not remove STP itself completely from the network
L2 multipath protocols using different forwarding approach are popping up
Co-existence of both approaches is expected to last long time
Thank you.
© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Public BRKDCT-2049 72
Backup slides
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 73
VSL Bandwidth Sizing & Considerations
The VSL is an Etherchannel
can include up to eight links
VSL bandwidth should be greater than or equal to the largest bandwidth connection to a single attached device (downlink)
Consider the bandwidth on a per VSS chassis basis
Consider the bandwidth for any Service Modules and SPAN sessions
Distribute the VSL interfaces across multiple modules for added resiliency
Include at least one VSL interface from the Supervisor module for faster VSL bring-up during reloads
SiSi SiSi
SiSi SiSi
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 74
Putting It All Together – Host A to Host B (1) Broadcast ARP Request
S10 S20 S30 S40
S100 S101 S200 FabricPath
Root for
Tree 1
Root for
Tree 2
MAC A MAC B
Multidestination
Trees on Switch 100
Tree IF
1 L1,L2,L3,L4
2 L4
DMAC→FF
SMAC→A
Payload
DSID→FF
Ftag→1
SSID→100
Broadcast →
DMAC→FF
SMAC→A
Payload
Multidestination
Trees on Switch 10
Tree IF
1 L1,L5,L9
2 L9
L1 L2 L4 L3
L5 L6 L7 L8
L9 L10 L11 L12
Ftag →
Ftag →
DMAC→FF
SMAC→A
Payload
DSID→FF
Ftag→1
SSID→100
FabricPath
MAC Table on S200
MAC IF/SID
Multidestination
Trees on Switch 200
Tree IF
1 L9
2 L9,L10,L11,L12
FabricPath
MAC Table on S100
MAC IF/SID MAC IF/SID
A e1/1 (local)
DMAC→FF
SMAC→A
Payload
Learn MACs of directly-connected
devices unconditionally
Don’t learn MACs in
flood frames
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 75
Putting It All Together – Host A to Host B (2) Unicast ARP Reply
S10 S20 S30 S40
S100 S101 S200 FabricPath
MAC A MAC B
Multidestination
Trees on Switch 100
Tree IF
1 L1,L2,L3,L4
2 L4
DMAC→A
SMAC→B
Payload
DSID→MC1
Ftag→1
SSID→200
Ftag →
DMAC→A
SMAC→B
Payload
Multidestination
Trees on Switch 10
Tree IF
1 L1,L5,L9
2 L9
Ftag →
Unknown →
DMAC→A
SMAC→B
Payload
DSID→MC1
Ftag→1
SSID→200
FabricPath
MAC Table on S200
MAC IF/SID
Multidestination
Trees on Switch 200
Tree IF
1 L9
2 L9,L10,L11,L12
FabricPath
MAC Table on S100
MAC IF/SID
A e1/1 (local) DMAC→A
SMAC→B
Payload
MAC IF/SID
B e12/2 (local)
A →
MAC IF/SID
A e1/1 (local)
B S200 (remote)
L1 L2 L4 L3
L5 L6 L7 L8
L9 L10 L11 L12
A → If DMAC is known, then
learn remote MAC
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 76
FabricPath
MAC Table on S200
MAC IF/SID
B e12/2 (local)
FabricPath
MAC Table on S100
MAC IF/SID
A e1/1 (local)
B S200 (remote)
Putting It All Together – Host A to Host B (3) Unicast Data
S10 S20 S30 S40
S100 S101 S200 FabricPath
MAC A MAC B S200 →
DMAC→B
SMAC→A
Payload
L1 L2 L4 L3
L5 L6 L7 L8
L9 L10 L11 L12
S200 →
DMAC→B
SMAC→A
Payload
DSID→200
Ftag→1
SSID→100
MAC IF/SID
A S100 (remote)
B e12/2 (local)
DMAC→B
SMAC→A
Payload
B → B →
FabricPath Routing
Table on S100
Switch IF
S10 L1
S20 L2
S30 L3
S40 L4
S101 L1, L2, L3, L4
… …
S200 L1, L2, L3, L4
DMAC→B
SMAC→A
Payload
DSID→200
Ftag→1
SSID→100
FabricPath Routing
Table on S30
Switch IF
… …
S200 L11
FabricPath Routing
Table on S30
Switch IF
… …
S200 – S200 →
Hash