data center networking architecture - cisco...fault domain sizing core layer provides inter-agg...
TRANSCRIPT
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 1
Data Center Networking Architecture
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 2
Before We Get Started:Put cell phones into silent mode
Intermediate level session focused on data center front end architecture
This session is based upon the Data Center Infrastructure Design Guide 2.5.
http://www.cisco.com/application/pdf/en/us/guest/netsol/ns107/c649/ccmigration_09186a008073377d.pdf
Additional Cisco Validated Designs (CVD) can be found at;http://www.cisco.com/go/cvd
Enterprise Data Center:
http://www.cisco.com/en/US/solutions/ns340/ns414/ns742/ns741/networking_solutions_products_genericcontent0900aecd80601e1d.html#datacenter
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 3
AgendaData Center Infrastructure
Core Layer Design
Aggregation Layer Design
Access Layer Design
Density and Scalability Implications
Scaling Bandwidth and Density
Spanning Tree Design and Scalability
Increasing HA in the DC
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 4
Data Center Evolution
Mainframe
Data Center 1.0
IT R
elev
ance
and
Con
trol
Application Architecture EvolutionCentralized
Data Center 2.0
Client-Server and Distributed Computing
Decentralized
Data Center 3.0
Service Oriented and Web 2.0 Based
Virtualized
Consolidate
Virtualize
Automate
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 5
Application Centric Architecture Two Sides Of the Same Coin
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 6
Data Center Architecture OverviewLayers of the Enterprise Multi-Tier Model
Multi-tier application architecture logically overlaid on networkLayer 2 and layer 3 access topologies Dual and single attached Servers, Mainframe and Blade ChassisMultiple aggregation modulesL2 adjacency requirements Stateful services for security and load balancing
Blade Chassis w/ Integrated Switch
L3 Access
Blade Chassis w/ Pass Thru
Mainframe w/ OSA
L2 w/ Clustering and NIC Teaming
Enterprise Core
DC Aggregation/Distribution
DC Access
DC Core
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 7
Core Layer Design
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 8
Is a separate DC Core Layer required? Consider:
10GigE port densityAdministrative domains Anticipate future requirements
Key core characteristics10GE scalabilityDistributed forwarding architectureAdvanced link load balancingScalable IP multicast support
ScalingCore Layer DesignRequirements
Campus Core
DC Core
Aggregation
Campus Distribution
Campus Access Layer
Server Farm Access Layer
Scaling
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 9
Core Layer DesignL2/L3 Characteristics
Layer 3 CoreEqual cost multi-path (ECMP) load balancingEIGRP/OSPF for fast convergenceL2 extension through core is not recommended
CEF* hashing algorithmDefault hash is on L3 IP addresses onlyL3 + L4 port hash will improveload distribution
CORE1(config)#mls ip cef load full simpleLeverages automatic source port randomization in client TCP stack
Campus Core
DC Core
Aggregation
WebServers
ApplicationServers
DatabaseServers
Access
L3L2
CEF HashApplied to Packets on Equal Cost Routes
*CEF = Cisco Express Forwarding
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 10
Core Layer DesignRouting Protocol Design: OSPF
Isolate the DC network with a dedicated OSPF Area.Not So Stubby Area (NSSA) helps to limit LSA propagation, but permits route redistribution (RHI)Advertise default into NSSA, summarize routes outUse “auto-cost reference-bandwidth”to support 10G linksLoopback interfaces simplify troubleshooting (neighbor ID)Use passive-network default: enable on L3 links to allow peeringUse authentication: more secure and avoids undesired adjacenciesInterface hello-dead = 1/3
Campus Core
DC Core
Aggregation
WebServers
ApplicationServers
DatabaseServers
Access
DC Subnets(Summarized)
Default Default
L3 vlan-ospf
Area 0NSSA
L0=10.10.3.3 L0=10.10.4.4
L0=10.10.1.1 L0=10.10.2.2
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 11
Core Layer DesignRouting Protocol Design: EIGRP
Use summarization and default to isolate the DC from network eventsAdvertise default into DC with interface command on core:
If other default routes exist (from internet edge for example), may need to use distribute lists to filter outUse passive-network defaultSummarize DC subnets to core with interface command on agg:
ip summary-address eigrp 20 0.0.0.0 0.0.0.0 200
Cost of 200 required to prefer Internet default routeover the local NULL0 default route installed bysummary-address
ip summary-address eigrp 20 10.20.0.0 255.255.0.0
Campus Core
DC Core
Aggregation
WebServers
ApplicationServers
DatabaseServers
DC Subnets(Summarized)
L3 vlan-eigrp
Default Default
Access
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 12
Aggregation Layer Design
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 13
Aggregation Layer DesignSpanning Tree Design
Rapid-PVST+ (802.1w) or MST (802.1s), Choice of .1w/.1s based on scale of logical+virtual ports requiredR-PVST+ is recommended as best replacement for 802.1d
Fast converging: inherits Cisco enhancements (Uplink-fast, Backbone-fast)Combined with RootGuard, BPDUGuard, LoopGuard ,and UDLD achieves for STP stabilityAccess layer uplink failures:
~300ms – 2secMost flexible design optionsUDLD global only enables on Fiber ports, must enable manually on copper ports
Root PrimaryHSRP PrimaryActive Context
Root SecondaryHSRP SecondaryStandby Context
Core
RootguardLoopGuardBPDU GuardUDLD Global
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 14
Aggregation Layer DesignIntegrated ServicesServices: Firewall, Load Balancing, SSL Encryption/Decryption
+L4-L7 services integrated in Cisco Catalyst® 6500Server load balancing, firewall and SSL services may be deployed in:
Active-standby pairs (ACE, FWSM 2.X)Active-active pairs (ACE, FWSM 3.1)
Integrated blades optimize rack space, cabling, mgmt,providing flexibility and economies of scaleInfluences many aspects of overall design
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 15
Aggregation Layer DesignActive-Standby Service Design
Active-standby servicesApplication Control EngineFirewall Service ModuleSSL Module
Under utilizes:Access layer uplinksService modulesAggregation switch fabrics
Advantages:Widely deployedPredictable traffic patternsEasier to configure and manageConsistent performance under failure conditions.
Core
Root PrimaryHSRP PrimaryActive Context
Root SecondaryHSRP SecondaryStandby Context
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 16
Aggregation Layer DesignActive-Active Service Design
VLAN6:Root PrimaryHSRP PrimaryActive Context
Active-Active Service DesignApplication Control Engine (ACE)
Active-standby distribution per context
Firewall Service Module (3.x)Two active-standby groups permit distribution of contexts across two FWSM’s
Permits uplink load balancing while having services applied
Increases overall service performance
VLAN5:Root PrimaryHSRP PrimaryActive Context
VLAN 6:Root SecondaryHSRP SecondaryStandby Context
vlan6 vlan6
VLAN 5:Root SecondaryHSRP SecondaryStandby Context
Core
vlan5 vlan6 vlan6 vlan5
Tech Tip: Virtual Context are key to Active/Active designs
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 17
Aggregation Layer DesignEstablishing Inbound Path Preference
vlan6 vlan6
vlan5 vlan6 vlan6 vlan5
CoreUse Route Health Injection feature of ACE Aligns advertised route of VIP with active context on ACE, FWSM and SSL service modulesAvoids unnecessary use of inter-switch link and asymmetrical flowsIntroduce route-map to RHI injected route to set desired metric
3. EIGRP/OSFP will propagate RHI VIP to Core to attract traffic
1. ACE Probes to Real Servers in VIP to Determine Health
4. If Context Failover Occurs, RHI and Route Preference Follow
2. If Healthy, Installs Host Route to VIP on Local MSFC
Establish Route Preference for Service Enabled Applications
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 18
Aggregation Layer DesignScaling the Aggregation Layer
Aggregation modules provide:Spanning tree scalingHSRP ScalingAccess layer densityApplication services scaling
SLB/firewallFault domain sizing
Core layer provides inter-aggmodule transport:
Provides inter-agg module transport in multi-tier modelLow latency distributed forwarding (use DFC’s)100,000’s PPS forwarding rate
Campus Core
DC Core
Server Farm Access Layer
Scaling
Aggregation Module 1
Aggregation Module 2
Aggregation Module 3
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 19
VRF-Green
VRF-Blue
VRF-Red
Agg2Agg1
802.1Q Trunks
VLANs Isolate Contexts on Access
Alternate Primary Contexts on Agg1 and 2 to Achieve
Active-Active Design
Aggregation Layer DesignUsing Virtual Route Forwarding (VRF)
Enables virtualization/partitioning of network resources (MSFC, ACE, FWSM)
Permits use of application services with multiple access topologies
Maps well to path isolation MAN/WAN designs such as with MPLS or Multi-VRF(VRF-Lite)
Security policy managementand deployment by usergroup/vrf
Firewall and SLB Contexts for Green,
Blue, and Red
MPLS or other Core
DC Core
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 20
Access Layer Design
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 21
Access Layer DesignDefining Layer 2 Access
Access Layer connects servers & hosts to the networkProprietary server protocols require layer 2 adjacencyL2 topologies consist of looped and loop-free models
Looped - VLANs are extended across inter-switch link trunk
Loop-free - VLANs are not extended across inter-switch link trunk
L3 routing is typically performed in the aggregation layerStateful services at Agg can be provided across the L2 access (FW, SLB, SSL, etc.)
802.
1q T
runk
s
L3 Agg2
DC Core
Primary RootPrimary HSRP
Active Services
Secondary RootSecondary HSRPStandby Services
L2Agg1 Inter-Switch Link
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 22
Access Layer DesignEstablish a Deterministic Model
802.
1q T
runk
s
L3 Agg2
DC Core
Primary RootPrimary HSRP
Active Services
L2Agg1
Align active components in trafficpath on common Agg switch:
Primary STP rootAgg-1(config)#spanning-tree vlan 1-10 root primary
Primary HSRP (outbound)standby 1 priority X
Active Service modules/contexts
Path Preference (inbound)Use RHI & Route mapRoute-map preferred-path
match ip address x.x.x.x
set metric -30
(also see slide 17)
Path Pref
L3+L4 Hash
Def gwy
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 23
Access Layer DesignBalancing VLANs on Uplinks: L2 Looped AccessDistributes load across uplinksSTP blocks uplink path for VLANs to secondary root switchIf active/standby service modules are used; consider inter-switch link utilization
Multiple service modules may be distributed on both agg switches to achieve balanceConsider b/w of inter-switch link in a failure situation
If active/active service modules are used;
(ACE and FWSM3.1)Balance contexts+hsrp across aggswitchesConsider establishing inbound path preference
802.1q Trunks
Blue-Primary RootBlue-Primary HSRP
Red-Sec RootRed-Sec HSRP
Red-PrimaryRed-PrimaryBlue-Sec RoBlue-Sec HSRP
Root HSRPot
L3 Agg2
L2
Agg1L3+L4 HashDC Core
Def gwy Def gwy
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 24
Access Layer DesignLooped Design Model
VLANs are extended between aggregation switches, creating the looped topologySpanning Tree is used to prevent actual loops (Rapid PVST+, MST)Redundant path exists through a second path that is blockingTwo looped topology designs:
Triangle and square
VLANs may be load balanced across access layer uplinksInter-switch link utilization must be considered as this may be used to reach active services
Secondary STP RootSecondary HSRP Standby Services
Primary STP RootPrimary HSRP
Active Services
FFF F F
F B F F FBB
F F
Looped Triangle Looped Square
L3 L2
Inter-Switch Link
.1Q TrunkF F
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 25
Access Layer Design L2 Looped Topologies
L2 SquareL3L2
L2 Triangle
Looped Square AccessSupports VLAN extension/L2 adjacency across access layerResiliency achieved with dual homing and STPQuick convergence with 802.1W/SSupports stateful services at aggregation layerActive-active uplinks align well to ACE/FWSMAchieves higher density access layer, optimizing 10GE aggregation layer density
Looped Triangle AccessSupports VLAN extension/L2 adjacency across access layerResiliency achieved with dual homing and STPQuick convergence with 802.1W/SSupports stateful services at aggregation layerProven and widely used
VLAN 10
VLAN 20
VLAN 10
Row 2, Cabinet 3 Row 9, Cabinet 8
L3L2
VLAN 10
VLAN 20
VLAN 10
Row 2, Cabinet 3 Row 9, Cabinet 8
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 26
Access Layer Design Benefits of L2 Looped Design
Services like firewall and load balancing can easily be deployed at the aggregation layer and shared across multiple access layer switches
VLANs are primarily contained between pairsof access switches but─
VLANs may be extended to different access switches to support
NIC teaming
Clustering L2 adjacency
Administrative reasons
Geographical challenges
Row A, Cab 7 Row K, Cab 3
L3 L2
DC Core
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 27
Access Layer DesignLoop-Free Design
Alternative to looped design
2 LoopFree Models:U and Inverted U
Benefit: Spanning Tree is enabled but no port is blockingso all links are forwarding
Benefit: less chance of loop conditiondue to configuration errors or other anomalies
L2-L3 boundary varies byloop-free model used: U or Inverted-U
Implication considerations withservice modules, L2 adjacency,and single attached servers
LoopFree U LoopFreeInverted U
L3
L2
DC Core
L3
L2
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 28
Access Layer Design Drawbacks of Layer 2 Looped Design
Main drawback: if frame looping occurs, the network may become unmanageable due to the infinite replication of frames802.1w Rapid PVST+ combined with STP related features and best practices improve stability and help to prevent loop conditions
UDLDLoopguardRootguardBPDUguardLimit domain sizeStay under STP watermarks for
logical and virtual ports
3/2 3/2
3/1 3/1Switch 1 Switch 2
DST MAC 0000.0000.4444
0000.0000.3333
DST MAC 0000.0000.4444
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 29
Access Layer Design Loop-Free Topologies
L2 Loop-Free Inverted UL2 Loop-Free U
L3L2
VLAN 10
VLAN 20
L3L2
VLAN 10
VLAN 20
VLAN 10
Loop-Free Inverted U AccessSupports VLAN extensionNo STP blocking; all uplinks activeAccess switch uplink failure black holes single attached serversSupports all service module implementationsSingle attached servers prone to isolation during failures
Loop-Free U AccessVLANs contained in switch pairs(no extension outside of switch pairs) No STP blocking; all uplinks activeAutostate implications for certain service modules (CSM)ACE supports autostate and per context failoverPotential for split subnets during failure
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 30
Access Layer Design Loop-Free U Design and Service Modules (1)
HSRP
SVI for Interface vlan 10goes to down state if uplink is the only interface in vlan 10
MSFC on Agg1
L3L2
VLAN 20, 21
Agg1, Active Services
Agg2, Standby Services
VLAN 10
HSRP Secondary on Agg2 Takes Over as def-gwy, But How to Reach Active Service Modules?
VLAN 11
If the uplink connecting access and aggregation goes down, the VLAN interface on the MSFC goes down as well due to the way autostate worksCSM and FWSM has implications as autostate is not conveyed (leaving black hole)Tracking and monitoring features may be used to allow failover of service modules based on uplink failure but would you want a service module failover for one access switch uplink failure? Not recommended to use loop-free L2 access with active-standby service module implementations. (See slide on ACE next)
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 31
Access Layer Design Loop-Free U Design and Service Modules (2)
HSRP
L3L2
VLAN 20, 21VLAN 10, 11
HSRP Secondary on Agg2 Takes over as def-gwy
With ACE:
ACE Context 1 ACE Context 1
Per context failover with autostate
If uplink fails to Agg1, ACE can switchover to Agg2 (under 1sec)
Requires ACE on access trunk side for autostate failover
May be combined with FWSM3.1 for active-active design
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 32
Access Layer Design Drawbacks of Loop-Free Inverted-U Design
DC Core
Single attached servers are black-holed if access switch uplink fails
Distributed EtherChannel® can reduce chance of black holing
NIC teaming improves resiliency in this design
Inter-switch link scaling needs to be considered when using active-standby servicemodules
L3L2
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 33
Virtual Switching System 1440Network System Virtualization
Aggregation/Access
SiSi SiSi
Server Access
SiSi SiSi SiSi SiSiSiSi SiSi
Features
Network System Virtualization
Inter-Chassis Stateful Switch Over (SSO)
Multi-Chassis EtherChannel (MEC)
Benefits of VSSIncreased Operational Efficiency via Simplified Network
Boost Non-stop Communication
Scale the System Bandwidth Capacity to 1.4 Tbps
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 34
Virtual Switching SystemData Center
A Virtual Switching in the Data Center increases bandwidth scalability, but still provides a Layer 2 hierarchical architecture without relying on Spanning Tree…
Data Center VSS design guide Summer 2008
L2/L3 Core
L2 Aggregation*
L2 Access
Single router node, Fast convergence, Scalable architecture
Dual Active Uplinks, Fast L2 convergence, minimized L2 Control Plane, Scalable
Dual-Homed Servers, Single active uplink per VLAN (PVST), Fast L2 convergence
*Service Module support in August 2008.
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 35
Access Layer Design Comparing Looped, Loop-Free and VSS
(1,2)
(3)
(4)
Uplink vlanson Agg Switch in Blocking or Standby State
VLAN Extension Supported
Across Access Layer
Service Module Black-
Holing on Uplink Failure
(5)
Single Attached
Server Black-Holing on
Uplink Failure
Access Switch Density per Agg Module
Must Consider Inter-Switch Link Scaling
Looped Triangle - + + + - +
-
+
-
+
Looped Square + + + + +
Loop-Free U + - - + +
Loop-Free Inverted U + + + +/- +
VSS (6) + + + + +
(1,2)
(3)
(4)
1. Use of Distributed EtherChannel Greatly Reduces Chances of Black Holing Condition 2. NIC Teaming Can Eliminate Black Holing Condition3. When Service Modules Are Used and Active Service Modules Are Aligned to Agg14. ACE Module Permits L2 Loopfree Access with per Context Switchover on Uplink failure5. Applies to when using CSM or FWSM in active/standby arrangement6. DCVSS design guide to be released in Summer’08
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 36
Access Layer Design:
L3 Design Model
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 37
Access Layer DesignDefining Layer 3 Access
DC Core
DC Aggregation
DC Access
L3 access switches connect to aggregation with routed interfaces
All uplinks are active (ECMP), no spanning tree blocking occurs
.1Q trunks between pairs of L3 access switches to support L2 adjacency server requirements
Convergence time is usually better than Spanning Tree
Provides isolation/shelter forhosts affected by broadcasts
L3L2
DC Core
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 38
Access Layer DesignNeed L3 for Multicast Sources?
Multicast sources on L2 access works well with IGMP snoopingIGMP snooping at access switch automatically limits multicast flow to interfaces with registered clients in VLANUse L3 when IGMP snooping is not available or when particular L3 administrative functions are required
DC Core
DC Aggregation
DC Access L3L2
DC Core
L3 Access with Multicast Sources
L3 Access with Multicast Sources
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 39
Access Layer DesignBenefits of L3 Access
Minimizes broadcast domains attaining high level of stabilityMeet server stability requirements or isolate particular application environmentsCreates smaller failure domains, increasing stabilityAll uplinks are active paths, no blocking (up to ECMP maximum)Fast uplink convergence:failover and fallback, no arptable to rebuild for aggregationswitches
DC Core
DC Aggregation
DC Access L3L2
DC Core
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 40
Access Layer DesignDrawbacks of Layer 3 Design
L2 adjacency is limited to access pairs (clustering and NIC teaming limited)
IP address space management is more difficult, small subnets.
If migrating to L3 access, IP address changes may be difficult on servers (may break apps)
Normally require services tobe deployed at access layerpair to maintain L2 adjacencywith server and provide statefulfailover
DC Core
DC Aggregation
DC Access L3L2
DC Core
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 41
Access Layer DesignL2 or L3? What Are My Requirements?
Difficulties in managing loops
Staff skillset; time to resolution
Convergence properties
NIC teaming; adjacency
HA clustering; L2 adjacency
Ability to extend VLANsSpecific application requirementsBroadcast domain sizingOversubscription requirementsLink utilization on uplinksService module support/placement
Rap
id P
VST+
or M
ST
OSP
F, E
IGR
P
Aggregation
Access
Layer 2 Layer 3The Choice of One Design Versus the Other:
L3L2
L3L2
Aggregation
Access
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 42
Access Layer Design: BladeServers
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 43
Blade Server RequirementsConnectivity Options
Using Integrated Ethernet SwitchesUsing Pass-Through Modules
Blade Server Chassis Blade Server Chassis
Aggregation Layer
Interface 1Interface 2
External L2 Switches
Integrated L2 Switches
Tech Tip – Avoid hierarchy of L2 switches. Traffic patternsare difficult to predict during failure conditions.
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 44
Blade Server RequirementsTrunk Failover Feature
Switch takes down server interfaces if corresponding uplink fails, forcing NIC teaming failover
Solves NIC teaming limitations; prevents black-holing of traffic
Achieves maximum bandwidth utilization:
No blocking by STP, but STP is enabled for loop protection
Can distribute trunk failover groups across switches
Dependent upon the NIC feature set for NIC Teaming/failover
Aggregation
Interface 1Interface 2
Blade Server Chassis
Integrated L2switches
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 45
Cisco Virtual Blade Switch (VBS) Overview of Concept and Benefits
Management Simplification– Operational simplification
• Single switch per rack to manage• True Plug-n-Play of switches
– Design Simplification:• Sharing Uplinks helps reduce cables• Reduction in # of logical nodes in L2/L3
network helps improve network convergence– Operational Consistency
• Familiar IOS CLI, MIBs and management toolslike CiscoWorks
• Consistent End-to-end features and functionality
Performance & Scalability– Up to 160G configurable bandwidth out of rack– Rack switch allows server to double bandwidth
with no additional cost
Virtual Blade Switch
Traditional Blade Switch
NEW
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 46
Cisco Catalyst Virtual Blade Switch Topology Highlighting Key Benefits
Aggregation Layer
With Catalsyt6500 VSS, all links utilized
Local Traffic doesn’t go to distribution
switch
Access Layer (Virtual Blade Switch)
Single Switch / Node (for Spanning Tree or
Layer 3 or Management)
Greater Server BW –via Active-Active
Server Connectivity
Mix-n-match GE & 10GE switches
Higher Resiliency
with Etherchannel
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 47
Cisco Catalyst Virtual Blade Switch Multiple Deployment Options
Common Scenario
Single Virtual Blade switch
Separate RingsSeparate VBSMore resilient
4 NIC server ScenarioMore Server Bandwidth – VMwareCreates smaller Rings
* Design Guide recommendations have not been established yet.
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 48
Density and Scalability Implications in the Data Center
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 49
Density and Scalability Implications Modular, Top of Rack, Blade Access Switching Models
494949
Where are the issues?CablingPowerCoolingSpanning Tree ScalabilityManagementOversubscriptionSparingRedundancy
The right solution is usually based on business requirementsHybrid implementations can and do exist
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 50
Density and Scalability ImplicationsServer Farm Cabinet Layout
N- Racks per Row
Considerations:How Many Interfaces per Server?Top of Rack, Blade Switches or End of Row?Separate Switch for Management Network?Cabling overhead, under floor, patch systems?Cooling capacity?Power distribution/ Redundancy?
…… ……~30-
40 1
RU
Ser
vers
per
Rac
k
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 51
Density and Scalability ImplicationsDensity: How Many NICs to Plan For?
Front End Interface
Back End Interface
Backup Network
OOB Management
Storage HBA or GE NIC
Three to four NICs per server are common
Front end or public interface
Storage interface (GE, FC)
Backup interface
Back end or private interface
integrated Lights Out (iLO) for OOB mgmt
May require more than two TOR switches per rack
30 servers@ 4 ports = 120 ports required in a single cabinet (3x48 port 1RU switches)
May need hard limits on cabling capacity
Avoid cross cabinet and other cabling nightmares
Cab
ling
rem
ains
in c
abin
ets
Single Rack-2 Switches Dual Rack-2 Switches
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 52
Density and Scalability ImplicationsCabinet Design with Top of Rack Switching
Servers Connect to a Top of Rack (TOR) Switch
Minimizes the number of cables to run from each cabinet/rackIf NIC teaming support: two - TOR switches are requiredWill two TOR switches provide enough port density?Cooling requirements usually do not permit a full rack of serversRedundant switch power supply are optionNo Redundant switch processorsGEC or 10GE Uplink Considerations
Cab
ling
rem
ains
in c
abin
ets
Single Rack-2 Switches Dual Rack-2 Switches
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 53
Density and Scalability ImplicationsTop of Rack (TOR) Switching Model
Pro: Efficient CablingPro: Improved CoolingCon: Number of Devices/MgmtCon: Spanning Tree Load
Access
Aggregation
Cabinet 1 Cabinet 2 …With ~1,000 Servers/25 Cabinets = 50 Switches
Cabinet 25
4 Uplinks per Cabinet = 100 Uplinks Total to Agg Layer
GEC or 10GE Uplinks?
DC Core
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 54
Density and Scalability ImplicationsCabinet Design with Modular Access Switches
Servers Connect Directly to a Modular Switch
Cable bulk can be difficult to manage and block cool air flowSwitch placement at end of rowor within rowMinimizes cabling to AggregationReduces number of uplinks/aggregation portsRedundant switch power andprocessors are optionsGEC or 10GE Uplink ConsiderationsNEBS Considerations
Cables route under raised floor or in overhead trays
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 55
Density and Scalability ImplicationsAccess Network Topology w Modular Switches
Pro: Fewer Devices/MgmtCon: Cabling ChallengesCon: Cooling Challenges
2 Uplinks per 6509 = 16 Uplinks to Agg Layer
Aggregation
Access
DC Core
GEC or 10GE Uplinks?
With ~1,000 Servers/9 Slot Access Switches= 8 Switches~8 Access Switches to Manage
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 56
Density and Scalability ImplicationsModular, Top of Rack and Blade Comparison
Aggregation
Top of Rack & Blade Switch
Access
Fewer Uplinks
More Uplinks
Fewer I/O Cables
More I/O Cables
Higher STP Proc
Lower STP Proc
Modular Access
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 57
Scaling Bandwidth and Density
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 58
Scaling B/W with GEC and 10GE Optimizing EtherChannel Utilization
Ideal is graph on top right
Bottom left graph more typical
Analyze the traffic flows in and out of the server farm:
IP addresses (how many?)
L4 port numbers (randomized?)
Default L3 hash may not be optimal for GEC: L4 hash may improve
10 GigE gives you effectively the full bandwidth without hash implications
agg(config)# port-channel load balance src-dst-port
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 59
Scaling B/W with GEC and 10GEMigrating Access Layer Uplinks to 10GE
How do I migrate from GEC to 10GE uplinks?How do I increase the 10GE port density at the agg layer?Is there a way to regain slots used by service modules?
Aggregation
Access Pair 1 … …
DC Core
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 60
Scaling B/W with GEC and 10GEConsolidate to ACE
Consider consolidating multiple service modules onto ACE Module
–SLB–Firewall–SSL
4/8/16G Fabric ConnectedActive-Active DesignsHigher CPS + Concurrent CPSSingle TCP termination, lower latencyFirewall feature gap needs consideration
DC Core
Access
Aggregation
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 61
Scaling B/W with GEC and 10GEService Layer Switch
Move certain services out of aggregation layerIdeal for CSM, SSL modulesOpens slots in agg layer for 10GE portsUse separate links for FT pathsExtend only necessary L2 VLANs to service switches via .1Q trunks (GEC/TenG)
Aggregation
DC Core
Service Switch2
(Redundant)
Service Switch1
Access
SiSi SiSi
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 62
Increasing Throughput with Virtual Switching System 1440
VSS combines a pair of Catalyst 6500s into a single logical switchEliminates Spanning-Tree blocking ports.Etherchannel load balancing Service module support in 12.2(33)SXI Summer08 (FWSM, ACE, NAM)Spanning-tree is a protection mechanism against rogue switches and cabling errors
DC Core
Service Switch2
(Redundant)
Service Switch1
Access
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 63
Scaling with 10GE DensityNexus 7000
Nexus 7000 provides high density 10GE scalability.Virtual Device Contexts (VDC) to logically partition a single chassis.Multi-Chassis Etherchannelsupport (future)NX-OS purpose built data center operating systemUnified Fabric capable
DC Core
Service Switch2
Service Switch1
Access
Aggregation
Nexus7000
Nexus7000
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 64
Spanning Tree Scalability
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 65
Spanning Tree ScalabilityCommon QuestionsHow many VLANs can I support in a single aggregation module?Can a “VLAN Anywhere”model be supported?How many access switches can I support in each aggregation module?What are the maximum number of logical ports?Are there STP hardware restrictions?
DC Core
Aggregation
…Access Pair 1 …
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 66
Spanning Tree ScalabilitySpanning Tree Protocols Used in the DC
Rapid PVST+ (802.1w)Most common in data center todayScales to large size ~10,000 logical portsCoupled with UDLD, Loopguard, RootGuard and BPDU Guard, provides a strong-stable L2 design solutionEasy to implement, proven, scales
MST (802.1s)Permits very large scale STP deployments ~30,000 logical portsNot as flexible as Rapid PVST+Service module implications (FWSM transparent mode)More common in service providers and ASPs
This CVD focuses on the Use of Rapid PVST+
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 67
Spanning Tree ScalabilitySpanning Tree Protocol Scaling
MST RPVST+ PVST+
Total Active STP Logical Interfaces 50,000 Total 10,000 Total 13,000 Total
Total Virtual Ports per LineCard
6,0001 per Switching Module
1,8001 per Switching
Module(6700)
1200 for Earlier Modules
1,8001 per Switching Module
1 10 Mbps, 10/100 Mbps, and 100 Mbps Switching Modules Support a Maximum of 1,200 Logical Interfaces per Modulehttp://www.cisco.com/univercd/cc/td/doc/product/lan/cat6000/122sx/ol_4164.htm#wp26366
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 68
Spanning Tree ScalabilitySpanning Tree Protocol Scaling
Trunks on the switch * active VLANs on the trunks + number of non-trunking interfaces on the switch
In this example, aggregation 1 will have:
10 + 20 + 30 = 60 STP active logical interfaces
AGG1#sh spann summ totSwitch is in rapid-pvst modeRoot bridge for: VLAN0010, VLAN0020, VLAN0030EtherChannel misconfig guard is enabledExtended system ID is enabledPortfast Default is disabledPortFast BPDU Guard Default is disabledPortfast BPDU Filter Default is disabledLoopguard Default is enabledUplinkFast is disabledBackboneFast is disabledPathcost method used is long
Name Blocking Listening Learning Forwarding STP Active---------------------- -------- --------- -------- ---------- ----------30 VLANs 0 0 0 60 60AGG1#
STP Active Column = STP Total Active Logical Interfaces
30 VLANs
20 VLANs10 VLANs
DC Core
Te7/3Te7/4
Number of Total STP Active Logical Interfaces=
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 69
Spanning Tree ScalabilitySpanning Tree Protocol Scaling
120 VLANs system wide
No manual pruning performed on trunks
1RU access layer environment
45 access switches each connected with 4GEC
Dual homed, loop topology(120 * 45 access links)+120 instances on link to agg2=5400+120=5520
This is under the maximum recommendation of 10,000 when using Rapid PVST+
Example: Calculating Total Active Logical Ports
Core
..…
Aggregation 2 Secondary RootLoop
Aggregation 1Primary Root
Layer 3Layer 2
Access 1 Access 45
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 70
Spanning Tree ScalabilitySpanning Tree Protocol Scaling
AGG1#sh vlan virtual-port slot 7Slot 7Port Virtual-ports-------------------------Te7/1 30Te7/2 30 Te7/3 10 Te7/4 20 Total virtual ports:90AGG1#
For line card x: sum of all trunks * VLANs * (the number of ports in a port-channel if used)
10 + 20 + (30*2)
=90 Virtual Port’s on line card 7
EtherChannel
30 VLANs
20 VLANs10 VLANs
Te7/1
Te7/2
DC Core
Te7/3Te7/4
NOTE: VPs Are Calculated per Port in Channel Groups
Number of Virtual Ports per Line Card=
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 71
Spanning Tree ScalabilitySpanning Tree Protocol Scaling
Example: Calculating Virtual Ports per Line Card
120 VLANs system wideno manual pruning performed on trunks12 access switches, each connected with 4GEC across 6700 line card
This is above the recommended watermark
EtherChannels to Acces Layer
Acc2
6748 Line Card
Acc1 …. Acc12
Maximum number VLANs per port = 37 1800/48=37.5 per port
(120 * 48 access links)=5,760 virtual ports
Core
..…
Aggregation 2 Secondary Rootloop
Aggregation 1Primary Root
Layer 3Layer 2
Access 1 Access 12
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 72
Spanning Tree ScalabilityWhy STP Watermarks Are Important
If exceeded, performance is unpredictable
Larger impact when interface flaps, or shut/no_shut
Small networks may not see a problem
Large networks will usually see problems
Convergence time will be affected
Pruning off unneeded VLANs will reduce activelogical and virtual port instances
Watermarks Are Not Hard Limits But─
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 73
Spanning Tree ScalabilityDesign Guidelines
Add aggregation modules to scale, dividing up the STP domainMaximum five hundred HSRP instances on Sup720 (depends on other cpu driven processes)If logical/virtual ports near upper limits perform:
–Manual pruning on trunks –Add aggregation modules–Use MST if necessary
AggregationModule 1
Access
DC Core
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 74
Increasing HA in the Data Center
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 75
Increasing HA in the Data CenterServer High Availability
1. Server network adapter
2. Port on a multi-port server adapter
3. Network media (server access)
4. Network media (uplink)
5. Access switch port
6. Access switch module
7. Access switch
With Data Center HARecommendations
Without Data Center HARecommendations
L2L3
These Network Failure Issues Can Be Addressed by Deployment of Dual Attached Servers Using Network Adapter Teaming Software
Common Points of Failure
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 76
Increasing HA in the Data Center Common NIC Teaming Configurations
IP=10.2.1.14MAC =0007.e910.ce0fOn Failover, Src MAC Eth1 = Src MAC Eth0
IP Address Eth1 = IP Address Eth0
Eth1: StandbyEth0: Active
AFT—Adapter Fault Tolerance H
eart
beat
s
On Failover, Src MAC Eth1 = Src MAC Eth0IP Address Eth1 = IP Address Eth0
Eth1: StandbyEth0: Active
SFT—Switch Fault Tolerance
One Port Receives, All Ports TransmitIncorporates Fault Tolerance
One IP Address and Multiple MAC Addresses
Eth1-X: ActiveEth0: Active
ALB—Adaptive Load Balancing
Hea
rtbe
ats
IP=10.2.1.14MAC =0007.e910.ce0f
IP=10.2.1.14MAC =0007.e910.ce0e
Default GW 10.2.1.1 HSRP
Hea
rtbe
ats
IP=10.2.1.14MAC =0007.e910.ce0f
Default GW 10.2.1.1 HSRP
Default GW 10.2.1.1 HSRP
Note: NIC manufacturer drivers are changing and may operate differently. Also, server OS have started integrating NIC teaming drivers which may operate differently.
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 77
Increasing HA in the Data Center Server Attachment: Multiple NICsYou Can Bundle Multiple Links to Allow Generating Higher Throughputs Between
L2L3
EtherChannelsAll Links Active: Load Balancing
Only One Link Active: Fault Tolerant Mode
EtherChannel Hash May Not Permit Full Utilization for Certain Applications (Backup Example)
Servers and Clients
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 78
Increasing HA in the Data Center Failover: What Is the Time to Beat?
The overall failover time is the combination of convergence at L2, L3, + L4 components
Stateful devices can replicate connection information and typically failover within 3-5sec
EtherChannels < 1sec
STP converges in ~1 sec (802.1w)
HSRP can be tuned to <1s
Where does TCP break? Microsoft, Linux, AIX, etc.
L2 ConvergenceL3 Convergence
L4 Convergence~ 5s
Microsoft XP2003 ServerTCP Stack Tolerance
~ 9s
Linux and Others Tolerate
a Longer Outage
Failo
ver T
ime
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 79
Increasing HA in the Data Center Failover Time Comparison
STP-802.1w—One secOSPF-EIGRP—Sub secACE Module with AutostateHSRP—Three sec (using 1/3)FWSM Module—Three secCSM Module—Five secWinXP/2003 ServerTCP Stack—Nine sec
OSPF/EIGRPSub-second
Spanning Tree~1sec
HSRP~ 3s
(may be tuned to less)
Failo
ver T
ime
TCP Stack Tolerance
~ 9s
FireWallService Module
~ 3s
ContentService Module
~ 5s
ACE~ 1s
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 80
Increasing HA in the Data Center Non-Stop Forwarding/Stateful Switch-Over
NSF/SSO is a redundancy mechanism for supervisor failoverSSO synchronizes layer 2 protocol state, hardware L2/L3 tables (MAC, FIB, adjacency table), ACL and QoStablesSSO synchronizes state for: trunks, interfaces, EtherChannels, port security, SPAN/RSPAN, STP, UDLD, VTPNSF with EIGRP, OSPF, IS-IS, BGP makes it possible to have no route flapping during the recovery Aggressive EIGRP/OSPF timers do not work in NSF/SSO environment
NSF-Aware NSF-Aware
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 81
Increasing HA in the Data Center NSF/SSO in the Data CenterRedundant Supervisors with SSO in theaccess layer:
Improves availability for single attached servers
Redundant Supervisors with SSO in the aggregation layer:
Consider in primary agg layer switchPrevents service module switchover (up to ~5sec depending on module)SSO switchover time less than two sec12.2.18SXD3 or higher
Possible implicationsHSRP state between Agg switches is not tracked and will show switchover until control plane recoversIGP Timers cannot be aggressive (tradeoff)
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 82
Increasing HA in the Data Center Best Practices: STP, HSRP, Other
Rapid PVST+ UDLD Global
Spanning Tree Pathcost Method=Long
Agg1:STP Primary Root
HSRP Primary HSRP Preempt and DelayDual Sup with NSF+SSO
Agg2:STP Secondary Root
HSRP SecondaryHSRP Preempt and Delay
Single Sup
Rapid PVST+: Maximum Number of STP Active Logical Ports- 8000 and Virtual Ports Per Linecard-1500
Blade Chassis with Integrated
Switch
FT
RootguardLoopGuardPortfast +BPDUguardUDLD Global
LACP+L4 HashDist EtherChannel
Min-Links
L3+ L4 CEF Hash LACP+L4 Port Hash
Dist EtherChannel for FT and Data VLANs
Data
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 83
Q and A
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 84
Recommended Reading
Continue your Networkers at Cisco Live learning experience with further reading from Cisco Press
Check the Recommended Reading flyer for suggested books
Visit www.cisco.com/go/cvdfor Design Guides and Whitepapers about all of these subjects and more
Available Onsite at the Cisco Company Store
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 85