netapp’s - amazon web services€™s global engineering ... netapp engineering by the numbers ......
TRANSCRIPT
CCSACI-2552
NetApp’s Global Engineering Cloud: The Journey to Nexus 9k and ACI
Michael McKee - Senior Network Architect
Michael Torelli – Senior Network Architect
Agenda
• Introduction
• FlexPod with Nexus 9K
• Network Design
• Nexus 9000 NX-OS Design
• Nexus 9000 ACI Design
• Conclusion
• Questions
NetApp is a leading vendor of innovative storage and data management solutions that help organizations around the world store, manage, protect, and retain one of their most precious assets: their data.
Year founded: 1992
Number of employees: 12,000
Revenues: $6.3 billion (FY2014)
Member of S&P 500
Member of Nasdaq 100
FORTUNE 500 Company
Stock symbol: NTAP
Worldwide offices: 150
NetApp ProductsFAS-Fabric Attached Storage
EF-Series
EF560 - HA Pair, All Flash 2U
SS3030 – Controller 2U
AltaVault
FlashRay
FlashRay 6U
MetroCluster
OnCommand Insight
OnCommand
StorageGrid
StorageGrid
DATA
• 9 R&D labs
• 4100+ user base
• We are Customer Zero
• Drive Innovation:
• Leveraging newly emerging capability from our products and partners
NetApp EngineeringEngineering Data Center Services
NetApp EngineeringBy the numbers
Engineering DC
Hypervisors
3
ESeries & ONTAP Controllers
50 450
Datacenter
Storage
33PB
Active Self-Service
VMs
~75K
cDOT Adoption
75%
Weekly VM turnover
10K
Workloads
Mins Self-ServiceVM
8
Lines of ProductCode
32M
Mins Incremental Build
7
Monthly Bisect
Runs
23K
Products
40+
Monthly Presub Smokes
6K
Users
4,100NetApp Dev
& QA Engineers
Tech Stack
Engineering Labs
VM’s Served
275K
Lines Test Automation Code
36M
Testbed Storage
300PB
Networking
IPv4/v61G-40G
Storage Controllers(Virtual/Physical)
15K10K
Test cases per month
220K
NetApp EngineeringGlobal Engineering Cloud
Massively scalable shared virtual data center infrastructure
Global Engineering Cloud
NetApp + ACI + OpenStack integrations
Hybrid cloud service
Cloud based test-beds
Converge resources into improved services
Hypervisor agnostic
Cisco UCSACI - Nexus 9K, 7K
(Nexus 9508, 93128, 9396, 7018, 7010)
NetApp cDOT, E-Series, NPS,
Cloud ONTAP
Self-Service Portal & API
HYPERVISOR
VM VM VM
SVMClustered Data ONTAP
HYPERVISOR
VM VM VM
SVMClustered Data ONTAP
HYPERVISOR
VM VM VM
SVMClustered Data ONTAP
FlexPod Data CenterCisco Nexus 9000 NX-OS Design
Cisco UCS
C220 M3 C-Series Server(s)
Cisco UCS
5108 B-Series Blade Chassis
2208XP Chassis FEX Modules
B200 M3 B-Series Blade(s)
Cisco UCS
6248UP Fabric
Interconnects
Cisco Nexus
9396PX Switches
NetApp FAS8040
Storage Controllers
w/ HA Backplane
Controller 1
Controller 2
Cisco Nexus 5596
Cluster Interconnects
NetApp
DS2246 Disk Shelves
PoPo
Po Po
vPC vPC
Po
ifgrp
vPC
ifgrp
vPC
Po
10Gbe Only
SAS Only
FlexPod Data CenterCisco Nexus 9000 ACI Design
Unified Computing
System
UCS 6248 Fabric
Interconnects,
Nexus 2232 Fabric Extender
& UCS C and B Series
Servers
APIC 1-3
FAS Controller
8000
Nexus 5596
Cluster Interconnects
Nexus 9396
vPC vPC
vPCvPC
x8
Nexus 9508
Nexus 9508
Spine Switches
Legend10Gbe only
Converged
Interconnects
Standby Link
SFO Interconnect40Gbe only
Global Dynamic Lab• Operational Since June, 2009
• 2168 racks with 12kW/rack
• Cold room containment / hot aisle
• First Energy Star certified Data Center
• Operational as of May, 2014
• 2235 racks capable 12kW/rack
• 36 rows with 63 racks per row
• Built on GDL design with many improvements
– Al Lawlis Sr. Director, NetApp
PUE 1.14PUE 1.2
External
Connectivity
NetApp Engineering DesignNexus 7000/5000/2000
GDL1
Core
CS01 CS02
DS03 DS04DS01 DS05 DS07
vPC
vPC vPC
• Architecture based on nexus 7K/5K/2K
• Each VLAN is isolated to a specific distribution block
• 10G to the Top-of-Rack (ToR)
• Traditional Ethernet, unified I/O and Fibre Channel
• Nexus 7K VDC and nexus 5K leveraged for FCoE/FC
RTP Data Center
GDL2 Network Requirements
• Design that will last the next ~10 years
• Based on FlexPod Architecture
• Programmability, Automation, Orchestration
• Reduce the failure domain size from ~500 racks to Top-of-Rack (ToR)
• Scale and density seen at the distribution layer before are now being required at the Top-of-Rack
• Adding 40G/100G capability
• User/Group isolation on shared infrastructure
R & D Data Center Design Principles
NetApp Engineering DesignNexus 9000 GDL2 and OTV
GDL2
DS11 DS12 DS13 DS14DI03 DI04
GDL1
Core
CS01 CS02
DS03 DS04DS01 DS05 DS07DI01 DI02
vPC vPC
Future
NX-
OS/ACI
vPC
vPC
vPC
Benefits:
• Based on same NX-OS as Nexus 7000/5000/2000
• OTV on Nexus 7004s to provide layer 2 connectivity
• 40G to Top-of-Rack (ToR)
• Migration path to ACI
vPC vPC 40G L2
Layer 2
40G L3
Layer 3
External
Connectivity
RTP Data Center
GDL1 Network Refresh
• Used inventory script – copper vs. fiber used/total
• Created a detailed step by step procedure for the migration and tested the procedures in Cisco Solution Validation Services lab
• Converted configurations from the existing 7Ks to the new 9Ks, accounting for updated best practices and port conversions
• Staged Nexus 9000 switches with the appropriate configurations and software
• Pre-labeled the fiber for the new ports to speed up the swap out
• Goal was zero downtime
Planning for the Migrations
External
Connectivity
GDL1 Network Refresh
Core• “shutdown” interfaces on CS02
• Verify traffic converges to CS01
• Remove CS02 from rack
• Rack new CS02
• “no shut” interfaces on CS02
• Verify traffic convergences
• “shutdown” interfaces on CS01
• Verify traffic converges to CS02
• Remove CS01 from rack
• Rack new CS01
• “no shut” interfaces on CS01
• Verify traffic convergences
Core/Distribution Steps
CS01 CS02
DS03 DS04
vPC vPC
CS02CS01
DS04DS03
vPC PairvPC Pair
vPC
Distribution• “shutdown” interfaces on DS04
• Verify traffic converges to DS03
• Remove DS04 from rack
• “Break vPC”
• Rack new DS04
• “no shut” northbound/southbound
interfaces on DS04
• Verify DS04 is root bridge and
HSRP active
• “shutdown” interfaces on DS03
• Verify traffic converges to DS04
• Remove DS03 from rack
• Rack new DS03
• “no shut”
northbound/southbound/east-west
L3 interfaces on DS03
• Configure vPC
• “no shut” vPC peer-link
• added members to the vPC port-
channels
DS03 DS04
External
Connectivity
GDL1 Network Refresh
Core Swap out went well
• Routing Protocols converged as expected
Challenges first Distribution Pair
• Bridge Assurance
• “lacp suspend-individual”
• spanning-tree guard root
• OSPF summaries
Lessons learned
vPC vPC
CS02CS01
vPC Pair
vPC
GDL1 Network Refresh
Total of 1100 racks to be refreshed
Leveraging Power on Auto Provisioning (PoAP) and python scripting to get the configurations from the old ToRs to the new 9Ks
Logistical challenge to work with the users in the half-rows to schedule the down time
Top-of-Rack
DS02DS01
Nexus 9372PX
Nexus 2248TP-E
Nexus 9372TX
Nexus 9332PQ
vPC Pair
NetApp Engineering DesignNexus 9000 GDL1
GDL2
DS11 DS12DI03 DI04
GDL1Core
DS01 DS05
vPC vPC
DI01 DI02
vPC vPC
vPC
vPC
vPC
Benefits
• Based on same NX-OS as Nexus 7000/5000/2000
• 40G Core which allows 40G to GDL2
• 40G to Top-of-Rack (ToR)
• Migration path to ACI
DS03 DS04 DS07 DS08
vPC vPC
vPC
CS01 CS02
40G L2
Layer 2
40G L3
Layer 3
DS13 DS14Future
NX-
OS/ACI
External
Connectivity
RTP Data Center
GDL2
DI03 DI04
GDL1Core
DS01 DS05
vPC vPC
DI01 DI02
vPC vPC
vPCvPC
vPC
DS03 DS04 DS07 DS08
vPC vPC
vPC
CS01 CS02
40G L2
Layer 2
40G L3
Layer 3
NetApp Engineering DesignGDL2 9508 to 9516
Benefits• Consistent infrastructure
• 9516s required for the 500 ToRs per DB
Lessons Learned
• vPC auto-recovery timer
• Nexus 9000 BU and Cisco Advanced Services has created a forklift upgrade procedure
DS11 DS12 DS14DS13
Future
NX-
OS/ACI
External
Connectivity
RTP Data Center
GDL2
DI03 DI04
GDL1Core
DS01 DS05
vPC vPC
DI01 DI02
vPC vPC
vPCvPC
vPC
DS03 DS04 DS07 DS08
vPC vPC
vPC
CS01 CS02
40G L2
Layer 2
40G L3
Layer 3
NetApp Engineering DesignGDL2 Core
Benefits• Inter-DB traffic local to GDL2
• Minimize 40G LR links between GDL1/GDL2
Lessons Learned
• Went according to plan
DS11 DS12 DS14DS13
Future
NX-
OS/ACI
Core
CS03 CS04
External
Connectivity
RTP Data Center
Nexus 9000 NX-OS
• Leveraging Power on Auto Provisioning (PoAP) has improved the time to deploy ToRs 800% (4hrs to 30 min)
• Hot patching combined with python scripts is allowing us to deploy fixes
• Bi-di optics leveraged for 40G links
• Built-in programmability of the platform
• Easy transition for Ops staff
BenefitsNetApp custom PoAP portal
Fabric
Spine Nodes
Fabric
Leaf NodesBorder
Leaf Nodes
Existing GDL2
Infrastructure
DS11 DS12
vPC External Routed Access
for IPv4
L3 Boundary for IPv6
• The Topology consist FlexPod Architecture with two spines and 9 leafs
running in production since Feb ‘15
• NetApp Release 8.3 cDOT is being used to provide the NFS/CIFS data
stores for the 11,000 virtual machines
Current Version
is1.03f
Legend
Layer 3
Layer 2Fabric
APICvPC APIC APICvPC
11,000 Virtual Machines
NetApp Engineering DesignCurrent Deployment – GDL2
NFS/CIFSTest Filers
Infrastructure
ACI Defining Terms
Tenant - Logical separator for: Customer, TG, BU, group etc. separates traffic, admin, visibility, etc.
Private Network - Equivalent to a VRF, separates routing instances, can be used as an admin
separation
Bridge Domain - Not quite a VLAN, simply a container for subnets, can be used to define L2
boundary
Application Network Profile - logical representation of an application and its interdependencies in
the network fabric.
End-Point Group - (EPG) Container for objects requiring the same policy treatment, i.e. app tiers, or
services
Tenant B
Bridge
DomainSubnet A
Tenant APrivate Network A
EPG1 EPG2
Bridge
DomainSubnet B
Bridge
DomainSubnet A
Bridge
DomainSubnet A
EPG3 EPG4
Subnet B
Private Network B Private Network A
App Profile A App Profile B
ACI Policy Defining Terms
Contract - Definition of policy. Defines how an
EPG communicates with other EPGs.
Subject - Used to build definitions of
communication between EPGs. Contains: filter,
action, and optional label.
Filter - Identifier for a subject, i.e. the traffic do you
want to take action on. Required within a subject.
Action - Action to be taken on the filtered traffic
with a subject. Required within a subject.
Label - Optional advanced identifier, when used
labels allow for more complex definition of
relationships within the policy model
Contract
Subject
Subject
Subject
Subject
Filter Action Label
In/out
port, etc.
Drop, mark,
redirect,
etc.
Optional
label
EPG2 EPG1
Contract
Routed Outside Global Routed Outside Global
NetApp Engineering DesignACI Specifics
Tenant GEC2
Private Network default
Tenant Common
Private Network default
External
Connectivity via
OSPF
Bridge Domain
GEC1_UCS08_
Guest_1
Subnet A
Tenant GEC1
Private Network default
EPG Guest_1 EPG Guest_2
DHCP
Server
Bridge Domain
GEC2_UCS59_
Guest_2
Subnet D
Bridge Domain
GEC1_UCS08_
Guest_2
Subnet B
• Tenants Sharing OSPF Process
Running in Common Tenant
• DHCP relay to an external DHCP
server
• Contracts – Providing at the
private network and consuming at
the EPG
Routed Outside Global
Bridge Domain
GEC2_UCS59_
Guest_1
Subnet C
Bridge Domain
GEC_UCS08_
Guest_1
Subnet A
Bridge Domain
GEC_UCS08_
Guest_2
Subnet A
Shared From
Common TenantBridge Domain
GEC2_UCS59_
Guest_2
Subnet D
Bridge Domain
GEC2_UCS59_
Guest_1
Subnet C
Shared From
Common Tenant
App Profile UCS08
EPG Guest_1 EPG Guest_2
App Profile UCS59
Virtual Private Test BedsVision
ANP-Test Bed 1
EPG - Clients
Clients
Filers
EPG - Filers
EPG - TestTools
Test Tools
ANP-Test Bed 3
EPG - Clients
Clients
Filers
EPG - Filers
EPG - TestTools
Test Tools
ANP-Test Bed 2
EPG - Clients
Clients
Filers
EPG - Filers
EPG - TestTools
Test Tools
Tenant - Virtual Private Test Beds
LDAP
EPG - LDAP
NIS
EPG - NIS
DHCP
EPG - DHCP
DNS
EPG - DNS
ANP-Common Services
Tenant - Common
NetApp Engineering DesignACI Zone Based Design
PODs
vPC
PODs
vPC
APICAPIC
Fabric Zone
Spine Nodes
Fabric
Leaf Nodes
Fabric Building
Spine Nodes
APIC APIC APIC APICvPC
PODs
vPC
PODs
vPC
PODs
Legend
Layer 3
Layer 2Fabric
WR
IT
GDL1 GDL2
Benefits:
• ‘Any Layer 2 segment
anywhere’
• Single Fabric for the Campus
• Centralized Policy Management
NetApp Engineering DesignLong Term Plan - Site
PODs
vPC
PODs
vPC
APIC APIC
IT
APIC APIC APIC APIC vPC
PODs
vPC
PODs
vPC
PODs WR
PODs
vPC
PODs
vPC
APIC APIC
IT
APIC APIC APIC APIC vPC
PODs
vPC
PODs
vPC
PODs WR
Legend
Layer 3
Layer 2Fabric
NetApp Engineering Design
• Scale – endpoints per leaf and total number of leafs
• FC/FCoE support on the Leaf
• Native IPv6 support in the ACI Fabric, shipping in the 1.1 release
• In-service software upgrade for the Leaf
• Spine support for 9516, shipping in the 1.1 release
• Transit routing for route peering with service appliances, shipping in the 1.1 release.
ACI Challenges
vPC vPC
UCS08
Fabric
Leaf Nodes
Conclusion
• Programmability and automation enables removing ourselves from the user’s work flows
• We leverage and influence the FlexPod reference Architecture
• Migrations from 7K/5K/2K to 9K can be done with 0 down time
• ACI in its current offering is missing a few key features that prevents us from wide scale deployment
• We still have to tackle the NX-OS to ACI migrations
• NetApp Engineering is committed to making ACI successful in our environment
Key Takeaways
References
• vPC auto Recovery -> http://www.cisco.com/c/en/us/td/docs/switches/datacenter/nexus9000/sw/7-x/interfaces/configuration/guide/b_Cisco_Nexus_9000_Series_NX-OS_Interfaces_Configuration_Guide_7x/configuring_vpcs.html#concept_09474B25A6434B6792CEE6322B5EF7F7
Complete Your Online Session Evaluation
Don’t forget: Cisco Live sessions will be available for viewing on-demand after the event at CiscoLive.com/Online
• Give us your feedback to be entered into a Daily Survey Drawing. A daily winner will receive a $750 Amazon gift card.
• Complete your session surveys though the Cisco Live mobile app or your computer on Cisco Live Connect.
Continue Your Education
• Demos in the Cisco campus
• Walk-in Self-Paced Labs
• Table Topics
• Meet the Engineer 1:1 meetings
• Related sessions
R&S Related Cisco Education OfferingsCourse Description Cisco Certification
CCIE R&S Advanced Workshops (CIERS-1 &
CIERS-2) plus
Self Assessments, Workbooks & Labs
Expert level trainings including: instructor led workshops, self
assessments, practice labs and CCIE Lab Builder to prepare candidates
for the CCIE R&S practical exam.
CCIE® Routing & Switching
• Implementing Cisco IP Routing v2.0
• Implementing Cisco IP Switched
Networks V2.0
• Troubleshooting and Maintaining
Cisco IP Networks v2.0
Professional level instructor led trainings to prepare candidates for the
CCNP R&S exams (ROUTE, SWITCH and TSHOOT). Also available in
self study eLearning formats with Cisco Learning Labs.
CCNP® Routing & Switching
Interconnecting Cisco Networking Devices:
Part 2 (or combined)
Configure, implement and troubleshoot local and wide-area IPv4 and IPv6
networks. Also available in self study eLearning format with Cisco Learning
Lab.
CCNA® Routing & Switching
Interconnecting Cisco Networking Devices:
Part 1
Installation, configuration, and basic support of a branch network. Also
available in self study eLearning format with Cisco Learning Lab.
CCENT® Routing & Switching
For more details, please visit: http://learningnetwork.cisco.com
Questions? Visit the Learning@Cisco Booth or contact [email protected]
Design Cisco Education OfferingsCourse Description Cisco Certification
Designing Cisco Network Service Architectures
(ARCH)
Provides learner with the ability to perform conceptual, intermediate, and
detailed design of a network infrastructure that supports desired capacity,
performance, availability required for converged Enterprise network
services and applications.
CCDP® (Design Professional)
Designing for Cisco Internetwork Solutions
(DESGN)
Instructor led training focused on fundamental design methodologies used
to determine requirements for network performance, security, voice, and
wireless solutions. Prepares candidates for the CCDA certification exam.
CCDA® (Design Associate)
For more details, please visit: http://learningnetwork.cisco.com
Questions? Visit the Learning@Cisco Booth or contact [email protected]
Data Center / Virtualization Cisco Education OfferingsCourse Description Cisco Certification
Cisco Data Center CCIE Unified Fabric
Workshop (DCXUF);
Cisco Data Center CCIE Unified Computing
Workshop (DCXUC)
Prepare for your CCIE Data Center practical exam with hands on lab
exercises running on a dedicated comprehensive topology
CCIE® Data Center
Implementing Cisco Data Center Unified Fabric
(DCUFI);
Implementing Cisco Data Center Unified
Computing (DCUCI)
Obtain the skills to deploy complex virtualized Data Center Fabric and
Computing environments with Nexus and Cisco UCS.
CCNP® Data Center
Introducing Cisco Data Center Networking
(DCICN); Introducing Cisco Data Center
Technologies (DCICT)
Learn basic data center technologies and how to build a data center
infrastructure.
CCNA® Data Center
Product Training Portfolio: DCAC9k, DCINX9k,
DCMDS, DCUCS, DCNX1K, DCNX5K, DCNX7K
Get a deep understanding of the Cisco data center product line including
the Cisco Nexus9K in ACI and NexusOS modes
For more details, please visit: http://learningnetwork.cisco.com
Questions? Visit the Learning@Cisco Booth or contact [email protected]
Network Programmability Cisco Education OfferingsCourse Description Cisco Certification
Integrating Business Applications with Network
Programmability (NIPBA);
Integrating Business Applications with Network
Programmability for Cisco ACI (NPIBAACI)
Learn networking concepts, and how to deploy and troubleshoot
programmable network architectures with these self-paced courses.
Cisco Business Application
Engineer Specialist Certification
Developing with Cisco Network Programmability
(NPDEV);
Developing with Cisco Network Programmability
for Cisco ACI (NPDEVACI)
Learn how to build applications for network environments and effectively
bridge the gap between IT professionals and software developers.
Cisco Network Programmability
Developer Specialist Certification
Designing with Cisco Network Programmability
(NPDES);
Designing with Cisco Network Programmability
for Cisco ACI (NPDESACI)
Learn how to expand your skill set from traditional IT infrastructure to
application integration through programmability.
Cisco Network Programmability
Design Specialist Certification
Implementing Cisco Network Programmability
(NPENG);
Implementing Cisco Network Programmability
for Cisco ACI (NPENGACI)
Learn how to implement and troubleshoot open IT infrastructure
technologies.
Cisco Network Programmability
Engineer Specialist Certification
For more details, please visit: http://learningnetwork.cisco.com
Questions? Visit the Learning@Cisco Booth or contact [email protected]
Cloud Cisco Education OfferingsCourse Description Cisco Certification
Designing the FlexPod Solution (FPDESIGN);
Implementing and Administering the FlexPod
Solution (FPIMPADM)
Learn how to design, implement and administer FlexPod solutions FlexPod Design Specialist;
FlexPod Implementation &
Administration Specialist
UCS Director (UCSDF) Learn how to manage physical and virtual infrastructure using
orchestration and automation functions of UCS Director.
Cisco Prime Service Catalog Learn how to deliver data center, workplace, and application services in an
on-demand, automated, and repeatable method.
Cisco Intercloud Fabric Learn how to implement end-to-end hybrid clouds with Intercloud Fabric
for Business and Intercloud Fabric for Providers.
Cisco Intelligent Automation for Cloud Learn how to implement and manage cloud deployments with Cisco
Intelligent Automation for Cloud
For more details, please visit: http://learningnetwork.cisco.com
Questions? Visit the Learning@Cisco Booth or contact [email protected]