netapp’s - amazon web services€™s global engineering ... netapp engineering by the numbers ......

50

Upload: phamnguyet

Post on 14-Apr-2018

215 views

Category:

Documents


1 download

TRANSCRIPT

CCSACI-2552

NetApp’s Global Engineering Cloud: The Journey to Nexus 9k and ACI

Michael McKee - Senior Network Architect

Michael Torelli – Senior Network Architect

Agenda

• Introduction

• FlexPod with Nexus 9K

• Network Design

• Nexus 9000 NX-OS Design

• Nexus 9000 ACI Design

• Conclusion

• Questions

Introduction

NetApp is a leading vendor of innovative storage and data management solutions that help organizations around the world store, manage, protect, and retain one of their most precious assets: their data.

Year founded: 1992

Number of employees: 12,000

Revenues: $6.3 billion (FY2014)

Member of S&P 500

Member of Nasdaq 100

FORTUNE 500 Company

Stock symbol: NTAP

Worldwide offices: 150

NetApp ProductsFAS-Fabric Attached Storage

EF-Series

EF560 - HA Pair, All Flash 2U

SS3030 – Controller 2U

AltaVault

FlashRay

FlashRay 6U

MetroCluster

OnCommand Insight

OnCommand

StorageGrid

StorageGrid

DATA

• 9 R&D labs

• 4100+ user base

• We are Customer Zero

• Drive Innovation:

• Leveraging newly emerging capability from our products and partners

NetApp EngineeringEngineering Data Center Services

NetApp EngineeringBy the numbers

Engineering DC

Hypervisors

3

ESeries & ONTAP Controllers

50 450

Datacenter

Storage

33PB

Active Self-Service

VMs

~75K

cDOT Adoption

75%

Weekly VM turnover

10K

Workloads

Mins Self-ServiceVM

8

Lines of ProductCode

32M

Mins Incremental Build

7

Monthly Bisect

Runs

23K

Products

40+

Monthly Presub Smokes

6K

Users

4,100NetApp Dev

& QA Engineers

Tech Stack

Engineering Labs

VM’s Served

275K

Lines Test Automation Code

36M

Testbed Storage

300PB

Networking

IPv4/v61G-40G

Storage Controllers(Virtual/Physical)

15K10K

Test cases per month

220K

NetApp EngineeringGlobal Engineering Cloud

Massively scalable shared virtual data center infrastructure

Global Engineering Cloud

NetApp + ACI + OpenStack integrations

Hybrid cloud service

Cloud based test-beds

Converge resources into improved services

Hypervisor agnostic

Cisco UCSACI - Nexus 9K, 7K

(Nexus 9508, 93128, 9396, 7018, 7010)

NetApp cDOT, E-Series, NPS,

Cloud ONTAP

Self-Service Portal & API

HYPERVISOR

VM VM VM

SVMClustered Data ONTAP

HYPERVISOR

VM VM VM

SVMClustered Data ONTAP

HYPERVISOR

VM VM VM

SVMClustered Data ONTAP

FlexPod with Nexus 9000

FlexPod Data CenterCisco Nexus 9000 NX-OS Design

Cisco UCS

C220 M3 C-Series Server(s)

Cisco UCS

5108 B-Series Blade Chassis

2208XP Chassis FEX Modules

B200 M3 B-Series Blade(s)

Cisco UCS

6248UP Fabric

Interconnects

Cisco Nexus

9396PX Switches

NetApp FAS8040

Storage Controllers

w/ HA Backplane

Controller 1

Controller 2

Cisco Nexus 5596

Cluster Interconnects

NetApp

DS2246 Disk Shelves

PoPo

Po Po

vPC vPC

Po

ifgrp

vPC

ifgrp

vPC

Po

10Gbe Only

SAS Only

FlexPod Data CenterCisco Nexus 9000 ACI Design

Unified Computing

System

UCS 6248 Fabric

Interconnects,

Nexus 2232 Fabric Extender

& UCS C and B Series

Servers

APIC 1-3

FAS Controller

8000

Nexus 5596

Cluster Interconnects

Nexus 9396

vPC vPC

vPCvPC

x8

Nexus 9508

Nexus 9508

Spine Switches

Legend10Gbe only

Converged

Interconnects

Standby Link

SFO Interconnect40Gbe only

NetApp Engineering Facilities - RTP

Global Dynamic Lab• Operational Since June, 2009

• 2168 racks with 12kW/rack

• Cold room containment / hot aisle

• First Energy Star certified Data Center

• Operational as of May, 2014

• 2235 racks capable 12kW/rack

• 36 rows with 63 racks per row

• Built on GDL design with many improvements

– Al Lawlis Sr. Director, NetApp

PUE 1.14PUE 1.2

Nexus 7000/5000/2000NX-OS Design

GDL1

External

Connectivity

NetApp Engineering DesignNexus 7000/5000/2000

GDL1

Core

CS01 CS02

DS03 DS04DS01 DS05 DS07

vPC

vPC vPC

• Architecture based on nexus 7K/5K/2K

• Each VLAN is isolated to a specific distribution block

• 10G to the Top-of-Rack (ToR)

• Traditional Ethernet, unified I/O and Fibre Channel

• Nexus 7K VDC and nexus 5K leveraged for FCoE/FC

RTP Data Center

GDL2 Network Requirements

• Design that will last the next ~10 years

• Based on FlexPod Architecture

• Programmability, Automation, Orchestration

• Reduce the failure domain size from ~500 racks to Top-of-Rack (ToR)

• Scale and density seen at the distribution layer before are now being required at the Top-of-Rack

• Adding 40G/100G capability

• User/Group isolation on shared infrastructure

R & D Data Center Design Principles

Nexus 9000 NX-OS DesignGDL2

NetApp Engineering DesignNexus 9000 GDL2 and OTV

GDL2

DS11 DS12 DS13 DS14DI03 DI04

GDL1

Core

CS01 CS02

DS03 DS04DS01 DS05 DS07DI01 DI02

vPC vPC

Future

NX-

OS/ACI

vPC

vPC

vPC

Benefits:

• Based on same NX-OS as Nexus 7000/5000/2000

• OTV on Nexus 7004s to provide layer 2 connectivity

• 40G to Top-of-Rack (ToR)

• Migration path to ACI

vPC vPC 40G L2

Layer 2

40G L3

Layer 3

External

Connectivity

RTP Data Center

Nexus 9000 NX-OS DesignGDL1

GDL1 Network Refresh

• Used inventory script – copper vs. fiber used/total

• Created a detailed step by step procedure for the migration and tested the procedures in Cisco Solution Validation Services lab

• Converted configurations from the existing 7Ks to the new 9Ks, accounting for updated best practices and port conversions

• Staged Nexus 9000 switches with the appropriate configurations and software

• Pre-labeled the fiber for the new ports to speed up the swap out

• Goal was zero downtime

Planning for the Migrations

External

Connectivity

GDL1 Network Refresh

Core• “shutdown” interfaces on CS02

• Verify traffic converges to CS01

• Remove CS02 from rack

• Rack new CS02

• “no shut” interfaces on CS02

• Verify traffic convergences

• “shutdown” interfaces on CS01

• Verify traffic converges to CS02

• Remove CS01 from rack

• Rack new CS01

• “no shut” interfaces on CS01

• Verify traffic convergences

Core/Distribution Steps

CS01 CS02

DS03 DS04

vPC vPC

CS02CS01

DS04DS03

vPC PairvPC Pair

vPC

Distribution• “shutdown” interfaces on DS04

• Verify traffic converges to DS03

• Remove DS04 from rack

• “Break vPC”

• Rack new DS04

• “no shut” northbound/southbound

interfaces on DS04

• Verify DS04 is root bridge and

HSRP active

• “shutdown” interfaces on DS03

• Verify traffic converges to DS04

• Remove DS03 from rack

• Rack new DS03

• “no shut”

northbound/southbound/east-west

L3 interfaces on DS03

• Configure vPC

• “no shut” vPC peer-link

• added members to the vPC port-

channels

DS03 DS04

External

Connectivity

GDL1 Network Refresh

Core Swap out went well

• Routing Protocols converged as expected

Challenges first Distribution Pair

• Bridge Assurance

• “lacp suspend-individual”

• spanning-tree guard root

• OSPF summaries

Lessons learned

vPC vPC

CS02CS01

vPC Pair

vPC

GDL1 Network Refresh

Total of 1100 racks to be refreshed

Leveraging Power on Auto Provisioning (PoAP) and python scripting to get the configurations from the old ToRs to the new 9Ks

Logistical challenge to work with the users in the half-rows to schedule the down time

Top-of-Rack

DS02DS01

Nexus 9372PX

Nexus 2248TP-E

Nexus 9372TX

Nexus 9332PQ

vPC Pair

NetApp Engineering DesignNexus 9000 GDL1

GDL2

DS11 DS12DI03 DI04

GDL1Core

DS01 DS05

vPC vPC

DI01 DI02

vPC vPC

vPC

vPC

vPC

Benefits

• Based on same NX-OS as Nexus 7000/5000/2000

• 40G Core which allows 40G to GDL2

• 40G to Top-of-Rack (ToR)

• Migration path to ACI

DS03 DS04 DS07 DS08

vPC vPC

vPC

CS01 CS02

40G L2

Layer 2

40G L3

Layer 3

DS13 DS14Future

NX-

OS/ACI

External

Connectivity

RTP Data Center

GDL2

DI03 DI04

GDL1Core

DS01 DS05

vPC vPC

DI01 DI02

vPC vPC

vPCvPC

vPC

DS03 DS04 DS07 DS08

vPC vPC

vPC

CS01 CS02

40G L2

Layer 2

40G L3

Layer 3

NetApp Engineering DesignGDL2 9508 to 9516

Benefits• Consistent infrastructure

• 9516s required for the 500 ToRs per DB

Lessons Learned

• vPC auto-recovery timer

• Nexus 9000 BU and Cisco Advanced Services has created a forklift upgrade procedure

DS11 DS12 DS14DS13

Future

NX-

OS/ACI

External

Connectivity

RTP Data Center

GDL2

DI03 DI04

GDL1Core

DS01 DS05

vPC vPC

DI01 DI02

vPC vPC

vPCvPC

vPC

DS03 DS04 DS07 DS08

vPC vPC

vPC

CS01 CS02

40G L2

Layer 2

40G L3

Layer 3

NetApp Engineering DesignGDL2 Core

Benefits• Inter-DB traffic local to GDL2

• Minimize 40G LR links between GDL1/GDL2

Lessons Learned

• Went according to plan

DS11 DS12 DS14DS13

Future

NX-

OS/ACI

Core

CS03 CS04

External

Connectivity

RTP Data Center

Nexus 9000 NX-OS

• Leveraging Power on Auto Provisioning (PoAP) has improved the time to deploy ToRs 800% (4hrs to 30 min)

• Hot patching combined with python scripts is allowing us to deploy fixes

• Bi-di optics leveraged for 40G links

• Built-in programmability of the platform

• Easy transition for Ops staff

BenefitsNetApp custom PoAP portal

Nexus 9000 ACI Design

Fabric

Spine Nodes

Fabric

Leaf NodesBorder

Leaf Nodes

Existing GDL2

Infrastructure

DS11 DS12

vPC External Routed Access

for IPv4

L3 Boundary for IPv6

• The Topology consist FlexPod Architecture with two spines and 9 leafs

running in production since Feb ‘15

• NetApp Release 8.3 cDOT is being used to provide the NFS/CIFS data

stores for the 11,000 virtual machines

Current Version

is1.03f

Legend

Layer 3

Layer 2Fabric

APICvPC APIC APICvPC

11,000 Virtual Machines

NetApp Engineering DesignCurrent Deployment – GDL2

NFS/CIFSTest Filers

Infrastructure

ACI Defining Terms

Tenant - Logical separator for: Customer, TG, BU, group etc. separates traffic, admin, visibility, etc.

Private Network - Equivalent to a VRF, separates routing instances, can be used as an admin

separation

Bridge Domain - Not quite a VLAN, simply a container for subnets, can be used to define L2

boundary

Application Network Profile - logical representation of an application and its interdependencies in

the network fabric.

End-Point Group - (EPG) Container for objects requiring the same policy treatment, i.e. app tiers, or

services

Tenant B

Bridge

DomainSubnet A

Tenant APrivate Network A

EPG1 EPG2

Bridge

DomainSubnet B

Bridge

DomainSubnet A

Bridge

DomainSubnet A

EPG3 EPG4

Subnet B

Private Network B Private Network A

App Profile A App Profile B

ACI Policy Defining Terms

Contract - Definition of policy. Defines how an

EPG communicates with other EPGs.

Subject - Used to build definitions of

communication between EPGs. Contains: filter,

action, and optional label.

Filter - Identifier for a subject, i.e. the traffic do you

want to take action on. Required within a subject.

Action - Action to be taken on the filtered traffic

with a subject. Required within a subject.

Label - Optional advanced identifier, when used

labels allow for more complex definition of

relationships within the policy model

Contract

Subject

Subject

Subject

Subject

Filter Action Label

In/out

port, etc.

Drop, mark,

redirect,

etc.

Optional

label

EPG2 EPG1

Contract

Routed Outside Global Routed Outside Global

NetApp Engineering DesignACI Specifics

Tenant GEC2

Private Network default

Tenant Common

Private Network default

External

Connectivity via

OSPF

Bridge Domain

GEC1_UCS08_

Guest_1

Subnet A

Tenant GEC1

Private Network default

EPG Guest_1 EPG Guest_2

DHCP

Server

Bridge Domain

GEC2_UCS59_

Guest_2

Subnet D

Bridge Domain

GEC1_UCS08_

Guest_2

Subnet B

• Tenants Sharing OSPF Process

Running in Common Tenant

• DHCP relay to an external DHCP

server

• Contracts – Providing at the

private network and consuming at

the EPG

Routed Outside Global

Bridge Domain

GEC2_UCS59_

Guest_1

Subnet C

Bridge Domain

GEC_UCS08_

Guest_1

Subnet A

Bridge Domain

GEC_UCS08_

Guest_2

Subnet A

Shared From

Common TenantBridge Domain

GEC2_UCS59_

Guest_2

Subnet D

Bridge Domain

GEC2_UCS59_

Guest_1

Subnet C

Shared From

Common Tenant

App Profile UCS08

EPG Guest_1 EPG Guest_2

App Profile UCS59

Virtual Private Test BedsVision

ANP-Test Bed 1

EPG - Clients

Clients

Filers

EPG - Filers

EPG - TestTools

Test Tools

ANP-Test Bed 3

EPG - Clients

Clients

Filers

EPG - Filers

EPG - TestTools

Test Tools

ANP-Test Bed 2

EPG - Clients

Clients

Filers

EPG - Filers

EPG - TestTools

Test Tools

Tenant - Virtual Private Test Beds

LDAP

EPG - LDAP

NIS

EPG - NIS

DHCP

EPG - DHCP

DNS

EPG - DNS

ANP-Common Services

Tenant - Common

NetApp Engineering DesignACI Zone Based Design

PODs

vPC

PODs

vPC

APICAPIC

Fabric Zone

Spine Nodes

Fabric

Leaf Nodes

Fabric Building

Spine Nodes

APIC APIC APIC APICvPC

PODs

vPC

PODs

vPC

PODs

Legend

Layer 3

Layer 2Fabric

WR

IT

GDL1 GDL2

Benefits:

• ‘Any Layer 2 segment

anywhere’

• Single Fabric for the Campus

• Centralized Policy Management

NetApp Engineering DesignLong Term Plan - Site

PODs

vPC

PODs

vPC

APIC APIC

IT

APIC APIC APIC APIC vPC

PODs

vPC

PODs

vPC

PODs WR

PODs

vPC

PODs

vPC

APIC APIC

IT

APIC APIC APIC APIC vPC

PODs

vPC

PODs

vPC

PODs WR

Legend

Layer 3

Layer 2Fabric

NetApp Engineering Design

• Scale – endpoints per leaf and total number of leafs

• FC/FCoE support on the Leaf

• Native IPv6 support in the ACI Fabric, shipping in the 1.1 release

• In-service software upgrade for the Leaf

• Spine support for 9516, shipping in the 1.1 release

• Transit routing for route peering with service appliances, shipping in the 1.1 release.

ACI Challenges

vPC vPC

UCS08

Fabric

Leaf Nodes

Conclusion

Conclusion

• Programmability and automation enables removing ourselves from the user’s work flows

• We leverage and influence the FlexPod reference Architecture

• Migrations from 7K/5K/2K to 9K can be done with 0 down time

• ACI in its current offering is missing a few key features that prevents us from wide scale deployment

• We still have to tackle the NX-OS to ACI migrations

• NetApp Engineering is committed to making ACI successful in our environment

Key Takeaways

References

• vPC auto Recovery -> http://www.cisco.com/c/en/us/td/docs/switches/datacenter/nexus9000/sw/7-x/interfaces/configuration/guide/b_Cisco_Nexus_9000_Series_NX-OS_Interfaces_Configuration_Guide_7x/configuring_vpcs.html#concept_09474B25A6434B6792CEE6322B5EF7F7

Questions?

Complete Your Online Session Evaluation

Don’t forget: Cisco Live sessions will be available for viewing on-demand after the event at CiscoLive.com/Online

• Give us your feedback to be entered into a Daily Survey Drawing. A daily winner will receive a $750 Amazon gift card.

• Complete your session surveys though the Cisco Live mobile app or your computer on Cisco Live Connect.

Continue Your Education

• Demos in the Cisco campus

• Walk-in Self-Paced Labs

• Table Topics

• Meet the Engineer 1:1 meetings

• Related sessions

Thank you

R&S Related Cisco Education OfferingsCourse Description Cisco Certification

CCIE R&S Advanced Workshops (CIERS-1 &

CIERS-2) plus

Self Assessments, Workbooks & Labs

Expert level trainings including: instructor led workshops, self

assessments, practice labs and CCIE Lab Builder to prepare candidates

for the CCIE R&S practical exam.

CCIE® Routing & Switching

• Implementing Cisco IP Routing v2.0

• Implementing Cisco IP Switched

Networks V2.0

• Troubleshooting and Maintaining

Cisco IP Networks v2.0

Professional level instructor led trainings to prepare candidates for the

CCNP R&S exams (ROUTE, SWITCH and TSHOOT). Also available in

self study eLearning formats with Cisco Learning Labs.

CCNP® Routing & Switching

Interconnecting Cisco Networking Devices:

Part 2 (or combined)

Configure, implement and troubleshoot local and wide-area IPv4 and IPv6

networks. Also available in self study eLearning format with Cisco Learning

Lab.

CCNA® Routing & Switching

Interconnecting Cisco Networking Devices:

Part 1

Installation, configuration, and basic support of a branch network. Also

available in self study eLearning format with Cisco Learning Lab.

CCENT® Routing & Switching

For more details, please visit: http://learningnetwork.cisco.com

Questions? Visit the Learning@Cisco Booth or contact [email protected]

Design Cisco Education OfferingsCourse Description Cisco Certification

Designing Cisco Network Service Architectures

(ARCH)

Provides learner with the ability to perform conceptual, intermediate, and

detailed design of a network infrastructure that supports desired capacity,

performance, availability required for converged Enterprise network

services and applications.

CCDP® (Design Professional)

Designing for Cisco Internetwork Solutions

(DESGN)

Instructor led training focused on fundamental design methodologies used

to determine requirements for network performance, security, voice, and

wireless solutions. Prepares candidates for the CCDA certification exam.

CCDA® (Design Associate)

For more details, please visit: http://learningnetwork.cisco.com

Questions? Visit the Learning@Cisco Booth or contact [email protected]

Data Center / Virtualization Cisco Education OfferingsCourse Description Cisco Certification

Cisco Data Center CCIE Unified Fabric

Workshop (DCXUF);

Cisco Data Center CCIE Unified Computing

Workshop (DCXUC)

Prepare for your CCIE Data Center practical exam with hands on lab

exercises running on a dedicated comprehensive topology

CCIE® Data Center

Implementing Cisco Data Center Unified Fabric

(DCUFI);

Implementing Cisco Data Center Unified

Computing (DCUCI)

Obtain the skills to deploy complex virtualized Data Center Fabric and

Computing environments with Nexus and Cisco UCS.

CCNP® Data Center

Introducing Cisco Data Center Networking

(DCICN); Introducing Cisco Data Center

Technologies (DCICT)

Learn basic data center technologies and how to build a data center

infrastructure.

CCNA® Data Center

Product Training Portfolio: DCAC9k, DCINX9k,

DCMDS, DCUCS, DCNX1K, DCNX5K, DCNX7K

Get a deep understanding of the Cisco data center product line including

the Cisco Nexus9K in ACI and NexusOS modes

For more details, please visit: http://learningnetwork.cisco.com

Questions? Visit the Learning@Cisco Booth or contact [email protected]

Network Programmability Cisco Education OfferingsCourse Description Cisco Certification

Integrating Business Applications with Network

Programmability (NIPBA);

Integrating Business Applications with Network

Programmability for Cisco ACI (NPIBAACI)

Learn networking concepts, and how to deploy and troubleshoot

programmable network architectures with these self-paced courses.

Cisco Business Application

Engineer Specialist Certification

Developing with Cisco Network Programmability

(NPDEV);

Developing with Cisco Network Programmability

for Cisco ACI (NPDEVACI)

Learn how to build applications for network environments and effectively

bridge the gap between IT professionals and software developers.

Cisco Network Programmability

Developer Specialist Certification

Designing with Cisco Network Programmability

(NPDES);

Designing with Cisco Network Programmability

for Cisco ACI (NPDESACI)

Learn how to expand your skill set from traditional IT infrastructure to

application integration through programmability.

Cisco Network Programmability

Design Specialist Certification

Implementing Cisco Network Programmability

(NPENG);

Implementing Cisco Network Programmability

for Cisco ACI (NPENGACI)

Learn how to implement and troubleshoot open IT infrastructure

technologies.

Cisco Network Programmability

Engineer Specialist Certification

For more details, please visit: http://learningnetwork.cisco.com

Questions? Visit the Learning@Cisco Booth or contact [email protected]

Cloud Cisco Education OfferingsCourse Description Cisco Certification

Designing the FlexPod Solution (FPDESIGN);

Implementing and Administering the FlexPod

Solution (FPIMPADM)

Learn how to design, implement and administer FlexPod solutions FlexPod Design Specialist;

FlexPod Implementation &

Administration Specialist

UCS Director (UCSDF) Learn how to manage physical and virtual infrastructure using

orchestration and automation functions of UCS Director.

Cisco Prime Service Catalog Learn how to deliver data center, workplace, and application services in an

on-demand, automated, and repeatable method.

Cisco Intercloud Fabric Learn how to implement end-to-end hybrid clouds with Intercloud Fabric

for Business and Intercloud Fabric for Providers.

Cisco Intelligent Automation for Cloud Learn how to implement and manage cloud deployments with Cisco

Intelligent Automation for Cloud

For more details, please visit: http://learningnetwork.cisco.com

Questions? Visit the Learning@Cisco Booth or contact [email protected]