upgrading ucs fabric interconnects while minimizing downtime

8
© 2011 Cisco and/or its affiliates. All rights reserved. 1 Upgrading UCS Fabric Interconnects While Minimizing Downtime Steve Sharman Consulting Systems Engineer – UCS Application and Solutions Architecture Team

Upload: cisco-data-center-sdn

Post on 27-Jun-2015

788 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: Upgrading UCS Fabric Interconnects While Minimizing Downtime

© 2011 Cisco and/or its affiliates. All rights reserved. 1

Upgrading UCS Fabric Interconnects While Minimizing Downtime

Steve SharmanConsulting Systems Engineer – UCS Application and Solutions Architecture Team

Page 2: Upgrading UCS Fabric Interconnects While Minimizing Downtime

© 2011 Cisco and/or its affiliates. All rights reserved. 3

Upgrading Fabric Interconnects is a Complex Process

• We are never dealing with the “Simple Case”: Like-for-Like, no change in uplinks (number/type), no additional chassis.... Most likely we will be adding uplinks, adding/modifying port channels etc.

• We are never upgrading just the Fabric Interconnects: Gen 2 hardware includes new IOM modules. Higher port density means more chassis, more blades.

• The process will disrupt both data planes: Chassis must be re-acknowledged on both sides. Hardware must be replaced, dropping each side during the re-cabling/configuration step.

• Outage windows need to be scheduled and risk assessed. The activity is most likely going to occur on a weekend or late at night.

Page 3: Upgrading UCS Fabric Interconnects While Minimizing Downtime

© 2011 Cisco and/or its affiliates. All rights reserved. 4

UCS+VMware Enables a Seamless Upgrade

• UCS - Stateless Computing• SAN Booting• Service Profile mobility• Separation of “Logical” objects from Physical

• Host/Application Virtualization• vMotion enables non-disruptive workload mobility

Page 4: Upgrading UCS Fabric Interconnects While Minimizing Downtime

© 2011 Cisco and/or its affiliates. All rights reserved. 5

Gen 2 Hardware Upgrade with No Downtime1. Install and configure the new Gen 2 Hardware as a separate system

• Setup Ethernet Port Channels, configure FC Trucking and Port Channels

• Verify that all northbound connectivity is working – (VLANs configured etc.)

2. On original system• Backup/Export Logical Configuration (preserve Identities)

3. New Platform• Import the Logical Configuration – “replace”

• Clear SP states

• Install a few test servers to verify Ethernet and SAN connectivity and catch any missing VLANs etc

4. You now have a “Duplicate” system with upgraded HW, new FW, and verified connectivity.

Page 5: Upgrading UCS Fabric Interconnects While Minimizing Downtime

© 2011 Cisco and/or its affiliates. All rights reserved. 6

Migrate Workloads to the New Platform

Basic Process

• On Source:

1. “Evacuate” guests from the chosen ESX server

2. Put in Maintenance mode and shutdown

3. Disassociate Service Profile from blade (you could delete it at this point)

4. Verify blade is completely shutdown and FC initiators logged out of fabric

• On Target:

1. Associate Service Profile to blade – server will boot

2. Verify ESX host is booted and online – remove from maintenance mode

3. Migrate guests to host.

• Repeat this process till the first chassis is evacuated

• Then shutdown the chassis, upgrade IOMs, add it to the new Fabric Interconnects, and repeat until complete.

Page 6: Upgrading UCS Fabric Interconnects While Minimizing Downtime

© 2011 Cisco and/or its affiliates. All rights reserved. 7

Use Case Example – Major Upgrade From Gen 1• Major Data Center Upgrade –

• UCS – Gen 1 to Gen 2

• MDS – 8GB FC

• Nexus – Retiring Cat6K, replacing with 7K core

• Fully committed 6120 implementation, all ports full.

• UCS fully loaded with blades, all running production applications supporting critical lines of business.

• All ESX servers are SAN booted, environment 100 virtualized.

• Upgrade of UCS included:• Doubling the Chassis count

• Upgrading to E5 Processors and increasing memory densities

• Implementing VPC to the 7K and FC Port Channels to the MDS

Page 7: Upgrading UCS Fabric Interconnects While Minimizing Downtime

© 2011 Cisco and/or its affiliates. All rights reserved. 8

Use Case Example - Continued

• This was considered a very “High Risk” upgrade with many application stakeholders.

• Original plan, using documented upgrade procedure, was going to require a full day outage on a weekend.

• After presenting this option a new plan was developed implementing the process outlined.

Result:

• 40 Servers, 2 Fabric Interconnects, and 10 IO Modules were upgraded with *no* application downtime.

• It did take 2 weeks to complete the upgrade. But all of the work was done during regularly scheduled change windows and the Technical lead didn’t miss his son’s football game on Saturday :)

Page 8: Upgrading UCS Fabric Interconnects While Minimizing Downtime

Thank you.