sqlcat: sql server ha and dr design patterns, architectures, and best practices using microsoft sql...

42
SQLCAT: SQL Server HA and DR Design Patterns, Architectures, and Best Practices Using Microsoft SQL Server 2012 AlwaysOn Sanjay Mishra Program Manager Microsoft Corporation DBI316

Upload: joel-newman

Post on 02-Jan-2016

246 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: SQLCAT: SQL Server HA and DR Design Patterns, Architectures, and Best Practices Using Microsoft SQL Server 2012 AlwaysOn Sanjay Mishra Program Manager

SQLCAT: SQL Server HA and DR Design Patterns, Architectures, and Best Practices Using Microsoft SQL Server 2012 AlwaysOn

Sanjay MishraProgram ManagerMicrosoft Corporation

DBI316

Page 2: SQLCAT: SQL Server HA and DR Design Patterns, Architectures, and Best Practices Using Microsoft SQL Server 2012 AlwaysOn Sanjay Mishra Program Manager

Setting the Stage

Assumed Pre-requisites for this presentation: Basic knowledge ofAlwaysOn Failover Cluster Instances (FCI)AlwaysOn Availability Groups (AG)

Definition: For the purpose of this presentationHigh Availability (Local HA): Availability within a data centerDisaster Recovery (DR): Availability across data centers

Page 3: SQLCAT: SQL Server HA and DR Design Patterns, Architectures, and Best Practices Using Microsoft SQL Server 2012 AlwaysOn Sanjay Mishra Program Manager

Setting the Stage

AlwaysOn ≠ Availability Groups

AlwaysOn = { SQL Server Failover Cluster Instances, Availability Groups }

Availability Groups ≠ Database Mirroring

Page 4: SQLCAT: SQL Server HA and DR Design Patterns, Architectures, and Best Practices Using Microsoft SQL Server 2012 AlwaysOn Sanjay Mishra Program Manager

SQL Server 2012 AlwaysOn HA+DR Design Patterns

SQL Server 2012 AlwaysOn HA+DR Design Pattern Solution Characteristics

Corresponding Pre-SQL Server 2012 Solution

1 Multi-site Failover Cluster Instance (FCI) for HA and DR

• Shared Storage solution 1

1 Masked by storage replication Multi-site FCI using stretch VLAN

2 Availability Group for HA and DR • Non-Shared Storage solutionDatabase Mirroring for Local HA and Log Shipping for DR

3Failover Cluster Instance for local HA + Availability Group for DR

• Combined Shared Storage and Non-Shared Storage

Failover Cluster Instance for Local HA and Database Mirroring for DR

Slight variations of these design patterns are occasionally observed as well.

Page 5: SQLCAT: SQL Server HA and DR Design Patterns, Architectures, and Best Practices Using Microsoft SQL Server 2012 AlwaysOn Sanjay Mishra Program Manager

Wednesday, June 13, 10:15 AM – 11:30 AMSQLCAT: HA/DR Customer Panel - Microsoft SQL Server 2012 AlwaysOn Deployment Considerations

N 320E• Michael Steineke, Edgenet, Inc.• David P. Smith, ServiceU Corporation• Ayad Shammout, CareGroup Healthcare Systems• Wolfgang Kutschera, bwin party• Thomas Grohser, Hedge fund in Connecticut

Page 6: SQLCAT: SQL Server HA and DR Design Patterns, Architectures, and Best Practices Using Microsoft SQL Server 2012 AlwaysOn Sanjay Mishra Program Manager

title

Multi-site Failover Cluster Instance for HA and DR

Page 7: SQLCAT: SQL Server HA and DR Design Patterns, Architectures, and Best Practices Using Microsoft SQL Server 2012 AlwaysOn Sanjay Mishra Program Manager

Multi-site Failover Cluster Instance for HA and DRPrimary Site DR Site

Node 1Node 2

Node 3Node 4

Windows Server Failover Cluster

Storage Replication

SQL-FCIActive Passive Passive Passive

SQL Server 2012 AlwaysOn HA+DR Solution Solution Characteristics

Corresponding Pre-SQL Server 2012 Solution

1Multi-site Failover Cluster Instance (FCI) for HA and DR(http://sqlcat.com/sqlcat/b/whitepapers/archive/2011/12/22/sql-server-2012-alwayson_3a00_-multisite-failover-cluster-instance.aspx)

• Shared Storage solution 1

• Instance Level HA (automatic)• Instance Level DR (automatic 2)• Uses storage replication

1 Masked by storage replication 2 Consider 3rd data center

Multi-site FCI using stretch VLAN

Page 8: SQLCAT: SQL Server HA and DR Design Patterns, Architectures, and Best Practices Using Microsoft SQL Server 2012 AlwaysOn Sanjay Mishra Program Manager

Multi-site Failover Cluster InstanceKey Elements

A single SQL Server failover cluster instance (FCI) providing HA as well as DRspanning across multiple sites (usually multiple subnets as well)

Key components:Storage

Storage level replicationCluster EnablerProvided by the storage vendor

Work with your storage vendor to get the appropriate software and best practices

NetworkMulti-subnet support in SQL Server configuration and engine

Key improvement in SQL Server 2012IP address OR dependency set within SQL Server setupSQL Engine skips binding to any IP’s which are not online at start-up

RegisterAllProvidersIP for Network Name improves application failover time

Page 9: SQLCAT: SQL Server HA and DR Design Patterns, Architectures, and Best Practices Using Microsoft SQL Server 2012 AlwaysOn Sanjay Mishra Program Manager

Storage ValidationStorage Validation Check Requirement is relaxed due to make-up of multi-site storage infrastructure (but still get the pop-up!)Multi-site FCI Solution does not require passing the storage validation tests, to be supported. http://support.microsoft.com/kb/943984

Appropriate Quorum Model Validation suggests “Node and Disk Majority” which can be ignoredConsider “Node and File Share Majority” or “Node Majority” based on number of nodes

Multi-site Failover Cluster InstanceDeployment Considerations

Page 10: SQLCAT: SQL Server HA and DR Design Patterns, Architectures, and Best Practices Using Microsoft SQL Server 2012 AlwaysOn Sanjay Mishra Program Manager

TEMPDB on Local DiskNot specific to “multi-site” FCIs, but has some great positive side effects for “multi-site” scenariosEnables use of local storage for TEMPDB

Can use solid state storage to improve performance of TEMPDB-heavy workloadsSaves money on storage replication licensingReduces cross-data center storage replication traffic

Multi-site Failover Cluster InstanceDeployment Considerations

Page 11: SQLCAT: SQL Server HA and DR Design Patterns, Architectures, and Best Practices Using Microsoft SQL Server 2012 AlwaysOn Sanjay Mishra Program Manager

title

Availability Groups for HA and DR

Page 12: SQLCAT: SQL Server HA and DR Design Patterns, Architectures, and Best Practices Using Microsoft SQL Server 2012 AlwaysOn Sanjay Mishra Program Manager

Availability Groups for HA and DR

SQL Server 2012 AlwaysOn HA+DR Solution Solution Characteristics

Corresponding Pre-SQL Server 2012 Solution

2 Availability Group for HA and DR

• Non-Shared Storage solution• (Group of) Database Level HA (automatic)• (Group of) Database Level DR (manual 3)• DR replica can be Active Secondary

3 DR is manual, if HA is chosen automatic. Consider 3rd data center, if need automatic DR.

Database Mirroring for Local HA and Log Shipping for DR

Primary Data CenterDisaster Recovery

Data Center

SQL ServerPrimary

SQL ServerSecondary

Windows Server Failover Cluster (single WSFC crossing two data centers)

Availability Group

Synchronous

Asynchronous

SQL Server

Secondary

Page 13: SQLCAT: SQL Server HA and DR Design Patterns, Architectures, and Best Practices Using Microsoft SQL Server 2012 AlwaysOn Sanjay Mishra Program Manager

Pre-requisites:Prerequisites, Restrictions, and Recommendations for AlwaysOn Availability Groups (SQL Server)

Unit of Failover

Group of databases – not the instance

Consider Contained Database for containing logins for failover

For jobs and other objects outside the database, simple customization needed

Considerations for Replacing Log Shipping

No delayed apply on the secondary

Removing log shipping means the regular log backup job is removed

Need to re-establish periodic log backup (essential for truncating the log)

New tools for monitoring and alertingAlwaysOn DashboardNew DMVsSystem Center Operations Manager

Availability Groups for HA and DRDeployment Considerations

Page 14: SQLCAT: SQL Server HA and DR Design Patterns, Architectures, and Best Practices Using Microsoft SQL Server 2012 AlwaysOn Sanjay Mishra Program Manager

Quorum is managed by the WSFC, irrespective of the number of SQL Server instances, number of nodes, number of availability groupsImportant goal: Design to ensure

Unavailability of the DR site (or the node at DR site) , or loss of network connectivity between sites should not impact the quorum of the WSFC

Two steps:Node votes: First decide which nodes should have a voteQuorum Model: Then choose the appropriate quorum model

Availability Groups for HA and DRQuorum Considerations

Page 15: SQLCAT: SQL Server HA and DR Design Patterns, Architectures, and Best Practices Using Microsoft SQL Server 2012 AlwaysOn Sanjay Mishra Program Manager

Node VotesBy default, every node has a vote => May not be ideal for the HA / DR goalsWindows Server hotfix: http://support.microsoft.com/kb/2494036Guidelines: http://msdn.microsoft.com/en-us/library/hh270280.aspx#RecommendedAdjustmentstoQuorumVoting

For the example topology discussed here, this means:1 vote to each node in the primary data center0 vote to the node in the disaster recovery data center= total 2 votes in the Windows Cluster => not ideal !Need odd number of votes for a “majority” based quorum model

Since this is a purely non-shared storage solution, two possible quorum models:Node and File Share Majority, or Node Majority

Availability Groups for HA and DRQuorum Considerations

Page 16: SQLCAT: SQL Server HA and DR Design Patterns, Architectures, and Best Practices Using Microsoft SQL Server 2012 AlwaysOn Sanjay Mishra Program Manager

Note: The Fileshare Witness always has 1 vote.

Primary Data CenterDisaster Recovery

Data Center

SQL ServerPrimary

SQL ServerSecondary

Windows Server Failover Cluster (single WSFC crossing two data centers)

Availability Group

SQL Server

Secondary

Synchronous

Asynchronous

File Share

Quorum Model and Node VotesNode and Fileshare MajorityUse the “Node and File Share Majority” quorum model with a protected file share witness.

Page 17: SQLCAT: SQL Server HA and DR Design Patterns, Architectures, and Best Practices Using Microsoft SQL Server 2012 AlwaysOn Sanjay Mishra Program Manager

Quorum Model and Node VotesNode Majority

Primary Data CenterDisaster Recovery

Data Center

SQL ServerPrimary

SQL ServerSecondary

Windows Server Failover Cluster (single WSFC crossing two data centers)

Availability Group

SQL Server

Secondary

Synchronous

Asynchronous

Additional Server for Node Majority Quorum Model

Add an additional voting node to the WSFC in the primary data center, and then use the “Node Majority” quorum model.

Page 18: SQLCAT: SQL Server HA and DR Design Patterns, Architectures, and Best Practices Using Microsoft SQL Server 2012 AlwaysOn Sanjay Mishra Program Manager

Quorum Model and Node VotesHow to set / view

To View Quorum Model

Windows Failover Cluster Manager GUIPowerShellCluster.exe

SQL Server DMVsAlwaysOn Dashboard in SSMS

To Change Quorum Model

Windows Failover Cluster Manager GUIPowerShellCluster.exe

To View Node Votes

PowerShellCluster.exe

SQL Server DMVsAlwaysOn Dashboard in SSMS

To Change Node Votes

PowerShellCluster.exe

Page 19: SQLCAT: SQL Server HA and DR Design Patterns, Architectures, and Best Practices Using Microsoft SQL Server 2012 AlwaysOn Sanjay Mishra Program Manager

Disaster = Primary site is downManual Process involved to bring database service online on the DR site

Force Quorum on the secondary in the DR siteExecute FORCE SERVICE ALLOW DATA LOSSAdjust quorum model and/or node votes

Recovering from a Disaster

Page 20: SQLCAT: SQL Server HA and DR Design Patterns, Architectures, and Best Practices Using Microsoft SQL Server 2012 AlwaysOn Sanjay Mishra Program Manager

Hardware: new hardware, reuse existing hardware?Windows Clustering: involve the Windows System Administration team and the networking team

Quorum considerations across multiple data centersCluster network communication across multiple data centers

Stages: migrate the whole configuration at once, or migrate the DR afterwards?Application connection string change

Migration: From DBM+LS to AGPlanning and Key Considerations

Page 21: SQLCAT: SQL Server HA and DR Design Patterns, Architectures, and Best Practices Using Microsoft SQL Server 2012 AlwaysOn Sanjay Mishra Program Manager

Special Case: Automatic Failover for DRUse of 3rd Data Center

Primary Data Center Disaster Recovery Data Center

SQL ServerPrimary

SQL ServerSecondary

Windows Server Failover Cluster

Availability Group

Synchronous

File Share

3rd Data Center

Page 22: SQLCAT: SQL Server HA and DR Design Patterns, Architectures, and Best Practices Using Microsoft SQL Server 2012 AlwaysOn Sanjay Mishra Program Manager

title

Failover Cluster Instance for HA, and Availability Group for DR

Page 23: SQLCAT: SQL Server HA and DR Design Patterns, Architectures, and Best Practices Using Microsoft SQL Server 2012 AlwaysOn Sanjay Mishra Program Manager

Failover Cluster Instance + Database Mirroring

Primary Data Center

SQL_FCI

Disaster Recovery Data Center

SQL_FCI

Database Mirroring

Principal Database

Mirror Database

Page 24: SQLCAT: SQL Server HA and DR Design Patterns, Architectures, and Best Practices Using Microsoft SQL Server 2012 AlwaysOn Sanjay Mishra Program Manager

FCI for HA + AG for DR

SQL Server 2012 AlwaysOn HA+DR Solution Solution Characteristics

Corresponding Pre-SQL Server 2012 Solution

3 Failover Cluster Instance for local HA + Availability Group for DR

• Combined Shared Storage and Non-Shared Storage

• Instance Level HA (automatic)• (Group of) Database Level DR (manual)• DR replica can be Active Secondary• Asymmetric storage is the key to this solution

Failover Cluster Instance for Local HA and Database Mirroring for DR

Primary Data Center

SQLFCIPRIMARY

Disaster Recovery Data Center

SQLFCIDR

Availability Group

PrimaryDatabase(s)

Secondary Database(s)

Windows Server Failover Cluster

Page 25: SQLCAT: SQL Server HA and DR Design Patterns, Architectures, and Best Practices Using Microsoft SQL Server 2012 AlwaysOn Sanjay Mishra Program Manager

Pre-requisites:

Windows Server Service packs / QFEs:Asymmetric Storage

Windows Server 2008 with http://support.microsoft.com/kb/976097OR, Windows Server 2008 R2 SP1

Node Votes: http://support.microsoft.com/kb/2494036Validate disk test QFE: http://support.microsoft.com/kb/2531907

Prerequisites, Restrictions, and Recommendations for AlwaysOn Availability Groups (SQL Server)

Different units of failover for HA and DRInstance-level failover for local HA (FCI)Group of databases (AG) for DR

AG Failover ModeIn FCI+AG configuration, FCI provides automatic failover, and AG provides manual failover

FCI for HA + AG for DRDeployment Considerations

Page 26: SQLCAT: SQL Server HA and DR Design Patterns, Architectures, and Best Practices Using Microsoft SQL Server 2012 AlwaysOn Sanjay Mishra Program Manager

Asymmetric StorageKey concept behind this architectureNew Windows Server Failover Clustering capability introduced in:

Windows Server 2008 R2 SP1Windows Server 2008 with QFE

Symmetric storage = a cluster disk that is shared between all the WSFC nodesAsymmetric storage = a cluster disk that is shared between a subset of nodes

Instance NamingEach FCI within the WSFC needs to have a different instance name

Database File Paths(recommended) use identical drive letters for the disks for each FCI(recommended) use identical file paths for data and log files for each FCI

FCI for HA + AG for DRDeployment Considerations

Page 27: SQLCAT: SQL Server HA and DR Design Patterns, Architectures, and Best Practices Using Microsoft SQL Server 2012 AlwaysOn Sanjay Mishra Program Manager

Quorum is managed by the WSFC, irrespective of the number of SQL Server instances (FCI or standalone), number of nodes, number of availability groupsImportant goal: Design to ensure

Unavailability of the DR site, or loss of network connectivity between sites should not impact the quorum of the WSFC

Two steps:Node votes: First decide which nodes should have a voteQuorum Model: Then choose the appropriate quorum model

FCI for HA + AG for DRQuorum Considerations

Page 28: SQLCAT: SQL Server HA and DR Design Patterns, Architectures, and Best Practices Using Microsoft SQL Server 2012 AlwaysOn Sanjay Mishra Program Manager

Node VotesBy default, every node has a vote => May not be ideal for the HA / DR goalsWindows Server hotfix: http://support.microsoft.com/kb/2494036Guidelines: http://msdn.microsoft.com/en-us/library/hh270280.aspx#RecommendedAdjustmentstoQuorumVoting

For the example topology discussed here, this means:1 vote to each node in the primary data center0 vote to each node in the disaster recovery data center= total 2 votes in the Windows Cluster => not ideal !Need odd number of votes for a “majority” based quorum model

Quorum models:Pick one of the “majority” based quorum models with odd number of votes

Node and File Share Majority, or Node Majority, orNode and (asymmetric) Disk Majority

Or, pick (asymmetric) Disk Only (special case!) – votes don’t matter

FCI for HA + AG for DRQuorum Considerations

Page 29: SQLCAT: SQL Server HA and DR Design Patterns, Architectures, and Best Practices Using Microsoft SQL Server 2012 AlwaysOn Sanjay Mishra Program Manager

Note: The Fileshare Witness always has 1 vote.

Quorum Model and Node VotesExample: Node and Fileshare Majority

Primary Data Center

SQLFCIPRIMARY

Disaster Recovery Data Center

SQLFCIDR

Availability Group

PrimaryDatabase(s)

Secondary Database(s)

Windows Server Failover Cluster

Fileshare

Page 30: SQLCAT: SQL Server HA and DR Design Patterns, Architectures, and Best Practices Using Microsoft SQL Server 2012 AlwaysOn Sanjay Mishra Program Manager

Quorum Model and Node VotesHow to set / view

To View Quorum Model

Windows Failover Cluster Manager GUIPowerShellCluster.exe

SQL Server DMVsAlwaysOn Dashboard in SSMS

To Change Quorum Model

Windows Failover Cluster Manager GUIPowerShellCluster.exe

To View Node Votes

PowerShellCluster.exe

SQL Server DMVsAlwaysOn Dashboard in SSMS

To Change Node Votes

PowerShellCluster.exe

Note: Only cluster.exe can be used to set quorum model to “Node and (asymmetric) Disk Majority” or “(asymmetric) Disk Only”

Page 31: SQLCAT: SQL Server HA and DR Design Patterns, Architectures, and Best Practices Using Microsoft SQL Server 2012 AlwaysOn Sanjay Mishra Program Manager

Disaster = Primary site is downManual Process involved to bring database service online on the DR site

Force Quorum on the secondary in the DR siteExecute FORCE SERVICE ALLOW DATA LOSS on the Availability GroupAdjust quorum model and/or node votes

Rethink quorum model: needs for another fileshare at the DR site?

Recovering from a Disaster

Page 32: SQLCAT: SQL Server HA and DR Design Patterns, Architectures, and Best Practices Using Microsoft SQL Server 2012 AlwaysOn Sanjay Mishra Program Manager

Hardware: new hardware, reuse existing hardware?Windows Clustering

Quorum considerations across multiple data centersCluster network communication across multiple data centers

Stages: migrate the whole configuration at once, or migrate the DR afterwards?Secondary (DR site) needs re-seeding

Uninstall existing SQL FCIDestroy existing WSFC at the DR siteRe-install SQL FCI after joining DR nodes to primary data center WSFCBackup from primary, and Restore on the secondary

Application connection string change

Migration: From FCI+DBM to FCI+AGPlanning and Key Considerations

Page 33: SQLCAT: SQL Server HA and DR Design Patterns, Architectures, and Best Practices Using Microsoft SQL Server 2012 AlwaysOn Sanjay Mishra Program Manager

Summary

Page 34: SQLCAT: SQL Server HA and DR Design Patterns, Architectures, and Best Practices Using Microsoft SQL Server 2012 AlwaysOn Sanjay Mishra Program Manager

SQL Server 2012 AlwaysOn HA+DR Design Patterns

SQL Server 2012 AlwaysOn HA+DR Design Pattern Solution Characteristics

Corresponding Pre-SQL Server 2012 Solution

1 Multi-site Failover Cluster Instance (FCI) for HA and DR

• Shared Storage solution 1

• Instance Level HA (automatic)• Instance Level DR (automatic 2)• Uses storage replication• Doesn’t require database to be in FULL recovery

model

1 Masked by storage replication 2 Consider 3rd data center

Multi-site FCI using stretch VLAN

2 Availability Group for HA and DR

• Non-Shared Storage solution• (Group of) Database Level HA (automatic)• (Group of) Database Level DR (manual 3)• DR replica can be Active Secondary• Requires database to be in FULL recovery model

3 DR is manual, if HA is chosen automatic. Consider 3rd data center, if need automatic DR.

Database Mirroring for Local HA and Log Shipping for DR

3Failover Cluster Instance for local HA + Availability Group for DR

• Combined Shared Storage and Non-Shared Storage

• Instance Level HA (automatic)• (Group of) Database Level DR (manual)• DR replica can be Active Secondary• Requires database to be in FULL recovery model• Asymmetric storage is the key to this solution

Failover Cluster Instance for Local HA and Database Mirroring for DR

Slight variations of these design patterns are occasionally observed as well.

Page 35: SQLCAT: SQL Server HA and DR Design Patterns, Architectures, and Best Practices Using Microsoft SQL Server 2012 AlwaysOn Sanjay Mishra Program Manager

Wednesday, June 13, 10:15 AM – 11:30 AMSQLCAT: HA/DR Customer Panel - Microsoft SQL Server 2012 AlwaysOn Deployment Considerations

N 320E• Michael Steineke, Edgenet, Inc.• David P. Smith, ServiceU Corporation• Ayad Shammout, CareGroup Healthcare Systems• Wolfgang Kutschera, bwin party• Thomas Grohser, Hedge fund in Connecticut

Page 37: SQLCAT: SQL Server HA and DR Design Patterns, Architectures, and Best Practices Using Microsoft SQL Server 2012 AlwaysOn Sanjay Mishra Program Manager

SanjayMishra

[email protected] www.sqlcat.com

@sqlcat

Page 38: SQLCAT: SQL Server HA and DR Design Patterns, Architectures, and Best Practices Using Microsoft SQL Server 2012 AlwaysOn Sanjay Mishra Program Manager

Resources

Connect. Share. Discuss.

http://northamerica.msteched.com

Learning

Microsoft Certification & Training Resources

www.microsoft.com/learning

TechNet

Resources for IT Professionals

http://microsoft.com/technet

Resources for Developers

http://microsoft.com/msdn

Page 39: SQLCAT: SQL Server HA and DR Design Patterns, Architectures, and Best Practices Using Microsoft SQL Server 2012 AlwaysOn Sanjay Mishra Program Manager

Complete an evaluation on CommNet and enter to win!

Page 40: SQLCAT: SQL Server HA and DR Design Patterns, Architectures, and Best Practices Using Microsoft SQL Server 2012 AlwaysOn Sanjay Mishra Program Manager

MS Tag

Scan the Tagto evaluate thissession now onmyTechEd Mobile

Required Slide *delete this box when your slide is finalized

Your MS Tag will be inserted here during the final scrub.

Page 41: SQLCAT: SQL Server HA and DR Design Patterns, Architectures, and Best Practices Using Microsoft SQL Server 2012 AlwaysOn Sanjay Mishra Program Manager

© 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to

be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS

PRESENTATION.

Page 42: SQLCAT: SQL Server HA and DR Design Patterns, Architectures, and Best Practices Using Microsoft SQL Server 2012 AlwaysOn Sanjay Mishra Program Manager