sqlcat: sql server ha and dr design patterns, architectures, and best practices using microsoft sql...
TRANSCRIPT
SQLCAT: SQL Server HA and DR Design Patterns, Architectures, and Best Practices Using Microsoft SQL Server 2012 AlwaysOn
Sanjay MishraProgram ManagerMicrosoft Corporation
DBI316
Setting the Stage
Assumed Pre-requisites for this presentation: Basic knowledge ofAlwaysOn Failover Cluster Instances (FCI)AlwaysOn Availability Groups (AG)
Definition: For the purpose of this presentationHigh Availability (Local HA): Availability within a data centerDisaster Recovery (DR): Availability across data centers
Setting the Stage
AlwaysOn ≠ Availability Groups
AlwaysOn = { SQL Server Failover Cluster Instances, Availability Groups }
Availability Groups ≠ Database Mirroring
SQL Server 2012 AlwaysOn HA+DR Design Patterns
SQL Server 2012 AlwaysOn HA+DR Design Pattern Solution Characteristics
Corresponding Pre-SQL Server 2012 Solution
1 Multi-site Failover Cluster Instance (FCI) for HA and DR
• Shared Storage solution 1
1 Masked by storage replication Multi-site FCI using stretch VLAN
2 Availability Group for HA and DR • Non-Shared Storage solutionDatabase Mirroring for Local HA and Log Shipping for DR
3Failover Cluster Instance for local HA + Availability Group for DR
• Combined Shared Storage and Non-Shared Storage
Failover Cluster Instance for Local HA and Database Mirroring for DR
Slight variations of these design patterns are occasionally observed as well.
Wednesday, June 13, 10:15 AM – 11:30 AMSQLCAT: HA/DR Customer Panel - Microsoft SQL Server 2012 AlwaysOn Deployment Considerations
N 320E• Michael Steineke, Edgenet, Inc.• David P. Smith, ServiceU Corporation• Ayad Shammout, CareGroup Healthcare Systems• Wolfgang Kutschera, bwin party• Thomas Grohser, Hedge fund in Connecticut
title
Multi-site Failover Cluster Instance for HA and DR
Multi-site Failover Cluster Instance for HA and DRPrimary Site DR Site
Node 1Node 2
Node 3Node 4
Windows Server Failover Cluster
Storage Replication
SQL-FCIActive Passive Passive Passive
SQL Server 2012 AlwaysOn HA+DR Solution Solution Characteristics
Corresponding Pre-SQL Server 2012 Solution
1Multi-site Failover Cluster Instance (FCI) for HA and DR(http://sqlcat.com/sqlcat/b/whitepapers/archive/2011/12/22/sql-server-2012-alwayson_3a00_-multisite-failover-cluster-instance.aspx)
• Shared Storage solution 1
• Instance Level HA (automatic)• Instance Level DR (automatic 2)• Uses storage replication
1 Masked by storage replication 2 Consider 3rd data center
Multi-site FCI using stretch VLAN
Multi-site Failover Cluster InstanceKey Elements
A single SQL Server failover cluster instance (FCI) providing HA as well as DRspanning across multiple sites (usually multiple subnets as well)
Key components:Storage
Storage level replicationCluster EnablerProvided by the storage vendor
Work with your storage vendor to get the appropriate software and best practices
NetworkMulti-subnet support in SQL Server configuration and engine
Key improvement in SQL Server 2012IP address OR dependency set within SQL Server setupSQL Engine skips binding to any IP’s which are not online at start-up
RegisterAllProvidersIP for Network Name improves application failover time
Storage ValidationStorage Validation Check Requirement is relaxed due to make-up of multi-site storage infrastructure (but still get the pop-up!)Multi-site FCI Solution does not require passing the storage validation tests, to be supported. http://support.microsoft.com/kb/943984
Appropriate Quorum Model Validation suggests “Node and Disk Majority” which can be ignoredConsider “Node and File Share Majority” or “Node Majority” based on number of nodes
Multi-site Failover Cluster InstanceDeployment Considerations
TEMPDB on Local DiskNot specific to “multi-site” FCIs, but has some great positive side effects for “multi-site” scenariosEnables use of local storage for TEMPDB
Can use solid state storage to improve performance of TEMPDB-heavy workloadsSaves money on storage replication licensingReduces cross-data center storage replication traffic
Multi-site Failover Cluster InstanceDeployment Considerations
title
Availability Groups for HA and DR
Availability Groups for HA and DR
SQL Server 2012 AlwaysOn HA+DR Solution Solution Characteristics
Corresponding Pre-SQL Server 2012 Solution
2 Availability Group for HA and DR
• Non-Shared Storage solution• (Group of) Database Level HA (automatic)• (Group of) Database Level DR (manual 3)• DR replica can be Active Secondary
3 DR is manual, if HA is chosen automatic. Consider 3rd data center, if need automatic DR.
Database Mirroring for Local HA and Log Shipping for DR
Primary Data CenterDisaster Recovery
Data Center
SQL ServerPrimary
SQL ServerSecondary
Windows Server Failover Cluster (single WSFC crossing two data centers)
Availability Group
Synchronous
Asynchronous
SQL Server
Secondary
Pre-requisites:Prerequisites, Restrictions, and Recommendations for AlwaysOn Availability Groups (SQL Server)
Unit of Failover
Group of databases – not the instance
Consider Contained Database for containing logins for failover
For jobs and other objects outside the database, simple customization needed
Considerations for Replacing Log Shipping
No delayed apply on the secondary
Removing log shipping means the regular log backup job is removed
Need to re-establish periodic log backup (essential for truncating the log)
New tools for monitoring and alertingAlwaysOn DashboardNew DMVsSystem Center Operations Manager
Availability Groups for HA and DRDeployment Considerations
Quorum is managed by the WSFC, irrespective of the number of SQL Server instances, number of nodes, number of availability groupsImportant goal: Design to ensure
Unavailability of the DR site (or the node at DR site) , or loss of network connectivity between sites should not impact the quorum of the WSFC
Two steps:Node votes: First decide which nodes should have a voteQuorum Model: Then choose the appropriate quorum model
Availability Groups for HA and DRQuorum Considerations
Node VotesBy default, every node has a vote => May not be ideal for the HA / DR goalsWindows Server hotfix: http://support.microsoft.com/kb/2494036Guidelines: http://msdn.microsoft.com/en-us/library/hh270280.aspx#RecommendedAdjustmentstoQuorumVoting
For the example topology discussed here, this means:1 vote to each node in the primary data center0 vote to the node in the disaster recovery data center= total 2 votes in the Windows Cluster => not ideal !Need odd number of votes for a “majority” based quorum model
Since this is a purely non-shared storage solution, two possible quorum models:Node and File Share Majority, or Node Majority
Availability Groups for HA and DRQuorum Considerations
Note: The Fileshare Witness always has 1 vote.
Primary Data CenterDisaster Recovery
Data Center
SQL ServerPrimary
SQL ServerSecondary
Windows Server Failover Cluster (single WSFC crossing two data centers)
Availability Group
SQL Server
Secondary
Synchronous
Asynchronous
File Share
Quorum Model and Node VotesNode and Fileshare MajorityUse the “Node and File Share Majority” quorum model with a protected file share witness.
Quorum Model and Node VotesNode Majority
Primary Data CenterDisaster Recovery
Data Center
SQL ServerPrimary
SQL ServerSecondary
Windows Server Failover Cluster (single WSFC crossing two data centers)
Availability Group
SQL Server
Secondary
Synchronous
Asynchronous
Additional Server for Node Majority Quorum Model
Add an additional voting node to the WSFC in the primary data center, and then use the “Node Majority” quorum model.
Quorum Model and Node VotesHow to set / view
To View Quorum Model
Windows Failover Cluster Manager GUIPowerShellCluster.exe
SQL Server DMVsAlwaysOn Dashboard in SSMS
To Change Quorum Model
Windows Failover Cluster Manager GUIPowerShellCluster.exe
To View Node Votes
PowerShellCluster.exe
SQL Server DMVsAlwaysOn Dashboard in SSMS
To Change Node Votes
PowerShellCluster.exe
Disaster = Primary site is downManual Process involved to bring database service online on the DR site
Force Quorum on the secondary in the DR siteExecute FORCE SERVICE ALLOW DATA LOSSAdjust quorum model and/or node votes
Recovering from a Disaster
Hardware: new hardware, reuse existing hardware?Windows Clustering: involve the Windows System Administration team and the networking team
Quorum considerations across multiple data centersCluster network communication across multiple data centers
Stages: migrate the whole configuration at once, or migrate the DR afterwards?Application connection string change
Migration: From DBM+LS to AGPlanning and Key Considerations
Special Case: Automatic Failover for DRUse of 3rd Data Center
Primary Data Center Disaster Recovery Data Center
SQL ServerPrimary
SQL ServerSecondary
Windows Server Failover Cluster
Availability Group
Synchronous
File Share
3rd Data Center
title
Failover Cluster Instance for HA, and Availability Group for DR
Failover Cluster Instance + Database Mirroring
Primary Data Center
SQL_FCI
Disaster Recovery Data Center
SQL_FCI
Database Mirroring
Principal Database
Mirror Database
FCI for HA + AG for DR
SQL Server 2012 AlwaysOn HA+DR Solution Solution Characteristics
Corresponding Pre-SQL Server 2012 Solution
3 Failover Cluster Instance for local HA + Availability Group for DR
• Combined Shared Storage and Non-Shared Storage
• Instance Level HA (automatic)• (Group of) Database Level DR (manual)• DR replica can be Active Secondary• Asymmetric storage is the key to this solution
Failover Cluster Instance for Local HA and Database Mirroring for DR
Primary Data Center
SQLFCIPRIMARY
Disaster Recovery Data Center
SQLFCIDR
Availability Group
PrimaryDatabase(s)
Secondary Database(s)
Windows Server Failover Cluster
Pre-requisites:
Windows Server Service packs / QFEs:Asymmetric Storage
Windows Server 2008 with http://support.microsoft.com/kb/976097OR, Windows Server 2008 R2 SP1
Node Votes: http://support.microsoft.com/kb/2494036Validate disk test QFE: http://support.microsoft.com/kb/2531907
Prerequisites, Restrictions, and Recommendations for AlwaysOn Availability Groups (SQL Server)
Different units of failover for HA and DRInstance-level failover for local HA (FCI)Group of databases (AG) for DR
AG Failover ModeIn FCI+AG configuration, FCI provides automatic failover, and AG provides manual failover
FCI for HA + AG for DRDeployment Considerations
Asymmetric StorageKey concept behind this architectureNew Windows Server Failover Clustering capability introduced in:
Windows Server 2008 R2 SP1Windows Server 2008 with QFE
Symmetric storage = a cluster disk that is shared between all the WSFC nodesAsymmetric storage = a cluster disk that is shared between a subset of nodes
Instance NamingEach FCI within the WSFC needs to have a different instance name
Database File Paths(recommended) use identical drive letters for the disks for each FCI(recommended) use identical file paths for data and log files for each FCI
FCI for HA + AG for DRDeployment Considerations
Quorum is managed by the WSFC, irrespective of the number of SQL Server instances (FCI or standalone), number of nodes, number of availability groupsImportant goal: Design to ensure
Unavailability of the DR site, or loss of network connectivity between sites should not impact the quorum of the WSFC
Two steps:Node votes: First decide which nodes should have a voteQuorum Model: Then choose the appropriate quorum model
FCI for HA + AG for DRQuorum Considerations
Node VotesBy default, every node has a vote => May not be ideal for the HA / DR goalsWindows Server hotfix: http://support.microsoft.com/kb/2494036Guidelines: http://msdn.microsoft.com/en-us/library/hh270280.aspx#RecommendedAdjustmentstoQuorumVoting
For the example topology discussed here, this means:1 vote to each node in the primary data center0 vote to each node in the disaster recovery data center= total 2 votes in the Windows Cluster => not ideal !Need odd number of votes for a “majority” based quorum model
Quorum models:Pick one of the “majority” based quorum models with odd number of votes
Node and File Share Majority, or Node Majority, orNode and (asymmetric) Disk Majority
Or, pick (asymmetric) Disk Only (special case!) – votes don’t matter
FCI for HA + AG for DRQuorum Considerations
Note: The Fileshare Witness always has 1 vote.
Quorum Model and Node VotesExample: Node and Fileshare Majority
Primary Data Center
SQLFCIPRIMARY
Disaster Recovery Data Center
SQLFCIDR
Availability Group
PrimaryDatabase(s)
Secondary Database(s)
Windows Server Failover Cluster
Fileshare
Quorum Model and Node VotesHow to set / view
To View Quorum Model
Windows Failover Cluster Manager GUIPowerShellCluster.exe
SQL Server DMVsAlwaysOn Dashboard in SSMS
To Change Quorum Model
Windows Failover Cluster Manager GUIPowerShellCluster.exe
To View Node Votes
PowerShellCluster.exe
SQL Server DMVsAlwaysOn Dashboard in SSMS
To Change Node Votes
PowerShellCluster.exe
Note: Only cluster.exe can be used to set quorum model to “Node and (asymmetric) Disk Majority” or “(asymmetric) Disk Only”
Disaster = Primary site is downManual Process involved to bring database service online on the DR site
Force Quorum on the secondary in the DR siteExecute FORCE SERVICE ALLOW DATA LOSS on the Availability GroupAdjust quorum model and/or node votes
Rethink quorum model: needs for another fileshare at the DR site?
Recovering from a Disaster
Hardware: new hardware, reuse existing hardware?Windows Clustering
Quorum considerations across multiple data centersCluster network communication across multiple data centers
Stages: migrate the whole configuration at once, or migrate the DR afterwards?Secondary (DR site) needs re-seeding
Uninstall existing SQL FCIDestroy existing WSFC at the DR siteRe-install SQL FCI after joining DR nodes to primary data center WSFCBackup from primary, and Restore on the secondary
Application connection string change
Migration: From FCI+DBM to FCI+AGPlanning and Key Considerations
Summary
SQL Server 2012 AlwaysOn HA+DR Design Patterns
SQL Server 2012 AlwaysOn HA+DR Design Pattern Solution Characteristics
Corresponding Pre-SQL Server 2012 Solution
1 Multi-site Failover Cluster Instance (FCI) for HA and DR
• Shared Storage solution 1
• Instance Level HA (automatic)• Instance Level DR (automatic 2)• Uses storage replication• Doesn’t require database to be in FULL recovery
model
1 Masked by storage replication 2 Consider 3rd data center
Multi-site FCI using stretch VLAN
2 Availability Group for HA and DR
• Non-Shared Storage solution• (Group of) Database Level HA (automatic)• (Group of) Database Level DR (manual 3)• DR replica can be Active Secondary• Requires database to be in FULL recovery model
3 DR is manual, if HA is chosen automatic. Consider 3rd data center, if need automatic DR.
Database Mirroring for Local HA and Log Shipping for DR
3Failover Cluster Instance for local HA + Availability Group for DR
• Combined Shared Storage and Non-Shared Storage
• Instance Level HA (automatic)• (Group of) Database Level DR (manual)• DR replica can be Active Secondary• Requires database to be in FULL recovery model• Asymmetric storage is the key to this solution
Failover Cluster Instance for Local HA and Database Mirroring for DR
Slight variations of these design patterns are occasionally observed as well.
Wednesday, June 13, 10:15 AM – 11:30 AMSQLCAT: HA/DR Customer Panel - Microsoft SQL Server 2012 AlwaysOn Deployment Considerations
N 320E• Michael Steineke, Edgenet, Inc.• David P. Smith, ServiceU Corporation• Ayad Shammout, CareGroup Healthcare Systems• Wolfgang Kutschera, bwin party• Thomas Grohser, Hedge fund in Connecticut
Track Resources
@sqlserver@ms_teched
mvaMicrosoft Virtual Academy
SQL Server 2012 Eval Copy
Get Certified!
Hands-On Labs
Resources
Connect. Share. Discuss.
http://northamerica.msteched.com
Learning
Microsoft Certification & Training Resources
www.microsoft.com/learning
TechNet
Resources for IT Professionals
http://microsoft.com/technet
Resources for Developers
http://microsoft.com/msdn
Complete an evaluation on CommNet and enter to win!
MS Tag
Scan the Tagto evaluate thissession now onmyTechEd Mobile
Required Slide *delete this box when your slide is finalized
Your MS Tag will be inserted here during the final scrub.
© 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to
be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS
PRESENTATION.