scott schnoll principal technical writer microsoft corporation unc313

52

Upload: miles-manning

Post on 23-Dec-2015

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Scott Schnoll Principal Technical Writer Microsoft Corporation UNC313
Page 2: Scott Schnoll Principal Technical Writer Microsoft Corporation UNC313

High Availability in

Scott SchnollPrincipal Technical WriterMicrosoft CorporationUNC313

Page 3: Scott Schnoll Principal Technical Writer Microsoft Corporation UNC313

Agenda

Exchange 2010 High Availability Vision/GoalsExchange 2010 High Availability FeaturesExchange 2010 High Availability Deep DiveDeploying Exchange 2010 High Availability FeaturesTransitioning to Exchange 2010 High AvailabilityEnd-to-End Availability ImprovementsHigh Availability Design Examples

Page 4: Scott Schnoll Principal Technical Writer Microsoft Corporation UNC313

Exchange 2010 High Availability Vision/Goals

Page 5: Scott Schnoll Principal Technical Writer Microsoft Corporation UNC313

Exchange 2010 High Availability Vision and Goals

Vision: Deliver a fast, easy-to-deploy and operate, economical solution that can provide messaging service continuity for all customersGoals

Deliver a solution for high availability and site resilience that is native to ExchangeEnable less expensive and less complex storageSimplify administration and reduce support costsIncrease end-to-end availabilitySupport Exchange Server 2010 Online

Page 6: Scott Schnoll Principal Technical Writer Microsoft Corporation UNC313

Exchange 2010 High Availability Solution

Unified technology for high availability and site resilienceNew framework for creating highly available mailboxesEvolution of continuous replication technologyCan be deployed on a range of storage optionsNative to Exchange; not bolted onto the side

Page 7: Scott Schnoll Principal Technical Writer Microsoft Corporation UNC313

DB1

Front End Server

NodeB(passive)

Outlook OWA, ActiveSync, or Outlook Anywhere

San Jose

Dallas

Standby Cluster

Third-party data replication needed for site resilience

Complex site resilience and recovery

Clustering knowledge required

DB2

DB3

DB4

DB5

DB6

Failover at Mailbox server level

DB1

DB2

DB3Clustered Mailbox Server had to be created manually

Exchange Server 2003

NodeA(active)

Page 8: Scott Schnoll Principal Technical Writer Microsoft Corporation UNC313

DB1

Client Access Server

NodeB(passive)

SCROutlook OWA, ActiveSync, or Outlook Anywhere

San Jose

Dallas

Standby Cluster

No GUI to manage SCR

Complex activation for remote server / datacenter

Clustering knowledge required

DB2

DB3

DB4

DB5

DB6

DB1

DB2

DB3

DB4

DB5

DB6

Failover at Mailbox server level

DB1

DB2

DB3Clustered Mailbox Server can’t co-exist with other roles

Exchange Server 2007

NodeA(active) CCR

Page 9: Scott Schnoll Principal Technical Writer Microsoft Corporation UNC313

DB2

DB3

DB2

DB3

DB4

DB4

DB5

Client Access Server

Mailbox Server 1

Mailbox Server 2

Mailbox Server 3

Mailbox Server 6

Mailbox Server 4

Dallas

San Jose

Mailbox Server 5

DB5

DB2

DB3

DB4

DB5DB1

DB1DB1

DB1

Failover managed by/with Exchange

Database level failover

Easy to extend across sites

All clients connect via CAS servers DB3

DB5

DB1

Client

Exchange Server 2010

Page 10: Scott Schnoll Principal Technical Writer Microsoft Corporation UNC313

Exchange 2010 High Availability Features

Page 11: Scott Schnoll Principal Technical Writer Microsoft Corporation UNC313

Exchange 2010 High Availability Feature Names

Mailbox Resiliency – Name of Unified High Availability and Site Resilience SolutionDatabase Availability Group – A group of up to sixteen mailbox servers that host a set of replicated databasesMailbox Database Copy – A mailbox database (.edb file and logs) that is either active or passiveDatabase Mobility – The ability of a single mailbox database to be replicated to and mounted on other mailbox servers

Page 12: Scott Schnoll Principal Technical Writer Microsoft Corporation UNC313

Exchange 2010 High Availability Feature Names

RPC Client Access service – A Client Access server feature that provides a MAPI endpoint for Outlook clientsShadow Redundancy – A transport feature that provides redundancy for messages for the entire time they are in transitIncremental Deployment – The ability to deploy high availability /site resilience after Exchange is installedExchange Third Party Replication API – An Exchange-provided API that enables use of third-party replication for a DAG in lieu of continuous replication

Page 13: Scott Schnoll Principal Technical Writer Microsoft Corporation UNC313

Exchange 2010 High Availability Terminology

High Availability – Solution must provide data availability, service availability, and automatic recovery from failuresDisaster Recovery – Process used to manually recover from a failureSite Resilience – Disaster recovery solution used for recovery from site failure*over – Short for switchover/failover; a switchover is a manual activation of one or more databases; a failover is an automatic activation of one or more databases after a failure

Page 14: Scott Schnoll Principal Technical Writer Microsoft Corporation UNC313

Exchange 2010 *overs

Within a datacenterDatabase or server *overs

Datacenter level: switchoverBetween datacenters

Database or server *oversAssumptions:

Each datacenter is a separate Active Directory siteEach datacenter has live, active messaging servicesStandby datacenter must be active to support single database *over

Page 15: Scott Schnoll Principal Technical Writer Microsoft Corporation UNC313

Exchange 2007 Concepts Brought Forward

Extensible Storage Engine (ESE)Databases and log files

Continuous ReplicationLog shipping and replayDatabase seedingStore service/Replication serviceDatabase health and status monitoringDivergenceAutomatic database mount behavior

Concepts of quorum and witnessConcepts of *overs

Page 16: Scott Schnoll Principal Technical Writer Microsoft Corporation UNC313

Exchange 2010 Deprecated Concepts

Storage groupsDatabases identified by the server on which they liveServer names as part of database namesClustered Mailbox Servers

Pre-installing a Windows Failover ClusterRunning setup in Clustered ModeMoving a CMS network identity between serversShared storage

Two HA copy limitsPrivate and public networks

Page 17: Scott Schnoll Principal Technical Writer Microsoft Corporation UNC313

Exchange 2010 High Availability Vision/Goals

Page 18: Scott Schnoll Principal Technical Writer Microsoft Corporation UNC313

Exchange 2010 HA Fundamentals

Database Availability GroupServerDatabaseDatabase CopyActive ManagerRPC Client Access

DAG

copy copy

AM

SVR

copy copy

AM

SVR

DB DB

RPC CAS

RPC CAS

Page 19: Scott Schnoll Principal Technical Writer Microsoft Corporation UNC313

Database Availability Group (DAG)

Base component of high availability and site resilienceA group of up to 16 servers that host a set of replicated databases“Wraps” a Windows Failover Cluster

Manages membership (DAG member = node)Provides heartbeat of DAG member serversActive Manager stores data in cluster database

Defines a boundary for:Mailbox database replicationDatabase and server *oversActive Manager

Page 20: Scott Schnoll Principal Technical Writer Microsoft Corporation UNC313

Active Manager

Exchange component that manages *oversRuns on every server in the DAGSelects best available copy on failoversIs the definitive source of information on where a database is active

Stores this information in cluster databaseProvides this information to other Exchange components (e.g., RPC Client Access and Hub Transport)

Two Active Manager roles: PAM and SAM

Page 21: Scott Schnoll Principal Technical Writer Microsoft Corporation UNC313

Active ManagerPrimary Active Manager (PAM)

Runs on the node that owns the cluster groupGets topology change notificationsReacts to server failuresSelects the best database copy on *overs

Standby Active Manager (SAM)Runs on every other node in the DAGResponds to queries about which server hosts the active copy of the mailbox database

Both roles are necessary for automatic recoveryIf Replication service is stopped, automatic recovery will not happen

Page 22: Scott Schnoll Principal Technical Writer Microsoft Corporation UNC313

Active ManagerSelection of Active Database Copy

Active Manager selects the “best” copy to become active when existing active fails

Ignores servers that are unreachable or activation is temporarily or regularly blockedSorts copies by currency to minimize data lossBreaks ties during sort based on Activation PreferenceSelects from sorted listed based on copy status of each copy

Page 23: Scott Schnoll Principal Technical Writer Microsoft Corporation UNC313

Active ManagerSelection of Active Database Copy

Active Manager selects the “best” copy to become active when existing active fails

Catalog HealthyCopy status Healthy, DisconnectedAndHealthy,

DisconnectedAndResynchronizing, orSeedingSource

CopyQueueLength < 10ReplayQueueLength < 50

Catalog CrawlingCopy status Healthy, DisconnectedAndHealthy,

DisconnectedAndResynchronizing, orSeedingSource

CopyQueueLength < 10ReplayQueueLength < 50

Catalog HealthyCopy status Healthy, DisconnectedAndHealthy,

DisconnectedAndResynchronizing, orSeedingSource

ReplayQueueLength < 50

Catalog CrawlingCopy status Healthy, DisconnectedAndHealthy,

DisconnectedAndResynchronizing, orSeedingSource

ReplayQueueLength < 50

5Copy status Healthy, DisconnectedAndHealthy,

DisconnectedAndResynchronizing, orSeedingSource

ReplayQueueLength < 50

6Catalog HealthyCopy status Healthy, DisconnectedAndHealthy,

DisconnectedAndResynchronizing, orSeedingSource

CopyQueueLength < 10

7Catalog CrawlingCopy status Healthy, DisconnectedAndHealthy,

DisconnectedAndResynchronizing, orSeedingSource

CopyQueueLength < 10

8Catalog HealthyCopy status Healthy, DisconnectedAndHealthy,

DisconnectedAndResynchronizing, orSeedingSource

9Catalog CrawlingCopy status Healthy, DisconnectedAndHealthy,

DisconnectedAndResynchronizing, orSeedingSource

10Copy status Healthy, DisconnectedAndHealthy,

DisconnectedAndResynchronizing, orSeedingSource

Page 24: Scott Schnoll Principal Technical Writer Microsoft Corporation UNC313

Example: Database Failover

Database failure occursFailure item is raisedActive Manager moves active databaseDatabase copy is restoredSimilar flow within and across datacenters

DB2

DB3

DB2

DB3

DB4

DB4

DB5

Mailbox Server

1

Mailbox Server

2

Mailbox Server

3

Mailbox Server

4

Mailbox Server

5

DB5

DB2

DB3

DB4

DB5DB1

DB1

DB1

DAG

Page 25: Scott Schnoll Principal Technical Writer Microsoft Corporation UNC313

Example: Server FailoverServer failure occursCluster notification of node downActive Manager moves active databasesServer is restoredCluster notification of node upDatabase copies resynchronize with active databasesSimilar flow within and across datacenters

DB2

DB3

DB2

DB3

DB4

DB4

DB5

Mailbox Server

1

Mailbox Server

2

Mailbox Server

3

Mailbox Server

4

Mailbox Server

5

DB5

DB2

DB3

DB4

DB5DB1

DB1

DB1

DAG

Page 26: Scott Schnoll Principal Technical Writer Microsoft Corporation UNC313

DAG LifecycleDAG is created initially as empty object in Active Directory

Continuous replication or 3rd party replication using Third Party Replication mode

When first mailbox server is added to a DAGA Windows failover cluster is formed with a Node Majority quorum using the name of the DAG The server is added to the DAG object in Active DirectoryA cluster network object (CNO) for the DAG is created in the built-in Computers containerOne or more IP addresses is assigned to the DAGThe Name and IP address(s) of the DAG is registered in DNSThe cluster database for the DAG is updated with info on configured databases, including if they are locally active (which they should be)

Page 27: Scott Schnoll Principal Technical Writer Microsoft Corporation UNC313

DAG Lifecycle

When second and subsequent Mailbox server is added to a DAG

The server is joined to cluster for the DAGThe quorum model is automatically adjusted

Node Majority - DAGs with odd number of membersNode and File Share Majority - DAGs with even number of membersFile share witness cluster resource, directory, and share are automatically created by Exchange when needed

The server is added to the DAG object in Active DirectoryThe cluster database for the DAG is updated with info on configured databases, including if they are locally active (which they should be)

Page 28: Scott Schnoll Principal Technical Writer Microsoft Corporation UNC313

DAG Lifecycle

After servers have been added to a DAGConfigure the DAG

Network EncryptionNetwork Compression

Configure DAG networksNetwork subnetsEnable/disable MAPI traffic/replication

Create mailbox database copiesSeeding is performed automatically

Monitor health and status of database copiesPerform switchovers as needed

Page 29: Scott Schnoll Principal Technical Writer Microsoft Corporation UNC313

DAG Lifecycle

Before you can remove a server from a DAG, you must first remove all replicated databases from the serverWhen a server is removed from a DAG:

The server is evicted from the clusterThe cluster quorum is adjusted as neededThe server is removed from the DAG object in Active Directory

Before you can remove a DAG, you must first remove all servers from the DAG

Page 30: Scott Schnoll Principal Technical Writer Microsoft Corporation UNC313

Exchange 2010 High Availability Vision/Goals

Page 31: Scott Schnoll Principal Technical Writer Microsoft Corporation UNC313

Deploying Exchange 2010 HA Features

Legacy Deployment Steps (CCR/SCC)

1. Prepare hardware, install proper OS, and update

Extra for SCC: configure storage2. Build Windows Failover Cluster

Extra for SCC: configure storage3. Configure cluster quorum, file share

witness, and public and private networks

4. Run Setup in Custom mode and install clustered mailbox server

5. Configure clustered mailbox serverExtra for SCC: configure disk resource

dependencies6. Test *overs

Legacy Deployment Steps (CCR/SCC) Exchange 2010 Incremental Deployment

1. Prepare hardware, install proper OS, and update

Extra for SCC: configure storage2. Build Windows Failover Cluster

Extra for SCC: configure storage3. Configure cluster quorum, file share

witness, and public and private networks

4. Run Setup in Custom mode and install clustered mailbox server

5. Configure clustered mailbox serverExtra for SCC: configure disk resource

dependencies6. Test *overs

1. Prepare hardware, install proper OS, and update

2. Run Setup and install Mailbox role3. Create a DAG and replicate databases4. Test *overs

Page 32: Scott Schnoll Principal Technical Writer Microsoft Corporation UNC313

Exchange 2010 Incremental Deployment

Create a DAGNew-DatabaseAvailabilityGroup -Name DAG1 -FileShareWitnessShare \\EXHUB1\DAG1FSW -FileShareWitnessDirectory C:\DAG1FSW

Add first Mailbox Server to DAGAdd-DatabaseAvailbilityGroupServer -Identity DAG1 -MailboxServer EXMBX1 -DatabaseAvailablityGroupIpAddresses 10.0.0.8

Add second and subsequent Mailbox ServerAdd-DatabaseAvailabilityGroupServer -Identity DAG1 -MailboxServer EXMBX2

Add-DatabaseAvailabilityGroupServer -Identity DAG1 -MailboxServer EXMBX2 -DatabaseAvailablityGroupIpAddresses 10.0.0.8,10.0.1.8

Add Mailbox Database CopyAdd-MailboxDatabaseCopy -Identity MBXDB1 -MailboxServer EXMBX3

Extend as needed

Page 33: Scott Schnoll Principal Technical Writer Microsoft Corporation UNC313

Creating a database availability groupAdding servers to a database availability groupAdd mailbox database copyDatabase switchover

demo

Page 34: Scott Schnoll Principal Technical Writer Microsoft Corporation UNC313

Transitioning to Exchange 2010 High Availability

Page 35: Scott Schnoll Principal Technical Writer Microsoft Corporation UNC313

Transition Steps

Verify that you meet requirements for Exchange 2010Deploy Exchange 2010Use Exchange 2010 mailbox move features to migrateUnsupported Transitions

In-place upgrade to Exchange 2010 from any previous version of ExchangeUsing database portability between Exchange 2010 and non-Exchange 2010 databasesBackup and restore of earlier versions of Exchange databases on Exchange 2010Using continuous replication between Exchange 2010 and Exchange 2007

Page 36: Scott Schnoll Principal Technical Writer Microsoft Corporation UNC313

Exchange 2010 End-to-End Availability Improvements

Page 37: Scott Schnoll Principal Technical Writer Microsoft Corporation UNC313

Exchange 2010 End-to-End Availability Improvements

Online Move MailboxSupported between Exchange 2010 databases, and between Exchange 2007 SP2 and Exchange 2010 databasesUser can access their mailbox while move is in progressMove is performed asynchronouslyby a new service called the Microsoft Exchange Mailbox Replication Service (MRS), which runs on Client Access servers

E-Mail Client

Mailbox Server 1 Mailbox Server 2

Client Access Server

Page 38: Scott Schnoll Principal Technical Writer Microsoft Corporation UNC313

Exchange 2010 End-to-End Availability Improvements

RPC Client Access serviceA new service that establishes a RPC endpoint for client access on the CAS role to replace the existing RPC endpoint on the Mailbox role

New RPC endpoint entirely re-written in managed codeRe-factored common business logic from Exchange 2007 that overlaps with what is needed by the RPC endpointCmdlets, performance counters, etc. to manage and monitor

Does not replace RPC endpoint for public folder databases; Outlook clients logon directly with public folder store to access public folder databases

Page 39: Scott Schnoll Principal Technical Writer Microsoft Corporation UNC313

Exchange 2010 End-to-End Availability Improvements

Mailbox Server

HubTransport

Edge Transport

EdgeTransport

Servers keep “shadow copies” of items until they are delivered to the next hop

Also helps simplify Hub and Edge Transport Server upgrades and maintenance

X

Shadow Redundancy

Page 40: Scott Schnoll Principal Technical Writer Microsoft Corporation UNC313

Exchange 2010 End-to-End Availability Improvements

Transport Dumpster ImprovementsGets feedback from replication pipeline to let it know when to delete items

Once something has been delivered, and the logs for the message are replicated, transport dumpster can delete the messageReplay is not required for deleting items from dumpster; only data in dumpster is data that has not yet been replicated

Responds to requests for redelivery after lossy failover both within its Active Directory site and across Active Directory sites (old site and new site)

Page 41: Scott Schnoll Principal Technical Writer Microsoft Corporation UNC313

Exchange 2010 End-to-End Availability Improvements

Exchange 2010 HAE-mail ArchiveHold Policy

X

Database Availability Group

Mailbox Server 1

Mailbox Server 2

Mailbox Server 3

DB1

DB2

DB3

DB1

DB2

DB3

DB1

DB2

DB3

Site/Server/Disk failureArchiving/ComplianceRecover deleted items

Using 3 or more database copies enables you to use replication for your backups

Page 42: Scott Schnoll Principal Technical Writer Microsoft Corporation UNC313

Exchange Server 2010 High Availability Design Examples

Page 43: Scott Schnoll Principal Technical Writer Microsoft Corporation UNC313

Client AccessHub

TransportMailbox

Client AccessHub

TransportMailbox

Member servers of DAG can host other server roles

Hardware Load Balancer

DB1

DB2

DB3

DB2

DB1

DB2

DB3

2-server DAGs should use RAID

8 processor cores recommended with a maximum of 64GB RAM

UM role not recommended for co-location

High Availability Design ExampleBranch/Small Office Design

Page 44: Scott Schnoll Principal Technical Writer Microsoft Corporation UNC313

Single Site

3 HA Copies

Database Availability Group

DB1 DB2 DB3

DB5 DB6

DB1 DB2 DB3

DB4 DB5 DB6

DB1 DB2 DB3

DB4 DB5 DB6DB4

MailboxServer 1

MailboxServer 2

MailboxServer 3

3 Nodes

X

CAS NLB Farm

AD: Dublin

XJBOD -> 3 physical Copies

2 servers out -> manual activation of server 3

In 3 server DAG, quorum is lostDAGs with more servers sustain more failures – greater resiliency

High Availability Design ExampleDouble Resilience – Maintenance + DB Failure

Page 45: Scott Schnoll Principal Technical Writer Microsoft Corporation UNC313

• Single Site• 4 Nodes• 3 HA Copies• JBOD -> 3 physical Copies

Database Availability Group (DAG)

DB2 DB3

DB5DB4

DB7 DB8 DB1

DB2 DB3 DB4

MailboxServer 1

DB5 DB6 DB7

DB8 DB1 DB2

MailboxServer 2

MailboxServer 3

X

CAS NLB Farm

AD: Dublin

DB3 DB4 DB5

DB6 DB7 DB8

MailboxServer 4

DB1 XDB6

• Upgrade server 1• Server 2 fails• Server 1 upgrade is done• 2 active copies die

High Availability Design ExampleDouble Node/Disk Failure Resilience

Page 46: Scott Schnoll Principal Technical Writer Microsoft Corporation UNC313

Key Takeaways

Greater end-to-end availability with Mailbox ResiliencyUnified framework for high availability and site resilienceFaster and easier to deploy with Incremental DeploymentReduced TCO with core ESE architecture changes and more storage optionsSupports large mailboxes for less money

Page 47: Scott Schnoll Principal Technical Writer Microsoft Corporation UNC313

question & answer

Page 48: Scott Schnoll Principal Technical Writer Microsoft Corporation UNC313

www.microsoft.com/teched

Sessions On-Demand & Community

http://microsoft.com/technet

Resources for IT Professionals

http://microsoft.com/msdn

Resources for Developers

www.microsoft.com/learningMicrosoft Certification and Training Resources

www.microsoft.com/learning

Microsoft Certification & Training Resources

Resources

Page 49: Scott Schnoll Principal Technical Writer Microsoft Corporation UNC313

Related ContentBreakout Sessions (session codes and titles)•UNC316 - Microsoft Exchange Server 2010 Architecture•UNC321 - Storage in Microsoft Exchange Server 2010

Interactive Theater Sessions (session codes and titles)•UNC02-TLC - Designing Microsoft Exchange Server 2010 High Availability Solutions

Hands-on Labs (session codes and titles)•UNC12-HOL - Microsoft Exchange Server 2010 High Availability and Storage Scenarios

Page 50: Scott Schnoll Principal Technical Writer Microsoft Corporation UNC313

Call to ActionLearn More!

Related Content at TechEd on “Related Content” SlideAttend in-person or consume post-event at TechEd Online

Check out online learning/training resourceshttp://technet.microsoft.com/exchange/2010 http://technet.microsoft.com/office/ocs

Try It Out!Download the Exchange Server 2010 Beta Evaluation

http://www.microsoft.com/exchange/2010/try-it

Get a 5-Day Trial of Office Communications Server 2007 R2https://r2.uctrial.com/

Page 51: Scott Schnoll Principal Technical Writer Microsoft Corporation UNC313

Complete an evaluation on CommNet and enter to win!

Page 52: Scott Schnoll Principal Technical Writer Microsoft Corporation UNC313

© 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS,

IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.