1 data guard. 2 data guard reasons for deployment site failures power failure air conditioning...

25
1 Data Guard

Upload: ginger-thomas

Post on 02-Jan-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

1

Data Guard

2

Data GuardReasons for Deployment Site Failures

Power failure Air conditioning failure Flooding Fire Storm damage Hurricane Earthquake Terrorism Sabotage Plane crash Planned Maintenance HUMAN ERROR

3

Primary Database Standby Database

Data GuardStandby Database

Primary

Instance

Database

Site 1

Database

Standby

Instance

Site 2

Redo

4

Data GuardPhysical Standby Physical Standby

Technology introduced in Oracle 7.2 Marketed as Data Guard in Oracle 8.1.7 and above

Standby is identical copy of primary database Redo changes

transported from primary to standby applied on standby (Redo Apply)

Can switch operations to standby Planned (switchover / switchback) Unplanned (failover)

Failover time dependent on various factors Rate of redo generation / size of redo logs Redo transport / apply configuration

5

Data GuardLogical Standby Introduced in Oracle 9.2 Subset of database objects Redo copied from primary to standby Changes converted into logical change records (LCR) Logical change records applied on standby (SQL Apply) Standby database can be opened for updates

Can modify propagated objects Can create new indexes for propagated objects

May need larger system for logical standby LCR apply can be less efficient than redo apply Array updates on primary become single row updates on

standby

6

Data GuardProtection Modes Three protection modes:

Maximum protection - zero data loss Redo synchronously transported to standby database Redo must be applied to at least one standby before

transactions on primary can be committed Processing on primary is suspended if no standby is

available Maximum availability - minimal data loss

Similar to maximum protection mode If no standby database is available processing

continues on primary Maximum performance (default)

Redo asynchronously shipped to standby database If no standby database is available processing

continues on primary

7

Data GuardRedo Log Shipping ARCH background process

Copies completed redo log files to standby LGWR background process - modes are:

ASYNC - asynchronous Oracle 10.1 and below

redo written by LGWR to dedicated area in SGA read from SGA by LNSn background process

Oracle 10.2 and above redo written by LGWR to local disk read from disk by LNSn background process

SYNC - synchronous Redo written to standby by LGWR - modes are:

AFFIRM - wait for confirmation redo written to disk NOAFFIRM - do not wait

8

Data GuardARCH Redo Transmission

ARC0 ARC1

OnlineRedoLog

LGWR RFS

StandbyRedoLog

ARCn

ArchivedRedoLogs

MRPLSP

StandbyDatabase

PrimaryDatabase

LOG_ARCHIVE_DEST_1

LOG

_AR

CH

IVE

_DE

ST_

2

Primary Database Standby Database

ArchivedRedoLogs

9

Data GuardLGWR (ASYNC) Redo Transmission

ArchivedRedoLogs

ARCn

RFS

StandbyRedoLog

ARCn

ArchivedRedoLogs

MRPLSP

StandbyDatabase

PrimaryDatabase

LOG_ARCHIVE_DEST_1

Primary Database Standby Database

LNSn

LGWR

OnlineRedoLog

10

Data GuardLGWR (SYNC) Redo Transmission

ArchivedRedoLogs

ARCn

OnlineRedoLog

RFS

StandbyRedoLog

ARCn

ArchivedRedoLogs

MRPLSP

StandbyDatabase

PrimaryDatabase

LOG_ARCHIVE_DEST_1

Primary Database Standby Database

LNSnLGWR

11

Data GuardRole Transitions There are two types of role transition

Switchover Planned failover to standby database Original primary becomes new standby Original standby becomes new primary No data loss Can switchback at any time

Failover Unplanned failover to standby database Original standby becomes new primary Original primary may need to be rebuilt Possible data loss

12

After Switchover

Data GuardSwitchover

Before Switchover

Primary

Instance

Database

Primary Database

Site1

Database

PhysicalStandby

Instance

Standby Database

Site2

Standby Database

PhysicalStandby

Instance

Database

Site1

Database

Primary

Instance

PrimaryDatabase

Site2

Redo Redo

13

Database

PhysicalStandby

Instance

Standby Database

Site2

After Failover

Data GuardFailover

Before Failover

Primary

Instance

Database

Primary Database

Site1

Database

PhysicalStandby

Instance

Standby Database

Site2

Unavailable

Primary

Instance

Database

Site1

Database

Primary

Instance

PrimaryDatabase

Site2

Redo Redo

14

Data GuardRead-Only Mode Physical standby database can be opened in read-only mode

(Managed) Recovery must be suspended Reports can use temporary tablespaces

Sorts Temporary tables

Reports cannot modify permanent objects Failover times may be affected

Suspended redo must be applied

15

Data GuardDelayed Redo Application Delay in redo application can be configured

Redo is transported immediately Provides protection against site failure

Redo is not applied immediately Provides protection against human error Increases potential failover times

In Oracle 10.1 and above flashback database can be used as an alternative to delayed redo application

16

Data GuardData Guard Broker Introduced in Oracle 9.2 Stable in Oracle 10.2 and above Managed using DGMGRL utility Contains Data Guard configuration

Additional layer of complexity Used by Enterprise Manager to manage standby Mandatory for some new functionality e.g.

Fast Start Failover

17

Site1

Primary

Node 1

Database

Standby

Node 2

Site2

Database

Data GuardFast Start Failover

Observer

Site3

18

Data GuardFast Start Failover Detects failure of primary database

Automatically fails over to nominated standby database Requirements include

Flashback logging must be configured DGMGRL must be used Observer process running in third independent site

Highly available in Oracle 11.1 and above MAXIMUM AVAILABILITY protection mode

Standby database archive log destination must be configured as LGWR SYNC

MAXIMUM PERFORMANCE protection mode Oracle 11.1 and above

Primary database can potentially be reinstated automatically Using flashback logs

19

Data GuardFast Start Failover Advantages

No interconnect network required between sites No storage network required between sites RAC licences not required if each site is a single-instance

Disadvantages Active / Passive Requires Enterprise Edition licence Remaining infrastructure must also failover

Network Application tier Clients

20

Data GuardOracle 11g New Features Snapshot Standby

Standby can be converted to snapshot standby Can be opened in read-write mode (for testing)

Redo transport continues Redo apply delayed Standby can subsequently be converted back to physical

standby

Active Data Guard Separately licensed option Updates applied to primary Changes can be read immediately on standby databases Standby database can be opened in read-only mode

Redo can continue to be applied

21

Data GuardLicensing Standby database nodes must by fully licensed

Same metric as primary (named user, CPU etc)

Standard Edition Cannot use Data Guard Use user-defined scripts to transport redo Use Automatic Recovery to apply redo Manually resolve archive log gaps

Enterprise Edition Use Managed Recovery to apply redo Use Fetch Archive Logging to resolve archive log gaps Additional licenses required for Active Data Guard

22

Data GuardAlternatives Standard Edition

Manual log shipping using scripts

SAN level Replication technologies Netapp SnapMirror, MetroCluster EMC SRDF, Mirrorview HP StorageWorks

Redo log replication technologies Quest Shareplex

23

Data GuardThe Reality

24

Data GuardThe Reality Many sites run physical standbys

Well proven technology Spare capacity on standby often used for development or

testing during normal operations

Relatively few sites run a logical standby Streams is much more popular

Many sites enable flashback logging In both development and production environments Very few using Automatic Failover

Very few sites working with Oracle 11g yet Consequently none using Active Data Guard

25

Data GuardThe Reality Failover times

Normally dependent on management decisions Usually some investigation before failover

Time to failover database is minimal (5-10 minutes) Time to failover infrastructure can be hours

Network configuration DNS Application / web servers Clients

Failover SLAs often up to 48 hours

Rebuild times Can take minutes using flashback logging Can take much longer depending on reason for failover