oracle data guard in oracle database 10g release 2 ...new features in oracle database 10g release 2...

34
Oracle Data Guard in Oracle Database 10g Release 2 Business Continuity for the Enterprise An Oracle White Paper November 2006

Upload: others

Post on 22-Feb-2021

27 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Oracle Data Guard in Oracle Database 10g Release 2 ...New Features in Oracle Database 10g Release 2 Following is a summary of the new Data Guard features available in Oracle Database

Oracle Data Guard in Oracle Database 10g Release 2

Business Continuity for the Enterprise

An Oracle White Paper November 2006

Page 2: Oracle Data Guard in Oracle Database 10g Release 2 ...New Features in Oracle Database 10g Release 2 Following is a summary of the new Data Guard features available in Oracle Database

Oracle Data Guard Business Continuity for the Enterprise

Executive Overview ...................................................................................................... 3 Impact Of Disasters ...................................................................................................... 3

High Availability Challenges ................................................................................... 3 Oracle Data Guard ................................................................................................... 4 Oracle Database High Availability Solutions........................................................ 4

Overview Of Oracle Data Guard................................................................................ 5 What is Oracle Data Guard? ................................................................................... 5 Data Guard Functionality........................................................................................ 6 New Features in Oracle Database 10g Release 2 ................................................. 7 Benefits Of Data Guard .......................................................................................... 8

Data Guard Process Architecture ............................................................................... 9 Key Technology Components ................................................................................... 10

Data Guard Configuration .................................................................................... 10 Protection Modes and Redo Transport............................................................... 10

Maximum Protection......................................................................................... 11 Maximum Availability ....................................................................................... 11 Maximum Performance .................................................................................... 12 Redo Transport Enhancements in Oracle Database 10g Release 2........... 12

Redo Apply And SQL Apply ................................................................................ 12 Physical Standby Database – Redo Apply...................................................... 13 Logical Standby Database – SQL Apply ........................................................ 15

Enterprise Manager & Data Guard Broker ........................................................ 18 Management Enhancements in Oracle Database 10g Release 2 ................ 19

Role Transitions – Switchover and Failover....................................................... 20 Switchover........................................................................................................... 20 Types of Failover ............................................................................................... 21 Manual Failover.................................................................................................. 21 Fast-Start Failover.............................................................................................. 22 Restoring Old Primary As A New Standby ................................................... 25 Role Transition Events ..................................................................................... 25

Handling Communication Failures ...................................................................... 26 Protection from Data Corruptions Caused by Human Errors........................ 26 Rolling Database Upgrades ................................................................................... 27 Cascaded Redo Log Destinations......................................................................... 27

Data Guard And RAC ................................................................................................ 28 Maximum Availability Architecture .......................................................................... 28 Data Guard And Remote Mirroring Solutions ....................................................... 29 Data Guard Customers ............................................................................................... 31 Conclusion .................................................................................................................... 32 References ..................................................................................................................... 33

Oracle Data Guard – Business Continuity for the Enterprise Page 2

Page 3: Oracle Data Guard in Oracle Database 10g Release 2 ...New Features in Oracle Database 10g Release 2 Following is a summary of the new Data Guard features available in Oracle Database

Oracle Data Guard Business Continuity for the Enterprise

EXECUTIVE OVERVIEW

Business continuity and disaster recovery are top priorities for the senior management of most global enterprises. Economic fluctuations, rapid changes in market trends, and competitive pressures require that the global enterprise of today must operate in a 24x7 environment, and must be able to swiftly and efficiently deal with unforeseen business interruptions. Oracle Data Guard is one of the most effective solutions available today to protect the core asset of any enterprise – its data, and make it available on a 24x7 basis despite disasters and other outages. This paper discusses Data Guard technology in Oracle Database 10g, and demonstrates how it is a key factor in the business continuity infrastructure of any enterprise.

IMPACT OF DISASTERS

With the proliferation of e-business, an enterprise today operates in an extremely complex and a highly networked global economy, and is more susceptible to interruptions than in the past. The cost of interruptions, or downtime, varies across industries and can be as much as millions of dollars an hour. While that number is staggering, the reasons are quite obvious. The Internet has brought millions of customers directly to the electronic storefronts. Critical and interdependent business matters such as customer relationships, competitive advantages, legal obligations, industry reputation and shareholder confidence are even more critical now because of their increased vulnerability to business disruptions and downtimes.

High Availability Challenges

Downtime that affects a business could be either unplanned or planned. Unplanned downtime may be due to hardware or system failures, data/storage failures, human errors, computer viruses, software glitches, natural disasters and malicious acts. A business may also have to undergo planned downtimes because of scheduled maintenance such as system upgrades.

Data is one of the most critical company assets. Continuous access to data is essential to avoid costly disruption to business processes.

A company designing its business continuity strategy must create a business continuity plan (BCP) that can effectively deal with these challenges. One of the critical requirements of the BCP is that it must protect business data, because data is one of the most critical company assets – whether it is payroll/employee information, customer records, valuable research, financial records, historical information, etc. If a company loses its data, it is not easily replaced and rebuilding or regenerating that data will likely be an extremely expensive, if not an impossible task, critically affecting the company’s ability to stay in business.

Oracle Data Guard – Business Continuity for the Enterprise Page 3

Page 4: Oracle Data Guard in Oracle Database 10g Release 2 ...New Features in Oracle Database 10g Release 2 Following is a summary of the new Data Guard features available in Oracle Database

Oracle Data Guard

Oracle Data Guard provides an extensive set of data protection and disaster recovery (DR) capabilities to sustain business continuity when confronted with disasters, human errors and corruptions that can adversely affect Oracle databases. This whitepaper provides an architectural and technology overview of the Data Guard feature of Oracle Database 10g Release 2. For additional details on Data Guard, please refer to Oracle Data Guard documentation (ref. [1]).

Oracle Database High Availability Solutions

Data Guard is one component of Oracle Database’s High Availability (HA) solution stack. The Oracle database comes with an integrated set of HA capabilities that help organizations ensure business continuity by minimizing the various kinds of planned and unplanned downtime that can affect their businesses. The following diagram shows the various HA features available with Oracle Database 10g. For further details on each of these features, please refer to [2] and [3].

System Failures

Data Failures

System Changes

Data Changes

UnplannedDowntime

PlannedDowntime

Grid Clusters

Automatic Storage ManagementFlashback

RMAN & Flash Recovery AreaH.A.R.D

Data Guard

Online ReconfigurationRolling Upgrades

Online Redefinition

The Oracle database comes with an integrated set of HA capabilities that help organizations minimize the various kinds of planned and unplanned downtime that can affect their businesses.

Fig. 1: Integrated High Availability Features of Oracle Database 10g

Oracle Data Guard – Business Continuity for the Enterprise Page 4

Page 5: Oracle Data Guard in Oracle Database 10g Release 2 ...New Features in Oracle Database 10g Release 2 Following is a summary of the new Data Guard features available in Oracle Database

OVERVIEW OF ORACLE DATA GUARD

What is Oracle Data Guard?

Oracle Data Guard is the management, monitoring, and automation software infrastructure that creates, maintains, and monitors one or more standby databases to protect enterprise data from failures, disasters, errors, and data corruptions.

Oracle Data Guard creates, maintains, and monitors one or more standby databases to protect enterprise data from failures, disasters, errors, and data corruptions.

Data Guard maintains standby databases as transactionally consistent copies of the production database. These standby databases can be located at remote disaster recovery sites thousands of miles away from the production data center, or they may be located in the same city, same campus, or even in the same building. If the production database becomes unavailable because of a planned or an unplanned outage, Data Guard can switch any standby database to the production role, thus minimizing the downtime associated with the outage, and preventing any data loss. Available as a feature of the Enterprise Edition of Oracle Database, Data Guard can be used in combination with other Oracle High Availability solutions such as Real Application Clusters (RAC) and Recovery Manager (RMAN), to provide a high level of data protection and data availability that is unprecedented in the industry. The following diagram presents an overview of Data Guard.

Primary Site

Standby Site

Clients Clients

Data Changes

Data Guard BrokerPrimary Database

Standby Database

Fig. 2: Overview of Oracle Data Guard Architecture

Oracle Data Guard – Business Continuity for the Enterprise Page 5

Page 6: Oracle Data Guard in Oracle Database 10g Release 2 ...New Features in Oracle Database 10g Release 2 Following is a summary of the new Data Guard features available in Oracle Database

Data Guard Functionality

A Data Guard configuration consists of a production database, also known as the primary database, and up to nine standby database(s), which are transactionally consistent copies of the primary database. Data Guard maintains this transactional consistency using redo data. As transactions occur in the primary database, redo data is generated and written to the local redo log files. With Data Guard, this redo data is also transferred to the standby sites and applied to the standby databases, keeping them synchronized with the primary database. Data Guard allows the administrator to choose whether this redo data is sent synchronously or asynchronously to a standby site.

“We are very impressed by Data Guard 10g performance. We proved that Zero Data Loss, Disaster Recovery protection for an application workload of 1 million business transactions/hour is achievable in our system and network environment.” – Manohar Malayanur, Manager, Database Systems Management, Fannie Mae

The underlying technologies for standby databases are Data Guard Redo Apply (physical standby database), and Data Guard SQL Apply (logical standby database). A physical standby database has on-disk database structures that are identical to the primary database on a block-for-block basis, and is updated using Oracle media recovery. A logical standby database is an independent database that contains the same data as the primary database. It is updated using SQL statements, and has the advantage that it can be used concurrently for recovery and for other tasks such as reporting and queries. Data Guard enables role transitions between the primary database and a chosen standby database, reducing overall downtime during planned outages and unplanned failures. The primary and standby databases, as well as their various interactions, may be managed by using SQL*Plus. Data Guard also offers a distributed management framework called the Data Guard Broker, which automates and centralizes the creation, maintenance, and monitoring of a Data Guard configuration. For easier manageability, administrators may use either Oracle Enterprise Manager or the Broker’s own specialized command-line interface (DGMGRL) to take advantage of the Broker’s management capabilities. The following diagram shows the Data Guard components.

Oracle Data Guard – Business Continuity for the Enterprise Page 6

Page 7: Oracle Data Guard in Oracle Database 10g Release 2 ...New Features in Oracle Database 10g Release 2 Following is a summary of the new Data Guard features available in Oracle Database

Network Broker

ProductionDatabase

Logical StandbyDatabase Open for

Reports

SQLApply

Transform Redo to SQL

AdditionalIndexes & MVs

Physical StandbyDatabase

DIGITAL DATA STORAGE

DIGITAL DATA STORAGE

Backup

Redo Apply

Sync or Async Redo Shipping

Fig. 3: Data Guard Architectural Components

New Features in Oracle Database 10g Release 2

Following is a summary of the new Data Guard features available in Oracle Database 10g Release 2. These are discussed in detail in subsequent sections.

• Fast-Start Failover – This capability allows Data Guard to automatically, and quickly fail over to a previously chosen, synchronized standby database in the event of loss of the primary database, without requiring any manual steps to invoke the failover, and without incurring any data loss. Following a fast-start failover, once the old primary database is repaired, Data Guard automatically reinstates it to be a standby database. This act restores high availability to the Data Guard configuration.

• Improved Redo Transmission – Several enhancements have been made in the redo transmission architecture to make sure redo data generated on the primary database can be transmitted as quickly and efficiently as possible to the standby database(s).

• Easy conversion of a physical standby database to a reporting database – A physical standby database can be activated as a primary database, opened read/write for reporting purposes, and then flashed back to a point in the past to be easily converted back to a physical standby database. At this point, Data Guard automatically synchronizes the standby database with the primary database. This allows the physical standby database to be utilized for read/write reporting and cloning activities.

• Automatic deletion of applied archived redo log files in logical standby databases – Archived logs, once they are applied on the logical standby database, are automatically deleted, reducing storage consumption on the logical standby and improving Data Guard manageability. Physical standby databases have already had this functionality since Oracle Database 10g Release 1, with Flash Recovery Area [4].

Oracle Data Guard – Business Continuity for the Enterprise Page 7

Page 8: Oracle Data Guard in Oracle Database 10g Release 2 ...New Features in Oracle Database 10g Release 2 Following is a summary of the new Data Guard features available in Oracle Database

• Fine-grained monitoring of Data Guard configurations – Oracle Enterprise Manager has been enhanced to provide granular, up-to-date monitoring of Data Guard configurations, so that administrators may make an informed and expedient decision regarding managing this configuration.

Benefits Of Data Guard

Data Guard offers the following benefits:

• Disaster recovery and high availability – Data Guard provides an efficient and comprehensive disaster recovery and high availability solution. Automatic failover and easy-to-manage switchover capabilities allow quick role reversals between primary and standby databases, minimizing the downtime of the primary database for planned and unplanned outages.

“Data Guard automates disaster-recovery procedures and reduces Fidelity's exposure to data loss by an order of magnitude compared to previous approaches.” – Jonathan Schapiro, Vice President, Data Architecture & Services, Fidel ty Investments i

• Complete data protection – A standby database also provides an effective safeguard against data corruptions and user errors. Storage level physical corruptions on the primary database do not propagate to the standby database. Similarly, logical corruptions or user errors that cause the primary database to be permanently damaged can be resolved. Finally, the redo data is validated at the time it is received at the standby database and further when applied to the standby database.

• Efficient utilization of system resources – A physical standby database can be used for backups and read-only reporting, thereby reducing the primary database workload and saving valuable CPU and I/O cycles. In Oracle Database 10g Release 2, a physical standby database can also be easily converted back and forth between being a physical standby database and an open read/write database. A logical standby database allows its tables to be simultaneously available for read-only access while they are updated from the primary database. A logical standby database also allows users to perform data manipulation operations on tables that are not updated from the primary database. Finally, additional indexes and materialized views can be created in the logical standby database for better reporting performance.

• Flexibility in data protection to balance availability against performance requirements – Data Guard offers the Maximum Protection, Maximum Availability and Maximum Performance modes to help enterprises balance data protection against system performance requirements.

• Protection from communication failures – If network connectivity is lost between the primary and one or more standby databases, redo data cannot be sent from the primary to those standby databases. Once connectivity is re-established, the missing redo data is automatically detected by Data Guard and the necessary archive logs are automatically transmitted to the standby databases. The standby databases are resynchronized with the primary database, with no manual intervention by the administrator.

• Centralized and simple management – Data Guard Broker automates the management and monitoring tasks across the multiple databases in a Data Guard configuration. Administrators may use either Oracle Enterprise Manager or the Broker’s own specialized command-line interface (DGMGRL) to take advantage of this integrated management framework.

• Integrated with Oracle database – Data Guard is available as an integrated feature of the Oracle Database (Enterprise Edition) at no extra cost.

Oracle Data Guard – Business Continuity for the Enterprise Page 8

Page 9: Oracle Data Guard in Oracle Database 10g Release 2 ...New Features in Oracle Database 10g Release 2 Following is a summary of the new Data Guard features available in Oracle Database

DATA GUARD PROCESS ARCHITECTURE

As shown in the following figure, Data Guard uses several processes of the Oracle database instance to achieve the automation necessary for disaster recovery and high availability.

Online Redo Logs

FAL

Oracle Net

PrimaryDatabase

TransactionsPhysical/Logical

StandbyDatabase

Backup /Reports

LGWR RFS

StandbyRedo Logs

Archived Redo LogsARCH

MRP/ LSP

Transform Redo to SQL for SQL Apply

Archived Redo Logs

ARCH

LNSsync

async

arch

Fig. 4: Data Guard Process Architecture in Oracle Database 10g Release 2

On the primary database, Data Guard uses the Log Writer (LGWR) process or multiple Archiver (ARCH) processes to collect transaction redo data. In order to ensure isolation from network disruptions, the Log Writer process uses specialized background processes, called LogWriter Network Server (LNS) process, to synchronously or asynchronously transmit the redo data to the standby database. The Archiver processes transmit the redo data to the standby database directly. The primary database also has the Fetch Archive Log (FAL) process to provide a client-server mechanism for transmitting archived logs to the standby following a communication loss between the primary and standby(s), for automatic gap resolution and resynchronization. On the standby database, Data Guard uses one or more Remote File Server (RFS) processes to receive redo data from the primary database, the Managed Recovery Process (MRP) to apply redo data to the physical standby database, and the Logical Standby Process (LSP) to apply SQL-translated redo data to the logical standby database. If the Data Guard Broker is enabled, Data Guard also uses the Data Guard Broker Monitor (DMON) process to manage and monitor the primary and standby databases as a unified configuration.

Oracle Data Guard – Business Continuity for the Enterprise Page 9

Page 10: Oracle Data Guard in Oracle Database 10g Release 2 ...New Features in Oracle Database 10g Release 2 Following is a summary of the new Data Guard features available in Oracle Database

KEY TECHNOLOGY COMPONENTS

Data Guard Configuration

A Data Guard configuration consists of one primary database and up to nine standby databases. The primary and standby databases can run on a single node or in a RAC environment. The standby databases are connected to the primary database over standard TCP/IP-based networks (e.g. a Local Area Network (LAN), a Metropolitan Area Network (MAN), a Wide Area Network (WAN)) using Oracle Net Services. There are no restrictions on where the databases are located, provided that they can communicate with each other. However, for disaster recovery, it is recommended that the standby databases be hosted at sites that are geographically separated from the primary site. Data Guard requires the operating system architecture on the primary and standby systems to be the same. Thus if the primary database is running the Linux operating system on an Intel architecture, all its standby databases must also be running Linux on Intel – they cannot be Windows systems, for example. In addition, the same release of Oracle Database Enterprise Edition must be installed on the primary database and all standby databases, except during rolling database upgrades using logical standby databases (ref. section “Rolling Database Upgrades” for details of this capability).

Protection Modes and Redo Transport

Data Guard provides three high-level modes of data protection to balance cost, availability, performance, and transaction protection. These modes can be set easily using any of the available management interfaces, e.g. using the following is simple SQL statement on the primary database:

ALTER DATABASE SET STANDBY DATABASE TO MAXIMIZE {PROTECTION | AVAILABILITY | PERFORMANCE};

To determine the appropriate data protection mode, enterprises need to weigh their business requirements for data protection against user demands for system response time. The following table outlines the suitability of each mode from a risk of data loss perspective. Protection

Mode

Risk of Data Loss In the Event of a

Disaster

Redo Transport

Mechanism

Maximum Protection

Zero data loss Synchronous (LGWR SYNC)

Maximum Availability

Zero data loss Synchronous (LGWR SYNC)

Maximum Performance

Minimal data loss – usually few seconds Asynchronous (LGWR ASYNC) or ARCH

Data Guard has no inherent distance limitation regarding where the standby databases may be located with respect to the primary database.

Oracle Data Guard – Business Continuity for the Enterprise Page 10

Page 11: Oracle Data Guard in Oracle Database 10g Release 2 ...New Features in Oracle Database 10g Release 2 Following is a summary of the new Data Guard features available in Oracle Database

The following sections describe these protection modes in more detail.

Maximum Protection

Maximum Protection mode offers the highest level of data protection for the primary database, ensuring a comprehensive zero-data loss disaster recovery solution. When operating in Maximum Protection mode, redo records are synchronously transmitted by the LGWR process (through the LNS process) from the primary database to the standby database(s), and a transaction is not committed on the primary database until it has been confirmed that the transaction data is available on disk on at least one standby server. It is strongly recommended that this mode be configured with at least two standby databases. If the last participating standby database becomes unavailable, processing stops on the primary database. This ensures that no transactions are lost should the primary database fail after it loses contact with all of its standby databases. Because of the synchronous nature of redo transmission, this Maximum Protection mode can potentially impact primary database response time. This impact can be minimized by configuring a low latency network with sufficient bandwidth for peak transaction load. Stock exchanges, currency exchanges, and financial institutions are examples of businesses that may require this Maximum Protection mode.

Maximum Availability

Maximum Availability mode has the next higher level of data availability for the primary database. As with the Maximum Protection mode, redo data is synchronously transmitted by LGWR from the primary database to the standby database, and the transaction is not committed on the primary database until it has been confirmed that the transaction data is available on disk on at least one standby server. However, in this mode, unlike the Maximum Protection mode, if the last participating standby database becomes unavailable – e.g. because of network connectivity problems, processing continues on the primary database. The standby database may temporarily fall behind compared to the primary database, but when it is available again, the databases will automatically synchronize with no data loss, using accumulated archived logs on the primary. Because of synchronous redo transmission, this protection mode can potentially impact response time and throughput. This impact can be minimized by configuring a low latency network with sufficient bandwidth for peak transaction load. The Maximum Availability mode is suitable for businesses that want the assurance of zero data loss protection, but do not want the production database to be impacted by network/standby server failures. These businesses will accept the possibility of data loss should a second failure subsequently affect the production database before the initial network/standby failure is resolved.

Oracle Data Guard – Business Continuity for the Enterprise Page 11

Page 12: Oracle Data Guard in Oracle Database 10g Release 2 ...New Features in Oracle Database 10g Release 2 Following is a summary of the new Data Guard features available in Oracle Database

Maximum Performance

Maximum Performance mode is the default protection mode. It offers slightly less primary database data protection, but higher performance, than Maximum Availability mode. In this mode, as the primary database processes transactions, redo data is asynchronously shipped to the standby database by the LGWR process. Alternatively, the Archiver process(es) (ARCH) on the primary database may also be configured to transmit the redo data in this mode. In any case, the commit operation of the primary database does not wait for the standby to acknowledge receipt before completing the write on the primary. If any standby destination becomes unavailable, processing continues on the primary database and there is little or no effect on performance. In the case of a failure of the primary database, redo data which had not yet been sent to the standby database is lost. However, if the network has sufficient throughput to keep up with peaks in redo traffic and the LGWR process is used to transmit redo to the standby server, the number of lost transactions will be very small or zero. The Maximum Performance mode should be used when availability and performance on the primary database are more important than the risk of losing a small amount of data. This mode is also suitable for Data Guard deployments over a WAN, where the inherent latencies of the network may limit the suitability of synchronous redo transmission.

Redo Transport Enhancements in Oracle Database 10g Release 2

There have been some key enhancements in the redo transport architecture in Oracle Database 10g Release 2. During asynchronous redo transmission using the Log Writer process (LGWR ASYNC), the network server (LNSn) process serving each standby database transmits redo data out of the online redo log files on the primary database, instead of using an ASYNC buffer, as is the case in Oracle Database 10g Release 1 or earlier. This allows asynchronous redo transmission using LGWR not to be constrained by the size of the ASYNC buffer. The maximum network transmission size is also increased from 1 MB to 10 MB thereby greatly improving network utilization under heavy load. For ARCH-based transport, it is now possible to have multiple Archiver processes transmit redo data in parallel from a single archived redo log file to a standby destination. This is specified through the new MAX_CONNECTIONS attribute on the LOG_ARCHIVE_DEST_n parameter. The allowable maximum number of ARCH processes (specified through the LOG_ARCHIVE_MAX_PROCESSES initialization parameter) has also been increased from 10 to 30. These enhancements enable the expedient transfer of archived redo data to standby destinations during high peak activity, e.g. during batch uploads.

Redo Apply And SQL Apply

A standby database is initially created from a backup copy of the primary

Oracle Data Guard – Business Continuity for the Enterprise Page 12

Page 13: Oracle Data Guard in Oracle Database 10g Release 2 ...New Features in Oracle Database 10g Release 2 Following is a summary of the new Data Guard features available in Oracle Database

database. Once created, Data Guard automatically maintains the standby database as a transactionally consistent copy of the primary database by transmitting primary database redo data to the standby system and then applying the redo data to the standby database. Data Guard provides two methods to apply this redo data to the standby database and keep it transactionally consistent with the primary, and these methods correspond to the two types of standby databases supported by Data Guard:

• Redo Apply, used for physical standby databases • SQL Apply, used for logical standby databases

Note that as Fig. 4 indicates, there is no distinction between these two types of standby databases as far as redo data transmission from the primary is concerned. Once this redo data reaches the standby server, it is how the redo data is applied on the standby database that distinguishes these two types of standby databases.

Physical Standby Database – Redo Apply

A physical standby database is kept synchronized with the primary database by applying the redo data received from the primary using Oracle media recovery. It is physically identical to the primary database on a block-for-block basis, and thus, the database schemas, including indexes, are the same.

How Redo Apply Works

Data Guard Redo Apply uses a specialized process, called the Managed Recovery Process (MRP). MRP reads incoming redo data directly from Standby Redo Logs (SRLs) as they are written by the RFS process, and then applies them to the physical standby database. MRP is started on the physical standby database by mounting the database and using the following command:

ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT FROM SESSION;

MRP may also transparently switch to reading from a standby archived log if the SRL is archived before MRP could complete reading of the SRL (a situation which may occur when the primary database has a very high redo generation rate). MRP can be run in parallel for the best performance of Data Guard Redo Apply. In releases prior to Oracle Database 10g, this required the use of a PARALLEL clause in the above RECOVER MANAGED STANDBY DATABASE command. In Oracle Database 10g, MRP can automatically determine the optimal number of parallel recovery processes at the time it starts (without requiring the PARALLEL clause), and this number is based on the number of CPUs available on the standby server [5].

Oracle Data Guard – Business Continuity for the Enterprise Page 13

Page 14: Oracle Data Guard in Oracle Database 10g Release 2 ...New Features in Oracle Database 10g Release 2 Following is a summary of the new Data Guard features available in Oracle Database

The physical standby database can be opened read-only, and queries can be run against the physical standby database at that time. The physical standby database cannot run recovery at the same time it is opened read-only. Redo data that are shipped to the standby while it is opened read-only are accumulated at the standby site, and are not applied. However, recovery operations can be resumed on the physical standby at any time, and the accumulated redo data will automatically get applied. This allows the physical standby database to run in a sequence that could involve running in recovery for a while, then being opened read-only to run reports, and then returning to running recovery to apply outstanding redo data.

The physical standby database can be opened read-only, and queries can be run against the physical standby database at that time.

To open the physical standby read-only, recovery needs to be canceled on the standby using the following command:

ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;

and then the database can be opened read-only:

ALTER DATABASE OPEN;

To change the standby database back to being a physical standby database performing redo apply, active user sessions need to be terminated and MRP restarted with the ALTER DATABASE RECOVER MANAGED STANDBY DATABASE command.

Redo Apply Enhancements in Oracle Database 10g Release 2

In Oracle Database 10g Release 2, using a combination of Data Guard and Flashback Database, a physical standby database can be opened temporarily in read/write mode for development, reporting, or testing purposes, and then flashed back to a point in the past to be reverted back to a physical standby database. After the database is flashed back, Data Guard automatically synchronizes the standby database with the primary database, without the need to re-create the physical standby database from a backup copy of the primary database. Through SQL*Plus, the commands do this are as simple as the following: SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;

SQL> CREATE RESTORE POINT t1 GUARANTEE FLASHBACK DATABASE;

SQL> ALTER DATABASE ACTIVATE STANDBY DATABASE;

<Now do read/write reporting> SQL> FLASHBACK DATABASE TO RESTORE POINT t1;

SQL> ALTER DATABASE CONVERT TO PHYSICAL STANDBY;

Note that during the time that the physical standby database is opened read/write, it will not receive or apply redo data from the primary database and cannot provide disaster protection. To ensure continuous disaster protection in

Oracle Data Guard – Business Continuity for the Enterprise Page 14

Page 15: Oracle Data Guard in Oracle Database 10g Release 2 ...New Features in Oracle Database 10g Release 2 Following is a summary of the new Data Guard features available in Oracle Database

this case, a second standby database is required.

Benefits of Physical Standby

A physical standby database provides the following benefits:

• Disaster recovery and high availability – A physical standby database enables a robust and efficient disaster recovery and high availability solution. Easy-to-manage switchover and failover capabilities allow simple role reversals between primary and physical standby databases, minimizing the downtime of the primary database for planned and unplanned outages. “The things that impress me the

most about Data Guard are its manageability, reliability, and ease of use. It is amazing how easily we could implement a solid Disaster Recovery / High Availability solution with Oracle Data Guard without requiring additional resources to support it.” – Darl Kuhn, Staff Engineer, DatabaseServices , Sun Services Global Engineering, Sun Microsystems

• Data protection – Using a physical standby database, Data Guard can ensure no data loss, even in the face of unforeseen disasters. A physical standby database supports all datatypes, and DDL and DML operations that the primary can support. It also provides a safeguard against data corruptions and user errors. Storage level physical corruptions on the primary database do not propagate to the standby database. Similarly, logical corruptions or user errors that cause the primary database to be permanently damaged can be resolved. Finally, the redo data is validated when it is received at and applied to the standby database.

• Reduction in primary database workload – The physical standby database can be opened read-only for reporting and queries, and easily converted back-and-forth between being a read-write active database and a physical standby database. Besides, using Recovery Manager (RMAN), the physical standby database can be utilized to create backups for the production database, thereby saving valuable CPU and I/O cycles from the production system. RMAN can perform this backup while the physical standby database is performing recovery, or when it is opened read-only.

• Performance – The redo apply technology used by the physical standby database applies changes using low-level recovery mechanisms, which bypass all SQL level code layers and therefore is the most efficient mechanism for applying changes. This makes the redo apply technology a highly efficient mechanism to maintain transactionally consistent copies of the primary database.

Logical Standby Database – SQL Apply

A logical standby database contains the same logical information as the primary database, although the physical organization and structure of the data can be different. The SQL Apply technology keeps the logical standby database synchronized with the primary database by transforming the redo data received from the primary database into SQL statements and then executing the SQL statements on the standby database. This makes it possible for the logical standby database to be accessed for queries and reporting purposes at the same time the SQL is being applied to it. Because the logical standby database is updated using SQL statements, it remains open in read-write mode, and the tables that are being updated from the primary database can be used simultaneously for other tasks such as

Oracle Data Guard – Business Continuity for the Enterprise Page 15

Page 16: Oracle Data Guard in Oracle Database 10g Release 2 ...New Features in Oracle Database 10g Release 2 Following is a summary of the new Data Guard features available in Oracle Database

reporting, summations, and queries. These tasks can also be optimized by creating additional indexes and materialized views on the maintained tables. A logical standby database can host multiple database schemas, and users can perform normal data manipulation operations on tables in schemas that are not updated from the primary database. A logical standby database has some restrictions on datatypes, types of tables, and types of DDL and DML operations. Please refer to [1] for a list of these unsupported datatypes and storage attributes.

How SQL Apply Works

SQL Apply uses a collection of parallel execution servers and background processes that perform the task of applying changes from the primary database to the logical standby database. The following diagram shows the flow of information and the role that each process performs.

Redo Data from

Primary Database

Reader Preparer

Redo Records LCR

LCR:

Shared Pool

Builder

Analyzer

Transaction groups

CoordinatorTransactions

sorted in dependency order

ApplierDatafiles

Log Mining

Apply Processing

Logical Change Records not grouped into transactions

Transactions to be applied

Fig. 5: Data Guard SQL Apply Process Architecture

Following is a summary of the functionalities of each of the processes:

• The Reader process reads incoming redo records from standby redo logs, or from standby archive logs.

• The Preparer processes convert the block changes into table changes, or logical change records (LCRs). At this point, the LCRs do not represent specific transactions.

• The Builder process assembles completed transactions from the individual LCRs.

• The Analyzer process examines the completed transactions, identifying

Oracle Data Guard – Business Continuity for the Enterprise Page 16

Page 17: Oracle Data Guard in Oracle Database 10g Release 2 ...New Features in Oracle Database 10g Release 2 Following is a summary of the new Data Guard features available in Oracle Database

dependencies between the different transactions.

• The Coordinator process (also known as the Logical Standby Process, or LSP), assigns transactions to the apply processes, monitors dependencies between transactions, and authorizes the commit of changes to the logical standby database.

• The Applier processes apply the LCRs for the assigned transaction to the database and commit the transactions when instructed to do so by the Coordinator.

These various SQL Apply processes can be started by entering this simple SQL command on the logical standby database:

ALTER DATABASE START LOGICAL STANDBY APPLY IMMEDIATE;

Since the logical standby database can be open read/write while SQL Apply is running, synchronous redo transport to the logical standby and application of redo data directly from SRLs makes it possible to use the logical standby database as a real-time reporting solution. This demonstrates yet another way Data Guard may be leveraged for additional business use beyond disaster protection.

Automatic Archived Log Deletion in Oracle Database 10g Release 2

The logical standby database in Oracle Database 10g Release 2 automatically deletes archived redo log files that have been applied by SQL Apply. This enhances space management on the logical standby database. The DBA_LOGMNR_PURGED_LOG view displays the archived redo log files that are no longer needed, and hence are candidates for deletion. If administrators want control over this deletion process, they may override the default auto-deletion behavior by setting the logical standby parameter LOG_AUTO_DELETE to FALSE. A logical standby

database can remain open at the same time its tables are updated from the primary database, and those tables are simultaneously available for read access.

Benefits of Logical Standby

A logical standby database provides similar disaster recovery, high availability, and data protection benefits as a physical standby database. It also provides the following specialized benefits:

• Efficient use of standby hardware resources – A logical standby database can be used for other business purposes in addition to disaster recovery requirements. It can host additional database schemas beyond the ones that are protected in a Data Guard configuration, and users can perform DDL or DML operations on those schemas any time. Because the logical standby tables that are protected by Data Guard can be stored in a different physical layout than on the primary database, additional indexes and materialized views can be created to improve query performance and suit specific business requirements.

• Reduction in primary database workload – A logical standby database can remain

Oracle Data Guard – Business Continuity for the Enterprise Page 17

Page 18: Oracle Data Guard in Oracle Database 10g Release 2 ...New Features in Oracle Database 10g Release 2 Following is a summary of the new Data Guard features available in Oracle Database

open at the same time its tables are updated from the primary database, and those tables are simultaneously available for read access. This enables the logical standby database to be used concurrently for data protection and reporting, thereby off-loading the primary database from those reporting and query tasks, and saving valuable CPU and I/O cycles.

Enterprise Manager & Data Guard Broker

Data Guard Broker is a distributed management framework that automates and centralizes the creation, maintenance, and monitoring of Data Guard configurations. All management operations can be performed either through Oracle Enterprise Manager, which uses the Broker, or through the Broker’s specialized command-line interface (DGMGRL). The following screenshot shows the Data Guard home page of the Enterprise Manager.

Fig. 6: Data Guard Configuration through Oracle Enterprise Manager The following list describes some of the operations that the Broker automates and simplifies:

• Creating and enabling Data Guard configurations, which include a primary database and up to nine standby (physical or logical) databases – all or a mix of these databases may be RAC clusters.

• Managing an entire Data Guard configuration from any site in the configuration.

• Implementing switchover or failover operations (including Fast-Start Failover) that involve complex role changes across all systems in the configuration.

Oracle Data Guard – Business Continuity for the Enterprise Page 18

Page 19: Oracle Data Guard in Oracle Database 10g Release 2 ...New Features in Oracle Database 10g Release 2 Following is a summary of the new Data Guard features available in Oracle Database

• Monitoring apply rates, capturing diagnostic information, and detecting problems quickly with centralized monitoring, and event notification.

The Broker's easy-to-use interfaces and centralized management and monitoring of the Data Guard configuration make Data Guard an enhanced high availability and data protection solution for the enterprise.

Management Enhancements in Oracle Database 10g Release 2

In Oracle Database 10g Release 2, Data Guard offers various enhanced views to monitor the run-time performance of the Data Guard configuration in a granular fashion. These can be accessed through SQL*Plus, DGMGRL or Enterprise Manager. Enterprise Manager offers the additional benefit of providing historical trend analysis on the Data Guard metrics that it monitors – for example, how the metric’s performance has been in the last 24 hrs, or last 5 days, etc. Also, through Enterprise Manager, it is possible to set up notification-alarms such that administrators may be notified in case the metric crosses the configured threshold value. Some of the Data Guard metrics monitored by Enterprise Manager are:

• Estimated Failover Time – The approximate number of seconds it would require to failover to this standby database.

• Apply Lag – Shows how far the standby is behind the primary.

• Redo Apply Rate – The rate at which redo is applied on the standby.

• Redo Generation Rate – The rate at which redo is generated on the primary.

• Transport Lag – The approximate number of seconds of redo not yet available on this standby database. This may be because the redo has not yet been shipped or there may be a gap.

• Data Guard Status – Shows the status of each database in the Data Guard configuration.

• Fast-Start Failover Occurred – When fast-start failover is enabled, this metric will generate a critical alert on the new primary database (old standby) if a fast-start failover occurs (ref. section “Fast-Start Failover” for details on fast-start failover).

• Fast-Start Failover Time – When fast-start failover is enabled, this metric will generate a critical alert on the new primary database (old standby) if fast-start failover occurs, indicating the time stamp of the occurrence.

Oracle Data Guard – Business Continuity for the Enterprise Page 19

Page 20: Oracle Data Guard in Oracle Database 10g Release 2 ...New Features in Oracle Database 10g Release 2 Following is a summary of the new Data Guard features available in Oracle Database

Role Transitions – Switchover and Failover

Data Guard offers two easy-to-use methods to handle planned and unplanned outages of the production site. The goal of both these methods is to transition a designated standby database to a production role as quickly as possible. These methods are called switchover and failover, corresponding respectively to how planned and unplanned outages are handled. Data Guard offers

two easy-to-use methods to handle planned and unplanned outages of the production site.

Switchover

A switchover is typically used to reduce primary database downtime during planned outages, such as operating system or hardware upgrades, or rolling upgrades of the Oracle database software and patch sets. A switchover may also be used by a business to test disaster recovery preparedness. A switchover operation requires all user sessions to be disconnected from the primary database. Following that, the primary database is transitioned to the standby role, after which the standby database is transitioned to the primary role. A switchover is initiated by the administrator using the Oracle Enterprise Manager GUI interface, the Data Guard Broker’s command line interface, or directly through SQL. For example – the following single Data Guard Broker CLI (DGMGRL) command initiates and completes the switchover to the standby database “StandbyChicago”: DGMGRL> SWITCHOVER TO StandbyChicago;

As seen above, the administrator needs to specify a target standby database at the time of the switchover. Once initiated, Data Guard automates the actual role transition processes. No data is lost in the process. If the switchover target is a logical standby database, a switchover operation does not require a restart of the old or new primary databases following a switchover. Also, there is no need to shut down and restart any other logical standby databases that are in the Data Guard configuration – they will continue to function normally after a switchover completes. All existing physical standby databases, however, are rendered unable to participate in the Data Guard configuration after the switchover, and must be recreated from a backup of the new primary database, in order to serve as its physical standby. If the switchover target is a physical standby database, a switchover operation requires the old primary database to be restarted and mounted (as a new standby) following the switchover. In Oracle Database 10g Release 2, the old standby database does not need to be restarted if it has never been open read-only while being a physical standby. Otherwise, it needs to be shutdown and

Oracle Data Guard – Business Continuity for the Enterprise Page 20

Page 21: Oracle Data Guard in Oracle Database 10g Release 2 ...New Features in Oracle Database 10g Release 2 Following is a summary of the new Data Guard features available in Oracle Database

restarted to be opened read/write as the new primary database. There is no need to shut down and restart other standby databases in the configuration – whether physical or logical. At times, the term switchback may also be used within the scope of Data Guard role management. A switchback operation is simply a subsequent switchover operation to return databases to their original roles.

A switchback operation is simply a subsequent switchover operation to return databases to their original roles.

Types of Failover

Failover is the operation of bringing one of the standby databases online as the new primary database when an unplanned catastrophic failure occurs on the primary database, and there is no possibility of recovering the primary database in a timely manner. For Data Guard in Oracle Database 10g Release 2, there are two kinds of failover operations – manual failover, and fast-start failover. A manual failover is initiated by the administrator using the Oracle Enterprise Manager GUI interface, the Data Guard Broker’s command line interface, or directly through SQL*Plus. With fast-start failover, Data Guard automatically fails over to a previously designated standby database when the primary database is unavailable, provided there is no a chance of data loss at the time of the failover. Fast-start failover allows the administrator to increase availability with no need for manual intervention, thereby reducing management costs as well as downtime. Manual failover gives administrators more control over the failover process – for example, an administrator may want to invoke a manual failover even where there is a chance of a data loss.

Manual Failover

A manual failover operation is initiated on the standby database that will assume the primary role. For example – the following single Data Guard Broker CLI (DGMGRL) command initiates and completes the failover to the standby database “StandbyChicago”: DGMGRL> FAILOVER TO StandbyChicago;

If the failover target is a logical standby database, a failover operation does not require a restart of the new primary database following a failover. Other logical standby databases in the Data Guard configuration will have to flashed back and synchronized, or recreated from a backup of the new primary. Other physical standby databases in that configuration will no longer be compatible, and hence they need to be recreated from a backup of the new primary. If the failover target is a physical standby database, this standby database (i.e. the new primary database) does not need to be restarted if it has never been open read-only since the last time it was started. Otherwise, it needs to be shutdown and restarted to be opened read/write as the new primary database. There is no need to shut down and restart other standby databases in the

Oracle Data Guard – Business Continuity for the Enterprise Page 21

Page 22: Oracle Data Guard in Oracle Database 10g Release 2 ...New Features in Oracle Database 10g Release 2 Following is a summary of the new Data Guard features available in Oracle Database

configuration – whether physical or logical, provided they were not “transactionally ahead” of the target standby at the time of the failover. Otherwise they will have to flashed back and synchronized, or recreated from a backup of the new primary. The manual failover operation ensures zero data loss if Data Guard was being run in the Maximum Protection mode or Maximum Availability mode and the target standby database was synchronized at the time of the failover. In Maximum Performance mode, if there were still some redo data in the primary database that had not yet been sent to the standby at the time of failover – that data may be lost. The following figure shows the result of a failover operation from a primary database in San Francisco to a physical standby database in Boston.

Fig. 7: Failover to a Standby Database

Fast-Start Failover

Fast-start failover allows Data Guard to automatically fail over to a previously chosen, synchronized standby database in the event of loss of the primary database, without requiring any manual steps to invoke the failover. Further, upon return of the failed primary, it will automatically be reinstated into the configuration as a standby of the new primary database. Fast-start failover can be used only in a Data Guard Broker configuration and can be configured only through DGMGRL or Enterprise Manager.

Fast-start failover automatically fails over to a designated standby database without requiring any manual intervention.

The fast-start failover configuration is monitored by a separate Observer process, which is a lightweight process integrated in the DGMGRL client-side component. It should be run on a different computer from the primary or

Oracle Data Guard – Business Continuity for the Enterprise Page 22

Page 23: Oracle Data Guard in Oracle Database 10g Release 2 ...New Features in Oracle Database 10g Release 2 Following is a summary of the new Data Guard features available in Oracle Database

standby databases. It continuously monitors the fast-start failover environment to ensure the primary database is available. If both the Observer and the standby database lose connectivity to the primary database, the Observer attempts to reconnect to the primary database for a configurable amount of time before initiating a fast-start failover.

Prerequisites & Administration

Fast-start failover requires Data Guard Broker to be enabled for the configuration. It also requires Flashback Database to be enabled and Flash Recovery Area to be configured in the primary and the designated standby database. Configuring Flashback Database and Flash Recovery Area is recommended for other standby databases in the configuration as well. It also requires Data Guard protection mode to be Maximum Availability, thereby ensuring that a failover can occur without any data loss. The Observer process needs network connectivity between the Observer machine and primary and standby servers to be able to monitor the configuration. To enable fast-start failover, a standby database needs to be designated as the target for the failover. This can be done using the Broker property, FastStartFailoverTarget, or through Enterprise Manager. Secondly, administrators need to configure how long the Observer should try to reconnect to the current primary database before initiating a fast-start failover. This can be done specifying the time (in seconds) through the FastStartFailoverThreshold Broker property, or through Enterprise Manager. Finally, fast-start failover needs to be enabled for the configuration, which can be done through Enterprise Manager, or through DGMGRL:

DGMGRL> ENABLE FAST_START FAILOVER;

The Observer process can now be started at the Observer machine, either using Enterprise Manager, or with the simple DGMGRL command, and it will start monitoring the fast-start failover configuration:

DGMGRL> START OBSERVER;

The following diagram shows the relationships between the primary database, target standby database, and the Observer during fast-start failover:

Oracle Data Guard – Business Continuity for the Enterprise Page 23

Page 24: Oracle Data Guard in Oracle Database 10g Release 2 ...New Features in Oracle Database 10g Release 2 Following is a summary of the new Data Guard features available in Oracle Database

Fig. 8: Interaction among the Observer, Primary Database & Standby Database

• Before Fast-Start Failover: Data Guard is operating in a steady state, with the primary database transmitting redo data to the target standby database and the Observer monitoring the state of the entire configuration.

• FastStartFailover Ensues: Disaster strikes the primary database and its network connections to both the Observer and the target standby database are lost. Upon detecting the break in communication, the Observer attempts to reestablish a connection with the primary database for the amount of time defined by the FastStartFailoverThreshold property before initiating a fast-start failover. If the Observer is unable to regain a connection to the primary database within the specified time, and the target standby database is ready for fast-start failover, then fast-start failover ensues.

• After Fast-Start Failover: The fast-start failover has completed and the target standby database is running in the primary database role. After the former primary database has been repaired, the Observer reestablishes its connection to that database and reinstates it as a new standby database. The new primary database starts transmitting redo data to the new standby database.

The elegant architecture of fast-start failover makes it an excellent candidate to be used in lights-out high availability situations where data protection is also important.

The status of the fast-start failover configuration can be monitored any time through Enterprise Manager or through the FS_FAILOVER_STATUS column in the V$DATABASE view. For example, if the primary database, while in Maximum Availability mode, gets disconnected from the standby database but remains connected to the Observer, the standby will no longer be synchronized with the primary. This will be reflected by the value “UNSYNCHRONIZED” in the FS_FAILOVER_STATUS column. Since fast-start failover always ensures no data loss, fast-start failover will not be possible in this case and a manual failover will be required. Fast-start failover has been designed to ensure that out of the three fast-start failover members – the primary, the standby and the Observer, at least two members agree to major state transitions, thus avoiding conditions such as split-brain scenarios in which there may be two divergent primary databases serving production workload. The simple, yet elegant architecture of fast-start failover

Oracle Data Guard – Business Continuity for the Enterprise Page 24

Page 25: Oracle Data Guard in Oracle Database 10g Release 2 ...New Features in Oracle Database 10g Release 2 Following is a summary of the new Data Guard features available in Oracle Database

makes it an excellent candidate to be used in lights-out high availability situations where data protection is also important.

Restoring Old Primary As A New Standby

Following a failover, the administrator will likely need to restore the old primary as a new standby in the Data Guard configuration, to resume data protection capabilities of the configuration. In Oracle9i, the only way to do this is to recreate the original primary as a new standby from a backup of the new primary. This time consuming process is avoided in Oracle Database 10g Release 1, if the Flashback Database feature of Oracle Database 10g was enabled on the original primary database prior to the failover, and the primary database is not a RAC database. This is accomplished simply by executing a flashback the old primary database to the point in time when the failover occurred (given by the STANDBY_BECAME_PRIMARY_SCN column in the V$DATABASE view), starting the database as a standby database and then letting Data Guard automatically synchronize the new standby database with the new primary database. Oracle Database 10g Release 2 adds support for the above process to Data Guard configurations that include RAC and automates it even further. If fast-start failover is enabled, as soon as the old primary database is repaired, brought up and network connectivity to the Observer is restored, the Observer, using flashback database transparently, automatically reinstates it as a new standby database in the configuration without requiring any manual intervention. If fast-start failover is not enabled, the old primary database may be reinstated as a new standby through Enterprise Manager or with this simple DGMGRL command:

DGMGRL> REINSTATE DATABASE boston_old_primary;

Role Transition Events

In Oracle Database 10g Release 2, Data Guard role changes cause events to be posted in an effort to help the administrator automate post role change tasks and notify applications when a database they are connected to is no longer operating in the primary / production role. Specifically, a system event – DB_ROLE_CHANGE, and a FAN (Fast Application Notification, ref. [6]) event – DB_DOWN, are posted. Together these two new events help client applications to failover to the designated standby database. The DB_ROLE_CHANGE system event is fired when the database opens for the first time after the role transition regardless of its new role, i.e., regardless of whether the role change caused it to open for the first time as a primary database or as a logical standby or as a physical standby, in read-only mode. Administrators could write a trigger that executes on this event to manage post role change tasks, e.g. starting a service / services on the new primary, adding temporary tablespaces, etc. The DB_DOWN event is posted in a failover situation, when the failover

Oracle Data Guard – Business Continuity for the Enterprise Page 25

Page 26: Oracle Data Guard in Oracle Database 10g Release 2 ...New Features in Oracle Database 10g Release 2 Following is a summary of the new Data Guard features available in Oracle Database

operation is coordinated through the Data Guard Broker. It is posted by the database that is operating in the new primary database role on behalf of the old primary. This is a FAN event to notify OCI clients, who are subscribers of this event that the database that used to be operating in the primary role is now down or unavailable. This allows applications to connect to the new primary soon after getting the notification instead of having to wait for a TCP timeout period, as would be the case if it were a site failure.

Handling Communication Failures

Data Guard can smoothly handle network connectivity problems that temporarily disconnect the standby (physical or logical) database from the primary database. The exact behavior is dictated by the Data Guard protection mode. When the standby database becomes unavailable and this standby database is the last available standby in the Maximum Protection mode, the primary database will be shut down. For all other cases, transactions are captured locally at the primary database. When connectivity to the standby is re-established, the accumulated archive logs are automatically shipped and applied to the standby, until the standby has resynchronized with the primary. This process does not require any administrative intervention. Oracle recommends that network capacity be sufficient to handle such resynchronizations if network outages are common in the vicinity of the primary site.

Protection from Data Corruptions Caused by Human Errors

When a primary database is open and active, and transactions are in progress, redo data is generated and transmitted to standby sites. Considering that human error is the leading cause of system downtime, it may be possible that this redo data contains critical logical user-errors, such as dropping of an important table, or other logical data corruptions, and this might have already logically corrupted the primary database. Data Guard provides several easy-to-use means to avoid such user errors. The administrator may decide to use the Flashback Database feature of Oracle Database 10g on both the primary and standby databases to quickly revert the databases to an earlier point-in-time to back out such user errors. Alternatively, if the administrator decides to failover to a standby database, but those user-errors were already applied to the standby database, the administrator may simply flashback the standby database to a safe point in time (assuming the flashback functionality was already enabled on the standby database). Finally, the administrator has the added option to delay the application of redo data on those standby databases by a configurable amount of time, which provides a window of protection from such user errors or corruptions. Irrespective of the option chosen, the apply process on the standby database always revalidates the log records to prevent application of physical redo data corruptions on the standby database.

Oracle Data Guard – Business Continuity for the Enterprise Page 26

Page 27: Oracle Data Guard in Oracle Database 10g Release 2 ...New Features in Oracle Database 10g Release 2 Following is a summary of the new Data Guard features available in Oracle Database

Rolling Database Upgrades

Oracle Database 10g supports database software upgrades for major release and patchset upgrades (from Oracle Database 10g onwards) in a rolling fashion – with near zero database downtime, by using Data Guard SQL Apply. The steps involve upgrading the logical standby database to the next release, running in a mixed mode to test and validate the upgrade, doing a role reversal by switching over to the upgraded database, and then finally upgrading the old primary database. While running in a mixed mode for testing purposes, the upgrade can be aborted and the software downgraded, without data loss. For additional data protection during these steps, a second standby database may be used. The following diagram shows the sequence of events in a rolling upgrade process.

Major ReleaseUpgrades

Patch SetUpgrades

Cluster Software & Hardware Upgrades

Initial SQL Apply Config

Clients Redo

Version X Version X

1

BA

Switchover to B, upgrade A

Redo

4

Upgrade

X+1X+1

BA

Run in mixed mode to test

Redo

3X+1X

A B

Upgrade node B to X+1

Upgrade

LogsQueue

X2

X+1

A B

Data Guard supports database software upgrades for major release and patchset upgrades in a rolling fashion with near zero database downtime

Fig. 9: Rolling Database Upgrades Using SQL Apply By supporting rolling upgrades with minimal downtimes, Data Guard reduces the large maintenance windows typical of many administrative tasks, and enables the 24x7 operation of the business.

Cascaded Redo Log Destinations

Data Guard provides many flexible configuration options. Using Cascaded Redo Log Destinations, a standby database receives its redo data from another standby database and not from the original primary database. Since the primary database sends redo data to only a subset of the standby databases, this feature reduces the load on the primary system, and also reduces network traffic and use of valuable network resources at the primary site. This is also valuable in

Oracle Data Guard – Business Continuity for the Enterprise Page 27

Page 28: Oracle Data Guard in Oracle Database 10g Release 2 ...New Features in Oracle Database 10g Release 2 Following is a summary of the new Data Guard features available in Oracle Database

cases where multiple standby databases are deployed, to serve a variety of disaster recovery and reporting requirements.

DATA GUARD AND RAC

Data Guard and RAC are complementary to each other. RAC addresses system or instance failures. It provides rapid and automatic recovery from failures that do not affect data – such as node failures, or instance crashes. It also provides increased scalability for an application. Data Guard, on the other hand, provides data protection through the use of transactionally consistent primary and standby databases, which neither share disks nor run in lock step. This enables recovery from site disasters or data corruptions. Data Guard is also natively integrated with RAC – e.g., some or all of the primary/standby (physical or logical) databases can be RAC databases, and they can be managed using Enterprise Manager or the Broker’s command-line interface, or directly using SQL. Customers should use a combination of Data Guard and RAC to maximize both high availability and disaster recovery benefits.

MAXIMUM AVAILABILITY ARCHITECTURE

As more system capabilities become available, IT managers, architects and administrators often find it difficult to integrate a suitable set of features to build one unified high availability (HA) solution that fits all of their business requirements. Oracle Maximum Availability Architecture (MAA) is Oracle's best-practices blueprint based on proven Oracle high availability technologies and recommendations. The goal of MAA is to remove the complexity in designing the optimal high availability architecture, and maximize systems availability.

Oracle Maximum Availability Architecture is based on proven Oracle high availability technologies and real-world customer deployment experiences, with the goal of removing the complexity in designing the optimal high availability architecture, and maximizing systems availability.

MAA provides the following benefits:

• MAA reduces the implementation costs for a highly available Oracle system by providing detailed configuration guidelines. The results of performance-impact studies for different configurations are highlighted to ensure that the chosen highly available architecture can perform and scale according to business needs.

• MAA provides best practices and recovery steps to eliminate or minimize downtime that could occur because of scheduled and unscheduled outages such as human errors, system faults and crashes, maintenance, data failures, corruptions, and disasters.

• MAA gives the ability to control the length of time to recover from an outage and the amount of acceptable data loss under disaster conditions thus allowing mean time to recovery (MTTR) to be tailored to specific business requirements.

Data Guard is an essential component of MAA, and the MAA guidelines

Oracle Data Guard – Business Continuity for the Enterprise Page 28

Page 29: Oracle Data Guard in Oracle Database 10g Release 2 ...New Features in Oracle Database 10g Release 2 Following is a summary of the new Data Guard features available in Oracle Database

include best practice recommendations (ref. [7]) on various Data Guard configuration aspects, such as a configuration involving both RAC and Data Guard, redo data transport mechanisms, switchover/failover, media recovery, SQL Apply configuration, network configuration, etc. Customers interested in Data Guard implementations are strongly recommended to refer to these MAA best practice guidelines.

DATA GUARD AND REMOTE MIRRORING SOLUTIONS

Remote Mirroring solutions are often seen as a way to offer simple and complete data protection. There are two kinds of remote mirroring solutions – (a) host-based replication, and (b) storage array-based mirroring. In Host-based Replication solutions, specialized file system drivers or volume manager components in the primary server intercept local writes, package them in logical messages, and synchronously or asynchronously send them over IP to remote (or secondary) hosts. Such solutions need to maintain specialized logs to keep track of write-ordering. The data volumes on the secondary server cannot be used (even for read-only access) while replication is in progress. In Storage Array-based Mirroring solutions, storage array controllers at the primary site mirror changed disk I/O blocks to a similar storage array at the secondary site. These changes are sent using protocols such as ESCON, FICON and Fibre Channel, although in some recent versions iSCSI and IP-based transport are also supported. The mirroring over the appropriate communication links is controlled by specialized link adapters loaded with appropriate firmware. As I/Os occur at the primary server, data is written to the cache of the source array, and placed in a queue. The link adapter takes the first entry of the queue and moves it across the link to the mirrored array. While such solutions are used in data centers, when it comes to protecting the Oracle database, Data Guard is inherently much more efficient, less expensive, and better optimized for data protection than remote mirroring solutions. Customers do not need to buy or integrate a remote mirroring solution with Data Guard to protect an Oracle database. Following is a summary of the benefits of Data Guard compared to a remote mirroring solution. For a detailed analysis on this topic, refer to [8].

• Better Network Efficiency With Data Guard, only the redo data need to be sent to the remote site. However, if a remote mirroring solution is used for data protection, then the database files, the online logs, the archive logs and the control file must be mirrored. This means that remote mirroring will send each change at least three times to the remote site. Further, database writes happen a lot more often than log writes because each log write typically contains many changes (known as group commit). This means that the network bandwidth needed for a database redo shipping based solution is considerably less than that of a remote mirroring solution. Even more importantly, this means far

Oracle Data Guard – Business Continuity for the Enterprise Page 29

Page 30: Oracle Data Guard in Oracle Database 10g Release 2 ...New Features in Oracle Database 10g Release 2 Following is a summary of the new Data Guard features available in Oracle Database

fewer network round trips. Remote mirroring can be very useful for non-database files, but for database data – the combination of better protection and lower cost provide compelling reasons to use Data Guard. An internal analysis of Oracle's corporate e-mail systems, as shown in the following graph, demonstrated that 7 times more data was transmitted over the network and 27 times more I/O operations were performed using a remote mirroring solution, compared to using Data Guard.

0 5 10 15 20 25 30

Network I/Os

NetworkBandwidth

Data GuardRemote Mirroring

Fig. 10: Network Performance: Data Guard vs. Remote Mirroring

• Better suited for WANs

Remote-mirroring solutions based on storage systems often have a distance limitation due to the underlying communication technology (Fibre, ESCON) used by the storage systems. This distance can be extended by using specialized devices from third party vendors. These devices convert ESCON/fibre to the appropriate IP, ATM or SONET networks. The problem is that with each such device, latency is introduced in the system, impacting the production database performance, and making such a configuration unsuitable for synchronous transport necessary for the zero data loss capability. This problem may be mitigated by introducing intermediate storage boxes in the communication path, but that only adds to the overall cost. The other solution is to use variations of synchronous transmission – however, depending on the remote mirroring solution, anything other than synchronous transmission of data may not preserve write-ordering across all mirrored volumes that the database resides on. This means such configurations cannot guarantee data consistency at all times, making them unsuitable as a data protection / disaster recovery solution for OLTP data.

Since Data Guard transmits only redo data to the standby sites, using a standard IP network, and preserves transactional consistency across all the protection modes (i.e. whether using synchronous or asynchronous mode of transport), and does not need expensive interim storage boxes, it is a

Oracle Data Guard – Business Continuity for the Enterprise Page 30

Page 31: Oracle Data Guard in Oracle Database 10g Release 2 ...New Features in Oracle Database 10g Release 2 Following is a summary of the new Data Guard features available in Oracle Database

much better DR and data protection solution for a WAN.

• Better resilience and data protection Data Guard processes are aware of data formats as they read and write information from the primary database. In addition, Data Guard is integrated with the Flashback Database feature, and allows application of changes to be delayed as well. These capabilities prevent many human errors and data corruptions from propagating, and/or affecting the standby database. Remote mirroring does not have this advantage – any inadvertent drop of a critical table will be instantly propagated to, and adversely affect, the remote copy of the database files.

• Higher ROI Using Data Guard, the standby database can be opened for reporting while changes are still propagating. That is not always the case for a remote mirroring solution. Besides, Data Guard is an out-of-the-box integrated feature of the Oracle database. However, remote mirroring solutions are extra cost purchases and require complex integration with the database. Besides, many of these remote mirroring solutions are proprietary and can be used with only the storage systems from the same vendor (on both the primary and secondary sites) that manufactures these remote mirroring solutions. Data Guard, on the other hand, does not force any lock-in with a particular storage solution for the primary and standby sites.

DATA GUARD CUSTOMERS

Data Guard, which has been available since Oracle version 7, is deployed for mission-critical applications at major global customer sites. A focused list of customers who have implemented Data Guard as their high availability and disaster recovery solution, along with detailed implementation case studies, is available at [9]. Some of the customer success stories profiled are:

Customer Case Study

o Ohio Savings Bank Oracle Database 10g – Maximum Availability Architecture & Zero Data Loss

o Oracle Global IT Oracle E-Business Suite with Data Guard over a WANo ADT Security

Services Using Data Guard SQL Apply Across a Wide Area Network

o Amadeus Using Data Guard for Disaster Recovery & Rolling Database Upgrades

o e-Rewards Market Research

High Availability & near real-time synchronization of Data Warehouse and OLTP databases using MAA and SQL Apply

Oracle Data Guard – Business Continuity for the Enterprise Page 31

Page 32: Oracle Data Guard in Oracle Database 10g Release 2 ...New Features in Oracle Database 10g Release 2 Following is a summary of the new Data Guard features available in Oracle Database

o Fannie Mae Data Guard 10g Release 2 - very high performance using

LGWR SYNC and LGWR ASYNC redo transport services

o First American Real Estate Solutions

Using Oracle9i Data Guard and Planning for Data Guard new features in Oracle Database 10g

o Kemira GrowHow Ltd, UK

Replacing Outsourced Disaster Recovery Services with Oracle Data Guard

o NeuStar Synchronous Zero Data Loss Protection with Production and Standby Databases Separated by 300 Miles

o PPL Automatic failover for mission critical outage management system using Data Guard 10g Release 2

o Shire Pharmaceuticals

Trans-Atlantic deployment (UK to US) using Data Guard 10g Release 2

o Swedish Post Extending the DR system using reporting capabilities of Data Guard SQL Apply

o VP Bank Using Data Guard SQL Apply to deploy content outside the corporate firewall

These success stories about Data Guard in action at some of the best names in various industries around the world is a tribute to Data Guard’s comprehensive capabilities in the area of business continuity.

CONCLUSION

Oracle Data Guard is a comprehensive data protection, disaster recovery and high availability solution for the enterprise. It offers a flexible and easy-to-manage framework that addresses both planned and unplanned outages. Physical and logical standby databases complement each other and can be maintained simultaneously, providing high-value data protection, while offloading overhead from primary databases. The various data protection modes provide flexibility to adapt to various protection, performance and infrastructure requirements. The Data Guard Broker in combination with Oracle Enterprise Manager provides an easy-to-use configuration and management framework.

“We needed to consider the safe-keeping of our data, but we also needed to look at cost. Oracle Data Guard provides everything for a high availability solution at a lower cost than other alternatives.” – Ann Collins, Technical Director, First American Real Estate Solutions

A modern global enterprise cannot provide mission-critical service to its customers without the kind of technology discussed in this paper. It has to be complete, integrated, easy-to-manage, serve multiple purposes and protect all enterprise data. At the same time, such data protection and disaster recovery technology should not be expensive, nor unduly impact performance, and should enable businesses to extract additional value out of their DR investments. Oracle Data Guard is the only solution available today that meets all these needs.

Oracle Data Guard – Business Continuity for the Enterprise Page 32

Page 33: Oracle Data Guard in Oracle Database 10g Release 2 ...New Features in Oracle Database 10g Release 2 Following is a summary of the new Data Guard features available in Oracle Database

REFERENCES

1. Oracle Data Guard Concepts and Administration – 10g Release 2, http://www.oracle.com/technology/documentation/database10g.html

2. Oracle High Availability Architecture and Best Practices – 10g Release 2, http://www.oracle.com/technology/documentation/database10g.html

3. Overview: Oracle Database High Availability – http://www.oracle.com/technology/deploy/availability

4. Oracle Database Backup and Recovery Basics – 10g Release 2, http://www.oracle.com/technology/documentation/database10g.html

5. Oracle Database 10g Best Practices: Data Guard Redo Apply and Media Recovery, http://www.oracle.com/technology/deploy/availability/htdocs/maa.htm

6. Oracle Database Oracle Clusterware and Oracle Real Application Clusters Administration and Deployment Guide – 10g Release 2, http://www.oracle.com/technology/documentation/database10g.html

7. Oracle Maximum Availability Architecture, http://www.oracle.com/technology/deploy/availability/htdocs/maa.htm

8. The Right Choice for Disaster Recovery: Data Guard, Stretch Clusters or Remote Mirroring, http://www.oracle.com/technology/deploy/availability/techlisting.html

9. Oracle High Availability Case Studies, http://www.oracle.com/technology/deploy/availability/htdocs/HA_CaseStudies.html

Oracle Data Guard – Business Continuity for the Enterprise Page 33

Page 34: Oracle Data Guard in Oracle Database 10g Release 2 ...New Features in Oracle Database 10g Release 2 Following is a summary of the new Data Guard features available in Oracle Database

Oracle Data Guard in Oracle Database 10g Release 2 – Business Continuity for the Enterprise November 2006 Author: Ashish Ray Contributing Author: Joseph Meeks Oracle Corporation World Headquarters 500 Oracle Parkway Redwood Shores, CA 94065 U.S.A. Worldwide Inquiries: Phone: +1.650.506.7000 Fax: +1.650.506.7200 www.oracle.com Copyright © 2005, Oracle. All rights reserved. This document is provided for information purposes only and the contents hereof are subject to change without notice. This document is not warranted to be error-free, nor subject to any other warranties or conditions, whether expressed orally or implied in law, including implied warranties and conditions of merchantability or fitness for a particular purpose. We specifically disclaim any liability with respect to this document and no contractual obligations are formed either directly or indirectly by this document. This document may not be reproduced or transmitted in any form or by any means, electronic or mechanical, for any purpose, without our prior written permission. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners.