ashish prabhu douglas utzig high availability systems group server technologies oracle corporation

31

Upload: clive

Post on 13-Jan-2016

52 views

Category:

Documents


0 download

DESCRIPTION

Ashish Prabhu Douglas Utzig High Availability Systems Group Server Technologies Oracle Corporation. Maximum Availability Architecture Oracle's Recipe For Building An Unbreakable System. Agenda. Achieving High Availability Maximum Availability Architecture (MAA) Overview MAA Components - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Ashish Prabhu Douglas Utzig High Availability Systems Group Server Technologies Oracle Corporation
Page 2: Ashish Prabhu Douglas Utzig High Availability Systems Group Server Technologies Oracle Corporation

Ashish PrabhuDouglas UtzigHigh Availability Systems GroupServer TechnologiesOracle Corporation

Page 3: Ashish Prabhu Douglas Utzig High Availability Systems Group Server Technologies Oracle Corporation

Maximum Availability ArchitectureOracle's Recipe For Building An Unbreakable System

Page 4: Ashish Prabhu Douglas Utzig High Availability Systems Group Server Technologies Oracle Corporation

Agenda

Achieving High Availability Maximum Availability Architecture (MAA)

Overview MAA Components Performance Considerations MAA Test Lab Q & A

Page 5: Ashish Prabhu Douglas Utzig High Availability Systems Group Server Technologies Oracle Corporation

High Availability is …

Page 6: Ashish Prabhu Douglas Utzig High Availability Systems Group Server Technologies Oracle Corporation

Causes of Downtime

Maintenance &Maintenance &ContinuousContinuousOperationsOperations

ScheduledScheduledOutagesOutages

Inadequate SystemInadequate SystemDesign, Testing & ProcessDesign, Testing & Process

UnscheduledUnscheduledOutagesOutages

Data Center Data Center DisastersDisasters

HumanHumanErrorError

System FaultsSystem Faultsand Crashesand Crashes

Data andData andMedia FailuresMedia Failures

Page 7: Ashish Prabhu Douglas Utzig High Availability Systems Group Server Technologies Oracle Corporation

High Availability Goal

Design and validate the best, integrated High Availability solution

– Unbreakable Architecture Handle all outages at all tiers

– Best Practices Cookbook for prevention, avoidance, mitigation, and

recovery Configuration, operational, outage solutions, restore fault

tolerance– Complete out-of-the-box high availability

Tested and validated solution

Unbreakable Architecture + Best Practices = Maximum Availability

Page 8: Ashish Prabhu Douglas Utzig High Availability Systems Group Server Technologies Oracle Corporation

Maximum Availability Architecture

Best Oracle High Availability Architecture– Blueprint for Database and Oracle9iAS– Guidelines for hardware and non-Oracle software

but platform, OS, storage, network, … independent– Evolves with new Oracle versions and features

Best Practices– Configuration and operational– Outages and detailed solutions– Restoring fault tolerance after an outage

Page 9: Ashish Prabhu Douglas Utzig High Availability Systems Group Server Technologies Oracle Corporation

Maximum Availability Architecture

WAN Traffic Manager

Dedicated Network

Primary Site

RAC

Oracle9iAS

Secondary Site

Oracle9iAS

RACData Guard

Page 10: Ashish Prabhu Douglas Utzig High Availability Systems Group Server Technologies Oracle Corporation

Secondary Site

Secondary Site is a Mirror of the Primary Site– Resolve unscheduled outages quickly and easily– Allow site-wide scheduled outages

Same Service Levels– Predictable performance and response time– Site transparency

Consistent Procedures and Processes– Reduces administrative complexity

Page 11: Ashish Prabhu Douglas Utzig High Availability Systems Group Server Technologies Oracle Corporation

Highly Available DatabaseReal Application Clusters

Fast Failover– Protection from local site system failures– Faster than cold cluster failover solution– Fast-start fault recovery (instance failure MTTR)

Availability and Accessibility – Allows for scheduled outages

Add and remove nodes transparently

– Transparent Application Failover (TAF) provides uninterrupted service

Page 12: Ashish Prabhu Douglas Utzig High Availability Systems Group Server Technologies Oracle Corporation

Highly Available DatabaseReal Application Clusters

Higher Scalability – All system resources from all nodes are leveraged– Cache fusion eliminates need to partition data or

modify the application – fully application transparent– Connection load balancing distributes connection

requests from application tier

Manageability– Provides a single image of the database to manage

Page 13: Ashish Prabhu Douglas Utzig High Availability Systems Group Server Technologies Oracle Corporation

Highly Available DatabaseOracle Data Guard

Data Protection– Protection from site failures, data failures, human

errors, and corruptions Protection modes balance availability with performance Apply delay prevents user error propagation

– Greater protection, performance, and manageability compared to remote mirroring solution

– Offload processing from primary database system Role Management

– Switchover operation for scheduled outages– Failover operation for unscheduled outages

Page 14: Ashish Prabhu Douglas Utzig High Availability Systems Group Server Technologies Oracle Corporation

Highly Available ApplicationOracle9iAS

Availability– Oracle9iAS J2EE (OC4J) and Web Cache

clustering for protection against system outages– Automatic monitor and restart of failed processes– Application state preserved through failures– Add and remove nodes transparently

Scalability– Hardware network load balancer distributes client

requests to Web Cache– Web Cache clustering for distributed caching and

load balancing across multiple OC4J instances

Page 15: Ashish Prabhu Douglas Utzig High Availability Systems Group Server Technologies Oracle Corporation

Highly Available ApplicationOracle9iAS

Application Application Server TierServer Tier

Database TierDatabase Tier

ClientsClients

Web CacheWeb Cache

OC4J ClustersOC4J Clusters

Load BalancerLoad Balancer

Page 16: Ashish Prabhu Douglas Utzig High Availability Systems Group Server Technologies Oracle Corporation

Network Infrastructure

Wide Area Traffic Manager to direct client traffic to proper site

Network load balancer to distribute incoming requests

Dedicated, fast link between sites– Influences production database performance

Redundant components and paths– Network paths to the site and within the site

Page 17: Ashish Prabhu Douglas Utzig High Availability Systems Group Server Technologies Oracle Corporation

Best Practices

Configuration– Detailed recommendations for Oracle software

Features to use, parameters to set– Guidelines for hardware and other software

Operational– Technical – e.g. Switchover and failover procedures– Logistical – e.g. Change management considerations– Emphasis on outages

Outages to monitor Detailed steps to resolve outages How to restore fault tolerance

Page 18: Ashish Prabhu Douglas Utzig High Availability Systems Group Server Technologies Oracle Corporation

Best Practices

Detect Detect OutageOutage

ConfigurationConfiguration Monitor for OutageMonitor for Outage

Restore Fault Restore Fault ToleranceTolerance

Resolve OutageResolve Outage

DatabaseOracle9iASOSStorageNetwork

Operational

Page 19: Ashish Prabhu Douglas Utzig High Availability Systems Group Server Technologies Oracle Corporation

HA and Performance

Combining high availability and performance– Secondary site with identical configuration as

primary site– Network bandwidth and latency between sites– Data Guard protection mode– Instance recovery time

Page 20: Ashish Prabhu Douglas Utzig High Availability Systems Group Server Technologies Oracle Corporation

Network Bandwidth / Latency

Network bandwidth and latency between sites influences commit response time

Longer network latency will increase response time

– Remote write = network round trip time + local write I/O time at secondary site

Network bandwidth should be greater than maximum redo generation rate

Page 21: Ashish Prabhu Douglas Utzig High Availability Systems Group Server Technologies Oracle Corporation

Database Protection Modes

Balance performance with level of protection from human error, data failures, and disasters

Maximum Protection and Maximum Availability modes

– No-data-loss protection, but can have a performance impact on production service levels

Maximum Performance mode– Data loss possible, but less impact on production

service levels

Page 22: Ashish Prabhu Douglas Utzig High Availability Systems Group Server Technologies Oracle Corporation

Instance Recovery Time

Balance performance with level of protection from system faults and crashes

Short instance recovery times can be achieved with negligible impact on performance

– Provided sufficient I/O capacity exists to handle additional data block writes generated

Fast-start checkpointing makes instance recovery time-bounded and predictable

Page 23: Ashish Prabhu Douglas Utzig High Availability Systems Group Server Technologies Oracle Corporation

Instance Recovery Time

0

100

200

300

400

500

600

700

800

900

disabled 300 180 90

writes/sec

tps

Page 24: Ashish Prabhu Douglas Utzig High Availability Systems Group Server Technologies Oracle Corporation

MAA Test LabOracle, Sun, HP, EMC, F5

WAN Traffic Manager

Dedicated Network

Primary Site

RAC

Oracle9iAS

Secondary Site

Oracle9iAS

RACData GuardF5 Networks

EMC

Hewlett-Packard

Sun Microsystems

Page 25: Ashish Prabhu Douglas Utzig High Availability Systems Group Server Technologies Oracle Corporation

Maximum Availability Architecture

Best Oracle High Availability Architecture What to use

Best Practices How to build it How to manage it How to fix it

Page 26: Ashish Prabhu Douglas Utzig High Availability Systems Group Server Technologies Oracle Corporation

MAA Information Sources

Oracle Technology Network– High Availability Collateral section

Maximum Availability Architecture - Overview Maximum Availability Architecture – The Details

http://otn.oracle.com/deploy/availability/techlisting.html

Oracle Consulting – Advanced Technologies Solutions (ATS) Group

http://otn.oracle.com/consulting/9iServices/content.html

Page 27: Ashish Prabhu Douglas Utzig High Availability Systems Group Server Technologies Oracle Corporation

Next StepsSessions by Oracle Database Development

RAC: The Present, The Future, but not Science Fiction

Mon, 1pm -- Moscone Room 103

Running Your Applications on Oracle Real Application Clusters

Mon, 11am -- Moscone Room 134

Real Customers, Real Application Clusters, Real Results

Mon, 4pm -- Moscone Room 134

Deploying A Highly Manageable Oracle Real Application Clusters

DatabaseMon, 5:30pm -- Moscone Room 134

Breaking All the Rules with The Unbreakable Database

Tue, 11am -- Moscone Room 103

Oracle’s Recipe For Building An Unbreakable System

Tue, 1pm -- Moscone Room 134

Bullet-Proof Data Protection with Oracle Data Guard

Tue, 4pm -- Moscone Room 134

TuesdayMonday

For More Info On Oracle HA Go To http://otn.oracle.com/deploy/availability/

Page 28: Ashish Prabhu Douglas Utzig High Availability Systems Group Server Technologies Oracle Corporation

Next StepsSessions by Oracle Database Development

Getting Under The Hood With Data Guard SQL Apply

Wed, 8:30am -- Moscone Room 134

LogMiner, Flashback Query and Online Redefinition: Power Tools

For DBAsWed, 11am -- Moscone Room 134

Are You Using The Best To Protect Your Enterprise Data?

Wed, 4pm -- Moscone Room 252

Oracle LogMiner - Not Just An Error Recovery Tool

Wed, 5:30pm -- Moscone Room 102

Wednesday

For More Info On Oracle HA Go To http://otn.oracle.com/deploy/availability/

Real Application Clusters

Data Guard

Backup & Recovery with Recovery Manager

LogMiner, Flashback Query and Online Redefinition

Database HA Demos All Four DaysIn The Oracle Demo Campground

Page 29: Ashish Prabhu Douglas Utzig High Availability Systems Group Server Technologies Oracle Corporation

Next StepsSessions by Oracle Database Development

Showcase Presentation/Demo

11:00 AM-- Database High Availability: Data Guard

11:30 AM-- Database High Availability: Backup & Recovery and Recovery Manager

12:00 PM -- Database High Availability: Online Reorg, Flashback Query and LogMiner

11:00 AM-- Real Application Clusters: Scalability

11:30 AM-- Real Application Clusters: High Availability

12:00 PM -- Real Application Clusters: CFS on Linux

11:00 AM-- Real Application Clusters: Scalability

11:30 AM-- Real Application Clusters: High Availability

12:30 PM -- Database High Availability: Data Guard

Monday

Tuesday

Wednesday

For More Info On Oracle HA Go To http://otn.oracle.com/deploy/availability/

Page 30: Ashish Prabhu Douglas Utzig High Availability Systems Group Server Technologies Oracle Corporation

AQ&Q U E S T I O N SQ U E S T I O N S

A N S W E R SA N S W E R S

Page 31: Ashish Prabhu Douglas Utzig High Availability Systems Group Server Technologies Oracle Corporation