rac - the savior of dba

Real Application ClusterRAC - Savior of DBA Presenter : Nikhil Kumar

2

WHO AM I?Nikhil Kumar (DBA Manager)6 Years of Experience in Oracle Databases and Apps.Oracle Certified Professional Oracle 9i and 11g.Worked on Mission critical Telecom (Vodafone and IDEA),

Financial ERP, Manufacturing and Government Domain.

3

AGENDA

1. Introduction to Real Application Cluster

2. RAC - Savior of DBA

3. Oracle Clusterware (Platform on Platform)

4. RAC Startup sequence

5. RAC Architecture

6. RAC Components

7. Single Instance on RAC

8. Node Eviction

9. Important Log directories in RAC.

10. Tips to monitor and improve the RAC environment.

4

I AM RAC Real Application Cluster

RAC allow multiple computers to run Oracle RDBMS simultaneously while accessing a single database ,thus providing clustering. Multiple Instance for single database.

Set of interconnected servers (nodes), acting as a single server. Transparent to Users.

High Availability ,Scalability, Ease of Administration.

Nodes should be Identical to fulfill the cluster creation environment (OS, Users and Network Segment etc).

CRS manages node addition and removal(One Node to 100 Nodes).

Instances of from different nodes writes same physical database.

RAC architecture enables users and applications to benefit from the processing power of multiple machines.

5

Control file, Data file, Redo log files, Temp files and SPFILE on shared storage.

Every Instance will have its own Redo log files undo Segments.

Every instance has its own set of background of process on clusterware level and database level.

All cache(DB buffer cache, Data dictionary cache and Library cache) are synchronized by cache fusion and resource managed globally.

Backup and Recovery process can be performed from any node of database.

Users can connect to any node of server.

I AM RAC Cont…

6

Project Detail:- Mission critical OLTP database(Telecom or Banking) Database. Database availability SLA is 99.99% or 100%..Maintenance Requirement:- OS Patching or Schedule bounce of OS. Database maintenance patch(CPU or PSU) . Static database parameter change (Due to bug or requirement by the time). Hardware upgrade or change.Hardisk failure, power failure or system failure. Prevention from Single point of failure?

Can we fulfill these requirement by single instance database ?

RAC - The Savior of DBA

Oracle Clusterware (Platform on Platform)Oracle clusterware is infrastructure which provides platform to Oracle database run on shared mode (Active- Active). Oracle Clusterware is acting as platform on OS platform to provide database availability in shared mode.

Oracle Clusterware Cont..Benefits of clusterware:-

Oracle clusterware detect issue and evict the problematic node to resolve the issues. Oracle clusterware restart the services if they stop for any reason. Oracle clusterware add addition node as per business requirement(Pay as you grow). Oracle clusterware add services or single database for automatic maintenance. Oracle clusterware can failover required services on surviving node to provide high availability. Oracle Clusterware eliminate planned and unplanned downtime(24X7 database availability). Automatic Load Balancing Scale and speedup.

9

INIT spawns init.ohasd (with respawn) which in turn starts the OHASD process . This daemon spawns 4 processes.Level 1: OHASD Spawns:cssdagent - Agent responsible for spawning CSSD.

orarootagent - Agent responsible for managing all root owned ohasd resources.

oraagent - Agent responsible for managing all oracle owned ohasd resources.

cssdmonitor - Monitors CSSD and node health (along wth the cssdagent).

Level 2: OHASD rootagent spawns:CRSD - Primary daemon responsible for managing cluster resources.

CTSSD - Cluster Time Synchronization Services Daemon

Diskmon

ACFS (ASM Cluster File System) Drivers

Level 2: OHASD oraagent spawns:MDNSD - Used for DNS lookup

GIPCD - Used for inter-process and inter-node communication

GPNPD - Grid Plug & Play Profile Daemon

EVMD - Event Monitor Daemon

ASM - Resource for monitoring ASM instances

Level 3: CRSD spawns:orarootagent - Agent responsible for managing all root owned crsd resources.

oraagent - Agent responsible for managing all oracle owned crsd resources.

Level 4: CRSD rootagent spawns:Network resource - To monitor the public network

SCAN VIP(s) - Single Client Access Name Virtual Ips

Node VIPs - One per node

ACFS Registery - For mounting ASM Cluster File System

GNS VIP (optional) - VIP for GNS

Level 4: CRSD oraagent spawns:ASM Resouce - ASM Instance(s) resource

Diskgroup - Used for managing/monitoring ASM diskgroups.

DB Resource - Used for monitoring and managing the DB and instances

SCAN Listener - Listener for single client access name, listening on SCAN VIP

Listener - Node listener listening on the Node VIP

Services - Used for monitoring and managing services

ONS - Oracle Notification Service

eONS - Enhanced Oracle Notification Service

GSD - For 9i backward compatibility

GNS (optional) - Grid Naming Service - Performs name resolution

For more information Please refer 11gR2 Clusterware and Grid Home WhatYou Need to Know (Doc ID 1053147.1)

RAC Startup Sequence

RAC Architecture

11

Clusterware ComponentsOracle Clusterware high availability components :-1. OCR 2. Voting Disk

Oracle Clusterware Network configuration:-3. SCAN :- SCAN stands for “Single Client Access Name”. This is new and mandatory feature introduced in

RAC11GR2. It provides single name to client to access the database running on the cluster. We don’t need to

change the configuration of client tnsname.ora file if add or remove nodes in the cluster.Network requirement for SCAN:- SCAN requires one single name to resolve 3 IP addresses fixed in DNS on the basis of round robin algorithm.

SCAN is given at time of grid installation interview phase. After installation 3 scan listeners are get created, which can relocate to nodes according to load.SCAN listeners are run from grid home and directly depends on VIP.

For more information Grid Infrastructure Single Client Access Name (SCAN) Explained (Doc ID 887522.1)

12

Clusterware Components Cont..Tradition entry of TNSNAMES.ORA file:-

New entry of tnsname.ora:-

Scan Parameter defined in database:-

Parameter defined on the database level:-

13

Clusterware Components Cont..2. VIP :- Virtual IP address which used mainly for failover of connection in case of node eviction. VIP can float on

nodes. We put the VIP address in Local listener parameter.

3. Public IP :- Public IP is address is physical machine or node.

4. Private IP :- Private IP used of internal communication of interconnected node which is connected to high speed switch.

Interconnect :- Cluster interconnect is very important private network used for intercommunication of nodes. Heartbeat (network ping) and memory channel

between the nodes. Network pings are performed by cssd service.

Helps is cache fusion. Wait events can be seen in AWR report if low speed switch is being used(100gbs

switch is recommended).

14

Process of fusing buffer cache of more than one instance to fulfill the block requirement of other nodes is called Cache Fusion.Cache Fusion uses a high-speed IPC interconnect to provide cache-to-cache transfers of data blocks between instances in a cluster. This data block shipping eliminates the disk I/O and optimizes read/write concurrency.Processes involves in Cache fusion are :-

Global Cache Service:- Process responsible for transferring block from one node to other.Global Enqueue Service :- Holds the information about lock on the buffer. It also performs distributed deadlock detections.Global Resource Directory :- GRD is present on each instance of the cluster. It keeps the list of buffer on which node they are mastered.

If objects is being used very frequently one specific node, then that node becomes the master of that object and same information passed to all GRD of the nodes. When same block is being requested by user request then information is read from GRD to locate the master node to fetch the block for fast retrieval.

Cache Fusion

15

Oracle Cluster Registry (OCR)Oracle Cluster Registry stores configuration information of oracle clusterware and RAC database resources (Database, VIP, Listener, Disk group, Scan and other services). OCR is created at time of clusterware installation. Voting disk and OCR automatic backup kept together in a single file.Automatic backup of Voting disk and OCR happen after every four hours, end of the day, end of the week

OCR is managed by mainly thee command utilities:-a) ocrcheckb) ocrconfigc) ocrdump

OCRDUMPFILE.txt

16

It manage information about node membership. Each voting disk must be accessible by all nodes in the cluster. Its primary purpose is help in the situation when private network communication fails.

Clusterware has 2 types of heartbeats, Which is being monitored by CSSD service in 2 way communication:-

1. Network Hearbeat Private Interconnect2. Disk Heartbeat Voting disk based

communication

Each node in the cluster is pinged every second.

Voting Disk

17

Voting Disk Cont…Some facts about Voting Disk:-Voting disk contain two types of data:Static :- Node membership information in the cluster.Dynamic: Disk Heartbeat Logging.We can dynamically add or replace voting disk..Voting disk got identity since 11.2.0.1, Thus backup of Voting disk using “dd” command not supported anymore.

Voting disk and OCR can be keep in same disk-group or different disk-groupYou must have root or sudo privilege account to manage it

18

Node Eviction Node Eviction is one of feature of RAC to prevent whole cluster from getting hanged or down. If any node is not passing heat-beat across other node or voting disk, then that node will be evicted from

the cluster. This is done to avoid from Spit Brain condition. This is one of important process of RAC to prevent the database consistency, So that no node can write

data independently on database files. Which node gets evicted:- Voting and heartbeat communication is used to determine the node. Once it is determine which cluster

needs to evicted, then CSSD is requested to kill itself to evict the node from cluster. If CSSD is hang or not responded that OCSSDMONITOR take over and kill itself to evict the node.

Which node survives:- In 2 node cluster instance with lowest node number will survive.In more than 2 node or n node cluster a biggest subcluster will survive. A subcluster which has access to

maximum number of voting disk in the cluster.

See Also Mos note 1050693.1 - Troubleshooting 11.2 clusterware node evictions.See Also Mos note 1549954.1 - RAC Node Eviction Troubleshooting Tool

19

Re-Bootless Node Eviction Prior to 11gr2 clusterware node eviction means “reboot” of the problematic node. From 11GR2 node eviction mean – graceful reboot of clusterware instead of system/node reboot for failures like network interconnect heartbeat and network interconnect. All processes of clusterware and database will be killed(mainly IO processes) of problematic node. Once all processes get stopped successfully . OHASD service restart the cluster.

Scenario when reboot of node will be required:-

If successful check of kill fail for I/O process then node reboot will be performed. If cssd service get killed during this process, then reboot will be performed. If clusterware stack shutdown for long time then reboot will be performed.

20

Single Instance on RACSuppose currently you have small business and near future you are expected to grow in business or requirement and you think today I don’t need RAC database but after 2 months I may require to migrate on RAC Migration factors:- Migration complication from single instance machine to RAC machine.

Performance issue. Extra Resource utilization

What should I do ? Well answer is you can have single instance on oracle clusterware, when business grow in term of users and transaction with time and you feel that you are facing performance issue with single instance database. You can covert that database from non-cluster database to cluster database by using following tools:-

1. rconfig utility.2. dbca 3. EM grid.

All above mentioned tools can be used to change the non-clusterware database clusterware database.

21

Important Log LocationsClusterware daemon logs are all under <GRID_HOME>/log/<nodename>. Structure under <GRID_HOME>/log/<nodename>:

alert<NODENAME>.log - look here first for most clusterware issues./admin:./agent:./agent/crsd:./agent/crsd/oraagent_oracle:./agent/crsd/ora_oc4j_type_oracle:./agent/crsd/orarootagent_root:./agent/ohasd:./agent/ohasd/oraagent_oracle:./agent/ohasd/oracssdagent_root:./agent/ohasd/oracssdmonitor_root:./agent/ohasd/orarootagent_root:./client:./crsd:./cssd:

./ctssd:

./diskmon:

./evmd:

./gipcd:

./gnsd:

./gpnpd:

./mdnsd:

./ohasd:

./racg:

./racg/racgeut:

./racg/racgevtf:

./racg/racgmain:

./srvm:

The cfgtoollogs dir under <GRID_HOME> and $ORACLE_BASE contains other important logfiles. Specifically for rootcrs.pl and configuration assistants like ASMCA, etc...

ASM logs live under $ORACLE_BASE/diag/asm/+asm/<ASM Instance Name>/trace

The diagcollection.pl script under <GRID_HOME>/bin can be used to automatically collect important files for support. Run this as the root user.

Important log directories on RAC

22

Tips to monitor and improve the RAC environment.There are so many tool and which can improve your configuration of RAC.

1. ORAchk Tool: Proactively scans for the most impactful problems across the various layers of your stack. It perform check on OS Level, Cluster level, database level and Network Level and provide solution according to that.

It comes with clusterware binaries but recommended to download the latest version form oracle support. Also refer ORAchk - Oracle Configuration Audit Tool (Doc ID 1268927.2)

2. OSWatcher and/or CHM Cluster health monitor :- These tools are used to monitor the operating and clusterware level process and record them in log according to your retention policy. Also refer OSWatcher (Includes: [Video]) (Doc ID 301137.1)

3. RAC AWR and ADDM reports:- RAC comes with few more AWR report functionality.awrgrpt:- It gives you the detail information of all nodes of the cluster in single report.

awrgdrpt:- Its gives you the difference report b/w 2 awr reports to deal with performance issue.

addmrpt:- Gives you the recommendation to improve the SQL query, database and cluster component.

orachk_pune-rac1_USSD_032014_165319.html

awrracdiff_1st_4711_2nd_5431.html

awrrpt_rac_5604_5606.html

addmrpt_1_9755_9756.txt

Thank You

rac - the savior of dba

Presentations & Public Speaking