download
DESCRIPTION
TRANSCRIPT
04/08/23
DB2 Recovery Solutions & MoreBill ArledgeDB2 Data Management AnalystBMC Software
04/08/23 ©2006 BMC Software2
When Availability is Critical, Recovery is Crucial!
Unplanned downtime is an unfortunate fact of life... Up to 80% of all unplanned downtime is caused by software or human error*Up to 80% of all unplanned downtime is caused by software or human error* Up to 70% of recovery is Up to 70% of recovery is “think time”“think time”!!
*Source: Gartner, “Aftermath: Disaster Recovery”, Vic Wheatman, September 21, 2001*Source: Gartner, “Aftermath: Disaster Recovery”, Vic Wheatman, September 21, 2001
Recover30%
Build30%
Diagnose20%
Detect20%
04/08/23 ©2006 BMC Software3
Recovery is a Real Challenge
›Cost of Downtime varies• By Industry• By Business Cycle
›Staff Productivity and Expertise pressures• Harder to get and keep good technicians• Recovery is a ‘part time’ job, skills may wane• A lot of hours can go into DR test ‘preparations’
›Planned downtime (backups) pressures• Consistent Copies/Dumps require outage• Even a brief outage may impact business
›Unplanned outages happen at painful times
04/08/23 ©2006 BMC Software4
Recovery Elements
FAILURECREATE
RECOVERYJCL
EXECUTERECOVERYANALYSIS
RECOVERY MANAGEMENT
FAST UTILITIES
APPLICATION OUTAGE
04/08/23 ©2006 BMC Software5
Let’s think out of the BOX!
– Who says the only way to recover to a PiT is forward recovery?• What if you could ‘avoid’ recovery for some objects in an application• What if you could exploit storage technology instead of tape copy?• What if you could go ‘backwards’ through the log?• What if you want to make the bad SQL just go away by ‘undoing’ it?
– This presentation will show how to use BMC Software to reduce or eliminate the downtime for Backup, Recovery, Replication, and Batch Restart
Apply 23 Hours of Log
1 Hr of Bad Transactions
23 Hours of Good Transactions
Image Copy Recovery started
RecoveryPoint
04/08/23 ©2006 BMC Software6
Recovery Management for DB2Solving Business Problems with Innovation and Automation
› Solution Integration– Building on core components to leverage BMC technology
› Backups – High availability techniques for necessary process– COPY PLUS Value Proposition– Snapshot Copy (Software, Hardware, Instant Snapshot)– Hybrid Copy Technique – mix and match for effective backup and recovery– Cabinet Copy – dramatic reduction in elapsed and CPU time– Encrypted Image copy – secure offsite tape storage– Online Consistent Copy – clean copy with NO Quiesce
› Application Recovery – speed and automation for an infrequent event – Recovery Management interface– Innovative forward and PIT recovery techniques – Index automation, Backout, Backup & Recovery
Avoidance, Timestamp Recovery, Log accumulation– Creative uses for the DB2 Log data – Reporting, UNDO, Migration
› Disaster Recovery– Local site preparation automation– Remotes site System and Application Recovery automation– DR Reporting, Estimation, Simulation– Remote Recovery and Replication
04/08/23 ©2006 BMC Software7
Recovery Management for DB2
› Top ROI Features– Large Application Group (e.g. SAP) Backup and Recovery Automation* – Snapshot Copy
• Software, Hardware*, Instant Snapshot*, Instant Restore*– Online Consistent Copy*– Index Copy and Recover Automation*– Physical Backout*– Backup and Recovery Avoidance*– Log Manipulation
• Extract & store by filter, Reports (including Data change Audit), Logical UNDO– Dropped Object Recovery*– Disaster Recovery and Coordinated Recovery automation*– Disaster Recovery Reporting*– Recovery Time Estimation*– Recovery Simulation*– Timestamp Recovery*– DB2 Version 8 Exploitation (more later!)
04/08/23 ©2006 BMC Software8
BMC Recovery Management for DB2 Solving Business Problems with Innovation and Automation
Integrating the capabilities of mature, patented functions.
Recovery Management for DB2Intelligent * Integrated * Automated * Optimized
SNAPSHOTUPGRADEFEATURE®
RECOVERPLUS for DB2
COPY PLUSfor DB2
Log Master™for DB2
RECOVERYMANAGER
for DB2
Job
Gen
erat
ion
and
Man
agem
ent
Hig
h S
peed
Bac
kup
Tech
nolo
gy
Log
Ana
lysi
sR
edo/
Und
o
F
ast,
Dep
enda
ble
Rec
over
ies
The Sum is greater than the parts
Solution Exclusives
•Recovery Simulation
•Recovery Estimation
•Disaster Recovery Tracking and Reporting
•Backout to Forward Recovery Automation
•DR Mirror Management
•Online Consistent Copy
•Inflight Recovery (Recover to ANY Timestamp
•Encrypted Image Copy
•Cabinet Copy
•DB2 9 Support
04/08/23 ©2006 BMC Software9
COPY PLUS for DB2
1:29:55
1:05:40
0:02:060:00:47
0:00:00
0:14:24
0:28:48
0:43:12
0:57:36
1:12:00
1:26:24
1:40:48
Elapsed Time CPU Time
COPY COMPARISON
IBM 8.1BMC
hou
rs:m
inu
tes:s
econ
ds
DB2 OBJECT:6 Part Table spaceAverage Row Length 1012 Secondary Indexes60 Million Rows8,644,513 Pages/~33GB
BMC OPTIONS USED:Options Maxtasks 6Output Tcopy Unit CARTVTS,STACK YES, Shrlevel Change, Indexes Yes, Resetmod no, Group Yes
IBM OPTIONS USED:Shrlevel ChangeCopy DDN(Tape1)Parallel (6)
04/08/23 ©2006 BMC Software10
3 Types of Snapshot from BMC
– Software snapshot• Brief outage required for ‘clean’ copy, known to the database • Makes a DBMS image copy, typically tape• Restore from tape or disk• Exploits processor cache
– Hardware snapshot• Uses volume mirrors or data set snaps as source for DBMS copies • Brief outage required for ‘clean’ copy, known to the database• Restore from tape or disk
– Instant Snapshot• Uses hardware data set snaps to create disk-resident backup• Note – NOT a DBMS formatted image copy (but can copy imagecopy)• Brief outage required for ‘clean’ copy, known to the database• NO OUTAGE for ‘fuzzy’ copy, known to the database
Instant Restore from Disk – in SECONDS!!
04/08/23 ©2006 BMC Software11
Hybrid Copy Illustration with STACK CABINET
BMCCOPY
InstantSnapshots
Single diskCopy dataset
IMS/DB2LOGS
BMCRECOVER
A Few LargeIMS/DB2/VSAM
Databases
Many Small IMS/DB2/VSAM
Databases
Recovered Databases
04/08/23 ©2006 BMC Software12
Hybrid Copy DB2 Example with STACK CABINET
› Part cabinet copy, part Instant Snapshot Copy › One BMC Copy statement - › COPY TABLESPACE DB.* SHRLEVEL CHANGE STACK CABINET
– OUTSIZE parm drives large objects to DSSNAP, smaller to single cabinet copy dataset
• Generated copy for 1828 data sets – 198 Instant Snapshots, 1630 regular copies to cabinet copy dataset
• Less than 17 Minutes elapsed time (NO OUTAGE)• Very little CPU time
• Recovery time for entire application – less than 1 hour– Without BMC, recover time with DSNUTILB over 360 minutes – 6 HOURS!!
04/08/23 ©2006 BMC Software13
Encrypted Image Copies
› Satisfies need for SOX compliance to protect financial and customer information› Encrypted Copies using DES (64bit) or AES (128bit) algorithms› KEYDSNAME is created at installation with restricted access
– Holds key, timestamp, optional algorithm identifier, optional comment› Requires BMC® Recovery Management for DB2 Solution
Joe Blogs123 45 6789
COPY DB.TSENCHIPER YES RECOVER DB.TS
keydsn
or
$je Lb*(1C18 bo 3(7V
Joe Blogs123 45 6789
EncryptedOutputcopies
04/08/23 ©2006 BMC Software14
Online Consistent Copy
– Used for migration of a consistent set of data• Test Database creation• Data Warehouse population
– No outage required– Very fast, exploits intelligent storage technology– Copy contains only committed data – uncommitted data excluded– Supports copying a group of spaces at the same point of consistency– Supports multi-dataset non-partitioned spaces
– Note: CAN be used as input to recovery requiring log apply
04/08/23 ©2006 BMC Software15
Online Consistent Copy
Log Records
DB2 ‘A’
BMC SnapshotBMC Copy
Snap Request
Register ‘Online Consistent Copy’
MVS Operating System
Storage Device
Snaps data set
Storage Device
Data set
Data set
DB2 ‘B’
Migrate data to another DB2 with RECOVER TOCOPYWITH OBIDXLAT
RECOVER TOCOPYWITH NO LOGAPPLY
Data set
Apply log records to consistent point
•OCC may be input to UNLOAD PLUS•OCC can be created with normal copy process
04/08/23 ©2006 BMC Software
DB2 Recovery Manager - Overview
– ISPF application with DB2 repository tables• Access DB2 Recovery Resources and …
– Group objects for recovery– Validate recoverability of objects– Specify/Generate recovery jobs
• Most processes available in Batch
RecoveryManager Repository
RecoveryJobs
DB2 Recovery Resources
BackupJobs
BackupJobs Disaster
Recovery
DisasterRecovery
• ICF Catalog• SYSLGRNG/X• SYSCOPY / Image Copies• Active Logs• Archive Logs• BSDS• DB2 Catalog
•Tablespace•Indexes•RI structures
04/08/23 ©2006 BMC Software17
Application Recovery - 101
– Without Recovery Manager• Determine what objects need to be recovered.
– By Plan, Volume, Database– What objects are included in the above
• Where do I need to recover to?– Image Copy (which one)– Quiesce Point– Current
• Build the process …– One recovery job or multiple– Log being used in recovery– Recover Indexes or Rebuild– Recover Plus Backout an option?– Do all spaces actually need to be recovered
• Run the jobs– Sit and wait … hopefully no abends.
04/08/23 ©2006 BMC Software18
Application Backup and Recovery Complete Subsystem-wide Backup
INIT
IAT
OR
S
Generated Balanced
Jobs
ARMBGPS
Scheduled Jobs
Copies
To Scheduler
04/08/23 ©2006 BMC Software19
RECOVER Group
Generatedjobs
SUBMIT
DB2 Subsystem
RECOVER groups TO Current, Timestamp or RBA
INIT
IAT
OR
Recoveries
04/08/23 ©2006 BMC Software20
RECOVER PLUS – Fast Forward Recovery
– Page built in memory, only written once
– Simultaneous COPY– Simultaneous key extract
ActiveLogs
ArchiveLogs Table Space
LOG INPUT
LOG SORT
MERGE
Work Dataset Index Space
KEY SORT
INDEX BUILD TABLE
IndexHx034Hx045Hx065Hx0f5Hx0d4Hx0e7Hx0e1Hx0a2
FullFull
Full Inc
INDEX work-area can use memory to reduce I/O
Copies
Copies
04/08/23 ©2006 BMC Software
Another Recovery Resource
Extract tablespace UNDO/REDO records for specified objectsSort into merge sequence (same sequence as copy output)
R+/CHANGEACCUM Job
ChangeAccumulation
File
R+/CHANGE ACCUM :Preprocesses/sorts log data to optimize log apply Allows all other RECOVER PLUS options
• No availability outage • No Tablespace STOPS, Locks, or Drains• Can be used in lieu of frequent copies
04/08/23 ©2006 BMC Software22
Index Copy and Recovery Automation
› Some Indexes are better Recovered than Rebuilt• Non-partitioned Indexes can have LARGE record counts• Rebuild requires scan of all PARTs• INDEXES can be COPIED and RECOVER can apply Log Records
› BMC can help• Automatically COPY Indexes based on size-threshold• Index copies can be Incremental Copies• Automatically RECOVER copied Indexes, REBUILD uncopied indexes
– User does not have to specify recovery type – we decide
› This can DRAMATICALLY REDUCE recovery time!!
04/08/23 ©2006 BMC Software23
Point in Time Recovery – Physical Backout
The fastest way to get the database to the point prior to the application error is to remove one hour of records, rather than restoring 23 hours of records.
Should BACKOUT fail, automatically do normal Forward Recovery
1 Hr of Bad Transactions
23 Hours of Good Transactions
Image Copy Recovery started
RecoveryPoint
Backout 1 Hr of Log
04/08/23 ©2006 BMC Software24
Doing Nothing Smarter With XUNCHANGED
›The Fastest Recovery is the one that can be AVOIDED.›How do We Know?
– SYSIBM.SYSLGRNX tracks Open for Update ranges for all objects– Recovery to a Point in Time is usually an ‘application’ event
• But not all objects in an application get updated every transaction– BMC Recovery Management for DB2 solution can…
• Read SYSIBM.SYSLGRNX to figure out “what has changed” • Issue GENJCL BACKUP XUNCHANGED syntax• Issue GENJCL RECOVER XUNCHANGED syntax
– Only application objects that have changed since the designated Point in Time will be recovered – a sometimes dramatic impact
04/08/23 ©2006 BMC Software25
Pit Range
000000001200
PIT_RBA START_RBA
OPTION RECOVERYPOINT TIMESTAMP
2004-04-20-09.00.00..RECOVER TABLESPACE
EMP.PAYROLL
UR1
UR2
Quiesce at 000000000900
Bad Update at 000000001000 Image copy
at 000000000100
TIMESTAMP RECOVERYNo QUIESCE required (Forward Flavor)
04/08/23 ©2006 BMC Software26
Pit Range
000000001200
PIT_RBA START_RBA
OPTION RECOVERYPOINT TIMESTAMP
2004-04-20-09.00.00..RECOVER TABLESPACE
EMP.PAYROLL
UR1
UR2
Quiesce at 000000000900
Bad Update at 000000001000
TIMESTAMP RECOVERYNo QUIESCE required (Backout Flavor)
04/08/23 ©2006 BMC Software27
TABLE
Mining the DB2 Log Data - Log Master
Load Utility
SQL Processor
DB2Batch
Log Scan
Logical Log
Repository
Reports
On-line Interface
ArchiveLogs
ActiveLogs
MemberBSDS
ArchiveLogs
ActiveLogs
MemberBSDS
ArchiveLogs
ActiveLogs
MemberBSDS
DMLDDL
LoadFile
Report Writer
SQL Generator
DDL Generator
Load Generator
High SpeedApply
04/08/23 ©2006 BMC Software28
Log Master
› Allows logical ‘UNDO’ of application transactions via SQL› Prevents potential data integrity problems by identifying when UNDO processing
will affect updates that were performed later.› Provides Data Migration to DB2 or DS databases› Display statistics on log activity including analysis of data capture changes
impact.› Comprehensive reporting
• Log Information – Audit, Summary, Detail• Performance – Commit, Rollback, Image Copy, Data Capture• Backout Integrity• Miscellaneous – Open Transaction, Quiet Point
› Support for recovery of Dropped Objects› High speed SQL apply feature
• Conflict Resolution
04/08/23 ©2006 BMC Software29
Log Master Output – Reports
›Miscellaneous Reports– Quiet Point
• Find Physical Quiet Points for filtered objects• Optionally insert a QUIESCE into SYSIBM.SYSCOPY for RECOVER purposes• DURATION – Added in V310… only report on quiet points greater than or equal to
the specified duration
– Open Transaction• What URIDs were still active at the TO point
04/08/23 ©2006 BMC Software30
Log Master Output – Reports
›Information Reports– DETAIL
• All Column data presented
– AUDIT• Index Key Value presented• Only reports changed columns for updates
– SUMMARY• No Data, Just INSERT, UPDATE, DELETE counts• CSV or SDF Format for Spread Sheet loading/analysis
– SUMMARY ALL ACTIVITY• Avoids much of Log Master overhead• Not URID boundary aware• Includes Compensation Record counts as well• CSV or SDF Format also
04/08/23 ©2006 BMC Software31
UNDO - take away only the bad data
– LOGMASTER for DB2 can apply UNDO SQL to get rid of bad transactions. Database remains online for optimal e-vailability.
Apply UNDO SQL
Bad Transaction
Good Transaction 1
Generate UNDO SQL
UNDO Bad Transactions
Good Transaction 2
04/08/23 ©2006 BMC Software32
REDO - re-apply ONLY the good data
– Customers can perform a point in time recovery and then re-apply good transactions using REDO SQL. Database is briefly offline to recover to consistency, then back online.
Bad Transaction
1. Generate REDO SQL
3. Apply REDO SQL
Good Transaction 1 Good Transaction 2 REDO Good Transaction 2
2. Point-in-time recovery to a quiet point prior to the bad transaction.
Recovery started
Be sure to generate the REDO SQL
BEFORE the RECOVER TO PIT!!
04/08/23 ©2006 BMC Software33
Automated Drop Recovery
•UNDO DDL to recreate the dropped object
•Syntax for recovery and object ID translation
•DB2 commands to rebind application plans that were invalidated when the object was dropped
•Drop Recovery Report
Generates JCL and outputs to automate Drop RecoveryLog
MasterTechnology
Scans DB2 Log Records
Process is initiated from the online interface
Recreates dropped objects
Post recovery SQL and Rebind
Drive Recovery Technology using copy and log from Dropped Object.
OBID TranslationApplies log to point of DROP
Recovery Plus
Technology
LogMaster
Technology
DB2 Subsystem
04/08/23 ©2006 BMC Software
DB2 Data Migration
Log MasterBATCH PGM
REPOSITORYMigrated RBA range
In-flight URIDs
TABLETABLE
TABLE
LogicalLog(+1)
Log MasterBATCH PGM
LogicalLog(+1)
Log MasterBATCH PGM
LogicalLog(+1)
(inflight URID 1988)
RBA 2000 RBA 3000 RBA 4000RBA 1000
Migrated 1000 - 2000less inflight URID 1988
Migrated 2000 - 3000plus inflight URID 1988
DB2 LOG
Input to LOAD utilityor Apply SQL process
Migrated 3000 - 4000
– Don’t replicate entire files, just migrate the changes!!!
04/08/23 ©2006 BMC Software35
Recover Plus Output OptionsMore Than Just Recovery
LOGS
INPUT IMAGECOPY
PROD
DB21OUTCOPY ONLYOUTCOPY ONLY
OUTPUT IMAGE OUTPUT IMAGE COPYCOPY
TEST
DB22INCOPYINCOPYOBIDXLATOBIDXLAT
INDEP OUTSPACEINDEP OUTSPACEOBIDXLATOBIDXLAT
DROPRECOVERYDROPRECOVERYINCOPYINCOPY
OBIDXLATOBIDXLAT
OPTION RECOVERYPOINT TIMESTAMP
2004-04-20-09.00.00..RECOVER TABLESPACE
EMP.PAYROLL
04/08/23 ©2006 BMC Software36
Disaster Recovery
– Options from weekly dumps to offsite logging• Dumps - Simple, cheap, maximum data loss
– Weekly dumps means several days data loss• Remote Mirror - Complex, expensive, no data loss
– Disk, Network, Software, Facilities, Operations• Compromise - Periodic vaulting of Copies & Logs
– Daily or hourly log shipment will minimize data loss
CostComplexity
Data LossOutage Time
04/08/23 ©2006 BMC Software37
Disaster Recovery
Preparation
› Generation of recovery JCL for DB2 system tables and BMC tables.
› Grouping and generation of recovery JCL based on application or other criteria.
› Recovery Simulation
Test and validate your recovery locally.
Can provide input into DR planning.
› DR Mirroring Monitors and reports existence/persistence of remote mirror volumes
Mirroring support reflected in JCL Generation.
› Pick tape lists for recovery copies and logs
04/08/23 ©2006 BMC Software38
Recovery Management for DB2 DR support (Sysprog and DBA)
– Offsite log recovery without complexity• Less data loss than weekly dumps• Automated process, easy to implement
– Dialog driven generation of ARM Utilities
TMS Pull
ICF Dump
ApplicationRemote Copies (Full/Incremental,Change/Reference)
ARMBLOG(Switch Active Log)
Truck
ARMBARC(copy log)
ARMBSRR(Gen System)
ARMBGEN(Gen Apps JCL)
DB2 Catalog/Directory& BMC RM RepositoryRemote Full Copies(Change or Reference)
04/08/23 ©2006 BMC Software39
Disaster Recovery
At the recovery site.
› Generates necessary JCL to restart DB2 subsystem.
› Supports mirrored subsystems based on installation mirroring configuration.
› Recovers system resources and critical BMC repositories (if not mirrored)
› Generates JCL for recovery of application data based on load balancing and job synchronization
› Automatically collects statistical data on actual recoveries.
› Archives recovery statistics for updating history repositories at local site.
04/08/23 ©2006 BMC Software40
Remote Site Execution
– Remote site DB2 startup is easy• Assuming all the tapes made it!!
ICFRestore
TMS Restore
Truck
ARMBSRR Job 1 (CLI, VSAM Allocates
Initialize Actives)
ARMBSRR Job 2 (Cat/Dir/RM Recoveries)
Application Dataset
Restores
Application Database
Recoveries
Business Resumption
(as of ‘last nights’ ARCHIVE LOG point)
04/08/23 ©2006 BMC Software41
Recovery Estimation / Simulation
Estimation
› Predict Recovery Times
› Based on history maintained for largest tablespaces
› Current tablespace attributes and artifacts
Simulation
› Validates recovery artifacts through actual execution of the recovery at local site without disruption to ‘real’ data.
› Provides input to decisions on DR planning.
Reporting
› Online and batch reporting available to view recovery statistics of actual, estimated and simulation runs.
04/08/23 ©2006 BMC Software42
BMC Software Recovery Value
Physical BackoutTimestamp Recovery
Transaction RecoveryAuto Disaster Recovery
E-Net Remote Replication
ReturnOn
Investment
Innovation
MigrationReplication
Audit ReportsEncrypt Copies
Assess and Improve Recoverability of Data
BusinessContinuity
Additional Daily Benefits
B&R Options Multi-Vendor
Hardware Support
Snapshot CopyRapid Recovery
Index B&R Intelligence
Hybrid CopyDrop Recovery
Recovery GroupsRecovery Avoidance
04/08/23 ©2006 BMC Software43
BMC Software Recovery Management Value Proposition
› Reduce or eliminate planned and unplanned outages– Improve Application Availability
• Perform backup while applications are online• Perform ‘logical’ recoveries while application are online• Perform ‘physical’ recovery with only log input
› Manage Complexity– Automate complicated backup and recovery processes
• Leverage investment in intelligent storage devices• Prepare for local and disaster recovery scenarios• Validate recoverability and recover assets
› Increase staff and resource efficiencies• Simplifies the recovery process for the DBAs• Assures successful and consistent recovery• Utilize computing resources efficiently