achieving rapid data recovery for ibm aix environments - executive overview of double-take...
TRANSCRIPT
-
7/29/2019 Achieving Rapid Data Recovery for IBM AIX Environments - Executive Overview of Double-Take Availability for AIX
1/10
Achieving Rapid Data
Recovery for IBM AIXEnvironmentsAn Executive Overview ofDouble-Take Availability for AIX
-
7/29/2019 Achieving Rapid Data Recovery for IBM AIX Environments - Executive Overview of Double-Take Availability for AIX
2/10
W H I T E P A P E R
v i s i o n s o l u t i o n s . c o m2
Introduction
Planning for recovery is a requirement in businesses of all sizes. In implementing an
operational plan that ensures that both data and applications can be recovered, IT
personnel are generally confronted with several challenges:
How can I ensure my applications and data are recoverable without impacting
business operations?
Do I have data protection approaches available to me that meet my recovery point
and recovery time objectives?
Can I afford to implement a comprehensive plan that covers both my local and
remote (disaster) recovery requirements?
Are there cost-effective alternatives that meet my requirements?
Business requirements are not the only mandates that may be driving the evolution of
your recovery plan. Various industry-specic regulatory mandates, including Sarbanes-
Oxley, HIPAA and SEC, specify requirements for data retention and recoverability. In
meeting these requirements, businesses have to deal with a variety of risks to data:
inadvertently deleted les or records (operator error), viruses or hackers that can cause
data corruption or deletion, and natural disasters that may put much more than just
your data at risk. Distributed or branch ofces may also have ease of use requirements
that may not apply to larger, more centralized businesses.
Do you have a plan that meets your recovery requirements to your satisfaction across
these areas?
-
7/29/2019 Achieving Rapid Data Recovery for IBM AIX Environments - Executive Overview of Double-Take Availability for AIX
3/10
W H I T E P A P E R
v i s i o n s o l u t i o n s . c o m3
Issues with Legacy Recovery Technologies
I youre like most Businesses, youre using some orm o data protection today probablytape-based backup. Periodically, someone shuts applications down to perorm a backup to
tape. Depending on the volume o data that is being copied, this may take several hours and
requires manual intervention to set up the backup job, run it, conrm that it occurred, and then
return the application to operation. The backup copy may be kept locally in case data needs to
be recovered in the near term, and eventually (ater several weeks) it may be moved to an osite
location or archival storage purposes. The reason to make and keep copies o your data is so
that, in the event o some sort o event or catastrophe that deletes or destroys data, you have a
clean copy saely tucked away to use or recovery purposes.
Tape is used or backup and archive because it is very inexpensive, but it is an old technology
that has been available almost since the dawn o computing. There are several issues with tape-
based backup:
Tape-based backup is a time-intensive process that is potentially disruptive to your
applications; this issue is commonly reerred to as the backup window problem.
Because of its impact on applications and resources, tape-based backups are usually not
taken more than once a day, and oten only once every several days, meaning that there
are very ew tape-based recovery points available or use over the course o a week; this
is problematic because your data is changing very requently (on the order o seconds or
minutes) and the ewer points in time you have a copy o (or recovery purposes) the more
data loss on average occurs or a given recovery; this issue is commonly reerred to as the
recovery point objective (RPO) problem.
Once it is clear that a recovery needs to occur, it takes time to perform the recovery (e.g.nding the right tape, transporting it (i its osite), restoring it to disk, restarting the application
on top o the data, etc.); this issue is commonly reerred to as the recovery time objective
(RTO) problem.
As a storage media for backup, tape is not entirely reliable; in fact, leading analyst groups
such as the Gartner Group, the Enterprise Strategy Group and the Taneja Group state that
as many as 1 in 4 backup tapes suer rom some sort o problem that precludes perorming
a recovery.
Transporting tapes to osite acilities or archival purposes also has inherent risk. Widely
publicized tape losses during physical transport (by truck) have hit large companies like Bank o
America, Citigroup Inc., ChoicePoint Inc. and LexisNexis and resulted in the theft of hundreds ofthousands o company records. Replication o data across secure IP-based networks is a much
aster, easier and saer way to transport data to osite locations or archival storage purposes.
I you are driven by either business or regulatory requirements to deploy a disaster recovery
solution, a pure tape-based data protection strategy can subject you to undue risk.
eading analyst groups, such
as the Gartner Group, the
nterprise Strategy Group and
e Taneja Group, state that as
many as 1 in 4 backup
tapes suer rom some sort
o problem that precludes
perorming a recovery.
-
7/29/2019 Achieving Rapid Data Recovery for IBM AIX Environments - Executive Overview of Double-Take Availability for AIX
4/10
W H I T E P A P E R
v i s i o n s o l u t i o n s . c o m4
The Proven Solution
Double-Take Availability for AIX
, rom Vision Solutions, is designed to resolve the ollowingcommon problems:
Backup window: Data is continuously and transparently copied rom designated servers
throughout the day as changes occur, so you never again have to concern yoursel with
backup windows.
Recovery Point Objective: Using a technology called continuous data protection (CDP),
Double-Take Availability will allow you to retroactively pick any previous point and generate a
readable, writable snapshot o what the data looked like at the selected point; this eectively
presents you with all possible recovery points to minimize data loss on recovery in a way
that tape, with its limited number o recovery points, never can.
Recovery Time Objective: Double-Take Availability restores directly from disk, providing
you with fast, reliable restores in a way that tape cannot. And your ability to pick the optimal
recovery point to minimize data loss means that you will spend less time restoring the
entire application environment; this eectively shortens the downtime associated with data
recovery and hence the impact and cost o an outage.
Redundant Application Server:The backup server provides a manual ailover target that
will allow a critical application to be rapidly restarted with access to current data (to allow
processing to continue i the primary server or some reason cannot be restarted).
Remote replication: Double-Take Availability includes the ability to replicate data across IP
networks so you can migrate your aged data to a remote acility without exposing it to the
risks associated with the physical transport o tape-based media.
Double-Take Availability is already in use with referenceable customers across different vertical
markets, including nancial services and healthcare. It runs on IBM Power Systems servers with
AIX 5.3 and above and is applicable to any application running on AIX. Applications for which
Double-Take Availability is a good t meet the following prole:
A 7 x 24 application environment with a small to non-existent backup window.
Critical applications (from a business point of view) that have high rates of data change
(where ewer recovery points translates to signicant amounts o lost data on recovery).
Applications with stringent recovery time requirements that are not currently being met withexisting data protection technologies.
By 2011, some orm o CDPwill be deployed in 80 percent
o the Fortune 2000.
Gartner
-
7/29/2019 Achieving Rapid Data Recovery for IBM AIX Environments - Executive Overview of Double-Take Availability for AIX
5/10
W H I T E P A P E R
v i s i o n s o l u t i o n s . c o m5
How Does the Double-Take Availability Solution Work?
A dedicated backup System p server is established, and can protect one or more applicationsrunning on other IBM AIX production servers that are connected to it locally via an IP-based
network. Double-Take Availability mirrors production server writes to designated Protected
Storage (the storage where the application you want to protect resides) to the backup server
over IP. These writes are stored in the Recovery Storage that is directly attached to the Backup
server. Double-Take Availability runs continuously in the background and does not noticeably
impact the performance of the Protected Application.
At any time, an administrator can go into the management interface of the Backup server,
running on a separate Windows-based PC, and generate a historical view (a snapshot of
the data at any randomly selected previous point in time). These historical views can then be
presented to any other server on the network or the purposes o recovery or to perorm any
type o o-host processing. These historical views are ully read/write capable, which means
that they can support o-host processing tasks like data analysis, testing, development
or backup all without imposing any impact whatsoever on the Protected Application. A
historical view can be presented back to the production server as well, but note that there is
another option with respect to the production server, called Production Restore, which uses
dierencing technology to modiy the Protected Storage to look like the historical view selected
on the backup server.
Most restore requests are
driven by issues such
as an inadvertently deleted
fle or data corruption
that is introduced by a
virus or a hacker.
Figure 1. Double-Take RecoverNow mirrors data to a local backup server, which can then retroactively
present snapshots or recovery, analysis, testing or development purposes with no impact
on the production server(s).
Data Tap
Protected Server
Ethernet (IP) network
LAN
IBM Server
Snapshot presentation
Manual failover target
provides server-level
redundancy
Protected Storage Recovery Storage
Production restore
AIX 5
-
7/29/2019 Achieving Rapid Data Recovery for IBM AIX Environments - Executive Overview of Double-Take Availability for AIX
6/10
W H I T E P A P E R
v i s i o n s o l u t i o n s . c o m6
Double-Take Availability also supports asynchronous replication. This allows you to replicate
the continuous data stream or selected historical views to a remote acility, as long as it is
connected to the primary acility by an IP-based network. Replication o the continuous data
stream provides ull any point in time recovery capabilities at the remote site. This conguration
is optimal or disaster recovery capability, since historical views created at the remote site can
only be presented to servers at the remote site that are on the same LAN as the remote backup
server. Replication represents a much aster, much more secure way to get your data to an
osite storage acility. To use this eature, you will need to purchase another System p server
running AIX and the backup server software license at the remote site.
n optimized recovery window
seven days is confgured on
the Backup server...
Any restore requirement dur-
ing that seven day period is
perormed instantaneously
rom disk, without the need
o build up a restore image
rom multiple incremental
backups.
Protected Server
Snapshots created by the
IBM Server can be presented
to a Backup Server for any
type of off host processing,
like backup.
Snapshots created by the
IBM Server can be presented
to a Recovery Server to
recover a production application.
IBM Server
Backup Server Tapes can then be
transported off site
for remote storage.
IBM Server Recovery Server
IP-based replication
WAN
LOCAL SITE REMOTE SITE
Data Reference Patterns
100%
50%
0%
Online(ms)
Retrievalactivity
Amount of data
Nearline(sec)
Archival/deletion(sec/mins)
References declineas data ages.
Data is being kept forlonger periods of time.
The percent of total datadeleted is declining.
TCO encourages migrationof less active data.
Days since creation
1 3 7 15 30 60 90
Source: Harison Information Strategies
-
7/29/2019 Achieving Rapid Data Recovery for IBM AIX Environments - Executive Overview of Double-Take Availability for AIX
7/10
W H I T E P A P E R
v i s i o n s o l u t i o n s . c o m7
The Correlation Between Data Age and Possibility of
Re-Use/Restore
It has been proven over time that most data recovery requests are or relatively recent data,
and that there is a direct correlation between the age o data and the possibility that it would
be required or restore purposes. Most restore requests are driven by issues such as an
inadvertently deleted le or data corruption that is introduced by a virus or a hacker. Typically
these problems are discovered within several hours or at most a ew days rom when they rst
occur, resulting in restore requests or more recent data.
In general, the only time you may need to restore data that has already been archived would
be in the event o a disaster that physically destroys computer equipment and acilities, such as
an earthquake or a tornado. While it pays to be prepared against these occurrences, they are
very rare. The slope o the red line in Figure 3 varies by company type, but it refects the general
relationship in all industries between the age o data and the chance that it would need to berestored.
Another key factor to note is that as data ages, it becomes less important to support the ability
to restore to any point in time. Note the inection point in the red line in Figure 3 that occurs
around Day 3. Restore requests or data drop o signicantly ater that point. This might suggest
that you would want to manage roughly 3 days worth o your most recent data with Double-
Take Availability, migrating it to less exible but less expensive media locally thereafter for several
weeks, and then eventually storing it in an o-site acility ater about 30 days. This 3 day window
is reerred to as the optimized recovery window.
Two Sample Use Cases
Using Double-Take Availability to Provide Zero Impact Data Protection and Rapid
Local Recovery
In this scenario, we assume the customer wants to solve the rapid recovery problem at the local
level. They have chosen, however, not to replicate and will continue to migrate data to tape or
physical shipment to an osite location.
The customer is running an Oracle database as an order entry system on an IBM Power
Systems server with AIX and 600GB of internal storage. This server will become the production
server.10% o the data changes on a monthly basis, and the overall rate o data growth is
orecast at 30% per year. Based on past experience, the customer knows that restore requests
tend to drop o signicantly ater seven days. The customer currently does daily incrementalbackups and weekly full backups using a 100 Mbit Ethernet LAN. Incremental backups take
roughly 90-120 minutes per day, while the ull backup takes between ten and teen hours using
a small tape cartridge autoloader.
To install the Double-Take Availability solution, the customer purchases a second IBM Power
Systems server on AIX to act as the backup server. Based on the rate of data change and
forecast database growth, 1.5 TB of Recovery Storage is housed internally to the backup
server. This backup server is attached to the same LAN as the production server. Double-Take
-
7/29/2019 Achieving Rapid Data Recovery for IBM AIX Environments - Executive Overview of Double-Take Availability for AIX
8/10
W H I T E P A P E R
v i s i o n s o l u t i o n s . c o m8
Availability is installed on the production server, while the relevant storage which underlies the
Oracle application is designated as the Protected Storage. An optimized recovery window of
seven days is congured on the backup server. An initial synchronization between the production
server and the backup server is perormed while the production server continues to run (itis run as a background process) so that database access is not impacted. Once the initial
synchronization is complete, continuous data protection is enabled.
To take advantage of the capabilities of their newly implemented Double-Take Availability
solution, the customer makes some changes to their data protection processes. With seven
days o data included in the optimized recovery window, the customer no longer needs to
perform daily incrementals. Any restore requirement during that seven day period is performed
instantaneously rom disk and without the need to build up a restore image rom multiple
incremental backups, thus cutting recovery time to minutes.
A weekly tape backup is still desirable to prepare for the eventual archiving of data offsite, but
the Oracle application no longer needs to be shut down to perorm backups. Once a week, a
historical view is created by the backup server, which then uses it to perorm a tape backup. The
customer continues to use its existing tape backup sotware to perorm this backup. Double-
Take Availability is compatible with all backup software packages for the purposes of historical
view presentation or o-host backup. These tapes are kept onsite or two weeks, and then sent
to an osite acility or archival storage.
Implemented in this way, Double-Take Availability for AIX provides the following benets:
Backups to tape are now completely decoupled from the production application so they can
now be scheduled to occur when it is convenient or the administrator, without concern or
impact on business processes.
Backups are only taken once a week now (instead of daily), taking less administrative time.
Restores within the optimized recovery window occur rapidly and reliably from disk,
completely resolving tape media integrity issues or near term restores.
Data loss on recovery is minimized because the administrator now has access to the
optimal recovery point to minimize data loss or every conceivable ailure scenario (this is the
RPO issue).
Recovery time is shortened in several ways:
no restore from tape to disk is required (the application can just be started right up on
the selected historical view).
a recovery point never needs to be built up from incrementals so there is less
administrative overhead associated with recovery (the selected point is just immediately
presented rom disk).
there is less time spent preparing the application for production use again after the
recovery because the best recovery point to resolve the problem can be selected (e.g.
i the problem is a le deletion or data corruption problem, the point right beore that
event occurred can be chosen).
-
7/29/2019 Achieving Rapid Data Recovery for IBM AIX Environments - Executive Overview of Double-Take Availability for AIX
9/10
W H I T E P A P E R
v i s i o n s o l u t i o n s . c o m9
Recovery time is considerably shortened in the event of a problem with the production
server: The Protected Application is simply started on the backup server, using the latest,
current copy o the production data (the latest historical view). It can continue to run there
until such time as the production server can be repaired and restarted.
In addition to these benets, there is another advantage that did not exist with the previous
tape-based approach. Patched and upgraded applications can be tested against current
production data in a manner completely decoupled from the production environment. A
historical view o the current data state is created and presented to a staging server (also on
the LAN) where the patched or upgraded application can be tested. Once the administrator is
satised with the stability o the new environment, it can be deployed in production. Double-Take
Availability makes it easy to create these historical views for testing purposes, ensuring more
reliable patch and upgrade processes against production environments.
Archiving To Tape with A Multi-Site Double-Take Availability Confguration
In this scenario, we assume the customer wants to solve three problems (backup window, RPO
and RTO) but they also want to migrate their archival data to a remote acility with minimal risk.
For the purposes o this example, well assume they are running an IBM DB2 UDB database on
an IBM Power Systems server with AIX.
Adding to their production server, the customer purchases a local backup server with an
appropriate amount of storage, and the Double-Take Availability software licenses. Then, to
enable the remote replication capability, the customer purchases another IBM Power Systems
server, to be located at the remote site, running the same operating system.
The customer wants to take the weekly ull tape backup rom disk at the remote site or
archival storage. Both the local and remote backup servers are connected via an IP network.
With this conguration, the only change to their ormer backup processes is that they now keep
no tape at all at the local (production) site, only at the remote site.
Once a week, a historical view that represents the ull backup is created on the remote-site
backup server. The remote backup server then backs up the data to tape. Recoveries o data
that is already archived can be restored rom tape to disk on the remote backup server, and
then replicated back to the local (production site) backup server. At that point, the view can be
manipulated or any recovery or o-host processing purposes in the same manner as any locally
created view.
This solution provides the ollowing benets:
All of the benets of the local conguration example accrue here, including removal of thebackup window, minimized data loss and much more rapid, reliable recoveries (due to rapid
restores direct rom disk and to the availability o the backup server as a manual ailover
platorm).
The additional advantages that accrue with the remote conguration include a fast, easy and
secure way to migrate data rom a local site to a remote site without incurring any o the risk
associated with physical transport, and a ast, easy and secure way to get that data back to
a local site on those rare occasions when a recovery rom older data is required.
Tape-only backups are no
longer a easible data
rotection strategy in todays
business environment
-
7/29/2019 Achieving Rapid Data Recovery for IBM AIX Environments - Executive Overview of Double-Take Availability for AIX
10/10
W H I T E P A P E R
15300 Barranca Parkway
Irvine, CA 92618800-957-4511
801-799-0300
visionsolutions.com
Copyright 2010, Vision Solutions, Inc. All rights reserved. IBM and Power Systems are
trademarks of International Business Machines Corporation. W indows is a registered trademark
of Microsoft Corporation. Linux is a registered trademark of Linus Torvalds.
Recovery Time Comparisons
When downtime costs you money, a rapid
recovery capability presents a quantiable returnon investment opportunity. By oering a much
aster and easier way to perorm data recovery
than that oered by tape, savings accrue not
only in the area o downtime but in terms o
administrative time and expense. As shown in
Figure 4 below, Double-Take Availability can
shorten recovery times by hours and even days
in some cases.
Summary
Any business that is experiencing rapid growth
or consolidation is very likely using a suboptimal
data recovery solution built around tape-based
backup. This type o legacy solution potentially
interrupts business processes, due to the
requirement or a backup window, subjects the business to potentially signicant data loss
when recoveries are required, and is time consuming and labor intensive or both data protection
operations and recoveries.
Double-Take Availability for AIX is a proven solution to the data recovery problem that is in use
at a variety of referenceable accounts today. Double-Take Availability leverages CDP technology
to support instantaneous recoveries rom disk, resulting in minimal data loss (due to its abilityto present all possible recovery points), rapid, reliable recovery (due to its ability to restore
immediately rom disk), all while not imposing any downtime on production applications (zero
impact data protection).
Because Double-Take Availability ensures that data on the backup server is always current,
it can be relied upon as a manual ailover platorm that allows application processing to be
rapidly restarted in the event o a catastrophic production server ailure. In addition, Double-Take
Availability supports asynchronous replication that will allow businesses to establish cost-
eective and secure multi-site disaster recovery strategies that support rapid recovery, even
from archived data. Double-Take Availability runs on IBM Power Systems servers with AIX and
is applicable to any AIX application, but is applied most often for use with business- or mission-
critical applications such as enterprise databases or le systems.
Recovery Time for 1 TB of Data
Review & Roll back from Historical View
Apply Roll Back to Production Se rver
Double-Take RecoverNow Recovery
RebuildVolumes
Apply Archive Logs
ResynchronizeVolumes
Apply Archive Logs
Restore Data Files from Tape
Apply Archive Logs
IBM
Hours
Local Offsite Tape
20
15
10
5
0
Recovery from Local Copy
Recovery from Offsite Copy
Recovery from Tape
3 Hrs
20 Min
9 Hrs
17 Hrs