gartner best practices in business continuity planning report
TRANSCRIPT
Human Error/
Operations Risk
Content/Application Links to Third Parties
Outsourced
Service Providers
Performance/Capacity
Security Incidents
Planned/Unplanned
Downtime
New E-Commerce Risks
• IT and business process management are integrated —
no longer solo views
• Production costs increase — no separate budget for BCP
• Risk identification and management take on a matrix
management focus, e.g., technology, financial, trading,
operations
• Problems are public — IT and business problem
management must be integrated; root cause analysis
• Only as strong as your weakest link — good
application/bad operations
• Contingency plans become critical when automation isn’t
there — every component of the business process now
must have a plan
E-Commerce BC: New Rules/New Realities
Disaster
Recovery
Business
Recovery
Business
Resumption
Contingency
Planning
Objective Mission-critical
applications
Mission- critical
business
processing
(workspace)
Business process
workarounds
External event
Focus Site or component
outage (external)
Site outage
(external)
Application outage
(internal)
External behavior
forcing change to
internal
Deliverable Disaster recovery
plan
Business
recovery plan
Alternate
processing plan
Business
contingency plan
Sample
Event(s)
Fire at the data
center; critical
server failure
Electrical outage
in the building
Credit
authorization
system down
Main supplier
cannot ship due to
its own problem
Sample
Solution
Recovery site in a
different location
Recovery site in
a different power
grid
Manual procedure 25% backup of
vital products;
backup supplier
Crisis Management
BC Components
Creating Business Continuity Plans
Business Impact Analysis
Risk Analysis
Recovery Strategy
Group Plans
and Procedures
Business Continuity Planning Initiation
Risk
ReductionImplement
Standby Facilities
Create Planning Organization
Testing
PROCESS
Change Management Education Testing Review
Policy ScopeResourcesOrganization
Ongoing
Process
Project
Awareness Programs Fiduciary Responsibility
BIA & Risk AssessmentCatalysts
Obtaining Management Commitment
Security Incident Detection & Response
Prevention/Planning
Detection
Incident Response
InvestigationEvidence
Legal Action
Business Req.
SystemArchitecture
SystemDesign
Construct Test ImplementPost
Imple-ment
• Identify
technology
and
business
continuity
risks from a
business
perspective
– BIA/ risk
analysis
RTO/RPO
• Ensure
complete
cost
estimate
• Ensure
appropriatel
y protected
end product
• Assess risks
of new
technology
products
• Identify secure
infrastructure
requirements
• Identify secure
administrative
requirements
• Establish
security
responsibilities
and service-
level
regulations
• Identify
BC/DR
strategies
• Establish
security test
strategy
• Translate
security
architecture to
detailed security
infrastructure
design
• Develop security
baselines for
new
technologies/
products
• Develop
detailed security
admin. design
• Develop
detailed
BCP/DR
design/
strategy
• Develop draft
SLAs
• Develop security
test plan
• Build/code
security
infrastructure
environment
and processes
• Build/code
security
admin.
environment,
roles/profiles
and processes
• Build
BCP/DR
environment,
plans and
processes
• Build/code
security test
plan,
processes,
scripts and
test
environment
• Train secure
administrati
ve,
operations,
business
unit, staff...
• Identify
security
noncomplia
nce issues
• Identify new
security
exposures
• Test
BCP/DR
plans to
ensure that
RTO/RPO
is
attainable
• Turn over
secure
application
infrastructure
to production
• Implement
secure
administrative
roles/profiles
• Implement
business/
continuity
DR
environment
Project Life Cycle
• Identify
changes to
tested env.
• Finalize secure
admin. env.
and processes
• Finalize
security
infrastructure
environment
and processes
• Finalize
BCP/DR env.,
plans and
processes
• Assess SLA
accuracy
• Finalize risk
acceptance
with business
• Ensure that
info. security
policies are
current
Business Process Owner
Architecture and
Standards Application and
Tech Design
Business
ContinuityOperations
Architecture and
Design
IT Operations
Problem, Change,
Performance, DR
Risk Management (Financial, Technology, Operations)
Information
Security
Recovery/continuity strategy/
design
IT Recovery management
E-Biz Project Manager
Business Manager
Risk Manager
Business Continuity Mgr.
Audit
IT
Information Security
Business Operations
Legal/Compliance
HR / Public Relations
E-Biz Recovery Team
Business continuity
strategy/design
Audit — Financial and EDP
OSPs/
Business Partners
E-Commerce BC — Integrated Processes
Rules and tools
Security Incident
identification/response
design
Problem Identification
and Impact Assessment
Problem Status/
Communication
Problem
Prevention and
Planning
Problem
Resolution
Root Cause Analysis
Problem Management Life Cycle
Problem Mgmt Team
Business Process Owner
Customer/Partner
Relationship Owner
Risk Management
Business Continuity
Information Security
IT Technical Support
IT Applications Support
Vendors/OSPs/Third Parties
Legal/Compliance
Public Relations
BCP PhaseAccounts
Payable
Accounts
Receivable
Cash
Mmgt.R&D Prod. Eng.
Order
Fulfillment
Impact Analysis
Risk Analysis
Strategy
Resources
Committed
Last Tested
Change Mgmt.
Last Major Review
Workable Solution
Audit
Location, Business Process or Department
Management Reporting is Critical
Too Much Testing and Reporting Is Never Enough
Revenue
Know your downtime
costs per hour, day, two
days...
Productivity
• Number of
employees impacted
X hours out X
burdened hourly rate
Damaged Reputation
• Customers
• Suppliers
• Financial markets
• Banks
• Business partners
• ...
Financial Performance
• Revenue recognition
• Cash flow
• Lost discounts (A/P)
• Payment guarantees
• Credit rating
• Stock price
Other Expenses
Temporary employees, equipment rental, overtime costs,
extra shipping costs, travel expenses...
What Is Your Cost of Downtime?
• Direct loss
• Compensatory payments
• Lost future revenue
• Billing losses
• Investment losses
Cost
Disaster Recovery Times
24 hours
48 hours
72 hours
Minutes12 hrs.
StandardRecovery
Elec.Vaulting
ElectronicJournaling
Shadowing
Mirroring
Database and/or fileand/or object backup
Log/journal transfer(continuous or periodic)
Database and/or file and/or object replication
Assumes mirroring or shadowing plusa complete application environment
net $host $disk $tape $
net $tape $
net $-$$+host $$+disk $$$$+
net $$$+host $$+disk $$$$+
net $$$+host $$$+disk $$$$+appl. $+
Hot Standby orLoad-Balanced
Applying High Availability to Disaster Recovery
Standby or Active
Geographic Load BalancerSite Load Balancer
Database Clusters
Application Server Clusters
Site Load Balancer
Web Server Clusters
Database Clusters
Database Replication
Transaction Replication
Designing E-Commerce Applications for No Single-Point-of-Failure
Database
ClustersDatabase
Clusters
Host-based
Disk-based
Replication Methods Examples
Disk-to-Disk mirroring EMC SRDF, Compaq DRM, IBM PPRC
and XRC, HDS HARC and HRC
Log-based DBMS
replication
Quest Shareplex, Oracle Standby
Database, ENET RRDF, SQL
Server 2000
Server-based block or file
replication
Legato Octopus, NSI Doubletake,
Veritas SRVM
Application-based
replication
Typically implemented with
message-queuing middleware
Data Replication for Continuous Availability
Emerging Technologies/Services
• Capacity on demand/emergency back-up
• Wide-area clusters
– HP Continental Clusters
– IBM Geographically Dispersed Parallel Sysplex
• Cascading data replication
Disks
Host
Operational
Site
High Bandwidth (fiber)
Disks
Host
Metropolitan/Regional
Recovery Facility
Tape Backup/Archival
Disks
Host
Primary Recovery
Site
High-
Availability-
Based Service
2000 2004
Warm Site and
Mobile Recovery
Quick Ship
Warm Site and
Mobile Recovery
Quick Ship
Load-Balanced (2+Sites)
Disaster Recovery: Market Dynamics
External
(dedicated)
External (shared)Internal
•You have an
alternative facility (50
km distant)
•BC vendors have
insufficient capacity
•BC is a recognized
and respected
discipline
•You cannot
economically benefit
from syndication
•You do not have an
alternate facility
•You desire multisite
continuous
availability or hot
standby support
•RTOs/RPOs are
very short
•You want to focus
on core competencies
•Getting management
sign-off for dedicated
capital is difficult
•Experience of
supporting an
invocation is
important
•Your planning
scenarios include
loss of technical
staff
Resource Internally or Externally
• Comdisco Recovery Services and Web Availability Services• IBM Business Continuity Recovery Services and Outsourcing Services• SunGard Recovery Services and E-Sourcing
• Professional services
• Planning software
• Hot/warm/cold standby
• Mobile/static facilities
• Mainframe/midrange/desktop
• Quick ship
Business Continuity and Internet Services
• Peripherals
• Networks
• Work area
• Specialized ancillary services
such as check processing and
data recovery
What’s new — Full-service Web-hosting with BC ―designed
in,‖ multisite infrastructures for continuous availability, Web
site and network ―throttling‖ for performance
North AmericanBusiness Continuity Market
Full-Service Providers
Cost
Always use competitive
tendering, even at renewal
Keep contracts to three years
Unbundle contract costs
Understand upgrade costs
Specify test time and additional
fees
Declaration fees are negotiable
For unsyndicated equipment,
check cost of self-acquisition
Annual cap fees
Contract Terms
Include early-termination conditions
Miscellaneous
Understand the right of access: ―first
come, first served‖ or shared
Check syndication levels, risk
exposures and exclusion zones
Touch the equipment. Visit the
recovery center
Agree to a buy-out schedule
Specify occupancy/comm. fees
Negotiating a Favorable BC Contract —Balance Risk With Economies of Scale