chapter-8 business continuity planning & disaster recovery planning

Upload: nrpradhan

Post on 05-Apr-2018

222 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/2/2019 Chapter-8 Business Continuity Planning & Disaster Recovery Planning

    1/17

    Chapter-8 BCP and DRP

    Management involvement:

    Management commitment and involvement are always needed for any major programs, and developing a disasterrecovery plan is no exception. Better commitment leads to greater funding and support. All the other choices comeafter management commitment.

    The senior manager of a business unit or division should have ownership for its business continuity plan becauseof his broad role and responsibility in the organization. The parties mentioned in other choices do not have thesame authority and power to make things happen.

    Continuity planning involves more than planning for a move off-site after adisaster destroys a data center. It also addresses how to keep an organization's critical functions operating inthe event of disruptions, both large and small. This broader perspective on continuity planning is based on thedistribution of computer use and support throughout an organization. The goal is to sustain businessoperations.

    A well-documented, well-rehearsed, well-coordinated disaster recovery planallows businesses to focus on surprises and survival. In today's environment, a LAN failure can be ascatastrophic as a natural disaster, such as a tornado. Insurance does not cover every loss. Choices (b), (c), and

    (d) are misconceptions. What is important is to focus on the major unexpected events and implementmodifications to the plan so that it is necessary to reclaim control over the business. The key is to ensuresurvival in the long run

    Silence is guilt, especially during a disaster. How a company appears torespond to a disaster can be as important as the response itself. If the response is kept in secrecy, the press willassume there is some reason for secrecy. The company should take time to explain to the press what happenedand what the response is. A corporate communications professional should be consulted instead of a lawyerdue to the specialized knowledge of the former. A spokesperson should be selected to contact media, issue aninitial statement, provide background information, and describe action plans, which are essential to minimizethe damage. The company lawyers may add restrictions to ensure that everything is done accordingly, whichmay not work well in an emergency.

    Management approval is the cornerstone for a successful contingency plan, be it for funding or support. Anindependent audit and a security review of the plan can validate the soundness of the proposed contingencystrategy. Similarly, a legal review can provide assurance that the plans comply with government regulations andthat liabilities and exposures are being adequately addressed.

    Mitigation is a long-term activity aimed at eliminating or reducing theprobability of an emergency or a disaster occurring. It requires "up-front" money and commitment frommanagement. Choice (b) is incorrect because preparedness is a readiness to respond to undesirable events. Itensures effective response and minimizes damage. Choice (c) is incorrect because response is the first phaseafter the onset of an emergency. It enhances recovery operations. Choice (d) is incorrect becauserecovery involves both short- and long-term restoration of vital systems to normal operations.

    The degree of loss caused by a disaster or disruption is directly related to the

    length of time the disruption affects business operations. managementcommitment is always needed. Choice (c) is incorrect because adequate and clear documentation is needed forpeople to know how to minimize disasters. Choice (d) is incorrect because more resources may be needed tominimize disasters.

    The Board of Directors and senior management are responsible for establishing policies, procedures, andresponsibilities for organization-wide contingency planning. The organization's contingency plan shouldaddress all critical services and operations that are provided by internal departments and external sources. TheChief Information Officer (choice a) and the Disaster Recovery Manager (choice b) are secondarilyresponsible for establishing organization-wide contingency planning. These employees execute what the board of

    Page 1 of17

  • 8/2/2019 Chapter-8 Business Continuity Planning & Disaster Recovery Planning

    2/17

    Chapter-8 BCP and DRP

    directors and the senior management planned for. The Audit Director (choice d) is responsible for reviewing theadequacy of the plan and issuing a report to the Board of Directors. He or she is not responsible for developing theplan.

    The contingency plan should be a coordinated effort with the objectives ofminimizing disruptions of service to the organization, employees, and customers (choices a and b); minimizingfinancial losses; and ensuring a timely resumption of operations (choice d) in the event of a disaster.Minimizing financial losses on outside contracts is the least important focus at this point (choice c).

    Business impact analysis:

    The purpose of business impact analysis (BIA) is to identify criticalfunctions, resources, and vital records necessary for an organization to continue its critical functions. In thisprocess, the BIA uses both quantitative and qualitative tools. Choices (a, c, and d) are examples that usequalitative tools. Anecdotal records constitute a description or narrative of a specific situation or condition.

    The risk analysis is usually part of the business impact analysis. It estimates both the functional and financialimpact of a risk occurrence to the organization and identifies the costs to reduce the risks to an acceptable levelthrough the establishment of effective controls.

    BIA is the process of identifying an organization's exposure to the sudden loss of selected business functions and/orthe supporting resources (threats) and analyzing the potential disruptive impact of those exposures (risks) on keybusiness functions and critical business operations. The BIA usually establishes a cost (impact) associated with the

    disruption lasting varying lengths of time.

    The business impact analysis examines business processes composition and priorities, business or operatingcycles, service levels, and, most importantly, the business process dependency on mission-critical informationsystems.

    Physical and environmental controls help prevent contingencies. Although many of the other controls, such aslogical access controls, also prevent contingencies, the major threats that a contingency plan addresses arephysical and environmental threats, such as fires, loss of power, plumbing breaks, or natural disasters.

    The airwaves are not secure and a mobile telephone switching office can be lost during a disaster. The cellularcompany may need a diverse route from the cell site to another mobile switching office.

    Contingency planning integrates and acts on the results of the business impactanalysis. The output of this process is a business continuity plan consisting of a set of contingency planswitha single plan for each core business process and infrastructure component. Each contingency plan shouldprovide a description of the resources, staff roles, procedures, and timetables needed for its implementation.

    Physical and environmental controls help prevent contingencies. Althoughmany other controls, such as logical access controls, also prevent contingencies, the major threats that acontingency plan addresses are physical and environmental threats, such as fires, loss of power, plumbing

    Page 2 of17

  • 8/2/2019 Chapter-8 Business Continuity Planning & Disaster Recovery Planning

    3/17

    Chapter-8 BCP and DRP

    breaks, or natural disasters. Logical access controls can address both the software and hardware threats.

    The first step is to consider possible threats including natural (e.g., fires,

    floods, earthquakes), technical (e.g., hardware/software failure, power

    disruption, communications interference), and human (e.g., riots, strikes,

    disgruntled employees, sabotage).

    The second step is to assess impacts from loss of information and services from both internal and externalsources. This includes financial condition, competitive position, customer confidence, legal/regulatoryrequirements, and cost analysis to minimize exposure.

    The third step is to evaluate critical needs. This evaluation should also consider timeframes in which a specificfunction becomes critical. This includes functional operations, key personnel, information, processing systems,documentation, vital records, and policies and procedures. The final step is to establish priorities for recovery basedon critical needs.

    Redundancy:

    A single point of failure occurs when there is no redundancy in data,equipment, facilities, systems, and programs. A failure of a component or element may disable the entiresystem.

    Testing:

    Management will not allow stopping of normal production operations for testing a disaster recovery plan. Somebusinesses operate on a 24X7 schedule and losing several hours of production time is tantamount to anotherdisaster, financially or otherwise.

    The purpose of frequent disaster recovery tests is to ensure recoverability. Review of test results should showthat the tests conducted met all planned objectives using files recovered from the backup copies only. This is becauseof the "no backup, no recovery" principle. Recovery from backup also shows that the backup schedule has beenfollowed regularly. Storing files at a secondary location (off-site) is preferable to the primary location (on-site)because it ensures continuity of business operations if the primary location is destroyed or inaccessible.

    The checklist testing will ensure that all the items on the checklists have been reviewed and considered. Duringstructured walk-through testing the team members meet and walk through the specific steps of each componentof the disaster recovery process and find gaps and overlaps. Simulation testing simulates a disaster duringnonbusiness hours so normal operations will not be interrupted. Fullinterruption testing is not recommendedsince it activates the total disaster recovery plan. This test is costly and disruptive to normal operations and requiressenior management's special approval

    A parallel test can be performed in conjunction with the checklist test or simulation test. All reports produced

    at the alternate site should agree with those reports produced at the primary site. A checklist can be used tomake sure that all steps are performed. The other three choices do not work well with parallel tests.

    Management is interested to find out what worked (successful) and what did not (unsuccessful) after a recoveryfrom a disaster. The idea is to learn from experience.

    Housing computers in a fire-resistant area is an example of a physicallyoriented disaster prevention categorywhile the other three choices are examples of procedure-oriented activities. Procedure-oriented actions relate totasks performed on a day-to-day, month-to-month, or annualbasis or otherwise performed regularly. Housing computers in a fire-resistant area with a noncombustible or

    Page 3 of17

  • 8/2/2019 Chapter-8 Business Continuity Planning & Disaster Recovery Planning

    4/17

    Chapter-8 BCP and DRP

    charged sprinkler area is not regular work. It is part of a computer center building construction plan thathappens once in a great while.

    In the case of contingency planning, a test should be used to improve the plan. If organizations do not use thisapproach, flaws in the plan may remain hidden or uncorrected.

    Full-interruption testing as the name implies disrupts normal operations and should be approached with caution.

    The purpose of end-to-end testing is to verify that a defined set of interrelatedsystems, which collectively support an organizational core business area or function, interoperate as intendedin an operational environment. Generally, end-to-end testing is conducted when one major system in the end-to-end chain is modified or replaced, and attention is rightfully focused on the changed or new system. Theboundaries on end-to-end tests are not fixed or predetermined but rather vary depending on a given businessarea's system dependencies (internal and external) and the criticality to the mission of the organization. Full-scale testing is costly and disruptive, while end-to-end testing is least costly. Pilot and parallel testing are notappropriate here.

    Prior to a test drill, the backup facility vendor needs a variety of planning information such as the CPU to beused, including the model number, number of tapes and disks required, operating systems software version,

    peripherals required with device numbers, and telecommunications needs including modems to establishconnection to the telephone company and the test computer. Usually the time window is short and timemanagement is very important considering the many customers the vendor may have. The key information is testtime (with starting and ending time frames) so that the number of hours required is known in advance. Thishelps the vendor to plan computer capacity and resource levels and allocate them among customers competingfor the same time slot.

    Disaster recovery plans should not be tested in actual use. That is, a realdisaster should not have to occur before the plan's weaknesses are revealed. At that point, the plan's weaknessis the organization's disaster. The other three choices are valid approaches to testing. For example, simulationcan be used to test different disaster scenarios, the plan can be tested in some locations or departments priorto launching an all out testing, and unannounced testing is recommended with management's permission.

    In stable IT environments, disaster recovery plans should be tested quarterly

    or semiannually. In dynamic environments where system and network configurations and application systemsoften change, more frequent testing may be required. the auditor'srecommendations are suggestions only. A cost-benefit analysis should be performed. Choice (c) is incorrectbecause budget allowances should not dictate the frequency of disaster recovery plan testing. Testing shouldbe done in the absence of budgeted amounts if the risk is high. Choice (d) is incorrect because it is too riskyto leave disaster recovery plan testing to management's discretion. When the business is down, managementmay opt to postpone the testing to save money, which is not good for the overall business.

    Alternate sites:

    A warm site has telecommunications ready to be utilized but does not have computers. A cold site is anempty building for housing computer processors later but equipped with environmental controls (e.g., heat, air

    conditioning) in place. A hot site is a fully equipped building ready to operate quickly. A redundant site isconfigured exactly like the primary site.

    A cold site is an environmentally protected computer room equipped with airconditioning, wiring, and humidity control for continued processing when the equipment is shipped to thelocation. The cold site is the least expensive method of backup site, but the most difficult and expensive to test.

    All vendors, regardless of their size, need written contracts for all customers, whether commercial or

    Page 4 of17

  • 8/2/2019 Chapter-8 Business Continuity Planning & Disaster Recovery Planning

    5/17

    Chapter-8 BCP and DRP

    governmental. Nothing should be taken for granted, and all agreements should be in writing to avoidmisunderstandings and performance problems.

    An annualized cost is obtained by multiplying the annual frequency with the expected dollar amount of cost.The product should be a small figure.

    Hot sites are fully equipped computer centers. Some have fire protection and warning devices, telecommunications linesintrusion detection systems, and physical security. These centers are equipped with computer hardware that is compawith that of a large number of subscribing organizations. This type of facility is intended to serve an organization thasustained total destruction and cannot defer computer services. The other choices do not have this kind of support.

    Cold sites do not have equipment so full-scale testing cannot be done until theequipment is installed. Adequate time may not be allowed in reciprocal agreements due to time pressures andscheduling conflicts between the two parties. Full-scale testing is possible with shared contingency centers andhot sites. Shared contingency centers are essentially the same as dedicated contingency centers. The differencelies in the fact that membership is formed by a group of similar organizations which use, or could use, identicalhardware.

    Reciprocal agreements do not require nearly as much advanced funding as

    do commercial facilities. They are inexpensive compared to other choices. However, cost alone should not bethe overriding factor when making backup facility decisions.

    A dedicated second site eliminates the threat of competition for time and space with other businesses. Thesebenefits coupled with the ever-growing demands of today's data and telecommunications networks havepaved the way for a new breed of intelligent buildings that can serve as both primary and contingency site locations.These intelligent buildings employ triple disaster avoidance systems covering power, telecommunications, lifesupport (water and sanitation), and 24-hour security systems. Hot, cold, and warm sites are operated and managedby commercial organizations, while the intelligent site is operated by the user organization.

    A reciprocal agreement is an agreement that allows two organizations to backeach other up. While this approach often sounds desirable, contingency planning experts note that thisalternative has the greatest chance of failure due to problems in keeping agreements and plans up-to-date assystems and personnel change. A hot site (choice a) is incorrect because it is a building already equipped with

    processing capability and other services, which is kept up-to-date by commercial vendors. A cold site (choiceb) is incorrect because it is a building for housing processors that can be easily adapted for use. A redundant site(choice d) is incorrect because it is a site equipped and configured exactly like the primary site.

    Mutual agreements, also called reciprocal agreements, are least costly. It does not cost any out-of-pocket money toenter into mutual agreementjust a word. Mutual agreements are not reliable and may not prove workable whenneeded.

    The other three choices are more expensive when compared to mutual agreements. Shared facilities (choice b)include hot/cold/warm sites and cost money to subscribe. Service bureaus (choice c) also charge money when theirfacilities are used. If companies own duplicate facilities (choice d) it costs money for the building,equipment, and staff.

    There could be a problem of establishing priorities in resource sharing when

    simultaneous disasters are declared by several of the hot site's subscribers. Disaster recovery planners need toknow what they are getting for their hot site subscription payments and what is being promised to others whoalso subscribe to their sites. Resource sharing is common among commercial backup facility vendors. Somevendors have their operations reviewed by an independent public accounting firm. Requesting a copy of theexternal auditor's report will provide an objective understanding of the vendor's resource sharing policies andpractices. The other choices are not objective and effective compared to the external auditor's report.

    A cold site is a fully prepared computer room that includes data communications; building securitymonitoring systems; heat, air-conditioning, and humidity controls; raised floors; and electrical power, not CPUand other computer equipment. In the event of a disaster, the computer vendor delivers the required CPU

    Page 5 of17

  • 8/2/2019 Chapter-8 Business Continuity Planning & Disaster Recovery Planning

    6/17

    Chapter-8 BCP and DRP

    hardware and peripheral equipment to the empty shell facility.

    A hot site backup is most costly because it is fully equipped and ready to operate. On the other hand, a mutualbackup site agreement is least costly. In the hot site backup, fullyequipped commercial computer facilities areused in case of a disaster. a mutual backup site agreement is least costly. However, mutual agreements are notreliable and may not prove workable when needed. Choice (c) is incorrect because a cold site backup is not asexpensive as hot site backup. However, it is more expensive than a mutual backup site agreement. Choice (d)is incorrect because off-site archival storage of data is not as expensive as a hot site backup. An off-site storageplace could be owned by the same organization wanting to process the data.

    The backup or alternate processing installation should be a reasonable distance away from the primaryinstallation. Ideally, the backup installation should be far enough away to be on a different electric power grid orfree from the same natural disaster (e.g., earthquake, hurricane) but close enough to be reached quickly.

    the backup site does not need to have a "mantrap" system due to the shortduration of recovery. Choice (b) is incorrect because the backup site need not have security guards. Choice (d)is incorrect because the backup site need not be a service bureau. Choices (a), (b), and (d) are minor concerns.

    Plan document:

    The plan document contains only the why, what, when, where, and who, not how. The "how" deals with detailedprocedures and information required to carry out the actions identified and assigned to a specific recovery team.This information should not be in the formal plan as it is too detailed and should be included in the detail referencematerials as an appendix to the plan.

    The "why" describes the need for recovery, the "what" describes the critical processes and resourcerequirements, the "when" deals with critical time frames, the "where" describes recovery strategy, and the"who" indicates the recovery team members and support organizations. Keeping the "how" information in theplan document confuses people, making it hard to understand and creating a maintenance nightmare.

    Backup:

    It is a fact that there is no recovery without a backup. A procedure is linked to a policy. There is no protectionwithout security controls. No backup, no recovery is applicable to a contingency plan.

    The first step toward protecting data is a comprehensive inventory of all servers, workstations, applications,and user data throughout the organization. Once a comprehensive study of this type is completed, various backup,access, storage, availability, and retention strategies can be evaluated in order to determine which strategy best fitsthe needs of an organization

    It is true that during a disaster not all application systems have to besupported while the LAN is out of service. Some LAN applications may be handled manually (choice a), some

    as stand-alone PC tasks (choice d), while others need to be supported off-site (choice c). While these duties areclearly defined, it is not so clear which users must secure and backup their own data. It is important tocommunicate to users that they must secure and backup their own data until normal LAN operations areresumed. This is often a missing link in developing a LAN methodology for contingency planning.

    Normally, the primary contingency strategy for applications and data is regular backup and secure off-sitestorage. Important decisions to be addressed include how often the backup is performed (choice a), how often itis stored off-site (choice b), and how it is transported to storage, to an alternate processing site, or to support theresumption of normal operations (choice d). How often the backup is used is not relevant because it is hoped thatit may never have to be used.

    Page 6 of17

  • 8/2/2019 Chapter-8 Business Continuity Planning & Disaster Recovery Planning

    7/17

    Chapter-8 BCP and DRP

    The location of the electronic vault will vary based on the disaster recoveryand contingency planning alternative chosen. The electronic vault device could be located in a site designedspecifically to house the electronic vault. It could be located in a hot/cold site, or it could be located in analternate computing site. Storing backup data in the same location as the primary computer is risky and notadvised because both the original and backup files can be destroyed in a disaster (choice a). It is beneficial tosituate the electronic vault device as part of a commercial off-site storage facility or backup computer site.

    The electronic vault impacts the number and frequency of data backups. It alters the way application systems aredesigned and operated in terms of file design and backup schedules. For example, only changes in data files sincethe last backup need to be transmitted (i.e., incremental file backup), and the changes can be transmitted every houror instantly.

    The availability of electronic vaulting increases the speed with which information can be retrieved.Traditionally, backup information is stored locally (on-site) and in an off-site vault because of the long retrieval timeshould it be required. Electronic vaulting eliminates on-site storage of data. An optical disk is a good storagemedia for electronic vaulting due to its large capacity and quick retrievability.

    Depending on the size and sophistication of the computing environment, the electronic vault storage mediaconsists of a combination of mass storage, optical disk, magnetic disk, and tape/cartridge library. For example,

    mass storage can be used to store magnetic disk files, and optical disks can be used to store paper documents.

    For some organizations, time becomes money. Increased system reliability improves the likelihood that all theinformation required is available at the electronic vault. If data can be retrieved immediately from the off-sitestorage, less is required in the computer center (choice c). It reduces retrieval time from hours to minutes(choice d). Since electronic vaulting eliminates tapes and tapes are a hindrance to automated operations,electronic vaulting supports automation (choice a).

    Daily backups taken to off-site storage facilities can minimize the damage. The whole company can suffer whendisaster strikes. There is no room for complacency. Even hot/warm/cold sites and mutual agreements (choices athrough c) require backups to continue with business operations. "No backup, no recovery" should be practiced.Microcomputer software and file backup must not always be kept at an off-site location. Depending on the importance of the information, storage of backup diskettes in another part ofthe building may be sufficient protection. However, this decision should be based on risk

    assessment rather than on ease of access. All backup diskettes should be adequately labeled to identify owner,use, and retention period. The storage locationwhether on-site or off-siteshould be environmentallycontrolled and secure, with procedural provisions for restricting physical access to authorized personnel.

    Hardware backup is the first step in contingency planning. All computerinstallations must include formal arrangements for alternative processing capability in the event their datacenter or any portion of the work environment becomes disabled. These plans can take several forms andinvolve the use of another data center. In addition, hardware manufacturers and software vendors can be helpfulin locating an alternate processing site and in some cases will be able to provide backup equipment underemergency conditions. The more common plans are service bureaus, reciprocal arrangements, and hot sites.After hardware is backed up, operating systems software is backed up next, followed by applications softwarebackup and documentation.

    Planning:

    The data center should be constructed in such a way as to minimize exposure to fire, water damage, heat, orsmoke from adjoining areas. Other considerations include raised floors, sprinklers, or fire detection andextinguishing systems and furniture made of noncombustible materials. All these considerations should be takeninto account in a cost effective manner at the time the data (computer) center is originally built. Add-ons will notonly be disruptive but also costly.

    There are three basic cost elements associated with alternate processing

    Page 7 of17

  • 8/2/2019 Chapter-8 Business Continuity Planning & Disaster Recovery Planning

    8/17

  • 8/2/2019 Chapter-8 Business Continuity Planning & Disaster Recovery Planning

    9/17

    Chapter-8 BCP and DRP

    remote computing.

    A contingency plan should consider three things: incident response, backup operations, and recovery. Thepurpose of incident response (choice a) is to mitigate the potentially serious effects of a severe LANsecurityrelated problem. It requires not only the capability to react to incidents but also the resources to alertand inform the users if necessary.

    Backup operation (choice c) plans are prepared to ensure that essential tasks can be completed subsequent todisruption of the LAN environment and can continue until the LAN is sufficiently restored.

    Recovery plans (choice d) are made to permit smooth, rapid restoration of the LAN environment followinginterruption of LAN usage. Supporting documents should be developed and maintained that will minimize the timerequired for recovery. Priority should be given to those applications and services that are deemed critical to thefunctioning of the organization. Backup operation procedures should ensure that these critical services andapplications are available to users.

    Since there are many types of disasters that can occur, it is not practical to consider all such disasters. Doing sois cost prohibitive. Hence, disaster recovery planning exercises should focus on major types of disasters thatoccur frequently. One approach is to perform risk analysis to determine the annual loss expectancy, which is

    calculated from the frequency of occurrence of a possible loss multiplied by the expected dollar loss peroccurrence.

    The mix and composition of the disaster recovery team is important as it requires appropriate andcompetent people to develop, test, and maintain the plan. For example, a representative from eachaffected area of the organization should be a part of the plan development team. This mix of people provides abroader perspective of the organization.Usually, telephone service is taken for granted by the recovery team members. Consequently, it is not addressed inthe planning stage. However, alternate phone services should be explored. The other three choices are usuallyconsidered due to familiarity and vendor presence.

    Planning Process:

    The correct sequence to follow to handle disasters is to plan, test, respond, recover, and continue.

    Both underwriters and management are concerned about risk reduction,availability of specific insurance coverage, and its total cost. A good disaster recovery plan addresses theseconcerns. However, a good plan is not a guarantee for lower insurance rates in all circumstances. Insurancerates are determined based on averages obtained from loss experience, geography, management judgment, thehealth of the economy, and a host of other factors. Total cost of insurance depends on the specific type ofcoverage obtained. It could be difficult or expensive to obtain insurance in the absence of a disaster recoveryplan. Insurance provides a certain level of comfort in reducing risks but it does not provide the means to ensurecontinuity of business operations.

    The business continuity planning process should safeguard an organization's ability to provide a minimum

    acceptable level of outputs and services in the event of failures of internal and external mission-criticalinformation systems and services. The planning process should link risk management and risk mitigation efforts tooperate the organization's core business processes.

    The information systems security officer should ensure that the existingcontingency and disaster recovery plans are updated and incorporated into the business continuity plan. Theofficer should examine the worst case scenario to ensure that a feasible backup strategy can be successfullyimplemented.

    Page 9 of17

  • 8/2/2019 Chapter-8 Business Continuity Planning & Disaster Recovery Planning

    10/17

    Chapter-8 BCP and DRP

    It is important to ensure that individuals responsible for the various business continuity and contingency planningactivities are held accountable for the successful completion of individual tasks and that the core business processowners are responsible and accountable for meeting the milestones for the development and testing of contingencyplans for their core business processes.

    Contingency planning involves more than planning for a move off-site aftera disaster destroys a data center. It also addresses how to keep an organization's critical functions operatingin the event of disruptions, both large and small. This broader perspective on contingency planning is basedon the distribution of computer support throughout an organization. The correct sequence of steps is as follows:

    ! Identifying the missionor businesscritical functions! Identifying the resources that support the critical functions! Anticipating potential contingencies or disasters! Selecting contingency planning strategies

    A top-down approach to contingency planning includes the following:

    +))))))))))))))))))))))))))))))))),

    * Conduct impact analysis ** Plan design ** Plan development ** Plan testing ** Plan maintenance *.)))))))))))))))))))))))))))))))))-

    The top-down approach involves senior management, line management, IS management, IS auditors, and endusers. The bottom-up approach (choice a) is not recommended for the first time development of the plan. It issuggested for the maintenance of the plan. Calling other companies in the same industry (choice b) or callinga commercial backup service provider (choice c) first requires a top-down plan developed by the company.

    The health and safety of employees and general public should be the first concern during a disaster situation.The second concern should be to minimize the disaster's economic impact on the organization in terms of revenuesand sales. The third concern should be to limit or contain the disaster. The fourth concern should be to reducephysical damage to property, equipment, and data.

    Most disaster recovery plans focus on data processing functionsnot otherfunctions within the organization. The IS management may assume that functional users will be responsiblefor their areas. With increased automation of business functions, a certain amount of coordination and planningare required between the IS management and the functional user management. Roles and responsibilities ofteam members are often defined (choice a), threats and vulnerabilities are analyzed (choice b), and impacts areanalyzed (choice d) although they may not be documented.

    The most important benefit of a comprehensive disaster recovery plan is to provide continuity of operationsfollowed by protection of assets, increased security, and reduced insurance costs. Assets can be acquired if the

    business is operating and profitable. There is no such thing as 100% security. Self-insurance can be assumed bythe company.

    If the business can survive without its telecommunications network for the period of time needed to restore thenetwork, the expense of network backup may not be necessary. On the other hand, if the network backup isessential, it is worth the cost. It is a management call. relying on hard-copy reports is not a viable solution formost situations. Choice (b) is incorrect because backup for telecommunications goes beyond storing parent-generation magnetic tapes off-site. It deals with network lines, nodes, equipment, telephone carriers, circuits, etc.Choice (c) is incorrect because backup for telecommunications goes beyond storing current generation transaction

    Page 10 of17

  • 8/2/2019 Chapter-8 Business Continuity Planning & Disaster Recovery Planning

    11/17

    Chapter-8 BCP and DRP

    files off-site. It deals with network lines, nodes, equipment, telephone carriers, circuits, etc.

    The reality is that contingency plans require contingencies. Problems or delays should be anticipated and plannedfor. Fall-back or alternative solutions need to be planned out in advance in case the original plan does not workfor whatever reason. Contingency plan documentation(choice a) is important but it is not critical when compared to choice (d). It is important that contingency plans becommunicated (choice b) but it is not critical when compared to choice (d). It is important that contingency plansbe understood (choice c) but it is not critical when compared to choice (d).

    The key word is to "maintain" an existing plan. Knowledge obtained fromtesting the plan is useful in refining the plan (bottom-up approach). The changes from the management andbusiness conditions and their impact should be considered (top-down approach) when updating the plan.Therefore, a combination of top-down and bottom-up approaches is very useful to maintain a disaster recoveryand contingency plan.

    Regulatory requirements dictate the length of the time a particular record ordocument must be retained by an organization to support its business activities. Insurance requirement aredictated by regulatory requirements. Auditors review compliance to such requirements. Accounting

    department, similar to other departments in the organization, must also comply with regulatory requirements.Prior to records retention, each organization must identify what records and documents are vital to itsoperations.

    Preparation:

    Restarting critical applications at the alternating facility is the greatest concern. There are several supportdetails that can affect the speed at which critical tasks can be restarted. Therefore, it will be helpful if the alternatefacility can assist in finding housing for personnel when the alternate facility is located out-of-state. The othermatters for consideration include installing equipment, operating the hardware, and loading software, master files,and databases.

    There are several considerations that should be reflected in the backup site location. The optimum facility

    location is (1) close enough to allow the backup function to become operational quickly, (2) unlikely to beaffected by the same contingency, (3) close enough to serve its users, and (4) convenient to airports, majorhighways, or train stations when located out of town.

    Recovery:

    Human resource policies and procedures impact employees involved in theresponse to a disaster. Specifically, it includes extended work hours, overtime pay, compensatory time, livingcosts, employee evacuation, medical treatment, notifying families of injured or missing employees, emergencyfood, and cash during recovery. The scope covers the pre-disaster plan, emergency response during recovery,and post-recovery issues. The major reason for ignoring the human resource issues is that they encompassmany items requiring extensive planning and coordination, which take a significant amount of time and effort.

    The goal is to capture all data points necessary to restart a process without loss of any product in the work inprogress status. The recovery team should recover all applications to the actual point of the interruption.

    Documenting the recovery plan should be done first and be available to use during a recovery. The amount oftime in developing the plan has no bearing on the recovery from a disaster. On the other hand, the amount of timespent on the other three choices and the degree of perfection attained in those choices will definitely help inreducing the recovery time after a disaster strikes. The more time spent on these three choices the better the qualityof the plan.

    Page 11 of17

  • 8/2/2019 Chapter-8 Business Continuity Planning & Disaster Recovery Planning

    12/17

    Chapter-8 BCP and DRP

    The selection of a contingency planning strategy should be based on practical considerations, including feasibilityand cost. Risk assessment can be used to help estimate the cost of options to decide an optimal strategy. Whetherthe strategy is on-site or off-site, a contingency planning strategy normally consists of emergency response,recovery, resumption, and implementation.

    In emergency response, it is important to document the initial actions taken to protect lives and limit damage.In recovery, the steps that will be taken to continue support for critical functions should be planned. Inresumption, what is required to return to normal operations should be determined. The relationship betweenrecovery and resumption is important. The longer it takes to resume normal operations, the longer theorganization will have to operate in the recovery mode. In implementation, it is necessary to make appropriatepreparations, document the procedures, and train employees. Emergency response and implementation do not havethe same relationship as recovery and resumption do.

    Incorporation of recovery requirements into system design will provideautomatic backup and recovery procedures. This helps to prepare for disasters in a timely manner. Choice (a)is incorrect because training every employee in emergency procedures does not guarantee that they will respondto a disaster in an optimal manner when needed. Choice (b) is incorrect even though conducting fire drills everymonth regularly is a good practice. Disaster recovery goes beyond fire drills. Choice (c) is incorrect becauseit is not necessary to train all IT staff in file rotation procedures. Only key people need to be trained.

    Emergency response:

    Emergency response procedures are those procedures initiated immediatelyafter an emergency occurs in order to (1) protect life, (2) protect property, and (3) minimize the impact of theemergency (loss control). Maximizing profits can be practiced during non-emergency times but not during anemergency.

    Reporting:

    The post-incident review after a disaster has occurred should focus on whathappened, what should have happened, and what should happen next, but not on who caused it. Blaming people

    will not solve the problem.

    Hidden costs are not insurable expenses and include (1) unemployment compensation premiums resultingfrom layoffs in the work force, (2) increases in advertising expenditures necessary to rebuild the volume ofbusiness, (3) cost of training new and old employees, and (4) increased cost of production due to decline in overalloperational efficiency. Generally, traditional accounting systems are not set up to accumulate and report the hiddencosts. Opportunity costs are not insurable expenses. They are costs of foregone choices, and accounting systems donot capture these types of costs. Both direct and variable costs are insurable expenses and are captured by accountingsystems.

    Costs:

    Manual tape processing has the tendency to cause problems at restore time.

    Multiple copies of files exist on different tapes. Finding the right tape to restore can become a nightmare, unlessthe software product has automated indexing and labeling features. Restoring files is costly due to theconsiderable human intervention required, causing delays. Until the software is available to automate the filerestoration process, costs continue to be higher than the other choices. Backing up refers to a duplicate copyof a data set that is held in storage in case the original data are lost or damaged. Archiving refers to the processof moving infrequently accessed data to less accessible and lower cost storage media. Journaling applicationspost a copy of each transaction to both the local and remote storage sites when applicable.

    Page 12 of17

  • 8/2/2019 Chapter-8 Business Continuity Planning & Disaster Recovery Planning

    13/17

    Chapter-8 BCP and DRP

    Media:

    These qualities cannot be generated quickly during a crisis. They take a long time to develop and maintain, longbefore a disaster occurs. On the other hand, media relationships require a proactive approach during a disaster.This includes distributing an information kit to the media at a moment's notice. The background information aboutthe company in the kit must be regularly reviewed and updated. When disaster strikes, it is important to get thecompany information out early. By presenting relevant information to the media, more time is available tomanage the actual day-to-day aspects of crisis communications during the disaster.

    Disaster:

    The main hazards caused by hurricanes most often involve the loss of power, flooding, and the inability to accessfacilities. Businesses may also be impacted by structural damage as well. Hurricanes are the only events that givewarnings before the disaster strikes. Excessive rains lead to floods. Earthquakes do not give warnings.

    The first thing is to declare the disaster as soon as the warning sign is known. Protecting the business site isinstrumental in continuing or restoring operations in the event of a hurricane. Ways to do this include anuninterruptible power supply (batteries and generators), a backup water source, and a supply of gasoline poweredpumps to keep the lower levels of the facility clear of flood waters. Boarding up windows and doors is good toprotect buildings from high-speed flying debris.

    Power and air-conditioning requirements need to be determined in advanceto reduce the installation time frames. Thiincludes diesel power generators, fuel, and other associated equipment. Media communications include keeping in towith radio, television, and newspaper firms. The call tree list should be kept current all the time so that the employee andvendor notification process can begin as soon as the disaster strikes. This list includes primary and secondary employeenames and phone numbers as well as escalation levels.

    Focus:

    Discretionary expense means management can decide whether to spend money on a particular item. When revenuesor profits fall, management can cut the discretionary expenses to reach their targeted profit goals. These mayinclude advertisement, training, and disaster recovery plans. Cutting the disaster recovery expense may not be agood choice since no one knows when a disaster might strike an organization. Then, it is too late to do anything.

    The focus of disaster recovery planning should be on protecting theorganization against the consequences of a disaster, not on the probability that it may or may not happen.

    Communications:

    A call tree diagram shows who to contact when a required person is not available or not responding. The calltree shows the successive levels of people to contact if no response is received from the lower level of the tree. Itshows the backup people when the primary person is not available. A decision tree diagram will show all thechoices available with their outcomes to make a decision. An event tree diagram can be used in projectmanagement, and a parse tree diagram can be used in estimating probabilities and nature of states insoftware engineering.

    Applications:

    It is important to define applications into certain categories to establish processing priority. For example, thetime for recovery of applications in category I is 72 hours after disaster declaration (high priority). The time

    Page 13 of17

  • 8/2/2019 Chapter-8 Business Continuity Planning & Disaster Recovery Planning

    14/17

    Chapter-8 BCP and DRP

    frame for recovery of category IV applications is three months after disaster declaration (low priority).

    Preparation:

    The decision to activate a disaster recovery plan is made after damageassessment and evaluation is completed. A list of equipment, software, forms, and supplies needed to operatecontingency category I (high priority) applications should be available to use as a damage assessment checklist.

    Interdependencies:

    The primary objectives of information systems security include integrity, availability, and confidentiality.Contingency plans support system availability by restoring or recovering from a disaster as quickly as possible. Theother three choices support security management practices.

    It is important to develop and implement a strategy for validating the business

    continuity plan within the time that remains. A typical strategy defines a minimum number of individual andjoint exercises that combine training and testing. There are several common techniques that can be employed,including reviews, rehearsals, and quality assurance reviews. Rehearsals include test drills and team memberrole plays.

    Triggers:

    It is important to document triggers for activating contingency plans. The information needed to define theimplementation triggers for contingency plans is the deployment schedule for each contingency plan and theimplementation schedule for the replaced mission-critical systems.

    Tasks:

    Data backup is the responsibility of system owners. The other three choices are the responsibility of contingencyrecovery and response teams.

    Processing requirements are the responsibility of system owners.

    Policies:

    A comprehensive disaster recovery plan is separate from but complementary to the security policy document. Bothitems go hand in hand.

    LAN Planning:

    Many physical problems in LANs are related to cables (choice a) since they can be broken or twisted. Servers

    (choice b) can be physically damaged due to disk head crash or power irregularities such as over or undervoltage conditions. Uninterruptible power supply (choice c) provides powerredundancy and protection to serversand workstations. Servers can be disk duplexed for redundancy. Redundant topologies such as star, mesh, orring can provide a duplicate path should a main cable link fail. Hubs require physical controls such as lock andkey since they are stored in wiring closets, although they can also benefit from redundancy, which can be veryexpensive. Given the choices, it is preferable to have redundant facilities for cables, servers, and powersupplies.

    Controls:

    Page 14 of17

  • 8/2/2019 Chapter-8 Business Continuity Planning & Disaster Recovery Planning

    15/17

    Chapter-8 BCP and DRP

    As a part of the preventive control category, fail soft is a continuity control.It is the selective termination of affected nonessential processing when a hardware or software failure isdetected in a computer system. A computer system continues to function because of its resilience.Choice (b) is incorrect because accuracy controls include data editing and validation routines. Choice (c) isincorrect because completeness control looks for the presence of all the required values or elements. Choice(d) is incorrect because consistency controls ensure repeatability of certain transactions with the sameattributes.

    Storage media has nothing to do with information availability. Data will be stored somewhere on some media. Itis not a decision criterion.

    Management's goal is to gather useful information and to make it available to authorized users. System backupand recovery procedures (choice a) and alternate computer equipment and facilities (choice d) will help ensurethat the recovery is as timely as possible. Both physical and logical access controls become important (choicec). System failures and other interruptions are common.

    Mean-time-to-repair (MTTR) is the amount of time it takes to resume normal operation. It is expressed in minutesor hours taken to repair computer equipment. The smaller the MTTR for equipment the more reliable it is. Mean-time-between-failures (MTBF) is the average length of time the hardware is functional. MTBF is expressed as

    the average number of hours or days between failures. The higher the MTBF the better reliability a system has.Redundant computer hardware (choice b) and backup computer facilities (choice c) are incorrect because theyare examples of system availability controls. They also address contingencies in case of a computer disaster(choice d).

    Tests:

    Drills give disaster recovery team members the opportunity to think through their tasks without the pressure ofbeing measured or graded. Exercises should periodically be conducted unannounced to more closely simulatethe pressure of a real disaster.

    The other three choices do not demonstrate the ability to respond when needed. A written plan is no good if it isnot tested. There are several types of testing: reviews, analyses, and simulations of disasters. Drills and exercisesare examples of simulation.

    Skills:

    The disaster recovery coordinator does not need the highly technical skills ofa programmer, systems analyst, hardware specialist, or network administrator. However, the coordinator should beable to communicate with technical staff and adequately interpret what they say in order to communicate iteffectively and clearly to nontechnical users and management.

    Applications:

    Since application systems are designed to provide data and information to end users, they are in a better position to

    assess the value or usefulness of the system to their business operations. Input from the other three parties(application programmers, computer operators, and auditors) is important but not as important as that of endusers. Their view is limited.

    An application system priority analysis should be performed to determine the business criticality for eachcomputer application. A priority code should be assigned to each production application system that is criticalto the survival of the organization. The priority code tells people how soon the application should be processedwhen the backup computer facility is ready. This will help in restoring the computer system following a disasterand facilitate in developing a recovery schedule.

    Page 15 of17

  • 8/2/2019 Chapter-8 Business Continuity Planning & Disaster Recovery Planning

    16/17

    Chapter-8 BCP and DRP

    Off-Site:

    Commercial off-site storage facilities are used to store data and program fileson magnetic media and system-related documentation, among other things. To reduce expenses, users ofcommercial off-site storage facilities often share their room or area with other users. This may sound good but be

    aware of security and access control problems and issues. Usually, fire protection (choice a), temperature andhumidity controls (choice b), and physical security (choice d) are reasonably adequate. They are the basic controlsneeded to operate.

    The selection of an off-site storage vendor is an important process that shouldbe done with proper care. The selection of the right backup storage facility vendor is critical in terms of thevendor's performance and capabilities. The number of employees the vendor has is less important comparedto other criteria.

    Media management and environmental factors (choice a), management reputation and site physical security(choice c), and transportation capabilities (choice d) are very important because they can make a big differencein vendor selection. For example, if transportation capabilities are less than desirable, the entire decision couldchange. It would be risky if there were no log maintained to record the movement and storage of media.

    Besides vendor fees, there are other items of information that the disaster recovery coordinator should obtain inwriting from the off-site storage vendor. Requiring that the same person deliver and pick up the media all the timeis not practical or necessary if other controls are effective. The other three choices are very important.

    The difficult aspect of the disaster recovery plan is keeping it up-to-date with all the changes that occur.Depending on how frequently the organization's systems and procedures change, a review of the off-site vendorand backup computer vendor facilities should be conducted once a quarter or semiannually. Generally, thereview does not include whether the vendor has enough computer capacity to serve, which is a long-termquestion.

    The process of transmitting backup information directly to an off-site storage vault is called electronic vaulting. Ituses telephone lines and networks to transmit the data. This process can be reversed as well. Electronic vaultingtakes less time than the other three choices even though they are also acceptable mechanisms. Choices (a), (b),

    and (d) are incorrect because special courier, regular courier and special messenger are not as fast as electronicvaulting.

    Goals:

    Data integrity and availability are two important elements of reliablecomputing. Data integrity is the concept of being able to ensure that data can be maintained in an unimpairedcondition and is not subject to unauthorized modification, whether intentional or inadvertent. Products suchas backup software, anti-virus software, and disk repair utility programs help protect data integrity in personalcomputers (PCs) and workstations. Availability is the property that a given resource will be usable during agiven time period. PCs and servers are becoming an integral part of complex networks with thousands ofhardware and software components (e.g., hubs, routers, bridges, databases, directory services) and the complexnature of client/server networks drives the demand for availability. System availability is increased whensystem downtime or outages are decreased and when fault tolerance hardware and software are used.

    Data security, privacy, and confidentiality are incorrect because they deal with ensuring that data is disclosedonly to authorized individuals and have nothing to do with reliable computing. Modularity deals with thebreaking down of a large system into small modules. Portability deals with the ability of application softwaresource code and data to be transported without significant modification to more than one type of computerplatform or more than one type of operating system. Portability has nothing to do with reliable computing.Feasibility deals with the degree to which the requirements can be implemented under existing constraints.

    Disaster recovery plans protect against the economic and intrinsic losses (e.g.,lost sales, lost profits) suffered by a company while insurance policies protect against the physical and tangible

    Page 16 of17

  • 8/2/2019 Chapter-8 Business Continuity Planning & Disaster Recovery Planning

    17/17

    Chapter-8 BCP and DRP

    losses (e.g., buildings, inventory, and equipment).

    System availability is expressed as a rate between the number of hours thesystem is available to the users during a given period and the scheduled hours of operation. Overall hours ofoperation also include sufficient time for scheduled maintenance activities. Scheduled time is the hours ofoperation and available time is the time during which the computer system is available to the users.

    Insurance:

    According to insurance industry estimates, every dollar of insured loss isaccompanied by three dollars of uninsured economic loss. This suggests that companies are only insured forone third of the potential consequences of a disaster and that insurance truly is a disaster recovery plan of lastresort.

    The major purpose of insurance is to transfer risk to others. Without properplanning, a company might be over-insured, which costs money. This can easily happen when hardware isleased and the lease agreement has built-in coverage for recovery and restoration services. A company can bedoubly insured on its hardware platforms with its own separate coverage, causing an excessive insuranceamount.

    The other choices do not deal with insurance, per se. For example, commercial software with problems (choiceb) can easily be replaced for a nominal fee from the software vendor. Competitive bids from underwriters(choice c) can reduce the premium amount. Data storage media problems (choice d) can be fixed withreplacement and/or reproduction coverage.

    A traditional computer data processing insurance policy covers equipment, buildings, and storage mediarecreation. It does not provide the coverage for the consequences of the loss of computer equipment or itsinaccessibility. The coverage is focused on the repair and replacement of the computer equipment. What isneeded is a policy that will not only replace the damaged equipment, but also covers the cost of alternativeprocessing while the equipment is unavailable.

    Since insurance reduces or eliminates risk, the best insurance is the onecommensurate with the most common types of risks to which a company is exposed. Choice (a) is incorrectbecause a basic policy covers specific named perils including fire, lightning, windstorm, etc. Choice (b) isincorrect because a broad policy covers additional perils such as roof collapse and volcanic action. Choice (c)is incorrect because a special all risk policy covers everything except specific exclusions named in the policy.

    An effective insurance recovery program does not alter, substitute, or eliminate the need for acomprehensive disaster recovery plan but rather complements it. This is because both have separate but usefulpurposes.

    Processing requirements are the responsibility of system owners.

    Page 17 of17