raid - 123seminarsonly.com€¦ · web viewraid (disambiguation). ... the schemes or...

RAIDFrom Wikipedia, the free encyclopediaJump to: navigation, search

This article has multiple issues. Please help improve it or discuss these issues on the talk page.

It needs additional references or sources for verification. Tagged since September 2007.

Its Criticism or Controversy section(s) may mean the article does not present a neutral point of view of the subject. Tagged since February 2010.

This article is about the data storage technology. For other uses, see Raid (disambiguation).

RAID, an acronym for Redundant Array of Independent Disks (formerly Redundant Array of Inexpensive Disks), is a technology that provides increased storage functions and reliability through redundancy, combining multiple disk drives components into a logical unit where all drives in the array are interdependent. This concept was first defined by David A. Patterson, Garth A. Gibson, and Randy Katz at the University of California, Berkeley in 1987 as Redundant Arrays of Inexpensive Disks.[1] Marketers representing industry RAID manufacturers later attempted to reinvent the term to describe a redundant array of independent disks as a means of dissociating a low-cost expectation from RAID technology.[2]

RAID is now used as an umbrella term for computer data storage schemes that can divide and replicate data among multiple disk drives. The schemes or architectures are named by the word RAID followed by a number (e.g., RAID 0, RAID 1). The various designs of RAID systems involve two key goals: increase data reliability and increase input/output performance. When multiple physical disks are set up to use RAID technology, they are said to be in a RAID array.[3] This array distributes data across multiple disks, but the array is addressed by the operating system as one single disk. RAID can be set up to serve several different purposes.

Contents[hide]

1 Standard levels 2 Nested (hybrid) RAID 3 RAID Parity 4 RAID 10 versus RAID 5 in Relational Databases 5 New RAID classification 6 Non-standard levels 7 Data backup 8 Implementations

o 8.1 Software-based RAID

http://en.wikipedia.org/wiki/RAID#Software-based_RAID%23Software-based_RAID

http://en.wikipedia.org/wiki/RAID#Implementations%23Implementations

http://en.wikipedia.org/wiki/RAID#Data_backup%23Data_backup

http://en.wikipedia.org/wiki/RAID#Non-standard_levels%23Non-standard_levels

http://en.wikipedia.org/wiki/RAID#New_RAID_classification%23New_RAID_classification

http://en.wikipedia.org/wiki/RAID#RAID_10_versus_RAID_5_in_Relational_Databases%23RAID_10_versus_RAID_5_in_Relational_Databases

http://en.wikipedia.org/wiki/RAID#RAID_Parity%23RAID_Parity

http://en.wikipedia.org/wiki/RAID#Nested_.28hybrid.29_RAID%23Nested_.28hybrid.29_RAID

http://en.wikipedia.org/wiki/RAID#Standard_levels%23Standard_levels

http://en.wikipedia.org/wiki/RAID#%23

http://en.wikipedia.org/wiki/Operating_system

http://en.wikipedia.org/wiki/RAID#cite_note-RAS-2%23cite_note-RAS-2

http://en.wikipedia.org/wiki/Input/output

http://en.wikipedia.org/wiki/Data_reliability

http://en.wikipedia.org/wiki/Data_(computing)

http://en.wikipedia.org/wiki/Computer_data_storage

http://en.wikipedia.org/wiki/Umbrella_term

http://en.wikipedia.org/wiki/RAID#cite_note-1%23cite_note-1

http://en.wikipedia.org/wiki/RAID#cite_note-patterson-0%23cite_note-patterson-0

http://en.wikipedia.org/wiki/University_of_California,_Berkeley

http://en.wikipedia.org/wiki/University_of_California,_Berkeley

http://en.wikipedia.org/wiki/Randy_Katz

http://en.wikipedia.org/wiki/Garth_A._Gibson

http://en.wikipedia.org/wiki/David_A._Patterson_(scientist)

http://en.wikipedia.org/wiki/Redundancy_(engineering)

http://en.wikipedia.org/wiki/Raid_(disambiguation)

http://en.wikipedia.org/wiki/Raid_(disambiguation)

http://en.wikipedia.org/wiki/Wikipedia:Neutral_point_of_view

http://en.wikipedia.org/wiki/Wikipedia:Criticism_sections

http://en.wikipedia.org/wiki/Wikipedia:Verifiability

http://en.wikipedia.org/wiki/Wikipedia:Citing_sources

http://en.wikipedia.org/wiki/Talk:RAID

http://en.wikipedia.org/w/index.php?title=RAID&action=edit

http://en.wikipedia.org/wiki/RAID#p-search%23p-search

http://en.wikipedia.org/wiki/RAID#mw-head%23mw-head

o 8.2 Hardware-based RAID o 8.3 Firmware/driver-based RAID o 8.4 Network-attached storage o 8.5 Hot spares

9 Reliability terms 10 Problems with RAID

o 10.1 Correlated failures o 10.2 Atomicity o 10.3 Write cache reliability o 10.4 Equipment compatibility o 10.5 Data recovery in the event of a failed array o 10.6 Drive error recovery algorithms o 10.7 Increasing recovery time o 10.8 Operator skills, correct operation o 10.9 Other problems and viruses

11 History 12 Vinum 13 Software RAID vs. Hardware RAID 14 Non-RAID drive architectures 15 See also 16 References 17 Further reading

18 External links

[edit] Standard levelsMain article: Standard RAID levels

A number of standard schemes have evolved which are referred to as levels. There were five RAID levels originally conceived, but many more variations have evolved, notably several nested levels and many non-standard levels (mostly proprietary).

Following is a brief textual summary of the most commonly used RAID levels.[4]

RAID 0 (block-level striping without parity or mirroring) provides improved performance and additional storage but no redundancy or fault tolerance (making it not true RAID, according to the acronym's definition). However, because of the similarities to RAID (especially the need for a controller to distribute data across multiple disks), simple stripe sets are normally referred to as RAID 0. Any disk failure destroys the array, and the likelihood of failure increases with more disks in the array (at a minimum, catastrophic data loss is twice as likely compared to single drives without RAID). A single disk failure destroys the entire array because when data is written to a RAID 0 volume, the data is broken into fragments called blocks. The number of blocks is dictated by the stripe size,

http://en.wikipedia.org/wiki/Disk_mirroring

http://en.wikipedia.org/wiki/Parity_bit

http://en.wikipedia.org/wiki/Data_striping


http://en.wikipedia.org/wiki/Proprietary_software

http://en.wikipedia.org/wiki/Non-standard_RAID_levels

http://en.wikipedia.org/wiki/Nested_RAID_levels

http://en.wikipedia.org/wiki/Standard_RAID_levels

http://en.wikipedia.org/w/index.php?title=RAID&action=edit&section=1

http://en.wikipedia.org/wiki/RAID#External_links%23External_links

http://en.wikipedia.org/wiki/RAID#Further_reading%23Further_reading

http://en.wikipedia.org/wiki/RAID#References%23References

http://en.wikipedia.org/wiki/RAID#See_also%23See_also

http://en.wikipedia.org/wiki/RAID#Non-RAID_drive_architectures%23Non-RAID_drive_architectures

http://en.wikipedia.org/wiki/RAID#Software_RAID_vs._Hardware_RAID%23Software_RAID_vs._Hardware_RAID

http://en.wikipedia.org/wiki/RAID#Vinum%23Vinum

http://en.wikipedia.org/wiki/RAID#History%23History

http://en.wikipedia.org/wiki/RAID#Other_problems_and_viruses%23Other_problems_and_viruses

http://en.wikipedia.org/wiki/RAID#Operator_skills.2C_correct_operation%23Operator_skills.2C_correct_operation

http://en.wikipedia.org/wiki/RAID#Increasing_recovery_time%23Increasing_recovery_time

http://en.wikipedia.org/wiki/RAID#Drive_error_recovery_algorithms%23Drive_error_recovery_algorithms

http://en.wikipedia.org/wiki/RAID#Data_recovery_in_the_event_of_a_failed_array%23Data_recovery_in_the_event_of_a_failed_array

http://en.wikipedia.org/wiki/RAID#Equipment_compatibility%23Equipment_compatibility

http://en.wikipedia.org/wiki/RAID#Write_cache_reliability%23Write_cache_reliability

http://en.wikipedia.org/wiki/RAID#Atomicity%23Atomicity

http://en.wikipedia.org/wiki/RAID#Correlated_failures%23Correlated_failures

http://en.wikipedia.org/wiki/RAID#Problems_with_RAID%23Problems_with_RAID

http://en.wikipedia.org/wiki/RAID#Reliability_terms%23Reliability_terms

http://en.wikipedia.org/wiki/RAID#Hot_spares%23Hot_spares

http://en.wikipedia.org/wiki/RAID#Network-attached_storage%23Network-attached_storage

http://en.wikipedia.org/wiki/RAID#Firmware.2Fdriver-based_RAID%23Firmware.2Fdriver-based_RAID

http://en.wikipedia.org/wiki/RAID#Hardware-based_RAID%23Hardware-based_RAID

which is a configuration parameter of the array. The blocks are written to their respective disks simultaneously on the same sector. This allows smaller sections of the entire chunk of data to be read off the drive in parallel, increasing bandwidth. RAID 0 does not implement error checking, so any error is uncorrectable. More disks in the array means higher bandwidth, but greater risk of data loss.

In RAID 1 (mirroring without parity or striping), data is written identically to multiple disks (a "mirrored set"). Although many implementations create sets of 2 disks, sets may contain 3 or more disks. Array provides fault tolerance from disk errors or failures and continues to operate as long as at least one drive in the mirrored set is functioning. With appropriate operating system support, there can be increase to read performance, and only a minimal write performance reduction. Using RAID 1 with a separate controller for each disk is sometimes called duplexing.

In RAID 2 (bit-level striping with dedicated Hamming-code parity), all disk spindle rotation is synchronized, and data is striped such that each sequential bit is on a different disk. Hamming-code parity is calculated across corresponding bits on disks and stored on one or more parity disks. Extremely high data transfer rates are possible.

In RAID 3 (byte-level striping with dedicated parity), all disk spindle rotation is synchronized, and data is striped such that each sequential byte is on a different disk. Parity is calculated across corresponding bytes on disks and stored on a dedicated parity disk. Very high data transfer rates are possible.

RAID 4 (block-level striping with dedicated parity) is identical to RAID 5 (see below), but confines all parity data to a single disk, which can create a performance bottleneck. In this setup, files can be distributed between multiple disks. Each disk operates independently which allows I/O requests to be performed in parallel, though data transfer speeds can suffer due to the type of parity. The error detection is achieved through dedicated parity and is stored in a separate, single disk unit.

RAID 5 (block-level striping with distributed parity) distributes parity along with the data and requires all drives but one to be present to operate; drive failure requires replacement, but the array is not destroyed by a single drive failure. Upon drive failure, any subsequent reads can be calculated from the distributed parity such that the drive failure is masked from the end user. The array will have data loss in the event of a second drive failure and is vulnerable until the data that was on the failed drive is rebuilt onto a replacement drive. A single drive failure in the set will result in reduced performance of the entire set until the failed drive has been replaced and rebuilt.

RAID 6 (block-level striping with double distributed parity) provides fault tolerance from two drive failures; array continues to operate with up to two failed drives. This makes larger RAID groups more practical, especially for high-availability systems. This becomes increasingly important as large-capacity drives lengthen the time needed to recover from the failure of a single drive. Single-parity RAID levels are as vulnerable to data loss as a RAID 0 array until the failed drive is replaced and its data rebuilt; the larger the drive, the longer the rebuild

http://en.wikipedia.org/wiki/Byte

http://en.wikipedia.org/wiki/Hamming_code

will take. Double parity gives time to rebuild the array without the data being at risk if a single additional drive fails before the rebuild is complete.

The following table provides an overview of the most important parameters of standard RAID levels. Space efficiency is given as an equation in terms of the number of drives, n, which results in a value between 0 and 1, representing the fraction of the sum of the drives' capacities that is available for use. For example, if three drives are arranged in RAID 3, this gives a space efficiency of 1−(1/3) = 0.66. If their individual capacities are 250 GB each, for a total of 750 GB over the three, the usable capacity under RAID 3 for data storage is 500 GB.

Level Description

Minimum # of disks

Space Efficienc

y

Fault Toleranc

e

Read Benefi

t

Write Benefi

tImage

RAID 0

Block-level striping without parity or mirroring.

2 1 0 (none) nX nX

RAID 1

Mirroring without parity or striping.

2 1/n n−1 disks nX 1X

RAID 2

Bit-level striping with dedicated Hamming-code parity.

3 1 − 1/n ⋅ log2(n-1)

1 disk when the corrupt disk is

found by the ( )

recover-record code.

RAID 3

Byte-level striping with dedicated parity.

3 1 − 1/n 1 disk

http://en.wikipedia.org/wiki/RAID_3










http://en.wikipedia.org/wiki/Parity_bit




http://en.wikipedia.org/wiki/File:RAID_0.svg


http://en.wikipedia.org/wiki/File:RAID2_arch.svg


RAID 4

Block-level striping with dedicated parity.

3 1 − 1/n 1 disk

RAID 5

Block-level striping with distributed parity.

3 1 − 1/n 1 disk (n−1)X

variable

RAID 6

Block-level striping with double distributed parity.

4 1 − 2/n 2 disks

[edit] Nested (hybrid) RAIDMain article: Nested RAID levels

In what was originally termed hybrid RAID,[5] many storage controllers allow RAID levels to be nested. The elements of a RAID may be either individual disks or RAIDs themselves. Nesting more than two deep is unusual.

As there is no basic RAID level numbered larger than 9, nested RAIDs are usually unambiguously described by attaching the numbers indicating the RAID levels, sometimes with a "+" in between. The order of the digits in a nested RAID designation is the order in which the nested array is built: for RAID 1+0 first pairs of drives are combined into two or more RAID 1 arrays (mirrors), and then the resulting RAID 1 arrays are combined into a RAID 0 array (stripes). It is also possible to combine stripes into mirrors (RAID 0+1). The final step is known as the top array. When the top array is a RAID 0 (such as in RAID 10 and RAID 50) most vendors omit the "+", though RAID 5+0 is clearer.

RAID 0+1: striped sets in a mirrored set ( minimum four disks; even number of disks) provides fault tolerance and improved performance but increases complexity.

The key difference from RAID 1+0 is that RAID 0+1 creates a second striped set to mirror a primary striped set. The array continues to operate with one or more

http://en.wikipedia.org/wiki/RAID#cite_note-Vijayan-4%23cite_note-Vijayan-4

http://en.wikipedia.org/wiki/Nested_RAID_levels











drives failed in the same mirror set, but if drives fail on both sides of the mirror the data on the RAID system is lost.

RAID 1+0: mirrored sets in a striped set (minimum two disks but more commonly four disks to take advantage of speed benefits; even number of disks) provides fault tolerance and improved performance but increases complexity.

The key difference from RAID 0+1 is that RAID 1+0 creates a striped set from a series of mirrored drives. In a failed disk situation, RAID 1+0 performs better because all the remaining disks continue to be used. The array can sustain multiple drive losses so long as no mirror loses all its drives.

RAID 5+1: mirrored striped set with distributed parity (some manufacturers label this as RAID 53).

Whether an array runs as RAID 0+1 or RAID 1+0 in practice is often determined by the evolution of the storage system. A RAID controller might support upgrading a RAID 1 array to a RAID 1+0 array on the fly, but require a lengthy offline rebuild to upgrade from RAID 1 to RAID 0+1. With nested arrays, sometimes the path of least disruption prevails over achieving the preferred configuration.

[edit] RAID ParityMany RAID levels employ an error protection scheme called "parity". Parity calculation, in and of itself, is a widely used method in information technology to provide fault tolerance in a given set of data. But how does it work?

It is actually very simple. In Boolean logic, there is a principle called "exclusive or", or shorthand, "XOR", meaning "one or the other, but not neither nor both." For example:

0 XOR 0 = 00 XOR 1 = 11 XOR 0 = 11 XOR 1 = 0

The XOR operator is central to how parity data is created and used within an array; It is used both for the protection of data, as well as for the recovery of missing data.

Let's suppose for the sake of simplicity that we have simple RAID made up of 6 hard disks (4 for data, 1 for parity, and 1 for use as hot spare), where each drive is capable of holding just a single byte worth of storage. This is how our initial RAID configuration would look, keeping in mind no data has yet been written to it:

Drive #1: -------- (Data)Drive #2: -------- (Data)Drive #3: -------- (Data)Drive #4: -------- (Data)Drive #5 -------- (Hot Spare)


Drive #6 -------- (Parity)

Now, let's write some random bits to each of our four data drives.

Drive #1: 00101010 (Data)Drive #2: 10001110 (Data)Drive #3: 11110111 (Data)Drive #4: 10110101 (Data)Drive #5 -------- (Hot Spare)Drive #6 -------- (Parity)

Every time we write anything on our data drives, we need to calculate parity to ensure we can recover if we have a disk failure. To calculate the parity for this RAID, we simply take the XOR of each drive's data. The resulting value is our parity data.

00101010 XOR (Drive 1 byte)10001110 XOR (Drive 2 byte)11110111 XOR (Drive 3 byte)10110101 (Drive 4 byte)

============== (apply XOR concept bit-wise down the columns)

11100110 (This is the value of the Parity byte)

We now know that "11100110" is our parity data. We can write that data to our dedicated parity drive:

Drive #1: 00101010 (Data)Drive #2: 10001110 (Data)Drive #3: 11110111 (Data)Drive #4: 10110101 (Data)Drive #5: -------- (Hot Spare)Drive #6: 11100110 (Parity)

Now, lets suppose one of those drives has failed. You can pick any, but, for this example, let's say that Drive #3 has failed. In order to know what the contents of Drive #3 were, we perform the same XOR calculation against all the remaining drives, and substituting our parity value (11100110) in place of the missing/dead drive:

00101010 XOR (Drive 1 byte)10001110 XOR (Drive 2 byte)11100110 XOR (Parity byte in place of failed Drive 3 byte)10110101 (Drive 4 byte)

============== (apply XOR concept bit-wise down the columns)

11110111 (This is the value of the failed Drive 3 byte)

With the complete contents of Drive #3 now successfully recovered, the data is written to the hot spare, and the RAID can continue operating as it had before.

Drive #1: 00101010 (Data)Drive #2: 10001110 (Data)Drive #3: --Dead-- (Data)Drive #4: 10110101 (Data)Drive #5: 11110111 (Hot Spare)Drive #6: 11100110 (Parity)

Normally, someone at this point will replace the dead drive with a working one of the same size. When this happens, the hot spare's contents are then automatically copied to it by the array controller, allowing the hot spare to return to its original purpose as an emergency standby drive. The resulting array is identical to its pre-failure state:

Drive #1: 00101010 (Data)Drive #2: 10001110 (Data)Drive #3: 11110111 (Data)Drive #4: 10110101 (Data)Drive #5 -------- (Hot Spare)Drive #6 11100110 (Parity)

This same basic XOR principle applies to parity within RAID groups regardless of capacity or number of drives. As long as there are enough drives present to allow for an XOR calculation to take place, parity can be used to recover data from any single drive failure. (A minimum of three drives must be present in order for parity to be used for fault tolerance, since the XOR operator requires two operands, and a place to store the result.)

[edit] RAID 10 versus RAID 5 in Relational DatabasesA common myth (and one which serves to illustrate the mechanics of proper RAID implementation) is that in all deployments, RAID 10 is inherently better for relational databases than RAID 5, due to RAID 5's need to recalculate and redistribute parity data on a per-write basis. [2]

While this may have been a hurdle in past RAID 5 implementations, the task of parity recalculation and redistribution within modern SAN appliances is performed as a back-end process transparent to the host, not as an in-line process which competes with existing I/O. (i.e. the RAID controller handles this as a housekeeping task to be performed during a particular spindle's idle timeslices, so as not to disrupt any pending I/O from the host.) The "write penalty" inherent to RAID 5 has been effectively masked over the past ten years by a combination of improved controller design, larger amounts of cache, and faster hard disks. The effect of a write penalty when using RAID 5 is mostly a

http://www.bytepile.com/raid_class.php#5


concern when the workload has a high amount of random writes (such as in some databases) while in other workloads modern RAID 5 systems can be on par with RAID 10 performance. [3]

In the vast majority of enterprise-level SAN hardware, any writes which are generated by the host are simply acknowledged immediately, and destaged to disk on the back end when the controller sees fit to do so. From the host's perspective, an individual write to a RAID 10 volume is no faster than an individual write to a RAID 5 volume; A difference between the two only becomes apparent when write cache at the SAN controller level is overwhelmed, and the SAN appliance must reject or gate further write requests in order to allow write buffers on the controller to destage to disk. While rare, this generally indicates poor performance management on behalf of the SAN administrator, not a shortcoming of RAID 5 or RAID 10. SAN appliances generally service multiple hosts which compete both for controller cache and spindle time with one another. This contention is largely masked, in that the controller is generally intelligent and adaptive enough to maximize read cache hit ratios while also maximizing the process of destaging data from write cache.

The choice of RAID 10 versus RAID 5 for the purposes of housing a relational database will depend upon a number of factors (spindle availability, cost, business risk, etc.) but, from a performance standpoint, it depends mostly on the type of I/O that database can expect to see. For databases that are expected to be exclusively or strongly read-biased, RAID 10 is often chosen in that it offers a slight speed improvement over RAID 5 on sustained reads. If a database is expected to be strongly write-biased, RAID 5 becomes the more attractive option, since RAID 5 doesn't suffer from the same write handicap inherent in RAID 10; All spindles in a RAID 5 can be utilized to write simultaneously, whereas only half the members of a RAID 10 can be used . [4] However, for reasons similar to what has eliminated the "write penalty" in RAID 5, the reduced ability of a RAID 10 to handle sustained writes has been largely masked by improvements in controller cache efficiency and disk throughput.

What causes RAID 5 to be slightly slower than RAID 10 on sustained reads is the fact that RAID 5 has parity data interleaved within normal data. For every read pass in RAID 5, there is a probability that a read head may need to traverse a region of parity data. The cumulative effect of this is a slight performance drop compared to RAID 10, which does not use parity, and therefore will never encounter a circumstance where data underneath a head is of no use. For the vast majority of situations, however, most relational databases housed on RAID 10 perform equally well in RAID 5. The strengths and weaknesses of each type only become an issue in atypical deployments, or deployments on overcommitted or outdated hardware.[5]

There are, however, other considerations which must be taken into account other than simply those regarding performance. RAID 5 and other non-mirror-based arrays offer a lower degree of resiliency than RAID 10 by virtue of RAID 10's mirroring strategy. In a RAID 10, I/O can continue even in spite of multiple drive failures. By comparison, in a RAID 5 array, any simultaneous failure involving greater than one drive will render the

http://www-03.ibm.com/systems/resources/systems_storage_disk_ess_pdf_raid5-raid10.pdf


http://www.google.com/url?sa=t&source=web&cd=7&ved=0CEgQFjAG&url=http%3A%2F%2Fwww-03.ibm.com%2Fsystems%2Fresources%2Fsystems_storage_disk_ess_pdf_raid5-raid10.pdf&rct=j&q=RAID%205%20versus%20RAID%2010%20for%20databases%20pdf&ei=3FXsTPL4GYvSsAPhma3-Dg&usg=AFQjCNHBBpiR392qorLZJI1wROTY0Prlag&sig2=tpkQECtnqr2QLBB1LErJzg&cad=rja

array itself unusable by virtue of parity recalculation being impossible to perform. For many, particularly in mission-critical environments with enough capital to spend, RAID 10 becomes the favorite as it provides the lowest level of risk.[6]

Additionally, the time required to rebuild data on a hot spare in a RAID 10 is significantly less than RAID 5, in that all the remaining spindles in a RAID 5 rebuild must participate in the process, whereas only half of all spindles need to participate in RAID 10. In modern RAID 10 implementations, all drives generally participate in the rebuilding process as well, but only half are required, allowing greater degraded-state throughput over RAID 5 and overall faster rebuild times.[7]

Again, modern SAN design largely masks any performance hit while the RAID array is in a degraded state, by virtue of selectively being able to perform rebuild operations both in-band or out-of-band with respect to existing I/O traffic. Given the rare nature of drive failures in general, and the exceedingly low probability of multiple concurrent drive failures occurring within the same RAID array, the choice of RAID 5 over RAID 10 often comes down to the preference of the storage administrator, particularly when weighed against other factors such as cost, throughput requirements, and physical spindle availability. [8]

In short, the choice of RAID 5 versus RAID 10 involves a complicated mixture of factors. There is no one-size-fits-all solution, as the choice of one over the other must be dictated by everything from the I/O characteristics of the database, to business risk, to worst case degraded-state throughput, to the number and type of disks present in the array itself. Over the course of the life of a database, you may even see situations where RAID 5 is initially favored, but RAID 10 slowly becomes the better choice, and vice versa.

[edit] New RAID classificationIn 1996, the RAID Advisory Board introduced an improved classification of RAID systems[citation needed]. It divides RAID into three types: Failure-resistant disk systems (that protect against data loss due to disk failure), failure-tolerant disk systems (that protect against loss of data access due to failure of any single component), and disaster-tolerant disk systems (that consist of two or more independent zones, either of which provides access to stored data).

The original "Berkeley" RAID classifications are still kept as an important historical reference point and also to recognize that RAID Levels 0-6 successfully define all known data mapping and protection schemes for disk. Unfortunately, the original classification caused some confusion due to assumption that higher RAID levels imply higher redundancy and performance. This confusion was exploited by RAID system manufacturers, and gave birth to the products with such names as RAID-7, RAID-10, RAID-30, RAID-S, etc. The new system describes the data availability characteristics of the RAID system rather than the details of its implementation.

The next list provides criteria for all three classes of RAID:

http://en.wikipedia.org/wiki/Wikipedia:Citation_needed





- Failure-resistant disk systems (FRDS) (meets a minimum of criteria 1–6):

1. Protection against data loss and loss of access to data due to disk drive failure2. Reconstruction of failed drive content to a replacement drive3. Protection against data loss due to a "write hole"4. Protection against data loss due to host and host I/O bus failure5. Protection against data loss due to replaceable unit failure6. Replaceable unit monitoring and failure indication

- Failure-tolerant disk systems (FTDS) (meets a minimum of criteria 7–15 ):

7. Disk automatic swap and hot swap8. Protection against data loss due to cache failure9. Protection against data loss due to external power failure10. Protection against data loss due to a temperature out of operating range11. Replaceable unit and environmental failure warning12. Protection against loss of access to data due to device channel failure13. Protection against loss of access to data due to controller module failure14. Protection against loss of access to data due to cache failure15. Protection against loss of access to data due to power supply failure

- Disaster-tolerant disk systems (DTDS) (meets a minimum of criteria 16–21):

16. Protection against loss of access to data due to host and host I/O bus failure17. Protection against loss of access to data due to external power failure18. Protection against loss of access to data due to component replacement19. Protection against loss of data and loss of access to data due to multiple disk failure20. Protection against loss of access to data due to zone failure21. Long-distance protection against loss of data due to zone failure

[edit] Non-standard levelsMain article: Non-standard RAID levels

Many configurations other than the basic numbered RAID levels are possible, and many companies, organizations, and groups have created their own non-standard configurations, in many cases designed to meet the specialised needs of a small niche group. Most of these non-standard RAID levels are proprietary.

Storage Computer Corporation used to call a cached version of RAID 3 and 4, RAID 7. Storage Computer Corporation is now defunct.

EMC Corporation used to offer RAID S as an alternative to RAID 5 on their Symmetrix systems. Their latest generations of Symmetrix, the DMX and the V-Max series, do not support RAID S (instead they support RAID 1, RAID 5 and RAID 6.)

http://en.wikipedia.org/wiki/Symmetrix

http://en.wikipedia.org/wiki/EMC_Corporation

http://en.wikipedia.org/wiki/Property

http://en.wikipedia.org/wiki/Non-standard_RAID_levels


The ZFS filesystem, available in Solaris, OpenSolaris and FreeBSD, offers RAID-Z, which solves RAID 5's write hole problem.

Hewlett-Packard 's Advanced Data Guarding (ADG) is a form of RAID 6. NetApp 's Data ONTAP uses RAID-DP (also referred to as "double", "dual", or

"diagonal" parity), is a form of RAID 6, but unlike many RAID 6 implementations, does not use distributed parity as in RAID 5. Instead, two unique parity disks with separate parity calculations are used. This is a modification of RAID 4 with an extra parity disk.

Accusys Triple Parity (RAID TP) implements three independent parities by extending RAID 6 algorithms on its FC-SATA and SCSI-SATA RAID controllers to tolerate three-disk failure.

Linux MD RAID10 (RAID 10) implements a general RAID driver that defaults to a standard RAID 1 with 2 drives, and a standard RAID 1+0 with four drives, but can have any number of drives, including odd numbers. MD RAID 10 can run striped and mirrored, even with only two drives with the f2 layout (mirroring with striped reads, giving the read performance of RAID 0; normal Linux software RAID 1 does not stripe reads, but can read in parallel).[6][7]

Infrant (now part of Netgear) X-RAID offers dynamic expansion of a RAID 5 volume without having to back up or restore the existing content. Just add larger drives one at a time, let it resync, then add the next drive until all drives are installed. The resulting volume capacity is increased without user downtime. (It should be noted that this is also possible in Linux, when utilizing Mdadm utility. It has also been possible in the EMC Clariion and HP MSA arrays for several years.) The new X-RAID2 found on x86 ReadyNas, that is ReadyNas with Intel CPUs, offers dynamic expansion of a RAID 5 or RAID 6 volume (note X-RAID2 Dual Redundancy not available on all X86 ReadyNas) without having to back up or restore the existing content etc. A major advantage over X-RAID, is that using X-RAID2 you do not need to replace all the disks to get extra space, you only need to replace two disks using single redundancy or four disks using dual redundancy to get more redundant space.

BeyondRAID, created by Data Robotics and used in the Drobo series of products, implements both mirroring and striping simultaneously or individually dependent on disk and data context. It offers expandability without reconfiguration, the ability to mix and match drive sizes and the ability to reorder disks. It supports NTFS, HFS+, FAT32, and EXT3 file systems.[8] It also uses thin provisioning to allow for single volumes up to 16 TB depending on the host operating system support.

Hewlett-Packard 's EVA series arrays implement vRAID - vRAID-0, vRAID-1, vRAID-5, and vRAID-6. The EVA allows drives to be placed in groups (called Disk Groups) that form a pool of data blocks on top of which the RAID level is implemented. Any Disk Group may have "virtual disks" or LUNs of any vRAID type, including mixing vRAID types in the same Disk Group - a unique feature. vRAID levels are more closely aligned to Nested RAID levels - vRAID-1 is actually a RAID 1+0 (or RAID 10), vRAID-5 is actually a RAID 5+0 (or RAID 50), etc. Also, drives may be added on-the-fly to an existing Disk Group, and the

http://en.wikipedia.org/wiki/Hewlett-Packard

http://en.wikipedia.org/wiki/Thin_provisioning


http://en.wikipedia.org/wiki/EXT3

http://en.wikipedia.org/wiki/FAT32

http://en.wikipedia.org/wiki/HFS%2B

http://en.wikipedia.org/wiki/NTFS

http://en.wikipedia.org/wiki/Drobo

http://en.wikipedia.org/wiki/Data_Robotics

http://en.wikipedia.org/wiki/Mdadm

http://en.wikipedia.org/wiki/Netgear



http://en.wikipedia.org/wiki/Non-standard_RAID_levels#Linux_MD_RAID_10

http://en.wikipedia.org/wiki/Linux

http://en.wikipedia.org/wiki/NetApp

http://en.wikipedia.org/wiki/Hewlett-Packard

http://en.wikipedia.org/wiki/Standard_RAID_levels#RAID_5_disk_failure_rate

http://en.wikipedia.org/wiki/RAID-Z

http://en.wikipedia.org/wiki/RAID-Z

http://en.wikipedia.org/wiki/FreeBSD

http://en.wikipedia.org/wiki/OpenSolaris

http://en.wikipedia.org/wiki/Solaris_(operating_system)

http://en.wikipedia.org/wiki/ZFS

existing virtual disks data is redistributed evenly over all the drives, thereby allowing dynamic performance and capacity growth.

IBM (Among others) has implemented a RAID 1E (Level 1 Enhanced). With an even number of disks it is similar to a RAID 10 array, but, unlike a RAID 10 array, it can also be implemented with an odd number of drives. In either case, the total available disk space is n/2. It requires a minimum of three drives.

Hadoop has a RAID system that generates a parity file by xor-ing a stripe of blocks in a single HDFS file. More details can be found here [9]

[edit] Data backupA RAID system used as a main system disk is not intended as a replacement for backing up data. In parity configurations it will provide a backup-like feature to protect from catastrophic data loss caused by physical damage or errors on a single drive. Many other features of backup systems cannot be provided by RAID arrays alone. The most notable is the ability to restore an earlier version of data, which is needed to protect against software errors causing unwanted data to be written to the disk, and to recover from user error or malicious deletion. RAID can also be overwhelmed by catastrophic failure that exceeds its recovery capacity and, of course, the entire array is at risk of physical damage by fire, natural disaster, or human forces. RAID is also vulnerable to controller failure since it is not always possible to migrate a RAID to a new controller without data loss.[10]

RAID drives can serve as excellent backup drives when employed as removable backup devices to main storage, and particularly when located offsite from the main systems. However, the use of RAID as the only storage solution does not replace backups.

[edit] Implementations

It has been suggested that Vinum volume manager be merged into this article or section. (Discuss)

(Specifically, the section comparing hardware / software raid)

The distribution of data across multiple drives can be managed either by dedicated hardware or by software. When done in software the software may be part of the operating system or it may be part of the firmware and drivers supplied with the card.

[edit] Software-based RAID

Software implementations are now provided by many operating systems. A software layer sits above the (generally block-based) disk device drivers and provides an abstraction layer between the logical drives (RAIDs) and physical drives. Most common levels are RAID 0 (striping across multiple drives for increased space and performance) and RAID 1 (mirroring two drives), followed by RAID 1+0, RAID 0+1, and RAID 5 (data striping with parity) are supported. New filesystems like btrfs may replace the

http://en.wikipedia.org/wiki/Btrfs

http://en.wikipedia.org/wiki/Disk_drive

http://en.wikipedia.org/wiki/Logical_disk

http://en.wikipedia.org/wiki/Device_driver

http://en.wikipedia.org/wiki/Block_device

http://en.wikipedia.org/wiki/Operating_systems


http://en.wikipedia.org/wiki/Software

http://en.wikipedia.org/wiki/Hardware

http://en.wikipedia.org/wiki/Talk:Vinum_volume_manager

http://en.wikipedia.org/wiki/Wikipedia:Merging

http://en.wikipedia.org/wiki/Vinum_volume_manager



http://en.wikipedia.org/wiki/Computer_software

http://en.wikipedia.org/wiki/Backup

http://en.wikipedia.org/wiki/Backup



http://en.wikipedia.org/wiki/Hadoop

http://en.wikipedia.org/wiki/IBM

http://en.wikipedia.org/wiki/File:Mergefrom.svg

traditional software RAID by providing striping and redundancy at the filesystem object level.

Apple's Mac OS X Server [11] and Mac OS X [12] support RAID 0, RAID 1 and RAID 1+0.

FreeBSD supports RAID 0, RAID 1, RAID 3, and RAID 5 and all layerings of the above via GEOM modules[13][14] and ccd.,[15] as well as supporting RAID 0, RAID 1, RAID-Z, and RAID-Z2 (similar to RAID 5 and RAID 6 respectively), plus nested combinations of those via ZFS.

Linux supports RAID 0, RAID 1, RAID 4, RAID 5, RAID 6 and all layerings of the above, as well as "RAID10" (see above).[16][17] Certain reshaping/resizing/expanding operations are also supported.[18]

Microsoft 's server operating systems support RAID 0, RAID 1, and RAID 5. Some of the Microsoft desktop operating systems support RAID such as Windows XP Professional which supports RAID level 0 in addition to spanning multiple disks but only if using dynamic disks and volumes. Windows XP supports RAID 0, 1, and 5 with a simple file patch.[19] RAID functionality in Windows is slower than hardware RAID, but allows a RAID array to be moved to another machine with no compatibility issues.

NetBSD supports RAID 0, RAID 1, RAID 4 and RAID 5 (and any nested combination of those like 1+0) via its software implementation, named RAIDframe.

OpenBSD aims to support RAID 0, RAID 1, RAID 4 and RAID 5 via its software implementation softraid.

Solaris ZFS supports ZFS equivalents of RAID 0, RAID 1, RAID 5 (RAID Z), RAID 6 (RAID Z2), and a triple parity version RAID Z3, and any nested combination of those like 1+0. Note that RAID Z/Z2/Z3 solve the RAID 5/6 write hole problem and are therefore particularly suited to software implementation without the need for battery backed cache (or similar) support. The boot filesystem is limited to RAID 1.

Solaris SVM supports RAID 1 for the boot filesystem, and adds RAID 0 and RAID 5 support (and various nested combinations) for data drives.

Linux and Windows FlexRAID is a snapshot RAID implementation. HP's OpenVMS provides a form of RAID 1 called "Volume shadowing", giving

the possibility to mirror data locally and at remote cluster systems.

Software RAID has advantages and disadvantages compared to hardware RAID. The software must run on a host server attached to storage, and server's processor must dedicate processing time to run the RAID software. The additional processing capacity required for RAID 0 and RAID 1 is low, but parity-based arrays require more complex data processing during write or integrity-checking operations. As the rate of data processing increases with the number of disks in the array, so does the processing requirement. Furthermore all the buses between the processor and the disk controller must carry the extra data required by RAID which may cause congestion.

http://en.wikipedia.org/wiki/FlexRAID




http://en.wikipedia.org/wiki/Solaris

http://en.wikipedia.org/wiki/OpenBSD

http://en.wikipedia.org/wiki/NetBSD


http://en.wikipedia.org/wiki/Microsoft




http://en.wikipedia.org/wiki/Linux





http://en.wikipedia.org/wiki/GEOM



http://en.wikipedia.org/wiki/Mac_OS_X


http://en.wikipedia.org/wiki/Mac_OS_X_Server

Over the history of hard disk drives, the increase in speed of commodity CPUs has been consistently greater than the increase in speed of hard disk drive throughput.[20] Thus, over-time for a given number of hard disk drives, the percentage of host CPU time required to saturate a given number of hard disk drives has been dropping. e.g. The Linux software md RAID subsystem is capable of calculating parity information at 6 GB/s (100% usage of a single core on a 2.1 GHz Intel "Core2" CPU as of Linux v2.6.26). A three-drive RAID 5 array using hard disks capable of sustaining a write of 100 MB/s will require parity to be calculated at the rate of 200 MB/s. This will require the resources of just over 3% of a single CPU core during write operations (parity does not need to be calculated for read operations on a RAID 5 array, unless a drive has failed).

Software RAID implementations may employ more sophisticated algorithms than hardware RAID implementations (for instance with respect to disk scheduling and command queueing), and thus may be capable of increased performance.

Another concern with operating system-based RAID is the boot process. It can be difficult or impossible to set up the boot process such that it can fall back to another drive if the usual boot drive fails. Such systems can require manual intervention to make the machine bootable again after a failure. There are exceptions to this, such as the LILO bootloader for Linux, loader for FreeBSD,[21] and some configurations of the GRUB bootloader natively understand RAID 1 and can load a kernel. If the BIOS recognizes a broken first disk and refers bootstrapping to the next disk, such a system will come up without intervention, but the BIOS might or might not do that as intended. A hardware RAID controller typically has explicit programming to decide that a disk is broken and fall through to the next disk.

Hardware RAID controllers can also carry battery-powered cache memory. For data safety in modern systems the user of software RAID might need to turn the write-back cache on the disk off (but some drives have their own battery/capacitors on the write-back cache, a UPS, and/or implement atomicity in various ways, etc.). Turning off the write cache has a performance penalty that can, depending on workload and how well supported command queuing in the disk system is, be significant. The battery backed cache on a RAID controller is one solution to have a safe write-back cache.

Finally operating system-based RAID usually uses formats specific to the operating system in question so it cannot generally be used for partitions that are shared between operating systems as part of a multi-boot setup. However, this allows RAID disks to be moved from one computer to a computer with an operating system or file system of the same type, which can be more difficult when using hardware RAID (e.g. #1: When one computer uses a hardware RAID controller from one manufacturer and another computer uses a controller from a different manufacturer, drives typically cannot be interchanged. e.g. #2: If the hardware controller 'dies' before the disks do, data may become unrecoverable unless a hardware controller of the same type is obtained, unlike with firmware-based or software-based RAID).

http://en.wikipedia.org/wiki/Atomicity_(database_systems)

http://en.wikipedia.org/wiki/GNU_GRUB



Most operating system-based implementations allow RAIDs to be created from partitions rather than entire physical drives. For instance, an administrator could divide an odd number of disks into two partitions per disk, mirror partitions across disks and stripe a volume across the mirrored partitions to emulate IBM's RAID 1E configuration. Using partitions in this way also allows mixing reliability levels on the same set of disks. For example, one could have a very robust RAID 1 partition for important files, and a less robust RAID 5 or RAID 0 partition for less important data. (Some BIOS-based controllers offer similar features, e.g. Intel Matrix RAID.) Using two partitions on the same drive in the same RAID is, however, dangerous. (e.g. #1: Having all partitions of a RAID 1 on the same drive will, obviously, make all the data inaccessible if the single drive fails. e.g. #2: In a RAID 5 array composed of four drives 250 + 250 + 250 + 500 GB, with the 500 GB drive split into two 250 GB partitions, a failure of this drive will remove two partitions from the array, causing all of the data held on it to be lost).

[edit] Hardware-based RAID

Hardware RAID controllers use different, proprietary disk layouts, so it is not usually possible to span controllers from different manufacturers. They do not require processor resources, the BIOS can boot from them, and tighter integration with the device driver may offer better error handling.

A hardware implementation of RAID requires at least a special-purpose RAID controller. On a desktop system this may be a PCI expansion card, PCI-e expansion card or built into the motherboard. Controllers supporting most types of drive may be used – IDE/ATA, SATA, SCSI, SSA, Fibre Channel, sometimes even a combination. The controller and disks may be in a stand-alone disk enclosure, rather than inside a computer. The enclosure may be directly attached to a computer, or connected via SAN. The controller hardware handles the management of the drives, and performs any parity calculations required by the chosen RAID level.

Most hardware implementations provide a read/write cache, which, depending on the I/O workload, will improve performance. In most systems the write cache is non-volatile (i.e. battery-protected), so pending writes are not lost on a power failure.

Hardware implementations provide guaranteed performance, add no overhead to the local CPU complex and can support many operating systems, as the controller simply presents a logical disk to the operating system.

Hardware implementations also typically support hot swapping, allowing failed drives to be replaced while the system is running.

However, inexpensive hardware RAID controllers can be slower than software RAID due to the dedicated CPU on the controller card not being as fast as the CPU in the computer/server. More expensive RAID controllers have faster CPUs, capable of higher throughput speeds and do not present this slowness.

http://en.wikipedia.org/wiki/Throughput

http://en.wikipedia.org/wiki/Server_(computing)

http://en.wikipedia.org/wiki/CPU

http://en.wikipedia.org/wiki/Logical_disk

http://en.wikipedia.org/wiki/Cache

http://en.wikipedia.org/wiki/Storage_area_network

http://en.wikipedia.org/wiki/Direct-attached_storage

http://en.wikipedia.org/wiki/Disk_enclosure

http://en.wikipedia.org/wiki/Fibre_Channel

http://en.wikipedia.org/wiki/Serial_Storage_Architecture

http://en.wikipedia.org/wiki/SCSI

http://en.wikipedia.org/wiki/Serial_ATA

http://en.wikipedia.org/wiki/Advanced_Technology_Attachment

http://en.wikipedia.org/wiki/Motherboard

http://en.wikipedia.org/wiki/Expansion_card

http://en.wikipedia.org/wiki/PCI_Express

http://en.wikipedia.org/wiki/Expansion_card

http://en.wikipedia.org/wiki/Peripheral_Component_Interconnect

http://en.wikipedia.org/wiki/RAID_controller


http://en.wikipedia.org/wiki/Intel_Matrix_RAID

http://en.wikipedia.org/wiki/Non-standard_RAID_levels#IBM_ServeRAID_1E

http://en.wikipedia.org/wiki/Disk_partitioning

[edit] Firmware/driver-based RAID

Operating system-based RAID doesn't always protect the boot process and is generally impractical on desktop versions of Windows (as described above). Hardware RAID controllers are expensive and proprietary. To fill this gap, cheap "RAID controllers" were introduced that do not contain a RAID controller chip, but simply a standard disk controller chip with special firmware and drivers. During early stage bootup the RAID is implemented by the firmware; when a protected-mode operating system kernel such as Linux or a modern version of Microsoft Windows is loaded the drivers take over.

These controllers are described by their manufacturers as RAID controllers, and it is rarely made clear to purchasers that the burden of RAID processing is borne by the host computer's central processing unit, not the RAID controller itself, thus introducing the aforementioned CPU overhead from which hardware controllers don't suffer. Firmware controllers often can only use certain types of hard drives in their RAID arrays (e.g. SATA for Intel Matrix RAID), as there is neither SCSI nor PATA support in modern Intel ICH southbridges; however, motherboard makers implement RAID controllers outside of the southbridge on some motherboards. Before their introduction, a "RAID controller" implied that the controller did the processing, and the new type has become known by some as "fake RAID" even though the RAID itself is implemented correctly. Adaptec calls them "HostRAID". Various Linux distributions will refuse to work with "fake RAID".[9].

[edit] Network-attached storage

Main article: Network-attached storage

While not directly associated with RAID, Network-attached storage (NAS) is an enclosure containing disk drives and the equipment necessary to make them available over a computer network, usually Ethernet. The enclosure is basically a dedicated computer in its own right, designed to operate over the network without screen or keyboard. It contains one or more disk drives; multiple drives may be configured as a RAID.

[edit] Hot spares

Both hardware and software RAIDs with redundancy may support the use of hot spare drives, a drive physically installed in the array which is inactive until an active drive fails, when the system automatically replaces the failed drive with the spare, rebuilding the array with the spare drive included. This reduces the mean time to recovery (MTTR), though it doesn't eliminate it completely. Subsequent additional failure(s) in the same RAID redundancy group before the array is fully rebuilt can result in loss of the data; rebuilding can take several hours, especially on busy systems.

Rapid replacement of failed drives is important as the drives of an array will all have had the same amount of use, and may tend to fail at about the same time rather than

http://en.wikipedia.org/wiki/Mean_time_to_recovery

http://en.wikipedia.org/wiki/Hot_spare


http://en.wikipedia.org/wiki/Ethernet

http://en.wikipedia.org/wiki/Computer_network

http://en.wikipedia.org/wiki/Network-attached_storage

http://en.wikipedia.org/wiki/Network-attached_storage


https://ata.wiki.kernel.org/index.php/SATA_RAID_FAQ%7Csource

http://en.wikipedia.org/wiki/Adaptec

http://en.wikipedia.org/wiki/Southbridge_(computing)

http://en.wikipedia.org/wiki/Intel_Matrix_RAID

http://en.wikipedia.org/wiki/Microsoft_Windows

http://en.wikipedia.org/wiki/Operating_system_kernel


randomly.[citation needed] RAID 6 without a spare uses the same number of drives as RAID 5 with a hot spare and protects data against simultaneous failure of up to two drives, but requires a more advanced RAID controller. Further, a hot spare can be shared by multiple RAID sets.

[edit] Reliability termsFailure rate

Two different kinds of failure rates are applicable to RAID systems. Logical failure is defined as the loss of a single drive and its rate is equal to the sum of individual drives' failure rates. System failure is defined as loss of data and its rate will depend on the type of RAID. For RAID 0 this is equal to the logical failure rate, as there is no redundancy. For other types of RAID, it will be less than the logical failure rate, potentially approaching zero, and its exact value will depend on the type of RAID, the number of drives employed, and the vigilance and alacrity of its human administrators.

Mean time to data loss (MTTDL)In this context, the average time before a loss of data in a given array.[22] Mean time to data loss of a given RAID may be higher or lower than that of its constituent hard drives, depending upon what type of RAID is employed. The referenced report assumes times to data loss are exponentially distributed. This means 63.2% of all data loss will occur between time 0 and the MTTDL.

Mean time to recovery (MTTR)In arrays that include redundancy for reliability, this is the time following a failure to restore an array to its normal failure-tolerant mode of operation. This includes time to replace a failed disk mechanism as well as time to re-build the array (i.e. to replicate data for redundancy).

Unrecoverable bit error rate (UBE)This is the rate at which a disk drive will be unable to recover data after application of cyclic redundancy check (CRC) codes and multiple retries.

Write cache reliabilitySome RAID systems use RAM write cache to increase performance. A power failure can result in data loss unless this sort of disk buffer is supplemented with a battery to ensure that the buffer has enough time to write from RAM back to disk.

Atomic write failureAlso known by various terms such as torn writes, torn pages, incomplete writes, interrupted writes, non-transactional, etc.

[edit] Problems with RAID

[edit] Correlated failures

The theory behind the error correction in RAID assumes that failures of drives are independent. Given these assumptions it is possible to calculate how often they can fail and to arrange the array to make data loss arbitrarily improbable.



http://en.wikipedia.org/wiki/Atomicity_(database_systems)

http://en.wikipedia.org/wiki/Disk_buffer

http://en.wikipedia.org/wiki/RAM

http://en.wikipedia.org/wiki/Mean_time_to_recovery


http://en.wikipedia.org/wiki/Failure_rate


http://en.wikipedia.org/wiki/Wikipedia:Citation_needed

In practice, the drives are often the same ages, with similar wear. Since many drive failures are due to mechanical issues which are more likely on older drives, this violates those assumptions and failures are in fact statistically correlated. In practice then, the chances of a second failure before the first has been recovered is not nearly as unlikely as might be supposed, and data loss can, in practice, occur at significant rates.[23]

A common misconception is that "server-grade" drives fail less frequently than consumer-grade drives. Two independent studies, one by Carnegie Mellon University and the other by Google, have shown that the "grade" of the drive does not relate to failure rates.[24][25]

[edit] Atomicity

This is a little understood and rarely mentioned failure mode for redundant storage systems that do not utilize transactional features. Database researcher Jim Gray wrote "Update in Place is a Poison Apple"[26] during the early days of relational database commercialization. However, this warning largely went unheeded and fell by the wayside upon the advent of RAID, which many software engineers mistook as solving all data storage integrity and reliability problems. Many software programs update a storage object "in-place"; that is, they write a new version of the object on to the same disk addresses as the old version of the object. While the software may also log some delta information elsewhere, it expects the storage to present "atomic write semantics," meaning that the write of the data either occurred in its entirety or did not occur at all.

However, very few storage systems provide support for atomic writes, and even fewer specify their rate of failure in providing this semantic. Note that during the act of writing an object, a RAID storage device will usually be writing all redundant copies of the object in parallel, although overlapped or staggered writes are more common when a single RAID processor is responsible for multiple drives. Hence an error that occurs during the process of writing may leave the redundant copies in different states, and furthermore may leave the copies in neither the old nor the new state. The little known failure mode is that delta logging relies on the original data being either in the old or the new state so as to enable backing out the logical change, yet few storage systems provide an atomic write semantic on a RAID disk.

While the battery-backed write cache may partially solve the problem, it is applicable only to a power failure scenario.

Since transactional support is not universally present in hardware RAID, many operating systems include transactional support to protect against data loss during an interrupted write. Novell Netware, starting with version 3.x, included a transaction tracking system. Microsoft introduced transaction tracking via the journaling feature in NTFS. Ext4 has journaling with checksums; ext3 has journaling without checksums but an "append-only" option, or ext3COW (Copy on Write). If the journal itself in a filesystem is corrupted though, this can be problematic. The journaling in NetApp WAFL file system gives atomicity by never updating the data in place, as does ZFS. An alternative method to

http://en.wikipedia.org/wiki/ZFS#Copy-on-write_transactional_model

http://en.wikipedia.org/wiki/Write_Anywhere_File_Layout

http://en.wikipedia.org/wiki/NTFS

http://en.wikipedia.org/wiki/Journaling_file_system


http://en.wikipedia.org/wiki/Jim_Gray_(computer_scientist)


http://en.wikipedia.org/wiki/RAID#cite_note-GoogleDiskFailure-24%23cite_note-GoogleDiskFailure-24

http://en.wikipedia.org/wiki/RAID#cite_note-CMUDiskFailure-23%23cite_note-CMUDiskFailure-23

http://en.wikipedia.org/wiki/Google

http://en.wikipedia.org/wiki/Carnegie_Mellon_University

http://en.wikipedia.org/wiki/RAID#cite_note-schroeder-22%23cite_note-schroeder-22

journaling is soft updates, which are used in some BSD-derived system's implementation of UFS.

This can present as a sector read failure. Some RAID implementations protect against this failure mode by remapping the bad sector, using the redundant data to retrieve a good copy of the data, and rewriting that good data to the newly mapped replacement sector. The UBE (Unrecoverable Bit Error) rate is typically specified at 1 bit in 1015 for enterprise class disk drives (SCSI, FC, SAS) , and 1 bit in 1014 for desktop class disk drives (IDE/ATA/PATA, SATA). Increasing disk capacities and large RAID 5 redundancy groups have led to an increasing inability to successfully rebuild a RAID group after a disk failure because an unrecoverable sector is found on the remaining drives. Double protection schemes such as RAID 6 are attempting to address this issue, but suffer from a very high write penalty.

[edit] Write cache reliability

The disk system can acknowledge the write operation as soon as the data is in the cache, not waiting for the data to be physically written. This typically occurs in old, non-journaled systems such as FAT32, or if the Linux/Unix "writeback" option is chosen without any protections like the "soft updates" option (to promote I/O speed whilst trading-away data reliability). A power outage or system hang such as a BSOD can mean a significant loss of any data queued in such a cache.

Often a battery is protecting the write cache, mostly solving the problem. If a write fails because of power failure, the controller may complete the pending writes as soon as restarted. This solution still has potential failure cases: the battery may have worn out, the power may be off for too long, the disks could be moved to another controller, the controller itself could fail. Some disk systems provide the capability of testing the battery periodically, however this leaves the system without a fully charged battery for several hours.

An additional concern about write cache reliability exists, specifically regarding devices equipped with a write-back cache—a caching system which reports the data as written as soon as it is written to cache, as opposed to the non-volatile medium.[27] The safer cache technique is write-through, which reports transactions as written when they are written to the non-volatile medium.

[edit] Equipment compatibility

The methods used to store data by various RAID controllers are not necessarily compatible, so that it may not be possible to read a RAID array on different hardware, with the exception of RAID 1, which is typically represented as plain identical copies of the original data on each disk. Consequently a non-disk hardware failure may require the use of identical hardware to recover the data, and furthermore an identical configuration has to be reassembled without triggering a rebuild and overwriting the data. Software RAID however, such as implemented in the Linux kernel, alleviates this concern, as the



http://en.wikipedia.org/wiki/BSOD


http://en.wikipedia.org/wiki/Serial_Attached_SCSI



http://en.wikipedia.org/wiki/Bad_sector

http://en.wikipedia.org/wiki/Unix_File_System

http://en.wikipedia.org/wiki/Soft_updates

setup is not hardware dependent, but runs on ordinary disk controllers, and allows the reassembly of an array. Additionally, individual RAID1 disks (software, and most hardware implementations) can be read like normal disks when removed from the array, so no RAID system is required to retrieve the data. Inexperienced data recovery firms typically have a difficult time recovering data from RAID drives, with the exception of RAID1 drives with conventional data structure.

[edit] Data recovery in the event of a failed array

With larger disk capacities the odds of a disk failure during rebuild are not negligible. In that event the difficulty of extracting data from a failed array must be considered. Only RAID 1 stores all data on each disk. Although it may depend on the controller, some RAID 1 disks can be read as a single conventional disk. This means a dropped RAID 1 disk, although damaged, can often be reasonably easily recovered using a software recovery program. If the damage is more severe, data can often be recovered by professional data recovery specialists. RAID 5 and other striped or distributed arrays present much more formidable obstacles to data recovery in the event the array fails.

[edit] Drive error recovery algorithms

This section does not cite any references or sources.Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed. (November 2009)

Many modern drives have internal error recovery algorithms that can take upwards of a minute to recover and re-map data that the drive fails to easily read. Many RAID controllers will drop a non-responsive drive in 8 seconds or so. This can cause the array to drop a good drive because it has not been given enough time to complete its internal error recovery procedure, leaving the rest of the array vulnerable. So-called enterprise class drives limit the error recovery time and prevent this problem, but desktop drives can be quite risky for this reason. A fix specific to Western Digital drives used to be known: a utility called WDTLER.exe could limit the error recovery time of a Western Digital desktop drive so that it would not be dropped from the array for this reason. The utility enabled TLER (time limited error recovery) which limits the error recovery time to 7 seconds. As of October 2009 Western Digital has locked out this feature in their desktop drives such as the Caviar Black.[28] Western Digital enterprise class drives are shipped from the factory with TLER enabled to prevent being dropped from RAID arrays. Similar technologies are used by Seagate, Samsung, and Hitachi.

As of late 2010, support for ATA Error Recovery Control configuration has been added to the Smartmontools program, so it now allows configuring many desktop class hard drives for use on a RAID controller.[28]

[edit] Increasing recovery time


http://en.wikipedia.org/wiki/RAID#cite_note-csc.liv.ac.uk-27%23cite_note-csc.liv.ac.uk-27

http://en.wikipedia.org/wiki/Smartmontools

http://en.wikipedia.org/wiki/RAID#cite_note-csc.liv.ac.uk-27%23cite_note-csc.liv.ac.uk-27

http://en.wikipedia.org/wiki/Time-Limited_Error_Recovery

http://en.wikipedia.org/wiki/Wikipedia:Verifiability#Burden_of_evidence

http://en.wikipedia.org/wiki/Template:Citation_needed

http://en.wikipedia.org/wiki/Wikipedia:Identifying_reliable_sources

http://en.wikipedia.org/w/index.php?title=RAID&action=edit

http://en.wikipedia.org/wiki/Wikipedia:Verifiability

http://en.wikipedia.org/wiki/Wikipedia:Citing_sources



Drive capacity has grown at a much faster rate than transfer speed, and error rates have only fallen a little in comparison. Therefore, larger capacity drives may take hours, if not days, to rebuild. The re-build time is also limited if the entire array is still in operation at reduced capacity.[29] Given a RAID array with only one disk of redundancy (RAIDs 3, 4, and 5), a second failure would cause complete failure of the array. Even though individual drives' mean time between failure (MTBF) have increased over time, this increase has not kept pace with the increased storage capacity of the drives. The time to rebuild the array after a single disk failure, as well as the chance of a second failure during a rebuild, have increased over time.[30]

[edit] Operator skills, correct operation

In order to provide the desired protection against physical drive failure, a RAID array must be properly set up and maintained by an operator with sufficient knowledge of the chosen RAID configuration, array controller (hardware or software), failure detection and recovery. Unskilled handling of the array at any stage may exacerbate the consequences of a failure, and result in downtime and full or partial loss of data that might otherwise be recoverable.

Particularly, the array must be monitored, and any failures detected and dealt with promptly. Failure to do so will result in the array continuing to run in a degraded state, vulnerable to further failures. Ultimately more failures may occur, until the entire array becomes inoperable, resulting in data loss and downtime. In this case, any protection the array may provide merely delays this.

The operator must know how to detect failures or verify healthy state of the array, identify which drive failed, have replacement drives available, and know how to replace a drive and initiate a rebuild of the array.

[edit] Other problems and viruses

While RAID may protect against physical drive failure, the data is still exposed to operator, software, hardware and virus destruction. Many studies[31] cite operator fault as the most common source of malfunction, such as a server operator replacing the incorrect disk in a faulty RAID array, and disabling the system (even temporarily) in the process.[32] Most well-designed systems include separate backup systems that hold copies of the data, but don't allow much interaction with it. Most copy the data and remove the copy from the computer for safe storage.

[edit] HistoryNorman Ken Ouchi at IBM was awarded a 1978 U.S. patent 4,092,732[33] titled "System for recovering data stored in failed memory unit." The claims for this patent describe what would later be termed RAID 5 with full stripe writes. This 1978 patent also mentions that disk mirroring or duplexing (what would later be termed RAID 1) and


http://en.wikipedia.org/wiki/Claim_(patent)


http://en.wikipedia.org/wiki/United_States_patent_law






http://en.wikipedia.org/wiki/RAID#cite_note-StorageForum-29%23cite_note-StorageForum-29

http://en.wikipedia.org/wiki/Mean_time_between_failure


protection with dedicated parity (that would later be termed RAID 4) were prior art at that time.

The term RAID was first defined by David A. Patterson, Garth A. Gibson and Randy Katz at the University of California, Berkeley, in 1987. They studied the possibility of using two or more drives to appear as a single device to the host system and published a paper: "A Case for Redundant Arrays of Inexpensive Disks (RAID)" in June 1988 at the SIGMOD conference.[1]

This specification suggested a number of prototype RAID levels, or combinations of drives. Each had theoretical advantages and disadvantages. Over the years, different implementations of the RAID concept have appeared. Most differ substantially from the original idealized RAID levels, but the numbered names have remained. This can be confusing, since one implementation of RAID 5, for example, can differ substantially from another. RAID 3 and RAID 4 are often confused and even used interchangeably.

One of the early uses of RAID 0 and 1 was the Crosfield Electronics Studio 9500 page layout system based on the Python workstation. The Python workstation was a Crosfield managed international development using PERQ 3B electronics, benchMark Technology's Viper display system and Crosfield's own RAID and fibre-optic network controllers. RAID 0 was particularly important to these workstations as it dramatically sped up image manipulation for the pre-press markets. Volume production started in Peterborough, England in early 1987.

[edit] VinumVinum is a logical volume manager, also called Software RAID, allowing implementations of the RAID-0, RAID-1 and RAID-5 models, both individually and in combination. Vinum is part of the base distribution of the FreeBSD operating system. Versions exist for NetBSD, OpenBSD and DragonFly BSD. Vinum source code is currently maintained in the FreeBSD source tree. Vinum supports raid levels 0, 1, 5, and JBOD. Vinum is invoked as "gvinum" on FreeBSD version 5.4 and up.

[edit] Software RAID vs. Hardware RAIDThe distribution of data across multiple disks can be managed by either dedicated hardware or by software. Additionally, there are hybrid RAIDs that are partly software- and partly hardware-based solutions.

With a software implementation, the operating system manages the disks of the array through the normal drive controller (ATA, SATA, SCSI, Fibre Channel, etc.). With present CPU speeds, software RAID can be faster than hardware RAID.

A hardware implementation of RAID requires at a minimum a special-purpose RAID controller. On a desktop system, this may be a PCI expansion card, or might be a

http://en.wikipedia.org/wiki/Peripheral_Component_Interconnect

http://en.wikipedia.org/wiki/CPU



http://en.wikipedia.org/wiki/SATA

http://en.wikipedia.org/wiki/AT_Attachment


http://en.wikipedia.org/wiki/JBOD

http://en.wikipedia.org/wiki/Source_tree

http://en.wikipedia.org/wiki/Source_code

http://en.wikipedia.org/wiki/DragonFly_BSD

http://en.wikipedia.org/wiki/OpenBSD

http://en.wikipedia.org/wiki/NetBSD


http://en.wikipedia.org/wiki/Redundant_array_of_independent_disks#RAID_5



http://en.wikipedia.org/wiki/Logical_volume_management


http://en.wikipedia.org/wiki/PERQ

http://en.wikipedia.org/wiki/RAID#cite_note-patterson-0%23cite_note-patterson-0

http://en.wikipedia.org/wiki/SIGMOD



http://en.wikipedia.org/wiki/Garth_A._Gibson

http://en.wikipedia.org/wiki/David_A._Patterson_(scientist)

http://en.wikipedia.org/wiki/Prior_art

capability built in to the motherboard. In larger RAIDs, the controller and disks are usually housed in an external multi-bay enclosure. This controller handles the management of the disks, and performs parity calculations (needed for many RAID levels). This option tends to provide better performance, and makes operating system support easier.

Hardware implementations also typically support hot swapping, allowing failed drives to be replaced while the system is running. In rare cases hardware controllers have become faulty, which can result in data loss. Hybrid RAIDs have become very popular with the introduction of inexpensive hardware RAID controllers. The hardware is a normal disk controller that has no RAID features, but there is a boot-time application that allows users to set up RAIDs that are controlled via the BIOS. When any modern operating system is used, it will need specialized RAID drivers that will make the array look like a single block device. Since these controllers actually do all calculations in software, not hardware, they are often called "fakeraids". Unlike software RAID, these "fakeraids" typically cannot span multiple controllers.

Example configuration. A simple example to mirror drive enterprise to drive excelsior (RAID1)

drive enterprise device /dev/da1s1ddrive excelsior device /dev/da2s1dvolume mirror plex org concat sd length 512m drive enterpriseplex org concat sd length 512m drive excelsior

[edit] Non-RAID drive architecturesMain article: Non-RAID drive architectures

Non-RAID drive architectures also exist, and are often referred to, similarly to RAID, by standard acronyms, several tongue-in-cheek. A single drive is referred to as a SLED (Single Large Expensive Drive), by contrast with RAID, while an array of drives without any additional control (accessed simply as independent drives) is referred to as a JBOD (Just a Bunch Of Disks). Simple concatenation is referred to a SPAN, or sometimes as JBOD, though this latter is proscribed in careful use, due to the alternative meaning just cited.

http://en.wikipedia.org/wiki/Non-RAID_drive_architectures

http://en.wikipedia.org/wiki/Non-RAID_drive_architectures


http://en.wikipedia.org/wiki/BIOS

raid - 123seminarsonly.com€¦ · web viewraid (disambiguation). ... the schemes or...

Documents