raid systems ver.2.0 jan 09, 2005 syam. raid primer redundant array of inexpensive disks random,...
TRANSCRIPT
RAID SystemsVer.2.0
Jan 09, 2005Syam
RAID Primer Redundant Array of Inexpensive Disks
random, real-time, redundant, array, assembly, interconnected, independent, inter-relation, devices , drives, etc.
Physical Drive Actual Hard Disks
Physical Array One or more physical drive
Logical Array Formed by splitting or combining physical
arrays Logical Drive
One or more logical array
Drive Layout
Physical Drive 0
Physical Array 0 Physical Array 1
Logical Array 2
Logical Drive
Logical Array 0Logical Array 1
Why RAID
CPU and Memory Disk I/O Disk Reliability
MTBF (Mean Time Between Failures) = MTBF Single Drive / # Drives Fault Tolerance
Use multiple small, inexpensive disks into array which yields performance exceeding that of Single Large Expensive Disk
RAID Benefits
Higher Data Security Fault Tolerance Improved Availability Increased, Integrated Capacity Improved Performance
RAID Costs
Planning and Design Hardware Software Setup and Training Maintenance
RAID Tradeoffs
RAID Misconceptions
Blanket Statements “Invulnerability Complex” AID 0 not RAID 0
Hardware vs. Software RAID Hardware RAID
Hardware manages the RAID independently from the host and gives the host a single drive per array
Controller operates simultaneously with system
Highly fault tolerant
Software RAID Software manages RAID Lives in host memory and consumes CPU
cycles Array only functional when array software is
loaded (array down, array software load?)
Hardware vs. Software RAID
Hardware RAID Software RAID
Implementation
Dedicated Hardware
Software Kernel
Automatic Failover
Yes Yes
How Swap Yes No
CostHundreds of Dollars
None
CPU Impact Negligible Typically 5-15%
RAID Concepts
Mirroring – One method of data redundancy Data written simultaneously to
two hard disks 100% redundancy protects against
failure of any of the disks
My DATA
My DATA
RAID Concepts
Striping – Disks used in parallel Each drive partitioned into stripes
from one sector (512 bytes) to several MBs
Partition size referred to as Striping Unit
Pieces of files are stored on multiple disks
Files can be broken up into bytes or blocks
Drive 0 Drive 1 Drive 2
Striping Unit
Parity – Another method of data redundancy Take N pieces of data and calculate
another piece and store the N+1 pieces on separate drives
If any one of the N+1 pieces of data is lost, it can be recreated from the other N
RAID Concepts
Parity Example
RAID Concepts
D1 = 10100101
D2 = 11110000
D3 = 00111100
D4 = 10111001
Parity = D1 XOR D2 XOR D3 XOR D4 = (((10100101 XOR 11110000) XOR 00111100) XOR 10111001) = 11010000Now five pieces of data are stored on five separate disks
Assume D3 becomes corrupt, can be restored by:D3 = D1 XOR D2 XOR D4 XOR Parity = (((10100101 XOR 11110000) XOR 10111001) XOR 11010000) = 00111100
RAID operation in degraded stateTwo-drive mirrored => performance
equals that of a single driveStriped array with parity =>
regenerating lost information Rebuilding
Two-drive mirrored => copy entire good drive to replacement drive
Striped array with parity => must determine new parity information
RAID Degraded Operation and Rebuilding
RAID can continue to operate during rebuild
Hardware RAID rebuilds faster than software RAID
Automatic rebuildController detects failed driveAutomatically rebuilds on
replacementManual rebuild
Administrator initiates rebuild Can be run in off-peak time
RAID Degraded Operation and Rebuilding
RAID Reliability
Component Reliability System Reliability
Function of reliability of components
N
MTBFComponentMTBF
NMTBFMTBFMTBF
MTBF1
...11
1
21
NMTBFMTBF ...1
RAID Reliability
Example RAID with 4 drives with MTBF 500,000
hours
Reliability decreased from 500,000 to 88,235 => decreased 82%
RAID reliability referred to RAID with fault tolerance
000,3001
000,5001
000,5001
000,5001
000,5001
1
4321
MTBF
235,88MTBF
Ability of RAID system to withstand loss of some hardware without loss of data or availability
When fault occurs, array enters degraded state Drive must be replaced Array must be rebuilt
RAID Fault Tolerance
Ability for users to access data Depends on:
Hardware Reliability Fault Tolerance Hot Swapping Automatic Rebuilding Service
RAID Availability
Most RAID levels use striping Possible threats to data integrity
Unexpected Hard Disk Failure Failures of Support Hardware Physical Damage Software Problems Viruses Human Error
RAID Backups
1988 Paper defined RAID levels 1- 5
Now single RAID levels 0 – 7Multiple RAID levels
RAID Levels
JBOD – Just a Bunch Of Disks
Spanning multiple physical drives into one logical drive
No Fault Tolerance Not a RAID
JBOD – Just a Bunch Of Disks
RAID 0 - Striping
Disk Striping No parity Example:
Write essay with 3 hands instead of 1
Increased IO Not a valid RAID implementation
due to lack of fault tolerance
RAID 0 - Striping
RAID 0 - Striping Supported by all hardware controllers Supported by most software Minimum of two hard disks Array Capacity = Smallest Drive Size * # of
Drives Storage Efficiency = 100% of drive Fault Tolerance: None Availability: Lowest of all RAID levelsFailure results in array down until rebuild and
restore Degradation and Rebuilding: Not Applicable
RAID 0 - Striping Random Read Performance: Very Good,
increases with larger stripe size Random Write Performance: Very Good,
increases with larger stripe size Sequential Read Performance: Very Good to
Great Sequential Write Performance: Very Good Cost: Lowest of all RAID levels Special Considerations: Daily Backups Uses: Non-critical data, Hobbyist, high-end
gaming
RAID 1 - Mirroring
100% Data Redundancy No IO Speed Increase When a drive fails, the other
operates until drive replaced
RAID 1 - Mirroring
RAID 1 - Mirroring Supported by all hardware controllers Supported by most software Exactly two hard disks Array Capacity = Smallest Drive Size Storage Efficiency = 50% of drives Fault Tolerance: Very Good Availability: Very GoodMost allow hot spare and automatic rebuilding Degradation and Rebuilding: Slight
degradation of read, write improvesRebuilding is relatively fast
RAID 1 - Mirroring Random Read Performance: Good Random Write Performance: Good Sequential Read Performance: Fair Sequential Write Performance: Good Cost: Relatively High Special Considerations: Size Limitation Uses: High Fault Tolerance without high
capacity, small databases, accounting and financial data
RAID 2 – Memory-Style ECC Introduces Parity Same principle as ECC memory Bit-level with Hamming Code Not used today
Cost, Complexity
RAID 2 - Memory-Style ECC
RAID 2 - Memory-Style ECC Special Hardware Controller Required Typically 10 Data Disks and 4 ECC Disks Array Capacity = 10 * Data Disk Size Storage Efficiency = 71% of drives Fault Tolerance: Fair Availability: Very Good “On the fly” error correction Degradation and Rebuilding: In theory little
degradation
RAID 2 - Memory-Style ECC Random Read Performance: Fair Random Write Performance: Poor Sequential Read Performance: Very Good Sequential Write Performance: Fair to Good Cost: Very Expensive Special Considerations: Not in modern
systems Uses: Not Used in Modern Systems
RAID 3 – Bit-Interleaved Parity Byte Level striping with
dedicated parity disk Read requests hit all data disks Write requests hit all data disks
and parity disk Great for high bandwidth but not
high I/O rates Parity Disk can be bottleneck
RAID 3 - Bit-Interleaved Parity
RAID 3 - Bit-Interleaved Parity Medium to high-end hardware controller
required Minimum of three hard disks Array Capacity = Smallest Drive Size*(#
Drives–1) Storage Efficiency = (# Drives-1)/# Drives Fault Tolerance: Good Availability: Very Good Hot swapping and automatic rebuild Degradation and Rebuilding: Relative little
degradation and rebuilds can take many hours
RAID 3 - Bit-Interleaved Parity Random Read Performance: Good Random Write Performance: Poor Sequential Read Performance: Very Good Sequential Write Performance: Fair to Good Cost: Moderate Special Considerations: Not as popular as
other Uses: Large Files with High transfer
performance, multimedia, publishing
RAID 4 – Block-Interleaved Parity
Block-level striping with dedicated parity disk
Write requests use read-modify-write i.e. four disks (3 data, 1 parity) Small write request 4 disk IO
write the new data to disk 0 (1) read old data from disk 1 & disk 2 (2) Write parity information (1)
Parity disk can become bottleneck
RAID 4 – Block-Interleaved Parity
Medium to high-end hardware controller required
Minimum of three hard disks Array Capacity = Smallest Drive Size*(#
Drives–1) Storage Efficiency = (# Drives-1)/# Drives Fault Tolerance: Good Availability: Very Good Hot swapping and automatic rebuild Degradation and Rebuilding: Moderate if
drive fails and potential lengthy rebuild
RAID 4 – Block-Interleaved Parity
Random Read Performance: Very Good Random Write Performance: Poor to Fair Sequential Read Performance: Good to Very
Good Sequential Write Performance: Fair to Good Cost: Moderate Special Considerations: Performance
depends on stripe size Uses: Not as common as level 3 or 5
Large Files with High transfer performance
RAID 4 – Block-Interleaved Parity
RAID 5 – Block-Interleaved Distributed Parity
One of most popular levels Eliminates the parity disk
bottleneck by distributing parity information across the array
More efficient with small read and large write requests
RAID 5 – Block-Interleaved Distributed Parity
RAID 5 – Block-Interleaved Distributed Parity
Moderately high-end hardware controller required
Supported by some software solutions Minimum of the hard disks Array Capacity = Smallest Drive Size*(#
Drives–1) Storage Efficiency = (# Drives-1)/# Drives Fault Tolerance: Good Availability: Good to Very Good Hot swapping and automatic rebuild Degradation and Rebuilding: Can be
substantial due to distributed parity
RAID 5 – Block-Interleaved Distributed Parity
Random Read Performance: Very Good to Great
Random Write Performance: Fair Sequential Read Performance: Good to Very
Good Sequential Write Performance: Fair to Good Cost: Moderate Special Considerations: Software RAID can
greatly affect performance due to parity calculations
Uses: Seen as Middle of RAID tradeoff triangle
RAID 6 – P+Q Redundancy Block-level striping with dual
distributed parity Adds 2D parity information Can handle up to multiple disk
failures High data fault tolerance for
mission critical applications
RAID 6 – P+Q Redundancy
RAID 6 – P+Q Redundancy Special Hardware Controller Required Typically Minimum of 4 Hard Disks Array Capacity = Smallest Drive Size * #
Drives-2 Storage Efficiency = (#Drives - 2)/#Drives Fault Tolerance: Very Good to Great Availability: Great Degradation and Rebuilding: Can be
substantial due to dual distributed parity
RAID 6 – P+Q Redundancy Random Read Performance: Very Good to
Great Random Write Performance: Poor Sequential Read Performance: Good to Very
Good Sequential Write Performance: Fair Cost: High Special Considerations: Tends to be used in
proprietary systems Uses: Where RAID 5 plus more fault
tolerance
Multiple RAID Levels
Combine two single levels to obtain improved performance
Most common level 01 and 10 RAID level X+Y ≠ Y+X
Usually not much impact on capacity
More impact on fault tolerance
Multiple RAID Levels RAID 01 vs 10 RAID 01
Strip Drives 1,2 RAID 0 for Stripe AStripe Drives 3,4 RAID 0 for Stripe BMirror two sets, if Drive 2 fails Stripe A is lost
RAID 10Mirror Drives 1,2 RAID 1 for Mirror AMirror Drives 3,4 RAID 1 for Mirror BStripe across A and B, if Drive 2 fails still have Drive 1 maintaining stripe
RAID 10 – Mirrored Stripe
Mirroring and striping without parity
Most Common of Multiple levels Large arrays with high
performance and high fault tolerance
RAID 10 - Mirrored Stripe
RAID 10 - Mirrored Stripe Most Hardware Controllers Support Even Number with Minimum of 4 Hard Disks Array Capacity = Smallest Drive Size * #
Drives/2 Storage Efficiency = 50% Fault Tolerance: Very Good to Great Availability: Great Degradation and Rebuilding: Relatively Little
RAID 10 - Mirrored Stripe Random Read Performance: Very Good to
Great Random Write Performance: Good to Very
Good Sequential Read Performance: Very Good to
Great Sequential Write Performance: Good to Very
Good Cost: High Special Considerations: Low storage
efficiency Uses: High performance and reliability,
enterprise servers
RAID Comparison
SequentialWrite Perf
0 2,3,4,... S*N 100% none $1 2 S*N/2 50% $$2 many varies, large ~ 70-80% $$$$$3 3,4,5,... S*(N-1) (N-1)/N $$4 3,4,5,... S*(N-1) (N-1)/N $$5 3,4,5,... S*(N-1) (N-1)/N $$6 4,5,6,... S*(N-2) (N-2)/N $$$
01/10 4,6,8,... S*N/2 50% $$$05-50 6,8,9,10,. S*N0*(N5-1) (N5-1)/N5 $$$$
RAID Level
Number of Disks Capacity
Storage Efficiency
Sequential Read Perf Cost
Fault Tolerance Availability
Random Read Perf
Random Write Perf