bridging the information gap in storage protocol stacks
DESCRIPTION
Bridging the Information Gap in Storage Protocol Stacks. Timothy E. Denehy, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau University of Wisconsin, Madison. State of Affairs. File System Storage System. Namespace, Files, Metadata, Layout, Free Space Block Based, Read/Write - PowerPoint PPT PresentationTRANSCRIPT
Bridging the Information Gapin Storage Protocol Stacks
Timothy E. Denehy,Andrea C. Arpaci-Dusseau,
and Remzi H. Arpaci-Dusseau
University of Wisconsin, Madison
2 of 32
State of Affairs
Namespace, Files, Metadata, Layout,
Free Space
Block Based, Read/Write
Parallelism, Redundancy
File System
Storage System
Interface
3 of 32
• Information gap may cause problems– Poor performance
• Partial stripe write operations– Duplicated functionality
• Logging in file system and storage system– Reduced functionality
• Storage system lacks knowledge of files• Time to re-examine the division of labor
Problem
4 of 32
• Enhance the storage interface– Expose performance and failure information
• Use information to provide new functionality– On-line expansion– Dynamic parallelism– Flexible redundancy
Our Approach
Informed LFS
Exposed RAID
5 of 32
Outline
• ERAID Overview• I·LFS Overview• Functionality and Evaluation
– On-line expansion– Dynamic parallelism– Flexible redundancy– Lazy redundancy
• Conclusion
6 of 32
• Backwards compatibility– Block-based interface– Linear, concatenated address space
• Expose information to the file system above– Regions– Performance– Failure
• Allow file system to utilize semantic knowledge
ERAID Goals
7 of 32
• Region– Contiguous portion of the address space
• Regions can be added to expand the address space• Region composition
– RAID: One region for all disks– Exposed: Separate regions for each disk– Hybrid
ERAID Regions
ERAID
8 of 32
• Exposed on a per-region basis• Queue length and throughput• Reveals
– Static disk heterogeneity– Dynamic performance and load fluctuations
ERAID Performance Information
ERAID
9 of 32
• Exposed on a per-region basis• Number of tolerable failures• Reveals
– Static differences in failure characteristics– Dynamic failures to file system above
ERAID Failure Information
RAID1
XERAID
10 of 32
Outline
• ERAID Overview• I·LFS Overview• Functionality and Evaluation
– On-line expansion– Dynamic parallelism– Flexible redundancy– Lazy redundancy
• Conclusion
11 of 32
• Log-structured file system– Transforms all writes into large sequential writes– All data and metadata is written to a log– Log is a collection of segments– Segment table describes each segment– Cleaner process produces empty segments
• Why use LFS for an informed file system?– Write-anywhere design provides flexibility– Ideas applicable to other file systems
I·LFS Overview
12 of 32
• Goals– Improve performance, functionality, and manageability– Minimize system complexity
• Exploits ERAID information to provide– On-line expansion– Dynamic parallelism– Flexible redundancy– Lazy redundancy
I·LFS Overview
13 of 32
• NetBSD 1.5• 1 GHz Intel Pentium III Xeon• 128 MB RAM• Four fast disks
– Seagate Cheetah 36XL, 21.6 MB/s• Four slow disks
– Seagate Barracuda 4XL, 7.5 MB/s
I·LFS Experimental Platform
14 of 32
I·LFS Baseline Performance
• Four slow disks: 30 MB/s• Four fast disks: 80 MB/s
15 of 32
Outline
• ERAID Overview• I·LFS Overview• Functionality and Evaluation
– On-line expansion– Dynamic parallelism– Flexible redundancy– Lazy redundancy
• Conclusion
16 of 32
• Goal: Expand storage incrementally– Capacity– Performance
• Ideal: Instant disk addition– Minimize downtime– Simplify administration
• I·LFS supports on-line addition of new disks
I·LFS On-line Expansion
17 of 32
• ERAID: Expandable address space• Expansion is equivalent to adding empty segments• Start with an oversized segment table• Activate new portion of segment table
I·LFS On-line Expansion Details
18 of 32
I·LFS On-line Expansion Experiment
• I·LFS immediately takes advantage of each extra disk
19 of 32
• Goal: Perform well on heterogeneous storage– Static performance differences– Dynamic performance fluctuations
• Ideal: Maximize throughput of the storage system• I·LFS writes data proportionate to performance
I·LFS Dynamic Parallelism
20 of 32
• ERAID: Dynamic performance information• Most file system routines are not changed
– Aware of only the ERAID linear address space– Reduces file system complexity
• Segment selection routine– Aware of ERAID regions and performance– Chooses next segment based on current performance
I·LFS Dynamic Parallelism Details
21 of 32
I·LFS Static Parallelism Experiment
• Simple striping limited by the rate of the slowest disk• I·LFS provides the full throughput of the system
22 of 32
I·LFS Dynamic Parallelism Experiment
• I·LFS adjusts to the performance fluctuation
23 of 32
• Goal: Offer new redundancy options to users• Ideal: Range of mechanisms and granularities• I·LFS provides mirrored per-file redundancy
I·LFS Flexible Redundancy
24 of 32
• ERAID: Region failure characteristics• Use separate files for redundancy
– Even inode N for original files– Odd inode N+1 for redundant files– Original and redundant data in different sets of regions
• Flexible data placement within the regions• Use recursive vnode operations for redundant files
– Leverage existing routines to reduce complexity
I·LFS Flexible Redundancy Details
25 of 32
I·LFS Flexible Redundancy Experiment
• I·LFS provides a throughput and reliability tradeoff
26 of 32
• Goal: Avoid replication performance penalty• Ideal: Replicate data immediately before failure• I·LFS offers redundancy with delayed replication• Avoids replication penalty for short-lived files
I·LFS Lazy Redundancy
27 of 32
• ERAID: Region failure characteristics• Segments needing replication are flagged• Cleaner acts as replicator
– Locates flagged segments– Checks data liveness and lifetime– Generates redundant copies of files
I·LFS Lazy Redundancy
28 of 32
I·LFS Lazy Redundancy Experiment
• I·LFS avoids performance penalty for short-lived files
29 of 32
Outline
• ERAID Overview• I·LFS Overview• Functionality and Evaluation
– On-line expansion– Dynamic parallelism– Flexible redundancy– Lazy redundancy
• Conclusion
30 of 32
Comparison with Traditional Systems
• On-line expansion– Yes
• Dynamic parallelism (heterogeneous storage)– Yes, but with duplicated functionality
• Flexible redundancy– No, the storage system is not aware of file composition
• Lazy redundancy– No, the storage system is not aware of file deletions
31 of 32
Conclusion
• Introduced ERAID and I·LFS• Extra information enables new functionality
– Difficult or impossible in traditional systems• Minimal complexity
– 19% increase in code size• Time to re-examine the division of labor
32 of 32
Questions?
http://www.cs.wisc.edu/wind/