bridging the information gap in storage protocol stacks

32
Bridging the Information Gap in Storage Protocol Stacks Timothy E. Denehy, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau University of Wisconsin, Madison

Upload: frieda

Post on 19-Feb-2016

36 views

Category:

Documents


0 download

DESCRIPTION

Bridging the Information Gap in Storage Protocol Stacks. Timothy E. Denehy, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau University of Wisconsin, Madison. State of Affairs. File System Storage System. Namespace, Files, Metadata, Layout, Free Space Block Based, Read/Write - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Bridging the Information Gap in Storage Protocol Stacks

Bridging the Information Gapin Storage Protocol Stacks

Timothy E. Denehy,Andrea C. Arpaci-Dusseau,

and Remzi H. Arpaci-Dusseau

University of Wisconsin, Madison

Page 2: Bridging the Information Gap in Storage Protocol Stacks

2 of 32

State of Affairs

Namespace, Files, Metadata, Layout,

Free Space

Block Based, Read/Write

Parallelism, Redundancy

File System

Storage System

Interface

Page 3: Bridging the Information Gap in Storage Protocol Stacks

3 of 32

• Information gap may cause problems– Poor performance

• Partial stripe write operations– Duplicated functionality

• Logging in file system and storage system– Reduced functionality

• Storage system lacks knowledge of files• Time to re-examine the division of labor

Problem

Page 4: Bridging the Information Gap in Storage Protocol Stacks

4 of 32

• Enhance the storage interface– Expose performance and failure information

• Use information to provide new functionality– On-line expansion– Dynamic parallelism– Flexible redundancy

Our Approach

Informed LFS

Exposed RAID

Page 5: Bridging the Information Gap in Storage Protocol Stacks

5 of 32

Outline

• ERAID Overview• I·LFS Overview• Functionality and Evaluation

– On-line expansion– Dynamic parallelism– Flexible redundancy– Lazy redundancy

• Conclusion

Page 6: Bridging the Information Gap in Storage Protocol Stacks

6 of 32

• Backwards compatibility– Block-based interface– Linear, concatenated address space

• Expose information to the file system above– Regions– Performance– Failure

• Allow file system to utilize semantic knowledge

ERAID Goals

Page 7: Bridging the Information Gap in Storage Protocol Stacks

7 of 32

• Region– Contiguous portion of the address space

• Regions can be added to expand the address space• Region composition

– RAID: One region for all disks– Exposed: Separate regions for each disk– Hybrid

ERAID Regions

ERAID

Page 8: Bridging the Information Gap in Storage Protocol Stacks

8 of 32

• Exposed on a per-region basis• Queue length and throughput• Reveals

– Static disk heterogeneity– Dynamic performance and load fluctuations

ERAID Performance Information

ERAID

Page 9: Bridging the Information Gap in Storage Protocol Stacks

9 of 32

• Exposed on a per-region basis• Number of tolerable failures• Reveals

– Static differences in failure characteristics– Dynamic failures to file system above

ERAID Failure Information

RAID1

XERAID

Page 10: Bridging the Information Gap in Storage Protocol Stacks

10 of 32

Outline

• ERAID Overview• I·LFS Overview• Functionality and Evaluation

– On-line expansion– Dynamic parallelism– Flexible redundancy– Lazy redundancy

• Conclusion

Page 11: Bridging the Information Gap in Storage Protocol Stacks

11 of 32

• Log-structured file system– Transforms all writes into large sequential writes– All data and metadata is written to a log– Log is a collection of segments– Segment table describes each segment– Cleaner process produces empty segments

• Why use LFS for an informed file system?– Write-anywhere design provides flexibility– Ideas applicable to other file systems

I·LFS Overview

Page 12: Bridging the Information Gap in Storage Protocol Stacks

12 of 32

• Goals– Improve performance, functionality, and manageability– Minimize system complexity

• Exploits ERAID information to provide– On-line expansion– Dynamic parallelism– Flexible redundancy– Lazy redundancy

I·LFS Overview

Page 13: Bridging the Information Gap in Storage Protocol Stacks

13 of 32

• NetBSD 1.5• 1 GHz Intel Pentium III Xeon• 128 MB RAM• Four fast disks

– Seagate Cheetah 36XL, 21.6 MB/s• Four slow disks

– Seagate Barracuda 4XL, 7.5 MB/s

I·LFS Experimental Platform

Page 14: Bridging the Information Gap in Storage Protocol Stacks

14 of 32

I·LFS Baseline Performance

• Four slow disks: 30 MB/s• Four fast disks: 80 MB/s

Page 15: Bridging the Information Gap in Storage Protocol Stacks

15 of 32

Outline

• ERAID Overview• I·LFS Overview• Functionality and Evaluation

– On-line expansion– Dynamic parallelism– Flexible redundancy– Lazy redundancy

• Conclusion

Page 16: Bridging the Information Gap in Storage Protocol Stacks

16 of 32

• Goal: Expand storage incrementally– Capacity– Performance

• Ideal: Instant disk addition– Minimize downtime– Simplify administration

• I·LFS supports on-line addition of new disks

I·LFS On-line Expansion

Page 17: Bridging the Information Gap in Storage Protocol Stacks

17 of 32

• ERAID: Expandable address space• Expansion is equivalent to adding empty segments• Start with an oversized segment table• Activate new portion of segment table

I·LFS On-line Expansion Details

Page 18: Bridging the Information Gap in Storage Protocol Stacks

18 of 32

I·LFS On-line Expansion Experiment

• I·LFS immediately takes advantage of each extra disk

Page 19: Bridging the Information Gap in Storage Protocol Stacks

19 of 32

• Goal: Perform well on heterogeneous storage– Static performance differences– Dynamic performance fluctuations

• Ideal: Maximize throughput of the storage system• I·LFS writes data proportionate to performance

I·LFS Dynamic Parallelism

Page 20: Bridging the Information Gap in Storage Protocol Stacks

20 of 32

• ERAID: Dynamic performance information• Most file system routines are not changed

– Aware of only the ERAID linear address space– Reduces file system complexity

• Segment selection routine– Aware of ERAID regions and performance– Chooses next segment based on current performance

I·LFS Dynamic Parallelism Details

Page 21: Bridging the Information Gap in Storage Protocol Stacks

21 of 32

I·LFS Static Parallelism Experiment

• Simple striping limited by the rate of the slowest disk• I·LFS provides the full throughput of the system

Page 22: Bridging the Information Gap in Storage Protocol Stacks

22 of 32

I·LFS Dynamic Parallelism Experiment

• I·LFS adjusts to the performance fluctuation

Page 23: Bridging the Information Gap in Storage Protocol Stacks

23 of 32

• Goal: Offer new redundancy options to users• Ideal: Range of mechanisms and granularities• I·LFS provides mirrored per-file redundancy

I·LFS Flexible Redundancy

Page 24: Bridging the Information Gap in Storage Protocol Stacks

24 of 32

• ERAID: Region failure characteristics• Use separate files for redundancy

– Even inode N for original files– Odd inode N+1 for redundant files– Original and redundant data in different sets of regions

• Flexible data placement within the regions• Use recursive vnode operations for redundant files

– Leverage existing routines to reduce complexity

I·LFS Flexible Redundancy Details

Page 25: Bridging the Information Gap in Storage Protocol Stacks

25 of 32

I·LFS Flexible Redundancy Experiment

• I·LFS provides a throughput and reliability tradeoff

Page 26: Bridging the Information Gap in Storage Protocol Stacks

26 of 32

• Goal: Avoid replication performance penalty• Ideal: Replicate data immediately before failure• I·LFS offers redundancy with delayed replication• Avoids replication penalty for short-lived files

I·LFS Lazy Redundancy

Page 27: Bridging the Information Gap in Storage Protocol Stacks

27 of 32

• ERAID: Region failure characteristics• Segments needing replication are flagged• Cleaner acts as replicator

– Locates flagged segments– Checks data liveness and lifetime– Generates redundant copies of files

I·LFS Lazy Redundancy

Page 28: Bridging the Information Gap in Storage Protocol Stacks

28 of 32

I·LFS Lazy Redundancy Experiment

• I·LFS avoids performance penalty for short-lived files

Page 29: Bridging the Information Gap in Storage Protocol Stacks

29 of 32

Outline

• ERAID Overview• I·LFS Overview• Functionality and Evaluation

– On-line expansion– Dynamic parallelism– Flexible redundancy– Lazy redundancy

• Conclusion

Page 30: Bridging the Information Gap in Storage Protocol Stacks

30 of 32

Comparison with Traditional Systems

• On-line expansion– Yes

• Dynamic parallelism (heterogeneous storage)– Yes, but with duplicated functionality

• Flexible redundancy– No, the storage system is not aware of file composition

• Lazy redundancy– No, the storage system is not aware of file deletions

Page 31: Bridging the Information Gap in Storage Protocol Stacks

31 of 32

Conclusion

• Introduced ERAID and I·LFS• Extra information enables new functionality

– Difficult or impossible in traditional systems• Minimal complexity

– 19% increase in code size• Time to re-examine the division of labor

Page 32: Bridging the Information Gap in Storage Protocol Stacks

32 of 32

Questions?

http://www.cs.wisc.edu/wind/