![Page 1: ZFS: The last word in File Systems - IS IT ? Swaminathan Sundararaman Sriram Subramanian](https://reader036.vdocuments.net/reader036/viewer/2022081420/5697bf951a28abf838c90adc/html5/thumbnails/1.jpg)
ZFS: The last word in File Systems- IS IT ?
Swaminathan SundararamanSriram Subramanian
![Page 2: ZFS: The last word in File Systems - IS IT ? Swaminathan Sundararaman Sriram Subramanian](https://reader036.vdocuments.net/reader036/viewer/2022081420/5697bf951a28abf838c90adc/html5/thumbnails/2.jpg)
ZFS: Zettabyte File System
The last word in file systems "We've rethought everything and
rearchitected it," - Jeff Bonwick, Sun distinguished engineer and chief architect
of ZFS. "We've thrown away 20 years of old
technology that was based on assumptions no longer true today."
![Page 3: ZFS: The last word in File Systems - IS IT ? Swaminathan Sundararaman Sriram Subramanian](https://reader036.vdocuments.net/reader036/viewer/2022081420/5697bf951a28abf838c90adc/html5/thumbnails/3.jpg)
Our Goal
To uncover interesting policies of ZFS
Focus on How ZFS automatically chooses multiple block
sizes, to match workload
Policy and performance analysis of ZFS during synchronous workloads
![Page 4: ZFS: The last word in File Systems - IS IT ? Swaminathan Sundararaman Sriram Subramanian](https://reader036.vdocuments.net/reader036/viewer/2022081420/5697bf951a28abf838c90adc/html5/thumbnails/4.jpg)
OS
Methodology
Semantic Block Analysis [ Prabhakaran et. al. ’05]
File System
Disk
Application
Pseudo Device Driver Block Inference
Workload
![Page 5: ZFS: The last word in File Systems - IS IT ? Swaminathan Sundararaman Sriram Subramanian](https://reader036.vdocuments.net/reader036/viewer/2022081420/5697bf951a28abf838c90adc/html5/thumbnails/5.jpg)
Preliminary Results
Naïve block allocation policy Does not work well for random workloads
Dynamic merges small block writes Suffers from Read-Modify-Write for some
workload Poor ZFS Intent Log blocks allocation policy Dynamically changes the block writing
mechanism based on workload (under investigation)
![Page 6: ZFS: The last word in File Systems - IS IT ? Swaminathan Sundararaman Sriram Subramanian](https://reader036.vdocuments.net/reader036/viewer/2022081420/5697bf951a28abf838c90adc/html5/thumbnails/6.jpg)
Outline
Infrastructure Block Classification Strategy
Policies Block Allocation Dynamic block resizing ZFS Intent Log (ZIL)
Conclusion
![Page 7: ZFS: The last word in File Systems - IS IT ? Swaminathan Sundararaman Sriram Subramanian](https://reader036.vdocuments.net/reader036/viewer/2022081420/5697bf951a28abf838c90adc/html5/thumbnails/7.jpg)
Infrastructure
Pseudo Device Driver Implemented a Block Driver using Layered Device
Interface (LDI) Ioctls to control collection of statistics
Issue: Solaris did not allow us to issue ioctls to pseudo block drivers
Solution: Indirection Wrote a dummy character driver and redirected the
ioctl requests to our block device
![Page 8: ZFS: The last word in File Systems - IS IT ? Swaminathan Sundararaman Sriram Subramanian](https://reader036.vdocuments.net/reader036/viewer/2022081420/5697bf951a28abf838c90adc/html5/thumbnails/8.jpg)
Infrastructure (Contd.)
Selective classification
Log files for Offline block analysis
Negligible performance overheads
Asynchronously written to the log file
![Page 9: ZFS: The last word in File Systems - IS IT ? Swaminathan Sundararaman Sriram Subramanian](https://reader036.vdocuments.net/reader036/viewer/2022081420/5697bf951a28abf838c90adc/html5/thumbnails/9.jpg)
Block Classification Strategy
Uber blocks 1024 byte blocks Identified by its Magic Flag
Data blocks Identified by a special pattern
Pattern repeated after ever 512 byte offset
Individual data blocks identified by seq. increasing numbers
![Page 10: ZFS: The last word in File Systems - IS IT ? Swaminathan Sundararaman Sriram Subramanian](https://reader036.vdocuments.net/reader036/viewer/2022081420/5697bf951a28abf838c90adc/html5/thumbnails/10.jpg)
Block Classification Strategy
ZIL blocks
Identified by its Magic Flag
Meta-data blocks
Rest of the blocks
![Page 11: ZFS: The last word in File Systems - IS IT ? Swaminathan Sundararaman Sriram Subramanian](https://reader036.vdocuments.net/reader036/viewer/2022081420/5697bf951a28abf838c90adc/html5/thumbnails/11.jpg)
Sequential Write of 1GB file
Block size: 4K ZFS Caches
small block writes
Large sequential 128k block writes
0
16
32
48
64
80
96
112
128
144
Blo
ck S
ize
in K
B
Block Sizes
![Page 12: ZFS: The last word in File Systems - IS IT ? Swaminathan Sundararaman Sriram Subramanian](https://reader036.vdocuments.net/reader036/viewer/2022081420/5697bf951a28abf838c90adc/html5/thumbnails/12.jpg)
Random writes inside 4GB file
0
16
32
48
64
80
96
112
128
144
0 1 2 3 4 5 6 7 8 9 10 11
Blo
ck
Siz
e in
KB
Block Sizes
Large 128k block write for every small 4k write
Block size: 4K
![Page 13: ZFS: The last word in File Systems - IS IT ? Swaminathan Sundararaman Sriram Subramanian](https://reader036.vdocuments.net/reader036/viewer/2022081420/5697bf951a28abf838c90adc/html5/thumbnails/13.jpg)
Random Writes of 4K blocks
0
10
20
30
40
50
60
70
80
90
100
0 1 2 3 4 5 6 7 8 9
0
10
20
30
40
50
60
70
80
90
Expected ZFS Offset
OffsetBlock Size
36 40
36 40
20 40
84 88
0 88
20 88
52 88
16 88
4 88
![Page 14: ZFS: The last word in File Systems - IS IT ? Swaminathan Sundararaman Sriram Subramanian](https://reader036.vdocuments.net/reader036/viewer/2022081420/5697bf951a28abf838c90adc/html5/thumbnails/14.jpg)
Random Writes of 512bytes
0
16
32
48
64
80
96
112
128
144
160
0 1 2 3 4 5 6 7 8
Blo
ck S
ize
in K
B
0
16
32
48
64
80
96
112
128
144
160
Off
set
in K
B
Offset Block size
0 0.5
16 16.5
64 64.5
32 32.5
150 128
128 128
127 127.5
![Page 15: ZFS: The last word in File Systems - IS IT ? Swaminathan Sundararaman Sriram Subramanian](https://reader036.vdocuments.net/reader036/viewer/2022081420/5697bf951a28abf838c90adc/html5/thumbnails/15.jpg)
Inference
Block Allocation
Purely based on file offsets
Block size is set to128K for offsets >= 128k
Block size is a multiple of 512 bytes for offsets < 128k
NOT based on dynamic workload characteristics
![Page 16: ZFS: The last word in File Systems - IS IT ? Swaminathan Sundararaman Sriram Subramanian](https://reader036.vdocuments.net/reader036/viewer/2022081420/5697bf951a28abf838c90adc/html5/thumbnails/16.jpg)
Small Sequential Writes of 4K
0
16
32
48
64
80
96
112
128
144
0 16 32 48 64 80 96 112 128 144 160 176 192 208 224
File Size in KB
Blo
ck S
ize
in K
B
ZFS Ideal
Write 4K blocks
Sleep 10 sec
Write Next block
![Page 17: ZFS: The last word in File Systems - IS IT ? Swaminathan Sundararaman Sriram Subramanian](https://reader036.vdocuments.net/reader036/viewer/2022081420/5697bf951a28abf838c90adc/html5/thumbnails/17.jpg)
Small Seq. Writes of 32KBytes
0
16
32
48
64
80
96
112
128
144
0 32 64 96 128 160 192 224 256 288 320
File Size in KB
Blo
ck
Siz
e in
KB
ZFS Ideal
![Page 18: ZFS: The last word in File Systems - IS IT ? Swaminathan Sundararaman Sriram Subramanian](https://reader036.vdocuments.net/reader036/viewer/2022081420/5697bf951a28abf838c90adc/html5/thumbnails/18.jpg)
Unmount after every write
0
20
40
60
80
100
120
140
0 20 40 60 80 100 120 140 160 180
Blo
ck S
ize
in K
B
Append Data Read from Disk
![Page 19: ZFS: The last word in File Systems - IS IT ? Swaminathan Sundararaman Sriram Subramanian](https://reader036.vdocuments.net/reader036/viewer/2022081420/5697bf951a28abf838c90adc/html5/thumbnails/19.jpg)
Dynamic Resizing of Blocks
Until file sizes < 128k
Appending data to small files in inefficient
If data is not in memory
Small append converted to Read-Modify-Write
![Page 20: ZFS: The last word in File Systems - IS IT ? Swaminathan Sundararaman Sriram Subramanian](https://reader036.vdocuments.net/reader036/viewer/2022081420/5697bf951a28abf838c90adc/html5/thumbnails/20.jpg)
COW in ZFS
Copy-on-write design makes most disk writes sequential
Multiple block sizes, automatically chosen to match workload
![Page 21: ZFS: The last word in File Systems - IS IT ? Swaminathan Sundararaman Sriram Subramanian](https://reader036.vdocuments.net/reader036/viewer/2022081420/5697bf951a28abf838c90adc/html5/thumbnails/21.jpg)
ZIL Block Chaining
![Page 22: ZFS: The last word in File Systems - IS IT ? Swaminathan Sundararaman Sriram Subramanian](https://reader036.vdocuments.net/reader036/viewer/2022081420/5697bf951a28abf838c90adc/html5/thumbnails/22.jpg)
ZIL Block Allocation
0
5
10
15
20
25
30
35
40
1 2 3 4 5 6 7 8 9 10 11
Bloc
k Si
ze in
KB
1024
3072
16K
32K
64K
![Page 23: ZFS: The last word in File Systems - IS IT ? Swaminathan Sundararaman Sriram Subramanian](https://reader036.vdocuments.net/reader036/viewer/2022081420/5697bf951a28abf838c90adc/html5/thumbnails/23.jpg)
ZIL Block Allocation 33K
0
5
10
15
20
25
30
35
40
1 2 3 4 5 6 7 8 9 10 11 12
Blo
ck S
ize
in K
B
33K
![Page 24: ZFS: The last word in File Systems - IS IT ? Swaminathan Sundararaman Sriram Subramanian](https://reader036.vdocuments.net/reader036/viewer/2022081420/5697bf951a28abf838c90adc/html5/thumbnails/24.jpg)
Conclusions
Block Allocation Purely based on file offsets NOT based on dynamic workload characteristics
Dynamic Resizing of Blocks Until file sizes < 128k Appending data to small files in inefficient
ZFS Intent Log Internal fragmentation
Bad blocks allocation policy Block chaining Mechanism
![Page 25: ZFS: The last word in File Systems - IS IT ? Swaminathan Sundararaman Sriram Subramanian](https://reader036.vdocuments.net/reader036/viewer/2022081420/5697bf951a28abf838c90adc/html5/thumbnails/25.jpg)
Conclusion
ZFS: The last Word in file systems Might be the latest word definitely not the last word !
![Page 26: ZFS: The last word in File Systems - IS IT ? Swaminathan Sundararaman Sriram Subramanian](https://reader036.vdocuments.net/reader036/viewer/2022081420/5697bf951a28abf838c90adc/html5/thumbnails/26.jpg)
Questions ?