![Page 1: The Direct Access File System (DAFS)4 Usenix FAST ‘03 DAFS Design Points 4 Designed for high performance – Minimize client-side overhead – Base protocol: remote DMA, flow control](https://reader035.vdocuments.net/reader035/viewer/2022071110/5fe5f1de2e49ee3b2f289f81/html5/thumbnails/1.jpg)
1
The Direct Access File System (DAFS)
Matt DeBergalis, Peter Corbett, Steve Kleiman, Arthur Lent, Dave Noveck, Tom Talpey, Mark Wittle
Network Appliance, Inc.
Usenix FAST ’03
Tom Talpey
![Page 2: The Direct Access File System (DAFS)4 Usenix FAST ‘03 DAFS Design Points 4 Designed for high performance – Minimize client-side overhead – Base protocol: remote DMA, flow control](https://reader035.vdocuments.net/reader035/viewer/2022071110/5fe5f1de2e49ee3b2f289f81/html5/thumbnails/2.jpg)
2Usenix FAST ‘03
Outline
4 DAFS
4 DAT / RDMA
4 DAFS API
4 Benchmark results
![Page 3: The Direct Access File System (DAFS)4 Usenix FAST ‘03 DAFS Design Points 4 Designed for high performance – Minimize client-side overhead – Base protocol: remote DMA, flow control](https://reader035.vdocuments.net/reader035/viewer/2022071110/5fe5f1de2e49ee3b2f289f81/html5/thumbnails/3.jpg)
3Usenix FAST ‘03
DAFS – Direct Access File System
4 File access protocol, based on NFSv4 and RDMA, designed specifically for high-performance data center file sharing (local sharing)
4 Low latency, high throughput, and low overhead
4 Semantics for clustered file sharing environment
![Page 4: The Direct Access File System (DAFS)4 Usenix FAST ‘03 DAFS Design Points 4 Designed for high performance – Minimize client-side overhead – Base protocol: remote DMA, flow control](https://reader035.vdocuments.net/reader035/viewer/2022071110/5fe5f1de2e49ee3b2f289f81/html5/thumbnails/4.jpg)
4Usenix FAST ‘03
DAFS Design Points
4 Designed for high performance– Minimize client-side overhead– Base protocol: remote DMA, flow control– Operations: batch I/O, cache hints, chaining
4 Direct application access to transport resources– Transfers file data directly to application buffers– Bypasses operating system overhead– File semantics
4 Improved semantics to enable local file sharing– Superset of CIFS, NFSv3, NFSv4 (and local file systems!)– Consistent high-speed locking– Graceful client and server failover, cluster fencing
4 http://www.dafscollaborative.org
![Page 5: The Direct Access File System (DAFS)4 Usenix FAST ‘03 DAFS Design Points 4 Designed for high performance – Minimize client-side overhead – Base protocol: remote DMA, flow control](https://reader035.vdocuments.net/reader035/viewer/2022071110/5fe5f1de2e49ee3b2f289f81/html5/thumbnails/5.jpg)
5Usenix FAST ‘03
DAFS Protocol
4 Session-based
4 Strong authentication
4 Message format optimized
4 Multiple data transfer models
4 Batch I/O
4 Cache hints
4 Chaining
![Page 6: The Direct Access File System (DAFS)4 Usenix FAST ‘03 DAFS Design Points 4 Designed for high performance – Minimize client-side overhead – Base protocol: remote DMA, flow control](https://reader035.vdocuments.net/reader035/viewer/2022071110/5fe5f1de2e49ee3b2f289f81/html5/thumbnails/6.jpg)
6Usenix FAST ‘03
DAFS Protocol Enhanced Semantics
4 Rich locking
4 Cluster fencing
4 Shared key reservations
4 Exactly-once failure semantics
4 Append mode, Create-unlinked, Delete-on-last-close
![Page 7: The Direct Access File System (DAFS)4 Usenix FAST ‘03 DAFS Design Points 4 Designed for high performance – Minimize client-side overhead – Base protocol: remote DMA, flow control](https://reader035.vdocuments.net/reader035/viewer/2022071110/5fe5f1de2e49ee3b2f289f81/html5/thumbnails/7.jpg)
7Usenix FAST ‘03
DAT – Direct Access Transport
4 Common requirements and an abstraction of services for RDMA - Remote Direct Memory Access
– Portable, high-performance transport underpinning for DAFS and applications
– Defines communications endpoints, transfer semantics, memory description, signalling, etc.
4 Transfer models:– Send (like traditional network flow)– RDMA Write (write directly to advertised peer memory)– RDMA Read (read from advertised peer memory)
4 Transport independent– 1 Gb/s VI/IP, 10 Gb/s InfiniBand, future RDMA over IP
4 http://www.datcollaborative.org
![Page 8: The Direct Access File System (DAFS)4 Usenix FAST ‘03 DAFS Design Points 4 Designed for high performance – Minimize client-side overhead – Base protocol: remote DMA, flow control](https://reader035.vdocuments.net/reader035/viewer/2022071110/5fe5f1de2e49ee3b2f289f81/html5/thumbnails/8.jpg)
8Usenix FAST ‘03
DAFS Inline Read
READ_INLINE
ApplicationBuffer
Send Descriptor
ReceiveDescriptor
Client
REPLY
ServerBuffer
Send Descriptor
ReceiveDescriptor
Server
READ_INLINE
REPLY
1
23
![Page 9: The Direct Access File System (DAFS)4 Usenix FAST ‘03 DAFS Design Points 4 Designed for high performance – Minimize client-side overhead – Base protocol: remote DMA, flow control](https://reader035.vdocuments.net/reader035/viewer/2022071110/5fe5f1de2e49ee3b2f289f81/html5/thumbnails/9.jpg)
9Usenix FAST ‘03
DAFS Direct Read
READ_DIRECT
ApplicationBuffer
Send Descriptor
ReceiveDescriptor
Client
REPLY
ServerBuffer
Send Descriptor
ReceiveDescriptor
Server
READ_DIRECT
REPLY
1
2
3
RDMA Write
![Page 10: The Direct Access File System (DAFS)4 Usenix FAST ‘03 DAFS Design Points 4 Designed for high performance – Minimize client-side overhead – Base protocol: remote DMA, flow control](https://reader035.vdocuments.net/reader035/viewer/2022071110/5fe5f1de2e49ee3b2f289f81/html5/thumbnails/10.jpg)
10Usenix FAST ‘03
DAFS Inline Write
WRITE_INLINE
ApplicationBuffer
Send Descriptor
ReceiveDescriptor
Client
REPLY
ServerBuffer
Send Descriptor
ReceiveDescriptor
Server
WRITE_INLINE
REPLY
1
23
![Page 11: The Direct Access File System (DAFS)4 Usenix FAST ‘03 DAFS Design Points 4 Designed for high performance – Minimize client-side overhead – Base protocol: remote DMA, flow control](https://reader035.vdocuments.net/reader035/viewer/2022071110/5fe5f1de2e49ee3b2f289f81/html5/thumbnails/11.jpg)
11Usenix FAST ‘03
DAFS Direct Write
WRITE_DIRECT
ApplicationBuffer
Send Descriptor
ReceiveDescriptor
Client
REPLY
ServerBuffer
Send Descriptor
ReceiveDescriptor
Server
WRITE_DIRECT
REPLY
1
2
3
RDMA Read
![Page 12: The Direct Access File System (DAFS)4 Usenix FAST ‘03 DAFS Design Points 4 Designed for high performance – Minimize client-side overhead – Base protocol: remote DMA, flow control](https://reader035.vdocuments.net/reader035/viewer/2022071110/5fe5f1de2e49ee3b2f289f81/html5/thumbnails/12.jpg)
12Usenix FAST ‘03
DAFS-enabled Applications
Raw Device Adapter
Disk I/OSyscalls
Application(unchanged)
Buffers
Device Driver
DAFS Library
DAT Provider Library
NIC Driver
RDMA NIC
• Kernel-level plug-in• Looks like raw disk• App uses standard
disk I/O calls• Very limited access to
DAFS features• Performance similar
to direct-attached disk
Kernel File System
File I/OSyscalls
Application(unchanged)
Buffers
File System
DAFS Library
DAT Provider Library
NIC Driver
RDMA NIC
• Kernel-level plug-in• Peer to local FS• App uses standard
file I/O semantics• Limited access to
DAFS features• Performance similar
to local FS
User Library
Application(modified)
Buffers
RDNA NIC
DAFS Library
DAT Provider Library
NIC Driver
UserSpace
OSKernel
H/W
• User-level library• Best performance• Full application
access to DAFS semantics
• Paper focuses on this style
UserSpace
OSKernel
H/W
DAFS APICalls
![Page 13: The Direct Access File System (DAFS)4 Usenix FAST ‘03 DAFS Design Points 4 Designed for high performance – Minimize client-side overhead – Base protocol: remote DMA, flow control](https://reader035.vdocuments.net/reader035/viewer/2022071110/5fe5f1de2e49ee3b2f289f81/html5/thumbnails/13.jpg)
13Usenix FAST ‘03
DAFS API
4 File based: exports DAFS semantics4 Designed for highest application performance4 Lowest client CPU requirements of any I/O system4 Rich semantics that meet or exceed local file system
capabilities4 Portable and consistent interface and semantics
across platforms– No need for different mount options, caching policies,
client-side SCSI commands, etc.– DAFS API interface is completely specified in an open
standard document, not in OS-specific documentation
4 Operating system avoidance
![Page 14: The Direct Access File System (DAFS)4 Usenix FAST ‘03 DAFS Design Points 4 Designed for high performance – Minimize client-side overhead – Base protocol: remote DMA, flow control](https://reader035.vdocuments.net/reader035/viewer/2022071110/5fe5f1de2e49ee3b2f289f81/html5/thumbnails/14.jpg)
14Usenix FAST ‘03
The DAFS API
4 Why a new API?– Backward compatibility with POSIX is fruitless
• File descriptor sharing, signals, fork()/exec()– Performance
• RDMA (memory registration), completion groups– New semantics
• Batch I/O, cache hints, named attributes, open with key, delete on last close
– Portability• OS independence and semantic consistency
![Page 15: The Direct Access File System (DAFS)4 Usenix FAST ‘03 DAFS Design Points 4 Designed for high performance – Minimize client-side overhead – Base protocol: remote DMA, flow control](https://reader035.vdocuments.net/reader035/viewer/2022071110/5fe5f1de2e49ee3b2f289f81/html5/thumbnails/15.jpg)
15Usenix FAST ‘03
Key DAFS API Features
4 Asynchronous– High performance interfaces support native asynchronous
file I/O– Many I/Os can be issued and awaited concurrently
4 Memory registration– Efficiently prewires application data buffers, permitting
RDMA (direct data placement)
4 Extended semantics– Batch I/O, delete on last close, open with key, cluster
fencing, locking primitives
4 Flexible completion model– Completion groups segregate related I/O– Applications can wait on specific requests, any of a set, or
any number of a set
![Page 16: The Direct Access File System (DAFS)4 Usenix FAST ‘03 DAFS Design Points 4 Designed for high performance – Minimize client-side overhead – Base protocol: remote DMA, flow control](https://reader035.vdocuments.net/reader035/viewer/2022071110/5fe5f1de2e49ee3b2f289f81/html5/thumbnails/16.jpg)
16Usenix FAST ‘03
Key DAFS API Features
4 Batch I/O– Essentially free I/O: amortizes costs of I/O issue over many
requests– Asynchronous notification of any number of completions– Scatter/gather file regions and memory regions
independently– Support for high-latency operations– Cache hints
4 Security and authentication– Credentials for multiple users– Varying levels of client authentication: none, default,
plaintext password, HOSTKEY, Kerberos V, GSS-API
4 Abstraction– server discovery, transient failure and recovery, failover,
multipathing
![Page 17: The Direct Access File System (DAFS)4 Usenix FAST ‘03 DAFS Design Points 4 Designed for high performance – Minimize client-side overhead – Base protocol: remote DMA, flow control](https://reader035.vdocuments.net/reader035/viewer/2022071110/5fe5f1de2e49ee3b2f289f81/html5/thumbnails/17.jpg)
17Usenix FAST ‘03
Benchmarks
4 Microbenchmarks to measure throughput and cost per operation of DAFS versus traditional network I/O
4 Application benchmark to demonstrate value of modifying application to use DAFS API
![Page 18: The Direct Access File System (DAFS)4 Usenix FAST ‘03 DAFS Design Points 4 Designed for high performance – Minimize client-side overhead – Base protocol: remote DMA, flow control](https://reader035.vdocuments.net/reader035/viewer/2022071110/5fe5f1de2e49ee3b2f289f81/html5/thumbnails/18.jpg)
18Usenix FAST ‘03
Benchmark Configuration
4 User-space DAFS library, VI provider4 NetApp F840 Server, fully cached workload
– Adapters (GbE):• Intel PRO/1000• Emulex GN9000 VI/TCP
– NFSv3/UDP, DAFS
4 Sun 280R client– Adapters:
• Sun “Gem 2.0”• Emulex GN9000 VI/TCP
4 Point-to-point connections
![Page 19: The Direct Access File System (DAFS)4 Usenix FAST ‘03 DAFS Design Points 4 Designed for high performance – Minimize client-side overhead – Base protocol: remote DMA, flow control](https://reader035.vdocuments.net/reader035/viewer/2022071110/5fe5f1de2e49ee3b2f289f81/html5/thumbnails/19.jpg)
19Usenix FAST ‘03
Microbenchmarks
4 Measures read performance4 NFS kernel versus DAFS user4 Asynchronous and Synchronous4 Throughput versus blocksize4 Throughput versus CPU time4 DAFS advantages are evident:
– Increased throughput– Constant overhead per operation
![Page 20: The Direct Access File System (DAFS)4 Usenix FAST ‘03 DAFS Design Points 4 Designed for high performance – Minimize client-side overhead – Base protocol: remote DMA, flow control](https://reader035.vdocuments.net/reader035/viewer/2022071110/5fe5f1de2e49ee3b2f289f81/html5/thumbnails/20.jpg)
20Usenix FAST ‘03
Microbenchmark Results
![Page 21: The Direct Access File System (DAFS)4 Usenix FAST ‘03 DAFS Design Points 4 Designed for high performance – Minimize client-side overhead – Base protocol: remote DMA, flow control](https://reader035.vdocuments.net/reader035/viewer/2022071110/5fe5f1de2e49ee3b2f289f81/html5/thumbnails/21.jpg)
21Usenix FAST ‘03
Application (GNU gzip)
4 Demonstrates benefit of user I/O parallelism4 Read, compress, write 550MB file4 Gzip modified to use DAFS API
– Memory preregistration, asynchronous read and write
4 16KB blocksize4 1 CPU, 1 process: DAFS advantage4 2 CPUs, 2 processes: DAFS 2x speedup
![Page 22: The Direct Access File System (DAFS)4 Usenix FAST ‘03 DAFS Design Points 4 Designed for high performance – Minimize client-side overhead – Base protocol: remote DMA, flow control](https://reader035.vdocuments.net/reader035/viewer/2022071110/5fe5f1de2e49ee3b2f289f81/html5/thumbnails/22.jpg)
22Usenix FAST ‘03
GNU gzip Runtimes
![Page 23: The Direct Access File System (DAFS)4 Usenix FAST ‘03 DAFS Design Points 4 Designed for high performance – Minimize client-side overhead – Base protocol: remote DMA, flow control](https://reader035.vdocuments.net/reader035/viewer/2022071110/5fe5f1de2e49ee3b2f289f81/html5/thumbnails/23.jpg)
23Usenix FAST ‘03
Conclusion
4 DAFS protocol enables high-performance local file sharing
4 DAFS API leverages benefit of user space I/O
4 The combination yields significant performance gains for I/O intensive applications