direct-fuse: removing the middleman for high-performance … · 2018-06-13 · direct-fuse:...

20
Direct-FUSE: Removing the Middleman for High-Performance FUSE File System Support Yue Zhu*, Teng Wang*, Kathryn Mohror + , Adam Moody + , Kento Sato + , Muhib Khan*, Weikuan Yu* Florida State University* Lawrence Livermore National Laboratory +

Upload: others

Post on 12-Feb-2020

23 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Direct-FUSE: Removing the Middleman for High-Performance … · 2018-06-13 · Direct-FUSE: Removing the Middleman for High-Performance FUSE File System Support Yue Zhu*, Teng Wang*,

Direct-FUSE: Removing the Middlemanfor High-Performance

FUSE File System SupportYue Zhu*, Teng Wang*,

Kathryn Mohror+, Adam Moody+, Kento Sato+, Muhib Khan*, Weikuan Yu*

Florida State University*Lawrence Livermore National Laboratory+

Page 2: Direct-FUSE: Removing the Middleman for High-Performance … · 2018-06-13 · Direct-FUSE: Removing the Middleman for High-Performance FUSE File System Support Yue Zhu*, Teng Wang*,

ROSS’18 S-2

Outline

• Background & Motivation

·Design

·Performance Evaluation

·Conclusion

Page 3: Direct-FUSE: Removing the Middleman for High-Performance … · 2018-06-13 · Direct-FUSE: Removing the Middleman for High-Performance FUSE File System Support Yue Zhu*, Teng Wang*,

ROSS’18 S-3

Introductionn High-performance computing (HPC) systems needs efficient

file system for supporting large-scale scientific applicationsØ Different file systems are used for different kinds of data in a single jobØ Both kernel- and user-level file systems can be used in the applicationsØ Due to kernel-level file systems’ development complexity, reliability and

portability issues, user-level file systems are more leveraged for particular I/O workloads with special purpose

n Filesystem in UserSpacE (FUSE)Ø A software interface for Unix-like computer operating systemsØ It allows non-privileged users to create their own file systems without

modifying kernel codeØ User defined file system is run as a separate process in user-spaceØ Example: SSHFS, GlusterFS client, FusionFS(BigData’14)

Page 4: Direct-FUSE: Removing the Middleman for High-Performance … · 2018-06-13 · Direct-FUSE: Removing the Middleman for High-Performance FUSE File System Support Yue Zhu*, Teng Wang*,

ROSS’18 S-4

How does FUSE Work?n Execution path of a function call

Ø Send the request to the user-level file system processo App program → VFS → FUSE kernel module → User-level file system

process

Ø Return the data back to the application programo User-level file system process → FUSE kernel module → VFS → App

programApplication Program User Level File System

Ext4

Storage Device

User Space

Kernel Space

FUSE

Page Cache

Virtual File System (VFS)

Page 5: Direct-FUSE: Removing the Middleman for High-Performance … · 2018-06-13 · Direct-FUSE: Removing the Middleman for High-Performance FUSE File System Support Yue Zhu*, Teng Wang*,

ROSS’18 S-5

FUSE File System vs. Native File SystemFUSE File System Native File System

# User-kernelMode Switch 4 2

# ContextSwitch 2 0

# MemoryCopies 2 1

Application Program User Level File System

Ext4

Storage Device

User Space

Kernel Space

FUSE

Page Cache

Virtual File System (VFS)

Page 6: Direct-FUSE: Removing the Middleman for High-Performance … · 2018-06-13 · Direct-FUSE: Removing the Middleman for High-Performance FUSE File System Support Yue Zhu*, Teng Wang*,

ROSS’18 S-6

Number of Context Switches & I/O Bandwidthn The complexity added in FUSE file system execution path

causes performance degradation in I/O bandwidthØ tmpfs: a file system that stores data in volatile memoryØ FUSE-tmpfs: a FUSE file system deployed on top of tmpfsØ dd micro-benchmark and perf system profiling tool are used to gather the I/O

bandwidth and the number of context switchesØ Experiment method: continually issue 1000 writes

Write Bandwidth # Context Switches

Block Size (KB)

FUSE-tmpfs(MB/s)

tmpfs(GB/s)

FUSE-tmpfs

tmpfs

4 163 1.3 1012 716 372 1.6 1012 764 519 1.7 1012 7128 549 2.0 1012 7256 569 2.4 2012 7

Page 7: Direct-FUSE: Removing the Middleman for High-Performance … · 2018-06-13 · Direct-FUSE: Removing the Middleman for High-Performance FUSE File System Support Yue Zhu*, Teng Wang*,

ROSS’18 S-7

Breakdown of Metadata & Data Latencyn The actual file system operations (i.e. metadata or data

operations) only occupy a small amount of total execution timeØ Tests are on tmpfs and FUSE-tmpfsØ Real Operation in metadata operation: the time of conducting operationØ Data Movement: the actual time of write in a complete write function callØ Overhead: the cost besides the above two, e.g. the time of context switches

0

100

200

300

400

500

600

1 4 16 64 128 256

Late

ncy

(ns)

Transfer Sizes (KB)

Data MovementOverhead

34.8%

33.7%

37.86%

15.82%10.08%

38.12%

0

50

100

150

200

250

Late

ncy

(ns)

Metadata Operations

Real OperationOverhead

Create Close

11.18%

2.17%

tmpfs FUSE-tmpfs tmpfs FUSE-tmpfs

Fig. 1. Time Expense in Metadata Operations Fig. 2. Time Expense in Data Operations

Page 8: Direct-FUSE: Removing the Middleman for High-Performance … · 2018-06-13 · Direct-FUSE: Removing the Middleman for High-Performance FUSE File System Support Yue Zhu*, Teng Wang*,

ROSS’18 S-8

Existing Solution and Our Approachn How to reduce the overheads from FUSE?

Ø Build an independent user-space library to avoid goingthrough kernel (e.g., IndexFS (SC’14), FusionFS)

Ø However, this approach cannot support multiple FUSE libraries with distinct file paths and file descriptors

n We propose Direct-FUSE to support multiple backend I/O services to an application Ø We adapted libsysio to our purpose in Direct-FUSE

o libsysio is developed by Scalability team of Sandia National Lab):« a POSIX-like file I/O, and name space support for remote file systems

from an application’s user-level address space.

Page 9: Direct-FUSE: Removing the Middleman for High-Performance … · 2018-06-13 · Direct-FUSE: Removing the Middleman for High-Performance FUSE File System Support Yue Zhu*, Teng Wang*,

ROSS’18 S-9

Outline

·Background & Motivation

• Design

·Performance Evaluation

·Conclusion

Page 10: Direct-FUSE: Removing the Middleman for High-Performance … · 2018-06-13 · Direct-FUSE: Removing the Middleman for High-Performance FUSE File System Support Yue Zhu*, Teng Wang*,

ROSS’18 S-10

n Direct-FUSE mainly consists of three components1. Adapted-libsysio

o Intercept file path and file descriptor for backend services identificationo Simplify metadata and data execution path in original libsysio

2. lightweight-libfuse (not real libfuse)o Abstract file system operations from backend services to unified APIs

3. Backend serviceso Provide defined file system operations (e.g., FusionFS)

The Overview of Direct-FUSE

Application Program

Ext4

Adapted-libsysio

lightweight-libfuse

FUSE-Ext4 FusionFS Client ….

FusionFS Server …

Backend Services

Direct-FUSE

Page 11: Direct-FUSE: Removing the Middleman for High-Performance … · 2018-06-13 · Direct-FUSE: Removing the Middleman for High-Performance FUSE File System Support Yue Zhu*, Teng Wang*,

ROSS’18 S-11

Path and File Descriptor Operationsn To facilitate the interception of file system operations

for multiple backends, the operations are categorizedinto two:

1. File path operationsi. Intercept prefix and path (e.g., sshfs:/sshfs/test.txt) and return mount

informationii. Look up corresponding inode based on the mount information, and

redirect to defined operations

2. File descriptor operationsi. Find open-file record based on given file descriptor

« Open-file record contains pointers to inode, current stream position, etc

ii. Redirect to defined operations based inode info in open-file record

Page 12: Direct-FUSE: Removing the Middleman for High-Performance … · 2018-06-13 · Direct-FUSE: Removing the Middleman for High-Performance FUSE File System Support Yue Zhu*, Teng Wang*,

ROSS’18 S-12

Requirements for New Backendsn Interact with FUSE high-level APIsn Separated as an independent user-space library

Ø The library contains the fuse file system operations,initialization function, and also the unmount function

Ø If a backend passes some specialized data to the fuse module via fuse_mount(), then the data has to be globalized for later file system operations

n Implemented in C/C++ or has to be binary compatible with C/C++

Page 13: Direct-FUSE: Removing the Middleman for High-Performance … · 2018-06-13 · Direct-FUSE: Removing the Middleman for High-Performance FUSE File System Support Yue Zhu*, Teng Wang*,

ROSS’18 S-13

Outline

·Background and Challenges

·Design

• Performance Evaluation

·Conclusion

Page 14: Direct-FUSE: Removing the Middleman for High-Performance … · 2018-06-13 · Direct-FUSE: Removing the Middleman for High-Performance FUSE File System Support Yue Zhu*, Teng Wang*,

ROSS’18 S-14

Experimental Methodologyn We compare the bandwidth of Direct-FUSE with local

FUSE file system and native file system on disk and memory by IozoneØ Disk

o Ext4-fuse: FUSE file system overlying Ext4 o Ext4-direct: Ext4-fuse bypasses the FUSE kernelo Ext4-native: original Ext4 on disk

Ø Memoryo tmpfs-fuse, tmpfs-direct, and tmpfs-native are similar to the three tests on

disk

n We also compare the I/O bandwidth of distributed FUSE file system with Direct-FUSEØ FusionFS: a distributed file system that supports metadata-

and write-intensive operations

Page 15: Direct-FUSE: Removing the Middleman for High-Performance … · 2018-06-13 · Direct-FUSE: Removing the Middleman for High-Performance FUSE File System Support Yue Zhu*, Teng Wang*,

ROSS’18 S-15

Sequential Write Bandwidthn Direct-FUSE achieves comparable bandwidth

performance to the native file systemØ Ext4-direct outperforms Ext4-fuse by 16.5% on averageØ tmpfs-direct outperforms tmpfs-fuse at least 2.15x

1

10

100

1000

10000

4 16 64 256 1024Ban

dwid

th(M

B/s

)

Write Transfer Sizes (KB)

Ext4-fuse Ext4-direct Ext4-nativetmpfs-fuse tmpfs-direct tmpfs-native

Page 16: Direct-FUSE: Removing the Middleman for High-Performance … · 2018-06-13 · Direct-FUSE: Removing the Middleman for High-Performance FUSE File System Support Yue Zhu*, Teng Wang*,

ROSS’18 S-16

Sequential Read Bandwidthn Similar to the sequential write bandwidth, the read

bandwidth of Direct-FUSE is comparable to the native file systemØ Ext4-direct outperforms Ext4-fuse by 2.5% on averageØ tmpfs-direct outperforms tmpfs-fuse at least 2.26x

1

10

100

1000

10000

4 16 64 256 1024Ban

dwid

th(M

B/s

)

Read Transfer Sizes (KB)

Ext4-fuse Ext4-direct Ext4-nativetmpfs-fuse tmpfs-direct tmpfs-native

Page 17: Direct-FUSE: Removing the Middleman for High-Performance … · 2018-06-13 · Direct-FUSE: Removing the Middleman for High-Performance FUSE File System Support Yue Zhu*, Teng Wang*,

ROSS’18 S-17

Distributed I/O Bandwidthn Direct-FUSE outperforms FusionFS in write

bandwidth and shows comparable read bandwidthØ Writes benefit more from the FUSE kernel bypassing

n Direct-FUSE delivers similar scalability results as the original FusionFS

1

10

100

1000

10000

1 2 4 8 16

Ban

dwid

th (M

B/s

)

Number of Nodes

fusionfs direct-fusionfs

1

10

100

1000

10000

1 2 4 8 16

Ban

dwid

th(M

B/s

)

Number of Nodes

fusionfs direct-fusionfs

Write Read

Page 18: Direct-FUSE: Removing the Middleman for High-Performance … · 2018-06-13 · Direct-FUSE: Removing the Middleman for High-Performance FUSE File System Support Yue Zhu*, Teng Wang*,

ROSS’18 S-18

Overhead Analysisn The dummy read/write occupies less than 3% of the

complete I/O function time in Direct-FUSE, even when the I/O size is very smallØ Dummy write/read: no actual data movement, directly

return once reach the backend serviceØ Real write/read: the actual Direct-FUSE read and write I/O

calls

1

10

100

1000

10000

1B 4B 16B 64B 256B 1KB

Lat

ency

(ns)

Transfer Sizes

dummy write real write

1

10

100

1000

10000

1B 4B 16B 64B 256B 1KB

Lat

ency

(ns)

Transfer Sizes

dummy read real read

Page 19: Direct-FUSE: Removing the Middleman for High-Performance … · 2018-06-13 · Direct-FUSE: Removing the Middleman for High-Performance FUSE File System Support Yue Zhu*, Teng Wang*,

ROSS’18 S-19

ConclusionsØ We have revealed and analyzed the context switches count and

time overheads in FUSE metadata and data operations

Ø We have designed and implemented Direct-FUSE, which canavoid crossing kernel boundary and support multiple FUSEbackends simultaneously

Ø Our experimental results indicate that Direct-FUSE achieves significant performance improvement compared to original FUSEfile systems

Page 20: Direct-FUSE: Removing the Middleman for High-Performance … · 2018-06-13 · Direct-FUSE: Removing the Middleman for High-Performance FUSE File System Support Yue Zhu*, Teng Wang*,

ROSS’18 S-20

Sponsors of This Research