fast 15’ authors: chanman lee, dongho sim, jooyoung hwang, and sangyeun cho, samasung eletronics...

25
FAST 15’ Authors: Chanman Lee, Dongho Sim, Jooyoung Hwang, and Sangyeun Cho, Samasung Eletronics Co. F2FS: A New File System for Flash Storage Presented by YouJune Go 2015. 08. 18. System Software Laboratory Department of CSE @POSTECH

Upload: imogene-lucas

Post on 12-Jan-2016

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: FAST 15’ Authors: Chanman Lee, Dongho Sim, Jooyoung Hwang, and Sangyeun Cho, Samasung Eletronics Co. F2FS: A New File System for Flash Storage Presented

FAST 15’Authors: Chanman Lee, Dongho Sim, Jooyoung Hwang, and Sangyeun Cho, Samasung Eletronics Co.

F2FS: A New File System for Flash Storage

Presented by YouJune Go

2015. 08. 18.System Software Laboratory

Department of CSE@POSTECH

Page 2: FAST 15’ Authors: Chanman Lee, Dongho Sim, Jooyoung Hwang, and Sangyeun Cho, Samasung Eletronics Co. F2FS: A New File System for Flash Storage Presented

Introduction• NAND flash memory has been widely used in various devices.

• Server system started utilizing flash devices as their primary storage. 

• BUT, there are several limitations on flash memory.• erase-before-write, write on erased block sequentially and

limited write cycles per erase block.

• Random writes are not good for flash storage devices.• Free space fragmentation.

• Sustained random write performance degrades.• Lifetime reduction.

• Sequential write oriented file system• Log structured file system, copy-on-write file system.

Page 3: FAST 15’ Authors: Chanman Lee, Dongho Sim, Jooyoung Hwang, and Sangyeun Cho, Samasung Eletronics Co. F2FS: A New File System for Flash Storage Presented

Key Design Considerations

• Flash-friendly on-disk layout.• Cost-effective index structure.• Multi-head logging.• Adaptive logging.• Fsync acceleration with roll-forward recovery.

Page 4: FAST 15’ Authors: Chanman Lee, Dongho Sim, Jooyoung Hwang, and Sangyeun Cho, Samasung Eletronics Co. F2FS: A New File System for Flash Storage Presented

• Flash awareness• All the file system metadata are located together for

locality.• File system cleaning is done in a unit of section(FTL’s GC

unit).

• Cleaning Cost Reduction• Multi-head logging for hot/cold data separation.

Flash-friendly On-disk Layout

Page 5: FAST 15’ Authors: Chanman Lee, Dongho Sim, Jooyoung Hwang, and Sangyeun Cho, Samasung Eletronics Co. F2FS: A New File System for Flash Storage Presented

• Superblock• Partition information, F2FS parameters(not changeable)

• Check point • file system status, bitmaps for valid NAT/SIT sets, orphan

inode list, summary entries of active segment.• Segment Information Table(SIT)

• valid segments and bitmap information in Main area. • Node Address Table(NIT)

• block address table to locate all the node blocks stored in Main area.

• Segment Summary Area(SSA) • summary entries representing the owner information of all

blocks in the Main area(parent inode number and its node/data offset).

• Main Area : Block is typed to be node or data.• Node block stores inode or indices of data blocks(not data).• Data block contains either directory or user file data.

On-disk layout in detail

Page 6: FAST 15’ Authors: Chanman Lee, Dongho Sim, Jooyoung Hwang, and Sangyeun Cho, Samasung Eletronics Co. F2FS: A New File System for Flash Storage Presented

• Update propagation issue; wandering tree problem.• One log.

LFS index structure

SB

CP

Inode map

Segment Usage

Segment Summary

Inode fordirectory

Inode forregualr file

Directory data

File data

IndirectPointer block

DirectPointer block

File data… …

Page 7: FAST 15’ Authors: Chanman Lee, Dongho Sim, Jooyoung Hwang, and Sangyeun Cho, Samasung Eletronics Co. F2FS: A New File System for Flash Storage Presented

• Restrained update propagation: node address translation method

• Multi-head log

F2FS index structure

SB

CP

Segment Usage

Segment Summary

Inode fordirectory

Inode forregualr file

Directory data

File data

IndirectPointer block

DirectPointer block

File data… …

NAT

Page 8: FAST 15’ Authors: Chanman Lee, Dongho Sim, Jooyoung Hwang, and Sangyeun Cho, Samasung Eletronics Co. F2FS: A New File System for Flash Storage Presented

File look-up operation example

• Assuming a file “/dir/file”

inodelocatio

n

NAT

root 100

root inode10

0

Search dentry

named dir and get

inode num

dir 200 dir inode200

Search dentry

named file and get

inode num

file 300

file inode300

Get an actual data

of file consquently

Page 9: FAST 15’ Authors: Chanman Lee, Dongho Sim, Jooyoung Hwang, and Sangyeun Cho, Samasung Eletronics Co. F2FS: A New File System for Flash Storage Presented

Multi-head logging• Data temperature classification.

• Node > Data• Direct node > Indirect node• Directory > User file

• Separation of multi-head logs in NAND flash.• Zone-aware log allocation for set-associative FTL mapping.• Multi-stream interface.

Page 10: FAST 15’ Authors: Chanman Lee, Dongho Sim, Jooyoung Hwang, and Sangyeun Cho, Samasung Eletronics Co. F2FS: A New File System for Flash Storage Presented

• Cleaning is a process to reclaim scattered and invalidated blocks for free segments for further logging.• occurred when capacity is filled up. • done in the unit of a section.

• Triggered in Foreground and Background process.

• Cleaning procedure(1) Victim selection: get a victim section through referencing Segment Info Table(SIT).

Greedy algorithm for foreground cleaning job. Cost-benefit algorithm for background cleaning job.

(2) Valid block check: load parent index structures of there-in data identified from Segment Summary Area.(3) Migration: move valid blocks to another spaces in Main area. (4) Mark victim section as “pre-free”

Pre-free sections are freed after the next checkpoint is made.

Cleaning

Page 11: FAST 15’ Authors: Chanman Lee, Dongho Sim, Jooyoung Hwang, and Sangyeun Cho, Samasung Eletronics Co. F2FS: A New File System for Flash Storage Presented

• Foreground cleaning• identify valid block quickly by using validity bitmaps in SIT

information.• after identification, F2FS retrieves parent node blocks containing

their indices from the SSA information and moves them to free logs.

• Background cleaning• does not issue actual I/O to migrate valid blocks, instead F2FS

loads the blocks into page cache and marks them as dirty (lazy migration).

• background cleaning is not kicked in when foreground cleaning process is in progress.

Foreground and Background Cleaning

Page 12: FAST 15’ Authors: Chanman Lee, Dongho Sim, Jooyoung Hwang, and Sangyeun Cho, Samasung Eletronics Co. F2FS: A New File System for Flash Storage Presented

• To reduce cleaning cost at highly aged conditions, F2FS changes write policy dynamically. • Normal logging (Append logging, logging to clean segments)

Need cleaning operations if there is no free segment. Cleaning causes mostly random read and sequential writes.

• Threaded logging (logging to dirty segments) Reuse invalid blocks in dirty segments. No need cleaning. Cause random writes.

• Switching between normal logging and threaded logging depending on predefined value k(ratio of clean segments).• If k > 5%, then normal logging. Otherwise, threaded logging.

Adaptive Logging

segment

Threaded logging writes data into invalid blocks in segment

Page 13: FAST 15’ Authors: Chanman Lee, Dongho Sim, Jooyoung Hwang, and Sangyeun Cho, Samasung Eletronics Co. F2FS: A New File System for Flash Storage Presented

• Checkpointng• implemented to provide a consistent recovery point from sudden

poweroff or system crash.• events like sync, umount and foreground cleaning, F2FS triggers

checkpoint procedures. (1) All dirty node and dentry blocks in the page cache are flushed. (2) It suspends ordinary writing activities. (3) File system metadata such as NAT, SIT and SSA are written to their dedicated areas on the disk. (4) Finally, F2FS writes a checkpoint pack, consisting of the following informa- tion.

Header and Footer NAT and SIT bitmaps NAT and SIT journals Summary blocks of active segments Orphan blocks

Sudden Power Off Recovery

Page 14: FAST 15’ Authors: Chanman Lee, Dongho Sim, Jooyoung Hwang, and Sangyeun Cho, Samasung Eletronics Co. F2FS: A New File System for Flash Storage Presented

• After a sudden power-off, F2FS rolls back to the latest consistent checkpoint.• Maintains shadow copy of checkpoint, NAT, SIT blocks.• Recovers the latest checkpoint.• Keeps NAT/SIT journal in checkpoint to avoid NAT, SIT writes.

Roll-back Recovery

Main areaSB

CP

NAT SIT SSA

0 1

0 1

NAT/SIT journalin

g

Page 15: FAST 15’ Authors: Chanman Lee, Dongho Sim, Jooyoung Hwang, and Sangyeun Cho, Samasung Eletronics Co. F2FS: A New File System for Flash Storage Presented

• Fsync handling• On fsync, checkpoint is not necessary.• Only direct node blocks and data file are written with fsync mark.

• Roll-forward recovery procedure* Denote N refers to the log position of the last stable checkpoint.(1) F2FS collects the direct node block having the special flag

located in N+n (n: the number of blocks updated since the last checkpoint).

(2) then loads the most recently written node blocks, named N-n, into the page cache.

(3) And compares the data indices in between N-n and N+n.(4) Finally, if F2FS detects different data indices, then it refreshes

the cached node block with the new indices stored in N+n and finally marks them as dirty.

Roll-forward Recovery

Page 16: FAST 15’ Authors: Chanman Lee, Dongho Sim, Jooyoung Hwang, and Sangyeun Cho, Samasung Eletronics Co. F2FS: A New File System for Flash Storage Presented

Evaluation• Experimental setup

• Mobile and server systems.• Performance comparison with ext4, btrfs, nilfs2.• Values in parentheses meaning seq-rd, seq-wr,

rand-rd, rand-wr in MB/s

Source: the paperb

Page 17: FAST 15’ Authors: Chanman Lee, Dongho Sim, Jooyoung Hwang, and Sangyeun Cho, Samasung Eletronics Co. F2FS: A New File System for Flash Storage Presented

Mobile Benchmark• In F2FS, more than 90% of writes are sequential.• F2FS reduces write amount per fsync by using roll-forward

recovery. • BTRFS and NILFS2 performed poor than ext4

• BTRFS: heavy indexing overheads, NILFS2: periodic data flush• For Iozone-RW, BTRFS, NILFS2 write 15%, 41% more I/Os than Ext4,

repectively.• For Iozone-RR, BTRFS has 50% more I/Os than other file systems.

Source: the paperb

Page 18: FAST 15’ Authors: Chanman Lee, Dongho Sim, Jooyoung Hwang, and Sangyeun Cho, Samasung Eletronics Co. F2FS: A New File System for Flash Storage Presented

Server Benchmark• Performance gain of F2FS over ext4 is more on SATA SSD than

on PCIe SSD• Varmail: 2.5x on the SATA SSD and 1.8x on the PCIe SSD• Oltp: 16% on the SATA SSD and 13% on the PCIe SSD

• Discard size matters in SATA SSD due to interface overhead.• When using small discard(256KB) for F2FS, fileserver performance is

degraded by 18%.

Source: the paperb

Page 19: FAST 15’ Authors: Chanman Lee, Dongho Sim, Jooyoung Hwang, and Sangyeun Cho, Samasung Eletronics Co. F2FS: A New File System for Flash Storage Presented

Multi-head Logging• Using more logs gives better hot and cold data separation• 2 logs: node, data• 4 logs: hot node, warm/cold node, hot data, warm/cold data.

Source: the paperb

Page 20: FAST 15’ Authors: Chanman Lee, Dongho Sim, Jooyoung Hwang, and Sangyeun Cho, Samasung Eletronics Co. F2FS: A New File System for Flash Storage Presented

Adaptive Logging Performance• Adaptive logging gives graceful performance degradation under

highly aged volume conditions.• Fileserver test on SATA SSD(94% util)

• Sustained performance improvement: 2x/3x over ext4/btrfs• Iozone test on eMMC(100% util)

• Sustained performance is similar to ext4

Source: the paperb

Page 21: FAST 15’ Authors: Chanman Lee, Dongho Sim, Jooyoung Hwang, and Sangyeun Cho, Samasung Eletronics Co. F2FS: A New File System for Flash Storage Presented

Conclusion• F2FS features

• Flash friendly on-disk layout align FS GC unit with FTL GC unit,• Cost-effective index structure restrain write propagation,• Multi-head logging cleaning cost reduction,• Adaptive logging graceful performance degradation in aged

condition,• Roll-forward recovery fsync acceleration.

• F2FS shows performance gain over other Linux file systems.• 3.1x(iozone) and 2x (SQLite) speedup over ext4,• 2.5x(SATA SSD) and 1.8x(PCIe SSD) speedup over

ext4(varmail)

Page 22: FAST 15’ Authors: Chanman Lee, Dongho Sim, Jooyoung Hwang, and Sangyeun Cho, Samasung Eletronics Co. F2FS: A New File System for Flash Storage Presented

Thank you, Q&A

Page 23: FAST 15’ Authors: Chanman Lee, Dongho Sim, Jooyoung Hwang, and Sangyeun Cho, Samasung Eletronics Co. F2FS: A New File System for Flash Storage Presented

BACKUP SLIDES

Page 24: FAST 15’ Authors: Chanman Lee, Dongho Sim, Jooyoung Hwang, and Sangyeun Cho, Samasung Eletronics Co. F2FS: A New File System for Flash Storage Presented
Page 25: FAST 15’ Authors: Chanman Lee, Dongho Sim, Jooyoung Hwang, and Sangyeun Cho, Samasung Eletronics Co. F2FS: A New File System for Flash Storage Presented