local file systems update

24
Local file systems update Red Hat Luk´ s Czerner February 23, 2013

Upload: lukas-czerner

Post on 29-Nov-2014

765 views

Category:

Technology


2 download

DESCRIPTION

The slides gives sn overview on what is currently cooking in local Linux file systems and what has been done in the recent past. With btrfs getting stabilized, xfs gaining more traction, ext4 improvements, new storage capabilities and file system requirements we are in the exciting new era where it might be hard to keep on track with the recent development. This talk should get you a picture on where are we heading to, get you familiar with the new features and capabilities and give you an idea how to use them correctly.

TRANSCRIPT

Page 1: Local file systems update

Local file systems update

Red Hat

Lukas Czerner

February 23, 2013

Page 2: Local file systems update

Copyright © 2013 Lukas Czerner, Red Hat.Permission is granted to copy, distribute and/or modify thisdocument under the terms of the GNU Free DocumentationLicense, Version 1.3 or any later version published by the FreeSoftware Foundation; with no Invariant Sections, no Front-CoverTexts, and no Back-Cover Texts. A copy of the license is includedin the COPYING file.

Page 3: Local file systems update

Agenda

1 Linux file systems overview

2 Challenges we’re facing today

3 Xfs

4 Ext4

5 Btrfs

6 Questions ?

Page 4: Local file systems update

Part I

Linux kernel file systems overview

Page 5: Local file systems update

File systems in Linux kernel

Linux kernel has a number of file systemsCluster, network, localSpecial purpose file systemsVirtual file systems

Close interaction with other Linux kernel subsystemsMemory ManagementBlock layerVFS - virtual file system switch

Optional stackable device driversdevice mappermdraid

Page 6: Local file systems update

Applications (Processes)

direct I/O(O_DIRECT)

VFS

PageCache

nvme

hooked in Device Drivers(hook in similar likestacked devices like

mdraid/device mapper do)

iomemory-vslwith module option

LVMBlock I/O Layer

optional stackable devices on topof “normal” block devices – work on bios

mdraid devicemapper

drbd ...

I/O Scheduler

maps bios to requests

deadlinecfq noop

request-baseddevice mapper targets

dm-multipath

SCSI mid layer virtio_blk iomemory-vsl

Physical devices

HDD SSD DVDdrive

MicronPCIe Card

Fusion-ioPCIe Card

LSIRAID

AdaptecRAID

QlogicHBA

EmulexHBA

...

anonymous pages(malloc)

read

(2)

write

(2)

open

(2)

stat(2

)

chm

od(2

)

...

BIOs (Block I/O)

sysfs(transport attributes)

/dev/vd*

SCSI upper layer

/dev/sda .../dev/sdb/dev/fio*

SCSI low layermegaraid sas aacraid qla2xxx ...libata

ahci ata_piix ...

lpfc

Transport Classesscsi_transport_fc

scsi_transport_sas

scsi_transport_...

lvm

/dev/fio*

/dev/nvme#n#

mtip32xx/dev/rssd*

The Linux I/O Stack Diagram (version 0.1, 2012-03-06)http://www.thomas-krenn.com/en/oss/linux-io-stack-diagram.htmlCreated by Werner Fischer and Georg SchönbergerLicense: CC-BY-SA 3.0, see http://creativecommons.org/licenses/by-sa/3.0/

block based FSext2 ext3

btrfs

ext4

xfs ifs

iso9660 ...

NFS coda

Network FS

gfs ocfs

smbfs ...

pseudo FS specialpurpose FSproc sysfs

futexfs

usbfs ...

tmpfs ramfs

devtmpfspipefs

network

nvmedevice

The Linux I/O Stack Diagramversion 0.1, 2012-03-06

outlines the Linux I/O stack as of Kernel version 3.3

Page 7: Local file systems update

Most active local file systems

File system Commits Developers Active developersExt4 648 112 13

Ext3 105 43 2

Xfs 650 61 8

Btrfs 1302 114 21

Page 8: Local file systems update

Number of lines of code

10000

20000

30000

40000

50000

60000

70000

80000

01/01/05

01/01/06

01/01/07

01/01/08

01/01/09

01/01/10

01/01/11

01/01/12

01/01/13

01/01/14

Line

s of

cod

e

Ext3Btrfs

XfsExt4

Page 9: Local file systems update

Part II

Challenges we’re facing today

Page 10: Local file systems update

Scalability

Common hardware storage capacity increasesYou can buy single 4TB drive for a reasonable priceBigger file system and file size

Common hardware computing power and parallelismincreases

More processes/threads accessing the file systemLocking issues

I/O stack designed for high latency low IOPSProblems solved in networking subsystem

Page 11: Local file systems update

Reliability

Scalability and Reliability are closely coupled problems

Being able to fix your file systemIn reasonable timeWith reasonable memory requirements

Detect errors before your application doesMetadata checksummingMetadata should be self describingOnline file system scrub

Page 12: Local file systems update

New types of storage

Non-volatile memoryWear levelling more-or-less solved in firmwareBlock layer has it’s IOPS limitationsWe can expect bigger erase blocks

Thinly provisioned storageLying to users to get more from expensive storageFilesystems can throw away most of it’s locality optimizationCut down performanceDevice mapper dm-thinp target

Hierarchical storageHide inexpensive slow storage behind expensive fast storagePerformance depends on working set sizeImprove performanceDevice mapper dm-cache target, bcache

Page 13: Local file systems update

Maintainability issues

More file systems with different use casesMultiple set of incompatible user space applicationsDifferent set of features and defaultsEach file system have different management requirements

Requirements from different types of storageSSDThin provisioningBigger sector sizes

Deeper storage technology stackmdraiddevice mappermultipath

Having a centralized management tool is incredibly useful

Having a central source of information is a must

System Storage Manager http://storagemanager.sf.net

Page 14: Local file systems update

Part III

What’s new in xfs

Page 15: Local file systems update

Scalability improvements

Delayed loggingImpressive improvements in metadata modificationperformanceSingle threaded workload still slower then ext4, but not muchWith more threads scales much better than ext4On-disk format change

XFS scales well up to hundreds of terabytesAllocation scalabilityFree space indexing

Locking optimization

Pretty much the best choice for beefy configurations with lotsof storage

Page 16: Local file systems update

Reliability improvements

Metadata checksummingCRC to detect errorsMetadata verification as it is written to or read from diskOn-disk format change

Future workReverse mapping allocation treeOnline transparent error correctionOnline metadata scrub

Page 17: Local file systems update

Part IV

What’s new in ext4

Page 18: Local file systems update

Scalability improvements

Based on very old architectureFree space tracked in bitmaps on diskStatic metadata positionsLimited size of allocation groupsLimited file size limit (16TB)Advantages are resilient on-disk format and backwards andforward compatibility

Some improvements with bigalloc featureGroup number of blocs into clustersCluster is now the smallest allocation unitTrade-off between performance and space utilization efficiency

Extent status tree for tracking delayed extentsNo longer need to scan page cache to find delalloc blocks

Scalability is very much limited by design, on-disk format andbackwards compatibility

Page 19: Local file systems update

Reliability improvements

Better memory utilization of user space toolsNo longer stores whole bitmaps - converted to extentsBiggest advantage for e2fsck

Faster file system creationInode table initialization postponed to kernelHuge time saver when creating bigger file systems

Metadata checksummingCRC to detect errorsNot enabled by default

Page 20: Local file systems update

Part V

What’s new in btrfs

Page 21: Local file systems update

Getting stabilized

Performance improvements is not where the focus isright now

Design specific performance problemsOptimization needed in future

Still under heavy development

Not all features are yet ready or even implemented

File system stabilization takes a long time

Page 22: Local file systems update

Reliability in btrfs

Userspace tools not in very good shape

Fsck utility still not fully finished

Neither kernel nor userspace handles errors gracefully

Very good design to build on

Metadata and data checksummingBack referenceOnline filesystem scrub

Page 23: Local file systems update

Resources

Linux Weekly News http://lwn.net

Kernel mailing lists http://vger.kernel.org

linux-fsdevellinux-ext4linux-btrfslinux-xfs

Linux Kernel code http://kernel.org

Linux IO stack diagramhttp://www.thomas-krenn.com/en/oss/linuxiostack-diagram.html

Page 24: Local file systems update

The end.Thanks for listening.