the evolution of file systems - snia · the evolution of file systems . thomas rivera, hitachi data...
TRANSCRIPT
The Evolution of File Systems
Thomas Rivera, Hitachi Data Systems Craig Harmer, April 2011
The Evolution of File Systems © 2012 Storage Networking Industry Association. All Rights Reserved. 2
2
SNIA Legal Notice
The material contained in this tutorial is copyrighted by the SNIA. Member companies and individuals may use this material in presentations and literature under the following conditions:
Any slide or slides used must be reproduced without modification The SNIA must be acknowledged as source of any material used in the body of any document containing material from these presentations.
This presentation is a project of the SNIA Education Committee. Neither the Author nor the Presenter is an attorney and nothing in this presentation is intended to be nor should be construed as legal advice or opinion. If you need legal advice or legal opinion please contact an attorney. The information presented herein represents the Author's personal opinion and current understanding of the issues involved. The Author, the Presenter, and the SNIA do not assume any responsibility or liability for damages arising out of any reliance on or use of this information. NO WARRANTIES, EXPRESS OR IMPLIED. USE AT YOUR OWN RISK.
The Evolution of File Systems © 2012 Storage Networking Industry Association. All Rights Reserved. 3
3
Abstract
The File Systems Evolution Over time additional file systems appeared focusing on specialized requirements such as:
data sharing, remote file access, distributed file access, parallel files access, HPC, archiving, security, etc.
Due to the dramatic growth of unstructured data, files as the basic units for data containers are morphing into file objects, providing more semantics and feature-rich capabilities for content processing This presentation will:
Categorize and explain the basic principles of currently available file system architectures (e.g. Local, Shared, SAN, Clustered, Network, Distributed, Parallel, etc. Explain technologies like Scale-Out NAS, NAS Aggregation, NAS Virtualization, NAS Clustering, Global Namespace, Parallel NFS Review new file system architectures being developed
The Evolution of File Systems © 2012 Storage Networking Industry Association. All Rights Reserved. 4
Related Tutorials
Check out SNIA Tutorial:
Using File Server Protocols for Block-based Storage Workloads
Check out SNIA Tutorial:
Understanding Enterprise NAS
Check out SNIA Tutorial:
pNFS and NFS V4.2
The Evolution of File Systems © 2012 Storage Networking Industry Association. All Rights Reserved. 5
Why File Systems Have Evolved
Scale Megabytes → Petabytes
Requirements High availability Data sharing Remote access Performance Archiving others…
(Not a strict timeline—new capabilities are generally incremental)
?
Time
..... Network
File System
Cluster File
System
SAN File
System
Shared File
System
Local File
System
Parallel File
System
Object File
System
Distributed File
System
The Evolution of File Systems © 2012 Storage Networking Industry Association. All Rights Reserved. 6
Where File Systems Live
File System
User space
Kernel space
mmap()
User Application and Libraries (ls, mv, rm, cp, ...)
Process Management
Memory Mgmt Scheduler IPC
Data Cache* Segmap Cache
Volume Manager
System Calls (open(), close(), read(), write(), ioctl(), mmap(), ...)
DMA
VFS
Device Drivers
Buffers
*can be bypassed by using
direct I/O
Machine dependent code
Hardware
The Evolution of File Systems © 2012 Storage Networking Industry Association. All Rights Reserved. 7
What File Systems Do (UNIX example)
Data Blocks
data block data block data block data block data block data block data block data block data block data block data block data block data block
Host direct 0
direct 1
direct 2
direct 3
direct 4
direct 5
direct 6
direct 7
direct 8
direct 9
single indirect
double indirect
triple indirect
File Owner
File Type Permissions
Last Access
Size
# of links
. . .
File attributes:
Inode
0 1 2 3 4
5 6 7 8 9
10 11 12 13 14
15 16 17 18 19
File locators: (“inodes”)
Data locators: (pointers)
Data: (blocks)
The Evolution of File Systems © 2012 Storage Networking Industry Association. All Rights Reserved. 8
A File System Taxonomy
Local File System
Shared File System
SAN File System
Cluster File System
Network File System
Distributed File System
Distributed Parallel
File System
File Systems
The Evolution of File Systems © 2012 Storage Networking Industry Association. All Rights Reserved. 9
Local File System
File system is co-located in the server with application
Local file system
Application
File System
The Evolution of File Systems © 2012 Storage Networking Industry Association. All Rights Reserved. 10
Local File System
Separate “islands” of data Limitation: no data sharing
Application
File System
Application Application Application
File System File System File System
The Evolution of File Systems © 2012 Storage Networking Industry Association. All Rights Reserved. 11
One Way to Share Data: Scale-Up
Vertical scaling
The Evolution of File Systems © 2012 Storage Networking Industry Association. All Rights Reserved. 12
Another Way to Share Data: Scale-Out
Shared Data
Horizontal Scaling ...
Storage Network
Shared Device: A multi-LUN device shared among clients
Each client has exclusive access to a dedicated LUN ≠
Shared Data: A physical device shared among clients
Clients access LUNs concurrently
The Evolution of File Systems © 2012 Storage Networking Industry Association. All Rights Reserved. 13
Data Access with Shared/Global File System
Separate logical and physical placement Metadata server File access is a three-step transaction...
Step 1:Request access
Metadata Server Client
Step 2: Metadata delivery
MDS Client
Step 3: Data access
MDS Client Metadata Server
Metadata Server
The Evolution of File Systems © 2012 Storage Networking Industry Association. All Rights Reserved. 14
Shared/Global File System Asymmetric (“SAN File System”)
Shared Data
One active metadata server Typically homogeneous (scaling limited by metadata server capacity) Inter-node distance limited by storage network capability
Storage Network
Client Network
Application Server Application Server Application Server Application Server Application Server
Application e.g. Web Server
Application e.g. Web Server
Application e.g. Web Server
Metadata Server (active)
Metadata Server (passive)
Data Server Data Server Data Server
Application e.g. Web Server
Application e.g. Web Server
The Evolution of File Systems © 2012 Storage Networking Industry Association. All Rights Reserved. 15
Shared/Global File System Symmetric (“Cluster File System”)
Shared Data
Storage Network
Metadata server in each node Typically homogeneous (scaling limited by internal communication, e.g., distributed locking) Inter-node distance limited by storage network capability
Client Network
Application Server Application Server Application Server Application Server Application Server
Application e.g. Web Server
Application e.g. Web Server
Application e.g. Web Server
Metadata Server (active)
Metadata Server (active)
Data Server Data Server Data Server
Application e.g. Web Server
Application (e.g. Web Server)
Data Server Data Server
Metadata Server (active)
Metadata Server (active)
Metadata Server (active)
The Evolution of File Systems © 2012 Storage Networking Industry Association. All Rights Reserved. 16
Network File Systems (aka Proxy File Systems)
Enables sharing of files located on a file server among one or more client computers using a network protocol
Local File System
Application File System
Application File System
Client
File System Server
Application File System
Client
Application File System
Client
Application File System
Client
Network Protocol*
* e.g. NFS, CIFS, AFP, WebDAV, FTP, HTTP, ...
Network File System
The Evolution of File Systems © 2012 Storage Networking Industry Association. All Rights Reserved. 17
Network File System “Stack” (Example: Sun’s NFS)
Data
SCSI Port
Volume Mgr
SCSI Driver
SCSI HBA
File System Application
NFS Client
Ethernet NIC
TCP/IP
RPC/XDR
NFS Server
Ethernet NIC
TCP/IP
RPC/XDR
LAN
SAN
The Evolution of File Systems © 2012 Storage Networking Industry Association. All Rights Reserved. 18
Wide Area Network File Systems Consolidation eases
Management
Administration
Cost
Compliance
Global file sharing and collaboration
Location consolidation and optimization
But: WAN performance is low compared to LAN/SAN performance
Application
NFS Client
Ethernet NIC
TCP/IP
RPC/XDR
Data
SCSI Port
Volume Mgr
SCSI Driver
SCSI HBA
File System
NFS Server
Ethernet NIC
TCP/IP
RPC/XDR
WAN
SAN
The Evolution of File Systems © 2012 Storage Networking Industry Association. All Rights Reserved. 19
Improving Wide Area File System Performance
Data
Application
NFS/CIFS Client
Ethernet NIC
TCP/IP
RPC/XDR
Application
NFS/CIFS Client
Ethernet NIC
TCP/IP
RPC/XDR
Application
NFS/CIFS Client
Ethernet NIC
TCP/IP
RPC/XDR
Application
NFS/CIFS Client
Ethernet NIC
TCP/IP
RPC/XDR
Application
NFS Client
Ethernet NIC
TCP/IP
RPC/XDR
Application-specific optimizations: email, document management, SQL, ... Protocol-specific optimizations: HTTP, NFS, CIFS, WebDAV, FTP, TCP/IP, ... Transport acceleration: TCP accelerators Intelligent caching: read-ahead, deferred write, coherency, ... Data compression: algorithms, file-aware differencing, data aggregation, I/O clustering, chunk based de-duplication, cross-protocol data reduction, ...
SCSI Port
Volume Mgr
SCSI Driver
SCSI HBA
File System
NFS Server
Ethernet NIC
TCP/IP
RPC/XDR
SAN
Compression Engine
Ethernet NIC
TCP/IP
Ethernet NIC
TCP/IP
Compression Engine
Ethernet NIC
TCP/IP
Ethernet NIC
TCP/IP
LAN WAN LAN
The Evolution of File Systems © 2012 Storage Networking Industry Association. All Rights Reserved. 20
Distributed File System (DFS)
/c /b /a
A network file system with files distributed among multiple file servers Not a parallel file system
Application File System
Client
File System Server
File System Server
File System Server
Network Protocol
Single File System
/
/a /b /c client view:
The Evolution of File Systems © 2012 Storage Networking Industry Association. All Rights Reserved. 21
Distributed Parallel File System
Client
File
Aggregation of Storage Servers RAIN + RAID
(aka Network RAID) Global Namespace
Segments of files distributed across storage nodes Enables parallel I/O to individual files (aka file striping)
File Server
File Server
File Server
File Server
File Server
Client Client Client
Network Protocol
The Evolution of File Systems © 2012 Storage Networking Industry Association. All Rights Reserved. 22
NAS Aggregation
In-Band Solution Sometimes called “NAS Router”
IP Network
NAS Router
Global Namespace
SAN
File Server
Data
SAN
File Server
Data
SAN
File Server
Data
The Evolution of File Systems © 2012 Storage Networking Industry Association. All Rights Reserved. 23
NAS Virtualization - Out-of-Band
Client Client Client Client
Metadata Server (MDS)
Global Namespace
File Server
Individual files / file segments pinned to file servers Files can be distributed and/or
replicated for parallel access Files can be striped for intra-file parallel
access Clients must locate the right file server e.g. NFSv4.1 (pNFS), Microsoft’s DFS
distributed files
striped files
replicated files
IP Network
File_A File_G File_B File_D
File_F File_H File_C File_E
File_K_1 File_K_2 File_K_3 File_K_4
File_A’ File_B’’ File_C’ File_B’
File Server
File Server
File Server
The Evolution of File Systems © 2012 Storage Networking Industry Association. All Rights Reserved. 24
NAS Virtualization – NFS4.1 pNFS
Application Server
IP
In-Band NAS:
IP
Out-of-Band NAS: Application Server Application Server
SAN SAN
Data
NAS Appliance
Data
NAS Appliance with NFSv4.1
pNFS extensions
Storage Protocols: Block: FCP, iSCSI, SRP, SAS File: NFSv4.1 Object: OSD
Data path decoupled from control and metadata path
Application Server
NFSv4 client
Application Server
NFSv4 client
Application Server
NFSv4 client
Application Server
NFSv4.1 client with pNFS
Application Server
NFSv4.1 client with pNFS
Application Server
NFSv4.1 client with pNFS
The Evolution of File Systems © 2012 Storage Networking Industry Association. All Rights Reserved. 25
Toward “Storage Grids” via NAS
NFS
Clustered Data Services
CIFS
HTTP
FTP
WebDAV
Each file pinned to a single server...
IP
VIP
Addr
ess
NFS
CIFS
Data Services
Local Files System
Classic Filer
VIP
Add
ress
Clustered Data Services
Cluster (Parallel) File System
NFS
CIFS
HTTP
FTP
WebDAV
All nodes serve all files...
Two variants:
Client
Client
Client
The Evolution of File Systems © 2012 Storage Networking Industry Association. All Rights Reserved. 26
Cloud: The New Grid
NAS Cluster is effectively a storage cloud Clients
Storage Cloud
Clients
Clie
nts
Clients
File Server Fi
le S
erve
r File Server
The Evolution of File Systems © 2012 Storage Networking Industry Association. All Rights Reserved. 27
Data Segmentation
Media production, eCAD, mCAD, Office docs
Media-archive, DAM, Broadcast,
Medical imaging, Media-Internet
Transactional systems, ERP, CRM
BI, Data warehousing, Scientific,
Transaction archive
Fixed Data Dynamic Data
Stru
ctur
ed
Uns
truc
ture
d
The Evolution of File Systems © 2012 Storage Networking Industry Association. All Rights Reserved. 28
The New Reality of Data Segmentation
Media production, eCAD, mCAD, Office docs
Media-archive, DAM, Broadcast,
medical imaging, Media-Internet
Transactional systems, ERP, CRM
BI, data warehousing, scientific,
transaction archive
Fixed Data Dynamic Data
Stru
ctur
ed
Uns
truc
ture
d
Semi Structured*
*Semi-Structured Data contains dynamic meta-data defined by users and/or applications
The Evolution of File Systems © 2012 Storage Networking Industry Association. All Rights Reserved. 29
Traditional Files
Owner, permissions, type, last modification, ...
Data
Metadata
The Evolution of File Systems © 2012 Storage Networking Industry Association. All Rights Reserved. 30
Semi-Structured Data
Object ID
Data
Metadata
Attributes User/application defined
Policies e.g., Replication
Methods e.g., Encryption
Owner, permissions, type, last modification, ...
The Evolution of File Systems © 2012 Storage Networking Industry Association. All Rights Reserved. 31
The File Object Model
Data Blocks
Object
Object
Object
Object
Object
Inode Name OID
Name OID
Name OID
Name OID
Name OID
Store
Data OID
Retrieve
OID Data
User/application defined
e.g., Replication
e.g., Encryption
Owner, permissions, type, last modification, ...
Object ID
Data
Metadata
Attributes
Policies
Methods
Object Object
The Evolution of File Systems © 2012 Storage Networking Industry Association. All Rights Reserved. 32
Managing File Objects
File objects can be managed like records in a relational database with user data as Binary Large Objects (BLOBs)
Object ID
Data
Metadata
Attributes
Policies
Methods
Object ID
Data
Metadata
Attributes
Policies
Methods
Object ID
Data
Metadata
Attributes
Policies
Methods
Object ID
Data
Metadata
Attributes
Policies
Methods
Object ID
Data
Metadata
Attributes
Policies
Methods
Object ID
Data
Metadata
Attributes
Policies
Methods
Objec
t ID
Data
Metad
ata
Attrib
utes
Polic
ies
Metho
ds
Objec
t ID
Data
Metad
ata
Attrib
utes
Polic
ies
Metho
ds
Objec
t ID
Data
Metad
ata
Attrib
utes
Polic
ies
Metho
ds
Objec
t ID
Data
Metad
ata
Attrib
utes
Polic
ies
Metho
ds
Objec
t ID
Data
Metad
ata
Attrib
utes
Polic
ies
Metho
ds
Objec
t ID
Data
Metad
ata
Attrib
utes
Polic
ies
Metho
ds
Database Schema
The Evolution of File Systems © 2012 Storage Networking Industry Association. All Rights Reserved. 33
Managing File Objects (Cont.)
Objec
t ID
Data
Metad
ata
Attrib
utes
Polic
ies
Metho
ds
Objec
t ID
Data
Metad
ata
Attrib
utes
Polic
ies
Metho
ds
Objec
t ID
Data
Metad
ata
Attrib
utes
Polic
ies
Metho
ds
Objec
t ID
Data
Metad
ata
Attrib
utes
Polic
ies
Metho
ds
Objec
t ID
Data
Metad
ata
Attrib
utes
Polic
ies
Metho
ds
Objec
t ID
Data
Metad
ata
Attrib
utes
Polic
ies
Metho
ds
Indexes constraints/relationships Object search Full text search Join operations Virtual views SQL-like requests Cursors
The Evolution of File Systems © 2012 Storage Networking Industry Association. All Rights Reserved. 34
Data Serving Hierarchy 3 Levels of Abstraction
Application may interface with the storage subsystem in any of three layers:
Block – highest performance and very little meta data File – high performance and some metadata Object – medium performance and rich metadata
Many to One
Many to One
Data Server Platform
Application
Object
File
Block
The Evolution of File Systems © 2012 Storage Networking Industry Association. All Rights Reserved. 35 35
Attribution & Feedback
Please send any questions or comments regarding this SNIA Tutorial to [email protected]
The SNIA Education Committee would like to thank the following individuals for their contributions to this Tutorial.
Authorship History
Original Author : Christian Bandulet Updates: Thomas Rivera, September 2012 Paul Massiglia , Spring 2012 Craig Harmer, April 2011
Additional Contributors
Craig Harmer Paul Massiglia Joseph White Thomas Rivera Christian Bandulet
The Evolution of File Systems © 2012 Storage Networking Industry Association. All Rights Reserved. 36
Appendix
Reference Material
The Evolution of File Systems © 2012 Storage Networking Industry Association. All Rights Reserved. 37
www.wikipedia.org
ADFS – Acorn's Advanced Disc filing system, successor to DFS BFS – the Be File System used on BeOS EFS – Encrypted filesystem, An extension of NTFS EFS (IRIX) – an older block filing system under IRIX Ext – Extended filesystem, designed for Linux system Ext2 – Second extended filesystem, designed for Linux systems Ext3 – Name for the journalled form of ext2 FAT – Used on DOS and Microsoft Windows, 12, 16 and 32 bit table depths FFS (Amiga) – Fast File System, used on Amiga systems. This FS has evolved over time. Now
counts FFS1, FFS Intl, FFS DCache, FFS2 FFS – Fast File System, used on *BSD systems Fossil – Plan 9 from Bell Labs snapshot archival file system Files-11 – OpenVMS filesystem GCR – Group Code Recording, a floppy disk data encoding format used by the Apple II and
Commodore Business Machines in the 5¼" disk drives for their 8-bit computers HFS – Hierarchical File System, used on older Mac OS systems
The Evolution of File Systems © 2012 Storage Networking Industry Association. All Rights Reserved. 38
www.wikipedia.org (cont'd)
HFS Plus – Updated version of HFS used on newer Mac OS systems HPFS – High Performance Filesystem, used on OS/2 ISO 9660 – Used on CD-ROM and DVD-ROM discs
(Rock Ridge and Joliet are extensions to this) JFS – IBM Journaling Filesystem, provided in Linux, OS/2, and AIX LFS – 4.4BSD implementation of a log-structured file system MFS – Macintosh File System, used on early Mac OS systems Minix file system – Used on Minix systems NTFS – Used on Windows NT, Windows 2000, Windows XP and Windows Server 2003 systems NSS – Novell Storage Services. This is a new 64-bit journaling filesystem using a balanced tree
algorithm. Used in NetWare versions 5.0-up and recently ported to Linux. OFS – Old File System, on Amiga. Nice for floppies, but fairly useless on hard drives PFS – and PFS2, PFS3, etc. Technically interesting filesystem available for the Amiga, performs
very well under a lot of circumstances. Very simple and elegant
The Evolution of File Systems © 2012 Storage Networking Industry Association. All Rights Reserved. 39
www.wikipedia.org (cont'd)
ReiserFS – Filesystem that uses journaling Reiser4 – Filesystem that uses journaling, newest version of ReiserFS SFS – Smart File System, journaled file system available for the Amiga platforms UDF – Packet based filesystem for WORM/RW media such as CD-RW and DVD. UFS – Unix Filesystem, used on older BSD systems UFS2 – Unix Filesystem, used on newer BSD systems UMSDOS – FAT filesystem extended to store permissions and metadata, used for Linux VxFS – Veritas file system, first commercial journaling file system; HP-UX, Solaris, Linux, AIX VSAM WAFL – Used on Network Appliance systems XFS – Used on SGI IRIX and Linux systems ZFS – Used on Solaris SAM QFS (Oracle)
The Evolution of File Systems © 2012 Storage Networking Industry Association. All Rights Reserved. 40
www.wikipedia.org (cont'd)
9P The Plan 9 and Inferno distributed file system AFS (Andrew File System) AppleShare Arla (file system) Coda CXFS (Clustered XFS) a distributed networked file system designed by Silicon Graphics (SGI)
specifically to be used in a SAN Distributed File System (DCE) Distributed File System (Microsoft) Freenet Global File System (GFS) Google File System (GFS) IBRIX Fusion™ InterMezzo Isilon OneFS™ Lustre (Oracle)
The Evolution of File Systems © 2012 Storage Networking Industry Association. All Rights Reserved. 41
NFS OpenAFS Server message block (SMB) (aka Common Internet File System (CIFS) or
Samba file system) Xsan (a storage area network (SAN) filesystem from Apple Computer, Inc.) archfs (archive) cdfs (reading and writing of CDs) cfs (caching) Davfs2 (WebDAV) Devfs ftpfs (ftp access) fuse (filesystem in userspace, like lufs but better maintained) GPFS an IBM cluster file system JFFS/JFFS2 (filesystems designed specifically for flash devices) LUFS ( replace ftpfs, ftp ssh ... access) nntpfs (netnews) OCFS (Oracle Cluster File System)
www.wikipedia.org (cont'd)