fred kuhns ()cs523s: operating systems file system interface and implementations fred kuhns cs523...
TRANSCRIPT
Fred Kuhns ( ) CS523S: Operating Systems
File System Interface and Implementations
Fred Kuhns
CS523 – Operating Systems
Fred Kuhns ( ) CS523S: Operating Systems
FS Framework in UNIX• Provides persistent storage
• Facilities for managing datafile - abstraction for data container, supports sequential
and random accessfile system - permits organizing, manipulating and
accessing files
• User interface specifies behavior and semantics of relevant system callsInterface exported abstractions: files, directories, file
descriptors and different file systems
Fred Kuhns ( ) CS523S: Operating Systems
Kernel, Files and Directories
• kernel provides control operations to name, organize and control access to files but it does not interpret contents
• Running programs have an associated current working directory. Permits use of relative pathnames. Otherwise complete pathnames are required.
• File viewed as a collection of bytesApplications requiring more structure must define
and implement themselves
Fred Kuhns ( ) CS523S: Operating Systems
Kernel, Files and Directories
• files and directories form hierarchical tree structure name space.tree forms a directed acyclic graph
• Directory entry for a file is known as a hard link.Files may also have symbolic links
• File may have one or more links
• POSIX defines library routines {opendir(), readdir(), rewinddir(), closedir()}
struct dirent { ino_t d_ino; char d_name[NAME_MAX + 1];}
Fred Kuhns ( ) CS523S: Operating Systems
File and Directory Organization
/
bin etc dev usr vmunix
etclocal
bin
sh
bash
/usr/local/bin/bash
(hard) links
Fred Kuhns ( ) CS523S: Operating Systems
File Attributes• Type – directory, regular file, FIFO, symbolic link, special.• Reference count – number of hard links {link(), unlink()}• size in bytes • device id – device files resides on• inode number - one inode per file, inodes are unique within
a disk partition (device id)• ownership - user and group id {chown()}• access modes - Permissions and modes {chmod()}
{read, write execute} for {owner, group or other}
• timestamps – three different timestamps: last access, last modify, last attributes modified. {utime()}
Fred Kuhns ( ) CS523S: Operating Systems
Permissions and Modes• Three Mode Flags = {suid, sgid and sticky}
suid – File: if set and executable then set the user’s effective user idDirectory: Not used
sgid – File: if set and executable then set the effective group id. If sgid is set but
not executable then mandatory file/record lockingDirectory: if set then new files inherit group of directory otherwise group
or creator.
sticky – File: if set and executable file then keep copy of program in swap area.Directory: if set and directory writable then remove/rename if EUID =
owner of file/directory or if process has write permission for file. Otherwise any process with write permission to directory may remove or rename.
Fred Kuhns ( ) CS523S: Operating Systems
User View of Files• File Descriptors (open, dup, dup2, fork)
All I/O is through file descriptors references the open file objectper process object file descriptors may be dup’ed {dup(), dup2()}, copied on fork
{fork()} or passed to unrelated process {(see ioctl() or sendmsg(), recvmsg()}permitting multiple descriptors to reference one object.
• File Object - holds contextcreated by an open() system callstores file offset reference to vnode
• vnode - abstract representation of a file
Fred Kuhns ( ) CS523S: Operating Systems
Vnode/vfsIn-memory
representationof file
How it works
File Descriptors{{0, uf_ofile} {1, uf_ofile} {2 , uf_ofile} {3 , uf_ofile} {4 , uf_ofile}
{5 , uf_ofile}}
Open File Objects{*f_vnode,f_offset,f_count,...},{*f_vnode,f_offset,f_count,...},
{*f_vnode,f_offset,f_count,...},{*f_vnode,f_offset,f_count,...},{*f_vnode,f_offset,f_count,...}}
Vnode/vfsIn-memory
representationof file
Vnode/vfsIn-memory
representationof file
Vnode/vfsIn-memory
representationof file
fd = open(path, oflag, mode); lseek(), read(), write() affect offset
Vnode/vfsIn-memory
representationof file
Fred Kuhns ( ) CS523S: Operating Systems
File Systems
• File hierarchy composed of one or more File Systems
• One File System is designated the Root File System
• Attached to mount points
• File can not span multiple File Systems
• Resides on one logical disk
Fred Kuhns ( ) CS523S: Operating Systems
Logical Disks• Viewed as linear sequence of fixed sized, randomly
accessible blocks.device driver maps FS blocks to underlying storage device.created using newfs or mkfs utilities
• A file system must reside in a logical disk, however a logical disk need not contain a file system (for example the swap device).
• Typically logical disk corresponds to partion of a physical disk. However, logical disk may: map to multiple physical disksbe mirrored on several physical disksstriped across multiple disks or other RAID techniques.
Fred Kuhns ( ) CS523S: Operating Systems
File Abstraction • Abstracts different types of I/O objects
for example directories, symbolic links, disks, terminals, printers, and pseudodevices (memory, pipes sockets etc).
• Control interface includes fstat, ioctl, fcntl
• Symbolic links: file contains a pathname to the linked file/directory. {lstat(), symlink(), readlink()}
• Pipe and FIFO files:FIFO created using mknod(), lives in the file system
name spacePipe created using pipe(), persists as long as opened for
reading or writing.
Fred Kuhns ( ) CS523S: Operating Systems
OO Style Interfaces
Abstract base class
Struct interface_t{// Common functions: open (), close ()// Common data: type, count// Pure virtual functions *ops (Null pointer)// Private data *data (Null pointer)}
Instance of derived class
{my_read() my_write() my_init() my_open()… }
Struct interface_t{ open (), close () type, count *ops *data}
{device_no, free_list, lock, …}
Fred Kuhns ( ) CS523S: Operating Systems
Overview
System calls
vnode interface
/procPCFSHSFStmpfs swapfs UFS RFS NFS
Anonymousmemory
Processaddressspace
disk cdrom diskette
Example from Solaris
Fred Kuhns ( ) CS523S: Operating Systems
Vfs/Vnode Framework
• Concurrently support multiple file system types
• transparent interoperation of different file systems within one file hierarchyenable file sharing over networkabstract interface allowing easy integration of
new file systems by vendors
Fred Kuhns ( ) CS523S: Operating Systems
Objectives• Operation performed on behalf of current process
• Support serialized access, I.e. locking
• must be stateless
• must be reentrant
• encourage use of global resources (cache, buffer)
• support client server architectures
• use dynamic storage allocation
Fred Kuhns ( ) CS523S: Operating Systems
Vnode/vfs interface• Define abstract interfaces• vfs: Fundamental abstraction representing a file
system to the kernelContains pointerss to file system (vfs) dependent
operations such as mount, unmount.
• vnode: Fundamental abstraction representing a file in the kerneldefines interface to the file, pointer to file system
specific routines. Reference counted. accessed in two ways:
1) I/O related system calls 2) pathname traversal
Fred Kuhns ( ) CS523S: Operating Systems
vfs Overview
Struct vfs { *vfs_next, *vfs_vnodecovered, *vfs_ops, *vfs_data, …}
rootvfs
Struct vfs { *vfs_next, *vfs_vnodecovered, *vfs_ops, *vfs_data, …}
Struct vnode { *v_vfsp, *v_vfsmountedhere,…}
Struct vnode { *v_vfsp, *v_vfsmountedhere,…}
Struct vnode { *v_vfsp, *v_vfsmountedhere,…}
Struct vfsops { *vfs_mount, *vfs_root, …}
Struct vfsops { *vfs_mount, *vfs_root, …}
private data private data
/ (root) /usr / (mounted fs)
Fred Kuhns ( ) CS523S: Operating Systems
Mounting a FS
• mount(spec, dir, flags, type, dataptr, datalen);
• SVR5 uses a global virtual file system switch table (vfssw)
• allocate and initialize private data
• initialize vfs struct
• initialize root vnode in memory (VFS_ROOT)
Fred Kuhns ( ) CS523S: Operating Systems
Pathname traversal
• Verify vnode is dir or stop• invoke VOP_LOOKUP (ufs_lookup())• if found, return pointer to vnode (locked)• else not found and last component, return
success and vnode of parent directory (locked)
• not found, release directory, repeat loop
Fred Kuhns ( ) CS523S: Operating Systems
Local File Systems
• S5fs - System V file system. Based on the original implementation.
• FFS/UFS - BSD developed filesystem with optimized disk usage algorithms
Fred Kuhns ( ) CS523S: Operating Systems
S5fs - Disk layout
• Viewed as a linear array of blocks
• Typical disk block size 512, 1024, 2048 bytes
• Physical block number is the block’s index
• disk uses cylinder, track and sector
• first few blocks are the boot area, which is followed by the inode list (fixed size)
Fred Kuhns ( ) CS523S: Operating Systems
Disk Layout
tract
cylinder
sector heads
plattersRotational speeddisk seek time
Fred Kuhns ( ) CS523S: Operating Systems
bootarea superblock inode list
S5fs disk layout
data
Boot area - code to initialize bootstrap the system
Superblock - metadata for filesystem. Size of FS, sizeof inode list, number of free blocks/inodes, free block/inode list
inode list - linear array of 64byte inode structs
Fred Kuhns ( ) CS523S: Operating Systems
s5fs - some details
name
2 byte
inode
14byte
8450
...“”
myfile123
directory
Di_mode (2)di_nlinks (2)di_uid (2)di_gid (2)di_size (4)di_addr (39)di_gen (1)di_atime (4)di_mtime (4)di_ctime (4)
On-disk inode
Fred Kuhns ( ) CS523S: Operating Systems
Locating file data blocks
0 12 3 45678910 - indirect11 - double indirect12 - triple indirect
256
bloc
ks
65,536 blocks
16,777,216 blocks
Assume 1024 Byte Blocks
Fred Kuhns ( ) CS523S: Operating Systems
S5fs Kernel Implementation
• In-Core Inodes - also include vnode, device id, inode number, flags
• Inode lookup uses a hash queue based on inode number (amy also use device number)
• kernel locks inode for reading/writing
• Read/Write use a buffer cache or VM
Fred Kuhns ( ) CS523S: Operating Systems
Problems with s5fs
• Superblock
• on-disk inodes
• Disk block allocation
• file name size
Fred Kuhns ( ) CS523S: Operating Systems
Fast File System - FFS
• Disk partition divided into cylinder groups• superblocks restructured and replicated
across partitionConstant informationcylinder group summary info such as free
inodes and free block
• support block fragments• Long file names• new disk block allocation strategy
Fred Kuhns ( ) CS523S: Operating Systems
FFS Allocation strategy
• Goal: Collocate similar data/info.
• file inodes located in same cyl group as dir.
• new dirs created in different cyl groups.
• Place file data blocks/inode in same cyl group - for size < 48K
• allocate sequential blocks at a rotationally optimal position.
• Choose cyl group with “best” free count
Fred Kuhns ( ) CS523S: Operating Systems
Is FFS/UFS Better?
• Measurements have shown substantial performance benefits over s5fs
• FFS however, is sub-optimal when the disk is nearly full. Thus 10% is always kept free.
• Modern disks however, no longer match the underlying assumptions of FFS
Fred Kuhns ( ) CS523S: Operating Systems
Buffer Cache
Hash (device,inode)
Free(LRU)
Fred Kuhns ( ) CS523S: Operating Systems
Other Limitations of s5fs and FFS
• Performance - hardware designs and modern architectures have redefined the computing environment
• Crash Recovery do you like waiting for fsck()?
• Security - do we need more than just 7 bits
• File Size limitations
Fred Kuhns ( ) CS523S: Operating Systems
Performance Issues
• FFS has a target rotational delayread/write entire trackmany disks have built-in caches
• Due to FS Caching, most I/O operations are writes.
• Synchronous writes of metadata
• Disk head seeks are expensive
Fred Kuhns ( ) CS523S: Operating Systems
Sun-FFS (cluster)
• Sets rotational delay to 0
• read clustering
• write clustering
Fred Kuhns ( ) CS523S: Operating Systems
Log-Structured FS
• Entire disk dedicated to log
• writes to tail of log file
• garbage collection daemon
• Dir and Inode structures retained
• Issue is locating inodes
• writes a segment at a time
Fred Kuhns ( ) CS523S: Operating Systems
Log-structured FS
• Requires a large cache for read efficiency
• Write efficiency is obtained since the system is always writing to the end of the log file. Why does this help?
• Why does performance compare to Sun-FFS?
• What about crash recovery?
Fred Kuhns ( ) CS523S: Operating Systems
4.4BSD Portal FS
User process
Protal file system Sockets
Portaldaemon
/p/<path> <path> fdfd
Fred Kuhns ( ) CS523S: Operating Systems
Review of vnode/vfs
• Provides a general purpose interface
• allows multiple file systems to be used simultaneously in a system
• OO Interface -although limited, no inheritance, fixed
interfacesHow can we improve on this?
Fred Kuhns ( ) CS523S: Operating Systems
Stackable Filesystems
• For a given mount point, there is now possible many file systems
/local
UFS
MyFS
application
/mylocal
application