fred kuhns ()cs523s: operating systems file system interface and implementations fred kuhns cs523...

40
Fred Kuhns ( ) CS523S: Operating Systems File System Interface and Implementations Fred Kuhns CS523 – Operating Systems

Upload: dimitri-winwood

Post on 31-Mar-2015

222 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Fred Kuhns ()CS523S: Operating Systems File System Interface and Implementations Fred Kuhns CS523 – Operating Systems

Fred Kuhns ( ) CS523S: Operating Systems

File System Interface and Implementations

Fred Kuhns

CS523 – Operating Systems

Page 2: Fred Kuhns ()CS523S: Operating Systems File System Interface and Implementations Fred Kuhns CS523 – Operating Systems

Fred Kuhns ( ) CS523S: Operating Systems

FS Framework in UNIX• Provides persistent storage

• Facilities for managing datafile - abstraction for data container, supports sequential

and random accessfile system - permits organizing, manipulating and

accessing files

• User interface specifies behavior and semantics of relevant system callsInterface exported abstractions: files, directories, file

descriptors and different file systems

Page 3: Fred Kuhns ()CS523S: Operating Systems File System Interface and Implementations Fred Kuhns CS523 – Operating Systems

Fred Kuhns ( ) CS523S: Operating Systems

Kernel, Files and Directories

• kernel provides control operations to name, organize and control access to files but it does not interpret contents

• Running programs have an associated current working directory. Permits use of relative pathnames. Otherwise complete pathnames are required.

• File viewed as a collection of bytesApplications requiring more structure must define

and implement themselves

Page 4: Fred Kuhns ()CS523S: Operating Systems File System Interface and Implementations Fred Kuhns CS523 – Operating Systems

Fred Kuhns ( ) CS523S: Operating Systems

Kernel, Files and Directories

• files and directories form hierarchical tree structure name space.tree forms a directed acyclic graph

• Directory entry for a file is known as a hard link.Files may also have symbolic links

• File may have one or more links

• POSIX defines library routines {opendir(), readdir(), rewinddir(), closedir()}

struct dirent { ino_t d_ino; char d_name[NAME_MAX + 1];}

Page 5: Fred Kuhns ()CS523S: Operating Systems File System Interface and Implementations Fred Kuhns CS523 – Operating Systems

Fred Kuhns ( ) CS523S: Operating Systems

File and Directory Organization

/

bin etc dev usr vmunix

etclocal

bin

sh

bash

/usr/local/bin/bash

(hard) links

Page 6: Fred Kuhns ()CS523S: Operating Systems File System Interface and Implementations Fred Kuhns CS523 – Operating Systems

Fred Kuhns ( ) CS523S: Operating Systems

File Attributes• Type – directory, regular file, FIFO, symbolic link, special.• Reference count – number of hard links {link(), unlink()}• size in bytes • device id – device files resides on• inode number - one inode per file, inodes are unique within

a disk partition (device id)• ownership - user and group id {chown()}• access modes - Permissions and modes {chmod()}

{read, write execute} for {owner, group or other}

• timestamps – three different timestamps: last access, last modify, last attributes modified. {utime()}

Page 7: Fred Kuhns ()CS523S: Operating Systems File System Interface and Implementations Fred Kuhns CS523 – Operating Systems

Fred Kuhns ( ) CS523S: Operating Systems

Permissions and Modes• Three Mode Flags = {suid, sgid and sticky}

suid – File: if set and executable then set the user’s effective user idDirectory: Not used

sgid – File: if set and executable then set the effective group id. If sgid is set but

not executable then mandatory file/record lockingDirectory: if set then new files inherit group of directory otherwise group

or creator.

sticky – File: if set and executable file then keep copy of program in swap area.Directory: if set and directory writable then remove/rename if EUID =

owner of file/directory or if process has write permission for file. Otherwise any process with write permission to directory may remove or rename.

Page 8: Fred Kuhns ()CS523S: Operating Systems File System Interface and Implementations Fred Kuhns CS523 – Operating Systems

Fred Kuhns ( ) CS523S: Operating Systems

User View of Files• File Descriptors (open, dup, dup2, fork)

All I/O is through file descriptors references the open file objectper process object file descriptors may be dup’ed {dup(), dup2()}, copied on fork

{fork()} or passed to unrelated process {(see ioctl() or sendmsg(), recvmsg()}permitting multiple descriptors to reference one object.

• File Object - holds contextcreated by an open() system callstores file offset reference to vnode

• vnode - abstract representation of a file

Page 9: Fred Kuhns ()CS523S: Operating Systems File System Interface and Implementations Fred Kuhns CS523 – Operating Systems

Fred Kuhns ( ) CS523S: Operating Systems

Vnode/vfsIn-memory

representationof file

How it works

File Descriptors{{0, uf_ofile} {1, uf_ofile} {2 , uf_ofile} {3 , uf_ofile} {4 , uf_ofile}

{5 , uf_ofile}}

Open File Objects{*f_vnode,f_offset,f_count,...},{*f_vnode,f_offset,f_count,...},

{*f_vnode,f_offset,f_count,...},{*f_vnode,f_offset,f_count,...},{*f_vnode,f_offset,f_count,...}}

Vnode/vfsIn-memory

representationof file

Vnode/vfsIn-memory

representationof file

Vnode/vfsIn-memory

representationof file

fd = open(path, oflag, mode); lseek(), read(), write() affect offset

Vnode/vfsIn-memory

representationof file

Page 10: Fred Kuhns ()CS523S: Operating Systems File System Interface and Implementations Fred Kuhns CS523 – Operating Systems

Fred Kuhns ( ) CS523S: Operating Systems

File Systems

• File hierarchy composed of one or more File Systems

• One File System is designated the Root File System

• Attached to mount points

• File can not span multiple File Systems

• Resides on one logical disk

Page 11: Fred Kuhns ()CS523S: Operating Systems File System Interface and Implementations Fred Kuhns CS523 – Operating Systems

Fred Kuhns ( ) CS523S: Operating Systems

Logical Disks• Viewed as linear sequence of fixed sized, randomly

accessible blocks.device driver maps FS blocks to underlying storage device.created using newfs or mkfs utilities

• A file system must reside in a logical disk, however a logical disk need not contain a file system (for example the swap device).

• Typically logical disk corresponds to partion of a physical disk. However, logical disk may: map to multiple physical disksbe mirrored on several physical disksstriped across multiple disks or other RAID techniques.

Page 12: Fred Kuhns ()CS523S: Operating Systems File System Interface and Implementations Fred Kuhns CS523 – Operating Systems

Fred Kuhns ( ) CS523S: Operating Systems

File Abstraction • Abstracts different types of I/O objects

for example directories, symbolic links, disks, terminals, printers, and pseudodevices (memory, pipes sockets etc).

• Control interface includes fstat, ioctl, fcntl

• Symbolic links: file contains a pathname to the linked file/directory. {lstat(), symlink(), readlink()}

• Pipe and FIFO files:FIFO created using mknod(), lives in the file system

name spacePipe created using pipe(), persists as long as opened for

reading or writing.

Page 13: Fred Kuhns ()CS523S: Operating Systems File System Interface and Implementations Fred Kuhns CS523 – Operating Systems

Fred Kuhns ( ) CS523S: Operating Systems

OO Style Interfaces

Abstract base class

Struct interface_t{// Common functions: open (), close ()// Common data: type, count// Pure virtual functions *ops (Null pointer)// Private data *data (Null pointer)}

Instance of derived class

{my_read() my_write() my_init() my_open()… }

Struct interface_t{ open (), close () type, count *ops *data}

{device_no, free_list, lock, …}

Page 14: Fred Kuhns ()CS523S: Operating Systems File System Interface and Implementations Fred Kuhns CS523 – Operating Systems

Fred Kuhns ( ) CS523S: Operating Systems

Overview

System calls

vnode interface

/procPCFSHSFStmpfs swapfs UFS RFS NFS

Anonymousmemory

Processaddressspace

disk cdrom diskette

Example from Solaris

Page 15: Fred Kuhns ()CS523S: Operating Systems File System Interface and Implementations Fred Kuhns CS523 – Operating Systems

Fred Kuhns ( ) CS523S: Operating Systems

Vfs/Vnode Framework

• Concurrently support multiple file system types

• transparent interoperation of different file systems within one file hierarchyenable file sharing over networkabstract interface allowing easy integration of

new file systems by vendors

Page 16: Fred Kuhns ()CS523S: Operating Systems File System Interface and Implementations Fred Kuhns CS523 – Operating Systems

Fred Kuhns ( ) CS523S: Operating Systems

Objectives• Operation performed on behalf of current process

• Support serialized access, I.e. locking

• must be stateless

• must be reentrant

• encourage use of global resources (cache, buffer)

• support client server architectures

• use dynamic storage allocation

Page 17: Fred Kuhns ()CS523S: Operating Systems File System Interface and Implementations Fred Kuhns CS523 – Operating Systems

Fred Kuhns ( ) CS523S: Operating Systems

Vnode/vfs interface• Define abstract interfaces• vfs: Fundamental abstraction representing a file

system to the kernelContains pointerss to file system (vfs) dependent

operations such as mount, unmount.

• vnode: Fundamental abstraction representing a file in the kerneldefines interface to the file, pointer to file system

specific routines. Reference counted. accessed in two ways:

1) I/O related system calls 2) pathname traversal

Page 18: Fred Kuhns ()CS523S: Operating Systems File System Interface and Implementations Fred Kuhns CS523 – Operating Systems

Fred Kuhns ( ) CS523S: Operating Systems

vfs Overview

Struct vfs { *vfs_next, *vfs_vnodecovered, *vfs_ops, *vfs_data, …}

rootvfs

Struct vfs { *vfs_next, *vfs_vnodecovered, *vfs_ops, *vfs_data, …}

Struct vnode { *v_vfsp, *v_vfsmountedhere,…}

Struct vnode { *v_vfsp, *v_vfsmountedhere,…}

Struct vnode { *v_vfsp, *v_vfsmountedhere,…}

Struct vfsops { *vfs_mount, *vfs_root, …}

Struct vfsops { *vfs_mount, *vfs_root, …}

private data private data

/ (root) /usr / (mounted fs)

Page 19: Fred Kuhns ()CS523S: Operating Systems File System Interface and Implementations Fred Kuhns CS523 – Operating Systems

Fred Kuhns ( ) CS523S: Operating Systems

Mounting a FS

• mount(spec, dir, flags, type, dataptr, datalen);

• SVR5 uses a global virtual file system switch table (vfssw)

• allocate and initialize private data

• initialize vfs struct

• initialize root vnode in memory (VFS_ROOT)

Page 20: Fred Kuhns ()CS523S: Operating Systems File System Interface and Implementations Fred Kuhns CS523 – Operating Systems

Fred Kuhns ( ) CS523S: Operating Systems

Pathname traversal

• Verify vnode is dir or stop• invoke VOP_LOOKUP (ufs_lookup())• if found, return pointer to vnode (locked)• else not found and last component, return

success and vnode of parent directory (locked)

• not found, release directory, repeat loop

Page 21: Fred Kuhns ()CS523S: Operating Systems File System Interface and Implementations Fred Kuhns CS523 – Operating Systems

Fred Kuhns ( ) CS523S: Operating Systems

Local File Systems

• S5fs - System V file system. Based on the original implementation.

• FFS/UFS - BSD developed filesystem with optimized disk usage algorithms

Page 22: Fred Kuhns ()CS523S: Operating Systems File System Interface and Implementations Fred Kuhns CS523 – Operating Systems

Fred Kuhns ( ) CS523S: Operating Systems

S5fs - Disk layout

• Viewed as a linear array of blocks

• Typical disk block size 512, 1024, 2048 bytes

• Physical block number is the block’s index

• disk uses cylinder, track and sector

• first few blocks are the boot area, which is followed by the inode list (fixed size)

Page 23: Fred Kuhns ()CS523S: Operating Systems File System Interface and Implementations Fred Kuhns CS523 – Operating Systems

Fred Kuhns ( ) CS523S: Operating Systems

Disk Layout

tract

cylinder

sector heads

plattersRotational speeddisk seek time

Page 24: Fred Kuhns ()CS523S: Operating Systems File System Interface and Implementations Fred Kuhns CS523 – Operating Systems

Fred Kuhns ( ) CS523S: Operating Systems

bootarea superblock inode list

S5fs disk layout

data

Boot area - code to initialize bootstrap the system

Superblock - metadata for filesystem. Size of FS, sizeof inode list, number of free blocks/inodes, free block/inode list

inode list - linear array of 64byte inode structs

Page 25: Fred Kuhns ()CS523S: Operating Systems File System Interface and Implementations Fred Kuhns CS523 – Operating Systems

Fred Kuhns ( ) CS523S: Operating Systems

s5fs - some details

name

2 byte

inode

14byte

8450

...“”

myfile123

directory

Di_mode (2)di_nlinks (2)di_uid (2)di_gid (2)di_size (4)di_addr (39)di_gen (1)di_atime (4)di_mtime (4)di_ctime (4)

On-disk inode

Page 26: Fred Kuhns ()CS523S: Operating Systems File System Interface and Implementations Fred Kuhns CS523 – Operating Systems

Fred Kuhns ( ) CS523S: Operating Systems

Locating file data blocks

0 12 3 45678910 - indirect11 - double indirect12 - triple indirect

256

bloc

ks

65,536 blocks

16,777,216 blocks

Assume 1024 Byte Blocks

Page 27: Fred Kuhns ()CS523S: Operating Systems File System Interface and Implementations Fred Kuhns CS523 – Operating Systems

Fred Kuhns ( ) CS523S: Operating Systems

S5fs Kernel Implementation

• In-Core Inodes - also include vnode, device id, inode number, flags

• Inode lookup uses a hash queue based on inode number (amy also use device number)

• kernel locks inode for reading/writing

• Read/Write use a buffer cache or VM

Page 28: Fred Kuhns ()CS523S: Operating Systems File System Interface and Implementations Fred Kuhns CS523 – Operating Systems

Fred Kuhns ( ) CS523S: Operating Systems

Problems with s5fs

• Superblock

• on-disk inodes

• Disk block allocation

• file name size

Page 29: Fred Kuhns ()CS523S: Operating Systems File System Interface and Implementations Fred Kuhns CS523 – Operating Systems

Fred Kuhns ( ) CS523S: Operating Systems

Fast File System - FFS

• Disk partition divided into cylinder groups• superblocks restructured and replicated

across partitionConstant informationcylinder group summary info such as free

inodes and free block

• support block fragments• Long file names• new disk block allocation strategy

Page 30: Fred Kuhns ()CS523S: Operating Systems File System Interface and Implementations Fred Kuhns CS523 – Operating Systems

Fred Kuhns ( ) CS523S: Operating Systems

FFS Allocation strategy

• Goal: Collocate similar data/info.

• file inodes located in same cyl group as dir.

• new dirs created in different cyl groups.

• Place file data blocks/inode in same cyl group - for size < 48K

• allocate sequential blocks at a rotationally optimal position.

• Choose cyl group with “best” free count

Page 31: Fred Kuhns ()CS523S: Operating Systems File System Interface and Implementations Fred Kuhns CS523 – Operating Systems

Fred Kuhns ( ) CS523S: Operating Systems

Is FFS/UFS Better?

• Measurements have shown substantial performance benefits over s5fs

• FFS however, is sub-optimal when the disk is nearly full. Thus 10% is always kept free.

• Modern disks however, no longer match the underlying assumptions of FFS

Page 32: Fred Kuhns ()CS523S: Operating Systems File System Interface and Implementations Fred Kuhns CS523 – Operating Systems

Fred Kuhns ( ) CS523S: Operating Systems

Buffer Cache

Hash (device,inode)

Free(LRU)

Page 33: Fred Kuhns ()CS523S: Operating Systems File System Interface and Implementations Fred Kuhns CS523 – Operating Systems

Fred Kuhns ( ) CS523S: Operating Systems

Other Limitations of s5fs and FFS

• Performance - hardware designs and modern architectures have redefined the computing environment

• Crash Recovery do you like waiting for fsck()?

• Security - do we need more than just 7 bits

• File Size limitations

Page 34: Fred Kuhns ()CS523S: Operating Systems File System Interface and Implementations Fred Kuhns CS523 – Operating Systems

Fred Kuhns ( ) CS523S: Operating Systems

Performance Issues

• FFS has a target rotational delayread/write entire trackmany disks have built-in caches

• Due to FS Caching, most I/O operations are writes.

• Synchronous writes of metadata

• Disk head seeks are expensive

Page 35: Fred Kuhns ()CS523S: Operating Systems File System Interface and Implementations Fred Kuhns CS523 – Operating Systems

Fred Kuhns ( ) CS523S: Operating Systems

Sun-FFS (cluster)

• Sets rotational delay to 0

• read clustering

• write clustering

Page 36: Fred Kuhns ()CS523S: Operating Systems File System Interface and Implementations Fred Kuhns CS523 – Operating Systems

Fred Kuhns ( ) CS523S: Operating Systems

Log-Structured FS

• Entire disk dedicated to log

• writes to tail of log file

• garbage collection daemon

• Dir and Inode structures retained

• Issue is locating inodes

• writes a segment at a time

Page 37: Fred Kuhns ()CS523S: Operating Systems File System Interface and Implementations Fred Kuhns CS523 – Operating Systems

Fred Kuhns ( ) CS523S: Operating Systems

Log-structured FS

• Requires a large cache for read efficiency

• Write efficiency is obtained since the system is always writing to the end of the log file. Why does this help?

• Why does performance compare to Sun-FFS?

• What about crash recovery?

Page 38: Fred Kuhns ()CS523S: Operating Systems File System Interface and Implementations Fred Kuhns CS523 – Operating Systems

Fred Kuhns ( ) CS523S: Operating Systems

4.4BSD Portal FS

User process

Protal file system Sockets

Portaldaemon

/p/<path> <path> fdfd

Page 39: Fred Kuhns ()CS523S: Operating Systems File System Interface and Implementations Fred Kuhns CS523 – Operating Systems

Fred Kuhns ( ) CS523S: Operating Systems

Review of vnode/vfs

• Provides a general purpose interface

• allows multiple file systems to be used simultaneously in a system

• OO Interface -although limited, no inheritance, fixed

interfacesHow can we improve on this?

Page 40: Fred Kuhns ()CS523S: Operating Systems File System Interface and Implementations Fred Kuhns CS523 – Operating Systems

Fred Kuhns ( ) CS523S: Operating Systems

Stackable Filesystems

• For a given mount point, there is now possible many file systems

/local

UFS

MyFS

application

/mylocal

application