file systemscse.poly.edu/jsterling/cs3224/slides/cs3224 - 4. files.pdf– root directory • in...

36
1 File Systems Chapter 4 1 2 What do we need to know? How are files viewed on different OS’s? What is a file system from the programmer’s viewpoint? You mostly know this, but we’ll review the main points. How are file systems put together? How is the disk laid out for directories? For files? What kind of memory structures are needed? What do some real file systems look like? cp/m, ms-dos (fat-12/16/32), ntfs, nfs, ext2, … What directions are file systems going?

Upload: others

Post on 09-Nov-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: File Systemscse.poly.edu/jsterling/cs3224/Slides/CS3224 - 4. Files.pdf– Root directory • In FAT-12 and FAT-16, preassigned enough space for 256 directory entries – Other directories

1

File Systems

Chapter 4

1

2

What do we need to know?

•  How are files viewed on different OS’s? •  What is a file system from the programmer’s viewpoint?

–  You mostly know this, but we’ll review the main points. •  How are file systems put together?

–  How is the disk laid out for directories? For files? What kind of memory structures are needed?

•  What do some real file systems look like? –  cp/m, ms-dos (fat-12/16/32), ntfs, nfs, ext2, …

•  What directions are file systems going?

Page 2: File Systemscse.poly.edu/jsterling/cs3224/Slides/CS3224 - 4. Files.pdf– Root directory • In FAT-12 and FAT-16, preassigned enough space for 256 directory entries – Other directories

2

3

Long-term Information Storage

1.  Must store large amounts of data

2.  Information stored must survive the termination of the process using it

3.  Multiple processes must be able to access the information concurrently

4

File Naming Issues

•  Character Set •  Length •  Extensions

Page 3: File Systemscse.poly.edu/jsterling/cs3224/Slides/CS3224 - 4. Files.pdf– Root directory • In FAT-12 and FAT-16, preassigned enough space for 256 directory entries – Other directories

3

5

File Naming

Typical file extensions.

6

File Structure

There are lots of files types. Here are three: –  byte sequence, record sequence, tree

Page 4: File Systemscse.poly.edu/jsterling/cs3224/Slides/CS3224 - 4. Files.pdf– Root directory • In FAT-12 and FAT-16, preassigned enough space for 256 directory entries – Other directories

4

7

Sample Files

(a) An executable file (b) An archive

8

File Access •  Sequential access

–  read all bytes/records from the beginning –  cannot jump around, could rewind or back up –  convenient when medium was mag tape

•  Random access –  bytes/records read in any order –  essential for data base systems –  read can be …

•  move file marker (seek), then read or … •  read and then move file marker

Page 5: File Systemscse.poly.edu/jsterling/cs3224/Slides/CS3224 - 4. Files.pdf– Root directory • In FAT-12 and FAT-16, preassigned enough space for 256 directory entries – Other directories

5

9

File Attributes

Possible file attributes

10

File Operations

1. Create 2. Delete 3. Open 4. Close 5. Read 6. Write

7. Append 8. Seek 9. Get attributes 10. Set Attributes 11. Rename

Page 6: File Systemscse.poly.edu/jsterling/cs3224/Slides/CS3224 - 4. Files.pdf– Root directory • In FAT-12 and FAT-16, preassigned enough space for 256 directory entries – Other directories

6

11

An Example Program Using Unix File System Calls (1/2)

12

An Example Program Using File System Calls (2/2)

Page 7: File Systemscse.poly.edu/jsterling/cs3224/Slides/CS3224 - 4. Files.pdf– Root directory • In FAT-12 and FAT-16, preassigned enough space for 256 directory entries – Other directories

7

13

Memory-Mapped Files

(a) Segmented process before mapping files into its address space

(b) Process after mapping existing file abc into one segment creating new segment for xyz

14

Directories Single-Level Directory Systems

•  A single level directory system –  contains 4 files –  owned by 3 different people, A, B, and C

Page 8: File Systemscse.poly.edu/jsterling/cs3224/Slides/CS3224 - 4. Files.pdf– Root directory • In FAT-12 and FAT-16, preassigned enough space for 256 directory entries – Other directories

8

15

Two-level Directory Systems

Letters indicate owners of the directories and files

16

Hierarchical Directory Systems

A hierarchical directory system

Page 9: File Systemscse.poly.edu/jsterling/cs3224/Slides/CS3224 - 4. Files.pdf– Root directory • In FAT-12 and FAT-16, preassigned enough space for 256 directory entries – Other directories

9

17

Directory Operations

1.  Create 2.  Delete 3.  Opendir 4.  Closedir

5. Readdir 6. Rename 7. Link 8. Unlink

18

File System Implementation

A possible file system layout

Page 10: File Systemscse.poly.edu/jsterling/cs3224/Slides/CS3224 - 4. Files.pdf– Root directory • In FAT-12 and FAT-16, preassigned enough space for 256 directory entries – Other directories

10

19

Implementing Files (1)

(a) Contiguous allocation of disk space for 7 files (b) State of the disk after files D and E have been removed

20

Implementing Files (2)

Storing a file as a linked list of disk blocks

Page 11: File Systemscse.poly.edu/jsterling/cs3224/Slides/CS3224 - 4. Files.pdf– Root directory • In FAT-12 and FAT-16, preassigned enough space for 256 directory entries – Other directories

11

21

Implementing Files (3)

File Allocation Table (FAT) uses a linked list in memory

22

Implementing Files (4)

Combination of Direct and Indirect Block Pointers Note: This is a simplified version of Unix i-node

Page 12: File Systemscse.poly.edu/jsterling/cs3224/Slides/CS3224 - 4. Files.pdf– Root directory • In FAT-12 and FAT-16, preassigned enough space for 256 directory entries – Other directories

12

23

The UNIX V7 File System

A UNIX i-node

24

Implementing Directories (1)

(a) A simple directory fixed size entries disk addresses and attributes in directory entry

(b) Directory in which each entry just refers to an i-node

Page 13: File Systemscse.poly.edu/jsterling/cs3224/Slides/CS3224 - 4. Files.pdf– Root directory • In FAT-12 and FAT-16, preassigned enough space for 256 directory entries – Other directories

13

25

Implementing Directories (2)

•  Two ways of handling long file names in directory –  (a) In-line –  (b) In a heap

26

Linking (1)

File system containing a file that is “shared” between two directories

Page 14: File Systemscse.poly.edu/jsterling/cs3224/Slides/CS3224 - 4. Files.pdf– Root directory • In FAT-12 and FAT-16, preassigned enough space for 256 directory entries – Other directories

14

27

Links (2)

(a) Situation prior to linking (b) After the link is created (c) After the original owner removes the file

28

Disk Space Management (1)

•  Dark line (left hand scale) gives data rate of a disk •  Dotted line (right hand scale) gives disk space efficiency •  All files here are 2KB

Block size

Page 15: File Systemscse.poly.edu/jsterling/cs3224/Slides/CS3224 - 4. Files.pdf– Root directory • In FAT-12 and FAT-16, preassigned enough space for 256 directory entries – Other directories

15

29

Disk Space Management (2)

(a) Storing the free list on a linked list (b) A bit map

30

File System Checking

•  Possible results while running fsck (a) consistent (b) missing block (c) duplicate block in free list (d) duplicate data block

Page 16: File Systemscse.poly.edu/jsterling/cs3224/Slides/CS3224 - 4. Files.pdf– Root directory • In FAT-12 and FAT-16, preassigned enough space for 256 directory entries – Other directories

16

31

File System Performance (1)

The block cache data structures

32

File System Writes

•  Unix –  “Critical Blocks” are written immediately –  Data blocks are written periodically or when the block

is removed from the block cache

•  MSDOS –  Uses “Write-through cache”. All writes are immediate.

Page 17: File Systemscse.poly.edu/jsterling/cs3224/Slides/CS3224 - 4. Files.pdf– Root directory • In FAT-12 and FAT-16, preassigned enough space for 256 directory entries – Other directories

17

33

Read Ahead

•  When block N is requested, the file system can issue a read for block N+1 also.

•  What if the file is not being read sequentially? –  Initially assume it is, but monitor disk access and set a

flag to non-sequential if needed. This can be used to disable read ahead.

34

File System Performance (2)

•  I-nodes placed at the start of the disk •  Disk divided into cylinder groups

–  each with its own blocks and i-nodes

Page 18: File Systemscse.poly.edu/jsterling/cs3224/Slides/CS3224 - 4. Files.pdf– Root directory • In FAT-12 and FAT-16, preassigned enough space for 256 directory entries – Other directories

18

35

Log-Structured File Systems

•  With CPUs faster, memory larger –  disk caches can also be larger –  increasing number of read requests can come from cache –  thus, most disk accesses will be writes

•  LSF Strategy structures entire disk as a log –  have all writes initially buffered in memory –  periodically write these to the end of the disk log – when file opened, locate i-node, then find blocks

36

Journaling

•  What happens when you remove a file? – Remove the directory entry – Release the i-node – Free the disk blocks

•  What happens if there is a crash after the first or second steps?

•  How can you minimize the damage?

Page 19: File Systemscse.poly.edu/jsterling/cs3224/Slides/CS3224 - 4. Files.pdf– Root directory • In FAT-12 and FAT-16, preassigned enough space for 256 directory entries – Other directories

19

37

The CP/M File System (1)

Memory layout of CP/M

38

The CP/M File System (2)

The CP/M directory entry format

Page 20: File Systemscse.poly.edu/jsterling/cs3224/Slides/CS3224 - 4. Files.pdf– Root directory • In FAT-12 and FAT-16, preassigned enough space for 256 directory entries – Other directories

20

39

File Allocation Table (FAT) Partition Layout

•  Partion layout: – Boot block – FAT – FAT copy – Root directory

•  In FAT-12 and FAT-16, preassigned enough space for 256 directory entries

– Other directories and files

40

FAT Table Sizes •  FAT-12

–  212 clusters –  Cluster Size: 512 Byte to 8KB –  Partitions size up to 32MB (4K clusters * 8KB / cluster) –  Windows default for volumes < 16MB, such as floppies

•  FAT-16 –  216 clusters –  Cluster Size: 512 Byte to 64KB –  Partitions size up to 4GB (64K clusters * 64KB / cluster)

•  FAT-32 –  228 clusters –  Cluster Size: 512 Byte to 32KB –  Partitions size in principle up to 8TB, but Windows will only create

FAT-32 partitions up to 32GB. •  Note that all FAT systems reserve the first two and last sixteen clusters

in a partition, so actual partition sizes are slightly smaller than listed above.

Page 21: File Systemscse.poly.edu/jsterling/cs3224/Slides/CS3224 - 4. Files.pdf– Root directory • In FAT-12 and FAT-16, preassigned enough space for 256 directory entries – Other directories

21

41

Directory Entries

Directory entry used in Windows:

Bytes

Original MS-DOS directory entry:

42

The Windows 98 File System

An example of how a long name is stored in Windows 98

Page 22: File Systemscse.poly.edu/jsterling/cs3224/Slides/CS3224 - 4. Files.pdf– Root directory • In FAT-12 and FAT-16, preassigned enough space for 256 directory entries – Other directories

22

43

UNIX File System

Disk layout in classical UNIX systems

44

The UNIX File System

A UNIX V7 directory entry (old)

Page 23: File Systemscse.poly.edu/jsterling/cs3224/Slides/CS3224 - 4. Files.pdf– Root directory • In FAT-12 and FAT-16, preassigned enough space for 256 directory entries – Other directories

23

45

The UNIX V7 File System

A UNIX i-node

46

UNIX File System

Directory entry fields.

Attributes in the i-node

Page 24: File Systemscse.poly.edu/jsterling/cs3224/Slides/CS3224 - 4. Files.pdf– Root directory • In FAT-12 and FAT-16, preassigned enough space for 256 directory entries – Other directories

24

47 Path Names

The UNIX File System

48

The UNIX File System

Some important directories found in most UNIX systems

Page 25: File Systemscse.poly.edu/jsterling/cs3224/Slides/CS3224 - 4. Files.pdf– Root directory • In FAT-12 and FAT-16, preassigned enough space for 256 directory entries – Other directories

25

49

Pathnames

•  Absolute Pathname –  Begins at the root directory : / –  Contains all sub-directory names separated by slashes –  Filename –  ~jsterling

•  ~ is recognized as a short-cut for the absolute path to the home directory.

•  Relative Pathname –  Does not begin with the root directory. –  Starts in the current working directory. Sometimes the current

working directory is shown explicitly using: . –  May move up the file tree using .. to refer to a parent directory.

50

The UNIX File System

The steps in looking up /usr/ast/mbox

Page 26: File Systemscse.poly.edu/jsterling/cs3224/Slides/CS3224 - 4. Files.pdf– Root directory • In FAT-12 and FAT-16, preassigned enough space for 256 directory entries – Other directories

26

51

The UNIX File System

•  Before linking. •  After linking.

(a) Before linking. (b) After linking

Note that hard links •  must refer to other files in the same file system. •  are not permitted to refer to directories. •  are indistinguishable from the original file name.

Note that soft links (aka symbolic links) •  contain only a pathname. •  result in a “dangling pointer” if the original filename is deleted.

52

The UNIX File System

•  Separate file systems •  After mounting

(a) (b)

(a) Before mounting. (b) After mounting

Page 27: File Systemscse.poly.edu/jsterling/cs3224/Slides/CS3224 - 4. Files.pdf– Root directory • In FAT-12 and FAT-16, preassigned enough space for 256 directory entries – Other directories

27

53

UNIX File System (3)

The relation between the file descriptor table, the open file description and the i-node

54

The Linux File System

•  Super block: –  # of inodes, # of blocks, etc.

•  Group Descriptor: –  # of free i-nodes, # of free blocks, # of directories

•  Bitmaps: –  Each is one block long

Page 28: File Systemscse.poly.edu/jsterling/cs3224/Slides/CS3224 - 4. Files.pdf– Root directory • In FAT-12 and FAT-16, preassigned enough space for 256 directory entries – Other directories

28

55

Record Locking in Unix •  Can lock any range of bytes in a file •  Can be multiple locks overlapping on

a file •  Locks can be

–  Exclusive (write): No other process can have a lock on the range.

–  Shared (read): Other locks may exist on the same bytes.

•  A failed lock can be made to block or not (choice of system call).

•  A process’s locks are released when a)  The process terminates b)  The process closes the file.

Even if the process had it open more than once simultaneously!

•  Locks are not inherited across fork. •  Locks can carry across exec. •  Deadlock possibility:

–  Competing locks could in principle result in deadlock,

–  but this is prevented in Unix.

56

System Calls for File Management

•  s is an error code •  fd is a file descriptor •  position is a file offset

Page 29: File Systemscse.poly.edu/jsterling/cs3224/Slides/CS3224 - 4. Files.pdf– Root directory • In FAT-12 and FAT-16, preassigned enough space for 256 directory entries – Other directories

29

57

The stat System Call

•  Mode: includes type and protection •  Inode # •  Device •  Link count •  Owner’s ID •  Group ID •  File Size in bytes •  Access Time •  Modification Time •  Status change Time •  Blocksize •  Block count (512 byte blocks)

Note that not all fields are stored in the i-nodes, themselves.

From Solaris Man Page

58

System Calls for Directory Management

•  s is an error code •  dir identifies a directory stream •  dirent is a directory entry

Page 30: File Systemscse.poly.edu/jsterling/cs3224/Slides/CS3224 - 4. Files.pdf– Root directory • In FAT-12 and FAT-16, preassigned enough space for 256 directory entries – Other directories

30

59

UNIX File System (4)

•  A BSD directory with three files •  The same directory after the file voluminous

has been removed

60

Unix File Protection

•  Divide the world into categories (owner / group / world) and specifying read / write / execute access for each.

•  Add read/write permissions for the owner/user and the group with: –  chmod ug+rw –  u for user/owner; g for group; o for other/world. –  + means add and – means remove. –  r for read; w for write; x for execute.

•  Adding / removing files requires write permission on their directory! •  Setuid

–  Executable files may have their bit setuid set to indicate that they execute with the permission of their owner. –  An example is the program to change the password. Requires write access to the password file.

Page 31: File Systemscse.poly.edu/jsterling/cs3224/Slides/CS3224 - 4. Files.pdf– Root directory • In FAT-12 and FAT-16, preassigned enough space for 256 directory entries – Other directories

31

61

NTFS Goals

•  File / disk security •  Disk quotas •  File compression •  Encryption

62

NTFS Features

•  64-bit cluster indices •  Logging of metadata •  Multiple data streams •  Unicode-based names •  Hard links •  Sparse Files •  Dynamic bad-cluster remapping

Page 32: File Systemscse.poly.edu/jsterling/cs3224/Slides/CS3224 - 4. Files.pdf– Root directory • In FAT-12 and FAT-16, preassigned enough space for 256 directory entries – Other directories

32

63

NTFS Master File Table (MFT)

•  Can be anywhere. Boot sector says where

•  Contains up to 248 records •  First 16 records reserved for

Metadata, describing –  MFT itself –  copy of the MFT –  logging file –  Root directory –  Bitmap of free/used blocks –  Bootstrap code –  Bad blocks –  Quotas, etc.

64

NTFS MFT Record

•  Describes one file or directory •  Has a length of 1KB •  Header

–  Magic number –  Sequence number –  Reference count –  Size –  Etc.

•  Attribute / value pairs –  Eg. Name, additional MFT

records, if directory then how the entries are stored.

•  May need more than one to describe a file.

Page 33: File Systemscse.poly.edu/jsterling/cs3224/Slides/CS3224 - 4. Files.pdf– Root directory • In FAT-12 and FAT-16, preassigned enough space for 256 directory entries – Other directories

33

65

NTFS MFT Data Attributes

•  Data Attribute keeps track of runs of consecutive blocks •  There may be holes. •  Each entry consist of a header and a sequence of runs. •  If file is small, data may be stored in MFT Record

66

NTFS Small Directory

•  A small directory is an MFT record containing directory entries.

•  The entries contain the file’s name and the MFT index of the file (plus some additional flags).

•  Note: Large directories are stored as B+ trees.

Page 34: File Systemscse.poly.edu/jsterling/cs3224/Slides/CS3224 - 4. Files.pdf– Root directory • In FAT-12 and FAT-16, preassigned enough space for 256 directory entries – Other directories

34

67

NTFS File Name Lookup

•  The file name above is C:\maria\web.htm •  The filename lookup first prepends \?? to the name. •  The name “\??\C:” is a symbolic link to an object that points to the root

directory of the C: drive •  From there the filename lookup is similar to the process in Unix,

accept that MFT index numbers replace the i-node numbers.

68

NTFS File Protection

•  Access Control Lists –  Discretionary (DACL)

says who has what permissions

–  System (SACL) says whose actions get audited.

Page 35: File Systemscse.poly.edu/jsterling/cs3224/Slides/CS3224 - 4. Files.pdf– Root directory • In FAT-12 and FAT-16, preassigned enough space for 256 directory entries – Other directories

35

69

Network File System (NFS)

•  Goal: to allow file system across a network to appear as one logical whole

•  Client – Server Model •  Server “exports” one or more

directories •  Client 1 is replacing its /bin

directory •  Client 1 is also mounting the

projects directory as /usr/ast/work.

•  Client 2 is mounting the projects directory as /mnt.

70

NFS Protocols

•  Mounting –  Client sends pathname to server, requesting mount permission. –  If the directory is listed as exported then server returns a file

handle. –  Remote directories might be mounted at boot time or automounted.

•  Directory and file access –  Most Unix system calls supported. –  Stateless connection.

•  Server does not support open •  Instead Client sends a lookup which returns a handle. •  Each read has to say where to start reading from. •  File locks not supported.

Page 36: File Systemscse.poly.edu/jsterling/cs3224/Slides/CS3224 - 4. Files.pdf– Root directory • In FAT-12 and FAT-16, preassigned enough space for 256 directory entries – Other directories

36

71

NFS Implementation

•  System call layer –  Open, read, close, etc.

•  Virtual File System Layer –  Maintain table of open files, using

v-nodes –  v-node may point to a standard i-

node or to an NFS Client r-node. •  NFS Client Code

–  On mount, create r-node for NFS directory.

–  On open, issue lookup and create r-node.

–  Transfers in 8KB chunks. –  Client does read-ahead. –  Client caching used for efficiency

•  Block discarded after 3 sec for data and 30 sec. for directories.

•  Writes synced after 30 seconds.

72

Example File Systems CD-ROM File Systems

•  ISO9660 –  Designed for lowest common

denominator in 1988 (i.e., msdos). –  Header contains 16 bytes that are

up for grabs (e.g., bootstrap block) –  Multi-byte numeric field in

directories are presented twice, once in big endian and once in little endian.

–  File names were 8.3 + version •  Rock Ridge extension allowed

various Unix features, such as –  file protections, –  longer names, –  more time stamps, –  arbitrary depth hierarchies

•  Joliet extension (from Microsoft) added unicode.