chapter 10: file-system interface

Chapter 10: File-System InterfaceChapter 10: File-System Interface

10.2 Silberschatz, Galvin and Gagne ©2005Operating System Concepts

Chapter 10: File-System InterfaceChapter 10: File-System Interface

Chapter 10.1 File Concept

Access Methods

Chapter 10.2 Directory Structure

File-System Mounting

File Sharing

Protection


Storage ManagementStorage Management

New block – File Systems a.k.a. “Storage Management”

An Operating System is often described as a program that manages processes, processors, memory, and storage.

Listing these: operating systems control and manage: Processes (both user and system)

Processors (the CPUs)

Memory management (primary, cache, …) and

Storage management (data, programs, directories used for access, etc. )


Storage Management - moreStorage Management - more Disk storage – primary medium for primary, online storage. Contains files – collections of related items defined by file creator.

Normally grouped into directories for ease of use and reference. Organized in a variety of structures.

Disk Access – sometimes character at a time; often blocks at a time. sometimes access sequentially; sometimes randomly.

Some file systems dedicated; some shared Some support data transfer data asynchronously; others

synchronously. Differ greatly in speed – many parameters as cited above. This chapter: the File System Interface.


Objectives of this Chapter:Objectives of this Chapter:

To explain the function of file systems

To describe the interfaces to file systems

To discuss file-system design tradeoffs, including access methods, file sharing, file locking, and directory structures

To explore file-system protection


File ConceptFile Concept A File System consists of two parts:

Files – the actual storage of data on a medium Stored on sequential or some kind of direct access storage device.

Directory Structure – structures the information for access Size, location, logical record length, block size, format, ownership, security,

paths to files / directories, etc.

A file may be defined as a contiguous logical address space, which is mapped by the operating system onto some kind of physical devices. Note: ‘logical’ does not mean ‘physical.’

Almost all storage devices are non-volatile (data remains when power is removed) Magnetic tapes Magnetic disks Optical disks, Jump drives CDs / DVDs …. And others…


File ConceptFile Concept To a user, a file is the smallest allocation of logical secondary

storage. All data is written to a ‘file.’ Data may be numeric, alphabetic, alphanumeric, or binary. Can be free form (text) Can be rigidly formatted – records.

Fixed length records; variable length records: Bright Lights application?

Generally, a file is a sequence of bits, bytes, lines, or records

… whose meaning is interpreted by the creator of the file and how it is used. “One man’s program is another man’s data.”


File Concept (continued)File Concept (continued)

Data files – many forms and structures Differentiate between a file’s organization and how it may be

accessed. not the same

Program files – Source programs Object files

May not be directly executable May be understandable by a ‘linker.’

Executable files May be ready for loader to bring into memory.

Much of the data about programs and data files revolves simply as how they are used!


File AttributesFile Attributes

Name – Typically the only information kept in human-readable form Usually independent of the process and system that created it.

Save for possible extensions or types, such as .doc or .ppt, etc. But names often are constrained by the operational environment. NIHPOO……. Each positions often means something very important in a

commercial (non-academic environment.) Identifier – unique tag (number) identifies file within file system

NIHP00; System Code IH; Source programs: ‘N’; subsystem ‘P’ Programs within subsystem: 00, 01, ….

Type – needed for systems that support different types .c, .java. .cpp, .exe, .dll, .dat, .wpd, .doc, etc. .xls, .css. …. And bringing up certain ‘processes’ to process these files … by type.

Location – pointer to file location on device Size – current file size - generally in bytes or blocks, especially blocks. Protection – controls who can do reading, writing, executing

Yes! Read, write, execute, Time, date, and user identification – data for protection, security, and

usage monitoring – Maybe date last accesses; OPR; security.


File OperationsFile Operations

File is an “abstract data type.” This means it has data which will be unique to

its implementation (realization – how organized, and use – how accessed and processed), and

File operations that can be performed on the data – dependent upon how it is implemented. Accessed sequentially, randomly, etc.

Let’s look at the six basic functions that can be performed on most files.


Typical File OperationsTypical File Operations Create –

Need to allocate space Adds entry in disk directory; load data onto storage device.

Write – “System call” supplies name of file and data to be written. A pointer usually needs to be available to “point” to the place

where the next ‘item’ is to be written; pointer updated.

Read – Another system call specifies file name, location in memory where

read data is to be placed, and, using a pointer, locates data to be read.

Pointer needs to be updated to point to ‘next’ item to be read. Pointer for read and write: called a ‘current file-position pointer.’


Typical File Operations – moreTypical File Operations – more

Reposition within file – This refers to moving a file pointer to point to a

specific position / record in the file. Really, this is a file-seek.

Delete – Using the directory, release the file space for reuse; Clears directory entry referring to this file.

Truncate – often used in recreating a file… Delete entries in file but keeps file attributes. Changed attribute is file length; File length reset to zero and its file space is

released.


Typical File OperationsTypical File Operations Other operations include:

Append data to end of a file Rename a file Copy a file Other file utilities: get length of file; get attributes, etc…. Many OS utilities such as file prints, allocating space, …

Some files open() a file at first reference; others require a specific open() or fopen, (system call) etc.

Some files are automatically close() when program terminates; others suggest an explicit file close(). My take: always close your files. Keep things clean.

Open() usually validates the desired mode (read. write, append,…), permissions, and more.

Then, open() typically returns a pointer to the entry in the open-file table.


File Operations – The Process ItselfFile Operations – The Process Itself

In a multiprogramming environment, there is usually a “process table” (PCB) for each running process. Most processes will contain current file pointer for each

opened file

Interestingly,, there is often a system-wide open file table too, which contains a list of open files for all running processes.

Honeywell – UNISYS PAT Table overflow (peripheral allocation table)….


Open File TablesOpen File Tables So there’s an entry in a process-dependent table and a system-wide

table

System wide table contains additional information including an ‘open count.’

When a file is opened for a process, an entry in the open-file table for that process points to the entry in the system-wide table.

The system-wide table also keeps track of who has the same file open, should more than a single process be accessing the file.

Close() decreases this count. When open count reaches zero, this file’s entry is removed form the system-wide table.


Open File Basic InformationOpen File Basic Information

Data needed to manage open files: File pointer - pointer to last read/write location, per process that

has the file open Note: this is needed for systems that do not include a file offset as part

of the read() and write() operations.

Needs to keep track of last read / write location as a current file-position pointer.

File-open count: - counter of number of times a file is open – to allow removal of data from open-file table when last process closes it

Disk location of the file: cache of data access information

Access rights: per-process access mode information. Each process opens a file in some kind of access mode..


““Open File Locking”Open File Locking” Provided by some operating systems and file

systems

Particularly useful for files that can be accessed by multiple applications at same time.

Mediates access to a file Shared locks – used for reading

Exclusive locks – needed for writing. Only one process at a time can get the exclusive lock.

Some OSs only provide for exclusive locking – which makes sense.


File Locking Example – Java APIFile Locking Example – Java APIimport java.io.*;

import java.nio.channels.*;

public class LockingExample {

public static final boolean EXCLUSIVE = false;

public static final boolean SHARED = true;

public static void main(String arsg[]) throws IOException

{

FileLock sharedLock = null;

FileLock exclusiveLock = null;

try {

RandomAccessFile raf = new RandomAccessFile("file.txt", "rw");

// get the channel for the file

FileChannel ch = raf.getChannel();

// this locks the first half of the file - exclusive

exclusiveLock = ch.lock(0, raf.length()/2, EXCLUSIVE);

/** Now modify the data . . .Needs exclusive access! */

// release the lock

exclusiveLock.release();


File Locking Example – Java API (cont)File Locking Example – Java API (cont)// this locks the second half of the file - shared

sharedLock = ch.lock(raf.length()/2+1, raf.length(), SHARED);

/** Now read the data . . . */

// release the lock

sharedLock.release();

}

catch (java.io.IOException ioe) {

System.err.println(ioe);

}finally {

if (exclusiveLock != null)

exclusiveLock.release();

if (sharedLock != null)

sharedLock.release();

}

}

}


File Types – Name, ExtensionFile Types – Name, Extension

We mentioned several file-types earlier.Here are more samples.

Common approach for implementing file types is to include the file-type as part of the name: name.extension.

File type tells the operating system the types of operations that can be performed on the file. e.g. .com and .exe and .bat can be executed.

.com and .exe files are binary executable files; a .bat file is text in ASCII format and consists of a series of commands to the operating system.

Certain applications expect files sent to them to be of a certain type, as in ..c, .java or .doc.


File Types – Name, Extension - moreFile Types – Name, Extension - more

We are very familiar with file-types, as we use them all the time.

“These” notes have extension .ppt for power point.

When I open this file by double-clicking on an icon or hot link representing the file, the specific application (Power Point) is automatically invoked.

Windows has default associationsof file-types to applications

Some OSs don’t require an extension and take an extension as a ‘hint.’


File StructureFile Structure File types indicate the internal structure of the file.

These have structures expected by programs that process them.

Typically, there is information (often up front in the file) needed by the processing program to properly process (load, process, display etc.) the file in question. It might include where program is to be loaded, key words, location of

first instruction, external symbols, and more.

For any file-type the Operating System supports, it needs some code to recognize and support that file type.

But new applications may require information structured in ways not supported by the operating system and problems may occur (book).

This presents some interesting problems.


File Structure – not recognizable formats…File Structure – not recognizable formats…

We may develop an application that creates a file-type not compatible with recognized file-types supportable by the operating system.

So, what to do? Some operating systems support a very limited set of

structures and interpret files very simply as, say, a sequence of 8-bit bytes.

So, ‘something’ must interpret these. The OS allows these, but does not support these directly. Thus, each application must include code to interpret

such an input file… Can you think of any? They are all around us! If you are a Java person, look at all the various I/O options available!

They are all a bit different.


Internal File StructureInternal File Structure Most systems usually have well-defined block sizes

These are usually dependent on the organization of the disk: sector size or some derivative of track size.

We always read and write in blocks – physical records.

For a specific file, all blocks are usually of the same size, with the number of ‘logical records’ as some subset of the block size. Called ‘blocking factor’ BF = 100 one hundred logical

records per physical record (block).

Discuss

Why do we read/write ‘blocks’ in lieu of logical records?

Discuss.


Internal File StructureInternal File Structure Some operating systems define files as simply streams of data

bytes.

Here, each byte is individually addressable by its offset from the front (or end) of the file. Logical record size = 1 byte.

But the system packs and unpacks these bytes into physical disk blocks of, say, 512 bytes per block.

So, the length of a logical record (a read() operation),

the physical block size (determined by sector size or track length), and

packing technique determine the number of logical records in a physical block (record).


Internal File StructureInternal File Structure Files, nonetheless, are considered a series of blocks (whatever their

size) and all I/O functions (logical read() and write()) take place with blocks.

The ‘first’ read() or write() does not read a logical record. It typically reads from IRG to IRG (IBG to IBG) or sector boundary to

sector boundary. - much more later `512 byte sector – contains five 100-character records…

Subsequent read() or write() operations result in (typically) a pointer moving to the next logical record in the block, which is part of a process’s address space. Thus only the physical read of a block results in a physical disk

access. Naturally, there is likely some internal fragmentation for the last

block allocated to a file. Data in a file can be accessed in several, but restricted, ways

often dependent upon the file’s ‘organization.’


Access MethodsAccess Methods Sequential Access of a sequential file organization is

the simplest form.

Information is processed in order – one logical record after another. Operations are typically some form of read() or write()

read next – reads a record and advances file pointer.

write next - appends to the end of the file and advances to the new end of file…moves file pointer as ‘writes’ occur.

reset – some can be reset to a certain position

Others….


Access MethodsAccess Methods

Direct Access – organization. File typically consists of fixed-length logical records. File is viewed as numbered sequence of blocks (records) Access may be random; sometimes sequential. Given the need for a retrieval, a ‘key’ of some sort is

developed for a logical record and from this a block address is computed and the block (containing the logical record) is read.

Blocks are stored according to some kind of key (like SSAN or Account Number, and others) and the computation of the disk address is often done by a variety of algorithms.

Typically we can read a block randomly – given its disk address.


Access MethodsAccess Methods There are many ways that direct access can be affected. Some direct access approaches allow the programmer to

computer a CCTTRR number; Others require the application to compute a ‘relative record

number’ starting with record 0, the first record in the file. IBM uses VSAM – Virtual Storage Access Method. Terms:

ESDS – entry sequenced data set – for sequential files KSDS – Key Sequenced Data Set – for indexed sequential

files, KSDS – uses a primary key such as SSAN, or account number. These

then are mapped into physical disk addresses. RRDS – Relative Record Data Set.

Here we compute algorithmically a relative record number – an integer.

More later.


Sequential-access FileSequential-access File

cp = current position

Here’s a visual for, perhaps, a tape drive.For sequential files, access is always sequential as shown above.


Simulation of Simulation of SequentialSequential Access on a Access on a Direct-accessDirect-access FileFile

On some direct access types of files, sequential processing is permitted, but not all..

On file organizations that permit both sequential and random access, both random queries for retrievals and sequential processing for other requirements such as reports, etc. are permitted.

Indexed Sequential Files (ahead) support both random and sequential access.

Direct Access files normally only support random access. (more ahead)


Example of Index Organization and Random Access Example of Index Organization and Random Access

This organization requires an index and contains pointers to various blocks.Access requires the search of index followed by the retrieval of a record from the file.

Logical records are contained within a block and blocks are read and written.So, when a block is read from disk, this is followed by a sequential read of the logical records within the block to see if the specific desired record lies within the block

Typically the index (above) has keys (primary keys) and block numbers are shown. The highest key in a block is shown. So we are not certain that the desired logical record is actually in the block until it is retrieved and searched for.


Example of Index Organization and Random Example of Index Organization and Random AccessAccess

An indexed sequential file is sorted (ordered) on some index or primary key, like name (above) or account number (key must be unique).

Then an index of primary keys and disk addresses (kept in memory when file is active) is used to locate a logical record.

(Actually disk addresses point to a block where the desired logical record ‘may’ be .)

Multiple keys may be used to search the file for a desired record.Example: The file must be ordered on a unique primary key, such as account number.But we may also retrieve on a unique or non-unique secondary key such as name (non-unique) or phone number (unique).


Example of Index Organization and Random Example of Index Organization and Random AccessAccess

For very large files, we may have levels of indexes (coarse index and fine index; or index sets, sequence sets, data sets (IBM)).

One or more of these indices may be kept in primary memory to reduce I/Os when attempting to access a record.

The indices are searched via a binary search; the retrieved block is searched sequentially for the desired logical record..


Example of Relative FilesExample of Relative Files

Relative Files are another kind of direct access file that does not allow for sequential access.

Due to the way the records in the file are created, sequential access, though possible, makeslittle sense. This is because we typically us a field within a logical record and – based on thatField within the record, computationally determine (usually, ‘hash’) a relative record number – an integer – that provides the ‘relative’ displacement of the logical record from the beginning of the file.

In a Relative File, the key to an individual record is usually computed and is an integer, such as 3, 25, 65, 234, etc. and not related to the order in which it is added to the file.’

Again, there are other direct access file types besides indexed sequential and relative files.

End of Chapter 10.1End of Chapter 10.1

chapter 10: file-system interface

Documents

file creator

file locking

filesystem interfacechapter

filesystem protection10

filesystem interface10

file system interface

function of file systemsto

filesystem design tradeoffs