january 13, 20001 files – chapter 2 basic file processing operations

32
January 13, 2000 1 Files – Chapter 2 Basic File Processing Operations

Upload: marjorie-nash

Post on 17-Dec-2015

231 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: January 13, 20001 Files – Chapter 2 Basic File Processing Operations

January 13, 2000 1

Files – Chapter 2

Basic File Processing Operations

Page 2: January 13, 20001 Files – Chapter 2 Basic File Processing Operations

2

Outline

• Physical versus Logical Files• Opening and Closing Files• Reading, Writing and Seeking• Special Characters in Files• The Unix Directory Structure• Physical Devices and Logical Files• Unix File System Commands

Page 3: January 13, 20001 Files – Chapter 2 Basic File Processing Operations

3

Physical versus Logical Files

• Physical File: A collection of bytes stored on a disk or tape.

• Logical File: A “Channel” (like a telephone line) that hides the details of the file’s location and physical format to the program.

• When a program wants to use a particular file, “data”, the operating system must find the physical file called “data” and make the hookup by assigning a logical file to it. This logical file has a logical name which is what is used inside the program.

Page 4: January 13, 20001 Files – Chapter 2 Basic File Processing Operations

4

Opening Files

• Once we have a logical file identifier hooked up to a physical file or device, we need to declare what we intend to do with the file:

• Open an existing file• Create a new file

That makes the file ready to use by the programWe are positioned at the beginning of the file and

are ready to read or write.

Page 5: January 13, 20001 Files – Chapter 2 Basic File Processing Operations

5

Opening Files in UNIX/C• The UNIX system function open( ) is used to

open an existing file or create a new file.fd = open(filename, flags, [pmode]);

– fd: the file description -- the logical file name. The fd is an integer. If there is an error in the attempt to open the file, fd is negative (-1).

– filename: the physical file name. The filename argument can be a pathname.

Page 6: January 13, 20001 Files – Chapter 2 Basic File Processing Operations

6

– flags: an integer argument that controls the operation of the open function. The values of flag is set by performing a bitwise OR of the following values:

• O_APPEND: Append every write operation to the end of the file.

• O_CREAT: Create and open a file for writing.• O_EXCL: Return an error if O_CREAT opens an existing

file.• O_RDONLY: Open a file for reading only.• O_RDWR: Open a file for reading and writing.• O_TRUNC: Truncate an existing file to a length of 0,

destroying its contents.• O_WRONLY: Open a file for writing only.• and many others for synchronization.

Page 7: January 13, 20001 Files – Chapter 2 Basic File Processing Operations

7

Opening Files in UNIX/C (cont’d)

–pmode: An integer argument to specify the protection mode. • If O_CREAT is specified, pmode is required.

• In UNIX, the pmode is a three-digit octal that indicates how the file can be used by the owner (1st digit), by members of the owner’s group (2nd digit), and by everyone else (3rd digit). r: read permission, w: write permission, e: execute permission.

pmode = 751 = r w er w e r w e1 1 1 1 0 1 0 0 1owner group world

• File protection is tied more to the operating system than to a specific language.

Page 8: January 13, 20001 Files – Chapter 2 Basic File Processing Operations

8

– Examples:

fd = open(filename, O_RDWR | O_CREAT, 0751);

fd = open(filename, O_RDWR | O_CREAT | O_TRUNC, 0751);

fd = open(filename, O_RDWR | O_CREAT | O_EXCL, 0751);

Page 9: January 13, 20001 Files – Chapter 2 Basic File Processing Operations

9

Closing Files

• Makes the logical file name available for another physical file (it’s like hanging up the telephone after a call).

• Ensures that everything has been written to the file [since data is written to a buffer prior to the file].

• Files are usually closed automatically by the operating system (unless the program is abnormally interrupted).

Page 10: January 13, 20001 Files – Chapter 2 Basic File Processing Operations

10

Reading

• Read(Source_file, Destination_addr, Size)

• Source_file = location the program reads from, i.e., its logical file name

• Destination_addr = first address of the memory block where we want to store the data.

• Size = how much information is being brought in from the file (byte count).

Page 11: January 13, 20001 Files – Chapter 2 Basic File Processing Operations

11

Writing

• Write(Destination_file, Source_addr, Size)

• Destination_file = the logical file name where the data will be written.

• Source_addr = first address of the memory block where the data to be written is stored.

• Size = the number of bytes to be written.

Page 12: January 13, 20001 Files – Chapter 2 Basic File Processing Operations

12

• A program does not necessarily have to read through a file sequentially: It can jump to specific locations in the file or to the end of file so as to append to it.

• The action of moving directly to a certain position in a file is often called seeking.

• Seek(Source_file, Offset)– Source_file = the logical file name in which the seek will

occur– Offset = the number of positions in the file the pointer is to

be moved from the start of the file.

Page 13: January 13, 20001 Files – Chapter 2 Basic File Processing Operations

13

• The seek function in UNIX/C: lseek( )pos = lseek(fd, byte_offset, origin)

– pos: a long integer value returned by lseek( ) equal to the number of bytes from the beginning to the file pointer after it has been moved.

– fd: the file descriptor.– byte_offset: the number of bytes to move from some

origin in the file. The byte_offset is a long integer and can be a negative value.

– origin: a value that specifies the starting position from which the byte_offset is to be taken. The values of origin:

• SEEK_SET: lseek( ) from the beginning of the file;• SEEK_CUR: lseek( ) from the current position;• SEEK_END: lseek( ) from the end of the file.

Page 14: January 13, 20001 Files – Chapter 2 Basic File Processing Operations

14

C/C++ streams• In C/C++, a file (and other devices like keyboard) is a stream of data.• There are two sets of I/O operations.

– C streams in stdio.h– C++ stream classes in iostream.h and fstream.h

• Comparison between UNIX/C operations and C/C++ streams– both support a complete set of file operations

UNIX/C

•Available mostly on UNIX, (also in Microsoft Visual C++)

•Fast

•Low level

C/C++ Streams

•Standard C/C++ features, available on almost all operating systems

•Provide structured I/O

Page 15: January 13, 20001 Files – Chapter 2 Basic File Processing Operations

15

C Streams

• Three standard streams: stdin, stdout, and stderr.• Opening file

fopen(const char *filename, const char *mode)• Closing file

fclose(FILE *fp)• Reading file

fread(void *buf, size_t size, size_t num, FILE *fp)//read num items of size bytes into buf from fpfgetc(FILE *fp) // return the next character from fpfgets(char *buf, int size, FILE *fp) // read a line or up to size bytes into buf from fpfscanf(FILE *fp, const char *format, …)// read and format data from fp

Page 16: January 13, 20001 Files – Chapter 2 Basic File Processing Operations

16

C Streams (Cont.)

• Writing filefwrite(const void *buf, size_t size, size_t num, FILE *fp)

//write num items of size bytes from buf to fpfputc(int ch, FILE *fp) //write the character ch to fpfputs(const char *buf, FILE *fp)

// write the string in buf to fpfprintf(FILE *fp, const char *format, …)

// write formatted data to fp• Seeking file

fseek(FILE *fp, long offset, int origin)

Page 17: January 13, 20001 Files – Chapter 2 Basic File Processing Operations

17

• C++ handles file I/O by creating objects of the stream classes.

• Standard stream objects: cin, cout, cerr, clog• Stream classes:

in file iostream.h: ios, istream, ostream, iostream,

in file fstream.h: ifstream, ofstream, fstream

ios

istream ostream

ifstream iostream ofstream

fstream

Page 18: January 13, 20001 Files – Chapter 2 Basic File Processing Operations

18

• Opening fileconstructormember function open

• Closing filedestructormember function close

• Reading fileoverloaded extracting operator <<many others: read, get, getline

• Writing fileoverloaded inserting operator >>many others: write, put

• Seeking fileseekg: set the read/get pointerseekp: set the write/put pointer

Page 19: January 13, 20001 Files – Chapter 2 Basic File Processing Operations

19

The LIST Program

• A simple file processing program: LIST– Display a prompt for the name of the input file.– Read the user’s response from the keyboard

into a variable called filename.– Open the file for input.– While there are still characters to be read from

the input file,• read a character from the file and,• write the character to the terminal screen.

– Close the input file.

Page 20: January 13, 20001 Files – Chapter 2 Basic File Processing Operations

20

/* read characters from a file and write them to the terminal screen */

#include <stdio.h>#include <fcntl.h>

main( ){

char c;int fd; /* file descriptor */char filename[20];

printf(“Enter the name of the file: “); /* step 1 */gets(filename); /* step 2 */fd = open(filename, O_RDONLY); /* step 3 */

while (read(fd, &c, 1) != 0) /* step 4a */putchar(c); /* write(stdout, &c, 1); does not work step 4b */

close(fd); /* step 5 */}

Page 21: January 13, 20001 Files – Chapter 2 Basic File Processing Operations

21

// listc.cpp// program using C streams to read characters from a file // and write them to the terminal screen #include <stdio.h>main( ) {

char ch;FILE * file; // file descriptorchar filename[20];printf("Enter the name of the file: "); // Step 1gets(filename); // Step 2file =fopen(filename, "r"); // Step 3while (fread(&ch, 1, 1, file) != 0) // Step 4a

fwrite(&ch, 1, 1, stdout); // Step 4bfclose(file); // Step 5

}

Page 22: January 13, 20001 Files – Chapter 2 Basic File Processing Operations

22

// listcpp.cpp DO THIS ONE...// list contents of file using C++ stream classes#include <fstream.h>void main (){

char ch;fstream file; // declare fstream unattachedchar filename[20];cout <<"Enter the name of the file: " // Step 1

<<flush; // force outputcin >> filename; // Step 2 file.open(filename, ios::in); // Step 3 file.unsetf (ios::skipws); // include white space in readwhile (1){

file >> ch; // Step 4a if (file.fail()) break;cout << ch; // Step 4b

}file.close(); // Step 5

}

Page 23: January 13, 20001 Files – Chapter 2 Basic File Processing Operations

23

Detecting End-of-File

• In UNIX/C– read returns 0

• Using C streams– fread returns -1– feof returns true

• Using C++ stream classes– fail returns true– eof returns true

Page 24: January 13, 20001 Files – Chapter 2 Basic File Processing Operations

24

Special Characters in Files I

• Sometimes, the operting system attempts to make “regular” user’s life easier by automatically adding or deleting characters for them.

• These modifications, however, make the life of programmers building sophisticated file structures (YOU) more complicated!

Page 25: January 13, 20001 Files – Chapter 2 Basic File Processing Operations

25

Special Characters in Files II: Examples

• Control-Z is added at the end of all files (MS-DOS). This is to signal an end-of-file.

• <Carriage-Return> + <Line-Feed> are added to the end of each line (again, MS-DOS).

• <Carriage-Return> is removed and replaced by a character count on each line of text (VMS)

Page 26: January 13, 20001 Files – Chapter 2 Basic File Processing Operations

26

The Unix Directory Structure I

• In any computer systems, there are many files (100’s or 1000’s). These files need to be organized using some method. In Unix, this is called the File System.

• The Unix File System is a tree-structured organization of directories. With the root of the tree represented by the character “/”.

• Each directory can contain regular files or other directories.• The file name stored in a Unix directory corresponds to its

physical name.

Page 27: January 13, 20001 Files – Chapter 2 Basic File Processing Operations

27

The Unix Directory Structure II

• Any file can be uniquely identified by giving it its absolute pathname. E.g., /usr6/mydir/addr. (see the next slide)

• The directory you are in is called your current directory.• You can refer to a file by the path relative to the current

directory.• “.” stands for the current directory and “..” stands for the

parent directory.

Page 28: January 13, 20001 Files – Chapter 2 Basic File Processing Operations

28

Page 29: January 13, 20001 Files – Chapter 2 Basic File Processing Operations

29

Physical Devices and Logical Files

• Unix has a very general view of what a file is: it corresponds to a sequence of bytes with no worries about where the bytes are stored or where they come from.

• Magnetic disks or tapes can be thought of as files and so can the keyboard and the console.

• No matter what the physical form of a Unix file (real file or device), it is represented in the same way in Unix: by an integer.

Page 30: January 13, 20001 Files – Chapter 2 Basic File Processing Operations

30

Stdout, Stdin, Stderr

• Stdout --> Console

fwrite(&ch, 1, 1, stdout);

• Stdin --> Keyboard

fread(&ch, 1, 1, stdin);

• Stderr --> Standard Error (again, Console)

[When the compiler detects an error, the error message is written in this file]

Page 31: January 13, 20001 Files – Chapter 2 Basic File Processing Operations

31

I/O Redirection and Pipes

• < filename [redirect stdin to “filename”]

• > filename [redirect stdout to “filename”]

E.g., a.out < my-input > my-output

• program1 | program2 [take any stdout output from program1 and use it in place of any stdin input to program2.

E.g., list | sort

Page 32: January 13, 20001 Files – Chapter 2 Basic File Processing Operations

32

Unix System Commands

• cat filenames --> Print the content of the named textfiles.• tail filename --> Print the last 10 lines of the text file.• cp file1 file2 --> Copy file1 to file2.• mv file1 file2 --> Move (rename) file1 to file2.• rm filenames --> Remove (delete) the named files.• chmod mode filename --> Change the protection mode on the

named file.• ls --> List the contents of the directory.• mkdir name --> Create a directory with the given name.• rmdir name --> Remove the named directory.