computer network programming. course information class hours: section 1 –mon 13:40-15:30 (eb268),...

72
Computer Network Programming

Post on 21-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Computer Network Programming

Course Information

• Class Hours:• section 1

– Mon 13:40-15:30 (EB268), Wed 8:40-9:30 (EB267)

• section 2

– Wed 16:40-17:30 (EB262), Fri 10:40-12:30 (EB267)

• Office Hours: Any time when I am in the office

• Textbook : W. Richard Stevens, Unix Network Programming, Volume 1, Networking APIs:

Sockets and XTI, Second Edition, Prentice Hall PTR, 1998.

W. Richard Stevens, Unix Network Programming, Volume 2 , Interprocess Communications, Second Edition, Prentice Hall PTR, 1998.

Course Information

• Some papers (whitepapers) and documents will be distributed on topics which are not covered by the textbook.

• Prerequisites: There is no course set as a prerequsite. However, the followings are requirements for taking this course:

Data Structures and Algorithms course (required)

Fluency in C Programming (required), if you know C++ then it is very easy to learn also C.

Operating System course may be taken parallely (recommended but not necessary)

Computer Networks course (recommended but not necessary)

Course Information

• Hardware and Software Requirements: Students need to have access to Unix machines (Solaris). Student may either work from the console of the unix machines or they can connect to the unix machines from PCs(Windows) using telnet, Xceed or Xwin.

• This course teaches network programming in the Unix Operating System.

• Projects and Homework Assignments will be done on Unix machines (hosts).

• Course homepage: http://www.cs.bilkent.edu.tr/~korpe/cs424.html

– please visit the homepage regularly. Announcements will be posted there also.

Grading Policy- tentativeMidterm: There will be one midterm exam. %20 Final: %20Project(s): %30Homeworks: %30

• Homeworks will include programmimg exercises.

• Project(s) will be large and will include writing programs

of substantial size.

Topics that will be covered(tentative)

• Overview of Unix Programming Environment

• Unix Programming Tools: compilers, debuggers, utilities, revision control system.

• Introduction to Computer networking and TCP/IP protocol suite

• Overview of TCP, IP, IPv6, Ethernet, PPP, ARP protocols

• Debugging and Networking Tools for Network Programming, Troubleshooting

• Introduction to Sockets, TCP Sockets

• I/O Multiplexing, Socket Options

Topics covered

• UDP Sockets, Name and Address conversions, DNS

• Daemon Processes, Advanced IO Functions

• Unix Domain Protocols, Non-blocking IO

• Routing Sockets, Broadcasting, Multicasting

• Advanced UDP Sockets, Signal driven I/O

• Threads

• IP Options, Raw Sockets, Data-link access

• Interprocess communication, Pipes and FIFOs

• Message queues, shared memory, semaphores

Topics Covered

• Distributed File Systems: NFS and AFS

• RPC, Pseudo-terminals

If time permits

• Implementation of Networking sub-system in Unix OS.

• Network Management and SNMP

• Introduction to Mobile and Wireless Networking

• Mobile IP and Bluetooth

Overview of the Unix Programming Environment

Logging into a Unix System

• Login in the system:• type your user name in the login prompt

• type your password after the login prompt

• When you successfully login, you will be entering into a directory in the file system which is called your HOME directory– ex: van:/home1/csstu/korpe$

Logging into the Systemaspendos{korpe}:> telnet van.ug.bcc.bilkent.edu.trTrying 139.179.11.19...Connected to van.ug.bcc.bilkent.edu.tr.Escape character is '^]'.

UNIX(r) System V Release 4.0 (van)

login: korpePassword: Last login: Thu Feb 7 11:31:54 from aspendos.cs.bilkSun Microsystems Inc. SunOS 5.5 Generic November 1995

van:/home1/csstu/korpe$

Basic commands

• Listing the files in a directory• use ls command

•ls -l gives more information about each file

• Displaying the content of a file• use cat or more commands

• Copying a file: cp command•cp file1 file2 copies file1 into file2.

• Renaming (moving) a file: mv command•mv file1 file2 moves file1 into file2.

Basic commands

• rm removes a file from the system

• wc displays the number of lines, words, characters in the file

• cd changes the directory•cd .. changes to the parent directory

•cd directory_name changes the current directory into the directory_name

Unix File Hierarchy

/

bin dev etc usr tmp unix kernel

home bin

you mike paul

tmp data.dat program.c

A path: /home/you/data.dat

junk

(root)

local

sbinvar

vmunix

Pathnames

• You can use full pathnames for the files in the commands or relative pathnames:– Example: assume current directory is “you”

– full path: cp home/you/data.dat home/mike– relative path: cp data.dat ../mike/

• Use the unix manual pages to obtain more in formation about the commands.

• Ex: man cp gives information about cp command

•man -k subject gives the commands related to subject

Man -kvan:/home1/csstu/korpe$man -k copy/usr/openwin/man/windex: No such file or directory/usr/local/SUNWspro/man/windex: No such file or directory/usr/dt/man/windex: No such file or directory/usr/man/windex: No such file or directorycp cp (1) - copy filescp cp (1) - copy filescp cp (l) - copy filescpio cpio (l) - copy files to and from archivesdd dd (1) - convert a file while copying itdd dd (1) - convert a file while copying itdd dd (l) - convert a file while copying itfcat fcatcmd (1) - copy files in the FSP database to stdoutfcatcmd fcatcmd (1) - copy files in the FSP database to stdoutinstall install (1) - copy files and set their attributesinstall install (1) - copy files and set their attributesinstall install (l) - copy files and set their attributestiffcp tiffcp (l) - copy (and possibly convert) a .SM TIFF file

Important directories in the File Hierarchy

`/bin' Executable (binary) programs. On most systems this is a separate directory to /usr/bin. In SunOS, this is a pointer (link) to /usr/bin.

`/etc' Miscellaneous programs and configuration files. This directory has become very messy over the history of UNIX and has become a dumping ground for almost anything. Recent versions of unix have begun to tidy up this directory by creating subdirectories `/etc/mail', `/etc/services' etc!

`/usr' This contains the main meat of UNIX. This is where application software lives, together with all of the basic libraries used by the OS.

`/usr/bin' More executables from the OS.

Important directories

`/usr/local' This is where users' custom software is normally added.

`/sbin' A special area for statically linked system binaries. They are placed here to distinguish commands used solely by the system administrator from user commands and so that they lie on the system root partition where they are guaranteed to be accessible during booting.

`/dev, /devices' A place where all the `logical devices' are collected. These are called `device nodes' in unix and are created by mknod. Logical devices are UNIX's official entry points for writing to devices. For instance, /dev/console is a route to the system console, while /dev/kmem is a route for reading kernel memory. Device nodes enable devices to be treated as though they were files.

Important directories

`/home' (Called /users on some systems.) Each user has a separate login directory where files can be kept. These are normally stored under /home by some convention decided by the system administrator.

`/var' /var/spool and /var/adm etc are used for holding queues for spooling and system log files.

`/vmunix' This is the program code for the unix kernel. (kernel is the core which implements basic operating system services).

`/kernel' On newer systems the kernel is built up from a number of modules which are placed in this directory.

Input-Output redirection

• “ls -l > filename” lists the files into a file called filename

• “cat f1 f2 f3 > tmp” concatenates the files into tmp

• “sort < temp” sorts the strings in the file temp and displays the sorted output (input received from temp)

Pipes

• You can redirect the output of a program to another program as input. – Example:

– ls | wc -l

– who | grep mary

– who | grep mary | wc -l

Process

• “ps” shows all the programs (processes) that are currently running

• “ps -ef” shows more detailed information about each process

• “kill -9 process_number” kills (terminates) a process

Tailoring the Environment and Shell

• You can bring the system closer to your personal taste.

• Shell is the interpreter that executes the commands that you type. – It provides you with a command prompt where

you type your commands. (like $)

• There are different kinds of shells that you can choose from

Different Shellsbash

The Bourne Again shell, an improved sh. csh

The standard C-shell. ksh

The Korn shell, an improved sh. sh

The original Bourne shell. tcsh

An improved C-shell.

- csh and tcsh are easy to use- in order to execute a shell just type its name.

Environment Variables

• Environment variables are variables that shell keeps and they are used to configure your working environment, so that you can tailure your environment for your needs and taste.

• Any program that you run can read these variable to find out the configuration of the environment.

Some important environment variables

PATH The search path for shell commands (bash)

TERM The terminal type (bash and csh)

DISPLAY X11 - the name of your display

LD_LIBRARY_PATH Path to search for object and shared libraries

HOSTNAME Name of this UNIX host

PRINTER Default printer (lpr)

HOME The path to your home directory (bash)

PS1 The default prompt for bash

path The search path for shell commands (csh)

term The terminal type (csh)

prompt The default prompt for csh

home The path to your home directory (csh)

Setting the Shell Variables

• You can display the current values of shell variables using the setenv command

• You can set the value of a shell variable using setenv command

• example:– setenv PATH /opt/SUNWspro/bin

» set the PATH variable

– setenv PATH ${PATH}:/usr/local/teTeX/bin

» adds one more path to the PATH variable

Shell Configuration Files

• You can add the configuration commands into special files called .profile and .cshrc so that your environment is configured usings the commans in those files when you initially login to the system.

• .profile and .cshrc are hidden files (use ls -al to see them)

• .cshrc is used by C-shell, .bashrc is used by Bourne shell, etc.

Whildcards

• Sometimes you want to be able to refer to several files in one go while executing a command

• You use whildcards for this

• The wildcard symbols are

• `?'

– Match single character. e.g. ls /etc/rc.????

• `*' – Match any number of characters. e.g. ls /etc/rc.*

• `[...]' – Match any character in a list enclosed by these brackets. e.g. ls [abc].C

Whildcards

Here are some examples and explanations.

`/etc/rc.????' Match all files in /etc whose first three characters are rc. and are 7 characters long.

`*.c' Match all files ending in `.c' i.e. all C programs.

`*.[Cc]' List all files ending on `.c' or `.C' i.e. all C and C++ programs.

`*.[a-z]' Match any file ending in .a, .b, .c, ... up to .z etc.

Regular ExpressionsThe wildcards belong to the shell. They are used for matching

filenames. UNIX has a more general and widely used mechanism

for matching strings, this is through regular expressions.

# Print all lines which DON'T begin with #

egrep '(^[^#])' /etc/rc

# Print all lines beginning with e, f or g. egrep '(^[efg])' /etc/rc

# Print all lines beginning with uppercase

egrep '(^[A-Z])' /etc/rc

# Print all lines NOT beginning with uppercase

egrep '(^[^A-Z])' /etc/rc

# Print all lines containing ! * &

egrep '([\!\*\&])' /etc/rc

(“egrep filename” is utility to search and print strings in a file)

How to construct regular expressions

Regular expressions are made up of the following `atoms'.

`.' Match any single character except the end of line.

`^' Match the beginning of a line as the first character.

`$' Match end of line as last character.

`[..]' Match any character in the list between the square brackets.(see below).

`*' Match zero or more occurrences of the preceding expression.

`+' Match one or more occurrences of the preceding expression.

`?' Match zero or one occurrence of the preceding expression.

File Permissions

• chmod – Change file access mode.

• chown, chgrp – Change the ownership of the file.

• use ls -al to find the current permissons of a file:

File Permissions

aspendos{korpe}:> ls -al .profile-rw-r--r-- 1 korpe staff 144 Apr 1 1997 .profile

Example

The structure of permissions is:-rwx rwx rwx

Permissonsfor the owner Permissons

for the group

Permissonsfor the others

Think rwx as a binary number where 1 corresponds to permissiongranted and 0 corresponds to permission disabled.

rwx corresponds to 111 = 7 r-- corresponds to 100 = 4

File Permissions

For example:• to obtain a file permission setting as rwxr-xr-x for a file, we have to execute the command: chmod 755 fılename• chmod +w changes the mode of the file so that it is

writable by everyone• chmod +x changes the mode of the file so that it is

executable by everyone• chmod ug+w changes the mode of the fıle to writable for

the user and group. • chmod uo+x changes the mode of the file to executable for

the user and others

Text editors that you can use

ed An ancient line-editor.

vi Visual interface to ed. This is the only "standard" UNIX text editor supplied by vendors.

emacs The most powerful UNIX editor. A fully configurable, user programmable editor which works under X11 and on tty-terminals.

xemacsA pretty version of emacs for X11 windows.

pico A tty-terminal only editor, comes as part of the PINE mail package.

xedit A test X11-only editor supplied with X-windows.

textedit A simple X11-only editor supplied by Sun Microsystems.

Running a process in the background

• Type & after the command name– example: my_program &

• use bg and fg commands to bring the process back and forth.

• use the jobs command to display the current jobs

Obtaining System Information

uname gives the name of the Operating Sytem

uname -r gives the version of the Operating System

hostname gives the name of the host.

Unix Programming Tools

Tools

• editors: emacs, xemacs, vi, pico

• compilers: gcc, cc, CC, g++

• linker/loader: ld

• archieve library builders: ar

• debuggers: gdb, xxgdb, dbx

• utilities: make, autoconf, purify, gprof, truss

• source code management: rcs, cvs, sccs

Compiling

• gcc– gcc -o hello hello.c

– compiles and links the program hello.c and generated an exacutable called hello (if yu don’t give a name then the executable is called a.out)

– gcc -c hello.c– only compiles the program and produces and object code.

– gcc -Wall– displays all the warnings

– gcc -llibraryname – link with the library libraryname

Linking– ld

– the linker ld links together the object code files and library files and produces a executable program.

– Linking refers to the process in which a symbol referenced in one module of your program is connected with its definition in another module (object file or library).

– ld -Ldirectory_name

» searches the directory called dırectory_name for the libraries specifies.

– ld -lx» search a library libx.so or libx.a

» .so shared object library

» .a archieve library

Linker

• Static Linking» Under static linking, copies of the archive library object

files that satisfy still unresolved external references in your program are incorporated in your executable at link time. External references in your program are connected with their definitions -- assigned addresses in memory -- when the executable is created.

• Dynamic Linking» Under dynamic linking, the contents of a shared object are

mapped into the virtual address space of your process at run time. External references in your program are connected with their definitions when the program is executed.

Creating a Library

• Static library for linking: libsomething.a• create .o files: gcc -c helper.c•ar rlv libsomething.a *.o•ranlib libsomething.a• use library as gcc -lsomething,

» searches in /usr/lib, etc.

• Dynamic library•gcc -shared -fPIC helper.c -o libhelper.so

• use same as above, LD_LIBRARY_PATH

ldd and nm

• List dynamic dependencies of an executable $ ldd a.outlibc.soç.1 => /usr/lib/libc.so.1libdl.so.1 => /usr/lib/libdl.so.1

• Content of an archive or executable$ nm [-g] a.out0004155b R charmap00030f10 t cleanfree0002e304 W close……..

Debugging

• Use gdb or xxgdb

• must compile your programs with -g option to get the symbol table

• gdb a.out or gdb a.out core

• gdb a.out 1234 attaches to process 1234

• source commands– list main.c:12 show source

– p(print) x show variable x

– where where am I in the stack

gdb: execution

• run arg run program• call f(a,b) call function in program• step N step N times into functions• next N step N times over functions • up N select stack frame that called

current one

• down select stack frame called by current one

gdb: break points

• break main.c:12 set break point• break foo set break point at function• clear main.c:12 delete break point• info break show breakpoint• delete 1 delete break point 1• display x display variable at each stop

make Utility

• make is used for compiling large projects that consists of a lot of source files

• maintains dependency graphs: • source <--- object

• latex <--- postscript

• based on modification times of filestarget .. : dependency command command ...

How to use make

• Create a file called Makefile in the directory where the source code (program) resides– edit the makefile so that dependencies are defined for the code

• type make– it will automatically read the Makefile and will compile the source

code.

• A Makefile consists of dependencies of the form: target .. : dependency command (or rule) command

Content of Makefile

• The target is the thing we want to build, the dependencies are like subroutines to be executed first if they do not exist. Finally the command (or rule) is to be executed if all if the dependencies exist; it takes the dependencies and turns them into the target. There are two important things to remember:

The file names must start on the first character of a line.

There must be a TAB character at the beginning of every rule or action. If there are spaces instead of tabs, or no tab at all, `make' will signal an error. This bizarre feature can cause a lot of confusion.

Example

Our example program consists of two source files: main.c and other.c

It uses a library called libdb which resides in directory/usr/local/lib

Our aim is to build a program called database.

# # Simple Makefile for `database' # # First define a macro OBJ = main.o other.o CC = gcc CFLAGS = -I/usr/local/include LDFLAGS = -L/usr/local/lib -ldb INSTALLDIR = /usr/local/bin # # Rules start here. Note that the $@ variable becomes the name of the # executable file. In this case it is taken from the ${OBJ} variable # database: ${OBJ} ${CC} -o $@ ${OBJ} ${LDFLAGS} # # If a header file changes, normally we need to recompile everything. # There is no way that make can know this unless we write a rule which # forces it to rebuild all .o files if the header file changes... # ${OBJ}: ${HEADERS} # # As well as special rules for special files we can also define a # "suffix rule". This is a rule which tells us how to build all files # of a certain type. Here is a rule to get .o files from .c files. # The $< variable is like $? but is only used in suffix rules. #

.c.o: ${CC} -c ${CFLAGS} $<

(continued in the next page)

####################################################################### # Clean up ####################################################################### # # Make can also perform ordinary shell command jobs# clean: rm -f ${OBJ} rm -f y.tab.c lex.yy.c y.tab.h rm -f y.tab lex.yy rm -f *% *~ *.o rm -f mconfig.tab.c mconfig.tab.h a.out rm -f man.dvi man.aux man.log man.toc rm -f cfengine.tar.gz cfengine.tar cfengine.tar.Z rm -f cfengine

install: ${INSTALLDIR}/database cp database ${INSTALLDIR}/database

How to invoke the Makefile

makemake databasemake cleanmake install

Make uses some special variables: $@ $? $<

Makefile Special Variables$@ This evaluates to the current target i.e. the name of the object you are currently trying to build. It is normal to use this as the final name of the program when compiling

$? This is used only outside of suffix rules and means the name of all the files which must be compiled in order to build the current target.

target: file1.o file2.oTAB cc -o $@ $?

$< This is only used in suffix rules. It has the same meaning as `$?' but only in suffix rules. It stands for the pre-requisite, or the file which must be compiled in order to make a given object.

Source Code Management and Revision Control

• Large scale programs are developed by many engineers

• a single shared database of source code (program files) is used by many people

• the access to the source code files needs to be synchronized (only one user should be able to access the source file

• We need to store many versions (revisions) program files at different stages of project development

RCS

• rcs: source code management and revision control system in Unix. Others exists: SCCS, CVS

• manages multiple revisions of files.

• rcs automates the storing, retrievel, logging, identification and merging of revisions

rcs• Revisions are stored in a file called RCS file.

• The RCS files are usually stored in a directory called R

• Example: Our program directory foo contains a file called main.c. All the versions of the main.c is stored in a file called main.c,v in the directory RCS. main.c is the currently used version of the file and it may or may not be stored in the foo directory.

/foo

main. c /RCS

main.c,v

Functions of RCS

• Store and retrieve multiple revisions of text– Revisions can be retrieved using the revision numbers,

symbolic names, authors, dates.

• Maintain complete history of changes– RCS logs all the changes to the file together with the

modifications, the author who modified, the date, and exlanation message.

• Resolve access conflicts– when more than one user wants to access the file, RCS

alerts the users and prevents corrupting the file.

Functions of RCS

• Maintain a tree of revisions.– RCS can maintain separate lines of development for

each module. It stores a tree structure that represents the ancesteral relationships among revisions

1.1

1.2

1.3

1.2.1.1

1.2.1.2

1.4

1.2.1.1.1.1

branch

Top of tree path

merge1.2.1.2.1.1

1.2.1.2.1.2

RCS Functions

• Merge revisions and resolve conflicts– Two different lines of development can be merged. If the revisions of

the merging affects the same section of the code, RCS alerts the user.

• Control Releases and Configurations– Revisions can be assigned symbolic names (tagging) and marked and

stable, released, experimental, etc.

• Automatically identify each revision with name, revision number, creation time and author.

• Minimize secondary storage. – Only the difference (delta) between revisions is stored in the RCS

file.

Example

• Assume you have a file f.c (working file) that you want to store in the RCS

• create a directory RCS: mkdir RCS• use the ci command (check-in) to store the file in the

RCS– ci f.c (stores the f.c in RCS/f.c,v and assigns a revision

number 1.1– f.c is deleted from the working directory

• use the command co (check-out) to retrieve the latest revision from the RCS file. – co f.c

RCS commands

• Use co -l to lock the retrieve (check-out) file so that you can make changes to it.

• After doing modifications to the file you can store (check-in) the file and RCS will assign it a new revision numver 1.2

• You can check-out a specific revision of the file using the command

• co -r revision_number (i.e co -r1.2 foo.c)

RCS commands

• You can check-in (store) the file with a revision number of your choice using the command: ci -rrevision_number file– example: ci -r1.2.1.1 f.c (creates branch)

• RCS can put automatic identification string into your file. TO achieve this put $Id$ string in the beginning of the file. When you checkin and checkout the file RCS will replace this string with an identification string of the form: $Id: filename revision date time author state $

• With such a string at the beginning of your working file you will know with which revision of the file your are working currently.

Other RCS commands

• ident shows information about the working file– Example:

aspendos{korpe}:> ident main.cmain.c$Id: main.c,v 1.2 2002/02/06 21:47:25 korpe Exp $

• rcsdiff shows the difference between two revisions of a

file.– rcsdiff -r1.2 -r1.2 main.c– rcsdiff -r1.2 main.c (shows the difference between the

working file main.c and its 1.2 revision)

Other RCS commands• rcsmerge incorporates the changes between two revisions

of a file into the corresponding working file. – rcsmerge -r1.2 -r1.2.1.1 f.c

(version of the working file is 1.3)

f.c

1.1

1.2

1.2.1.1

Working file 1.3

New revision

delta

delta is added to current working filewhich has revision 1.3

RCS Commands

• rlog prints log messages and other information about RCS files.

• rcs creates new RCS files or changes the attributes of the existing ones

• rcs -l1.2 f.c locks the revision 1.2 of f.c so that we can check it as revision 1.3.

• rcs -lrevision filename removes the lock.

• rcs -nname[:rev] assigns a symbolic name to the version rev of the file. (this is called tagging and name is called a tag)

Other Tools

• Memory– purify: memory leak detector

• Performance– prof, gprof: profile performance

– truss: trace system calls

gprof

– Profiling = execution profile of a call graph– periodic CPU sampling

gcc -pg myprog.c -o myprog.cgprof myprog gmon.out

Output produced into gmon.out

truss• show execution trace of system calls• does not show stdio calls• -p: attach to existing process id• -f: follow children• -u libc: also follow user libraries

$ truss -u libc -d test

use top to show memory utilization

Purify

• Check for – memory leaks– access to free’d memory– open file descriptors

– purify -cache-dir=/tmp/purify gcc -g test.c– a.out