eseaica1 · 2000-02-23 · chapter 2. pr ogramming with sockets do not study them. there exist also...

R�eseaux Informatiques I

Jean-Yves Le Boudec

ICA

Ecole Polytechnique Federale de Lausanne

Tu aimeras les r�eseaux

Les noms de mod�eles en couche

Ne te seront plus farouches

TCP/IP, ISO

Loin des a�res des novices

Tu seras �a bonne �ecole

Conna�tras les protocoles

Leurs trames et leurs services

A la langue de Moli�ere

Tu devras etre in�d�ele

Et parfois pr�ef�erer celle

De James Fenimore Cooper

JYLB

i

ii

Cooper (James Fenimore) 1789-1851. Romancier am�ericain. Ses romans mettent en sc�ene le con it

entre la civilisation et la culture primitive. Le dernier des Mohicans, 1826. Le tueur de daims, 1841.

Contents

1 Introduction to Computer Networking 1

2 Programming With Sockets 3

2.1 Objective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2.2 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2.3 General Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2.3.1 What is the socket library ? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2.3.2 What is a socket ? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2.3.3 Creating a socket . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2.4 Addresses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.4.1 Data structures for addresses . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.4.2 Host and Network order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.4.3 Textual representation of addresses . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.5 Names . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.6 Using a UDP socket . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.6.1 Binding a socket . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.6.2 Writing to and reading from a UDP socket . . . . . . . . . . . . . . . . . . . . 9

2.6.3 Closing a socket . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.6.4 Example: UDP Server and Client . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.7 TCP sockets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.7.1 Di�erences with UDP sockets . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.7.2 Server side . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.7.3 Client side . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.7.4 Example: A simple TCP client server pair . . . . . . . . . . . . . . . . . . . . . 15

2.8 Socket Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

2.8.1 Socket Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

2.8.2 How do I make a socket non-blocking ? . . . . . . . . . . . . . . . . . . . . . . 22

2.9 Concurrency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

iii

iv CONTENTS

2.9.1 Select . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2.9.2 Fork . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

2.9.3 Interprocess Communication . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2.9.4 Zombie . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

2.9.5 Example: a UDP parallel server . . . . . . . . . . . . . . . . . . . . . . . . . . 26

2.9.6 Example: a TCP parallel server . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

2.10 Unix programming miscellanies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

2.10.1 How do I use man pages ? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

2.10.2 Handling errors in system and library calls . . . . . . . . . . . . . . . . . . . . 30

3 The MAC layer 33

4 The Internet Protocol 35

5 The Transport Layer 37

6 The Application Layer 39

7 Mixed Architectures 41

Chapter 1

Introduction to Computer Networking

1

2 CHAPTER 1. INTRODUCTION TO COMPUTER NETWORKING

Put copy of slides here.

Chapter 2

Programming With Sockets

2.1 Objective

After studying this chapter you should be able to

� write a C program under Unix which uses the transport layer of TCP/IP, by means of the sockets

interface

� write a parallel UDP server using fork()

This chapter is a �rst introduction to low level programming in C. It is not a complete course on

inter-process communication (see the operating systems and distributed systems lectures).

2.2 References

FAQs are available by FTP from the news.answers archives at rtfm.mit.edu and its many mirror sites

worldwide. See also http://www.faqs.org.

1. Socket FAQ: ftp://rtfm.mit.edu/pub/usenet/news.answers/unix-faq/socket

2. Fork and programming with Unix processes:

ftp://rtfm.mit.edu/pub/usenet/news.answers/programmer/faq

3. Secure Unix programming:

ftp://rtfm.mit.edu/pub/usenet/news.answers/sup/securefaq.html

4. Comer and Stevens, TCP/IP, Vol III: Client Server Programming and Applications, 1993,

Prentice-Hall

2.3 General Concepts

2.3.1 What is the socket library ?

The socket library is an application programming interface (API) which provides access to the trans-

port layer, and with some restrictions (raw sockets) to the network layer. There exist other APIs; we

3

4 CHAPTER 2. PROGRAMMING WITH SOCKETS

do not study them.

There exist also other architectures than TCP/IP. The socket API has been designed to be a general

mechanism, which applies to other architectures as well. We will study only the part of the socket

API which is used for TCP/IP.

2.3.2 What is a socket ?

In our context, a socket means a communication end-point. A TCP/IP socket is identi�ed by

� the IP address of a communication interface

� a port number

Every socket has a family, a type and a protocol. The family corresponds to what we call an architec-

ture. There are two families for us: IPv4 and IPv6. We will see only IPv4 in this chapter. The type

corresponds to the service. There are three types of sockets:

� UDP sockets: type = datagram; protocol = UDP

� TCP sockets: type = stream; protocol = TCP

� raw sockets: type = raw; protocol = IP or ICMP

For architectures other than TCP/IP, there might be other combinations.

Unix treats a socket as a �le. In reality, a socket corresponds to a receive and a send bu�er in the

operating system kernel.

Sockets can be used to communicate:

� between di�erent machines

� between di�erent processes on the same machine. In that case, the IP address is the loopback

address 127.0.0.1

2.3.3 Creating a socket

A program creates a socket with the socket library function.

#include <sys/types.h>

#include <sys/socket.h>

int socket(int family, int type, int protocol)

where

� family is AF INET or AF INET6

� type is SOCK DGRAM, SOCK STREAM, or SOCK RAW

� type is 0

2.4. ADDRESSES 5

The call returns a (small) integer (the \socket descriptor") which can be used later to reference the

socket. The socket descriptor is in the same space as a �le descriptor. A process has at least three

open �les: 0 (input); 1 (output); 2 (error). A process that accesses no disk �le and opens one socket is

likely to receive a socket descriptor = 3. File access functions such as read and write can be performed

on sockets too. There is a limit on the maximum number of �les, including sockets, that a process

can access (use the command line: limit -h to explore this).

2.4 Addresses

2.4.1 Data structures for addresses

We must be able to represent IP addresses and port numbers. There is a generic data structure for

addresses, independent of any architecture:



struct sockaddr {

short sa_family; /* for us: AF\_INET */

char sa_data[14]; /* 14 bytes of data representing the address */

}

An IPv4 socket data structure contains both an IPv4 address and a port:

#include <netinet/in.h>

struct in_addr {

u_long s_addr;

};

struct sockaddr_in { /* IPv4 address + port */

short sin_family; /* to be set to AF_INET */

u_short sin_port;

struct in_addr sin_addr;

char sin_zero[8]; /* unused */

};

When we use socket calls, an IPv4 socket data structure can be cast into a struct sockaddr because it

has the appropriate length (this is why we have 8 bytes of stu�ng).

An IPv6 socket data structure contains an IPv6 address, a ow identi�er and a port:

struct in_addr6 {

u_long s6_addr[4];

};

struct sockaddr_in6 {


short sin6_family; /* AF_INET6 */

u_short sin6_port;

u_long sin6_flowlabel;

struct in_addr6 sin6_addr;

};

A generic socket address can contain 14 bytes. For IPv6, this is not large enough. As a result, the

generic socket functions cannot all be used with IPv6. We concentrate on IPv4 in this chapter.

2.4.2 Host and Network order

An integer other than char is stored on several bytes. For some reasons, Intel processors store the

bytes in memory in the order of increasing weight; most other processors do the opposite. Thus the

32 bit integer which is written 12 34 56 78 in hexadecimal, is stored in memory as

78 56 34 12 Intel

12 34 56 78 Motorola

Note that the bits inside a byte are stored the same way by all machines.

As a networking convention, all multi-bytes are always transmitted higher order �rst. We say thus

that the network order is

12 34 56 78 Network Order

Now when a host sends an IP packet, it writes such integers as IP address and port number in the

packet. They have to be written in network order. For performance optimization reasons, writing an

IP packet to the hardware is simply done by copying from the socket address data structure. Assume

for example that we are on a Linux PC (with an Intel processor). The port �eld in the socket address

data structure has to be written in network order, not in host order. Of course, your program should

be portable on any processor.

When writing socket programs in C, remember the following: Every protocol �eld which is written in

a data structure must be in network order.

This is implemented by using the following library functions.

u_long htonl(u_long hostlong);

u_short htons(u_short hostshort);

u_long ntohl(u_long netlong);

u_short ntohs(u_short netshort);

For example, if you wish to set a port number to 1234 in a socket data structure, you should write

struct sockaddr_in monAddr;

...

monAddr.sin_port = htons(1234);

On a Motorola machine, those functions do nothing. On Intel machines, they simply invert the byte

order.

2.5. NAMES 7

2.4.3 Textual representation of addresses

An IP address is stored as a 32-bit integer. We often need to output or input an address as a character

string, in dotted decimal format. The following address format translation functions are useful:

u_long inet_addr (char* adresseAscii);

char *inet_ntoa(struct in_addr adresse);

For example:

char* monAdresse = "128.178.156.23";

struct in_addr myAddress;

myAddress.s_addr = inet_addr (monAdresse);

As you can expect, inet addr writes the result of its computation in network order, not in host order,

so you will not have to convert it again before writing it into a socket data structure.

2.5 Names

DNS names can be mapped to IP addresses and vice-versa, by means of the following function.

#include <netdb.h>

struct hostent {

char * h_name; /* host name */

char **h_aliases;

int h_addrtype; /* eg IP */

int h_length: /* 4 for IPv4 */

char **h_addr_list; /* ends with NULL */

};

struct hostent *gethostbyname(char* nom);

The only di�culty is (maybe) the de�nition of struct hostent. Remember here that in C and C++, you

declare objects as you use them. For example, if h is of the type struct hostent, then **(h.h addr list)

has the type char. Thus *(h.h addr list) is a pointer to a char, and �nally h.h addr list is a pointer

to a pointer of char. The de�nition of h addr list is illustrated on Figure 2.1. It shows that the �eld

h addr list points to a list of IP addresses. In C, a list is represented with pointers. The thing to

which h addr list points is the pointer to the �rst address. Remember also that *(h.h addr list+3) is

more usually written h.h addr list[3].

In most cases, when gethostbyname is called, a request is sent to a DNS server. Textual representations

of IP addresses are also valid DNS names. In such a case, the DNS is not looked up. Instead,

gethostbyname does an automatic conversion using inet addr.

Look up the man pages on your machine to learn about similar functions such as gethostbyaddr,

getservbyname and getservbyport.


h . h _ a d d r _ l i s t

0

1 2

3 4

5 6

7 8

1 0

2 0

3 0

4 0

1 1

2 2

3 3

4 4

* h . h _ a d d r _ l i s t

* * h . h _ a d d r _ l i s t

* ( * h . h _ a d d r _ l i s t + 4 )

* ( h . h _ a d d r _ l i s t + 3 )

Figure 2.1: The list of addresses in struct hostent.

2.6 Using a UDP socket

2.6.1 Binding a socket

When a socket is created with the socket call, it is not yet usable. It must be associated with an

interface and a port number. This is called \binding" and is performed with the bind function:

int bind(int sd, struct sockaddr* adresse, int longueur)

/* returns 0 if success, -1 if error */

On TCP/IP hosts, there are always at least two interfaces: the loopback interface (used for communi-

cation between processes or threads inside one host) and the \regular" interface (modem or Ethernet

in most cases). The regular interface is the interface by default. It is obtained by setting the address

�eld to INADDR ANY. Some systems have many non-loopback interfaces (for example: router); the

operating system chooses one for you if you specify INADDR ANY.

Port assignment is di�erent. A server is expected to use a prede�ned port number. This is done by

setting the port �eld in the address structure, then call bind. If the port is already in use, the binding

fails and returns -1. Standard servers (email server, http server) use standard port numbers, below

1024. They can be bound only by a process with root privilege.

Client programs do not care about which port number they obtain. In such a case, the port �eld is

set to 0. The port is assigned by the operating system when bind is called.

When calling bind(sd, ad, longAd), you should always set longAd to sizeof(ad).

2.6. USING A UDP SOCKET 9

2.6.2 Writing to and reading from a UDP socket

The function pair sendto(), recvfrom() can be used to send data to any destination, or receive from

any source, over one single socket. The remote system is part of the function argument.


int sendto (int sd, char* buf, int nbytes, int flags,

struct sockaddr* adrDest, int longAdr);

int recvfrom (int sd, char* buf, int nbytes, int flags,

struct sockaddr* adrSrce, int* longAdr);

The socket is speci�ed by sd ; this de�nes the local address and port number.

sendto is used to send a datagram of length nbytes, designated by the pointer buf. The destination

address is speci�ed by adrDest ; longAdr must be equal to sizeof(adrDest).

recvfrom() is similar, except that the remote system is not known when calling recvfrom(). In contrast,

the value of the remote address and port number are written by recvfrom(). Note that the address

length is a pointer to an integer, since this parameter is changed by the function (call by parameter,

not by value).

The value of ags is normally 0. The value MSG PEEK allows to look at the data without removing it

from the socket bu�er.

If one UDP socket is used to communicate with one single remote socket, then we can omit the remote

address and port, after connect ing the socket:

int connect (int sd, struct sockaddr* adrDest, int longAdr);

This speci�es the remote address. If the connect call succeeds (returned value is not negative), we say

that the socket is connected. Note that the socket is still a UDP socket, and UDP is a connectionless

protocol. On a connected socket, the function pair send(), recv() can be used to send data to or

receive data from the other end:

int send (int sd, char* buf, int nBytes, int flags);

int recv (int newSd, char* buf, int nBytes, int flags);

Both send and sendto return the number of bytes accepted by the socket, or -1 if there is an error.

We know from another chapter that UDP is unreliable. Thus this number of bytes cannot be assumed

to have been safely delivered.

recv and recvfrom return the number of bytes actually received.

By default, recv and recvfrom are blocking: if no data is available in the socket bu�er, then the calling

function is suspended until some data arrives (see how to change this behaviour below).

The system calls read and write can also be used. See the man pages.

2.6.3 Closing a socket

A socket uses operating system resources, therefore it should be closed if it is not used. Other more

exotic closing modes exist, see the references.

int close(int sd);


Achtung: If several processes are using the same socket (see Section 2.9.2), then calling close on a

socket does not actually close it; the operating system simply decrements a count of the number of

processes that access the socket. The socket is then closed only when all processes that have access

to it have called close (either explicitly, or implicitly by exiting).

2.6.4 Example: UDP Server and Client

The example below is a simple client/server pair, used as follows.

% ./udpClient <destAddr> bonjour les amis

%

% ./udpServer &

causes udpClient to send the three words \bonjour les amis" to a udp client listening on port

SERV PORT at address destAddr (a valid DNS name or IP address in dotted decimal notation). The

server simply displays on its local output all messages received, with their source address. udpServer

is started in detached mode from the command line on the server machine. It runs until you kill the

process. The design of both programs is illustrated on Figure 2.2. Note that with these toy programs

c l i e n t

s o c k e t ( ) ;

b i n d ( ) ;

s e n d t o ( ) ;

c l o s e ( ) ;

s e r v e r

s o c k e t ( ) ;

b i n d ( ) ;

r c v f r o m ( ) ;

Figure 2.2: Design of a simple UDP client and server.

messages can be lost: there is no guarantee that the server will display anything at all. In a real

program, you will have to �x this and implement your own reliability solution.

/* inet.h */



#include<netinet/in.h>

#include <arpa/inet.h>

#include <netdb.h>


#include <stdio.h>

#define SERVER_PORT 1500

#define MAX_MSG 80

#define MAX_FILE 2048

#define TERM_CHAR '$'


/*************************************************/

/* udpClient.c */

/*************************************************/

#include "inet.h"

int main(int argc, char *argv[]){

int sd, rc, i; // socket descrip.; ret code

struct sockaddr_in cliAddr, servAddr;

struct hostent *h;

// check command line arguments

if (argc < 3) {

printf("usage: %s <server> <data1>...<dataN>\n",

argv[0]);

exit(1);

}

// resolve server name, print result

// populate address and port

h = gethostbyname(argv[1]);

if (h == NULL){

printf("%s: unknown host '%s'\n", argv[0], argv[1]);

exit(1);

}

printf("%s: trying to send to '%s' (address: %s )\n",

argv[0],

h->h_name,

inet_ntoa(*(struct in_addr *)h->h_addr_list[0]));

servAddr.sin_family = h->h_addrtype;

memcpy((char *) &servAddr.sin_addr.s_addr,h->h_addr_list[0],

h->h_length);

servAddr.sin_port = htons (SERVER_PORT);

// create socket

sd = socket(AF_INET,SOCK_DGRAM,0);

if (sd <0) {

printf("%s: cannot open socket \n",argv[0]);

exit(1);

}

// bind any port number


cliAddr.sin_family = AF_INET;

cliAddr.sin_addr.s_addr = htonl(INADDR_ANY);

cliAddr.sin_port = htons(0);

//for (i=0; i<8 ; i++) cli_addr.sin_zero[i]='\0';

rc=bind(sd, (struct sockaddr *) &cliAddr,

sizeof(cliAddr));

if (rc<0) {

printf("%s cannot bind \n", argv[0]);

exit(1);

}

// send data

for (i=2;i<argc;i++){

rc = sendto (sd, argv[i], strlen(argv[i])+1,0,

(struct sockaddr *) &servAddr, sizeof(servAddr));

if (rc<0){

printf("%s: cannot send data %d\n",argv[0], i-1);

close(sd);

exit(1);

}

} // end for

// close socket and exit

close(sd);

exit(0);

}

/********************************************************/

/* udpServ.c */

/********************************************************/

#include "inet.h"

int main(int argc, char *argv[]){

int sd, rc, i, n, cliLen; // socket descriptor


char msg[MAX_MSG];

// create socket



if (sd <0) {

printf("%s: cannot open socket \n",argv[0]);

exit(1);

}

// bind server port

servAddr.sin_family = AF_INET;

servAddr.sin_addr.s_addr = htonl(INADDR_ANY);

servAddr.sin_port = htons(SERVER_PORT);

rc = bind (sd, (struct sockaddr *) &servAddr,

sizeof(servAddr));

if (rc<0) {

printf("%s cannot bind port number %d \n", argv[0],

SERVER_PORT);

exit(1);

}

2.7 TCP sockets

2.7.1 Di�erences with UDP sockets

TCP sockets di�er from UDP sockets in a number of ways.

� Since TCP is connection oriented, a TCP socket can be used only after a connection establish-

ment phase. This uses the connect, listen and accept calls.

� A TCP server uses at least two sockets. One socket is non-connected and is used to receive

connection requests (\SYN" packets). Once a connection request is accepted, a new socket is

created; this new socket is connected to the remote end.

� A TCP socket on which data can be exchanged is always connected, so recvfrom and sendto

cannot be used. Otherwise, send and receive are used as with connected UDP sockets.

� When you close a TCP socket, the socket continues to exist for some time (it is in the TIME-

WAIT state). This prevents the same connection to be reused for a time large enough in order to

avoid confusion between an old incarnation of the connection and a new one. See the socket-faq

for mode details.

� When recv returns 0, this means that the connection is half closed: it can be used for sending,

but not for receiving because the return direction was closed by the other end.

As with UDP sockets, be aware that if several processes access the same socket, then it has to be

closed by all processes.

2.7.2 Server side

listen is used to tell the operating system to wait for incoming connection request on an open socket

sd :

2.7. TCP SOCKETS 15

int listen(int sd, int queueLength);

The maximum number of connection requests that may be pending in the socket bu�er is queueLength.

accept consumes one connection request (if any has arrived; otherwise it blocks). It creates a new

socket and returns the socket descriptor (or -1 if there is a problem).

int accept(int sd, struct sockaddr* adrDest, int longueur);

2.7.3 Client side

On the client side, the only additional step is to request a connection establishment, with connect.

Note the di�erence in semantics with connect applied to a UDP socket.

int connect (int sd, struct sockaddr* adrDest, int longueur);

2.7.4 Example: A simple TCP client server pair

The example below is a straighforward adaptation of the UDP example above. The design of both

programs is illustrated on Figure 2.3.

A potential problem with this simplistic program is that it does not correctly read from the socket.

Remember that the TCP service is byte by byte, and does not have a concept of packets. Thus, when

you read data from a TCP socket, you might well have for example only part the data that was sent

by the other end. A common solution is to read byte by byte (with one recv call for every byte), until

you have obtained all the data you need. See in the exercises for more details.


c l i e n t

s o c k e t ( ) ;

s e r v e r

s o c k e t ( ) ;

b i n d ( ) ;

c o n n e c t ( ) ;

s e n d ( ) ;

c l o s e ( ) ;

b i n d ( ) ;

l i s t e n ( ) ;

a c c e p t ( ) ;

r e c e i v e ( ) ;

c l o s e ( ) ;

Figure 2.3: Design of a simple TCP client and server.

/*********************************************************/

/* tcpClient.c */

**********************************************************/

#include "inet.h"

int main(int nbArgPlusUn, char *mot[]){

int sd, i; // socket descriptor

int rc // REXXish return code


struct hostent *h;


if (nbArgPlusUn < 3) {

printf("usage: %s <server> <data1>...<dataN>\n",

mot[0]);

exit(1);

}

// resolve server name, print result

// and populate server address and port

2.7. TCP SOCKETS 17

h = gethostbyname(mot[1]);

if (h == NULL){

printf("%s: unknown host '%s'\n", mot[0], mot[1]);

exit(1);

}

printf("%s: now preparing to send data to host '%s' \nat address: %s \n",

mot[0],

h->h_name,

inet_ntoa(*(struct in_addr *) h->h_addr_list[0]));


memcpy((char *) &servAddr.sin_addr.s_addr, h ->

h_addr_list[0],

h->h_length);


// create socket

sd = socket(AF_INET,SOCK_STREAM,0);

if (sd <0) {

printf("%s: cannot open socket \n",mot[0]);

exit(1);

}





rc=bind(sd, (struct sockaddr *) &cliAddr,

sizeof(cliAddr));

if (rc<0) {

printf("%s cannot bind \n", mot[0]);

exit(1);

}

// connect to server

rc = connect (sd, (struct sockaddr *) &servAddr,

sizeof(servAddr));

if (rc<0){

printf("%s: cannot connect \n",mot[0]);

close(sd);

exit(1);

}

printf("%s: connecting... \n",mot[0]);

// send arguments one by one

for (i=2; i < nbArgPlusUn; i++){


// send data

rc = send(sd, mot[i], strlen(mot[i])+1, 0);

if (rc<0){

printf("%s: cannot send data%d\n",mot[0], i-1);

close(sd);

exit(1);

}

printf("%s: sent data%d: '%s'\n",mot[0],i, mot[i]);

}// end for


close(sd);

exit(0);

}

/***************************************************/

/* tcpServer.c */

/* */

/* simple sequential test server */

/* connection closed by client */

/***************************************************/

#include "inet.h"


int sd, newSd, rc, i, n, cliLen;

// socket descriptors and return code


char msg[MAX_MSG];

// create socket

sd = socket(AF_INET,SOCK_STREAM,0);

if (sd <0) {


exit(1);

}

// bind server port





sizeof(servAddr));

if (rc<0) {

printf("%s cannot bind port number %d \n", mot[0], SERVER_PORT);

exit(1);

2.8. SOCKET OPTIONS 19

}

// tell OS to receive SYN packets on sd

// sd is an unconnected socket (associated with local host and port only)

listen(sd, 5);

2.8 Socket Options

2.8.1 Socket Options

Special settings on sockets are made using setsockopt() and getsockopt(). The argument speci�es

the socket descriptor, the type of option, the option itself, and the data structure containing the

parameters of the option. The type expresses the layer at which the option applies (for example: IP

or TCP, or generic socket option). The option itself can be for example: reuse a local port (general

socket option), disable Nagle's algorithm (a TCP level option), etc.

Multicast IP is implemented by means of socket options of type IP. See the example below.

/*********************************************************/

/* mcastClient.c */

/* multicast test client */

/*********************************************************/

#include "inet.h"


int sd, rc, i; // socket descriptor and ret code

unsigned char ttl = 1; // send multicast with ttl =1 !


struct hostent *h;


if (nbArgPlusUn < 3) {

printf("usage: %s <server> <data1>...<dataN>\n",mot[0]);

exit(1);

}

// resolve server name, print result and populate server

// address and port


if (h == NULL){

printf("%s: unknown host '%s'\n", mot[0], mot[1]);

exit(1);


}

printf("%s: trying to send data to host '%s' at address: %s \n",

mot[0],

h->h_name,

inet_ntoa(*(struct in_addr *) h->h_addr_list[0]));


memcpy((char *) &servAddr.sin_addr.s_addr, h ->h_addr_list[0],

h->h_length);


// check dest addr is multicast;

if (!IN_MULTICAST(ntohl(servAddr.sin_addr.s_addr))){

printf("%s: dest addr %s is not multicast \n",mot[0],

inet_ntoa(servAddr.sin_addr));

exit(1);

}

// create socket


if (sd <0) {


exit(1);

}





rc=bind(sd, (struct sockaddr *) &cliAddr, sizeof(cliAddr));

if (rc<0) {

printf("%s cannot bind \n", mot[0]);

exit(1);

}

// set ttl on the socket

rc = setsockopt(sd, IPPROTO_IP, IP_MULTICAST_TTL, &ttl, sizeof(ttl));

if ( rc < 0) {

printf("%s cannot set ttl = %d IPPROTO_IP, IP_MULTICAST_TTL \n",

mot[0], ttl);

exit(1);

}

// send data

for (i=2;i<nbArgPlusUn;i++){

2.8. SOCKET OPTIONS 21

rc = sendto (sd, mot[i], strlen(mot[i])+1,0,

(struct sockaddr *) &servAddr, sizeof(servAddr));

if (rc<0){

printf("%s: cannot send data %d\n",mot[0], i-1);

close(sd);

exit(1);

}

} // end for


close(sd);

exit(0);

}

/*******************************************************/

/* mcastServ.c */

/* */

/* multicast test server */

/*******************************************************/

#include "inet.h"


int sd, rc, i, n, cliLen;

struct ip_mreq mreq; // req block for mcast address


struct in_addr mcastAddr;

struct hostent *h;

char msg[MAX_MSG];


if (nbArgPlusUn != 2) {

printf("usage: %s <mcast address>\n",mot[0]);

exit(1);

}

// get multicast address for server to listen to


if (h == NULL){

printf("%s: unknown group '%s'\n", mot[0], mot[1]);

exit(1);


}

memcpy(&mcastAddr, h ->h_addr_list[0], h->h_length);

// check dest addr is multicast;

if (!IN_MULTICAST(ntohl(mcastAddr.s_addr))){

printf("%s: dest addr %s is not multicast \n",mot[0],

inet_ntoa(mcastAddr));

exit(1);

}

printf("%s: server ready to listen to %s\n", mot[0], mot[1]);

2.8.2 How do I make a socket non-blocking ?

This is controlled by i/o and �le control system calls.

fcntl(sd, F_SETFL, FNDELAY)

sets the socket as non-blocking. If an operation on the socket would otherwise block, then it returns

-1 and sets errno to EWOULDBLOCK.

2.9 Concurrency

In many cases it is necessary to monitor several sockets and act on the next available input. This can

be done in a number of ways. We examine two methods: select(), and parallelism (with fork()). A

third family of methods uses signals, which we do not study in this chapter.

2.9.1 Select

select() allows you to wait on several sockets at the same time. It can also be used to implement

timers.

The select() call comes with macros used to set up the masks.

#include <sys/time.h>


#include <unistd.h>

int select(int n, fd_set *readfds, fd_set *writefds,

fd_set *exceptfds, struct timeval *timeout);

FD_CLR(int fd, fd_set *set);

FD_ISSET(int fd, fd_set *set);

FD_SET(int fd, fd_set *set);

FD_ZERO(fd_set *set);

2.9. CONCURRENCY 23

The man pages say something like:

select() examines the I/O �le descriptor sets whose addresses are passed in readfds, writefds, and

exceptfds to see if any of their �le descriptors are ready for reading, are ready for writing, or have

an exceptional condition pending, respectively. n is the number of bits to be checked in each bit

mask that represents a �le descriptor; the �le descriptors from 0 to n -1 in the �le descriptor sets are

examined. On return, select() replaces the given �le descriptor sets with subsets consisting of those

�le descriptors that are ready for the requested operation. The return value from the call to select()

is the number of ready �le descriptors.

The �le descriptor sets are stored as bit �elds in arrays of integers. The following macros are provided

for manipulating such �le descriptor sets: FD ZERO() initializes a �le descriptor set fdset to the null

set. FD SET() includes a particular �le descriptor fd in fdset. FD CLR() removes fd from fdset.

FD ISSET() is nonzero if fd is a member of fdset, zero otherwise. The behavior of these macros

is unde�ned if a �le descriptor value is less than zero or greater than or equal to FD SETSIZE.

FD SETSIZE is a constant de�ned in <sys/select.h>.

If timeout is not a NULL pointer, it speci�es a maximum interval to wait for the selection to complete.

If timeout is a NULL pointer, the select() blocks inde�nitely. To e�ect a poll, the timeout argument

should be a non-NULL pointer, pointing to a zero-valued timeval structure.

RETURN VALUE On success, select returns the number of descriptors con- tained in the descriptor

sets, which may be zero if the timeout expires before anything interesting happens. On error, -1 is

returned, and errno is set appropriately; the sets and timeout become unde�ned, so do not rely on

their contents after an error.

#include <stdio.h>

#include <sys/time.h>


#include <unistd.h>

int

main(void)

{

fd_set rfds;

struct timeval tv;

int retval;

/* Watch stdin (fd 0) to see when it has input. */

FD_ZERO(&rfds);

FD_SET(0, &rfds);

/* Wait up to five seconds. */

tv.tv_sec = 5;

tv.tv_usec = 0;

retval = select(1, &rfds, NULL, NULL, &tv);

/* Don't rely on the value of tv now! */

if (retval)

printf("Data is available now.\n");

/* FD_ISSET(0, &rfds) will be true. */


else

printf("No data within five seconds.\n");

exit(0);

}

2.9.2 Fork

One alternative method to implement concurrency is to have separate processes; in Unix, use fork().

A process is a running program with all its environment (state information, execution stack, heap,

opened �le descriptors, etc). Every process has a number allocated to it by the operating system; it

is called the process id. You see it with the ps command. A process knows its process id by using

getpid().

fork() is a very special function. When a process uses fork(), a clone, called the child, is created. The

child di�ers from the parent in only two points:

� its process id is di�erent

� when fork() completes, the returned value is 0 for the child, and the child's process id for the

parent.

Both child and parent run concurrently. At the end of the fork, both execute the same instruction

(though with di�erent return codes). On a single processor machine, it is the operating system which

decides which process uses the processor, based on complex allocation rules.

Here is an example.

/*forkEx.c */

#include <stdio.h>

#include <unistd.h>

#include <stdlib.h>

main(int argc, char* argv[]){

int rc;

printf("%s : I will fork\n",argv[0]);

rc=fork();

if rc <0 then {

perror("Fork failed ");

}

else if (rc >0){

sleep(10);

printf("%s [%i]: I am the father\n",

argv[0],getpid());

}

else (printf("%s [%i]: I am the son \n",

2.9. CONCURRENCY 25

argv[0],getpid());

}

When you run this program, a second process is created. If fork succeeds, then the parent process

sleeps for 10 seconds and prints its message. The child process prints its message immediately. You

are likely to see the child's message before the parent's one. If the parent did not sleep, we cannot

predict which message is printed �rst.

When a child is created, the two processes become independent. The child inherits a copy of all

structures which existed in the parent process at the time of the fork. After the fork however, the two

processes have their own life; variables in the two processes are updated independent of each other.

There is a di�erence however for �les and sockets; they exist in the operating system space, not in the

process space. If a process opens a socket and then forks, the socket is not duplicated (but the socket

descriptor is duplicated). The child has access to the same socket, not to a clone. In particular, if a

socket is accessible by several processes, the socket is closed only after all processes have closed the

socket (or exited). Be careful when you fork a process to close all sockets that are not needed in the

father or the child process.

An alternative to processes is the concept of thread, also called \light weight process" . A thread is

used to have parallelism inside a process. It is more complex to use, since several concurrent threads

have access to all global variables in the process. See for example beginthread() under Windows.

For concepts of permissions and secure programming, see the secure unix programming FAQ.

The exec family of system calls is commonly used to replace a forked process by the image of another

process. This is how you start another program (see the unix programming faq).

2.9.3 Interprocess Communication

Once you have created a child process, you may have to exchange information between processes.

Interprocess communication is a topic of its own. We just mention here two simple ways.

A �rst method is to use sockets.

A second method is to use wait():


#include <sys/wait.h>

pid_t wait (int *statusRetourne);

/* pid_t is an integer type defined in sys/wait.h */

When a process, say P , calls wait(), the following happens.

� If P has no child, the call returns immediately with a value of -1. The global variable errno is

set to 10 (= ECHILD).

� Else, the call is blocking, until one of the child processes dies. The call returns the process id of

the child, and the termination status is written at the location equal to statusRetourne.

� The convention for statusRetourne is as follows. Only the two low-order bytes have a meaning.

If the low order byte is 0, then the second low order byte is equal to the return code of the child.


Else, the child was aborted and the low order byte is the number of the signal which caused the

child to abort.

To access bytes inside an integer, use masking as below.

#include <stdio.h>

#include <stdlib.h>

void main (){

int a=0x12345678;

int masque = 0xFF;

printf("a=%08X \t masked a = %08X\n", a, masque & a);

}

2.9.4 Zombie

When a process exits, it normally has an exit code. Unix allows the parent of a process, say P to

request the exit code of a child process, say C; the parent does so by using wait(). By default the

operating system keeps the status of the dead child process until P asks for it. In the mean time, the

child process is said to be in a \zombie" state: it is dead, but still exists (the process id cannot be

reused). If you do ps on a zombie process, you see a Z ag. If P dies before waiting on the child

process, the zombie process becomes child of P 0s parent, and so on. All processes share one common

ancestor (init).

Zombies may become a problem on a server which is never rebooted. Over time, the space of process

ids may exhaust. Thus, if you fork a process, you should either make sure the parent waits for the

child, or avoid creation of zombies.

Avoiding creation of zombies is easy on some unix avours; you simply tell the system not to keep

zombies. See the unix programmer FAQ for details. On some systems you need a more sophisticated

signal manipulation.

2.9.5 Example: a UDP parallel server

The example below is a straighforward adaptation of the UDP example above. The design of both

programs is illustrated on Figure 2.4.

2.9. CONCURRENCY 27

c l i e n t

s o c k e t ( ) ;

b i n d ( ) ;

s e n d t o ( ) ;

c l o s e ( ) ;

p a r a l l e l s e r v e r

s o c k e t ( ) ;

b i n d ( ) ;

r c v f r o m ( ) ;

f o r k ( ) ;

c l o s e ( ) ;

d o t h e j o b

Figure 2.4: Design of a parallel UDP server.

/*********************************************************/

/* udpParServ.c */

/* simple parallel udp server */

/*********************************************************/

#include "inet.h"


int sd, rc, i, n, cliLen;

// socket descriptor and return code


char msg[MAX_MSG];

int sleepTime;

int pid;

// create socket


if (sd <0) {


exit(1);

}

// bind server port






sizeof(servAddr));

if (rc<0) {

printf("%s cannot bind port number %d \n",

mot[0], SERVER_PORT);

exit(1);

}

/* avoid zombies */ signal(SIGCHLD, SIG_IGN);

// server infinite loop

while(1){

// receive

cliLen = sizeof(cliAddr);

n = recvfrom(sd, msg, MAX_MSG, 0,

(struct sockaddr *) &cliAddr, &cliLen);

if (n<0){

printf("%s: cannot receive data \n", mot[0]);

continue;

}

// start a new process

pid = fork();

if (pid <0){

printf("%s: cannot receive data \n", mot[0]);

continue;

}

else if (pid==0) {

// son process

// do the job

printf("%s[%d]: processing message '%s' from %s\n",

mot[0], getpid(), msg,

inet_ntoa(cliAddr.sin_addr));

sleepTime = atoi(msg);

sleep(sleepTime);

// close socket and die

close(sd);

exit(0);

}

} // end of infinite while

// never reach this line

2.10. UNIX PROGRAMMING MISCELLANIES 29

}

2.9.6 Example: a TCP parallel server

A TCP parallel server is similar to the UDP parallel server, with one important di�erence. Because

the TCP server creates one socket for every connection (with access()), we have to be careful to close

the socket so created in the father process. Indeed, remember that a socket closes only when all

processes that have access to it have closed the socket. Figure 2.5 shows the design. The alert reader

will enjoy writing and testing the corresponding code.

b i n d ( ) ;

c o n n e c t ( ) ;

s d = s o c k e t ( ) ;

b i n d ( ) ;

l i s t e n ( ) ;

n e w S d = a c c e p t ( s d , ) ;

f o r k ( ) ;

s o c k e t ( ) ;

c l o s e ( s d ) ;

r e c v ( ) ;

d o t h e j o b

c l o s e ( n e w S d ) ;

e x i t ( ) ;

c l o s e ( n e w S d ) ;s e n d ( ) ;

. . .

c l o s e ( ) ;

e x i t ( ) ;

Figure 2.5: Design of a parallel TCP server.

2.10 Unix programming miscellanies

2.10.1 How do I use man pages ?

Before applying anything from this document, please check on your system with the man pages. Unix

di�ers slightly from one system to another, and from one release to another.

Start by typing

man man

to obtain the right syntax. Read carefully the output of man man, you will save many hours.

Man pages are the place where the truth about a library and a function is to be found. They are

organized in Sections and subsections. For example:


� Section 1 is for user commands and application programs

� Section 3 is for library functions

� Section 3, subsection C is for C library functions

Each section has one man page called intro. You invoke man pages inside a given section (for example

in section 3) with the option -s3. Obtain the introduction to section 3 by typing something like

man -s3 intro

Man pages give the section and subsection in the header, for example if you type man socket you get

a page starting with socket(3N) which means that socket() is in section 3N .

2.10.2 Handling errors in system and library calls

When calling system or socket library calls, you should always catch possible errors. In principle, an

error is indicated by return code -1. The normal procedure after detecting an error is to exit, after

printing an error message.

The �rst examples in this chapter print error messages on the standard output; this is to keep things

simple. As a good programmer, you should print error messages on stderr, the error output. This is

done, in a primitive way, by using the perror() function:

#include <stdio.h>

void perror(const char* monMessage);

/ displays monMessage on the error output followed by an error

message generated by the system */

More sophisticated error messages would be implemented using strerror().

What is the error message generated by perror() ? There exists a global variable, errno, which stores

the last error code. Most library functions set the value of errno when they return on error. The man

pages tell you whether a function sets errno, and the meanings of the values. You must declare extern

int errno. For example:

#include <stdio.h>

#include <unistd.h>

#include <stdlib.h>

extern int errno;

void main (){

int nbChar=0;

char mot[10];

printf("Please give a word\n");

nbChar = read(3, mot, 10);

printf("errno = %d\n", errno);

if (nbChar==-1){

2.10. UNIX PROGRAMMING MISCELLANIES 31

perror("Could not read correctly");

exit(1);

}

printf("%s\n", mot);

}

This program attempts to read a word from the �le with �le descriptor = 3, which is not open. This

is a stupid program from many respects, but it shows the use of perror and errno. Note that the

program prints the error code after the printf function. If printf itself fails, then the error code used

by perror is no longer the same. On a Linux system, the program was called err ; this version of unix

translates errno=11 into \ Try again".

% ./err

Please give a word

errno = 11

Could not read correctly: Try again

Read the man pages to discover more about error code 11.

Chapter 3

The MAC layer

33

34 CHAPTER 3. THE MAC LAYER


Chapter 4

The Internet Protocol

35

36 CHAPTER 4. THE INTERNET PROTOCOL


Chapter 5

The Transport Layer

37

38 CHAPTER 5. THE TRANSPORT LAYER


Chapter 6

The Application Layer

39

40 CHAPTER 6. THE APPLICATION LAYER


Chapter 7

Mixed Architectures

41

42 CHAPTER 7. MIXED ARCHITECTURES


eseaica1 · 2000-02-23 · chapter 2. pr ogramming with sockets do not study them. there exist also...

Documents