chapter 6 i/o multiplexing: select and poll function
TRANSCRIPT
Chapter 6Chapter 6I/O Multiplexing:I/O Multiplexing:
select and poll functionselect and poll function
abstractabstract
IntroductionIntroduction I/O Models(5 I/O Models(5 종류종류 )) Synchronous I/O versus Asynchronous I/OSynchronous I/O versus Asynchronous I/O select functionselect function batch inputbatch input shutdown functionshutdown function pselect functionpselect function poll functionpoll function
IntroductionIntroduction TCP client is handling two inputs at the same TCP client is handling two inputs at the same
time: standard input and a TCP sockettime: standard input and a TCP socket– when the client was blocked in a call to read, the when the client was blocked in a call to read, the
server process was killedserver process was killed– server TCP sends FIN to the client TCP, but the server TCP sends FIN to the client TCP, but the
client never see FIN since the client is blocked client never see FIN since the client is blocked reading from standard inputreading from standard input
=> the capability to tell the kernel that we want to be => the capability to tell the kernel that we want to be notified if one or more I/O conditions are ready.notified if one or more I/O conditions are ready.
: I/O multiplexing (select and poll): I/O multiplexing (select and poll)
When: When: – client is handling multiple descriptors (interactive client is handling multiple descriptors (interactive
input and a network socket).input and a network socket).– Client to handle multiple sockets(rare)Client to handle multiple sockets(rare)– TCP server handles both a listening socket and its TCP server handles both a listening socket and its
connected socket.connected socket.– Server handle both TCP and UDP.Server handle both TCP and UDP.– Server handles multiple services and multiple Server handles multiple services and multiple
protocolsprotocols
I/O ModelsI/O Models Blocking I/OBlocking I/O nonblocking I/Ononblocking I/O I/O multiplexing(select and poll)I/O multiplexing(select and poll) signal driven I/O(SIGIO)signal driven I/O(SIGIO) asynchronous I/O(posix.1 aio_ functions)asynchronous I/O(posix.1 aio_ functions)
Two distinct phases for an input operationsTwo distinct phases for an input operations
1. Waiting for the data to be ready1. Waiting for the data to be ready
2. Copying the data from the kernel to the process2. Copying the data from the kernel to the process
Blocking I/OBlocking I/O
application
recvfrom
Processdatagram
System call
Return OK
No datagram ready
Datagram readycopy datagram
Copy complete
kernel
Process blocks in a call to recvfrom
Wait for data
Copy datafrom kernel to user
nonblocking I/Ononblocking I/Oapplication
recvfrom
Processdatagram
System call
Return OK
No datagram ready
copy datagram
application
kernel
Wait for data
EWOULDBLOCK
recvfrom No datagram readyEWOULDBLOCK
System call
recvfrom datagram readySystem call
Copy datafrom kernel to user
Process repeatedlycall recvfromwating for an OK return(polling)
I/O multiplexing(select and poll)I/O multiplexing(select and poll)
application
select
Processdatagram
System call
Return OK
No datagram ready
Datagram readycopy datagram
Copy complete
kernel
Wait for data
Return readable
recvfrom
Copy datafrom kernel to user
Process blockin a call toselect waitingfor one ofpossibly manysockets tobecome readable
Process blockswhile data copiedinto applicationbuffer
System call
signal driven I/O(SIGIO)signal driven I/O(SIGIO)
application
Establish SIGIO
Processdatagram
System call
Return OK
Datagram readycopy datagram
Copy complete
kernel
Wait for data
Deliver SIGIO
recvfrom Copy datafrom kernel to user
Process continues executing
Process blockswhile data copiedinto applicationbuffer
Sigaction system call
Return Signal handler
Signal handler
asynchronous I/Oasynchronous I/O
application
aio_read
Signal handlerProcessdatagram
System call
Delever signal
No datagram ready
Datagram readycopy datagram
Copy complete
kernel
Process continues
executing
Wait for data
Copy datafrom kernel to user
Return
Specified in aio_read
Comparison of the I/O ModelsComparison of the I/O Models
blocking nonblocking I/O multiplexing
signal-drivenI/O
asynchronous I/O
initiate
complete
check check check check check check
complete
blocked
check
blocked
readyinitiate blocked
complete
notificationinitiate blocked
complete
initiate
notification
wait fordata
copy datafrom kernelto user
ist phase handled differently,2nd phase handled the same
handles both phases
Synchronous I/O , Asynchronous I/OSynchronous I/O , Asynchronous I/O Synchronous I/O : cause the requesting process to Synchronous I/O : cause the requesting process to
be blocked until that I/O operation (recvfrom) combe blocked until that I/O operation (recvfrom) completes.(pletes.(blocking, nonblocking, I/O multiplexing, signal-dblocking, nonblocking, I/O multiplexing, signal-d
riven I/Oriven I/O)) Asynchronous I/O : does not cause the requesting Asynchronous I/O : does not cause the requesting
process to be blockedprocess to be blocked
(asynchronous I/O) (asynchronous I/O)
Select functionSelect function Allows the process to instruct the kernel to wait Allows the process to instruct the kernel to wait
for any one of multiple events to occur and to for any one of multiple events to occur and to wake up the process only when one or more of wake up the process only when one or more of these events occurs or when a specified amount these events occurs or when a specified amount of time has passed.of time has passed.
(readable ,writable , expired time)(readable ,writable , expired time)
#include <sys/select.h>#include <sys/select.h>
#include <sys/time.h>#include <sys/time.h>
int select (int maxfdp1, fd_set *readset, fd_set *writeset, fd_set *exceptset, cint select (int maxfdp1, fd_set *readset, fd_set *writeset, fd_set *exceptset, const struct timeval *);onst struct timeval *);
struct timeval{struct timeval{
long tv_sec; /* seconds */long tv_sec; /* seconds */
long tv_usec; /* microseconds */long tv_usec; /* microseconds */
}}
Condition of select functionCondition of select function Wait forever : return only descriptor is ready(tiWait forever : return only descriptor is ready(ti
meval = NULL)meval = NULL) wait up to a fixed amount of time:wait up to a fixed amount of time: Do not wait at all : return immediately after cheDo not wait at all : return immediately after che
cking the descriptors(timeval = 0)cking the descriptors(timeval = 0)
wait: normally interrupt if the process catches a siwait: normally interrupt if the process catches a signal and returns from the signal handlergnal and returns from the signal handler
Readset => descriptor for checking readableReadset => descriptor for checking readable writeset => descriptor for checking writablewriteset => descriptor for checking writable exceptset => descriptor for checking exceptset => descriptor for checking
two exception conditionstwo exception conditions
:arrival of out of band data for a socket:arrival of out of band data for a socket
:the presence of control status information to be :the presence of control status information to be read from the master side of a pseudo terminalread from the master side of a pseudo terminal
Descriptor setsDescriptor sets Array of integers : each bit in each integer correArray of integers : each bit in each integer corre
spond to a descriptor.spond to a descriptor.
fd_set: an array of integers, with each bit in each integer corresponding to a dfd_set: an array of integers, with each bit in each integer corresponding to a descriptor. escriptor.
Void FD_ZERO(fd_set *fdset); /* clear all bits in fdset */Void FD_ZERO(fd_set *fdset); /* clear all bits in fdset */ Void FD_SET(int fd, fd_set *fdset); /* turn on the bit for fd in fdset */Void FD_SET(int fd, fd_set *fdset); /* turn on the bit for fd in fdset */ Void FD_CLR(int fd, fd_set *fdset); /* turn off the bit for fd in fdset*/Void FD_CLR(int fd, fd_set *fdset); /* turn off the bit for fd in fdset*/ int FD_ISSET(int fd, fd_set *fdset);/* is the bit for fd on in fdset ? */int FD_ISSET(int fd, fd_set *fdset);/* is the bit for fd on in fdset ? */
Example of Descriptor sets Example of Descriptor sets functionfunction
fd_set rset;fd_set rset;
FD_ZERO(&rset);/*all bits off : initiate*/FD_ZERO(&rset);/*all bits off : initiate*/
FD_SET(1, &rset);/*turn on bit fd 1*/FD_SET(1, &rset);/*turn on bit fd 1*/
FD_SET(4, &rset); /*turn on bit fd 4*/FD_SET(4, &rset); /*turn on bit fd 4*/
FD_SFD_SET(5, &rset); /*turn on bit fd 5*/T(5, &rset); /*turn on bit fd 5*/
specifies the number of descriptors to be tesspecifies the number of descriptors to be tested.ted.
Its value is the maximum descriptor to be teIts value is the maximum descriptor to be tested, plus one.(hence our name of maxfdp1)sted, plus one.(hence our name of maxfdp1)(example:fd1,2,5 => maxfdp1: 6)(example:fd1,2,5 => maxfdp1: 6)
constant FD_SETSIZE defined by includinconstant FD_SETSIZE defined by including <sys/select.h>, is the number of descriptorg <sys/select.h>, is the number of descriptors in the fd_set datatype.(1024)s in the fd_set datatype.(1024)
Maxfdp1 argument
Condition that cause a socket to Condition that cause a socket to be ready for be ready for selectselect
Condition Readable? writable? Exception?
Data to readread-half of the connection closednew connection ready for listening socket
Space available for writingwrite-half of the connection closed
•••
••
• •
•
Pending error
TCP out-of-band data
could be blocked in the call to fgets when sometcould be blocked in the call to fgets when something happened on the sockething happened on the socket
blocks in a call to select instead, waiting for eithblocks in a call to select instead, waiting for either standard input or the socket to be readable.er standard input or the socket to be readable.
Condition handled by select in strCondition handled by select in str_cli(section5.5)_cli(section5.5)
Condition handled by select in strCondition handled by select in str_cli(section5.5)_cli(section5.5)
Data of EOF
client
• stdinSocket•
error EOF
RST
TCP
data FIN
Select for readability on either standard input or socket
Three conditions are handled Three conditions are handled with the socketwith the socket
Peer TCP send a data,the socket becomr readablPeer TCP send a data,the socket becomr readable and e and readread returns greater than 0 returns greater than 0
Peer TCP send a FIN(peer process terminates), tPeer TCP send a FIN(peer process terminates), the socket become readable and he socket become readable and readread returns 0(e returns 0(end-of-file)nd-of-file)
Peer TCP send a RST(peer host has crashed and Peer TCP send a RST(peer host has crashed and rebooted), the socket become readable and returrebooted), the socket become readable and returns -1 and ns -1 and errnoerrno contains the specific error code contains the specific error code
Implimentation of str_cli functioImplimentation of str_cli function using n using selectselect
Void str_cli(FILE *fp, int sockfd){
int maxfdp1;fd_set rset;char sendline[MAXLINE], recvline[MAXLINE];
FD_ZERO(&rset);for ( ; ; ) {
FD_SET(fileno(fp), &rset);FD_SET(sockfd, &rset);maxfdp1 = max(fileno(fp), sockfd) + 1;Select(maxfdp1, &rset, NULL, NULL, NULL);
Continue…..
if (FD_ISSET(sockfd, &rset)) { /* socket is readable */if (Readline(sockfd, recvline, MAXLINE) == 0)
err_quit("str_cli: server terminated prematurely");
Fputs(recvline, stdout);}
if (FD_ISSET(fileno(fp), &rset)) { /* input is readable */if (Fgets(sendline, MAXLINE, fp) == NULL)
return; /* all done */Writen(sockfd, sendline, strlen(sendline));
}}//for
}//str_cli
Stop and waitStop and waitsends a line to the server sends a line to the server and then waits for the replyand then waits for the reply
request
request
serverrequest
request
serverreply
reply
reply
reply
client
time1
time2
time3
time4
time5
time6
time7
time0
Batch inputBatch input
request8 request7 request6 request5
reply1 reply2 reply3 reply4
Time 7:
request9 request8 request7 request6
reply2 reply3 reply4 reply5
Time 7:
The problem with our revised str_cli functionThe problem with our revised str_cli function– After the handling of an end-of-file on input, the senAfter the handling of an end-of-file on input, the sen
d function returns to the main function, that is, the prd function returns to the main function, that is, the program is terminated.ogram is terminated.
– However, in batch mode, there are still other requestHowever, in batch mode, there are still other requests and replies in the pipe.s and replies in the pipe.
A way to close one-half of the TCP connectionA way to close one-half of the TCP connection– send a FIN to the server, telling it we have finished ssend a FIN to the server, telling it we have finished s
ending data, but leave the socket descriptor open for ending data, but leave the socket descriptor open for reading <= shutdown functionreading <= shutdown function
Shutdown functionShutdown function Close one half of the TCP connectionClose one half of the TCP connection
(example:send FIN to server, but leave the (example:send FIN to server, but leave the socket descriptor open for reading)socket descriptor open for reading)
Close function : decrements the descriptor’s reference count and closes the socket only if the count reaches 0, terminate both direction(reading and writing)
Shutdown function : just one of them(reading or writing)
Calling shutdown to close half of a TCP connection
client serverdata
data
FIN
Ack of data and FIN
data
dataFIN
Ack of data and FIN
Read returns > 0
Read returns > 0
Read returns 0
writewriteclose
write
writeshutdown
Read returns > 0
Read returns > 0
Read returns 0
#include<sys/socket.h>#include<sys/socket.h>
int shutdown(int sockfd, int howto);int shutdown(int sockfd, int howto);
/* return : 0 if OK, -1 on error *//* return : 0 if OK, -1 on error */
howto argumenthowto argument SHUT_RD : read-half of the connection closedSHUT_RD : read-half of the connection closed
SHUT_WR : write-half of the connection closedSHUT_WR : write-half of the connection closed
SHUT_RDWR : both closedSHUT_RDWR : both closed
Str_cli function using Str_cli function using select select and and sshutdownhutdown
#include "unp.h"void str_cli(FILE *fp, int sockfd){
int maxfdp1, stdineof;fd_set rset;char sendline[MAXLINE], recvline[MAXLINE];
stdineof = 0;FD_ZERO(&rset);for ( ; ; ) {
if (stdineof == 0) // select on standard input for readabilityFD_SET(fileno(fp), &rset);
FD_SET(sockfd, &rset);maxfdp1 = max(fileno(fp), sockfd) + 1;Select(maxfdp1, &rset, NULL, NULL, NULL);
Continue…..
if (FD_ISSET(sockfd, &rset)) { /* socket is readable */if (Readline(sockfd, recvline, MAXLINE) == 0) {
if (stdineof == 1)return; /* normal termination */
elseerr_quit("str_cli: server terminated prematurely");
}Fputs(recvline, stdout);
}if (FD_ISSET(fileno(fp), &rset)) { /* input is readable */
if (Fgets(sendline, MAXLINE, fp) == NULL) {stdineof = 1;Shutdown(sockfd, SHUT_WR); /* send FIN */FD_CLR(fileno(fp), &rset);continue;
}
Writen(sockfd, sendline, strlen(sendline));}
} }
TCP echo serverTCP echo server Rewrite the server as a single process that uses Rewrite the server as a single process that uses
select to handle any number of clients, instead select to handle any number of clients, instead of forking one child per client.of forking one child per client.
Data structure TCP server(1)Data structure TCP server(1)
Client[][0]
[1]
[2]
-1
-1
-1
-1[FD_SETSIZE -1]
rset:
fd0 fd1 fd2 fd3
0 0 0 1
Maxfd + 1 = 4
fd:0(stdin),1(stdout),2(stderr)fd:3 => listening socket fd
Before first client has established a connection
Data structure TCP server(2)Data structure TCP server(2)
Client[][0]
[1]
[2]
4
-1
-1
-1[FD_SETSIZE -1]
rset:
fd0 fd1 fd2 fd3
0 0 0 1
Maxfd + 1 = 5
* fd3 => listening socket fd
fd4
1
*fd4 => client socket fd
After first client connection is established
Client[][0]
[1]
[2]
4
5
-1
-1[FD_SETSIZE -1]
rset:
fd0 fd1 fd2 fd3
0 0 0 1
Maxfd + 1 = 6
* fd3 => listening socket fd
fd4
1
* fd4 => client1 socket fd
fd5
1
* fd5 => client2 socket fd
Data structure TCP server(3)Data structure TCP server(3)
After second client connection is established
Data structure TCP server(4)Data structure TCP server(4)
Client[][0]
[1]
[2]
-1
5
-1
-1[FD_SETSIZE -1]
rset:
fd0 fd1 fd2 fd3
0 0 0 1
Maxfd + 1 = 6
* fd3 => listening socket fd
fd4
0
* fd4 => client1 socket fd deleted
fd5
1
* fd5 => client2 socket fd
*Maxfd does not change
After first client terminates its connection
TCP echo server using single TCP echo server using single processprocess
#include"unp.h"int main(int argc, char **argv){
int i, maxi, maxfd, listenfd, connfd, sockfd;int nready, client[FD_SETSIZE];ssize_t n;fd_set rset, allset;char line[MAXLINE];socklen_t clilen;struct sockaddr_in cliaddr, servaddr;listenfd = Socket(AF_INET, SOCK_STREAM, 0);bzero(&servaddr, sizeof(servaddr));servaddr.sin_family = AF_INET;servaddr.sin_addr.s_addr = htonl(INADDR_ANY);servaddr.sin_port = htons(SERV_PORT);Bind(listenfd, (SA *) &servaddr, sizeof(servaddr));Listen(listenfd, LISTENQ);
maxfd = listenfd; /* initialize */maxi = -1; /* index into client[] array */for (i = 0; i < FD_SETSIZE; i++)
client[i] = -1; /* -1 indicates available entry */FD_ZERO(&allset);FD_SET(listenfd, &allset);
for ( ; ; ) {rset = allset; /* structure assignment */nready = Select(maxfd+1, &rset, NULL, NULL, NULL);
if (FD_ISSET(listenfd, &rset)) { /* new client connection */clilen = sizeof(cliaddr);connfd = Accept(listenfd, (SA *) &cliaddr, &clilen);
#ifdef NOTDEFprintf("new client: %s, port %d\n",
Inet_ntop(AF_INET, &cliaddr.sin_addr, 4, NULL),ntohs(cliaddr.sin_port));
#endif
for (i = 0; i < FD_SETSIZE; i++)if (client[i] < 0) {
client[i] = connfd; /* save descriptor */break;
}if (i == FD_SETSIZE)
err_quit("too many clients");
FD_SET(connfd, &allset); /* add new descriptor to set */if (connfd > maxfd)
maxfd = connfd; /* for select */if (i > maxi)
maxi = i; /* max index in client[] array */
if (--nready <= 0)continue; /* no more readable descriptors */
}
for (i = 0; i <= maxi; i++) { /* check all clients for data */if ( (sockfd = client[i]) < 0)
continue;if (FD_ISSET(sockfd, &rset)) {
if ( (n = Readline(sockfd, line, MAXLINE)) == 0) {/*4connection closed by client */
Close(sockfd);FD_CLR(sockfd, &allset);client[i] = -1;
} elseWriten(sockfd, line, n);
if (--nready <= 0)break; /* no more readable descriptors */
}}
}}
Denial of service attacksDenial of service attacks
If Malicious client connect to the server, senIf Malicious client connect to the server, send 1 byte of data(other than a newline), and td 1 byte of data(other than a newline), and then goes to sleep.hen goes to sleep.
=>call readline, server is blocked.=>call readline, server is blocked.
Solution Solution use nonblocking I/Ouse nonblocking I/O have each client serviced by a separate thread of have each client serviced by a separate thread of
control (spawn a process or a thread to service econtrol (spawn a process or a thread to service each client)ach client)
place a timeout on the I/O operationplace a timeout on the I/O operation
pselectpselect function function
#include <sys/select.h>#include <signal.h>#include <time.h>
int pselect(int maxfdp1, fd_set *readset, fd_set *writeset, fd_set *exceptset, const struct timespec *timeout, const sigset_t *sigmask)
pselect function was invented by Posix.1g.
pselectpselect function function struct timespec{struct timespec{
time_t tv_sec; /*seconds*/time_t tv_sec; /*seconds*/
long tv_nsec; /* nanoseconds */long tv_nsec; /* nanoseconds */ sigmask => pointer to a signal mask.sigmask => pointer to a signal mask.
Poll functionPoll function Similar to select, but provide additional informaSimilar to select, but provide additional informa
tion when dealing with streams devicestion when dealing with streams devices #include <poll.h>#include <poll.h>
int poll(struct pollfd *fdarray, unsigned long nfint poll(struct pollfd *fdarray, unsigned long nfds, int timeout);ds, int timeout);
/*return : count of ready descriptors, 0 on timeout, /*return : count of ready descriptors, 0 on timeout, -1 on error*/-1 on error*/
Struct pollfd{Struct pollfd{
int fd; /* descriptor to check */int fd; /* descriptor to check */
short events; /* events of interest on fd */short events; /* events of interest on fd */
short revents;/*events that occurred on fd*/short revents;/*events that occurred on fd*/
}}
specifies the conditions to be tested for a given despecifies the conditions to be tested for a given descriptor fdscriptor fd
events: the conditions to be testedevents: the conditions to be tested
revents:the status of that descriptorrevents:the status of that descriptor
Input Input eventsevents and returned and returned reventsrevents for for pollpoll
ConstantInput to events ?
Result from revents ?
Description
POLLINPOLLRDNORMPOLLRDBANDPOLLPRI
• • • •
• • • •
Normal or priority band data can be readnormal data can be readpriority band data can be readhigh-priority data can be read
POLLOUTPOLLWRNORMPOLLWRBAND
POLLERRPOLLHUPPOLLNVAL
• • •
• • •
normal data can be writtennormal data can be written priority band data can be written
• • •
An error has occurredhangup has occurreddescriptor is not an open file
Timeout value for Timeout value for pollpoll
Timeout value Description
INFTIM
0
>0
Wait forever
Return immediately, do not block
Wait specified number of milliseconds
If we are no longer interested in particular descriptor, just set the fd member of the pollfd structure
Specifies how long the function is to wait before returning