communication dr. ying lu [email protected] csce455/855 distributed operating systems

37
Communication Dr. Ying Lu [email protected] CSCE455/855 Distributed Operating Systems

Upload: makayla-routon

Post on 15-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Communication Dr. Ying Lu ylu@cse.unl.edu CSCE455/855 Distributed Operating Systems

Communication

Dr. Ying Lu

[email protected]

CSCE455/855 Distributed Operating Systems

Page 2: Communication Dr. Ying Lu ylu@cse.unl.edu CSCE455/855 Distributed Operating Systems

Most of the lecture notes are from the textbook companion website

Some of the lecture notes are based on slides created by Dr. Krzyzanowski at Rutgers University

I have modified them and added new slides

Giving credit where credit is due:

CSCE455/855 Distributed Operating Systems

Page 3: Communication Dr. Ying Lu ylu@cse.unl.edu CSCE455/855 Distributed Operating Systems

Layered Protocols (I)

Reference model for networked communication

Page 4: Communication Dr. Ying Lu ylu@cse.unl.edu CSCE455/855 Distributed Operating Systems

Layered Protocols (2)

Page 5: Communication Dr. Ying Lu ylu@cse.unl.edu CSCE455/855 Distributed Operating Systems

Assume that you are developing a client-server application: How to let the two processes (client and server)

located on two machines communicate with each other?

Socket programming: using functions like connect(sd, (struct sockaddr *)&sin, sizeof(sin)), write(sd, buf, strlen(buf)) etc.

Client-Server Communication

Page 6: Communication Dr. Ying Lu ylu@cse.unl.edu CSCE455/855 Distributed Operating Systems

Remote Procedure Calls (RPC)

• Avoid explicit message exchange between processes• Basic idea is to allow a process on a machine to call

procedures on a remote machine– Make a remote procedure possibly look like a local one

• Original paper on RPC:– A. Birrell, B Nelson, “Implementing Remote Procedure

Calls”, ACM Symposium on Operating System Principles, 1984

Page 7: Communication Dr. Ying Lu ylu@cse.unl.edu CSCE455/855 Distributed Operating Systems

• How are parameters passed in a local procedure call– E.g., #include <sys/types.h>

#include <unistd.h>

...

char buf[20];

size_t nbytes;

ssize_t bytes_read;

int fd;

...

nbytes = sizeof(buf);

bytes_read = read(fd, buf, nbytes);

...

Conventional Procedure Call

Page 8: Communication Dr. Ying Lu ylu@cse.unl.edu CSCE455/855 Distributed Operating Systems

Conventional Procedure Call

Figure 4-5. (a) Parameter passing in a local procedure call: the stack before the call to read. (b) The stack while the called procedure is active.

Page 9: Communication Dr. Ying Lu ylu@cse.unl.edu CSCE455/855 Distributed Operating Systems

Remote Procedure Calls (RPC)

• How are parameter passed in a remote procedure call, while making it look like a local procedure call?

Page 10: Communication Dr. Ying Lu ylu@cse.unl.edu CSCE455/855 Distributed Operating Systems

Client and Server Stubs

Principle of RPC between a client and server program.

Page 11: Communication Dr. Ying Lu ylu@cse.unl.edu CSCE455/855 Distributed Operating Systems

Steps of a Remote Procedure Call1. Client procedure calls client stub in normal way2. Client stub builds message, calls local OS3. Client's OS sends message to remote OS4. Remote OS gives message to server stub5. Server stub unpacks parameters, calls server6. Server does work, returns result to the stub7. Server stub packs it in message, calls local OS8. Server's OS sends message to client's OS9. Client's OS gives message to client stub10. Stub unpacks result, returns to client

Page 12: Communication Dr. Ying Lu ylu@cse.unl.edu CSCE455/855 Distributed Operating Systems

Passing Value Parameters (1)

Steps involved in doing remote computation through RPC

2-8

Page 13: Communication Dr. Ying Lu ylu@cse.unl.edu CSCE455/855 Distributed Operating Systems

Passing Value Parameters (2)

Page 14: Communication Dr. Ying Lu ylu@cse.unl.edu CSCE455/855 Distributed Operating Systems

Passing Value Parameters (3)

a) Original message on the Pentium (little-endian)b) The message after receipt on the SPARC (big-endian)Note: the little numbers in boxes indicate the address of each byte

Page 15: Communication Dr. Ying Lu ylu@cse.unl.edu CSCE455/855 Distributed Operating Systems

Passing Value Parameters (3)

a) Original message on the Pentium (little-endian)b) The message after receipt on the SPARC (big-endian)c) The message after being inverted (integer 5, string: “LLIJ”)Note: the little numbers in boxes indicate the address of each byte

Page 16: Communication Dr. Ying Lu ylu@cse.unl.edu CSCE455/855 Distributed Operating Systems

Passing reference parameters

– What is Call By Value and Call By Refernce?– Example: call foo(int, int * ) or read(fd, buf, nbytes)

– Call by copy/restore

– The dreaded “pointer problem”• Linked list• Complex graph

a

b

a’

b’

foo(a, &b ) Call foo(a, &b’ )

Copy value a and contents of loc b

into a’ and loc b’

Return Copy contents of loc b’ into b

Machine AMachine B

Page 17: Communication Dr. Ying Lu ylu@cse.unl.edu CSCE455/855 Distributed Operating Systems

MarshallingValues must cross the network

Machine formats differ– Integer byte order

• Little-endian or big-endian

– Floating point format• IEEE 754 or not

Marshalling transferring data structure used in remote procedure call from one address space to another.

Define a “network format”, for example following XDR (eXternal Data Representation) standard http://www.ietf.org/rfc/rfc1832.txt

Page 18: Communication Dr. Ying Lu ylu@cse.unl.edu CSCE455/855 Distributed Operating Systems

RPC: The basic mechanism

Client routines

Client stub

RPC runtime

Networkroutines

Source: R. Stevens, Unix Network Programming (IPC) Vol 2, 1998

Server routines

Server stub

RPC runtime

Network routines

Processkernel

Processkernel

Client process Server process 1. Client calls a local procedure on the client stub

2. The client stub acts as a proxy and marshalls the call and the args.

3. The client stub send this to the remote system (via TCP/UDP)

4. The server stub unmarshalls the call and args from the client

5. The server stub calls the actual procedure on the server

6. The server stub marshalls the reply and sends it back to the client

1

2

3

4

5

6

Page 19: Communication Dr. Ying Lu ylu@cse.unl.edu CSCE455/855 Distributed Operating Systems

Example1: A Time Server Interface

struct time {int seconds;int minutes;int hours;int day;int month;int year;char timezone[4];

}int gettime(t); struct time *t;int settime(t); struct time *t;

Page 20: Communication Dr. Ying Lu ylu@cse.unl.edu CSCE455/855 Distributed Operating Systems

Example1: Client Stub for Settimeint settime(t); struct time *t; {

char *p, message[32];int stat;

p = message;p = put_int(p, SETTIME);p = put_int(p, t->seconds);p = put_int(p, t->minutes);p = put_int(p, t->hours);p = put_int(p, t->day);p = put_int(p, t->month);p = put_int(p, t->year); p = put_string(p, t->timezone, 4);stat = do_operation(“time_server”, message, 32);if(stat == SUCCESS) get_int(message, &stat);return(stat);

}

Page 21: Communication Dr. Ying Lu ylu@cse.unl.edu CSCE455/855 Distributed Operating Systems

Example1: Server Stub (1)void main_loop() {

char *p, message[32];int len, op_code;struct time t;

for(;;) {len = receive_request(message, 32);if(len < 4) {

/* error handling code */}p = message;p = get_int(p, op_code);switch(op_code) {case SETTIME: if (len < 32) {

/* error handling code */ }

p = get_int(p, &t.seconds);

p = get_int(p, &t.minutes);p = get_int(p, &t.hours);p = get_int(p, &t.day);p = get_int(p, &t.month);p = get_int(p, &t.year);p = get_string(p, &t.timezone,

4);len = settime(&t);put_int(message, len);len = 4; break; case GETTIME:

/* code for unmarshalling and calling gettime */

}

send_reply(message, len);

}

}

Page 22: Communication Dr. Ying Lu ylu@cse.unl.edu CSCE455/855 Distributed Operating Systems

Writing a Client and a Server

Figure 4-12. The steps in writing a client and a server in DCE RPC

DCE: Distributed Computing Environment

Page 23: Communication Dr. Ying Lu ylu@cse.unl.edu CSCE455/855 Distributed Operating Systems

Binding a Client to a Server (1)

• Registration of a server makes it possible for a client to locate the server and bind to it.

• Server location is done in two steps:1.Locate the server’s machine.

2.Locate the server on that machine.

Page 24: Communication Dr. Ying Lu ylu@cse.unl.edu CSCE455/855 Distributed Operating Systems

Binding a Client to a Server (2)

Figure 4-13. Client-to-server binding in DCE.

Page 25: Communication Dr. Ying Lu ylu@cse.unl.edu CSCE455/855 Distributed Operating Systems

Asynchronous RPC (1)

a) The interconnection between client and server in a traditional RPCb) The interaction using asynchronous RPC

2-12

Page 26: Communication Dr. Ying Lu ylu@cse.unl.edu CSCE455/855 Distributed Operating Systems

Asynchronous RPC (2)

A client and server interacting through two asynchronous RPCs

2-13

Page 27: Communication Dr. Ying Lu ylu@cse.unl.edu CSCE455/855 Distributed Operating Systems

LPC v.s. RPC

• Global variables

• Client and server fail independently– RPC: requires code to deal with server crashes

Page 28: Communication Dr. Ying Lu ylu@cse.unl.edu CSCE455/855 Distributed Operating Systems

When Things Go Wrong

• Semantics of remote procedure calls– Local procedure call: exactly once

• How many times a remote procedure call may be called?

• A remote procedure call may be called:– 0 time: server crashed or server process died before

executing server code

– 1 time: everything worked well

– 1 or more: due to excess latency or lost reply from server, client retransmitted

• Exactly once may be difficult to achieve with RPC

Page 29: Communication Dr. Ying Lu ylu@cse.unl.edu CSCE455/855 Distributed Operating Systems

RPC Semantics

• Most RPC systems will offer either:– at least once semantics– or at most once semantics

• Understand application:– Illustrate some applications that “at least once” is

suitable? • Idempotent functions: may be run any number of times

without harm

– Illustrate some applications that “at most once” is suitable?

Page 30: Communication Dr. Ying Lu ylu@cse.unl.edu CSCE455/855 Distributed Operating Systems

Useful Links for RPC

RFC 1831: RPC Specification

RFC 1832: XDR Specification

Page 31: Communication Dr. Ying Lu ylu@cse.unl.edu CSCE455/855 Distributed Operating Systems

In-Class Exercises (I)

1. C has a construction called a union, in which a field of a record (called a struct in C) can hold any one of several alternatives. At run time, there is no sure-fire way to tell which one is in there. Does this feature of C have any implications for remote procedure call? Explain your answer.

If the runtime system cannot tell what type value is in the field, it cannot marshal it correctly. Thus unions cannot be tolerated in an RPC system unless there is a tag field that unambiguously tells what the variant field holds.

Page 32: Communication Dr. Ying Lu ylu@cse.unl.edu CSCE455/855 Distributed Operating Systems

In-Class Exercises (II)

2. One way to handle parameter conversion in RPC systems is to have each machine send parameters in its native representation, with the other one doing the translation, if need be. The native system could be indicated by a code in the first byte. However, since locating the first byte in the first word is precisely the problem, can this actually work?

First of all, when one computer sends byte 0, it always arrives in byte 0.Thus the destination computer can simply access byte 0 (using a byte instruction) and the code will be in it. An alternative scheme is to put the code in all the bytes of the first word. Then no matter which byte is examined, the code will be there.

Page 33: Communication Dr. Ying Lu ylu@cse.unl.edu CSCE455/855 Distributed Operating Systems
Page 34: Communication Dr. Ying Lu ylu@cse.unl.edu CSCE455/855 Distributed Operating Systems

Appendix

Page 35: Communication Dr. Ying Lu ylu@cse.unl.edu CSCE455/855 Distributed Operating Systems

Example: Writing a client and a server (1)

/* interface.x *//* Example Interface Definition */struct square_in { long arg;};

struct square_out { long result;};

program SQUARE_PROG { version SQUARE_VERS { square_out SQUAREPROC( square_in ) = 1; /* Procedure number = 1 */ } = 1; /* Version number = 1 */} = 0x31230000; /* program number */

Source: R. Stevens, Unix Network Programming (IPC) Vol 2, 1998

Page 36: Communication Dr. Ying Lu ylu@cse.unl.edu CSCE455/855 Distributed Operating Systems

Example: Writing a client and a server (2)

interface.x

rpcgen

interface.h

Client Main interface_clnt.c(client stub)

interface_xdr.c interface_svc.c(Server stub)

Server.c

Runtime lib

Client Server

Source: R. Stevens, Unix Network Programming (IPC) Vol 2, 1998

Page 37: Communication Dr. Ying Lu ylu@cse.unl.edu CSCE455/855 Distributed Operating Systems

Unix Network Programming

UNIX Network Programming, Volume 2, Second Edition: Interprocess Communications, Prentice Hall, 1999, ISBN 0-13-081081-9