Download - 5. Distributed Operating Systems
DISTRIBUTED OPERATING
SYSTEMS
Sandeep Kumar PooniaHead of Dept. CS/IT
B.E., M.Tech., UGC-NET
LM-IAENG, LM-IACSIT,LM-CSTA, LM-AIRCC, LM-SCIEI, AM-UACEE
ISSUES IN CLIENT-SERVER COMMUNICATION
Addressing
Blocking versus non-blocking
Buffered versus unbuffered
Reliable versus unreliable
Server architecture: concurrent versus
sequential
Scalability
ADDRESSING ISSUES
Question: how is the server
located?
Hard-wired address
Machine address and process
address are known a priori
Broadcast-based
Server chooses address from a
sparse address space
Client broadcasts request
Can cache response for future
Locate address via name
server
user server
user server
user serverNS
BLOCKING VERSUS NON-BLOCKING
Blocking communication (synchronous)
Send blocks until message is actually sent
Receive blocks until message is actually received
BLOCKING VERSUS NON-BLOCKING
Non-blocking communication (asynchronous)
Send returns immediately
Return does not block
BUFFERING ISSUES
Unbuffered communication
Server must call receive before client can call send
BUFFERING ISSUES
Buffered communication
Client send to a mailbox
Server receives from a mailbox
RELIABILITY
Unreliable channel
Need acknowledgements (ACKs)
Applications handle ACKs
ACKs for both request and reply
RELIABILITY
Reliable channel
Reply acts as ACK for request
Explicit ACK for response
SERVER ARCHITECTURE Sequential
Serve one request at a time
Can service multiple requests by employing events and
asynchronous communication
Concurrent
Server spawns a process or thread to service each request
Can also use a pre-spawned pool of threads/processes
(apache)
Thus servers could be
Pure-sequential, event-based, thread-based, process-based
SCALABILITY
Question:How can you scale the server
capacity?
Buy bigger machine!
Replicate
Distribute data and/or algorithms
Ship code instead of data
Cache
TO PUSH OR PULL ?
Client-pull architecture
Clients pull data from servers (by sending requests)
Example: HTTP
Pro: stateless servers, failures are each to handle
Con: limited scalability
Server-push architecture
Servers push data to client
Example: video streaming
Pro: more scalable,
Con: stateful servers, less resilient to failure
REMOTE PROCEDURE CALLS
Goal: Make distributed computing look like centralized
computing
Allow remote services to be called as procedures
Transparency with regard to location, implementation,
language
Issues
How to pass parameters
Bindings
Semantics in face of errors
Two classes: integrated into programming language
and separate
CONVENTIONAL PROCEDURE CALL
a) Parameter passing in a
local procedure call: the
stack before the call to
read
b) The stack while the called
procedure is active
PARAMETER PASSING
Local procedure parameter passing
Call-by-value
Call-by-reference: arrays, complex data structures
Remote procedure calls simulate this through:
Stubs – proxies
Flattening – marshalling
Related issue: global variables are not allowed
in RPCs
CLIENT AND SERVER STUBS
Principle of RPC between a client and server program.
STUBS
Client makes procedure call (just like a local
procedure call) to the client stub
Server is written as a standard procedure
Stubs take care of packaging arguments and
sending messages
Packaging parameters is called marshalling
Stub compiler generates stub automatically
from specs in an Interface Definition Language
(IDL)
Simplifies programmer task
A PAIR OF STUBS
Client-side stub
Looks like local server
function
Same interface as local
function
Bundles arguments into
message, sends to
server-side stub
Waits for reply, un-
bundles results
returns
Server-side stub
Looks like local client
function to server
Listens on a socket for
message from client
stub
Un-bundles arguments
to local variables
Makes a local function
call to server
Bundles result into reply
message to client stub
STEPS OF A REMOTE PROCEDURE CALL
1. Client procedure calls client stub in normal way
2. Client stub builds message, calls local OS
3. Client's OS sends message to remote OS
4. Remote OS gives message to server stub
5. Server stub unpacks parameters, calls server
6. Server does work, returns result to the stub
7. Server stub packs it in message, calls local OS
8. Server's OS sends message to client's OS
9. Client's OS gives message to client stub
10. Stub unpacks result, returns to client
EXAMPLE OF AN RPC
2-8
RPC – ISSUES
How to make the “remote” part of RPC invisible
to the programmer?
What are semantics of parameter passing?
E.g., pass by reference?
How to bind (locate & connect) to servers?
How to handle heterogeneity?
OS, language, architecture, …
How to make it go fast?
RPC MODEL
A server defines the service interface using an
interface definition language (IDL)
the IDL specifies the names, parameters, and types
for all client-callable server procedures
A stub compiler reads the IDL declarations and
produces two stub functions for each server
function
Server-side and client-side
RPC MODEL (CONTINUED)
Linking:–
Server programmer implements the server’s
functions and links with the server-side stubs
Client programmer implements the client program
and links it with client-side stubs
Operation:–
Stubs manage all of the details of remote
communication between client and server
RPC STUBS A client-side stub is a function that looks to the client
as if it were a callable server function
I.e., same API as the server’s implementation of the function
A server-side stub looks like a caller to the server
I.e., like a hunk of code invoking the server function
The client program thinks it’s invoking the server
but it’s calling into the client-side stub
The server program thinks it’s called by the client
but it’s really called by the server-side stub
The stubs send messages to each other to make the RPC happen transparently
MARSHALLING ARGUMENTS
Marshalling is the packing of function
parameters into a message packet
the RPC stubs call type-specific functions to
marshal or unmarshal the parameters of an RPC
Client stub marshals the arguments into a message
Server stub unmarshals the arguments and uses them to
invoke the service function
on return:
the server stub marshals return values
the client stub unmarshals return values, and returns to
the client program
MARSHALLING
Problem: different machines have different data formats
Intel: little endian, SPARC: big endian
Solution: use a standard representation
Example: external data representation (XDR)
Problem: how do we pass pointers?
If it points to a well-defined data structure, pass a copy and the server
stub passes a pointer to the local copy
What about data structures containing pointers?
Prohibit
Chase pointers over network
Marshalling: transform parameters/results into a byte stream
ISSUE #1 — REPRESENTATION OF DATA
Big endian vs. little endian
Sent by Pentium Rec’d by SPARC After inversion
REPRESENTATION OF DATA (CONTINUED)
IDL must also define representation of data on
network
Multi-byte integers
Strings, character codes
Floating point, complex, …
…
example: Sun’s XDR (external data representation)
Each stub converts machine representation to/from
network representation
Clients and servers must not try to cast data!
ISSUE #2 — POINTERS AND REFERENCES
read(int fd, char* buf, int nbytes)
Pointers are only valid within one address space
Cannot be interpreted by another processEven on same machine!
Pointers and references are ubiquitous in C, C++
Even in Java implementations!
POINTERS AND REFERENCES —
RESTRICTED SEMANTICS
Option: call by value
Sending stub dereferences pointer, copies result to
message
Receiving stub conjures up a new pointer
Option: call by result
Sending stub provides buffer, called function puts
data into it
Receiving stub copies data to caller’s buffer as
specified by pointer
POINTERS AND REFERENCES —
RESTRICTED SEMANTICS (CONTINUED)
Option: call by value-result
Caller’s stub copies data to message, then copies result back to client buffer
Server stub keeps data in own buffer, server updates it; server sends data back in reply
Not allowed:–
Call by reference
Aliased arguments
RPC BINDING
Binding is the process of connecting the client
to the server
the server, when it starts up, exports its interface
identifies itself to a network name server
tells RPC runtime that it is alive and ready to accept calls
the client, before issuing any calls, imports the
server
RPC runtime uses the name server to find the location of
the server and establish a connection
The import and export operations are explicit in
the server and client programs
BINDING: COMMENTS
Exporting and importing incurs overheads
Binder can be a bottleneck
Use multiple binders
Binder can do load balancing
RPC - BINDING
Static binding hard coded stub
Simple, efficient
not flexible
stub recompilation necessary if the location of the server changes
use of redundant servers not possible
Dynamic binding name and directory server
load balancing
IDL used for binding
flexible
redundant servers possible
RPC - DYNAMIC BINDING
client
procedure call
client stubbind
(un)marshal
(de)serialize
Find/bind
send
receive
com
munic
ation m
odule
com
munic
ation m
odule
server
procedure
server stubregister
(un)marshal
(de)serialize
receive
send
dispatcher
selects stub
client process server process
name and directory server
2
4
5 6
7
8
9
1
12
11 10
12
13
12
3
Wolfgang Gassler, Eva
Zangerle
FAILURE SEMANTICS
Client unable to locate server: return error
Lost request messages: simple timeout mechanisms
Lost replies: timeout mechanisms
Make operation idempotent
Use sequence numbers, mark retransmissions
FAILURE SEMANTICS
Server failures: did failure occur before or after
operation?
At least once semantics- Keep trying until a reply has been
received
At most once- Gives up immediately and report back failure
No guarantee- RPC may have been carried out any where
from 0 to a large number of times
Exactly once: desirable but difficult to achieve
FAILURE SEMANTICS
Client failure: what happens to the server
computation?
Referred to as an orphan
Extermination: log at client stub and explicitly kill orphans
Overhead of maintaining disk logs
Reincarnation: Divide time into epochs between failures
and delete computations from old epochs
Gentle reincarnation: upon a new epoch broadcast, try to
locate owner first (delete only if no owner)
Expiration: give each RPC a fixed quantum T; explicitly
request extensions
Periodic checks with client during long computations
IMPLEMENTATION ISSUES
Choice of protocol [affects communication costs]
Use existing protocol (UDP) or design from scratch
Packet size restrictions
Reliability in case of multiple packet messages
Flow control
Copying costs are dominant overheads
Need at least 2 copies per message
From client to NIC and from server NIC to server
As many as 7 copies
Stack in stub – message buffer in stub – kernel – NIC – medium –NIC – kernel – stub – server
Scatter-gather operations can reduce overheads
RPC PROTOCOL
CRITICAL PATH
CASE STUDY: SUNRPC
One of the most widely used RPC systems
Developed for use with NFS
Built on top of UDP or TCP
TCP: stream is divided into records
UDP: max packet size < 8912 bytes
UDP: timeout plus limited number of retransmissions
TCP: return error if connection is terminated by server
Multiple arguments marshaled into a single structure
At-least-once semantics if reply received, at-least-zero semantics if no reply. With UDP tries at-most-once
Use SUN’s eXternal Data Representation (XDR)
Big endian order for 32 bit integers, handle arbitrarily large data structures
BINDER: PORT MAPPER
Server start-up: create port
Server stub calls svc_register
to register prog. #, version #
with local port mapper
Port mapper stores prog #,
version #, and port
Client start-up: call
clnt_create to locate server
port
Upon return, client can call
procedures at the server
SUMMARY
RPCs make distributed computations look like
local computations
Issues:
Parameter passing
Binding
Failure handling
Case Study: SUN RPC
GROUP COMMUNICATION
One-to-many communication: useful for
distributed applications
Issues:
Group characteristics:
Static/dynamic, open/closed
Group addressing
Multicast, broadcast, application-level multicast (unicast)
Atomicity
Message ordering
Scalability
PUTTING IT ALL TOGETHER: EMAIL
User uses mail client to compose a message
Mail client connects to mail server
Mail server looks up address to destination
mail server
Mail server sets up a connection and passes
the mail to destination mail server
Destination stores mail in input buffer (user
mailbox)
Recipient checks mail at a later time
EMAIL: DESIGN CONSIDERATIONS
Structured or unstructured?
Addressing?
Blocking/non-blocking?
Buffered or unbuffered?
Reliable or unreliable?
Server architecture
Scalability
Push or pull?
Group communication