distributed operating systems. coverage distributed systems (ds) paradigms –ds & os’s...
TRANSCRIPT
Distributed Operating Systems
Coverage• Distributed Systems (DS) Paradigms
– DS & OS’s– Services and Models– Communication– Distributed File Systems
• Coordination– Distributed Mutual Exclusion (ME)– Distributed Co-ordination– Synchronization
• DS Scheduling & Misc. Issues
What is a Distributed System“A distributed system is the one preventing you from working
because of the failure of a machine that you had never heard
of.” Leslie Lamport
Multiple computers sharing (same) state and interconnected by a network
“A distributed system is a collection of independent computers that appear to the users of the system as a single computer.” Tanenbaum
shared memory multiprocessor
message passing multiprocessor
distributed system
Distribution: Example Pro/Cons All the Good Stuff: High Performance, Distributed Access,
Scalable, Heterogeneous, Sharing (Concurrency), Load Balancing (Migration, Relocation), Fault Tolerance , …
• Bank account database (DB) example– Naturally centralized: easy consistency and performance
– Fragment DB among regions: exploit locality of reference, security & reduce reliance on network for remote access
– Replicate each fragment for fault tolerance
• But, we now need (additional) DS techniques– Route request to right fragment
– Maintain access/consistency of fragments as a whole database
– Maintain access/consistency of each fragment’s replicas
– …
Transparency: Global Access • Illusion of a single computer across a DS
Distribution transparency:Access transparency + location transparency + replication transparency + fragmentation transparency
Fragmentation Hide wether the resource is fragmented or not
Multiprocessor OS Types (1)
- Each CPU has its own operating system- Shared bus Comm. blocking & CPU idling!
Bus
Multiprocessor OS Types (2)
Master-Slave multiprocessorsBus
- Master-Slave multiprocessors- Master is a bottleneck!
Multiprocessor OS Types (3)
• Symmetric Multiprocessors– SMP multiprocessor model
Bus
- Eliminates the CPU bottleneck, but have issues associated to ME, synchronization
- Mutex on OS?
OS’s for DS’s• Loosely-coupled OS
– A collection of computers each running their own OS, OS’s allow sharing of resources across machines
– A.K.A. Network Operating System (NOS)– Manages heterogeneous multicomputer DS– Difference: provides local services to remote clients via remote logging– Data transfer from remote OS to local OS via FTP (File Transfer Protocols)
• Tightly-coupled OS– OS tries to maintain single global view of resources it manages– A.K.A. Distributed Operating System (DOS)– Manages multiprocessors & homogeneous multicomputers– Similar “local access feel” as a non-distributed, standalone OS– Data migration or computation migration modes (entire process or threads)
Network Operating Systems (NOS)
Distributed Operating Systems (DOS)
Client Server Model for DOS & NOS
Middleware
• Can we have the best of both worlds?– Scalability and openness of a NOS– Transparency and relative ease of a DOS
• Solution: additional layer of SW above NOS– Mask heterogeneity– Improve distribution transparency (and others)
“Middleware” (MW)
Middleware (& Openness)
• Document-based middleware (e.g. WWW)• Coordination-based MW (e.g., Linda, publish subscribe, Jini etc.)
• File system based MW• Shared object based MW
File System-Based Middleware (1)
(a)(b)
• Approach: make a DS look like a big file system• Transfer models:
(a) Download/upload model (work done locally)(b) Remote access model (work done remotely)
(a) Two file systems(b) Naming Transparency: All clients have same view of FS(c) Some clients with different FS views
File System-Based Middleware (2)
File System-Based Middleware (3)
• Semantics of file sharing (ordering and session semantics)– (a) single processor gives sequential consistency– (b) distributed system may return obsolete value
Shared Object-Based Middleware (1)
• Main elements of CORBA based system– Common Object Request Broker Architecture
• Approach: make a DS look like objects (variables + methods)• Easy scaling to large systems
• replicated objects (C++, Java)• flexibility
inter-ORB protocol
Example 1: Main elements of CORBA (Common Object Request Broker Architecture) based system:
- Object Request Broker (ORB) - Internet Inter-ORB Protocol (IIOP)
Shared Object-Based Middleware (2)Example 2: Globe [1] - provides a model of distributed shared objects and basic support services (naming, locating objects etc.)- an object is completely self-contained- designed to scale to a billion users, a trillion objects around the world
[1] M. van Steen, P. Homburg, and A. Tanenbaum. “Globe: A Wide-Area Distributed System”, 1999.
GIDS: Globe Infrastructure Directory Service
Shared Object-Based Middleware (3)
A distributed shared object in Globe– can have its state copied on multiple computers at once– how to maintain sequential consistency of write operations?
OS’s, DS’s & MW
Network Hardware
The Internet
Network Services and Protocols
Network Services
(blocking)
(non-blocking)
• Internet Protocol (IP)• Transmission Control Protocol (TCP): Connection-oriented• Universal Datagram Protocol (UDP): Connectionless
• Network services:
Client-Server Communications
• Unbuffered msg. passing– send(addr,msg), recv(addr,msg)
– all request and reply at C/S level
– all msg. acks between kernels only
• Buffered msg. passing– msg. sent to kernel mailbox or
kernel/user interface socket
client server
kernel kernel
msg. directed at a process
client server
kernel kernel
blocking? non-blocking?
Remote Procedure Calls (RPC)
• Synchronous/Asynchronous (blocking/non-blocking) communication– [Sync] client generated request, STUB kernel– [Sync] kernel blocks process till reply received from server– [ASync] buffers msg
RPC & Stubs (Dummy Procedure i.p.o RPC)
• [C] call “client stub” procedure• [CS] prepare msg. buffer• [CS] load parameters into buffer• [CS] prepare msg. header • [CS] send trap to kernel• [K] context switch to kernel• [K] copy msg. to kernel• [K] determine server address (NS)• [K] put address in header• [K] set up network interface• [K] start timer for msg
• [S] process req; initiate “server stub”• [SS] call server• [SS] set up parameter stack/unbundle• [K] context switch to server stub• [K] copy msg. to stub• [K] see if stub is waiting• [K] decide which stub to assign• [K] check packet for validity• [K] process interrupt (save PC, kernel
state)
NS
C: Client; CS: Client Stub, [K] KernelS: Server; SS: Server Stub, NS : Network service
Remote Procedure Call
Implementation Issues:• Can we pass pointers? (local context…)
– call by reference becomes copy-restore (but might fail)
• Weakly typed languages (e.g., C allows computations, say product of arrays sans array size specs)– can client stub determine unspecified size to pass on?
• Not always possible to determine parameter types
• Cannot use global variables– C/S may get moved to remote machine
RPC Failures?
• C/S failure vs. communication failure?• Who detects? Timeouts?• Does it matter if a node (C/S) failed
– BEFORE or AFTER a request arrived?
– BEFORE or AFTER a request is processed?
• Client failure: orphan requests? add expiration counters
• Server crash?
Communication
• Delivers messages despite– communication link(s) failure
– process failures
• Main kinds of failures to tolerate– timing (link and process)
– omission (link and process)
– value
Communication: Reliable Delivery
• Omission failure tolerance (degree k).
• Design choices:a) Error masking (spatial): several (> k) links
b) Error masking (temporal): repeat K+1 times
c) Error recovery: detect error and recover
Reliable Delivery (cont.)
Error detection and recovery: ACK’s and timeouts
• Positive ACK: sent when a message is received– Timeout on sender without ACK: sender retransmits
• Negative ACK: sent when a message loss detected– Needs sequence #s or time-based reception semantics
• Tradeoffs– Positive ACKs faster failure detection usually– NACKs : fewer msgs…
Q: what kind of situations are good for– Spatial error masking?– Temporal error masking?– Error detection and recovery with positive ACKs?– Error detection and recovery with NACKs?
Resilience to Sender Failure
• Multicast FT-Communication harder than point-to-point– Basic problem is of failure detection
– Subsets of senders may receive msg, then sender fails
• Solutions depend on flavor of multicast reliabilitya) Unreliable: no effort to overcome link failures
b) Best-effort: some steps taken to overcome link failures
c) Reliable: participants coordinate to ensure that all or none of correct recipients get it (sender failed in b)