distributed operating systems. coverage distributed systems (ds) paradigms –ds & os’s...

Distributed Operating Systems

Coverage• Distributed Systems (DS) Paradigms

– DS & OS’s– Services and Models– Communication– Distributed File Systems

• Coordination– Distributed Mutual Exclusion (ME)– Distributed Co-ordination– Synchronization

• DS Scheduling & Misc. Issues

What is a Distributed System“A distributed system is the one preventing you from working

because of the failure of a machine that you had never heard

of.” Leslie Lamport

Multiple computers sharing (same) state and interconnected by a network

“A distributed system is a collection of independent computers that appear to the users of the system as a single computer.” Tanenbaum

shared memory multiprocessor

message passing multiprocessor

distributed system

Distribution: Example Pro/Cons All the Good Stuff: High Performance, Distributed Access,

Scalable, Heterogeneous, Sharing (Concurrency), Load Balancing (Migration, Relocation), Fault Tolerance , …

• Bank account database (DB) example– Naturally centralized: easy consistency and performance

– Fragment DB among regions: exploit locality of reference, security & reduce reliance on network for remote access

– Replicate each fragment for fault tolerance

• But, we now need (additional) DS techniques– Route request to right fragment

– Maintain access/consistency of fragments as a whole database

– Maintain access/consistency of each fragment’s replicas

– …

Transparency: Global Access • Illusion of a single computer across a DS

Distribution transparency:Access transparency + location transparency + replication transparency + fragmentation transparency

Fragmentation Hide wether the resource is fragmented or not

Multiprocessor OS Types (1)

- Each CPU has its own operating system- Shared bus Comm. blocking & CPU idling!

Bus


Master-Slave multiprocessorsBus

- Master-Slave multiprocessors- Master is a bottleneck!


• Symmetric Multiprocessors– SMP multiprocessor model

Bus

- Eliminates the CPU bottleneck, but have issues associated to ME, synchronization

- Mutex on OS?

OS’s for DS’s• Loosely-coupled OS

– A collection of computers each running their own OS, OS’s allow sharing of resources across machines

– A.K.A. Network Operating System (NOS)– Manages heterogeneous multicomputer DS– Difference: provides local services to remote clients via remote logging– Data transfer from remote OS to local OS via FTP (File Transfer Protocols)

• Tightly-coupled OS– OS tries to maintain single global view of resources it manages– A.K.A. Distributed Operating System (DOS)– Manages multiprocessors & homogeneous multicomputers– Similar “local access feel” as a non-distributed, standalone OS– Data migration or computation migration modes (entire process or threads)

Network Operating Systems (NOS)

Distributed Operating Systems (DOS)

Client Server Model for DOS & NOS

Middleware

• Can we have the best of both worlds?– Scalability and openness of a NOS– Transparency and relative ease of a DOS

• Solution: additional layer of SW above NOS– Mask heterogeneity– Improve distribution transparency (and others)

“Middleware” (MW)

Middleware (& Openness)

• Document-based middleware (e.g. WWW)• Coordination-based MW (e.g., Linda, publish subscribe, Jini etc.)

• File system based MW• Shared object based MW

File System-Based Middleware (1)

(a)(b)

• Approach: make a DS look like a big file system• Transfer models:

(a) Download/upload model (work done locally)(b) Remote access model (work done remotely)

(a) Two file systems(b) Naming Transparency: All clients have same view of FS(c) Some clients with different FS views



• Semantics of file sharing (ordering and session semantics)– (a) single processor gives sequential consistency– (b) distributed system may return obsolete value

Shared Object-Based Middleware (1)

• Main elements of CORBA based system– Common Object Request Broker Architecture

• Approach: make a DS look like objects (variables + methods)• Easy scaling to large systems

• replicated objects (C++, Java)• flexibility

inter-ORB protocol

Example 1: Main elements of CORBA (Common Object Request Broker Architecture) based system:

- Object Request Broker (ORB) - Internet Inter-ORB Protocol (IIOP)

Shared Object-Based Middleware (2)Example 2: Globe [1] - provides a model of distributed shared objects and basic support services (naming, locating objects etc.)- an object is completely self-contained- designed to scale to a billion users, a trillion objects around the world

[1] M. van Steen, P. Homburg, and A. Tanenbaum. “Globe: A Wide-Area Distributed System”, 1999.

GIDS: Globe Infrastructure Directory Service

Shared Object-Based Middleware (3)

A distributed shared object in Globe– can have its state copied on multiple computers at once– how to maintain sequential consistency of write operations?

OS’s, DS’s & MW

Network Hardware

The Internet

Network Services and Protocols

Network Services

(blocking)

(non-blocking)

• Internet Protocol (IP)• Transmission Control Protocol (TCP): Connection-oriented• Universal Datagram Protocol (UDP): Connectionless

• Network services:

Client-Server Communications

• Unbuffered msg. passing– send(addr,msg), recv(addr,msg)

– all request and reply at C/S level

– all msg. acks between kernels only

• Buffered msg. passing– msg. sent to kernel mailbox or

kernel/user interface socket

client server

kernel kernel

msg. directed at a process

client server

kernel kernel

blocking? non-blocking?

Remote Procedure Calls (RPC)

• Synchronous/Asynchronous (blocking/non-blocking) communication– [Sync] client generated request, STUB kernel– [Sync] kernel blocks process till reply received from server– [ASync] buffers msg

RPC & Stubs (Dummy Procedure i.p.o RPC)

• [C] call “client stub” procedure• [CS] prepare msg. buffer• [CS] load parameters into buffer• [CS] prepare msg. header • [CS] send trap to kernel• [K] context switch to kernel• [K] copy msg. to kernel• [K] determine server address (NS)• [K] put address in header• [K] set up network interface• [K] start timer for msg

• [S] process req; initiate “server stub”• [SS] call server• [SS] set up parameter stack/unbundle• [K] context switch to server stub• [K] copy msg. to stub• [K] see if stub is waiting• [K] decide which stub to assign• [K] check packet for validity• [K] process interrupt (save PC, kernel

state)

NS

C: Client; CS: Client Stub, [K] KernelS: Server; SS: Server Stub, NS : Network service

Remote Procedure Call

Implementation Issues:• Can we pass pointers? (local context…)

– call by reference becomes copy-restore (but might fail)

• Weakly typed languages (e.g., C allows computations, say product of arrays sans array size specs)– can client stub determine unspecified size to pass on?

• Not always possible to determine parameter types

• Cannot use global variables– C/S may get moved to remote machine

RPC Failures?

• C/S failure vs. communication failure?• Who detects? Timeouts?• Does it matter if a node (C/S) failed

– BEFORE or AFTER a request arrived?

– BEFORE or AFTER a request is processed?

• Client failure: orphan requests? add expiration counters

• Server crash?

Communication

• Delivers messages despite– communication link(s) failure

– process failures

• Main kinds of failures to tolerate– timing (link and process)

– omission (link and process)

– value

Communication: Reliable Delivery

• Omission failure tolerance (degree k).

• Design choices:a) Error masking (spatial): several (> k) links

b) Error masking (temporal): repeat K+1 times

c) Error recovery: detect error and recover

Reliable Delivery (cont.)

Error detection and recovery: ACK’s and timeouts

• Positive ACK: sent when a message is received– Timeout on sender without ACK: sender retransmits

• Negative ACK: sent when a message loss detected– Needs sequence #s or time-based reception semantics

• Tradeoffs– Positive ACKs faster failure detection usually– NACKs : fewer msgs…

Q: what kind of situations are good for– Spatial error masking?– Temporal error masking?– Error detection and recovery with positive ACKs?– Error detection and recovery with NACKs?

Resilience to Sender Failure

• Multicast FT-Communication harder than point-to-point– Basic problem is of failure detection

– Subsets of senders may receive msg, then sender fails

• Solutions depend on flavor of multicast reliabilitya) Unreliable: no effort to overcome link failures

b) Best-effort: some steps taken to overcome link failures

c) Reliable: participants coordinate to ensure that all or none of correct recipients get it (sender failed in b)

distributed operating systems. coverage distributed systems (ds) paradigms –ds & os’s...

Documents

distributed access

bus slide

dos nos slide

remote os

local os

multiprocessor os types

middleware mw slide

nos transparency