distributed systems · distributed systems principles and paradigms chapter 04 (version february...

42
Distributed Systems Principles and Paradigms Chapter 04 (version February 18, 2008) Maarten van Steen Vrije Universiteit Amsterdam, Faculty of Science Dept. Mathematics and Computer Science Room R4.20. Tel: (020) 598 7784 E-mail:[email protected], URL: www.cs.vu.nl/steen/ 01 Introduction 02 Architectures 03 Processes 04 Communication 05 Naming 06 Synchronization 07 Consistency and Replication 08 Fault Tolerance 09 Security 10 Distributed Object-Based Systems 11 Distributed File Systems 12 Distributed Web-Based Systems 13 Distributed Coordination-Based Systems 00 – 1 /

Upload: others

Post on 27-Mar-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Distributed Systems · Distributed Systems Principles and Paradigms Chapter 04 (version February 18, 2008) ... 11 Distributed File Systems ... Transient communication: A message is

Distributed SystemsPrinciples and Paradigms

Chapter 04(version February 18, 2008)

Maarten van Steen

Vrije Universiteit Amsterdam, Faculty of ScienceDept. Mathematics and Computer Science

Room R4.20. Tel: (020) 598 7784E-mail:[email protected], URL: www.cs.vu.nl/∼steen/

01 Introduction02 Architectures03 Processes04 Communication05 Naming06 Synchronization07 Consistency and Replication08 Fault Tolerance09 Security10 Distributed Object-Based Systems11 Distributed File Systems12 Distributed Web-Based Systems13 Distributed Coordination-Based Systems

00 – 1 /

Page 2: Distributed Systems · Distributed Systems Principles and Paradigms Chapter 04 (version February 18, 2008) ... 11 Distributed File Systems ... Transient communication: A message is

Layered Protocols

• Low-level layers

• Transport layer

• Application layer

• Middleware layer

04 – 1 Communication/4.1 Layered Protocols

Page 3: Distributed Systems · Distributed Systems Principles and Paradigms Chapter 04 (version February 18, 2008) ... 11 Distributed File Systems ... Transient communication: A message is

Basic Networking Model

Physical

Data link

Network

Transport

Session

Application

Presentation

Application protocol

Presentation protocol

Session protocol

Transport protocol

Network protocol

Data link protocol

Physical protocol

Network

1

2

3

4

5

7

6

Drawbacks:

• Focus on message-passing only• Often unneeded or unwanted functionality• Question: Violates transparency?

04 – 2 Communication/4.1 Layered Protocols

Page 4: Distributed Systems · Distributed Systems Principles and Paradigms Chapter 04 (version February 18, 2008) ... 11 Distributed File Systems ... Transient communication: A message is

Low-level layers

Physical layer: contains the specification and imple-mentation of bits, and their transmission betweensender and receiver

Data link layer: prescribes the transmission of a se-ries of bits into a frame to allow for error and flowcontrol

Network layer: describes how packets in a networkof computers are to be routed.

Observation: for many distributed systems, the lowest-level interface is that of the network layer.

04 – 3 Communication/4.1 Layered Protocols

Page 5: Distributed Systems · Distributed Systems Principles and Paradigms Chapter 04 (version February 18, 2008) ... 11 Distributed File Systems ... Transient communication: A message is

Transport Layer

Important: The transport layer provides the actualcommunication facilities for most distributed systems.

Standard Internet protocols:

• TCP: connection-oriented, reliable, stream-orientedcommunication• UDP: unreliable (best-effort) datagram communi-

cation

Note: IP multicasting is generally considered a stan-dard available service.

04 – 4 Communication/4.1 Layered Protocols

Page 6: Distributed Systems · Distributed Systems Principles and Paradigms Chapter 04 (version February 18, 2008) ... 11 Distributed File Systems ... Transient communication: A message is

Middleware Layer

Observation: Middleware is invented to provide com-mon services and protocols that can be used by manydifferent applications:

• A rich set of communication protocols , but whichallow different applications to communicate

• (Un)marshaling of data, necessary for integratedsystems

• Naming protocols , so that different applicationscan easily share resources

• Security protocols , to allow different applicationsto communicate in a secure way

• Scaling mechanisms , such as support for repli-cation and caching

Note: what remains are truly application-specificprotocols

Question: Such as...?04 – 5 Communication/4.1 Layered Protocols

Page 7: Distributed Systems · Distributed Systems Principles and Paradigms Chapter 04 (version February 18, 2008) ... 11 Distributed File Systems ... Transient communication: A message is

Types of Communication (1/3)

Client

Server

Synchronize after processing by server

Synchronize at request delivery

Synchronize at request submission

Request

Reply

Storage facility

Transmission interrupt

Time

Distinguish:

• Transient versus persistent communication• Asynchrounous versus synchronous commu-

nication

04 – 6 Communication/4.1 Layered Protocols

Page 8: Distributed Systems · Distributed Systems Principles and Paradigms Chapter 04 (version February 18, 2008) ... 11 Distributed File Systems ... Transient communication: A message is

Types of Communication (2/3)

Client

Server

Synchronize after processing by server

Synchronize at request delivery

Synchronize at request submission

Request

Reply

Storage facility

Transmission interrupt

Time

Transient communication: A message is discardedby a communication server as soon as it cannot bedelivered at the next server, or at the receiver.

Persistent communication: A message is stored ata communication server as long as it takes to deliverit at the receiver.

04 – 7 Communication/4.1 Layered Protocols

Page 9: Distributed Systems · Distributed Systems Principles and Paradigms Chapter 04 (version February 18, 2008) ... 11 Distributed File Systems ... Transient communication: A message is

Types of Communication (3/3)

Client

Server

Synchronize after processing by server

Synchronize at request delivery

Synchronize at request submission

Request

Reply

Storage facility

Transmission interrupt

Time

Places for synchronization:

• At request submission• At request delivery• After request processing

04 – 8 Communication/4.1 Layered Protocols

Page 10: Distributed Systems · Distributed Systems Principles and Paradigms Chapter 04 (version February 18, 2008) ... 11 Distributed File Systems ... Transient communication: A message is

Client/Server

Some observations: Client/Server computing is gen-erally based on a model of transient synchronouscommunication :

• Client and server have to be active at the time ofcommunication• Client issues request and blocks until it receives

reply• Server essentially waits only for incoming requests,

and subsequently processes them

Drawbacks synchronous communication:

• Client cannot do any other work while waiting forreply• Failures have to be dealt with immediately (the

client is waiting)• In many cases the model is simply not appropri-

ate (mail, news)

04 – 9 Communication/4.1 Layered Protocols

Page 11: Distributed Systems · Distributed Systems Principles and Paradigms Chapter 04 (version February 18, 2008) ... 11 Distributed File Systems ... Transient communication: A message is

Messaging

Message-oriented middleware: Aims at high-levelpersistent asynchronous communication :

• Processes send each other messages, which arequeued• Sender need not wait for immediate reply, but can

do other things• Middleware often ensures fault tolerance

04 – 10 Communication/4.1 Layered Protocols

Page 12: Distributed Systems · Distributed Systems Principles and Paradigms Chapter 04 (version February 18, 2008) ... 11 Distributed File Systems ... Transient communication: A message is

Remote Procedure Call (RPC)

• Basic RPC operation

• Parameter passing

• Variations

04 – 11 Communication/4.2 Remote Procedure Call

Page 13: Distributed Systems · Distributed Systems Principles and Paradigms Chapter 04 (version February 18, 2008) ... 11 Distributed File Systems ... Transient communication: A message is

Basic RPC Operation (1/2)

Observations:

• Application developers are familiar with simple pro-cedure model• Well-engineered procedures operate in isolation

(black box)• There is no fundamental reason not to execute

procedures on separate machine

Conclusion: communication between caller & calleecan be hidden by using procedure-call mechanism.

Call local procedureand return results

Call remoteprocedure

Returnfrom call

Client

Request Reply

ServerTime

Wait for result

04 – 12 Communication/4.2 Remote Procedure Call

Page 14: Distributed Systems · Distributed Systems Principles and Paradigms Chapter 04 (version February 18, 2008) ... 11 Distributed File Systems ... Transient communication: A message is

Basic RPC Operation (2/2)

Implementationof add

Client OS Server OS

Client machine Server machine

Client stub

Client process Server process1. Client call to

procedure

2. Stub buildsmessage

5. Stub unpacksmessage

6. Stub makeslocal call to "add"

3. Message is sentacross the network

4. Server OShands messageto server stub

Server stubk = add(i,j) k = add(i,j)

proc: "add"int: val(i)int: val(j)

proc: "add"int: val(i)int: val(j)

proc: "add"int: val(i)int: val(j)

1. Client procedure calls client stub as usual.2. Client stub builds message and calls local OS.3. Client’s OS sends message to remote OS.4. Remote OS gives message to server stub.5. Server stub unpacks parameters and calls server.6. Server does work and returns result to the stub.7. Server stub packs it in message and calls OS.8. Server’s OS sends message to client’s OS.9. Client’s OS gives message to client stub.

10. Client stub unpacks result and returns to the client.

04 – 13 Communication/4.2 Remote Procedure Call

Page 15: Distributed Systems · Distributed Systems Principles and Paradigms Chapter 04 (version February 18, 2008) ... 11 Distributed File Systems ... Transient communication: A message is

RPC: Parameter Passing (1/2)

Parameter marshaling: There’s more than just wrap-ping parameters into a message:

• Client and server machines may have differentdata representations (think of byte ordering)• Wrapping a parameter means transforming a value

into a sequence of bytes• Client and server have to agree on the same en-

coding:– How are basic data values represented (inte-

gers, floats, characters)– How are complex data values represented (ar-

rays, unions)• Client and server need to properly interpret mes-

sages, transforming them into machine-dependentrepresentations.

04 – 14 Communication/4.2 Remote Procedure Call

Page 16: Distributed Systems · Distributed Systems Principles and Paradigms Chapter 04 (version February 18, 2008) ... 11 Distributed File Systems ... Transient communication: A message is

RPC: Parameter Passing (2/2)

RPC parameter passing:

• RPC assumes copy in/copy out semantics:while procedure is executed, nothing can be as-sumed about parameter values (only Ada sup-ports this model).• RPC assumes all data that is to be operated on is

passed by parameters. Excludes passingreferences to (global) data .

Conclusion: full access transparency cannot be re-alized.

Observation: If we introduce a remote reference mech-anism, access transparency can be enhanced:

• Remote reference offers unified access to remotedata• Remote references can be passed as parameter

in RPCs

04 – 15 Communication/4.2 Remote Procedure Call

Page 17: Distributed Systems · Distributed Systems Principles and Paradigms Chapter 04 (version February 18, 2008) ... 11 Distributed File Systems ... Transient communication: A message is

Asynchronous RPCs

Essence: Try to get rid of the strict request-reply be-havior, but let the client continue without waiting for ananswer from the server.

Call local procedure

Call remoteprocedure

Returnfrom call

Request Accept request

Wait for acceptance

Call local procedureand return results

Call remoteprocedure

Returnfrom call

Client Client

Request Reply

Server ServerTime Time

Wait for result

(a) (b)

Variation: deferred synchronous RPC :

Call local procedure

Call remoteprocedure

Returnfrom call

Client

RequestAcceptrequest

ServerTime

Wait foracceptance

Interrupt client

Returnresults Acknowledge

Call client withone-way RPC

04 – 16 Communication/4.2 Remote Procedure Call

Page 18: Distributed Systems · Distributed Systems Principles and Paradigms Chapter 04 (version February 18, 2008) ... 11 Distributed File Systems ... Transient communication: A message is

RPC in Practice

Essence: Let the developer concentrate on only theclient- and server-specific code; let the RPC system(generators and libraries) do the rest.

C compiler

Uuidgen

IDL compiler

C compiler C compiler

Linker Linker

C compiler

Server stubobject file

Serverobject file

Runtimelibrary

Serverbinary

Clientbinary

Runtimelibrary

Client stubobject file

Clientobject file

Client stubClient code Header Server stub

Interfacedefinition file

Server code

#include#include

04 – 17 Communication/4.2 Remote Procedure Call

Page 19: Distributed Systems · Distributed Systems Principles and Paradigms Chapter 04 (version February 18, 2008) ... 11 Distributed File Systems ... Transient communication: A message is

Client-to-Server Binding (DCE)

Issues: (1) Client must locate server machine, and(2) locate the server.

Example: DCE uses a separate daemon for eachserver machine.

Endpointtable

Server

DCEdaemon

Client1. Register endpoint

2. Register service3. Look up server

4. Ask for endpoint

5. Do RPC

Directoryserver

Server machineClient machine

Directory machine

04 – 18 Communication/4.2 Remote Procedure Call

Page 20: Distributed Systems · Distributed Systems Principles and Paradigms Chapter 04 (version February 18, 2008) ... 11 Distributed File Systems ... Transient communication: A message is

Message-OrientedCommunication

• Transient Messaging

• Message-Queuing System

• Message Brokers

• Example: IBM Websphere

04 – 19 Communication/4.3 Message-Oriented Communication

Page 21: Distributed Systems · Distributed Systems Principles and Paradigms Chapter 04 (version February 18, 2008) ... 11 Distributed File Systems ... Transient communication: A message is

Transient Messaging: Sockets

Example: Consider the Berkeley socket interface ,which has been adopted by all UNIX systems, as wellas Windows 95/NT/2000/XP:

SOCKET Create a new communication endpointBIND Attach a local address to a socketLISTEN Announce willingness to accept N con-

nectionsACCEPT Block until someone remote wants to

establish a connectionCONNECT Attempt to establish a connectionSEND Send data over a connectionRECEIVE Receive data over a connectionCLOSE Release the connection

connect

socket

socket

bind listen read

read

write

write

accept close

close

Server

Client

Synchronization point Communication

04 – 20 Communication/4.3 Message-Oriented Communication

Page 22: Distributed Systems · Distributed Systems Principles and Paradigms Chapter 04 (version February 18, 2008) ... 11 Distributed File Systems ... Transient communication: A message is

Message-Oriented Middleware

Essence: Asynchronous persistent communicationthrough support of middleware-level queues . Queuescorrespond to buffers at communication servers.

PUT Append a message to a specified queueGET Block until the specified queue is nonempty, and

remove the first messagePOLL Check a specified queue for messages, and re-

move the first. Never blockNOTIFY Install a handler to be called when a message is

put into the specified queue

04 – 21 Communication/4.3 Message-Oriented Communication

Page 23: Distributed Systems · Distributed Systems Principles and Paradigms Chapter 04 (version February 18, 2008) ... 11 Distributed File Systems ... Transient communication: A message is

Message Broker

Observation: Message queuing systems assume acommon messaging protocol: all applications agreeon message format (i.e., structure and data represen-tation)

Message broker: Centralized component that takescare of application heterogeneity in an MQ system:

• Transforms incoming messages to target format• Very often acts as an application gateway• May provide subject-based routing capabilities⇒ Enterprise Application Integration

Queuinglayer

Brokerprogram

Repository with conversion rules and programsSource client Destination client

OS OSOS

Message broker

Network

04 – 22 Communication/4.3 Message-Oriented Communication

Page 24: Distributed Systems · Distributed Systems Principles and Paradigms Chapter 04 (version February 18, 2008) ... 11 Distributed File Systems ... Transient communication: A message is

IBM’s WebSphere MQ (1/3)

Basic concepts:

• Application-specific messages are put into, andremoved from queues• Queues always reside under the regime of a queue

manager• Processes can put messages only in local queues,

or through an RPC mechanism

Message transfer:

• Messages are transferred between queues• Message transfer between queues at different pro-

cesses, requires a channel• At each endpoint of channel is a message chan-

nel agent• Message channel agents are responsible for:

– Setting up channels using lower-level networkcommunication facilities (e.g., TCP/IP)

– (Un)wrapping messages from/in transport-levelpackets

– Sending/receiving packets

04 – 23 Communication/4.3 Message-Oriented Communication

Page 25: Distributed Systems · Distributed Systems Principles and Paradigms Chapter 04 (version February 18, 2008) ... 11 Distributed File Systems ... Transient communication: A message is

IBM’s WebSphere MQ (2/3)

MCA MCAMCA MCA

MQ Interface

Stub StubServerstub

Serverstub

Send queue

Program ProgramQueuemanager

Queuemanager

Routing table

Enterprise networkRPC(synchronous)

Local network

Message passing(asynchronous)

To other remotequeue managers

Client's receivequeueSending client Receiving client

• Channels are inherently unidirectional

• MQ provides mechanisms to automatically startMCAs when messages arrive, or to have a re-ceiver set up a channel

• Any network of queue managers can be created;routes are set up manually (system administra-tion)

04 – 24 Communication/4.3 Message-Oriented Communication

Page 26: Distributed Systems · Distributed Systems Principles and Paradigms Chapter 04 (version February 18, 2008) ... 11 Distributed File Systems ... Transient communication: A message is

IBM’s WebSphere MQ (3/3)

Routing: By using logical names , in combinationwith name resolution to local queues, it is possible toput a message in a remote queue

SQ1SQ2

SQ1

SQ1SQ1SQ2

QMBQMCQMD

SQ1

SQ1SQ1

SQ1

SQ2SQ1

SQ1

SQ1SQ1

QMA

QMA QMA

QMC

QMCQMB

QMD

QMBQMD

Routing tableRouting table

Routing table Routing table

QMB

QMC

QMA

LA1LA1

LA1

LA2LA2

LA2

QMCQMA

QMA

QMDQMD

QMC

Alias tableAlias table

Alias table

QMD

SQ1

SQ2

SQ1

Question: What’s a major problem here?

04 – 25 Communication/4.3 Message-Oriented Communication

Page 27: Distributed Systems · Distributed Systems Principles and Paradigms Chapter 04 (version February 18, 2008) ... 11 Distributed File Systems ... Transient communication: A message is

Stream-Oriented Communication

• Support for continuous media

• Streams in distributed systems

• Stream management

04 – 26 Communication/4.4 Stream-Oriented Communication

Page 28: Distributed Systems · Distributed Systems Principles and Paradigms Chapter 04 (version February 18, 2008) ... 11 Distributed File Systems ... Transient communication: A message is

Continuous Media

Observation: All communication facilities discussedso far are essentially based on a discrete , that istime-independent exchange of information

Continuous media: Characterized by the fact thatvalues are time dependent :

• Audio• Video• Animations• Sensor data (temperature, pressure, etc.)

Transmission modes: Different timing guarantees withrespect to data transfer:

• Asynchronous: no restrictions with respect towhen data is to be delivered• Synchronous: define a maximum end-to-end de-

lay for individual data packets• Isochronous: define a maximum and minimum

end-to-end delay (jitter is bounded)

04 – 27 Communication/4.4 Stream-Oriented Communication

Page 29: Distributed Systems · Distributed Systems Principles and Paradigms Chapter 04 (version February 18, 2008) ... 11 Distributed File Systems ... Transient communication: A message is

Stream

Definition: A (continuous) data stream is a connection-oriented communication facility that supports isochronousdata transmission

Some common stream characteristics:

• Streams are unidirectional• There is generally a single source , and one or

more sinks• Often, either the sink and/or source is a wrapper

around hardware (e.g., camera, CD device, TVmonitor, dedicated storage)

Stream types:

• Simple : consists of a single flow of data, e.g.,audio or video• Complex : multiple data flows, e.g., stereo audio

or combination audio/video

04 – 28 Communication/4.4 Stream-Oriented Communication

Page 30: Distributed Systems · Distributed Systems Principles and Paradigms Chapter 04 (version February 18, 2008) ... 11 Distributed File Systems ... Transient communication: A message is

Streams and QoS

Essence: Streams are all about timely delivery ofdata. How do you specify this Quality of Service(QoS)? Basics:

• The required bit rate at which data should betransported.

• The maximum delay until a session has beenset up (i.e., when an application can start sendingdata).

• The maximum end-to-end delay (i.e., how longit will take until a data unit makes it to a recipient).

• The maximum delay variance, or jitter .

• The maximum round-trip delay .

04 – 29 Communication/4.4 Stream-Oriented Communication

Page 31: Distributed Systems · Distributed Systems Principles and Paradigms Chapter 04 (version February 18, 2008) ... 11 Distributed File Systems ... Transient communication: A message is

Enforcing QoS (1/2)

Observation : There are various network-level tools,such as differentiated services by which certain pack-ets can be prioritized.

Also : use buffers to reduce jitter:

0 5

1 2 3 4 5 6 7 8

10Time (sec)

Time in buffer

15 20

Gap in playback

Packet removed from buffer

1 2 3 4 5 6 7 8Packet arrives at buffer

1 2 3 4 5 6 7 8Packet departs source

04 – 30 Communication/4.4 Stream-Oriented Communication

Page 32: Distributed Systems · Distributed Systems Principles and Paradigms Chapter 04 (version February 18, 2008) ... 11 Distributed File Systems ... Transient communication: A message is

Enforcing QoS (2/2)

Problem: How to reduce the effects of packet loss(when multiple samples are in a single packet)?

Solution: simply spread the samples:

1 2 3 4 5 6 7 8 9 10 11 12

1 5 9 13 2 6 10 14 3 7 11 15

13 14 15 16

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

4 8 12 16

Lost packet

Lost packet

Gap of lost frames

Lost frames

(a)

(b)

Sent

Delivered

Sent

Delivered

04 – 31 Communication/4.4 Stream-Oriented Communication

Page 33: Distributed Systems · Distributed Systems Principles and Paradigms Chapter 04 (version February 18, 2008) ... 11 Distributed File Systems ... Transient communication: A message is

Stream Synchronization

Problem: Given a complex stream, how do you keepthe different substreams in synch?

Example: Think of playing out two channels, that to-gether form stereo sound. Difference should be lessthan 20–30 µsec!

Network

Incoming stream

Application

Receiver's machine

Procedure that readstwo audio data units foreach video data unit

OS

Alternative: multiplex all substreams into a singlestream, and demultiplex at the receiver. Synchroniza-tion is handled at multiplexing/demultiplexing point(MPEG).

04 – 32 Communication/4.4 Stream-Oriented Communication

Page 34: Distributed Systems · Distributed Systems Principles and Paradigms Chapter 04 (version February 18, 2008) ... 11 Distributed File Systems ... Transient communication: A message is

Multicast Communication

• Application-level multicasting

• Gossip-based data dissemination

04 – 33 Communication/4.5 Multicast Communication

Page 35: Distributed Systems · Distributed Systems Principles and Paradigms Chapter 04 (version February 18, 2008) ... 11 Distributed File Systems ... Transient communication: A message is

Application-Level Multicasting

Essence: Organize nodes of a distributed system intoan overlay network and use that network to dissem-inate data. Example: Consider a Chord-based peer-to-peer system:

1. Initiator generates a multicast identifier mid.

2. Lookup su (mid), the node responsible for mid.

3. Request is routed to succ(mid), which will be-come the root .

4. If P wants to join, it sends a join request to theroot.

5. When request arrives at Q:

• Q has not seen a join request before⇒ it be-comes forwarder ; P becomes child of Q. Joinrequest continues to be forwarded.

• Q knows about tree⇒ P becomes child of Q.No need to forward join request anymore.

04 – 34 Communication/4.5 Multicast Communication

Page 36: Distributed Systems · Distributed Systems Principles and Paradigms Chapter 04 (version February 18, 2008) ... 11 Distributed File Systems ... Transient communication: A message is

ALM: Some costs

A

BD

C

Ra

RbRd

Rc

Internet

RouterEnd host

Overlay network

75

1

1

1

1

50

40

• Link stress : How often does an ALM messagecross the same physical link? Example: mes-sage from A to D needs to cross 〈Ra, Rb〉 twice.

• Stretch : Ratio in delay between ALM-level pathand network-level path. Example: messages B

to C follow path of length 71 at ALM, but 47 atnetwork level⇒ stretch = 71/47.

04 – 35 Communication/4.5 Multicast Communication

Page 37: Distributed Systems · Distributed Systems Principles and Paradigms Chapter 04 (version February 18, 2008) ... 11 Distributed File Systems ... Transient communication: A message is

Epidemic Algorithms

• General background

• Update models

• Removing objects

04 – 36 Communication/4.5 Multicast Communication

Page 38: Distributed Systems · Distributed Systems Principles and Paradigms Chapter 04 (version February 18, 2008) ... 11 Distributed File Systems ... Transient communication: A message is

Principles

Basic idea: Assume there are no write–write con-flicts:

• Update operations are initially performed at oneor only a few replicas• A replica passes its updated state to a limited

number of neighbors• Update propagation is lazy, i.e., not immediate• Eventually, each update should reach every replica

Anti-entropy : Each replica regularly chooses anotherreplica at random, and exchanges state differences,leading to identical states at both afterwards

Gossiping : A replica which has just been updated(i.e., has been contaminated ), tells a number ofother replicas about its update (contaminating themas well).

04 – 37 Communication/4.5 Multicast Communication

Page 39: Distributed Systems · Distributed Systems Principles and Paradigms Chapter 04 (version February 18, 2008) ... 11 Distributed File Systems ... Transient communication: A message is

Anti-Entropy

• A node P selects another node Q from the systemat random.

• Push : P only sends its updates to Q

• Pull : P only retrieves updates from Q

• Push-Pull : P and Q exchange mutual updates(after which they hold the same information).

Observation: for push-pull it takesO(log(N)) roundsto disseminate updates to all N nodes (round = whenevery node as taken the initiative to start an exchange).

04 – 38 Communication/4.5 Multicast Communication

Page 40: Distributed Systems · Distributed Systems Principles and Paradigms Chapter 04 (version February 18, 2008) ... 11 Distributed File Systems ... Transient communication: A message is

Gossiping

Basic model: A server S having an update to re-port, contacts other servers. If a server is contactedto which the update has already propagated, S stopscontacting other servers with probability 1/k.

If s is the fraction of ignorant servers (i.e., which areunaware of the update), it can be shown that withmany servers

s = e−(k+1)(1−s)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

-15.0

-12.5

-10.0

-7.5

-5.0

-2.5

k

ln(s)

Observation: If we really have to ensure that all serversare eventually updated, gossiping alone is not enough

04 – 39 Communication/4.5 Multicast Communication

Page 41: Distributed Systems · Distributed Systems Principles and Paradigms Chapter 04 (version February 18, 2008) ... 11 Distributed File Systems ... Transient communication: A message is

Deleting Values

Fundamental problem: We cannot remove an oldvalue from a server and expect the removal to prop-agate. Instead, mere removal will be undone in duetime using epidemic algorithms

Solution: Removal has to be registered as a specialupdate by inserting a death certificate

Next problem: When to remove a death certificate (itis not allowed to stay for ever):

• Run a global algorithm to detect whether the re-moval is known everywhere, and then collect thedeath certificates (looks like garbage collection)• Assume death certificates propagate in finite time,

and associate a maximum lifetime for a certificate(can be done at risk of not reaching all servers)

Note: it is necessary that a removal actually reachesall servers.

Question: What’s the scalability problem here?

04 – 40 Communication/4.5 Multicast Communication

Page 42: Distributed Systems · Distributed Systems Principles and Paradigms Chapter 04 (version February 18, 2008) ... 11 Distributed File Systems ... Transient communication: A message is

Example Applications

Data dissemination : Perhaps the most important one.Note that there are many variants of dissemina-tion.

Aggregation : Let every node i maintain a variablexi. When two nodes gossip, they each reset theirvariable to

xi, xj← (xi + xj)/2

Result: in the end each node will have computedthe average x̄ = ∑i xi/N.

Question: What happens if initially xi = 1 andxj = 0, j 6= i?

04 – 41 Communication/4.5 Multicast Communication