1 consistent global states of distributed systems: fundamental concepts and mechanisms author: ozalp...

33
1 Consistent Global States of Dis tributed Systems: Fundamental Concepts and Mechanisms Author: Ozalp Babaoglu and Keith M arzullo Distributed Systems: 526 U1580 Professor: Ching-Chi Hsu

Upload: cecily-whitehead

Post on 12-Jan-2016

221 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Consistent Global States of Distributed Systems: Fundamental Concepts and Mechanisms Author: Ozalp Babaoglu and Keith Marzullo Distributed Systems: 526

1

Consistent Global States of Distributed Systems: Fundamental Concepts and Mechani

sms

Author: Ozalp Babaoglu and Keith Marzullo

Distributed Systems: 526 U1580

Professor: Ching-Chi Hsu

Page 2: 1 Consistent Global States of Distributed Systems: Fundamental Concepts and Mechanisms Author: Ozalp Babaoglu and Keith Marzullo Distributed Systems: 526

2

Introduction

Many problems in distributed computing can be cast as executing some notification or reaction when the state of the system satisfies a particular condition

Global Predicate Evaluation (GPE): to establish the truth of a Boolean expression whose variables may refer to the global systems state

A global state may not be consistent Asynchronous system:

no bounds on the relative speeds of processes and message delays Impossible to maintain synchronized local clocks Communication remains the only possible mechanism for

synchronization

channels are reliable but may deliver messages out of order

Page 3: 1 Consistent Global States of Distributed Systems: Fundamental Concepts and Mechanisms Author: Ozalp Babaoglu and Keith Marzullo Distributed Systems: 526

3

Outline

Two Class of solutions to the GPE problem: A reactive-architecture: each process, when executing an event, n

otify P0 by sending it a message describing the event A snapshot architecture: the monitor P0 sends each process a ‘stat

e enquiry’ message.

Page 4: 1 Consistent Global States of Distributed Systems: Fundamental Concepts and Mechanisms Author: Ozalp Babaoglu and Keith Marzullo Distributed Systems: 526

4

Definitions (1)

distributed systems: a collection of sequential processes p1, p2, ...,

pn networked by unidirectional communication channels

events: the activity of each sequential process, which can be internal events or communications: send(m) or receive(m) with another process

local history of process pi : hi = ei1ei

2...

global history: H = h1h2... hn

cause-effect relation '->': If ei

k, eilhi and k<l, then ei

k eil

If ei = send(m) and ej = receive(m), then ei ej

If e e' and e' e'', then e e'' Concurrent e||e': neither e e' nor e' e

Page 5: 1 Consistent Global States of Distributed Systems: Fundamental Concepts and Mechanisms Author: Ozalp Babaoglu and Keith Marzullo Distributed Systems: 526

5

Definitions (2)

distributed computation: a partially ordered set defined by the pair (H, )

space-diagram: representation of a distributed computation

p1

p2

p3

e11 e1

2 e13 e1

4 e15 e1

6

e21

e22

e23

e31 e3

2 e33 e3

4 e35 e3

6

Page 6: 1 Consistent Global States of Distributed Systems: Fundamental Concepts and Mechanisms Author: Ozalp Babaoglu and Keith Marzullo Distributed Systems: 526

6

Definitions (3)

local state of pi immediately after executing event eik is denoted b

y ik

global state: (, ..., n)

a cut C(c1,...,cn) is a subset of global history H and contains an initial prefix of each of the local histories, i.e. C h1

c1hnc

n

a run R is a total ordering of all events in H and is consistent with each local history Example: pp6

Note that a single distributed computation may have many runs

Page 7: 1 Consistent Global States of Distributed Systems: Fundamental Concepts and Mechanisms Author: Ozalp Babaoglu and Keith Marzullo Distributed Systems: 526

7

Example

Insistent cut and phantom deadlock

p1

p2

p3

e11 e1

2 e13 e1

4 e15 e1

6

e21

e22

e23

e31 e3

2 e33 e3

4 e35 e3

6

C C’

req req resp

resp

reqreq

Page 8: 1 Consistent Global States of Distributed Systems: Fundamental Concepts and Mechanisms Author: Ozalp Babaoglu and Keith Marzullo Distributed Systems: 526

8

Consistency

A consistent cut C, is such that e and e', (e C)(e' e) => e' C

A consistent global state is one corresponding to a consistent cut Aconsistent run R, is such that

e and e', (e e') => e appears before e' in R Example: pp6

If the run is consistent then all the global states in the sequence will be consistent as well

Page 9: 1 Consistent Global States of Distributed Systems: Fundamental Concepts and Mechanisms Author: Ozalp Babaoglu and Keith Marzullo Distributed Systems: 526

9

Observing Distributed Computations

A monitor p0 will assume a passive role in that it will not send any messages of its own

The application processes notify p0 by sending it a message whenever they execute an event

The monitor p0 constructs an observation of the underlying distributed computation as the events arrived

Due to the variability of message delays, an observation can correspond to a consistent run, an inconsistent run or no run at all O1 = e2

1e11e3

1e32e3

4e12e2

2e33e1

3e14e3

5.... => not a run

O2 = e11e3

1e21e3

2e12e3

3e34e1

3e22e3

5e36.... => inconsistent run

O3 = e31e2

1e11e1

2e32e3

3e13e3

4e14e2

2e15.... => consistent run

To restore order of messages by defining a delivery rule for deciding when received messages are to be presented to the application process

Page 10: 1 Consistent Global States of Distributed Systems: Fundamental Concepts and Mechanisms Author: Ozalp Babaoglu and Keith Marzullo Distributed Systems: 526

10

First-In-First-Out(FIFO) delivery for all messages m and m' from pi to pj

if sendi(m) sendi(m') => deliverj(m) deliverj(m')

FIFO can be implemented by adding sequence numbers to messages

While FIFO delivery is sufficient to guarantee that observations correspond to runs, it is not sufficient to guarantee consistent observations

FIFO delivery

Page 11: 1 Consistent Global States of Distributed Systems: Fundamental Concepts and Mechanisms Author: Ozalp Babaoglu and Keith Marzullo Distributed Systems: 526

11

Observing Distributed Computations with Real-Time Clocks

Environment: message delays are bounded by channels are FIFO existence of a global real-time clock each message includes RC(e), the global real-time clock when event

e occurs, as its timestamp DR1:

At time t, deliver all received messages with timestatmps up to t- in increasing timestamp order

Observation is consistent iff the following is satisfied Clock condition: e e' => RC(e) < RC(e')

Page 12: 1 Consistent Global States of Distributed Systems: Fundamental Concepts and Mechanisms Author: Ozalp Babaoglu and Keith Marzullo Distributed Systems: 526

12

Observing Distributed Computations with Logical Clocks

Environment: channels are FIFO asynchronous communication implementation of logical clocks each message includes LC(e), the logical clock when event e occurs,

as its timestamp DR2:

Deliver all messages that are stable at p0 in increasing timestamp order

Note: a message m is stable at p if no future messages with timestamp < TS(m) Given FIFO channels, m is stable at p0 when p0 has received at least

one message with timestamp>TS(m) from all other processes

Page 13: 1 Consistent Global States of Distributed Systems: Fundamental Concepts and Mechanisms Author: Ozalp Babaoglu and Keith Marzullo Distributed Systems: 526

13

Logical Clocks

p1

p2

p3

1 2 4 5 6 7

1

5

6

1 2 3 4 5 7

Logical Clockeach process pi maintains a local variable LCi

when a new event ei occurs, pi modifies LCi to

LCi + 1 if ei is an internal or send event max{ LCi, TS(m)} + 1 if ei = receive(m)

Page 14: 1 Consistent Global States of Distributed Systems: Fundamental Concepts and Mechanisms Author: Ozalp Babaoglu and Keith Marzullo Distributed Systems: 526

14

Observing Distributed Computations with Causal Delivery

Causal Delivery (CD): sendi(m) sendj(m') => deliverk(m) deliverk(m')

If p0 uses a delivery rule satisfying CD, then all of its observations will be consistent

Page 15: 1 Consistent Global States of Distributed Systems: Fundamental Concepts and Mechanisms Author: Ozalp Babaoglu and Keith Marzullo Distributed Systems: 526

15

Efficient Delivering

For implementing causal delivery, what is really needed is an effective procedure for deciding: given events e,e' that are causally related and their clock values, do

es there exists some other event e'' such that e e'' e' Given RC(e) <RC(e') (or LC(e)<LC(e')), it may be that

e e' or e|| e', i.e. e' e) The above observations suggest a timing mechanism TC whereby

causal precedence relations between events can be deduced from their timstamps

Stong Clock Condition: e e' TC(e) < TC(e')

Page 16: 1 Consistent Global States of Distributed Systems: Fundamental Concepts and Mechanisms Author: Ozalp Babaoglu and Keith Marzullo Distributed Systems: 526

16

Causal History (1)

p1

p2

p3

e21

e22

e23

e31 e3

2 e33 e3

4 e35 e3

6

Causal history of event e14

e11 e1

2 e13 e1

4 e15 e1

6

Causal history of event e(e) = { e' H | e' e} {e}That is, (e) is the smallest consistent cut that includes e

Page 17: 1 Consistent Global States of Distributed Systems: Fundamental Concepts and Mechanisms Author: Ozalp Babaoglu and Keith Marzullo Distributed Systems: 526

17

Causal Histories (2)

Maintaining Causal History Each process pi initializes local variable i to be Each message m contains a timestamp TS(m) which is the causal his

tory of its send event Scheme

If ei is internal or send event,

then i={ei} the causal history of the previous local event

If ei is the receive of message m by process pi from pj

then i={ei} the causal history of the previous local event of pi

the causal history of the corresponding send event at pj

The strong clock condition is satisfied if clock comparison is interpreted as set inclusion e e' (e) (e') or e e' e (e') if e e'

Problem: the causal histories will grow rapidly

Page 18: 1 Consistent Global States of Distributed Systems: Fundamental Concepts and Mechanisms Author: Ozalp Babaoglu and Keith Marzullo Distributed Systems: 526

18

Vector Clocks

The causal history of an event can be represented as a fixed-dimensional vector VC(e)[1..n] rather than a set, where VC(e)[i] = k, iff i(e) = hi

k for i = 1,2,...,n

p2(1,2,4)

(4,3,4)

p3

p1

(0,1,0)

(0,0,1) (1,0,2) (1,0,3) (1,0,4) (1,0,5) (1,0,6)

(1,0,0) (2,1,0) (3,1,3) (4,1,3) (5,1,3) (6,1,3)

Page 19: 1 Consistent Global States of Distributed Systems: Fundamental Concepts and Mechanisms Author: Ozalp Babaoglu and Keith Marzullo Distributed Systems: 526

19

Maintaining Vector Clocks

Maintaining Vector clock Each process pi maintains a local vector VCi[1..n]

Each message m contains a timestamp TS(m) which is the vector clock value VC(e)of its send event e

Scheme if ei is an internal or send event

VCi [i]= VCi [i] + 1, and VC(ei)=VCi

if ei = receive(m)

VCi = max { VCi , TS(m) }

VCi [i] = VCi [i] + 1

VC(ei)[j] number of events of pj that causally precede event ei of pi

V < V' (VV')k: 1kn: V[k] V'[k])

Page 20: 1 Consistent Global States of Distributed Systems: Fundamental Concepts and Mechanisms Author: Ozalp Babaoglu and Keith Marzullo Distributed Systems: 526

20

Properties of Vector Clocks

Properties of Vector Clocks Strong Clock Condition Simple Strong Clock Condition

e e' VC(e) < VC(e') ei ej VC(ei)[i] VC(ej)[i]

Concurrent ei||ej VC(ei)[i] VC(ej)[i]) (VC(ej)[j] VC(ei)[j])

Pairwise Inconsistent i j, VC(ei)[i] VC(ej)[i]) (VC(ej)[j] VC(ei)[j])

Consistent Cut (c1,c2, ..., cn) iff

i, j: 1 i,j n, VC(eici)[i] VC(ej

cj)[i]

Counting: the number of events precedes e is givent by #(e) #(e) =n

j=1 VC(e)[j] -1

Weak Gap-Detection: Given ei and ej

if VC(ei)[k] < VC(ej)[k] for some k j,

then ek such that (ek ei) (ek ej)

Page 21: 1 Consistent Global States of Distributed Systems: Fundamental Concepts and Mechanisms Author: Ozalp Babaoglu and Keith Marzullo Distributed Systems: 526

21

Implementing Causal Deliberywith Vector Clocks

Babaoglu & Marzullo monitor p0 maintains an array D[1..n] where D[i] contains TS(mi)[i

] where mi is the last message delivered from process pi

DR3: Deliver message m from process pj when both of the following is sat

isfied D[j] = TS(m)[j] -1 => guarantee FIFO D[k] TS(m)[k], k j => guarantee Causal Relation

DR4: Monitor p0 maintains an counter D

Deliver message m of event ei as soon as

D = #(ei) - 1

Page 22: 1 Consistent Global States of Distributed Systems: Fundamental Concepts and Mechanisms Author: Ozalp Babaoglu and Keith Marzullo Distributed Systems: 526

22

Causal Delivery with vector ClockExamples

p1(2,2)

(3,2)

p2

p0

(1,0)

(0,0) (1,1) (1,2)

[0,0](1,1) (2,2)

(0,0)

(1,0) (1,2) (3,2)

Page 23: 1 Consistent Global States of Distributed Systems: Fundamental Concepts and Mechanisms Author: Ozalp Babaoglu and Keith Marzullo Distributed Systems: 526

23

Distributed Snapshots

In this strategy, p0 will request the states of the other processes and then combined them into a global state

Definition: channel state: for each channel from pi to pj,

i,j = set difference between i and j

incoming channels of process pi :INi

outgoing channels of process pi :OUTi

Snapshot Protocols Chandy and Lamport [1985] Morgan[1985]

Page 24: 1 Consistent Global States of Distributed Systems: Fundamental Concepts and Mechanisms Author: Ozalp Babaoglu and Keith Marzullo Distributed Systems: 526

24

Snapshot Protocol 1

Assumption: existence of a global real-time clock : RC Each message is attached with timestamp Message delays are bounded

global clock algorithm P0 sends [take snapshot at tss] to all processes

When clock RC reads tss, each process pi do the following

records its local state i,

sends an empty message over all its outgoing channels and starts recording all message received over each incoming channels

For the time pi receives a message from pj with timestamp greater than or equal to tss, pi stops recording messages for that channel

Page 25: 1 Consistent Global States of Distributed Systems: Fundamental Concepts and Mechanisms Author: Ozalp Babaoglu and Keith Marzullo Distributed Systems: 526

25

Snapshot Protocol 2

Assumption: Bounded message delays Channels are FIFO

Chandy & Lamport P0 send [take snapshot] to itself

For each process pi receiving [take snapshot] If it is the first time

records its local state i

sends each out-going channels [take snapshot] starts recording messages from other incoming channels

If it is not the first time stops recording message from that incoming channel

Page 26: 1 Consistent Global States of Distributed Systems: Fundamental Concepts and Mechanisms Author: Ozalp Babaoglu and Keith Marzullo Distributed Systems: 526

26

Chandy & Lamport (1985)

p1

p2

p0

e11 e1

2 e13 e1

4 e15 e1

6

e21 e2

2 e23 e2

4 e25

Real computation R= e21 e1

1 e12 e1

3 e22 e1

4 e23 e2

4 e15 e2

5 e16

in terms of global state =00 0111 21 31 32 42 43 44 54 55 65

e1*

e2*

Page 27: 1 Consistent Global States of Distributed Systems: Fundamental Concepts and Mechanisms Author: Ozalp Babaoglu and Keith Marzullo Distributed Systems: 526

27

Properties of Snapshots

Definition a : the global state in which the snapshot protocol is initiated, f : the global state in which the protocol terminates and S : the global state constructed ei

* denote the event when pi receives [take snapshot] for the first time, causing pi to start recording its state

let the time be ti when ei* occurs

ei is a prerecordering event if ei ei*

,

otherwise it is a post-recording event Properties

Then there exists a run R' such that a S f That is to say S could have happened

Page 28: 1 Consistent Global States of Distributed Systems: Fundamental Concepts and Mechanisms Author: Ozalp Babaoglu and Keith Marzullo Distributed Systems: 526

28

Argumentation (1)

Chandy & Lamport(1985) consider any (post-recordering, prerecordering) pair (e, e') then e e') swapping all such events will result in another consistent run R'

swap (e13 , e2

2 ) r1= e21 e1

1 e12 e2

2 e13 e1

4 e23 e2

4 e15 e2

5 e16

swap (e14 , e2

3 ) r2= e21 e1

1 e12 e2

2 e13 e2

3 e14 e2

4 e15 e2

5 e16

swap (e13 , e2

3 ) R'= e21 e1

1 e12 e2

2 e23 e1

3 e14 e2

4 e15 e2

5 e16

the global state after executing the last prerecording event (e23 ) in R

' is S (=23), the constructed global state If the computation goes in this run, S could have happen

Page 29: 1 Consistent Global States of Distributed Systems: Fundamental Concepts and Mechanisms Author: Ozalp Babaoglu and Keith Marzullo Distributed Systems: 526

29

Argumentation (2)

Lai & Yang(1987) Let GSN(ti:piP) be a snapshot taken between 1 and 2, during the

computation R. Let =2-1, construct R' as follows:

R' is the same as R except that every post-recording event in R is now postponed for d units of time, that is

R'(t) =R(t) if R(t) is an event at piand tti

R(t-) if R(t-) is an event at pi and t- ti

otherwise Example

Page 30: 1 Consistent Global States of Distributed Systems: Fundamental Concepts and Mechanisms Author: Ozalp Babaoglu and Keith Marzullo Distributed Systems: 526

30

Properties of Global Predicates

Stable Predicates Many system properties one wishes to detect have the characteristic

that once they become true, they remain true If is a stable predicate, since a S f

( is true in s ) => ( is true in f ) ( is false in s ) =>( is false in a )

Nonstable Predicates the condition encoded by the predicate may not persist long enough

for it to be true when the predicate is evaluated if a predicate is found to be true by the monitor, we do not know

whether ever held during the actual run

Page 31: 1 Consistent Global States of Distributed Systems: Fundamental Concepts and Mechanisms Author: Ozalp Babaoglu and Keith Marzullo Distributed Systems: 526

31

Nonstable Predicates

Two problems The condition encoded by the predicate may not persist long enou

gh for it to be true when the predicate is evaluated If a predicate is found to be true by the monitor, we do not kno

w whether ever held during the actual run The predicate may have held even if it is not detected, and even if

it is detected it may have never held. Extended nonstable global predicate: apply to the entire distribute

d computation Possibly() Definitely()

Page 32: 1 Consistent Global States of Distributed Systems: Fundamental Concepts and Mechanisms Author: Ozalp Babaoglu and Keith Marzullo Distributed Systems: 526

32

Detecting Possibly and Definitely

min (ik) : the global state with the smallest level in the lattice cont

aining ik

max(ik) : the global state with the largest level in the lattice contai

ning ik

Examples: min (13) = 31,max (1

3) = 33

min(ik) = (1

c1, 2c2,…, n

cn ): j: VC(jcj)[j]=VC( i

k)[j]

max(ik) = (1

c1, 2c2,…, n

cn ): j: VC(jcj)[i]<=VC( i

k)[i] and ((j

Cj = jf) or (VC(j

Cj+1)[i] > VC(jk)[i]))

The minimum level containing jk is the sum of components of the v

ector timestamp VC(jk)

An algorithm for detecting Definitely(): O(kn): k is the maximum number of events a monitored process has executed

Page 33: 1 Consistent Global States of Distributed Systems: Fundamental Concepts and Mechanisms Author: Ozalp Babaoglu and Keith Marzullo Distributed Systems: 526

33

Example