dcs 6. basic distributed algorithms fundamentals wei yuan november,21,2013

SDP-MARCH- Talk DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013

Upload: jude-corsey

Post on 02-Apr-2015




0 download


Page 1: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013


DCS 6. Basic Distributed Algorithms Fundamentals

Wei YuanNovember,21,2013

Page 2: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013


• Physical Clocks• Logical Clocks– Lamport’s Logical Clock– Vector Clock

• Global Snapshots


Page 3: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013

Physical Clocks

• Most computers today keep track of the passage of time with a battery-backed-up CMOS clock circuit, driven by a quartz oscillator. – battery backup to continue measuring time when power

is off

• Two registers with quartz: counter, holding register

• A Programmable Interval Timer, to generate an interrupt (clock tick) periodically

• The interrupt service procedure simply adds one to a counter in memory.


Page 4: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013


• Getting two systems to agree on time– Two clocks hardly ever agree– Quartz oscillators oscillate at slightly different


• Clocks tick at different rates– Create ever-widening gap in perceived time– Clock Drift (时钟漂移)

• Difference between two clocks at one point in time– Clock Skew (时钟偏移)


Page 5: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013


• 国际原子时间( international atomic time , TAI )• 统一协调时间( Universal coordinated

time , UTC )• ……• 时间同步算法


Page 6: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013


• Physical Clocks• Logical Clocks– Lamport’s Logical Clock– Vector Clock

• Global Snapshots


Page 7: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013

Lamport’s Logical Clock

• A distributed system consists of a collection of distinct processes which are spatially separated, and which communicate with one another by exchanging messages. – A network of interconnected computers, the ARPA net– A single computer :the central control unit, the memory

units, and the input-output channels are separate processes

• Lamport L. Time, clocks, and the ordering of events in a distributed system[J]. Communications of the ACM, 1978, 21(7): 558-565.


Page 8: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013

Lamport’s happened before (→) relation

• Define the "happened before" relation without using physical clocks(partial ordering)

• Assumption– the system is composed of a collection of processes– Each process consists of a sequence of events– the execution of a subprogram on a computer– the execution of a single machine instruction

• We are assuming that the events of a process form a sequence, where a occurs before b in this sequence if a happens before b.


Page 9: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013

Lamport’s happened before () relation

(1)In the same process:if

(2) If is the sending of a message by one process and is the receipt of the same message by another process, then . (3) If and then.

• Two distinct events and are said to be concurrent if and .

• Assume that for any event . ( is an irreflexive partial ordering)


Page 10: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013

space-time diagram

• horizontal: space• vertical: time• dots: events• vertical lines:

process• wavy lines:



Page 11: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013

• A clock is just a way of assigning a number to an event (abstract) – Clock for each process

• assign a number to any event in the process

– Clock for the entire system • = if is an event in process

• Clock Condition– For any events , : if then .– Cannot expect the converse condition to hold, since that

would imply that any two concurrent events must occur at the same time.(e.g., p2&p3 are both concurrent with q3)


Page 12: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013

• A process’ clock “ticks”– ( 1 ) means that there must be a tick line between any

two events on a process line– ( 2 ) means that every message line must cross a tick



Page 13: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013

Event counting example


Page 14: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013

Lamport’s logical timestamps

• Process ’s clock is represented by a register , so is the value contained by during the event .

• All processes use a local counter (logical clock) with initial value of zero

• Just before each event, the local counter is incremented by 1 and assigned to the event as its timestamp

• A send (message) event carries its timestamp • For a receive (message) event, the counter is

updated by max (receiver’s-local-counter, message-timestamp) + 1


Page 15: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013

Event counting example

Applying Lamport’s algorithm


Page 16: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013

Problem: Identical timestamps

• Concurrent events (e.g., b & g; i & k) may have the same timestamp … or not

• Total ordering: every event is assigned a unique timestamp (number), every such timestamp is unique.


Page 17: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013

Unique timestamps (total ordering)

We can force each timestamp to be unique• Define global logical timestamp

– represents local Lamport timestamp– represents process number (globally unique)

• e.g., (host address, process ID)

• Compare timestamps:– if and only if – or and

• Does not necessarily relate to actual event ordering


Page 18: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013

• Unique (totally ordered) timestamps


Page 19: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013

Problem: Detecting causal relations

• If – We cannot conclude .

•By looking at Lamport timestamps– We cannot conclude which events are causally related

•Solution: use a vector clock


Page 20: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013


• Physical Clocks• Logical Clocks– Lamport’s Logical Clock– Vector Clock

• Global Snapshots


Page 21: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013

Vector clocks

Rules:1. Vector initialized to 0 at each process 2. Process increments its element of the vector in local vector before timestamping event: 3. Message is sent from process with attached to it4. When receives message, compares vectors element by element and sets local vector to higher of two values • For example, received: [ 0, 5, 12, 1 ], have: [ 2, 8, 10, 1] new timestamp: [ 2, 8, 12, 1 ]


Page 22: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013

Comparing vector timestamps

• Define iff iff• For any two events e, e’

if then V(e) < V(e’)

… just like Lamport’s algorithm

if V(e) < V(e’) then

• Two events are concurrent if neither

V(e)V(e’) nor V(e’) V(e)


Page 23: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013

Vector timestamps





Page 24: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013

Vector timestamps






Page 25: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013

Vector timestamps







Page 26: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013

Vector timestamps







Page 27: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013

Vector timestamps







Page 28: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013

Vector timestamps







Page 29: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013

Vector timestamps







Page 30: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013

Vector timestamps







Two events are concurrent if neither V(e)≤V(e’) nor V(e’)≤ V(e)

Page 31: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013

Vector timestamps







Page 32: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013

Vector timestamps






(0,0,0) (2,1,0


Page 33: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013

Vector timestamps






(0,0,0) (2,2,0


Page 34: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013


• Physical Clocks• Logical Clocks– Lamport’s Logical Clock– Vector Clock

• Global Snapshots


Page 35: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013

“Distributed snapshots: determining global states of distributed systems”, K. Mani Chandy and Leslie Lamport, ACM TOCS 1985


Page 36: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013

Model of a Distributed System

• Finite set of processes as nodes.• Finite set of channels as edges.• Channels have infinite buffers, are error-free and FIFO.• The delay experienced by a message is arbitrary but finite.


p q





Page 37: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013

A banking example to illustrate recording of consistent states


Page 38: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013

Global State of a Distributed System

Global State:• Union of the local states of the individual processes and the

state of the channels.• The state of a channel is determined by “Message in transit”

where the message is sent along the channel but not yet received.

• Initial global state for system:– each process is in initial state– the state of each channel is empty sequence



Page 39: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013

Global State Detection

• Many problems in distributed systems can be solved by detecting a global state of system.

• Stable property detection– A stable property which once becomes true, remains true

forever.– E.g. termination, deadlock, token loss etc.

• Checkpointing in distributed systems– E.g .debugging, failure recovering etc.




Page 40: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013

Distributed Computation• A distributed computation is the sequence of events.• There are three kind of events: local, send, receive.• An event is an atomic action that may change the state of

the process p and the state of at most one channel that is incident on p.

Definition of Event e• Event is a five-tuple e = <p, s, s', M, c>, where• p is the process in which the event occur,• s is the state of p immediately before the event,• s' is the state of p immediately after the event,• M is the message sent or received along the channel c.


Page 41: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013

Consistent Global State

• Consistency: every message that is recorded as received has also been recorded as sent.

• Consistent global states determined by a snapshots are the states that may have occurred during the computation.


同时满足以下两个条件:C1. 消息守恒。记录在进程 pi 的本地状态中发送的消息 mij

必须出现在通道 Cij 的状态中,或是出现在接收方进程 pj 的本地状态中。C2. 在得到的全局状态中,对于每一个结果,引起结果的原因也必须出现。

Page 42: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013

Chandy–Lamport Algorithm

• Each process in the system records its local state and the state of its incoming channels.

• Recorded states form a consistent global state.• Snapshot algorithm runs concurrently with the computation

but does not alter the underlying computation.• Snapshot algorithm uses marker as a recording signal.• Any process can initiate the snapshot by sending a marker

for all outgoing channels.• On receiving a marker a process records its own local state

and the states of all incoming channels.


Page 43: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013

Chandy–Lamport Algorithm contd.

Marker-Sending Rule for Process pi

(1) Process pi records its state.

(2) For each outgoing channel C on which a markerhas not been sent, pi sends a marker along C

before pi sends further messages along C.


Page 44: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013

Chandy–Lamport Algorithm contd.

Marker-Receiving Rule for Process pj

On receiving a marker along channel C:if pj has not recorded its state then

Record the state of C as the empty set Execute the “marker sending rule”else Record the state of C as the set of messages received along C after pj ’s state was recorded

and before pj received the marker along C


Page 45: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013



Page 46: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013

附• 集合上的关系称为偏序关系或偏序,当且仅当是自反的、反对称的和传

递的。• 偏序( Partial Order )设 A 是一个非空集, P 是 A 上的一个关系,若 P 满足下列条件:1. 对任意的 a∈A ,( a,a )∈ P;( 自反性)2. 若( a,b )∈ P ,且( b,a )∈ P ,则 a=b; (反对称性)3. 若( a,b )∈ P ,( b,c )∈ P ,则( a,c )∈ P; (传递性)则称 P 是 A 上的一个偏序关系。若 P 是 A 上的一个偏序关系,我们用 a≤b 来表示( a,b )∈ P 。

• 设如果对于每一个,或者有,或者有 , 则称小于等于为上的全序或线序。46