dcs 6. basic distributed algorithms fundamentals wei yuan november,21,2013
TRANSCRIPT
![Page 1: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013](https://reader033.vdocuments.net/reader033/viewer/2022061519/551c4545550346a66a8b462e/html5/thumbnails/1.jpg)
SDP-MARCH-Talk
DCS 6. Basic Distributed Algorithms Fundamentals
Wei YuanNovember,21,2013
![Page 2: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013](https://reader033.vdocuments.net/reader033/viewer/2022061519/551c4545550346a66a8b462e/html5/thumbnails/2.jpg)
Outline
• Physical Clocks• Logical Clocks– Lamport’s Logical Clock– Vector Clock
• Global Snapshots
2
![Page 3: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013](https://reader033.vdocuments.net/reader033/viewer/2022061519/551c4545550346a66a8b462e/html5/thumbnails/3.jpg)
Physical Clocks
• Most computers today keep track of the passage of time with a battery-backed-up CMOS clock circuit, driven by a quartz oscillator. – battery backup to continue measuring time when power
is off
• Two registers with quartz: counter, holding register
• A Programmable Interval Timer, to generate an interrupt (clock tick) periodically
• The interrupt service procedure simply adds one to a counter in memory.
3
![Page 4: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013](https://reader033.vdocuments.net/reader033/viewer/2022061519/551c4545550346a66a8b462e/html5/thumbnails/4.jpg)
Problem
• Getting two systems to agree on time– Two clocks hardly ever agree– Quartz oscillators oscillate at slightly different
frequencies
• Clocks tick at different rates– Create ever-widening gap in perceived time– Clock Drift (时钟漂移)
• Difference between two clocks at one point in time– Clock Skew (时钟偏移)
4
![Page 5: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013](https://reader033.vdocuments.net/reader033/viewer/2022061519/551c4545550346a66a8b462e/html5/thumbnails/5.jpg)
Solution
• 国际原子时间( international atomic time , TAI )• 统一协调时间( Universal coordinated
time , UTC )• ……• 时间同步算法
5
![Page 6: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013](https://reader033.vdocuments.net/reader033/viewer/2022061519/551c4545550346a66a8b462e/html5/thumbnails/6.jpg)
Outline
• Physical Clocks• Logical Clocks– Lamport’s Logical Clock– Vector Clock
• Global Snapshots
6
![Page 7: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013](https://reader033.vdocuments.net/reader033/viewer/2022061519/551c4545550346a66a8b462e/html5/thumbnails/7.jpg)
Lamport’s Logical Clock
• A distributed system consists of a collection of distinct processes which are spatially separated, and which communicate with one another by exchanging messages. – A network of interconnected computers, the ARPA net– A single computer :the central control unit, the memory
units, and the input-output channels are separate processes
• Lamport L. Time, clocks, and the ordering of events in a distributed system[J]. Communications of the ACM, 1978, 21(7): 558-565.
7
![Page 8: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013](https://reader033.vdocuments.net/reader033/viewer/2022061519/551c4545550346a66a8b462e/html5/thumbnails/8.jpg)
Lamport’s happened before (→) relation
• Define the "happened before" relation without using physical clocks(partial ordering)
• Assumption– the system is composed of a collection of processes– Each process consists of a sequence of events– the execution of a subprogram on a computer– the execution of a single machine instruction
• We are assuming that the events of a process form a sequence, where a occurs before b in this sequence if a happens before b.
8
![Page 9: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013](https://reader033.vdocuments.net/reader033/viewer/2022061519/551c4545550346a66a8b462e/html5/thumbnails/9.jpg)
Lamport’s happened before () relation
(1)In the same process:if
(2) If is the sending of a message by one process and is the receipt of the same message by another process, then . (3) If and then.
• Two distinct events and are said to be concurrent if and .
• Assume that for any event . ( is an irreflexive partial ordering)
9
![Page 10: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013](https://reader033.vdocuments.net/reader033/viewer/2022061519/551c4545550346a66a8b462e/html5/thumbnails/10.jpg)
space-time diagram
• horizontal: space• vertical: time• dots: events• vertical lines:
process• wavy lines:
messages
10
![Page 11: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013](https://reader033.vdocuments.net/reader033/viewer/2022061519/551c4545550346a66a8b462e/html5/thumbnails/11.jpg)
• A clock is just a way of assigning a number to an event (abstract) – Clock for each process
• assign a number to any event in the process
– Clock for the entire system • = if is an event in process
• Clock Condition– For any events , : if then .– Cannot expect the converse condition to hold, since that
would imply that any two concurrent events must occur at the same time.(e.g., p2&p3 are both concurrent with q3)
11
![Page 12: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013](https://reader033.vdocuments.net/reader033/viewer/2022061519/551c4545550346a66a8b462e/html5/thumbnails/12.jpg)
• A process’ clock “ticks”– ( 1 ) means that there must be a tick line between any
two events on a process line– ( 2 ) means that every message line must cross a tick
line
12
![Page 13: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013](https://reader033.vdocuments.net/reader033/viewer/2022061519/551c4545550346a66a8b462e/html5/thumbnails/13.jpg)
Event counting example
13
![Page 14: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013](https://reader033.vdocuments.net/reader033/viewer/2022061519/551c4545550346a66a8b462e/html5/thumbnails/14.jpg)
Lamport’s logical timestamps
• Process ’s clock is represented by a register , so is the value contained by during the event .
• All processes use a local counter (logical clock) with initial value of zero
• Just before each event, the local counter is incremented by 1 and assigned to the event as its timestamp
• A send (message) event carries its timestamp • For a receive (message) event, the counter is
updated by max (receiver’s-local-counter, message-timestamp) + 1
14
![Page 15: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013](https://reader033.vdocuments.net/reader033/viewer/2022061519/551c4545550346a66a8b462e/html5/thumbnails/15.jpg)
Event counting example
Applying Lamport’s algorithm
15
![Page 16: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013](https://reader033.vdocuments.net/reader033/viewer/2022061519/551c4545550346a66a8b462e/html5/thumbnails/16.jpg)
Problem: Identical timestamps
• Concurrent events (e.g., b & g; i & k) may have the same timestamp … or not
• Total ordering: every event is assigned a unique timestamp (number), every such timestamp is unique.
16
![Page 17: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013](https://reader033.vdocuments.net/reader033/viewer/2022061519/551c4545550346a66a8b462e/html5/thumbnails/17.jpg)
Unique timestamps (total ordering)
We can force each timestamp to be unique• Define global logical timestamp
– represents local Lamport timestamp– represents process number (globally unique)
• e.g., (host address, process ID)
• Compare timestamps:– if and only if – or and
• Does not necessarily relate to actual event ordering
17
![Page 18: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013](https://reader033.vdocuments.net/reader033/viewer/2022061519/551c4545550346a66a8b462e/html5/thumbnails/18.jpg)
• Unique (totally ordered) timestamps
18
![Page 19: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013](https://reader033.vdocuments.net/reader033/viewer/2022061519/551c4545550346a66a8b462e/html5/thumbnails/19.jpg)
Problem: Detecting causal relations
• If – We cannot conclude .
•By looking at Lamport timestamps– We cannot conclude which events are causally related
•Solution: use a vector clock
19
![Page 20: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013](https://reader033.vdocuments.net/reader033/viewer/2022061519/551c4545550346a66a8b462e/html5/thumbnails/20.jpg)
Outline
• Physical Clocks• Logical Clocks– Lamport’s Logical Clock– Vector Clock
• Global Snapshots
20
![Page 21: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013](https://reader033.vdocuments.net/reader033/viewer/2022061519/551c4545550346a66a8b462e/html5/thumbnails/21.jpg)
Vector clocks
Rules:1. Vector initialized to 0 at each process 2. Process increments its element of the vector in local vector before timestamping event: 3. Message is sent from process with attached to it4. When receives message, compares vectors element by element and sets local vector to higher of two values • For example, received: [ 0, 5, 12, 1 ], have: [ 2, 8, 10, 1] new timestamp: [ 2, 8, 12, 1 ]
21
![Page 22: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013](https://reader033.vdocuments.net/reader033/viewer/2022061519/551c4545550346a66a8b462e/html5/thumbnails/22.jpg)
Comparing vector timestamps
• Define iff iff• For any two events e, e’
if then V(e) < V(e’)
… just like Lamport’s algorithm
if V(e) < V(e’) then
• Two events are concurrent if neither
V(e)V(e’) nor V(e’) V(e)
22
![Page 23: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013](https://reader033.vdocuments.net/reader033/viewer/2022061519/551c4545550346a66a8b462e/html5/thumbnails/23.jpg)
Vector timestamps
23
(0,0,0)
(0,0,0)
(0,0,0)
![Page 24: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013](https://reader033.vdocuments.net/reader033/viewer/2022061519/551c4545550346a66a8b462e/html5/thumbnails/24.jpg)
Vector timestamps
24
(1,0,0)
(0,0,0)
(0,0,0)
(0,0,0)
![Page 25: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013](https://reader033.vdocuments.net/reader033/viewer/2022061519/551c4545550346a66a8b462e/html5/thumbnails/25.jpg)
Vector timestamps
25
(0,0,0)
(0,0,0)
(0,0,0)
(1,0,0)
(2,0,0)
![Page 26: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013](https://reader033.vdocuments.net/reader033/viewer/2022061519/551c4545550346a66a8b462e/html5/thumbnails/26.jpg)
Vector timestamps
26
(1,0,0)
(2,0,0)
(0,0,0)
(0,0,0)
(0,0,0)
![Page 27: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013](https://reader033.vdocuments.net/reader033/viewer/2022061519/551c4545550346a66a8b462e/html5/thumbnails/27.jpg)
Vector timestamps
27
(1,0,0)
(2,0,0)
(0,0,0)
(0,0,0)
(0,0,0)
![Page 28: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013](https://reader033.vdocuments.net/reader033/viewer/2022061519/551c4545550346a66a8b462e/html5/thumbnails/28.jpg)
Vector timestamps
28
(1,0,0)
(2,0,0)
(0,0,0)
(0,0,0)
(0,0,0)
![Page 29: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013](https://reader033.vdocuments.net/reader033/viewer/2022061519/551c4545550346a66a8b462e/html5/thumbnails/29.jpg)
Vector timestamps
29
(1,0,0)
(2,0,0)
(0,0,0)
(0,0,0)
(0,0,0)
![Page 30: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013](https://reader033.vdocuments.net/reader033/viewer/2022061519/551c4545550346a66a8b462e/html5/thumbnails/30.jpg)
Vector timestamps
30
(1,0,0)
(2,0,0)
(0,0,0)
(0,0,0)
(0,0,0)
Two events are concurrent if neither V(e)≤V(e’) nor V(e’)≤ V(e)
![Page 31: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013](https://reader033.vdocuments.net/reader033/viewer/2022061519/551c4545550346a66a8b462e/html5/thumbnails/31.jpg)
Vector timestamps
31
(1,0,0)
(2,0,0)
(0,0,0)
(0,0,0)
(0,0,0)
![Page 32: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013](https://reader033.vdocuments.net/reader033/viewer/2022061519/551c4545550346a66a8b462e/html5/thumbnails/32.jpg)
Vector timestamps
32
(1,0,0)
(2,0,0)
(0,0,0)
(0,0,0)
(0,0,0) (2,1,0
)
![Page 33: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013](https://reader033.vdocuments.net/reader033/viewer/2022061519/551c4545550346a66a8b462e/html5/thumbnails/33.jpg)
Vector timestamps
33
(1,0,0)
(2,0,0)
(0,0,0)
(0,0,0)
(0,0,0) (2,2,0
)
![Page 34: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013](https://reader033.vdocuments.net/reader033/viewer/2022061519/551c4545550346a66a8b462e/html5/thumbnails/34.jpg)
Outline
• Physical Clocks• Logical Clocks– Lamport’s Logical Clock– Vector Clock
• Global Snapshots
34
![Page 35: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013](https://reader033.vdocuments.net/reader033/viewer/2022061519/551c4545550346a66a8b462e/html5/thumbnails/35.jpg)
“Distributed snapshots: determining global states of distributed systems”, K. Mani Chandy and Leslie Lamport, ACM TOCS 1985
35
![Page 36: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013](https://reader033.vdocuments.net/reader033/viewer/2022061519/551c4545550346a66a8b462e/html5/thumbnails/36.jpg)
Model of a Distributed System
• Finite set of processes as nodes.• Finite set of channels as edges.• Channels have infinite buffers, are error-free and FIFO.• The delay experienced by a message is arbitrary but finite.
36
p q
r
c1
c2
c3c4
![Page 37: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013](https://reader033.vdocuments.net/reader033/viewer/2022061519/551c4545550346a66a8b462e/html5/thumbnails/37.jpg)
A banking example to illustrate recording of consistent states
37
![Page 38: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013](https://reader033.vdocuments.net/reader033/viewer/2022061519/551c4545550346a66a8b462e/html5/thumbnails/38.jpg)
Global State of a Distributed System
Global State:• Union of the local states of the individual processes and the
state of the channels.• The state of a channel is determined by “Message in transit”
where the message is sent along the channel but not yet received.
• Initial global state for system:– each process is in initial state– the state of each channel is empty sequence
38
分布式系统的每个组件都有一个本地状态。进程状态:由本地存储器和活动历史描述。通道状态:由沿通道发送的消息减去沿通道接收消息的序列描述。
![Page 39: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013](https://reader033.vdocuments.net/reader033/viewer/2022061519/551c4545550346a66a8b462e/html5/thumbnails/39.jpg)
Global State Detection
• Many problems in distributed systems can be solved by detecting a global state of system.
• Stable property detection– A stable property which once becomes true, remains true
forever.– E.g. termination, deadlock, token loss etc.
• Checkpointing in distributed systems– E.g .debugging, failure recovering etc.
39
分布式系统中没有共享的存储器和全局时钟,本地时钟和本地存储器这样的分布式特性使得有效记录系统全局状态很困难。
检测如死锁和终止这样的稳态特性时,就需要检查系统全局状态。对于故障恢复,需要周期性地保存分布式系统的全局状态(称检查点),并通过把系统还原到最近保存的全局状态使恢复工作从进程故障点开始。
![Page 40: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013](https://reader033.vdocuments.net/reader033/viewer/2022061519/551c4545550346a66a8b462e/html5/thumbnails/40.jpg)
Distributed Computation• A distributed computation is the sequence of events.• There are three kind of events: local, send, receive.• An event is an atomic action that may change the state of
the process p and the state of at most one channel that is incident on p.
Definition of Event e• Event is a five-tuple e = <p, s, s', M, c>, where• p is the process in which the event occur,• s is the state of p immediately before the event,• s' is the state of p immediately after the event,• M is the message sent or received along the channel c.
40
![Page 41: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013](https://reader033.vdocuments.net/reader033/viewer/2022061519/551c4545550346a66a8b462e/html5/thumbnails/41.jpg)
Consistent Global State
• Consistency: every message that is recorded as received has also been recorded as sent.
• Consistent global states determined by a snapshots are the states that may have occurred during the computation.
41
同时满足以下两个条件:C1. 消息守恒。记录在进程 pi 的本地状态中发送的消息 mij
必须出现在通道 Cij 的状态中,或是出现在接收方进程 pj 的本地状态中。C2. 在得到的全局状态中,对于每一个结果,引起结果的原因也必须出现。
![Page 42: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013](https://reader033.vdocuments.net/reader033/viewer/2022061519/551c4545550346a66a8b462e/html5/thumbnails/42.jpg)
Chandy–Lamport Algorithm
• Each process in the system records its local state and the state of its incoming channels.
• Recorded states form a consistent global state.• Snapshot algorithm runs concurrently with the computation
but does not alter the underlying computation.• Snapshot algorithm uses marker as a recording signal.• Any process can initiate the snapshot by sending a marker
for all outgoing channels.• On receiving a marker a process records its own local state
and the states of all incoming channels.
42
![Page 43: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013](https://reader033.vdocuments.net/reader033/viewer/2022061519/551c4545550346a66a8b462e/html5/thumbnails/43.jpg)
Chandy–Lamport Algorithm contd.
Marker-Sending Rule for Process pi
(1) Process pi records its state.
(2) For each outgoing channel C on which a markerhas not been sent, pi sends a marker along C
before pi sends further messages along C.
43
![Page 44: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013](https://reader033.vdocuments.net/reader033/viewer/2022061519/551c4545550346a66a8b462e/html5/thumbnails/44.jpg)
Chandy–Lamport Algorithm contd.
Marker-Receiving Rule for Process pj
On receiving a marker along channel C:if pj has not recorded its state then
Record the state of C as the empty set Execute the “marker sending rule”else Record the state of C as the set of messages received along C after pj ’s state was recorded
and before pj received the marker along C
44
![Page 45: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013](https://reader033.vdocuments.net/reader033/viewer/2022061519/551c4545550346a66a8b462e/html5/thumbnails/45.jpg)
Thanks!Q&A
45
![Page 46: DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013](https://reader033.vdocuments.net/reader033/viewer/2022061519/551c4545550346a66a8b462e/html5/thumbnails/46.jpg)
附• 集合上的关系称为偏序关系或偏序,当且仅当是自反的、反对称的和传
递的。• 偏序( Partial Order )设 A 是一个非空集, P 是 A 上的一个关系,若 P 满足下列条件:1. 对任意的 a∈A ,( a,a )∈ P;( 自反性)2. 若( a,b )∈ P ,且( b,a )∈ P ,则 a=b; (反对称性)3. 若( a,b )∈ P ,( b,c )∈ P ,则( a,c )∈ P; (传递性)则称 P 是 A 上的一个偏序关系。若 P 是 A 上的一个偏序关系,我们用 a≤b 来表示( a,b )∈ P 。
• 设如果对于每一个,或者有,或者有 , 则称小于等于为上的全序或线序。46