distributed synchronization

04/19/23 ICSS741 - Time and Coordination 1

Distributed Synchronization

• In single CPU systems– Semaphores and monitors

– Essentially shared memory solutions

• How about distributed synchronization? – Relevant information is scattered

– Processes make decisions based on local information

– A single point of failure in a system should be avoided

– No common clock or other precise global time source exists


What Is Needed

• In order to coordinate events in a distributed system– We may need to know the time at which a particular

event took place– We may need to determine the order in which two

events took place, or should take place, without respect to the time they actually occur

• Synchronization requires either– Global time– Global ordering


Time Synchronization

• Is it possible to synchronize all clocks to produce a single, unambiguous time standard?

• Time synchronization need not be absolute– What usually matters is that processes agree on

the order in which events occur (not necessarily the time at which they occur)


Time and Coordination

• Basically two problems with time– External synchronization

• Synchronize clocks with an authoritative external source of time

– Internal synchronization• The internal consistency of the clocks is what

matters, not whether they are close to the real time


Astronomical Time

• Since the 17th century time has been measured astronomically– The event of the sun reaching the highest point in the sky

is called the transit of the sun– The interval between two consecutive transits of the sun

is called a solar day

• In the 1940s, it was established that the earth’s rotation is not constant– The earth is spinning slower– 300 million years ago there were about 400 days per year


Atomic Time

• The atomic clock was invented in 1948– One second is the time it takes the cesium 133

atom to make 9,192,631,770 transitions– Currently about 50 cesium-133 clocks exist– Periodically they are averaged to produce

international atomic time (TAI)– The Bureau International de l’Heure (BIH)

maintains the official clock


Leap Seconds

• Currently about 86,400 TAI seconds is about 3msec less than a mean solar day– Not a problem until noon becomes 6am

• BIH solves the problem by inserting leap seconds to compensate for the difference– Leap seconds are added whenever the discrepancy

grows to 800 msec– Power companies will increase their frequencies to

compensate

• UTC (Universal coordinated time) is the result


Obtaining Accurate Time

• UTC is an international standard for the current time– WWV shortwave radio from Fort Collins

(accuracy 0.1 – 10 milliseconds)– GEOS satellites (0.1 milliseconds)– GPS satellites (1 millisecond)


Physical Clocks

• Computer each contain their own physical clocks– Timer might be a better word…– Utilize crystal that oscillate at a known frequency– A count of the oscillations is maintained– Software typically takes this count, divides it down,

and stores it as a number in a register

• Most systems provide date/time from the counter• Ordering events, in a single machine, with such a

clock is easy– Provided the clock resolution is fine enough


Clock Drift

• Crystal-based clocks are subject to drifting– the change in the offset between the clock and a

nominal perfect reference clock per unit of time measured by the reference clock

• Typical drift rates– Quartz crystals – 10-6 (about a difference of one

second every 1,000,000 seconds or 11.6 days)– Atomic clocks – 10-13


External Synchronization

• Lets say you have access to a UTC time source• Assume the machine has a timer that causes an

interrupt H times a second– Current clock value is C

– When UTC time is t, the value of the clock on machine p is Cp(t)

– Ideally Cp(t)=t for all p and t (dC/dt should be 1)


Maximum Drift Rate

Slow clock

Fast clock

Perfect clock

dC/dt < 1

dC/dt > 1dC/dt = 1


Synchronizing Physical Time

• What exactly does it mean to synchronize two clocks?– Clocks inherently suffer from drifting– Assuming clocks can always be precisely synchronized

in unrealistic– Define an acceptable range for the difference in time

reported by two clocks (clock skew)

• A distributed physical clock synchronization service defines, and maintains, a maximum skew throughout the system.


The Basic Algorithm

• A wants to read B’s clock1) A sends a request to B2) B records its current clock value3) The clock value is sent back to A4) B’s clock value is adjusted to reflect travel

time5) B’s clock value can now be compared to A’s

• Step 4 is difficult to implement accurately


Interesting Question

• So you have to adjust your time– Your clock is slow – move it ahead

– Your clock is fast – move it back?

• Implementations– Slow down your clock so it will continually move

towards the real time

– Speed up your clock so it will move towards the real time

– Just move your clock ahead to the real time


Cristian’s Algorithm

• One machine knows the true time

• Periodically each machine sends a request for the current time

T0

T1

Request

CUTC

I, interrupt handling time

Time

Measured with the same clock


Transit Time

• Estimating propagation time– ( T1 – T0 ) / 2

– ( T1 – T0 – I ) / 2

– If minimum possible propagation delay is known, the estimate can be made better

• Accuracy can be improved by taking several measurements– Any measurement in which T1 – T0 exceeds a threshold

is discarded (congestion)


ICMP Timestamp Request/Reply

Type (17 or 18) Code (0) Checksum

Identifier Sequence Number

32-bit originate timestamp

32-bit receive timestamp

32-bit transmit timestamp

rtt

Same clock so difference is accurate


The Berkeley Algorithm

• Time server is active, and polls each machine periodically for its time– Based on the answers, an average time is computed

– A fault-tolerant average is used

– Machines are then told to slow down, or speed up their clocks

• Suitable for systems where no UTC source is available


Berkeley Algorithm

Current Time = 740Adjusted TimeA = 730Adjusted TimeB = 742Average = 737

Current Time = 720

Move clock forward 7

Current Time = 737740 740

720

737

+7

Network delay = 10 Network delay = 5


Network Time Protocol

• NTP is used to synchronize the time of a computer client to another server or reference time source

• Client accuracies are typically within a millisecond on LANs and up to a few tens of milliseconds on WANs

• NTP configurations utilize multiple redundant servers and diverse network paths in order to achieve high accuracy and reliability

• Configurations can use authentication to prevent accidental or malicious protocol attacks


NTP Strata

Primary Servers (stratum 1 ) sync to UTC source

Secondary Servers (stratum 2 ) sync to primary servers

Workstations


USNA NTP Time Servers


Rules of Engagement

• Clients should avoid using the primary servers whenever possible– In most cases the accuracy of the NTP

secondary (stratum 2) servers is only slightly degraded relative to the primary servers

– As a group, the secondary servers may be just as reliable


When to Use a Primary

• As a general rule– The secondary server provides synchronization to a

sizable population of other servers and clients– The server operates with at least two and preferably

three other secondary servers in a common synchronization subnet

– The administration(s) that operates these servers coordinates other servers within the region, in order to reduce the resources required outside that region.

• In order to ensure reliability, clients should spread their use over many different servers


NTP Servers

• http://www.ntp.org (home page for NTP)

• List of Primary Servers (100)– http://www.eecis.udel.edu/~mills/ntp/clock1.htm

• List of Secondary Servers (110)– http://www.eecis.udel.edu/~mills/ntp/clock2.htm

• Our server– timehost.cs.rit.edu


Synchronization Modes

• Servers synchronize in one of three modes– Multicast

• Used on high speed LANs• Servers periodically broadcast their time• Low accuracies, but efficient

– Procedure-call• Similar to the operation of Cristian’s algorithm

– Symmetric• Used by master servers• Pairs of servers exchange information• Timing data is retained in order to improve accuracy


NTP Design Goals

• The four primary design goals of NTP are– Allow accurate UTC synchronization– Enable survival despite significant losses of

connectivity– Allow frequent resynchronization– Protect against malicious or accidental

interference


Accurate Synchronization

• NTP provides the following information relative to the primary server– Clock offset

• Difference between the two clocks

– Round-trip delay• Total transmission time for the messages

– Dispersion• Offsets are predicted• Dispersion is a measure of how much the prediction differs

from what what reported• Large dispersion values indicate inaccuracy


Logical Clocks

• Since physical clocks cannot be perfectly synchronized across a distributed system– Physical time cannot be used to determine the

order in which events occur

• Logical clocks can be used to order events within a distributed system

• The essence of a logical clock is the happens-before relationship


Happens-Before

• The happens-before relationship is denoted a b– If a and b are events in the same process, and a occurs

before b, then a b

– If a is the event of a message being sent by one process, and b is the event of the message being received by another process, then a b

– If a b, and b c, then a c

• Any two events that are not in a happen-before relationship are concurrent


Events

p1

p2

p3

a b

c d

e f

m1

m2

Physicaltime


Lamport’s Logical Clock

• To obtain logical ordering, timestamps that are independent of physical clocks are used

• Lamport clocks follow these rules– Each process increments it clock between every

two consecutive events– If a sends a message to b, the message includes

T(a). Upon receipt, b sets its clock to the greater of T(a)+1 and the current clock


Lamport’s Algorithm012345678910

08162432404856647280

0102030405060708090100

08162432404861697785

0102030405060708090100

a a

b

cd

A(6) F(10) B(24) C(50) D(60) E(64)

e

f

A(6) B(24) C(50) D(60) E(64) F(71)

b

c

d

e

f

0123456787071


Partial Ordering

• If a b then L(a) < L(b)

• Note that– L(d) < L(e)

• Does not imply that– d e– Since d and e might be concurrent

• Plus L(a) might equal L(b)


Example

a b

c d

e f

m1

m2

21

3 4

51

p1

p2

p3

Physical time


Total Ordering

• Between every event the clock must tick at least once

• Since events cannot happen at the same time, attach the process number to the low-order end of the time, separated by a decimal point

• Now– If a happens before b in the same process, C(a) < C(b)– If a and b represent the sending an receiving of a

message, C(a) < C(b)– For all events a and b, C(a) is not equal to C(b)


Vector Clocks

• Vector clocks were designed to overcome the shortcomings of Lamport’s clocks– A vector clock is an array of times

• The rules:– Initially, Vi[j]=0, for i,j = 1,2 …, N– Just before pi timestamps an event, it increments Vi[i]– pi includes the value t = Vi in every message it sends– When pi receives a timestamp in as message, it takes

the component-wise maximum of the two vector timestamps


Example

a b

c d

e f

m1

m2

(2,0,0)(1,0,0)

(2,1,0) (2,2,0)

(2,2,2)(0,0,1)

p1

p2

p3

Physical time


Comparing Timestamps

• Vector timestamps are compared as follows– V = V’ iff V[j] = V’[j] for j=1,2,…,N– V <= V’ iff V[j] <= V’[j] for j=1,2…,N– V < V’ iff V<=V’ and V != V’

• So what?– If V(e) < V(e’) then ee’– c and e are concurrent since neither V(c) <=

V(e) nor V(e)<=V(c)

distributed synchronization

Documents

resulticss741 time

millisecondicss741 time

century time

yearicss741 time

occuricss741 time

unambiguous time standard

official clockicss741

real timeicss741 time