ics362 – distributed systems dr. ken cosh lecture 7

ICS362 – Distributed Systems

Dr. Ken Cosh

Lecture 7

Recap

Synchronisation– Clock Synchronisation– Logical Clocks– Mutual Exclusion Algorithms– Election Algorithms

This Week

Replication & Consistency– Data Centric Consistency Models– Client Centric Consistency Models– Replica Management– Consistency Protocols

Replication

Making copies of a data store. Why?

– Reliability If one goes down, we can use another one. Triple Modular Redundancy

– Performance Scalability by numbers

– Reduces load on server Scalability by geographical area

– A physically more proximate copy of the data

Why Replicate?

Enhance reliability. Improve performance. Replicas allows remote sites to continue working in

the event of local failures. It is also possible to protect against data corruption. Replicas allow data to reside close to where it is

used. This directly supports the distributed systems goal of

enhanced scalability.

Replication Performance?

Having geographically close replicas may make the local process perceive better performance

But we may increase the system bandwidth when keeping all replicas up to date.– i.e. Consistency problems

Retrieving a webpage from my machine’s cache might be quick, but is it up to date?

Consistency

Consider if P accesses a local replica N times per second.

Consider if that replica is updated M times per second.– If N<<M then many of the updates will never be

read, making all the network updates wasted

What is consistency?

There are different types, but 1 type:– If a read operation is performed at any copy, it will

return the same result.– If an update operation is performed at any copy, it

will propagate before any subsequent operations.

This is sometimes referred to as “Strict Consistency” or synchronous replication.

Consistency

Synchronising Replicas– Lamports time stamps?– A co-ordinator responsible for managing?

Either way, there is significant communication across the network.– Improved performance for scalability, but

decreased performance through network traffic?

Consistency

Relax the consistency constraints?– Replicas may not always be the same

everywhere.– Performance may be improved

It depends on the system as to what consistency model is adequate.

Consistency Models

A contract between processes and the data store;– If the processes obey certain rules, the data store

will behave in an expected way. Generally processes expect that when they

make a read operation they get the results of the last write operation.– But without a global clock, what is the last write

operation?

Continuous Consistency

Where deviations between replicas form continuous consistency ranges with tolerated inconsistencies specified.– Numerical Value deviations– Staleness deviations– Ordering deviations

Numerical Value Deviations

Consider exchange rates across bank branches– We could specify 2 copies should not deviate more than

$0.02 (absolute numerical deviation)– Or by more than 0.5% (relative numerical deviation)

Hereby if the rate goes up (i.e. one replica is updated), the other replicas could still be considered mutually consistent even with slight differences.

Staleness Deviations

Relate to the last time a replica was updated.– Consider weather reports.

They stay relatively accurate over several hours

– In this case the main server may decide to update the replicas periodically.

Ordering deviations

Concern the order in which updates occur. An update on one replica may be tentatively

applied, while waiting for global agreement– So some updates may need to be rolled back and

reapplied in a different order before becoming permanent.

Inconsistencies

Under continuous consistency models, some inconsistencies are tolerated, for a short time.

To define inconsistencies we need a ‘consistency unit’ or ‘conit.

– Perhaps a single currency’s exchange rate– Perhaps a single destination’s weather report

When a conit breaks the tolerated inconsistency it gets updated

– So consideration needs to be given to the size of a conit.

Consider this

Consistency Model states two replicas may differ in no more than one outstanding update.

– On the left, a larger conit receives two updates and so propagates the updates

– On the left, each smaller conit receives one update and doesn’t propagate

Conit Size

Small conits leads to greater management overheads– And so overall performance considerations.

Larger conits can force more frequent updates– Consider from the previous example.

Consider if data items contained in a conit are used completely independently…

Strict Consistency

The Rules:– Any read on a data item ‘x’ returns a value corresponding to

the result of the most recent write on ‘x’ (regardless of where the write occurred).

Consider this code;– int a=1; a=2; cout << a;

If this displayed 1 (or anything other than 2) you’d get frustrated as a programmer.

But that is because you are used to strict consistency!

Strict Consistency

For strict consistency to succeed we need a global time (remember discussion on sychronisation last week).

– suppose the 3 lines of code in the last equation were executed 1 nanosecond apart on two computers 3 meters apart.

– The 2nd write command would have to travel 10 times the speed of light to beat the 3rd command!

– Is it fair to expect code to break Einstein’s special theory of relativity?

Sequential Consistency

The Rules:– The result of any execution is the same as if the (read and

write) operations by all processes on the data-store were executed in the same sequential order and the operations of each individual process appear in this sequence in the order specified by its program.

Essentially this means that all operations must see the same interleaving of operations; processes are aware of their own reads, but everyone’s writes.

Sequential Consistency (2)

(a) demonstrates an acceptable interweaving of reads and writes, while (b) is unacceptable.

Consider updating the football scores, in the correct order.


A problem with this model could be if reads or writes are prioritised.– Consider 3 variables (x, y and z) all initialised to

0. 3 Processes operate on these variables; Process P1 : x=1; cout << y << z; Process P2 : y=1; cout << x << z; Process P3 : z=1; cout << x << y;

– How many different possibilities are there? 6! = 720


Actually its not 720 – many of these don’t conform to sequential consistency.

– Knowing that the 2 operations on each process have to happen sequentially, but the other processor’s operations can interleave them, there are 90 different valid sequences.

These 90 different sequences produce a variety of different valid prints;

– 001011, 101011, 110101, 111111, and many many more. Also some invalid prints;

– 000000 or 001001. For these reasons, other weaker consistency models

have been proposed and developed.

Causal Consistency

The Rules:– Writes that are potentially causally related must be seen by all processes in

the same order. Concurrent writes may be seen in a different order on different machines (i.e., by different processes).

This model distinguishes between events that are “causally related” and those that are not.

If event B is caused or influenced by an earlier event A, then causal consistency requires that every other process see event A, then event B.

Causal Consistency (2)

(a) is not valid as P2’s write is related to P1’s write due to the read on ‘x’ giving ‘a’ (all processes must see them in the same order).

(b) is valid as now the two writes are concurrent

Grouping Operations

Sequential & Causal consistency are defined at the granularity of read and write operations

However, many often a process will operate a series of reads and writes

For this a synchronisation variable can be acquired.– When a process enters a critical section it acquires relevant

synchronisation variables.– When it leaves the critical section it releases the relevant

synchronisation variables.

Synchronisation Variables

Each synchronisation variable has an owner;– i.e. the process which last acquired it.

If a new process wants to acquire the variable it has to request the current owner– And get the current data values associated with

the synchronisation variable

This is Entry Consistency

Entry Consistency

The Rules1. An acquire access of a synchronization variable is not allowed to

perform with respect to a process until all updates to the guarded shared data have been performed with respect to that process.

2. Before an exclusive mode access to a synchronization variable by a process is allowed to perform with respect to that process, no other process may hold the synchronization variable, not even in nonexclusive mode.

3. After an exclusive mode access to a synchronization variable has been performed, any other process's next nonexclusive mode access to that synchronization variable may not be performed until it has performed with respect to that variable's owner.

Entry Consistency (2)

•Entry consistency locks individual data items to ensure that no other processes are accessing that data item at that time.

•Note; P2’s read on ‘y’ returns NIL as no lock has been requested.

Client Centric Consistency

So far we have considered how to make data consistent for a variety of processes, but maintaining data consistency might not be the important factor.– There may be no simultaneous updates.

Here we consider how to maintain a consistent view for an individual client operating on the data store.

Client Centric Consistency

How fast should updates (writes) be made available to read-only processes?

Think of most database systems: mainly read. Think of the DNS: write-write conflicts do not occur. Think of WWW: as with DNS, except that heavy use of client-

side caching is present: even the return of stale pages is acceptable to most users.

These systems all exhibit a high degree of acceptable inconsistency, with the replicas gradually becoming consistent over time.

Eventual Consistency

In Eventual Consistency the only requirement is that all replicas will eventually be the same.

All updates must be guaranteed to propagate to all replicas … eventually!

This works well if every client always updates the same replica.

Things are a little difficult if the clients are mobile.

Mobile Consistency

When the portable computer connects to a different replica, client consistency becomes a harder issue.

Bayou’s Consistency Models

Bayou identified 4 models of Client-Centic Consistency:– Monotonic-Read Consistency– Monotonic-Write Consistency– Read-Your-Writes Consistency– Writes-Follow-Reads Consistency

Monotonic Reads

If a process reads the value of a data item ‘x’, any successive read operation on ‘x’ by that process will always return that same value or a more recent value.

E.g. – Email system– If I check my email in Chiang Mai, and then move

to Bangkok, the email should contain at least the mails I had in Chiang Mai

Monotonic Reads

In (b) we can’t be sure that R(x2) contains WS(x1)

Monotonic Writes

A write operation by a process on a data item ‘x’ is completed before any successive write operation on ‘x’ by the same process.

This is essentially FIFO consistency where all write operations appear in the same order everywhere

Read Your Writes

The effect of a write operation by a process on data item ‘x’ will always be seen by a successive read operation on ‘x’ by the same process.

Consider updating a web document– If we edit a document and upload it to the server, but then

access a cached version through the web browser then we may not satisfy Read Your Writes consistency

Consider changing your password– The password change might not be propagated immediately

resulting in not being able to access certain areas.

Read Your Writes

Writes Follow Reads

A write operation by a process on a data item ‘x’ following a previous read operation on ‘x’ by the same process, is guaranteed to take place on the same or a more recent value of ‘x’ that was read.

This guarantees that if we can only read a reaction to an article after viewing the original article.

– Suppose I read article A and react with response B, B will only be visible after A has been written first.

Writes Follow Reads

Replica Management

When, Where and By Whom replicas should be placed – and then how to keep them consistent.

2 Sub problems– Placing of Replica Servers

Finding the best places to locate replica servers

– Placing of Content Finding the best servers for placing content

Replica Server Placement

Often driven by management / commercial reasons rather than by optimisation reasons.

For Optimisation:– Choose the best K out of N locations where K<N.

Qiu suggests taking the distance (in terms of latency of bandwidth) between clients and potential locations and finding the next best location for a replica server.

– But this is O(N2) leading to slow decisions over replica placement

There are alternatives, but lets consider content placement instead.

Content Placement

Permanent Replicas

The initial set of replicas For example:

– With a website the original copy of the webpage and any mirror sites.

Server Initiated Replicas

If my website suddenly gets a lot of hits from a specific location, it may enhance performance to install a temporary replica closer to the requests.

Based on traffic (count of hits) we can set a ‘replication threshold’ and a ‘deletion threshold’.

– Deletion Threshold: if there are not enough requests the replica can be deleted (so long as it isn’t the last replica)

– Replication Threshold: indicating it is worth creating a replica – close to where the requests are comi9ng from.

Server Initiated Replicas

If the requests are between the deletion and replication thresholds, we can consider migrating the data closer to the requests.

Client Initiated Replicas

Essentially Caches– To be managed by the clients

Although some assistance from the server assists with staleness.

Content Distribution

A further design decision concerns how updates are propagated– Propagate a notification of an update– Transfer data from one copy to another– Propagate the update operation to other copies

Another design decision is where updates are initiated from – the server or the replica?– Push vs Pull

Propagate a Notification

Essentially let the replicas know there is an update– The replica can later decide when to update– A notification takes up very little bandwidth– If there are lots of writes and few reads, it may be

inefficient to constantly update the data

Propagate the Data

Transfer the modified data– If there is a high read-write ratio it may be worth

propagating the modified data.

To reduce communication overhead transfers can be aggregated with multiple modifications.

Propagate the Update Operation

“Active Replication”– Rather than propagating the data, propagate the

operation required – along with any parameters required for it.

Assumes all replicas are capable of performing the operation.

Bandwidth costs can be small (assuming parameters are not too large)

Push vs Pull

Push Based Approach– Updates propagated without replica requesting– Often used between server initiated replicas– If the read-to-update frequency is high every update can be

expected to be read more than once.– However, server needs to keep track of clients (stateful or

softstate) If softstate then the client might get a lease or a promise for

updates.– Replicas also need to inform the server if they purge some

data (perhaps due to lack of space) – increasing communication.

Push vs Pull

Pull Based Approach– Replicas request any updates to be sent– Effective if read-to-update ratio is low,

i.e. each time there is a read, the replica could check if there have been any updates

– The replica will poll the server to check if there have been any updates

Leases

Hybrid between push and pull.– Promise for the server to push updates for a specified time.

When the lease expires the client can renew the lease or poll the server for updates.

Leases could be based on age– Where if the data isn’t changed for some time, it is assumed

that it won’t be changed for some time more Consider the football scores at the end of a game.

Leases could be based on client requests– Where frequently accessed replicas maintain a longer

lease.

Consistency Protocols

Thus far we have considered a variety of consistency models & design issues.

Consistency Protocols are an implementation of a specific consistency model.– Primary Based Protocols– Replicated-Write Protocols

Primary Based Protocols

In primary based protocols each data item has a primary server responsible for updating write operations on it.– Useful for sequential consistency as the primary

server manages the sequence of updates. Remote Write Protocol

– Where all writes are performed on the primary server Local Write Protocols

– Where the primary server can be moved when a write is necessary

Remote Write Protocols

Local Write Protocols

Replicated Write Protocols

Where write operations can be carried out at multiple replicas– Active Replication

Totally Ordered Multicast is necessary, using Lamports Time Stamps

Alternatively a central co-ordinator (sequencer) could be responsible for ordering writes (but this is essentially a primary based protocol)

– Quorum Based Protocols Where replicas ‘vote’ on updates

Quorum Based Protocols

To update a file the client must first contact half the replicas (+1) to get a majority to agree to doing the update.– Each replica then updates their version number to

the latest version.

To read a file, a client also contacts (at least) half the replicas (+1) to check if they agree on the version number for the file

ics362 – distributed systems dr. ken cosh lecture 7

Documents

consistency problemsretrieving

consistency constraints

strict consistency

close replicas

continuous consistency

update operation

consistency modelsa

replication performance