distributed systems cs 15-440 programming models- part v replication and consistency- part i lecture...

Distributed SystemsCS 15-440

Programming Models- Part V

Replication and Consistency- Part I

Lecture 18, Oct 29, 2014

Mohammad Hammoud

1

Today… Last Session:

Programming Models – Part IV: Pregel

Today’s Session: Programming Models – Part V: GraphLab Replication and Consistency- Part I: Motivation, Overview & Types of

Consistency Models

Announcements: Project 3 is now posted. It is due on Wednesday Nov 12, 2014 by

midnight PS4 is now posted. It is due on Saturday Nov 15, 2014

by midnight We will practice more on MPI tomorrow in the recitation

2

Objectives

Discussion on Programming Models

Why parallelizing our programs?

Parallel computer architectures

Traditional Models of parallel programming

Types of Parallel Programs

Message Passing Interface (MPI)

MapReduce, Pregel and GraphLab


Last 4 Sessions

Cont’d

Objectives

Discussion on Programming Models

Why parallelizing our programs?

Parallel computer architectures

Traditional Models of parallel programming

Types of Parallel Programs

Message Passing Interface (MPI)


The GraphLab Analytics Engine

5

GraphLab

Motivation &

Definition

The Programming

Model

Input, Output &

Components

The Architectural

Model

Fault-Tolerance

The Computation

Model

Motivation for GraphLab

There is an exponential growth in the scale of Machine Learning and Data Mining (MLDM) algorithms

Designing, implementing and testing MLDM at large-scale are challenging due to: Synchronization Deadlocks Scheduling Distributed state management Fault-tolerance

The interest on analytics engines that can execute MLDM algorithms automatically and efficiently is increasing MapReduce is inefficient with iterative jobs (common in MLDM algorithms) Pregel cannot run asynchronous problems (common in MLDM algorithms)

6

What is GraphLab?

GraphLab is a large-scale graph-parallel distributed analytics engine

Some Characteristics:• In-Memory (opposite to MapReduce and similar to Pregel)• High scalability• Automatic fault-tolerance• Flexibility in expressing arbitrary graph algorithms (more flexible

than Pregel)• Shared-Memory abstraction (opposite to Pregel but similar to

MapReduce)• Peer-to-peer architecture (dissimilar to Pregel and MapReduce)• Asynchronous (dissimilar to Pregel and MapReduce)

7


8

GraphLab

Motivation &

Definition

The Programming

Model

Input, Output &

Components

The Architectural

Model

Fault-Tolerance

The Computation

Model

GraphLab assumes problems modeled as graphs

It adopts two phases, the initialization and the execution phases

Input, Graph Flow and Output

9

Initialization PhaseInitialization Phase GraphLab Execution PhaseGraphLab Execution Phase

Distributed File system

(MapReduce) Graph Builder

Distributed File system

Raw Graph Data

Raw Graph Data

Raw Graph Data

Raw Graph Data

Parsing + PartitioningParsing +

Partitioning

Atom Collection

Atom Collection

Index Construction

Index Construction

Atom IndexAtom Index

Atom File

Atom File

Atom File

Atom File

Atom File

Atom File

Atom File

Atom File

Atom File

Atom File

Atom File

Atom File

Cluster Distributed File system

TCP RPC Comms

TCP RPC Comms

Atom IndexAtom Index

Atom File

Atom File

Atom File

Atom File

Atom File

Atom File

Atom File

Atom File

Atom File

Atom File

Atom File

Atom File

Monitoring + Atom

Placement

Monitoring + Atom

Placement

GL EngineGL Engine

GL EngineGL Engine

GL EngineGL Engine

Components of the GraphLab Engine: The Data-Graph

The GraphLab engine incorporates three main parts:1. The data-graph, which represents the user program state at a cluster machine

10

Data-Graph

Vertex

Edge

The GraphLab engine incorporates three main parts:2. The update function, which involves two main sub-functions:

2.1- Altering data within a scope of a vertex

2.2- Scheduling future update functions at neighboring vertices

vv

Sv

The scope of a vertex v (i.e., Sv) is the data stored in v and in all v’s adjacent edges and vertices

Components of the GraphLab Engine: The Update Function





The update function

Schedule v





CPU 1CPU 1

CPU 2CPU 2

ee ff gg

kkjjiihh

ddccbbaa bb

iihh

aa

ii

bb ee ff

jj

cc

Sch

edule

rSch

edule

r

The process repeats until the scheduler is emptyThe process repeats until the scheduler is empty

Components of the GraphLab Engine: The Sync Operation

The GraphLab engine incorporates three main parts:3. The sync operation, which maintains global statistics describing data

stored in the data-graph

Global values maintained by the sync operation can be written by all update functions across the cluster machines

The sync operation is similar to Pregel’s aggregators

A mutual exclusion mechanism is applied by the sync operation to avoid write-write conflicts

For scalability reasons, the sync operation is not enabled by default


15

GraphLab

Motivation &

Definition

The Programming

Model

Input, Output &

Components

The Architectural

Model

Fault-Tolerance

The Computation

Model

The Architectural Model

GraphLab adopts a peer-to-peer architecture All engine instances are symmetric Engine instances communicate together using Remote Procedure Call

(RPC) protocol over TCP/IP The first triggered engine has an additional responsibility of being a

monitoring/master engine

Advantages: Highly scalable Precludes centralized bottlenecks and single point of failures

Main disadvantage: Complexity


17

GraphLab

Motivation &

Definition

The Programming

Model

Input, Output &

Components

The Architectural

Model

Fault-Tolerance

The Computation

Model

The Programming Model

GraphLab offers a shared-memory programming model

It allows scopes to overlap and vertices to read/write from/to their scopes

Consistency Models in GraphLab

GraphLab guarantees sequential consistency Provides the same result as a sequential execution of the computational steps

User-defined consistency models Full Consistency Vertex Consistency Edge Consistency

19

Vertex v

Consistency Models in GraphLab

D1 D2 D3 D4 D5

D1↔2D1↔2 D2↔3

D2↔3 D3↔4D3↔4 D4↔5

D4↔5

1 2 3 4 5

D1 D2 D3 D4 D5

D1↔2D1↔2 D2↔3

D2↔3 D3↔4D3↔4 D4↔5

D4↔5

1 2 3 4 5

ReadWrite

ReadWrite

D1 D2 D3 D4 D5

D1↔2D1↔2 D2↔3

D2↔3 D3↔4D3↔4 D4↔5

D4↔5

1 2 3 4 5

ReadWrite


21

GraphLab

Motivation &

Definition

The Programming

Model

Input, Output &

Components

The Architectural

Model

Fault-Tolerance

The Computation

Model

The Computation Model

GraphLab employs an asynchronous computation model

It suggests two asynchronous engines Chromatic Engine Locking Engine

The chromatic engine executes vertices partially asynchronous It applies vertex coloring (e.g., no adjacent vertices share the same color) All vertices with the same color are executed before proceeding to a different color

The locking engine executes vertices fully asynchronously Data on vertices and edges are susceptible to corruption It applies a permission-based distributed mutual exclusion mechanism to

avoid read-write and write-write hazards


23

GraphLab

Motivation &

Definition

The Programming

Model

Input, Output &

Components

The Architectural

Model

Fault-Tolerance

The Computation

Model

Fault-Tolerance in GraphLab

GraphLab uses distributed checkpointing to recover from machine failures

It suggests two checkpointing mechanisms Synchronous checkpointing (it suspends the entire execution of GraphLab) Asynchronous checkpointing

How Does GraphLab Compare to MapReduce and Pregel?

25

GraphLab vs. Pregel vs. MapReduceAspect Hadoop

MapReducePregel GraphLab

Programming Model

Shared-Memory Message-Passing Shared-Memory

Aspect Hadoop MapReduce

Pregel GraphLab

Programming Model


Computation Model

Synchronous Synchronous Asynchronous


Pregel GraphLab

Programming Model


Computation Model


Parallelism Model

Data-Parallel Graph-Parallel Graph-Parallel


Pregel GraphLab

Programming Model


Computation Model


Parallelism Model


Architectural Model

Master-Slave Master-Slave Peer-to-Peer


Pregel GraphLab

Programming Model


Computation Model


Parallelism Model


Architectural Model


Task/Vertex Scheduling

Model

Pull-Based Push-Based Push-Based


Pregel GraphLab

Programming Model


Computation Model


Parallelism Model


Architectural Model


Task/Vertex Scheduling

Model

Pull-Based Push-Based Push-Based

Application Suitability

Loosely-Connected/

Embarrassingly Parallel Applications

Strongly-Connected Applications

Strongly-Connected Applications (more

precisely MLDM apps)

Today…

Replication and Consistency Motivation Overview Types of Consistency Models

27

A New Chapter

Why Replication?Replication is the process of maintaining the data at multiple computers

Replication is necessary for:1. Improving performance

A client can access the replicated copy of the data that is near to its location

2. Increasing the availability of servicesReplication can mask failures such as server crashes and network disconnection

3. Enhancing the scalability of the systemRequests to the data can be distributed to many servers which contain replicated copies of the data

4. Securing against malicious attacksEven if some replicas are malicious, secure data can be guaranteed to the client by relying on the replicated copies at the non-compromised servers

28

1. Replication for Improving Performance

Example ApplicationsCaching webpages at the client browser

Caching IP addresses at clients and DNS Name Servers

Caching in Content Delivery Network (CDNs)Commonly accessed contents, such as software and streaming media, are cached at various network locations

29

Main Server

Replicated Servers

2. Replication for High-Availability

Availability can be increased by storing the data at replicated locations (instead of storing one copy of the data at a server)

Example: Google File-System replicates the data at computers across different racks, clusters and data-centers

If one computer or a rack or a cluster crashes, then the data can still be accessed from another source

30

3. Replication for Enhancing Scalability

Distributing the data across replicated servers helps in avoiding bottlenecks at the main server

It balances the load between the main and the replicated servers

Example: Content Delivery Networks decrease the load on main servers of the website

31

Main Server

Replicated Servers

4. Replication for Securing Against Malicious Attacks

If a minority of the servers that hold the data are malicious, the non-malicious servers can outvote the malicious servers, thus providing security

The technique can also be used to provide fault-tolerance against non-malicious but faulty servers

Example: In a peer-to-peer system, peers can coordinate to prevent delivering faulty data to the requester

32n = Servers with correct data = Servers with faulty datan = Servers that do not

have the requested data

Number of servers with correct data outvote the

faulty servers

Why Consistency?

In a DS with replicated data, one of the main problems is keeping the data consistent

An example:In an e-commerce application, the bank database has been replicated across two servers

Maintaining consistency of replicated data is a challenge

33

Bal=1000 Bal=1000

Replicated Database

Event 1 = Add $1000 Event 2 = Add interest of 5%

Bal=2000

1 2

Bal=10503 Bal=20504Bal=2100

Overview of Consistency and Replication

Consistency Models

Data-Centric Consistency Models

Client-Centric Consistency Models

Replica Management

When, where and by whom replicas should be placed?

Which consistency model to use for keeping replicas consistent?

Consistency ProtocolsWe study various implementations of consistency models

34

Next lectures

Today’s lecture

Overview

Consistency ModelsData-Centric Consistency Models

Client-Centric Consistency Models

Replica Management

Consistency Protocols

35

Introduction to Consistency and Replication

Process 1 Process 2 Process 3

Local Copy

Distributed data-store

Maintaining Consistency of Replicated Data

37

x=0 x=0 x=0 x=0

Replica 1 Replica 2 Replica 3 Replica n

Process 1

Process 2

Process 3

R(x)b=Read variable x; Result is b

W(x)b= Write variable x; Result is b

P1 =Process P1 =Timeline at P1

R(x)0

R(x)0

W(x)2

x=2 x=2 x=2 x=2

R(x)?R(x)2

W(x)5

R(x)?R(x)5

x=5 x=5 x=5 x=5

DATA-STORE

Strict Consistency •Data is always fresh

• After a write operation, the update is propagated to all the replicas • A read operation will result in reading the most recent write

•If there are occasional writes and reads, this leads to large overheads

Maintaining Consistency of Replicated Data (Cont’d)

38

x=0 x=0 x=0 x=0

Replica 1 Replica 2 Replica 3 Replica n

Process 1

Process 2

Process 3

R(x)b=Read variable x; Result is b

W(x)b= Write variable x; Result is b

P1 =Process P1 =Timeline at P1

R(x)0

R(x)5

W(x)2

x=2 x=2 x=2 x=2

R(x)?R(x)3

W(x)5

R(x)?R(x)5

x=0 x=5 x=3

DATA-STORE

Loose Consistency •Data might be stale

• A read operation may result in reading a value that was written long back• Replicas are generally out-of-sync

•The replicas may sync at coarse grained time, thus reducing the overhead

Trade-offs in Maintaining Consistency

Maintaining consistency should balance between the strictness of consistency versus efficiency

Good-enough consistency depends on your application

39

Strict Consistency

Generally hard to implement, and is inefficient

Loose Consistency

Easier to implement, and is efficient

Consistency ModelA consistency model is a contract between

the process that wants to use the data, and

the replicated data repository (or data-store)

A consistency model states the level of consistency provided by the data-store to the processes while reading and writing the data

40

Types of Consistency Models

41

Summary

42

Next Three Classes

Data-Centric Consistency Models

Sequential and Causal Consistency Models

Client-Centric Consistency ModelsEventual Consistency, Monotonic Reads, Monotonic Writes, Read Your Writes and Writes Follow Reads

Replica ManagementReplica management studies:

when, where and by whom replicas should be placed

which consistency model to use for keeping replicas consistent

Consistency ProtocolsWe study various implementations of consistency models

43

References[1] Haifeng Yu and Amin Vahdat, “Design and evaluation of a conit-based continuous consistency model for replicated services”

[2] http://tech.amikelive.com/node-285/using-content-delivery-networks-cdn-to-speed-up-content-load-on-the-web/

[3] http://en.wikipedia.org/wiki/Replication_(computer_science)

[4] http://en.wikipedia.org/wiki/Content_delivery_network

[5] http://www.cdk5.net

[6] http://www.dis.uniroma1.it/~baldoni/ordered%2520communication%25202008.ppt

[7] http://www.cs.uiuc.edu/class/fa09/cs425/L5tmp.ppt

44

http://tech.amikelive.com/node-285/using-content-delivery-networks-cdn-to-speed-up-content-load-on-the-web/




Back-up Slides

45

PageRank PageRank is a link analysis algorithm

The rank value indicates an importance of a particular web page

A hyperlink to a page counts as a vote of support

A page that is linked to by many pages with high PageRank receives a high rank itself

A PageRank of 0.5 means there is a 50% chance that a person clicking on a random link will be directed to the document with the 0.5 PageRank

PageRank (Cont’d) Iterate:

Where: α is the random reset probability L[j] is the number of links on page j

1 32

4 65

pagerank(i, scope){ // Get Neighborhood data (R[i], Wij, R[j]) scope;

// Update the vertex data

// Reschedule Neighbors if needed if R[i] changes then reschedule_neighbors_of(i); }

;][)1(][][

iNj

ji jRWiR

PageRank Example in GraphLab PageRank algorithm is defined as a per-vertex operation working on the scope

of the vertex

Dynamic computation

distributed systems cs 15-440 programming models- part v replication and consistency- part i lecture...

Documents

graphlab slide

graphlab engine

graphlab analytics engine

programming models

graphlab replication

graphlab motivation

computation model slide

mldm algorithms pregel