distributed systems cs 15-440 case study: replication in google chubby recitation 5, oct 06, 2011...

16
Distributed Systems CS 15-440 Case Study: Replication in Google Chubby Recitation 5, Oct 06, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud

Post on 21-Dec-2015

221 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Distributed Systems CS 15-440 Case Study: Replication in Google Chubby Recitation 5, Oct 06, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud

Distributed SystemsCS 15-440

Case Study: Replication in Google Chubby

Recitation 5, Oct 06, 2011

Majd F. Sakr, Vinay Kolar, Mohammad Hammoud

Page 2: Distributed Systems CS 15-440 Case Study: Replication in Google Chubby Recitation 5, Oct 06, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud

Today…

Last recitation session: Google Chubby Architecture

Today’s session: Consensus and Replication in Google Chubby

Announcement: Project 2 Interim Design Report is due soon

Page 3: Distributed Systems CS 15-440 Case Study: Replication in Google Chubby Recitation 5, Oct 06, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud

Overview

Recap: Google Chubby

Consensus in ChubbyPaxos Algorithm

Page 4: Distributed Systems CS 15-440 Case Study: Replication in Google Chubby Recitation 5, Oct 06, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud

Recap: Google Data center Architecture

(To avoid clutter the Ethernet connections are shown from only one of the clusters to the external links)

Page 5: Distributed Systems CS 15-440 Case Study: Replication in Google Chubby Recitation 5, Oct 06, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud

Chubby Overview

A Chubby Cell is the first level of hierarchy inside Chubby (ls)/ls/chubby_cell/directory_name/…/file_name

Chubby instance is implemented as a small number of replicated servers (typically 5) with one designated master

Replicas are placed at failure-independent sites

Typically, they are placed within a cluster but not within a rack

The consistency of replicated database is ensured through a consensus protocol that uses operation logs

Page 6: Distributed Systems CS 15-440 Case Study: Replication in Google Chubby Recitation 5, Oct 06, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud

Chubby Architecture Diagram

Page 7: Distributed Systems CS 15-440 Case Study: Replication in Google Chubby Recitation 5, Oct 06, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud

Consistency and Replication In Chubby

Challenges in replication of data in Google infrastructure:1. Replica Servers may run at arbitrary speed and fail

2. Replica Servers have access to stable persistent storage that can survive crashes

3. Messages may be lost, reordered, duplicated or delayed

Google has implemented a consensus protocol, using Paxos algorithm, for ensuring consistency

The protocol operates over a set of replicas with the goal of reaching an agreement to update a common value

Page 8: Distributed Systems CS 15-440 Case Study: Replication in Google Chubby Recitation 5, Oct 06, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud

Paxos Algorithm

Another algorithm proposed by Lamport

Paxos ensures correctness, but not liveliness

Algorithm initiation and termination:Any replica can submit a value with the goal of achieving consensus on a final value

In Chubby, if all replicas have this value as the next entry in their update logs, then consensus is achieved

Paxos is guaranteed to achieve consensus if:A majority of the replicas run for long enough with sufficient network stability

Page 9: Distributed Systems CS 15-440 Case Study: Replication in Google Chubby Recitation 5, Oct 06, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud

Paxos Approach

Steps1. Election

Group of replica servers elect a coordinator

2. Selection of candidate valueCoordinator selects the final value and disseminates to the group

3. Acceptance of final valueGroup will accept or reject a value that is finally stored in all replicas

Page 10: Distributed Systems CS 15-440 Case Study: Replication in Google Chubby Recitation 5, Oct 06, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud

1. Election

Approach:

Each replica maintains highest sequence number seen so far

If the replica wants to bid for coordinator:It picks a unique number that is higher than all sequence numbers that the replica has seen till now

Broadcast a “propose” message with this unique sequence number

If other replicas have not seen higher sequence number, they send a “promise” message

Promise message signifies that the replica will not promise to any other candidate lesser than the proposed sequence number

The promise message may include a value that the replica wants to commit

Candidate replica with majority of “promise” message wins

Challenges: Multiple coordinators may co-existReject messages from old coordinators

Page 11: Distributed Systems CS 15-440 Case Study: Replication in Google Chubby Recitation 5, Oct 06, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud

Message Exchanges in Election

Page 12: Distributed Systems CS 15-440 Case Study: Replication in Google Chubby Recitation 5, Oct 06, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud

2. Selection of candidate values

Approach:The elected coordinator will select a value from all promise messages

If the promise messages did not contain any value then the coordinator is free to choose any value

Coordinator sends the “accept” message (with the value) to the group of replicas

Replicas should acknowledge the accept message

Coordinator waits until a majority of the replicas answer

Possible indefinite wait

Page 13: Distributed Systems CS 15-440 Case Study: Replication in Google Chubby Recitation 5, Oct 06, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud

Message Exchanges in Consensus

Page 14: Distributed Systems CS 15-440 Case Study: Replication in Google Chubby Recitation 5, Oct 06, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud

3. Commit the value

Approach

If a majority of the replicas acknowledge, thenthe coordinator will send a “commit” message to all replicas

Otherwise, Coordinator will restart the election process

Page 15: Distributed Systems CS 15-440 Case Study: Replication in Google Chubby Recitation 5, Oct 06, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud

Message Exchanges in Commit

Page 16: Distributed Systems CS 15-440 Case Study: Replication in Google Chubby Recitation 5, Oct 06, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud

Referenceshttp://cdk5.net

“Paxos Made Live – An Engineering Perspective”, Tushar Chandra, Robert Griesemer, and Joshua Redstone, 26th ACM Symposium on Principles of Distributed Computing, PODC 2007