paxos made simple

24
Paxos Made Simple Jinghe Zhang

Upload: heath

Post on 07-Jan-2016

55 views

Category:

Documents


4 download

DESCRIPTION

Paxos Made Simple. Jinghe Zhang. Introduction. Lock is the easiest way to manage concurrency Mutex and semaphore. Read and write locks. In distributed system: No master for issuing locks. Failures. Problem. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Paxos Made Simple

Paxos Made Simple

Jinghe Zhang

Page 2: Paxos Made Simple

Introduction

Lock is the easiest way to manage concurrency Mutex and semaphore. Read and write locks.

In distributed system: No master for issuing locks. Failures.

Page 3: Paxos Made Simple

Problem

How to reach consensus/data consistency in distributed system that can tolerate non-malicious failures?

Page 4: Paxos Made Simple

Requirements Safety

Only a value that has been proposed may be chosen.

Only a single value is chosen. A node never learns that a value has been

chosen unless it actually has been. Liveness

Some proposed value is eventually chosen. If a value has been chosen, a node can

eventually learn the value

Page 5: Paxos Made Simple

Paxos Properties

Paxos is an asynchronous consensus algorithm Asynchronous networks

No common clocks or shared notion of time (local ideas of time are fine, but different processes may have very different “clocks”)

No way to know how long a message will take to get from A to B

Page 6: Paxos Made Simple

Paxos Properties

Paxos is guaranteed safe. Consensus is a stable property: once

reached it is never violated; the agreed value is not changed.

Page 7: Paxos Made Simple

Paxos Properties

Paxos is not guaranteed live. Consensus is reached if “a large

enough subnetwork...is non-faulty for a long enough time.”

Otherwise Paxos might never terminate.

Page 8: Paxos Made Simple

Paxos Algorithm

Key Assumptions: Set of processes that run Paxos is

known a-priori Processes suffer crash failures All processes have Greek names (but

translate as “Fred”, “Cynthia”, “Nancy”…)

Page 9: Paxos Made Simple

Paxos Algorithm 3 roles

proposer acceptor Learner

A node can act as more than one clients (usually 3).

2 phases Phase 1: Prepare request Response Phase 2: Accept request Response

Page 10: Paxos Made Simple

Phase 1: (prepare request)

(1) A proposer chooses a new proposal version number n , and sends a prepare request (“prepare”,n) to a majority of acceptors:(a) Can I make a proposal with number n ?(b) if yes, do you suggest some value for

my proposal?

Page 11: Paxos Made Simple

Phase 1: (prepare request)

(2) If an acceptor receives a prepare request (“prepare”, n) with n greater than that of any prepare request it has already responded, sends out (“ack”, n, n’, v’) or (“ack”, n, , )(a) responds with a promise not to accept any

more proposals numbered less than n.(b) suggest the value v of the highest-number

proposal that it has accepted if any, else

Page 12: Paxos Made Simple

Phase 2: (accept request)

(3) If the proposer receives responses from a majority of the acceptors, then it can issue an accept request (“accept”, n , v) with number n and value v: (a) n is the number that appears in the

prepare request.(b) v is the value of the highest-numbered

proposal among the responses

Page 13: Paxos Made Simple

Phase 2: (accept request)

(4) If the acceptor receives an accept request (“accept”, n , v) , it accepts the proposal unless it has already responded to a prepare request having a number greater than n.

Page 14: Paxos Made Simple

Learning the decision Obvious algorithm: whenever acceptor

accepts a proposal, respond to all learners (“accept”, n, v).

No Byzantine-Failures: Acceptors informs a distinguished learner and let the distinguished learner broadcast the result.

Learner receives (“accept”, n, v) from a majority of acceptors, decides v, and sends (“decide”, v) to all other learners.

Learners receive (“decide”, v), decide v

Page 15: Paxos Made Simple

In Well-Behaved Runs

1 1

2

n

.

.

.

(“accept”,1 ,v1)

1

2

n

.

.

.

1 1

2

n

.

.

.

(“prepare”,1)

(“ack”,1, , )

decide v1

(“accept”,1 ,v1)

1: proposer1-n: acceptors1-n: learners

Page 16: Paxos Made Simple

Paxos is safe…

Intuition: If a proposal with value v is decided,

then every higher-numbered proposal issued by any proposer has value v.

A majority of acceptors accept (n, v), v is decided

next prepare request with Proposal Number n+1

(what if n+k?)

Page 17: Paxos Made Simple

Safety (proof)

Suppose (n, v) is the earliest proposal that passed. If none, safety holds.

Let (n’, v’) be the earliest issued proposal after (n, v) with a different value v’!=v

As (n’, v’) passed, it requires a major of acceptors. Thus, some process approve both (n, v) and (n’, v’), though it will suggest value v with version number k>= n.

As (n’, v’) passed, it must receive a response (“ack”, n’, j, v’) to its prepare request, with n<j<n’. Consider (j, v’) we get the contradiction.

Page 18: Paxos Made Simple

Liveness

Fischer-Lynch-Patterson (1985) No consensus can be guaranteed in

an asynchronous communication system in the presence of any failures.

Intuition: a “failed” process may just be slow, and can rise from the dead at exactly the wrong time.

Page 19: Paxos Made Simple

Liveness FLP tells us that it is impossible for an

asynchronous system to agree on anything with accuracy and liveness! Liveness requires that agents are free

to accept different values in subsequent rounds.

But: safety requires that once some round succeeds, no subsequent round can change it.

Page 20: Paxos Made Simple

Liveness(cont.) Paper gives us a scenario with 2

proposers, and during the scenario no decision can be made. As the paper points out, selecting a

distinguished proposer will solve the problem.

“Leader election” But Paxos doesn’t block in case of a lost

message Phase 1 can start with new rank even if

previous attempts never ended

Page 21: Paxos Made Simple

Applications

Chubby lock service. Petal: Distributed virtual disks. Frangipani: A scalable distributed

file system.

Page 22: Paxos Made Simple

Summary

Consensus is “impossible” But this doesn’t turn out to be a big

obstacle We can achieve consensus with

probability one in many situations Paxos is an example of a

consensus protocol, very simple

Page 23: Paxos Made Simple

If you are interested...

Lamport, Leslie (May 1998). "The Part-Time Parliament". ACM Transactions on Computer Systems 16 (2): 133–169. (http://research.microsoft.com/users/lamport/pubs/lamport-paxos.pdf)

Page 24: Paxos Made Simple

Questions?

Jenkins, if I want another yes-man, I’ll build one!