probabilistically bounded staleness for practical partial quorums brian hudson peter bailis,...

21
PROBABILISTICALLY BOUNDED STALENESS FOR PRACTICAL PARTIAL QUORUMS Brian Hudson PETER BAILIS, SHIVARAM VENKATARAMAN, MICHEAL J. FRANKLIN, JOSEPH M. HELLERSTEIN, ION STOICA Data store replication results in a fundamental trade-off between operation latency and data consistency. In this paper, we examine this trade-off in the context of quorum-replicated data stores. Under partial, or non-strict quorum replication, a data store waits for responses from a subset of replicas before answering a query, without guaranteeing that read and write replica sets intersect. As deployed in practice, these con gurations provide only basic eventual consistency guarantees, with no limit to the recency of data returned. However, anecdotally, partial quorums are often “good enough” for practitioners given their latency bene ts. In this work, we explain why partial quorums are regularly acceptable in practice, analyzing both the staleness of data they return and the latency bene ts they offer. We introduce Probabilistically Bounded Staleness (PBS) consistency, which provides expected bounds on staleness with respect to both versions and wall clock time. We derive a closed-form solution for versioned staleness as well as model real-time staleness for representative Dynamo-style systems under internet-scale production workloads. Using PBS, we measure the latency-consistency trade-off for partial quorum systems. We quantitatively demonstrate how eventually consistent systems frequently return consistent data within tens of milliseconds while offering signi cant latency bene ts. CS 644

Upload: marquise-pitman

Post on 15-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: PROBABILISTICALLY BOUNDED STALENESS FOR PRACTICAL PARTIAL QUORUMS Brian Hudson PETER BAILIS, SHIVARAM VENKATARAMAN, MICHEAL J. FRANKLIN, JOSEPH M. HELLERSTEIN,

PROBABILISTICALLY BOUNDED STALENESS FOR PRACTICAL

PARTIAL QUORUMS

Brian Hudson

PETER BAILIS, SHIVARAM VENKATARAMAN, MICHEAL J. FRANKLIN, JOSEPH M. HELLERSTEIN, ION STOICA

Data store replication results in a fundamental trade-off between operation latency and data consistency. In this paper, we examine this trade-off in the context of quorum-replicated data stores. Under partial, or non-strict quorum replication, a data store waits for responses from a subset of replicas before answering a query, without guaranteeing that read and write replica sets intersect. As deployed in practice, these configurations provide only basic eventual consistency guarantees, with no limit to the recency of data returned. However, anecdotally, partial quorums are often “good enough” for practitioners given their latency benefits. In this work, we explain why partial quorums are regularly acceptable in practice, analyzing both the staleness of data they return and the latency benefits they offer. We introduce Probabilistically Bounded Staleness (PBS) consistency, which provides expected bounds on staleness with respect to both versions and wall clock time. We derive a closed-form solution for versioned staleness as well as model real-time staleness for representative Dynamo-style systems under internet-scale production workloads. Using PBS, we measure the latency-consistency trade-off for partial quorum systems. We quantitatively demonstrate how eventually consistent systems frequently return consistent data within tens of milliseconds while offering significant latency benefits.

CS 644

Page 2: PROBABILISTICALLY BOUNDED STALENESS FOR PRACTICAL PARTIAL QUORUMS Brian Hudson PETER BAILIS, SHIVARAM VENKATARAMAN, MICHEAL J. FRANKLIN, JOSEPH M. HELLERSTEIN,

Paper Selection

Did NOT propose yet another new consistency model or develop a system implementing new semantics

Instead it examines what level of consistency existing quorum-replicated systems actually provide. Amazon Dynamo Basho Riak Project Voldemort

Page 3: PROBABILISTICALLY BOUNDED STALENESS FOR PRACTICAL PARTIAL QUORUMS Brian Hudson PETER BAILIS, SHIVARAM VENKATARAMAN, MICHEAL J. FRANKLIN, JOSEPH M. HELLERSTEIN,

Background

Modern distributed data stores need to be scalable, highly available, and fast.

These systems often replicate data across different machines and often across datacenters for two reasons: Provide high availability when components

fail Provide improved performance by serving

requests from multiple replicas

Page 4: PROBABILISTICALLY BOUNDED STALENESS FOR PRACTICAL PARTIAL QUORUMS Brian Hudson PETER BAILIS, SHIVARAM VENKATARAMAN, MICHEAL J. FRANKLIN, JOSEPH M. HELLERSTEIN,

Background

To provide low read and write latency, these systems often forgo guaranteeing consistency of reads and instead opt for eventual consistency.

BUT – these “eventually consistent” systems make no guarantees on the staleness (recency in terms of version written) beyond simply saying that the system will “eventually” return the most recent version in the absence of new writes.

Probabilistically Bounded Staleness (PBS) aims to fill this gap, providing SLA-style consistency predictions. Time: “How eventual?” Version history: “How consistent?” A way to quantify latency-consistency trade-offs

Page 5: PROBABILISTICALLY BOUNDED STALENESS FOR PRACTICAL PARTIAL QUORUMS Brian Hudson PETER BAILIS, SHIVARAM VENKATARAMAN, MICHEAL J. FRANKLIN, JOSEPH M. HELLERSTEIN,

Background – Quorum Systems Quorum systems ensure

strong consistency across reads and writes to replicas by ensuring that read and write replica sets overlap/intersect. R + W ≥ N

Partial quorum systems lower latency by requiring fewer replicas to respond. Sets of replicas written to

and read from need not overlap: R + W ≤ N

N replicas/keyRead: wait for R repliesWrite: wait for W acks

R = read quorum sizes W = write quorum sizes N = number of replicas

N = 3, W = 2

Page 6: PROBABILISTICALLY BOUNDED STALENESS FOR PRACTICAL PARTIAL QUORUMS Brian Hudson PETER BAILIS, SHIVARAM VENKATARAMAN, MICHEAL J. FRANKLIN, JOSEPH M. HELLERSTEIN,

Background – Partial Quorums Probabilistic Quorum Systems

Provides probabilistic guarantees of quorum intersection

By scaling the number of replicas, an arbitrarily high probability of consistency can be achieved

Page 7: PROBABILISTICALLY BOUNDED STALENESS FOR PRACTICAL PARTIAL QUORUMS Brian Hudson PETER BAILIS, SHIVARAM VENKATARAMAN, MICHEAL J. FRANKLIN, JOSEPH M. HELLERSTEIN,

PBS: k-staleness

k-staleness Models the probability

that we will read a value that is no more than k versions older than the last written version.

How consistent is eventual consistency?

A quorum system obeys PBS k-staleness consistency if, with probability 1 – psk, at least one value in any read quorum has been committed within k versions of the latest committed version when the read begins.

Page 8: PROBABILISTICALLY BOUNDED STALENESS FOR PRACTICAL PARTIAL QUORUMS Brian Hudson PETER BAILIS, SHIVARAM VENKATARAMAN, MICHEAL J. FRANKLIN, JOSEPH M. HELLERSTEIN,

PBS: k-staleness

Simple closed form solution

N = 3, R = W = 1 Probability of returning

a version Within 2 versions is 0.5 Within 3 versions is

0.703 Within 5 versions >

0.868 Within 10 versions >

0.98

Number of quorums of size R composed of nodes that were not written to in the write quorum divided by the number of possible read quorums

Page 9: PROBABILISTICALLY BOUNDED STALENESS FOR PRACTICAL PARTIAL QUORUMS Brian Hudson PETER BAILIS, SHIVARAM VENKATARAMAN, MICHEAL J. FRANKLIN, JOSEPH M. HELLERSTEIN,

PBS: k-staleness

N = 3, R = W = 1 Probability of

returning a version Within 2 versions is

0.5 Within 3 versions is

0.703 Within 5 versions >

0.868 Within 10 versions >

0.98

Page 10: PROBABILISTICALLY BOUNDED STALENESS FOR PRACTICAL PARTIAL QUORUMS Brian Hudson PETER BAILIS, SHIVARAM VENKATARAMAN, MICHEAL J. FRANKLIN, JOSEPH M. HELLERSTEIN,

PBS: t-staleness

t-visibility Models the probability of inconsistency for

expanding quorums. t-visibility is the probability that a read operation, starting t seconds after a write commits, will observe the latest value of a data item.

This t captures the expected length of the “window of consistency”.

How eventual is eventual consistency?

Page 11: PROBABILISTICALLY BOUNDED STALENESS FOR PRACTICAL PARTIAL QUORUMS Brian Hudson PETER BAILIS, SHIVARAM VENKATARAMAN, MICHEAL J. FRANKLIN, JOSEPH M. HELLERSTEIN,

PBS: t-staleness

The above equation makes several assumptions Reads occur instantly Writes commit immediately after W replicas

have the version Not realistic! T-staleness in real systems

depend on write latency and propagation speeds

Page 12: PROBABILISTICALLY BOUNDED STALENESS FOR PRACTICAL PARTIAL QUORUMS Brian Hudson PETER BAILIS, SHIVARAM VENKATARAMAN, MICHEAL J. FRANKLIN, JOSEPH M. HELLERSTEIN,

PBS: t-staleness

WARS Model Describes the

message latencies between coordinator and a single replica for a write followed by a read t seconds after commit. In an N replica system, this messaging occurs N times.

Page 13: PROBABILISTICALLY BOUNDED STALENESS FOR PRACTICAL PARTIAL QUORUMS Brian Hudson PETER BAILIS, SHIVARAM VENKATARAMAN, MICHEAL J. FRANKLIN, JOSEPH M. HELLERSTEIN,

PBS: t-staleness

Given a distribution for each WARS, calculating t-visibility for a given value of t is straightforward

Implemented a Monte Carlo simulation using the WARS model for computing t-staleness

The simulation runs quickly and web version can be found here: http://pbs.cs.berkeley.edu/#demo

Modified Cassandra, instrumented it to record information needed to form WARS distributions Source code for patch is available online WARS predictions closely matched empirical

observations, validating the simulation

Page 14: PROBABILISTICALLY BOUNDED STALENESS FOR PRACTICAL PARTIAL QUORUMS Brian Hudson PETER BAILIS, SHIVARAM VENKATARAMAN, MICHEAL J. FRANKLIN, JOSEPH M. HELLERSTEIN,

PBS: Production Latency Distributions

Obtained real-world latency information from LinkedIn on a Voldemort distribution running on traditional disks and one on SSDs.

Single node under peak traffic, representing 60% read and 40% read-modify-write traffic

Big improvement with SSDs! Voldemort goes from being IO bound when running on spinning drives to being CPU or network bound when running on SSDs.

Probably why Amazon DynamoDB runs on SSDs

Page 15: PROBABILISTICALLY BOUNDED STALENESS FOR PRACTICAL PARTIAL QUORUMS Brian Hudson PETER BAILIS, SHIVARAM VENKATARAMAN, MICHEAL J. FRANKLIN, JOSEPH M. HELLERSTEIN,

PBS: Production Latency Distributions

Yammer provides private social networking for companies.

Uses Riak

Page 16: PROBABILISTICALLY BOUNDED STALENESS FOR PRACTICAL PARTIAL QUORUMS Brian Hudson PETER BAILIS, SHIVARAM VENKATARAMAN, MICHEAL J. FRANKLIN, JOSEPH M. HELLERSTEIN,

PBS Latency Model Fitting

Provided data (summary statistics) was under-specified. WARS requires distributions.

Authors massaged this data into WARS distributions

NOTE: WARS distribution information can be easily collected (as their Cassandra patch shows), it just currently is not collected in practice

Page 17: PROBABILISTICALLY BOUNDED STALENESS FOR PRACTICAL PARTIAL QUORUMS Brian Hudson PETER BAILIS, SHIVARAM VENKATARAMAN, MICHEAL J. FRANKLIN, JOSEPH M. HELLERSTEIN,

PBS

Maintaining a large number of replicas for availability or better performance results in a potentially large impact on consistency immediately following a write, but it still converges quickly.

Write latency has a huge impact: compare SSD to disk SSD write latency median .489ms Disk write latency median 1.5ms SSD has 97.4% chance of consistent reads immediately after write Disk has 43.9% chance of consistent reads immediately after write

Data suggests SSDs can make a HUGE impact on consistency due to reduce write latency and write variance. Probably why Amazon DynamoDB is built using only SSDs

Page 18: PROBABILISTICALLY BOUNDED STALENESS FOR PRACTICAL PARTIAL QUORUMS Brian Hudson PETER BAILIS, SHIVARAM VENKATARAMAN, MICHEAL J. FRANKLIN, JOSEPH M. HELLERSTEIN,

PBS

Increasing N while keeping R and W constant Maintaining a large number of replicas for

availability or better performance results in a potentially large impact on consistency immediately following a write, but it still converges quickly.

Page 19: PROBABILISTICALLY BOUNDED STALENESS FOR PRACTICAL PARTIAL QUORUMS Brian Hudson PETER BAILIS, SHIVARAM VENKATARAMAN, MICHEAL J. FRANKLIN, JOSEPH M. HELLERSTEIN,

Cassandra Defaults

Page 20: PROBABILISTICALLY BOUNDED STALENESS FOR PRACTICAL PARTIAL QUORUMS Brian Hudson PETER BAILIS, SHIVARAM VENKATARAMAN, MICHEAL J. FRANKLIN, JOSEPH M. HELLERSTEIN,

Riak Defaults

Page 21: PROBABILISTICALLY BOUNDED STALENESS FOR PRACTICAL PARTIAL QUORUMS Brian Hudson PETER BAILIS, SHIVARAM VENKATARAMAN, MICHEAL J. FRANKLIN, JOSEPH M. HELLERSTEIN,

Riak “Low Value” Data Recommendation