riak dt map: a composable, convergent replicated dictionary · merge • merge map version vectors...

49
Riak DT Map: A Composable, Convergent Replicated Dictionary Russell Brown*, Sean Cribbs, Sam Elliot, Christopher Meiklejohn @basho.com First Workshop on the Principles and Practice of Eventual Consistency April 13, 2014

Upload: others

Post on 06-Jun-2020

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Riak DT Map: A Composable, Convergent Replicated Dictionary · Merge • Merge Map Version Vectors • For each field, keep dots on both sides, drop dominated dots, keep frontier

Riak DT Map: A Composable, Convergent

Replicated DictionaryRussell Brown*, Sean Cribbs, Sam Elliot, Christopher

Meiklejohn @basho.com

First Workshop on the Principles and Practice of Eventual Consistency April 13, 2014

Page 2: Riak DT Map: A Composable, Convergent Replicated Dictionary · Merge • Merge Map Version Vectors • For each field, keep dots on both sides, drop dominated dots, keep frontier

Sorry

• Informal “experience report”

• Crap slides

Page 3: Riak DT Map: A Composable, Convergent Replicated Dictionary · Merge • Merge Map Version Vectors • For each field, keep dots on both sides, drop dominated dots, keep frontier

Platform

• Riak

• Dynamo[1] Inspired KV Store

• Clustered and Replicated

• Eventually Consistent

Page 4: Riak DT Map: A Composable, Convergent Replicated Dictionary · Merge • Merge Map Version Vectors • For each field, keep dots on both sides, drop dominated dots, keep frontier

Platform

• Consistent hashing[5]

• N replicas == Preference List

• R/W/PW/PR sloppy/strict quorums

Page 5: Riak DT Map: A Composable, Convergent Replicated Dictionary · Merge • Merge Map Version Vectors • For each field, keep dots on both sides, drop dominated dots, keep frontier
Page 6: Riak DT Map: A Composable, Convergent Replicated Dictionary · Merge • Merge Map Version Vectors • For each field, keep dots on both sides, drop dominated dots, keep frontier
Page 7: Riak DT Map: A Composable, Convergent Replicated Dictionary · Merge • Merge Map Version Vectors • For each field, keep dots on both sides, drop dominated dots, keep frontier
Page 8: Riak DT Map: A Composable, Convergent Replicated Dictionary · Merge • Merge Map Version Vectors • For each field, keep dots on both sides, drop dominated dots, keep frontier

Platform• Version Vectors Per Key

• Partition is a Vnode is an Actor

• Opaque Values

• Syntactic Merge in Riak

• Semantic Merge at Client Application

Page 9: Riak DT Map: A Composable, Convergent Replicated Dictionary · Merge • Merge Map Version Vectors • For each field, keep dots on both sides, drop dominated dots, keep frontier

History

• “Siblings” are a usability problem

• Application level semantic merges are “are ad-hoc and error-prone.” [Shapiro et al]

Page 10: Riak DT Map: A Composable, Convergent Replicated Dictionary · Merge • Merge Map Version Vectors • For each field, keep dots on both sides, drop dominated dots, keep frontier

Problem?

• Focussed on a single use case

• Aimed for general usefulness

• Aware of CRDTs[2]

Page 11: Riak DT Map: A Composable, Convergent Replicated Dictionary · Merge • Merge Map Version Vectors • For each field, keep dots on both sides, drop dominated dots, keep frontier

Use Case

• Mobile game progress data

• Game State

• Non trivial merge

Page 12: Riak DT Map: A Composable, Convergent Replicated Dictionary · Merge • Merge Map Version Vectors • For each field, keep dots on both sides, drop dominated dots, keep frontier

Desired Outcome• Express updates as operations

• Apply related updates together

• Avoid “hand coded” resolution

Page 13: Riak DT Map: A Composable, Convergent Replicated Dictionary · Merge • Merge Map Version Vectors • For each field, keep dots on both sides, drop dominated dots, keep frontier

Engineering Constraints• Can’t build a new database from scratch

• Support Riak’s features

• Anti Entropy

• MDC Replication

• Hide CRDTs in Riak!

• Riak Object, a sort of multi value register

Page 14: Riak DT Map: A Composable, Convergent Replicated Dictionary · Merge • Merge Map Version Vectors • For each field, keep dots on both sides, drop dominated dots, keep frontier

How Hard Can It Be?

• Ignorance as an asset!

• Aim: Ad Hoc general composition of any CRDTs

• Result: Something less (but useful (we hope))

Page 15: Riak DT Map: A Composable, Convergent Replicated Dictionary · Merge • Merge Map Version Vectors • For each field, keep dots on both sides, drop dominated dots, keep frontier

Structure

• ORSWOT[3] like

• DVV[4] like

• “Dot” is the actor, count pair for an event in version vector history

• The event at time of creation

Page 16: Riak DT Map: A Composable, Convergent Replicated Dictionary · Merge • Merge Map Version Vectors • For each field, keep dots on both sides, drop dominated dots, keep frontier

Structure

• {version_vector, [{{name, type}, [{dot, crdt},…]},…], [deferred]}

Page 17: Riak DT Map: A Composable, Convergent Replicated Dictionary · Merge • Merge Map Version Vectors • For each field, keep dots on both sides, drop dominated dots, keep frontier

Structure

• Version vector for Map

• Set of {name, type} pairs for field names map to

• Set of {dot, crdt} pairs

• List of {version_vector, [field_name]} pairs for deferred operations

Page 18: Riak DT Map: A Composable, Convergent Replicated Dictionary · Merge • Merge Map Version Vectors • For each field, keep dots on both sides, drop dominated dots, keep frontier

Structure• e.g. {[{a, 3}, {b, 2}, {d, 1}],

[{{“gold”, counter}, [{{a,1}, …}, {{b, 1}, …}] }, {“weapons”, set}, [{{b, 2}, …}, {{d, 1}, …}] ], [{[{a, 5}, {z, 25}], [{“cars”, set}}] }

Version vector

field

dots

CRDTs

Deferred

Page 19: Riak DT Map: A Composable, Convergent Replicated Dictionary · Merge • Merge Map Version Vectors • For each field, keep dots on both sides, drop dominated dots, keep frontier

Semantics

• Observed Remove (add wins)

• Updating embedded CRDT is a field Add

• Update merges set of dotted CRDTs and creates a new {dot, crdt}, discards merged.

Page 20: Riak DT Map: A Composable, Convergent Replicated Dictionary · Merge • Merge Map Version Vectors • For each field, keep dots on both sides, drop dominated dots, keep frontier

Semantics• Concurrent field remove + update: update persists

• Remove “dots” removed from remaining CRDT

• e.g. A with clock [{a,3}, {b,1}] removes {“weapons”, set} at {a,3} with with entries {a, 1} -> “sword” [{a, 2}, {b,1}] -> “dagger” {a, 3} -> “bazooka”

Page 21: Riak DT Map: A Composable, Convergent Replicated Dictionary · Merge • Merge Map Version Vectors • For each field, keep dots on both sides, drop dominated dots, keep frontier

Semantics

• Merges with B which has clock [{b, 3}, {a, 2}] with entries {a, 1} -> “sword” [{a, 2}, {b,1}] -> “dagger” {b,2} -> “Halberd” {b,3} -> “ICBM”

• Result is [{b,3}, {a,3}] -> {“weapons”, set} -> {b, 3} -> [{b, 2} -> “Halberd”, {b,3} -> “ICBM”]

Page 22: Riak DT Map: A Composable, Convergent Replicated Dictionary · Merge • Merge Map Version Vectors • For each field, keep dots on both sides, drop dominated dots, keep frontier

Semantics

• Shared causal context

• Map version vector is embedded set version vector

• Is embedded map version vector

• Dots are shared down through structure

Page 23: Riak DT Map: A Composable, Convergent Replicated Dictionary · Merge • Merge Map Version Vectors • For each field, keep dots on both sides, drop dominated dots, keep frontier

Semantics

• “Future” Context for remove ops are “deferred”

• Remove dots contained in context

• Defer removal of unseen dots.

Page 24: Riak DT Map: A Composable, Convergent Replicated Dictionary · Merge • Merge Map Version Vectors • For each field, keep dots on both sides, drop dominated dots, keep frontier

Merge• Merge Map Version Vectors

• For each field, keep dots on both sides, drop dominated dots, keep frontier dots

• If a field is on one side, but not on the other, merge “empty” CRDT with absent side version vector (drops internal dots)

• Union deferred operations

• Apply deferred operations that are no longer from “the future”

Page 25: Riak DT Map: A Composable, Convergent Replicated Dictionary · Merge • Merge Map Version Vectors • For each field, keep dots on both sides, drop dominated dots, keep frontier

Op Based Interface

• Send batch of operations

• Execute atomically at one replica

• Replicate new state to preflist

• Send map vv as context (for removes!)

Page 26: Riak DT Map: A Composable, Convergent Replicated Dictionary · Merge • Merge Map Version Vectors • For each field, keep dots on both sides, drop dominated dots, keep frontier

Failure

• Not arbitrary composition

• Embedded ORSWOT /= ORSWOT

• Embedded Cntr /= PN-counter

• Ask me about the counter I made

Page 27: Riak DT Map: A Composable, Convergent Replicated Dictionary · Merge • Merge Map Version Vectors • For each field, keep dots on both sides, drop dominated dots, keep frontier

Global Clock

• Map clock is the clock for all embedded CRDTs

• Dots must be passed down through structure

• e.g. embedded ORSWOT

Page 28: Riak DT Map: A Composable, Convergent Replicated Dictionary · Merge • Merge Map Version Vectors • For each field, keep dots on both sides, drop dominated dots, keep frontier

Generality?

• Can't just "bang together" CRDTs

• Must share some causal relationship (? verify ?)

Page 29: Riak DT Map: A Composable, Convergent Replicated Dictionary · Merge • Merge Map Version Vectors • For each field, keep dots on both sides, drop dominated dots, keep frontier

Embedded ORSWOT - bad• ORSWOT has a VV

• Update ORSWOT field at A

• [{a, 1}] -> "sword"

• replicate to B

• remove field at A

• Update ORSWOT field

• [{a, 1}] -> "bazooka"

• BUSTED!

Page 30: Riak DT Map: A Composable, Convergent Replicated Dictionary · Merge • Merge Map Version Vectors • For each field, keep dots on both sides, drop dominated dots, keep frontier

Some mistaeks I mad

• Merging Field CRDTs in Merge - need event

• Not dropping dominated dots

• Can be in both sides, but removed after merge

• Not sharing causal context down the Map

Page 31: Riak DT Map: A Composable, Convergent Replicated Dictionary · Merge • Merge Map Version Vectors • For each field, keep dots on both sides, drop dominated dots, keep frontier

Map “design” iterations

• OR-Set[2] of Fields

• Lattice of Types

Page 32: Riak DT Map: A Composable, Convergent Replicated Dictionary · Merge • Merge Map Version Vectors • For each field, keep dots on both sides, drop dominated dots, keep frontier

What’s in a name?

• {“gold”, counter} -> counter

• {“gold”, set} -> set

• Sam Elliot - intern extraordinaire

Page 33: Riak DT Map: A Composable, Convergent Replicated Dictionary · Merge • Merge Map Version Vectors • For each field, keep dots on both sides, drop dominated dots, keep frontier

Next Iteration

• OR-Sets get BIG

• Update a CRDT means add the field (again!)

• ORSWOT[3] to the rescue

Page 34: Riak DT Map: A Composable, Convergent Replicated Dictionary · Merge • Merge Map Version Vectors • For each field, keep dots on both sides, drop dominated dots, keep frontier

ORSWOT

• Didn’t understand the paper

• Didn’t grok CvRDT - CmRDT equivalence

• Learned about DVV[4]

• (re)Invented the ORSWOT!

Page 35: Riak DT Map: A Composable, Convergent Replicated Dictionary · Merge • Merge Map Version Vectors • For each field, keep dots on both sides, drop dominated dots, keep frontier

[{a, 1}]

{a, 1} Shelly

Page 36: Riak DT Map: A Composable, Convergent Replicated Dictionary · Merge • Merge Map Version Vectors • For each field, keep dots on both sides, drop dominated dots, keep frontier

[{a, 1}] [{a, 1}

{a, 1} Shelly {a, 1} Shelly

Page 37: Riak DT Map: A Composable, Convergent Replicated Dictionary · Merge • Merge Map Version Vectors • For each field, keep dots on both sides, drop dominated dots, keep frontier

[{a, 1}] [{a, 1}, {b, 3}]

{a, 1} Shelly

{b, 1}

{b, 2}

{b, 3}

Bob

Pete

Phil

{a, 1} Shelly

Page 38: Riak DT Map: A Composable, Convergent Replicated Dictionary · Merge • Merge Map Version Vectors • For each field, keep dots on both sides, drop dominated dots, keep frontier

[{a, 1}, {b,3}] [{a, 1}, {b, 3}]

{a, 1} Shelly

{b, 1}

{b, 2}

{b, 3}

Bob

Pete

Phil

{a, 1} Shelly

{b, 1}

{b, 2}

{b, 3}

Bob

Pete

Phil

Page 39: Riak DT Map: A Composable, Convergent Replicated Dictionary · Merge • Merge Map Version Vectors • For each field, keep dots on both sides, drop dominated dots, keep frontier

[{a, 2}, {b, 3}]

{b, 1}

{b, 3}

[{a, 1}, {b, 3}]

Bob

Pete

{a, 1} Shelly

{b, 1}

{b, 2}

{b, 3}

Bob

Pete

Phil

{a, 1} Shelly

{a, 2} Anna

Page 40: Riak DT Map: A Composable, Convergent Replicated Dictionary · Merge • Merge Map Version Vectors • For each field, keep dots on both sides, drop dominated dots, keep frontier

[{a, 2}, {b, 3}]

{b, 1}

{b, 3}

[{a, 2}, {b, 3}]

Bob

Pete

{a, 1} Shelly

{b, 1}

{a, 2}

{b, 3}

Bob

Pete

Anna

{a, 1} Shelly

{a, 2} Anna

Page 41: Riak DT Map: A Composable, Convergent Replicated Dictionary · Merge • Merge Map Version Vectors • For each field, keep dots on both sides, drop dominated dots, keep frontier

Flag (an aside)

• Observed Remove Boolean Flag

• Is a new thing?

Page 42: Riak DT Map: A Composable, Convergent Replicated Dictionary · Merge • Merge Map Version Vectors • For each field, keep dots on both sides, drop dominated dots, keep frontier

Next Iteration

• ORSWOT like

• {vv, [{{name, type}, [{dot, crdt}]}]}

• on remove, simply discard

• on merge, keep the un-dominated and common dots

Page 43: Riak DT Map: A Composable, Convergent Replicated Dictionary · Merge • Merge Map Version Vectors • For each field, keep dots on both sides, drop dominated dots, keep frontier

On Going Work

• More compact representation

• More efficient merge implementation

• Replicate “fragments” not whole state

• depend on Active Entropy more

Page 44: Riak DT Map: A Composable, Convergent Replicated Dictionary · Merge • Merge Map Version Vectors • For each field, keep dots on both sides, drop dominated dots, keep frontier

Production Report• Not in Production Yet :(

• Basho’s difficult 2nd Album - Riak 2.0

• Some ML posts - positive experiences

• Some customer use cases

• TapJoy - Maps of Counters for Stats

Page 45: Riak DT Map: A Composable, Convergent Replicated Dictionary · Merge • Merge Map Version Vectors • For each field, keep dots on both sides, drop dominated dots, keep frontier

Testing

• Verification is hard

• No general tools for jobbing engineers

• Property based testing reasonable compromise

Page 46: Riak DT Map: A Composable, Convergent Replicated Dictionary · Merge • Merge Map Version Vectors • For each field, keep dots on both sides, drop dominated dots, keep frontier

QuickCheck

• Model the system as set of actors

• Model the state

• Generate millions of inter leavings of operations

• create, add field, remove field, update field

Page 47: Riak DT Map: A Composable, Convergent Replicated Dictionary · Merge • Merge Map Version Vectors • For each field, keep dots on both sides, drop dominated dots, keep frontier

QuickCheck Lessons

• Bugs slipped through

• State space too large

• Smaller state space

• Focus on concurrent operations

Page 48: Riak DT Map: A Composable, Convergent Replicated Dictionary · Merge • Merge Map Version Vectors • For each field, keep dots on both sides, drop dominated dots, keep frontier

Lets fight Talk

• Questions?

Page 49: Riak DT Map: A Composable, Convergent Replicated Dictionary · Merge • Merge Map Version Vectors • For each field, keep dots on both sides, drop dominated dots, keep frontier

references![1] G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman, A. Pilchin, S. Sivasubramanian, P. Vosshall, and W. Vogels.

Dynamo: Amazon’s highly available key-value store.

SIGOPS Oper. Syst. Rev., 41(6):205–220, Oct. 2007.

![2] M. Shapiro, N. Preguiça, C. Baquero, and M. Zawirski.

A comprehensive study of Convergent and Commutative Replicated Data Types.

Rapport de recherche RR-7506, INRIA, Jan. 2011.

[3] Annette Bieniusa, Marek Zawirski, Nuno Preguiça, Marc

Shapiro, Carlos Baquero, Valter Balegas, Sérgio Duarte (2012) An Optimized Conflict-free Replicated Set http://arxiv.org/abs/1210.3368

[4] Nuno Preguiça, Carlos Baquero, Paulo Sérgio Almeida, Victor Fonte, Ricardo Gonçalves

Dotted Version Vectors: Logical Clocks for Optimistic Replication

http://arxiv.org/abs/1011.5808

[5] Karger et al, Web Caching with Consistent Hashing , 1999