dynamo systems - qcon sf 2012 presentation

93
Dynamo: Theme and Variations

Upload: shanley-kane

Post on 28-Nov-2014

1.565 views

Category:

Technology


0 download

DESCRIPTION

A look at Dynamo-based systems: the architectural principles, use cases and requirements; where they differ from relational databases; and where they are going.

TRANSCRIPT

Page 1: Dynamo Systems - QCon SF 2012 Presentation

Dynamo: Theme and Variations

Page 2: Dynamo Systems - QCon SF 2012 Presentation

@shanley

Page 3: Dynamo Systems - QCon SF 2012 Presentation

Riak

Page 4: Dynamo Systems - QCon SF 2012 Presentation
Page 5: Dynamo Systems - QCon SF 2012 Presentation
Page 6: Dynamo Systems - QCon SF 2012 Presentation

150 Services

Page 7: Dynamo Systems - QCon SF 2012 Presentation
Page 8: Dynamo Systems - QCon SF 2012 Presentation
Page 9: Dynamo Systems - QCon SF 2012 Presentation

Global access Multiple machines Multiple datacenters

Scale to peak loads easily Tolerance of continuous failure

Page 10: Dynamo Systems - QCon SF 2012 Presentation
Page 11: Dynamo Systems - QCon SF 2012 Presentation

Traditionally production systems store their state in relational databases. For many of the more common usage patterns of state persistence, however, a relational database is a solution that is far from ideal.

Most of these services only store and retrieve data by primary key and do not require the complex querying and management functionality offered by an RDBMS.

This excess functionality requires expensive hardware and highly skilled personnel for its operation, making it a very inefficient solution.

In addition, the available replication technologies are limited and typically choose consistency over availability.

Although many advances have been made in the recent years, it is still not easy to scale-out databases or use smart partitioning schemes for load balancing.

Dynamo: Amazon’s Highly Available Key-value Store

Page 12: Dynamo Systems - QCon SF 2012 Presentation

CAP Theorem

Page 13: Dynamo Systems - QCon SF 2012 Presentation

People tend to focus on consistency/availability as the sole driver of emerging database models because it provides a simple and academic explanation for more complex evolutionary factors. In fact, CAP Theorem, according to its original author, “prohibits only a tiny part of the design space: perfect availability and consistency in the presence of partitions, which are rare… there is little reason to forfeit C or A when the system is not partitioned.” In reality, a much larger range of considerations and tradeoffs have informed the “NoSQL” movement…

Page 14: Dynamo Systems - QCon SF 2012 Presentation

Traditionally production systems store their state in relational databases. For many of the more common usage patterns of state persistence, however, a relational database is a solution that is far from ideal.

Most of these services only store and retrieve data by primary key and do not require the complex querying and management functionality offered by an RDBMS.

This excess functionality requires expensive hardware and highly skilled personnel for its operation, making it a very inefficient solution.

In addition, the available replication technologies are limited and typically choose consistency over availability.

Although many advances have been made in the recent years, it is still not easy to scale-out databases or use smart partitioning schemes for load balancing.

Dynamo: Amazon’s Highly Available Key-value Store

Page 15: Dynamo Systems - QCon SF 2012 Presentation

Spanner is Google’s scalable, multi-version, globally- distributed, and synchronously-replicated database… It is the first system to distribute data at global scale and support externally-consistent distributed transactions...

Spanner is designed to scale up to millions of machines across hundreds of datacenters and trillions of database rows… Spanner’s main focus is managing cross-datacenter replicated data…

Spanner started… as part of a rewrite of Google’s advertising backend called F1 [35]. This backend was originally based on a MySQL database…

Resharding this revenue-critical database as it grew in the number of customers and their data was extremely costly. The last resharding took over two years of intense effort…

Spanner: Google’s Globally-Distributed Database

Page 16: Dynamo Systems - QCon SF 2012 Presentation

Shanley’s Theorem

Page 17: Dynamo Systems - QCon SF 2012 Presentation

Database design is driven by a virtuous tension between the requirements of the app, the profile of developer productivity, and the limitations of the operational scenario.

Page 18: Dynamo Systems - QCon SF 2012 Presentation

Database design is driven by a virtuous tension between the requirements of the app, the profile of developer productivity, and the limitations of the operational scenario.

Stringent latency requirements measured at the 99.9% percentile Highly available

Always writeable Modeled as keys/values

Page 19: Dynamo Systems - QCon SF 2012 Presentation

Database design is driven by a virtuous tension between the requirements of the app, the profile of developer productivity, and the limitations of the operational scenario.

Choice to manage conflict resolution themselves or manage on the data store level

Simple, primary-key only interface No need for relational data model

Page 20: Dynamo Systems - QCon SF 2012 Presentation

Database design is driven by a virtuous tension between the requirements of the app, the profile of developer productivity, and the limitations of the operational scenario.

Functions on commodity hardware Each object must be replicated across multiple DCs

Can scale out one node at a time with minimal impact on system and operators

Page 21: Dynamo Systems - QCon SF 2012 Presentation

Database design is driven by a virtuous tension between the requirements of the app, the profile of developer productivity, and the

Page 22: Dynamo Systems - QCon SF 2012 Presentation

Database design is driven by a virtuous tension between the requirements of the app, the profile of developer productivity, and the limitations of the operational scenario.

1995: Less than 40 million internet users; now: 2.4 billion

Latency perceived as unavailability New types of applications

Page 23: Dynamo Systems - QCon SF 2012 Presentation

Database design is driven by a virtuous tension between the requirements of the app, the profile of developer productivity, and the limitations of the operational scenario.

Much more data Unstructured data

New kinds of business requirements

App scales gracefully without high development overheard

Page 24: Dynamo Systems - QCon SF 2012 Presentation

Database design is driven by a virtuous tension between the requirements of the app, the profile of developer productivity, and the limitations of the operational scenario

Scale-out design on less expensive hardware Ability to easily meet peak loads

Run efficiently across multiple sites Low operational burden

Page 25: Dynamo Systems - QCon SF 2012 Presentation

Aspects of the database:• How to distribute data around the cluster• Adding new nodes • Replicating data • Resolving data conflicts • Dealing with failure scenarios • Data model

Page 26: Dynamo Systems - QCon SF 2012 Presentation

Database design is driven by a virtuous tension between the requirements of the app, the profile of developer productivity, and the how to distribute data around the cluster

Page 27: Dynamo Systems - QCon SF 2012 Presentation

Database design is driven by a virtuous tension between the requirements of the app, the profile of developer productivity, and the how to distribute data around the cluster

Page 28: Dynamo Systems - QCon SF 2012 Presentation

how to distribute data around the cluster

Bunny Names A-G

Bunny Names H-R

Bunny Names R-Z

Page 29: Dynamo Systems - QCon SF 2012 Presentation

how to distribute data around the cluster

Bunny Names Disproportionately Trend Towards Bunny, Cuddles, Fluffy, Mr. Bunny, Peter Rabbit, Velveteen, Peter Cottontail, and Mitten

Page 30: Dynamo Systems - QCon SF 2012 Presentation

how to distribute data around the cluster

Page 31: Dynamo Systems - QCon SF 2012 Presentation

how to distribute data around the cluster

Reduce risk of hot spots in the database Data is automatically assigned to nodes

Page 32: Dynamo Systems - QCon SF 2012 Presentation

how to distribute data around the cluster

Page 33: Dynamo Systems - QCon SF 2012 Presentation

how to distribute data around the cluster

Page 34: Dynamo Systems - QCon SF 2012 Presentation

how to distribute data around the cluster

Page 35: Dynamo Systems - QCon SF 2012 Presentation

adding new nodes

Page 36: Dynamo Systems - QCon SF 2012 Presentation

adding new nodes

Page 37: Dynamo Systems - QCon SF 2012 Presentation

adding new nodes

Bunny Names A-G

Page 38: Dynamo Systems - QCon SF 2012 Presentation

adding new nodes

Bunny Names A-GBunny Names A-G Bunny Names A-GBunny Names A-G

Page 39: Dynamo Systems - QCon SF 2012 Presentation

adding new nodes

Page 40: Dynamo Systems - QCon SF 2012 Presentation

adding new nodes

Page 41: Dynamo Systems - QCon SF 2012 Presentation

adding new nodes

“decoupling of partitioning and partition placement”

Page 42: Dynamo Systems - QCon SF 2012 Presentation

replicating data

Page 43: Dynamo Systems - QCon SF 2012 Presentation

replicating data

master

slaveslave

Page 44: Dynamo Systems - QCon SF 2012 Presentation

replicating data

master

slaveslave

writes

Page 45: Dynamo Systems - QCon SF 2012 Presentation

replicating data

master

slaveslave

writes

reads

Page 46: Dynamo Systems - QCon SF 2012 Presentation

replicating data

master

slaveslave

Page 47: Dynamo Systems - QCon SF 2012 Presentation

replicating data

master

slaveslave

1. Time to figure out the master is gone 2. Master election

Availability Clock

Page 48: Dynamo Systems - QCon SF 2012 Presentation

replicating data

master

slaveslave

Consistency > Availability

Unavailable to writes until data can be confirmed correct

Availability Clock

Page 49: Dynamo Systems - QCon SF 2012 Presentation

replicating data

Page 50: Dynamo Systems - QCon SF 2012 Presentation

“Every node in Dynamo should have the same set of responsibilities as its peers; there should be no distinguished node or nodes that take special roles or extra set of responsibilities.”

replicating data

Page 51: Dynamo Systems - QCon SF 2012 Presentation

replicating data

readsreads

readsreads

reads

writes

reads

readsreadswrites

writeswriteswrites

writeswrites

Page 52: Dynamo Systems - QCon SF 2012 Presentation

replicating data

readsreads

readsreads

reads

writes

reads

readsreadswrites

writeswriteswrites

writeswrites

Clients can read / write to any node All updates reach all replicas eventually

Page 53: Dynamo Systems - QCon SF 2012 Presentation

replicating data w and r values

Page 54: Dynamo Systems - QCon SF 2012 Presentation

replicating data number of replicas that need to participate in a read/write for a

success response

Page 55: Dynamo Systems - QCon SF 2012 Presentation

put( )

replicating data w = 1

only one node needs to be available to complete write request

Page 56: Dynamo Systems - QCon SF 2012 Presentation

put( )

replicating data w = 3

Page 57: Dynamo Systems - QCon SF 2012 Presentation

resolving conflicts reads when not all writes have propagated (laggy or down node)

Page 58: Dynamo Systems - QCon SF 2012 Presentation

resolving conflicts different clients update at the exact same time

Page 59: Dynamo Systems - QCon SF 2012 Presentation

a85hYGBgzmDKBVIsTFUPPmcwJTLmsTIcmsJ1nA8qzK7HcQwqfB0hzNacxCYWcA1ZIgsA

resolving conflicts

Page 60: Dynamo Systems - QCon SF 2012 Presentation

Whether one object is a direct descendant of the otherWhether the objects are direct descendants of a common parent

Whether the objects are unrelated in recent heritage

resolving conflictsvector clocks that show relationships between objects

Page 61: Dynamo Systems - QCon SF 2012 Presentation

resolving conflicts

vector clock is updated when objects are updatedlast-write wins or conflicts can be resolved on client side

Page 62: Dynamo Systems - QCon SF 2012 Presentation

resolving conflicts

if stale responses are returned as part of the read, those replicas are updated

Page 63: Dynamo Systems - QCon SF 2012 Presentation

failure conditions

Page 64: Dynamo Systems - QCon SF 2012 Presentation

failure conditionsn = 3

Page 65: Dynamo Systems - QCon SF 2012 Presentation

writes & updates

failure conditionsn = 3

Page 66: Dynamo Systems - QCon SF 2012 Presentation

failure conditionshinted handoff

Page 67: Dynamo Systems - QCon SF 2012 Presentation

developing apps

Page 68: Dynamo Systems - QCon SF 2012 Presentation

developing apps

“Most of these services only store and retrieve data by primary key and do not require the complex querying and management functionality offered by an RDBMS.”

Page 69: Dynamo Systems - QCon SF 2012 Presentation

developing apps“schema-less” more flexibility, agility

Page 70: Dynamo Systems - QCon SF 2012 Presentation

developing apps

User/Session ID Session Data Session

Page 71: Dynamo Systems - QCon SF 2012 Presentation

developing apps

User/Session ID

Campaign ID

Session Data

Ad Content

Session

Advertising

Page 72: Dynamo Systems - QCon SF 2012 Presentation

developing apps

User/Session ID

Campaign ID

Date

Session Data

Ad Content

Log File

Session

Advertising

Logs

Page 73: Dynamo Systems - QCon SF 2012 Presentation

developing apps

User/Session ID

Campaign ID

Date

Date, Date/Time

Session Data

Ad Content

Log File

Updates

Session

Advertising

Logs

Sensor

Page 74: Dynamo Systems - QCon SF 2012 Presentation

developing apps

User/Session ID

Campaign ID

Date

Date, Date/Time

Login, Email, UUID

Session Data

Ad Content

Log File

Updates

User Attributes

Session

Advertising

Logs

Sensor

User Data

Page 75: Dynamo Systems - QCon SF 2012 Presentation

developing apps

User/Session ID

Campaign ID

Date

Date, Date/Time

Login, Email, UUID

Title, Integer, Etc.

Session Data

Ad Content

Log File

Updates

User Attributes

Text, JSON, XML

Session

Advertising

Logs

Sensor

User Data

Content

Page 76: Dynamo Systems - QCon SF 2012 Presentation

future

Page 77: Dynamo Systems - QCon SF 2012 Presentation

future

Page 78: Dynamo Systems - QCon SF 2012 Presentation

future

Page 79: Dynamo Systems - QCon SF 2012 Presentation

more data types

Page 80: Dynamo Systems - QCon SF 2012 Presentation

more data types

counters sets

sever side structure and conflict resolution policy

Page 81: Dynamo Systems - QCon SF 2012 Presentation

strong consistency

there is little reason to forfeit C or A when the system is not partitioned

Page 82: Dynamo Systems - QCon SF 2012 Presentation

strong consistency

conditional writes consistent reads

Page 83: Dynamo Systems - QCon SF 2012 Presentation

other advanced features

Page 84: Dynamo Systems - QCon SF 2012 Presentation

other advanced features

Page 85: Dynamo Systems - QCon SF 2012 Presentation

other advanced features

metadata aggregation tasks search

Page 86: Dynamo Systems - QCon SF 2012 Presentation

other advanced features

metadata aggregation tasks search

Page 87: Dynamo Systems - QCon SF 2012 Presentation

other advanced features

Page 88: Dynamo Systems - QCon SF 2012 Presentation

In summary…

Page 89: Dynamo Systems - QCon SF 2012 Presentation

rapid evolutionary change

Page 90: Dynamo Systems - QCon SF 2012 Presentation

significant events

Page 91: Dynamo Systems - QCon SF 2012 Presentation

explosion of new systems

Page 92: Dynamo Systems - QCon SF 2012 Presentation

evolving into higher-order systems

Page 93: Dynamo Systems - QCon SF 2012 Presentation

We’re hiring. [email protected]