nosql distilled - goto conference - goto...

87
NoSQL Distilled A guide to polyglot persistence #NoSQLDistilled @pramodsadalage ThoughtWorks Inc. Tuesday, June 11, 13

Upload: haphuc

Post on 09-Mar-2018

233 views

Category:

Documents


0 download

TRANSCRIPT

NoSQL DistilledA guide to polyglot persistence

#NoSQLDistilled@pramodsadalageThoughtWorks Inc.

Tuesday, June 11, 13

Why RDBMS?

Tuesday, June 11, 13

ACID TransactionsAtomicityConsistencyIsolationDurability

Tuesday, June 11, 13

Standard Query Interface

Tuesday, June 11, 13

Interact with many languages

Tuesday, June 11, 13

Everyone knows SQL

Tuesday, June 11, 13

Everyone knows SQL

Tuesday, June 11, 13

Limit less indexing

Tuesday, June 11, 13

Handles many data models

Tuesday, June 11, 13

Why NoSQL

Tuesday, June 11, 13

Schema changes are hard

Tuesday, June 11, 13

line items:

customer: Ann

$4820321293533

$3910321601912

$5110131495054

$96

$39

$51

payment details:

Card: AmexCC Number: 12345expiry: 04/2001

ID: 1001orders

customers

order lines

credit cards

Impedance mismatchTuesday, June 11, 13

Application vs Integration databases

Billing

Inventory

Billing

Inventory

Integration Database

Application Database web service

Tuesday, June 11, 13

Running on clustersTuesday, June 11, 13

Un-Structured Data

Tuesday, June 11, 13

Un-Even rate of data growth

Tuesday, June 11, 13

Domain Models

Tuesday, June 11, 13

Domain driven data models

Tuesday, June 11, 13

RDBMS dataTuesday, June 11, 13

Aggregate model (Embedding objects)

Tuesday, June 11, 13

Aggregate Data

// in customers{ "customer": { "id": 1, "name": "Martin", "billingAddress": [{"city": "Chicago"}], "orders": [ { "id":99, "orderItems":[ { "productId":27, "price": 32.45, "productName": "NoSQL Distilled" } ], "shippingAddress":[{"city":"Chicago"}] "orderPayment":[ { "ccinfo":"1000-1000-1000-1000", "txnId":"abelif879rft", "billingAddress": {"city": "Chicago"} } ], } ] }}

Tuesday, June 11, 13

Aggregate model (Referencing Objects)

Tuesday, June 11, 13

Aggregate data

// in Customers{ "id":1, "name":"Martin", "billingAddress":[{"city":"Chicago"}]}// in Orders{ "id":99, "customerId":1, "orderItems":[ { "productId":27, "price": 32.45, "productName": "NoSQL Distilled" } ], "shippingAddress":[{"city":"Chicago"}] "orderPayment":[ { "ccinfo":"1000-1000-1000-1000", "txnId":"abelif879rft", "billingAddress": {"city": "Chicago"} } ],}

Tuesday, June 11, 13

Aggregate Orientation

Tuesday, June 11, 13

RDBMS’s have no concept of aggregates

Tuesday, June 11, 13

Aggregates reduce the need for ACID

Tuesday, June 11, 13

Better for clusters, can be distributed easily

Tuesday, June 11, 13

Key-ValueDocument

Column-Family

Tuesday, June 11, 13

Key Value Databases

Tuesday, June 11, 13

Key-ValueDatabase

•One Key-One Value•Value is opaque to database•Like a Hash•Some are distributed

Oracle Riak

instance cluster

table bucket

row key-value

row-id key

Tuesday, June 11, 13

“key” (VIN) “value” (car facts)

JTTDR… …

make#Ford model#Mustang

year#2011 ….

Tuesday, June 11, 13

Document Databases

Tuesday, June 11, 13

Document Database

•One Key-One Value•Value is visible to database•Value can be queries•JSON/XML documents

Oracle MongoDB

instance mongod

schema database

table collection

row document

row_id _id

Tuesday, June 11, 13

“id” (VIN) “document” (car facts)

JTTDR…

{… “make”: “Ford”, “model”: “Mustang”, “year”: 2011, … }

Tuesday, June 11, 13

Column-Family Databases

Tuesday, June 11, 13

Column-Family Database

•Data organized as columns•Each row has row key•Columns have versioned data

•Row data is sorted by column name

Oracle Cassandra

instance cluster

database keyspace

table column-family

row rowcolumns same for

every row

columns can be different for each row

Tuesday, June 11, 13

“id” (VIN) “column families”(car facts)

JTTDR…

{… “car”:{“make”: “Ford”, “model”: “focus”…} “service”:{…} }

Tuesday, June 11, 13

Key-Points Aggregate Databases

Tuesday, June 11, 13

Inter-aggregate relations are hard to maintain

Tuesday, June 11, 13

Schema-less means implicit schema

Tuesday, June 11, 13

Graph Databases

Tuesday, June 11, 13

Tuesday, June 11, 13

Graph Databases

•Is multi-relational graph•Relationships are first-class citizens

•Traversal algorithms•Nodes and Edges can have data (key-value pairs)

Tuesday, June 11, 13

Graph databases work best for data with complex

relations

Tuesday, June 11, 13

Key-Value Database Usage

Tuesday, June 11, 13

Session Storage

Tuesday, June 11, 13

User Profiles/Preferences

Tuesday, June 11, 13

Shopping Cart

Tuesday, June 11, 13

Single user analytics

Tuesday, June 11, 13

Document Database Usage

Tuesday, June 11, 13

Event Logging

Tuesday, June 11, 13

Prototype development

Tuesday, June 11, 13

eCommerce Application

Tuesday, June 11, 13

Content Management Applications

Tuesday, June 11, 13

Column-Family Database Usage

Tuesday, June 11, 13

Large write volume

Tuesday, June 11, 13

Content Management

Tuesday, June 11, 13

eCommerce Application

Tuesday, June 11, 13

Graph Database Usage

Tuesday, June 11, 13

Connected Data

Tuesday, June 11, 13

Routing things/money

Tuesday, June 11, 13

Location Services

Tuesday, June 11, 13

Recommendation engines

Tuesday, June 11, 13

Schema-less really?

Tuesday, June 11, 13

Schema-free does not mean no schema-migration

Tuesday, June 11, 13

Schema is implicit in code

Tuesday, June 11, 13

Data must be migrated, when schema in code is

changed

Tuesday, June 11, 13

All data need not be migrated at the same time

(lazy migration)

Tuesday, June 11, 13

Polyglot Persistence

Tuesday, June 11, 13

Use different data storage technology for varying

needs

Tuesday, June 11, 13

Can be across the enterprise or in single

application

Tuesday, June 11, 13

Encapsulate data access through services

Tuesday, June 11, 13

Order persistence service

Document store

e-commerce platform

Session/Cart storage service

Key-Value store

Inventory and Price service

RDBMS(Legacy DB)

Nodes and Relations service

Graph store

Shopping cart and session data

Completed Orders

Inventoryand

Item Price Customer social graph

Tuesday, June 11, 13

RDBMS

e-commerce platform

Shopping cart data

Completed Orders

Session dataSOLR

Searchrequests

Update Indexed Data

Update indexed data, batch or realtime

Tuesday, June 11, 13

martinfowler.com/bliki/PolyglotPersistence.html

Session Storage

Financial Data

Shopping Cart

Recommendationengine

Product Catalog

Reporting Analytics User Activity Logs

Speculative Retail Web Application

Redis RDBMS Riak Neo4J

MongoDB RDBMS Cassandra Cassandra

Tuesday, June 11, 13

Experience

Tuesday, June 11, 13

e-learning (before)

Tuesday, June 11, 13

e-learning (after)

Tuesday, June 11, 13

e-commerce (before)

Tuesday, June 11, 13

e-commerce (after)

Tuesday, June 11, 13

How do I choose?

Tuesday, June 11, 13

Choose for programmer productivity

Tuesday, June 11, 13

Choose for data access performance

Tuesday, June 11, 13

Choose to stick with the default

Tuesday, June 11, 13

Choose by testing your expectations

Tuesday, June 11, 13

Try the databases, they are all open-source

Tuesday, June 11, 13

Thanks#NoSQLDistilled

@pramodsadalagesadalage.com

Tuesday, June 11, 13