developing polyglot persistence applications (gluecon 2013)

56
Developing polyglot persistence applications Chris Richardson Author of POJOs in Action Founder of the original CloudFoundry.com @crichardson [email protected] http://plainoldobjects.com

Upload: chris-richardson

Post on 10-May-2015

5.573 views

Category:

Technology


0 download

DESCRIPTION

This is 30 minute GlueCon 2013 version of a much longer talk. See http://plainoldobjects.com/presentations/developing-polyglot-persistence-applications/ for other versions and the example code. NoSQL databases such as Redis, MongoDB and Cassandra are emerging as a compelling choice for many applications. They can simplify the persistence of complex data models and offer significantly better scalability and performance. However, using a NoSQL database means giving up the benefits of the relational model such as SQL, constraints and ACID transactions. For some applications, the solution is polyglot persistence: using SQL and NoSQL databases together. In this talk, you will learn about the benefits and drawbacks of polyglot persistence and how to design applications that use this approach. We will explore the architecture and implementation of an example application that uses MySQL as the system of record and Redis as a very high-performance database that handles queries from the front-end. You will learn about mechanisms for maintaining consistency across the various databases.

TRANSCRIPT

Page 1: Developing polyglot persistence applications (gluecon 2013)

Developing polyglot persistence applications

Chris RichardsonAuthor of POJOs in ActionFounder of the original CloudFoundry.com @crichardson [email protected]://plainoldobjects.com

Page 2: Developing polyglot persistence applications (gluecon 2013)

@crichardson

Presentation goalThe benefits and drawbacks of

polyglot persistence and

How to design applications that use this approach

Page 3: Developing polyglot persistence applications (gluecon 2013)

@crichardson

About Chris

Page 4: Developing polyglot persistence applications (gluecon 2013)

@crichardson

(About Chris)

Page 5: Developing polyglot persistence applications (gluecon 2013)

@crichardson

About Chris()

Page 6: Developing polyglot persistence applications (gluecon 2013)

@crichardson

About Chris

Page 7: Developing polyglot persistence applications (gluecon 2013)

@crichardson

About Chris

http://www.theregister.co.uk/2009/08/19/springsource_cloud_foundry/

Page 8: Developing polyglot persistence applications (gluecon 2013)

@crichardson

vmc push About-Chris

Developer Advocate for CloudFoundry.com

Page 9: Developing polyglot persistence applications (gluecon 2013)

@crichardson

Agenda

• Why polyglot persistence?

• Optimizing queries using Redis materialized views

• Synchronizing MySQL and Redis

Page 10: Developing polyglot persistence applications (gluecon 2013)

@crichardson

Food To Go Architecture - 2006

Order taking

Restaurant Management

MySQL Database

HungryConsumer Restaurant

Success =

inadequate

Page 11: Developing polyglot persistence applications (gluecon 2013)

@crichardson

Solution: Spend Money

http://upload.wikimedia.org/wikipedia/commons/e/e5/Rising_Sun_Yacht.JPG

OR

http://www.trekbikes.com/us/en/bikes/road/race_performance/madone_5_series/madone_5_2/#

Page 12: Developing polyglot persistence applications (gluecon 2013)

@crichardson

Solution: Use NoSQL

Benefits

• Higher performance

• Higher scalability

• Richer data-model

• Schema-less

Drawbacks

• Limited transactions

• Limited querying

• Relaxed consistency

• Unconstrained data

Page 13: Developing polyglot persistence applications (gluecon 2013)

@crichardson

Example NoSQL DatabasesDatabase Key features

CassandraExtensible column store, very scalable, distributed

MongoDB Document-oriented, fast, scalable

Redis Key-value store, very fast

http://nosql-database.org/ lists 122+ NoSQL databases

Page 14: Developing polyglot persistence applications (gluecon 2013)

@crichardson

Redis: Fast and Scalable

Redis master

Redis slave

Redis slave

Filesystem

ClientIn-memory= FAST

Durable

Redis master

Redis slave

Redis slave

Filesystem

Consistent Hashing

Page 15: Developing polyglot persistence applications (gluecon 2013)

@crichardson

Redis: advanced key/value store

K1 V1

K2 V2

... ...

String SET, GET

List LPUSH, LPOP, ...

Set SADD, SMEMBERS, ..

Sorted Set

Hashes

ZADD, ZRANGE

HSET, HGET

Page 16: Developing polyglot persistence applications (gluecon 2013)

@crichardson

Sorted sets

myset a

5.0

b

10.

Key Value

ScoreMembers are sorted by score

Page 17: Developing polyglot persistence applications (gluecon 2013)

@crichardson

Adding members to a sorted setRedis Server

zadd myset 5.0 a myset a

5.0

Key Score Value

Page 18: Developing polyglot persistence applications (gluecon 2013)

@crichardson

Adding members to a sorted setRedis Server

zadd myset 10.0 b myset a

5.0

b

10.

Page 19: Developing polyglot persistence applications (gluecon 2013)

@crichardson

Adding members to a sorted setRedis Server

zadd myset 1.0 c myset a

5.0

b

10.

c

1.0

Page 20: Developing polyglot persistence applications (gluecon 2013)

@crichardson

Retrieving members by index range

Redis Server

zrange myset 0 1

myset a

5.0

b

10.

c

1.0ac

Key StartIndex

EndIndex

Page 21: Developing polyglot persistence applications (gluecon 2013)

@crichardson

Retrieving members by score

Redis Server

zrangebyscore myset 1 6

myset a

5.0

b

10.

c

1.0ac

Key Min value

Max value

Page 22: Developing polyglot persistence applications (gluecon 2013)

@crichardson

Redis is great but there are tradeoffs

• No rich queries: PK-based access only

• Limited transaction model:

• Read first and then execute updates as batch

• Difficult to compose code

• Data must fit in memory

• Single-threaded server : run multiple with client-side sharding

• Missing features such as access control, ...

Hope you don’t need ACID

or rich queries one day

Page 23: Developing polyglot persistence applications (gluecon 2013)

@crichardson

The future is polyglot

IEEE Software Sept/October 2010 - Debasish Ghosh / Twitter @debasishg

e.g. Netflix• RDBMS• SimpleDB• Cassandra• Hadoop/Hbase

Page 24: Developing polyglot persistence applications (gluecon 2013)

@crichardson

Polyglot Pattern: Cache

Order taking

Restaurant Management

MySQL Database

CONSUMER RESTAURANT OWNER

RedisCacheFirst Second

Page 25: Developing polyglot persistence applications (gluecon 2013)

@crichardson

Redis: Timeline

Polyglot Pattern: write to SQL and NoSQL

Follower 1

Follower 2

Follower 3 ...

LPUSHXLTRIM

MySQL

TweetsFollowers

INSERT

NEW TWEET

Follows

Avoids expensive

MySQL joins

Page 26: Developing polyglot persistence applications (gluecon 2013)

@crichardson

Polyglot Pattern: replicate from MySQL to NoSQL

QueryService

Update Service

MySQL Database

Reader Writer

Redis

Systemof RecordMaterialized

view query() update()replicate

anddenormalize

Page 27: Developing polyglot persistence applications (gluecon 2013)

@crichardson

Agenda

• Why polyglot persistence?

• Optimizing queries using Redis materialized views

• Synchronizing MySQL and Redis

Page 28: Developing polyglot persistence applications (gluecon 2013)

@crichardson

Finding available restaurantsAvailable restaurants =

Serve the zip code of the delivery addressAND

Are open at the delivery time

public interface AvailableRestaurantRepository {

List<AvailableRestaurant> findAvailableRestaurants(Address deliveryAddress, Date deliveryTime); ...}

Page 29: Developing polyglot persistence applications (gluecon 2013)

@crichardson

Database schemaID Name …

1 Ajanta

2 Montclair Eggshop

Restaurant_id zipcode

1 94707

1 94619

2 94611

2 94619

Restaurant_id dayOfWeek openTime closeTime

1 Monday 1130 1430

1 Monday 1730 2130

2 Tuesday 1130 …

RESTAURANT table

RESTAURANT_ZIPCODE table

RESTAURANT_TIME_RANGE table

Page 30: Developing polyglot persistence applications (gluecon 2013)

@crichardson

Finding available restaurants on Monday, 6.15pm for 94619 zipcode

Straightforward three-way join

select r.*from restaurant r inner join restaurant_time_range tr on r.id =tr.restaurant_id inner join restaurant_zipcode sa on r.id = sa.restaurant_id where ’94619’ = sa.zip_code and tr.day_of_week=’monday’ and tr.openingtime <= 1815 and 1815 <= tr.closingtime

Page 31: Developing polyglot persistence applications (gluecon 2013)

@crichardson

How to scale this query?

• Query caching is usually ineffective

• MySQL Master/slave replication is straightforward BUT

• Complexity of slaves

• Risk of replication failing

• Limitations of MySQL master/slave replication

Page 32: Developing polyglot persistence applications (gluecon 2013)

@crichardson

Query denormalized copy stored in Redis

Order taking

Restaurant Management

MySQL Database

CONSUMER RESTAURANT OWNER

Redis

Systemof RecordCopy

findAvailable()update()

Page 33: Developing polyglot persistence applications (gluecon 2013)

@crichardson

BUT how to implement findAvailableRestaurants() with Redis?!

?select r.*from restaurant r inner join restaurant_time_range tr on r.id =tr.restaurant_id inner join restaurant_zipcode sa on r.id = sa.restaurant_id where ’94619’ = sa.zip_code and tr.day_of_week=’monday’ and tr.openingtime <= 1815 and 1815 <= tr.closingtime

K1 V1

K2 V2

... ...

Page 34: Developing polyglot persistence applications (gluecon 2013)

@crichardson

ZRANGEBYSCORE = Simple SQL Query

ZRANGEBYSCORE myset 1 6

select valuefrom sorted_setwhere key = ‘myset’ and score >= 1 and score <= 6

=

key value score

sorted_set

Page 35: Developing polyglot persistence applications (gluecon 2013)

@crichardson

select r.*from restaurant r inner join restaurant_time_range tr on r.id =tr.restaurant_id inner join restaurant_zipcode sa on r.id = sa.restaurant_id where ’94619’ = sa.zip_code and tr.day_of_week=’monday’ and tr.openingtime <= 1815 and 1815 <= tr.closingtime

select valuefrom sorted_setwhere key = ? and score >= ? and score <= ?

?

How to transform the SELECT statement?

We need to denormalize

Page 36: Developing polyglot persistence applications (gluecon 2013)

@crichardson

Simplification #1: Denormalization

Restaurant_id Day_of_week Open_time Close_time Zip_code

1 Monday 1130 1430 947071 Monday 1130 1430 946191 Monday 1730 2130 947071 Monday 1730 2130 946192 Monday 0700 1430 94619…

SELECT restaurant_id FROM time_range_zip_code WHERE day_of_week = ‘Monday’ AND zip_code = 94619 AND 1815 < close_time AND open_time < 1815

Simpler query:§ No joins§ Two = and two <

Page 37: Developing polyglot persistence applications (gluecon 2013)

@crichardson

Simplification #2: Application filtering

SELECT restaurant_id, open_time FROM time_range_zip_code WHERE day_of_week = ‘Monday’ AND zip_code = 94619 AND 1815 < close_time AND open_time < 1815

Even simpler query• No joins• Two = and one <

Page 38: Developing polyglot persistence applications (gluecon 2013)

@crichardson

Simplification #3: Eliminate multiple =’s with concatenation

SELECT restaurant_id, open_time FROM time_range_zip_code WHERE zip_code_day_of_week = ‘94619:Monday’ AND 1815 < close_time

Restaurant_id Zip_dow Open_time Close_time

1 94707:Monday 1130 14301 94619:Monday 1130 14301 94707:Monday 1730 21301 94619:Monday 1730 21302 94619:Monday 0700 1430…

key

range

Page 39: Developing polyglot persistence applications (gluecon 2013)

@crichardson

Simplification #4: Eliminate multiple SELECT VALUES with concatenation

SELECT open_time_restaurant_id, FROM time_range_zip_code WHERE zip_code_day_of_week = ‘94619:Monday’ AND 1815 < close_time

zip_dow open_time_restaurant_id close_time

94707:Monday 1130_1 1430

94619:Monday 1130_1 1430

94707:Monday 1730_1 2130

94619:Monday 1730_1 2130

94619:Monday 0700_2 1430

...

Page 40: Developing polyglot persistence applications (gluecon 2013)

@crichardson

zip_dow open_time_restaurant_id close_time

94707:Monday 1130_1 1430

94619:Monday 1130_1 1430

94707:Monday 1730_1 2130

94619:Monday 1730_1 2130

94619:Monday 0700_2 1430

...

Sorted Set [ Entry:Score, …]

[1130_1:1430, 1730_1:2130]

[0700_2:1430, 1130_1:1430, 1730_1:2130]

Using a Redis sorted set as an index

Key

94619:Monday

94707:Monday

94619:Monday 0700_2 1430

Page 41: Developing polyglot persistence applications (gluecon 2013)

@crichardson

Finding restaurants that have not yet closed

ZRANGEBYSCORE 94619:Monday 1815 2359è

{1730_1}

1730 is before 1815 è Ajanta is open

Delivery timeDelivery zip and day

Sorted Set [ Entry:Score, …]

[1130_1:1430, 1730_1:2130]

[0700_2:1430, 1130_1:1430, 1730_1:2130]

Key

94619:Monday

94707:Monday

Page 42: Developing polyglot persistence applications (gluecon 2013)

@crichardson

NoSQL ⇒ Denormalized representation for each query

Page 43: Developing polyglot persistence applications (gluecon 2013)

@crichardson

Sorry Ted!

http://en.wikipedia.org/wiki/Edgar_F._Codd

Page 44: Developing polyglot persistence applications (gluecon 2013)

@crichardson

Agenda

• Why polyglot persistence?

• Optimizing queries using Redis materialized views

• Synchronizing MySQL and Redis

Page 45: Developing polyglot persistence applications (gluecon 2013)

@crichardson

MySQL & Redis need to be consistent

Page 46: Developing polyglot persistence applications (gluecon 2013)

@crichardson

Two-Phase commit is not an option

• Redis does not support it

• Even if it did, 2PC is best avoided http://www.infoq.com/articles/ebay-scalability-best-practices

Page 47: Developing polyglot persistence applications (gluecon 2013)

@crichardson

AtomicConsistentIsolatedDurable

Basically AvailableSoft stateEventually consistent

BASE: An Acid Alternative http://queue.acm.org/detail.cfm?id=1394128

Page 48: Developing polyglot persistence applications (gluecon 2013)

@crichardson

Updating Redis #FAILbegin MySQL transaction update MySQL update Redisrollback MySQL transaction

Redis has updateMySQL does not

begin MySQL transaction update MySQLcommit MySQL transaction<<system crashes>> update Redis

MySQL has updateRedis does not

Page 49: Developing polyglot persistence applications (gluecon 2013)

@crichardson

Updating Redis reliablyStep 1 of 2

begin MySQL transactionupdate MySQLqueue CRUD event in MySQL

commit transaction

ACID

Event IdOperation: Create, Update, DeleteNew entity state, e.g. JSON

Page 50: Developing polyglot persistence applications (gluecon 2013)

@crichardson

Updating Redis reliably Step 2 of 2

for each CRUD event in MySQL queue get next CRUD event from MySQL queue

If CRUD event is not duplicate then Update Redis (incl. eventId)end ifbegin MySQL transaction

mark CRUD event as processedcommit transaction

Page 51: Developing polyglot persistence applications (gluecon 2013)

@crichardson

Generating CRUD events

• Explicit code

• Hibernate event listener

• Service-layer aspect

• CQRS/Event-sourcing

• Database triggers

• Reading the MySQL binlog

Page 52: Developing polyglot persistence applications (gluecon 2013)

@crichardson

ID JSON processed?

ENTITY_CRUD_EVENT

EntityCrudEventProcessor

RedisUpdater

apply(event)

Step 1 Step 2

INSERT INTO ...

Timer

SELECT ... FROM ...

Redis

Page 53: Developing polyglot persistence applications (gluecon 2013)

@crichardson

Representing restaurants in Redis

...JSON...

restaurant:lastSeenEventId:1

restaurant:1

55

94619:Monday [0700_2:1430, 1130_1:1430, ...]

94619:Tuesday [0700_2:1430, 1130_1:1430, ...]

duplicate detection

Page 54: Developing polyglot persistence applications (gluecon 2013)

@crichardson

Updating Redis

WATCH restaurant:lastSeenEventId:≪restaurantId≫

Optimistic locking

Duplicate detection

lastSeenEventId = GET restaurant:lastSeenEventId:≪restaurantId≫

if (lastSeenEventId >= eventId) return;

Transaction

MULTI SET restaurant:lastSeenEventId:≪restaurantId≫ eventId ... SET restaurant:≪restaurantId≫ ≪restaurantJSON≫ ZADD ≪ZipCode≫:≪DayOfTheWeek≫ ... ... update the restaurant data...

EXEC

Page 55: Developing polyglot persistence applications (gluecon 2013)

@crichardson

Summary

• Each SQL/NoSQL database = set of tradeoffs

• Polyglot persistence: leverage the strengths of SQL and NoSQL databases

• Use Redis as a distributed cache

• Store denormalized data in Redis for fast querying

• Reliable database synchronization required

Page 56: Developing polyglot persistence applications (gluecon 2013)

@crichardson

@crichardson [email protected]

http://plainoldobjects.com - code and slides