using chained transactions for maximum concurrency under load (qconsf 2010)

Post on 18-Jan-2015

4.026 Views

Category:

Technology

3 Downloads

Preview:

Click to see full reader

DESCRIPTION

My chained transaction talk for handling maximum concurrency in the presence of lock contention like shopping cart checkout.

TRANSCRIPT

@billynewport 1

Thoughts on Transactions

Chained transactions for scaling ACID@billynewport

@billynewport 2

AgendaExplain scenario

Implement using a single database

Explain concurrency issues under load

Implement using a sharded database

Implement using WebSphere eXtreme Scale and chained transactions

@billynewport 3

ScenarioLarge eCommerce web site

Problem is order checkout

We track inventory levels for each SKU

Orders during checkout need to adjust available inventory.

@billynewport 4

Shopping cart metricsMillions of SKUs

Cart size of 5 items for electronics/big ticket items

Cart size of 20 items for clothing

Expect concurrent load of 2500 checkouts per second

@billynewport 5

DatabaseBegin

For each item in cartSelect for update where sku = item.skuDecrement available sku levelIf not available then rollback…Update level where sku = item.sku

Commit

Cart items randomly distributed amongst all 2m items, lots of concurrency.

Simple enough, right? All is good?

@billynewport 6

Problem: cabbage patch dolls

Cabbage patch dolls are popular this fall…

@billynewport 7

Database killers!The dolls cause major

concurrency problems Lots of row level locks Contention on doll rows Possible table lock escalation App server thread issues Connection pools empty Then DEATH!

They aren’t sweet and cuddly any more…

@billynewport 8

Database killersWe need a way to get locks to decrement

inventory

But, we don’t want to hold the lock for very long

Bigger carts make the problem worse, all the locks held for longer

Ideally, hold locks for constant time

Any contentious items make problem worse

@billynewport 9

SolutionHold lock on inventory rows for as short a time

as possible

Decouple this from size of cart.

How?

@billynewport 10

Chained transactions

Programmers think of transactions in synchronous terms.Begin / Do Work / Hold locks / Commit

Easy to program, bad for concurrency.

@billynewport 11

InspirationMicrosoft had COM objects with apartment

model threading.

Modern Actor support is similar. Some state with a mailbox.

BPEL supports flows with compensation

Data meets actors is a good analogySend a message (cart) to a group of actors

identified using their keys with a compensator

@billynewport 12

AlternativeWe need to think asynchronously in terms of flows

with compensation Map of <SKU Key/SKU Amount> Brick:

Do{ code to reduce inventory level for SKU }

Undo{ code to increase level inventory for SKU }

Provide Map with do/undo bricks

Easy to program, great concurrency.

@billynewport 13

Transactions and sharded stores

Option 1: Write transaction to one shard then spread out asynchronously

Option 2: 2 phase commit

Option 3: Chained transactions

@billynewport 14

Transactions and sharded stores

Option 1: Write transaction to one shard then spread out asynchronously

Option 2: 2 phase commit

Option 3: Chained transactions

@billynewport 15

Transactions and sharded stores

Option 1: Write transaction to one shard then spread out asynchronously

Option 2: 2 phase commit

Option 3: Chained transactions

@billynewport 16

Implementation1PC only required

Data store supportingRow locksRow oriented data

Integrated FIFO messaging

IBM WebSphere eXtreme Scale provides these capabilities.

@billynewport 17

ImplementationApplication makes map and code bricks

Submits transaction as an asynchronous job.

Uses a Future to check on job outcome.

Do blocks can trigger flow reversal if a problem occurs. Invoke undo block for each completed step

@billynewport 18

MechanismLoop

Receive message for actor keyProcess itSend modified cart to next ‘sku’ using local

‘transmit q’Commit transaction

Background thread pushes messages in transmit q to the destination shards using exactly once semantics.

@billynewport 19

PerformanceA single logical transaction will be slower than a

1PC DBMS implementation.

However, under concurrent load then it will deliver:Higher throughputBetter response times

Thru better contention managementEach ‘SKU’ only locked for a very short period

@billynewport 20

GeneralizationThis could be thought of as a workflow engine.

But, a big difference here is that a workflow engine usually talks with a remote store.

Here: the flow state is the MESSAGE It moves around to where the data is for the next step

Using a MESSAGE for flow state rather than a database means it scales linearly.

The message ‘store’ is integrated and scales with the data store.

@billynewport 21

Architecture ComparisonConventional Message oriented

FlowDB

ApplDB

MsgStor

e

BPEngine

BPEngine

BPEngin

e

FlowState

Flow Edge

= Msg

IntegratedMsg/Data store

ApplDB

Write behind

IntegratedMsg/Data store

IntegratedMsg/Data store

@billynewport 22

Sample implementationComing soon.

Running in lab

Working with several eCommerce customers looking to implement soon.

Soon to be published on github as sample code.

top related