omid, reloaded scalable and highly-available transaction ... · scalable and highly-available...

21
Omid, Reloaded Scalable and Highly-Available Transaction Processing Ohad Shacham Francisco Perez Sorrosal Edward Bortnikov Eshcar Hillel Idit Keidar Ivan Kelly Sameer Paranjpye Matthieu Morel

Upload: others

Post on 24-May-2020

58 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Omid, Reloaded Scalable and Highly-Available Transaction ... · Scalable and Highly-Available Transaction Processing Ohad Shacham Francisco Perez Sorrosal ... Transactional API over

Omid, ReloadedScalable and Highly-Available Transaction Processing

Ohad Shacham Francisco PerezSorrosal

Edward Bortnikov Eshcar Hillel

Idit Keidar Ivan Kelly Sameer ParanjpyeMatthieu Morel

Page 2: Omid, Reloaded Scalable and Highly-Available Transaction ... · Scalable and Highly-Available Transaction Processing Ohad Shacham Francisco Perez Sorrosal ... Transactional API over

Dynamic indexing at Yahoo!

Real-TimeSearch Index

Indexing Pipeline

Content Processing Serving

Que

ries

Crawl feed

Document store

Page 3: Omid, Reloaded Scalable and Highly-Available Transaction ... · Scalable and Highly-Available Transaction Processing Ohad Shacham Francisco Perez Sorrosal ... Transactional API over

Web indexing pipeline

Crawl DocprocLink

Analysis Queue

Crawlschedule

Content

QueueLinks

STORM

HBase

Page 4: Omid, Reloaded Scalable and Highly-Available Transaction ... · Scalable and Highly-Available Transaction Processing Ohad Shacham Francisco Perez Sorrosal ... Transactional API over

Transactional API over HBaseSnapshot Isolation ConsistencyScalable, Highly AvailableDatabase-neutral

Apache incubator project

Omid – transactions for key-value store

Page 5: Omid, Reloaded Scalable and Highly-Available Transaction ... · Scalable and Highly-Available Transaction Processing Ohad Shacham Francisco Perez Sorrosal ... Transactional API over

Transactions and snapshot isolation

Aborts only on write-write conflicts

Read point Write point

begin commitread(x) write(y) write(x) read(y)

Page 6: Omid, Reloaded Scalable and Highly-Available Transaction ... · Scalable and Highly-Available Transaction Processing Ohad Shacham Francisco Perez Sorrosal ... Transactional API over

Architecture

Transaction ManagerClient

Begin/Commit

Datastore Datastore Datastore Committable

Results/Timestamp

Atomic commitGet commit info

Read/Write

Conflict Detection

Page 7: Omid, Reloaded Scalable and Highly-Available Transaction ... · Scalable and Highly-Available Transaction Processing Ohad Shacham Francisco Perez Sorrosal ... Transactional API over

Transaction ManagerClient

Begin

Datastore Datastore Datastore Committable

t1

Write (k1, v1, t1)Write (k2, v2, t1)

Read (k’, last committed t’ < t1)

(k1, v1, t1) (k2, v2, t1)

Running exampletr = t1

Page 8: Omid, Reloaded Scalable and Highly-Available Transaction ... · Scalable and Highly-Available Transaction Processing Ohad Shacham Francisco Perez Sorrosal ... Transactional API over

Transaction ManagerClient

Commit: t1, {k1, k2}

Datastore Datastore Datastore Committable

t2

(k1, v1, t1) (k2, v2, t1)

Write (t1, t2)

(t1, t2)

Running exampletr = t1tc = t2

Page 9: Omid, Reloaded Scalable and Highly-Available Transaction ... · Scalable and Highly-Available Transaction Processing Ohad Shacham Francisco Perez Sorrosal ... Transactional API over

Transaction ManagerClient

Datastore Datastore Datastore Committable

Read (k1, t3)

(k1, v1, t1) (k2, v2, t1) (t1, t2)

Read (t1)

Running example

tr = t3

Page 10: Omid, Reloaded Scalable and Highly-Available Transaction ... · Scalable and Highly-Available Transaction Processing Ohad Shacham Francisco Perez Sorrosal ... Transactional API over

Transaction ManagerClient

Datastore Datastore Datastore Committable

t2

(t1, t2)(k1, v1, t1, t2) (k2, v2, t1, t2)

Remove (t1)

Post committr = t1tc = t2

Update commit

cells

Page 11: Omid, Reloaded Scalable and Highly-Available Transaction ... · Scalable and Highly-Available Transaction Processing Ohad Shacham Francisco Perez Sorrosal ... Transactional API over

Datastore(k1, v1, t1, t2)

Transaction Manager

Datastore Datastore Committable

Read (k1, t3)

(k2, v2, t1, t2)

Commit cells

Client

tr = t3

Page 12: Omid, Reloaded Scalable and Highly-Available Transaction ... · Scalable and Highly-Available Transaction Processing Ohad Shacham Francisco Perez Sorrosal ... Transactional API over

High availability

Transaction ManagerClient

Begin/Commit/Abort

Datastore Datastore Datastore Committable

Results/Timestamp

Atomic commitGet commit info

Read/Write

Single point of failure

Page 13: Omid, Reloaded Scalable and Highly-Available Transaction ... · Scalable and Highly-Available Transaction Processing Ohad Shacham Francisco Perez Sorrosal ... Transactional API over

Transaction Manager

Transaction ManagerClient

Datastore Datastore Datastore Committable

Recovery state

Architecture with HA

Page 14: Omid, Reloaded Scalable and Highly-Available Transaction ... · Scalable and Highly-Available Transaction Processing Ohad Shacham Francisco Perez Sorrosal ... Transactional API over

Transaction Manager

Split brain

Transaction ManagerClient

Datastore Datastore Datastore Committable

Recovery state

Avoid Split Brain

Page 15: Omid, Reloaded Scalable and Highly-Available Transaction ... · Scalable and Highly-Available Transaction Processing Ohad Shacham Francisco Perez Sorrosal ... Transactional API over

Transaction Manager

Our HA solution

Transaction ManagerClient

Datastore Datastore Datastore Committable

Recovery state

ABORT

Infrequent access

Page 16: Omid, Reloaded Scalable and Highly-Available Transaction ... · Scalable and Highly-Available Transaction Processing Ohad Shacham Francisco Perez Sorrosal ... Transactional API over

§ No runtime overhead in mainstream execution• Minor overhead after failover

§ Leases for leader election• Local lease status check before/after writing to Commit Table

High availability

Page 17: Omid, Reloaded Scalable and Highly-Available Transaction ... · Scalable and Highly-Available Transaction Processing Ohad Shacham Francisco Perez Sorrosal ... Transactional API over

050

100150200250300350400450500550

Omid1 Omid1 Non Durable

Omid Omid Non Durable

Tps

* 103

Page 18: Omid, Reloaded Scalable and Highly-Available Transaction ... · Scalable and Highly-Available Transaction Processing Ohad Shacham Francisco Perez Sorrosal ... Transactional API over

0

500

1000

1500

2000

2500

document inversion duplicate detection out-link processing in-link processing stream to runtime

Late

ncy

(ms)

Commit + CT update BeginCompute ReadUpdate

Page 19: Omid, Reloaded Scalable and Highly-Available Transaction ... · Scalable and Highly-Available Transaction Processing Ohad Shacham Francisco Perez Sorrosal ... Transactional API over

0

50

100

150

200

250

15 300 500 2000 8000 45000 55000 70000

Tim

e (m

s)

Txn / Sec[Batch size]

Queueing (avg)HBASE write (avg)Network and CD (avg)

[10000][10000][240][240][50][25][1] [10000]

Page 20: Omid, Reloaded Scalable and Highly-Available Transaction ... · Scalable and Highly-Available Transaction Processing Ohad Shacham Francisco Perez Sorrosal ... Transactional API over

§ OLTP on-top-of HBase

§ Integrating Omid as Phoenix transaction processing engine

§ Augment semantics to support secondary index creation

§ Augment Omid for concurrent transaction execution

Integration with Apache Phoenix

Page 21: Omid, Reloaded Scalable and Highly-Available Transaction ... · Scalable and Highly-Available Transaction Processing Ohad Shacham Francisco Perez Sorrosal ... Transactional API over

Summary

Transactions – important use case in stream processing

Snapshot Isolation – scalable consistency model

Omid – an apache incubator TP system for HBaseBattle-tested, Web-scale, HA