distributed stms - inesc-idmcouceiro/eurotm/wtm2012/... · 2012. 4. 18. · distributed stms stms...

14
Distributed STMs STMs are being employed in new scenarios: Database caches in three-tier web apps (FénixEDU) HPC programming language (X10) In-memory cloud data grids (Coherence, Infinispan) New challenges: Scalability Fault-tolerance EuroTM Workshop on Transac1onal Memory (WTM 2012), Bern, Switzerland REPLICATION 1

Upload: others

Post on 21-Aug-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Distributed STMs - INESC-IDmcouceiro/eurotm/wtm2012/... · 2012. 4. 18. · Distributed STMs STMs are being employed in new scenarios: Database caches in three-tier web apps (FénixEDU)

Distributed STMs

  STMs are being employed in new scenarios:   Database caches in three-tier web apps (FénixEDU)

  HPC programming language (X10)   In-memory cloud data grids (Coherence, Infinispan)

  New challenges:   Scalability

  Fault-tolerance

Euro-­‐TM  Workshop  on  Transac1onal  Memory  (WTM  2012),  Bern,  Switzerland  

REPLICATION

1  

Page 2: Distributed STMs - INESC-IDmcouceiro/eurotm/wtm2012/... · 2012. 4. 18. · Distributed STMs STMs are being employed in new scenarios: Database caches in three-tier web apps (FénixEDU)

Partial Replication

  Each site stores a partial copy of the data.

  Genuine partial replication schemes maximize scalability by ensuring that:   Only data sites that replicate data item read or

written by a transaction T, exchange messages for executing/committing T.

  Existing 1-Copy Serializable implementations enforce distributed validation of read-only transactions [SRDS10]:   considerable overheads in typical workloads

Euro-­‐TM  Workshop  on  Transac1onal  Memory  (WTM  2012),  Bern,  Switzerland   2  

Page 3: Distributed STMs - INESC-IDmcouceiro/eurotm/wtm2012/... · 2012. 4. 18. · Distributed STMs STMs are being employed in new scenarios: Database caches in three-tier web apps (FénixEDU)

Issues with Partial Replication

  Extending existing local multiversion (MV) STMs is not enough.

  Local MV STMs rely on a single global counter to track version advancement.

  Problem:   Commit of transactions should involve ALL NODES

NO GENUINENESS = POOR SCALABILITY

Euro-­‐TM  Workshop  on  Transac1onal  Memory  (WTM  2012),  Bern,  Switzerland   3  

Page 4: Distributed STMs - INESC-IDmcouceiro/eurotm/wtm2012/... · 2012. 4. 18. · Distributed STMs STMs are being employed in new scenarios: Database caches in three-tier web apps (FénixEDU)

GMU: Genuine Multiversion Update-Serializable Replication [ICDCS12]

  In the execution/commit phase of a transaction T, ONLY nodes which store data items accessed by T are involved.

  It uses multiple versions for each data item

  It builds visible snapshots = freshest consistent snapshots taking into account: 1.  causal dependencies vs. previously committed transactions

at the time a transaction began, 2.  previous reads executed by the same transaction

  Vector clocks used to establish visible snapshots

Euro-­‐TM  Workshop  on  Transac1onal  Memory  (WTM  2012),  Bern,  Switzerland  

G  M  U  

4  

Page 5: Distributed STMs - INESC-IDmcouceiro/eurotm/wtm2012/... · 2012. 4. 18. · Distributed STMs STMs are being employed in new scenarios: Database caches in three-tier web apps (FénixEDU)

High Level Overview (i)   Transactions commit using a vector clock.

  Each node stores a log of committed vector clocks.

Euro-­‐TM  Workshop  on  Transac1onal  Memory  (WTM  2012),  Bern,  Switzerland   5  

Initial view of the visible snapshot   Upon a transaction T begins on N: it acquires the most

recent vector clock in N’s commit log.

View extension of the visible snapshot   Upon T reads on a node N:

  T’s vector clock can be modified according to N’s commit log.

  Three reading rules are applied using T’s vector clock.

Page 6: Distributed STMs - INESC-IDmcouceiro/eurotm/wtm2012/... · 2012. 4. 18. · Distributed STMs STMs are being employed in new scenarios: Database caches in three-tier web apps (FénixEDU)

High Level Overview (ii)

Euro-­‐TM  Workshop  on  Transac1onal  Memory  (WTM  2012),  Bern,  Switzerland   6  

Write operation   Upon a transaction T writes V on data item O: it inserts

<O,V> in T’s write-set.

Commit operation   Read-only transactions always commit.

  Update transactions run a genuine 2-Phase Commit:   Upon prepare message reception (participant-side)

  acquire read/write locks and validate read-set,   send back a tentative commit vector clock.

  If all replies are positive (coordinator-side)

  multicast write-set and final commit vector clock.

Page 7: Distributed STMs - INESC-IDmcouceiro/eurotm/wtm2012/... · 2012. 4. 18. · Distributed STMs STMs are being employed in new scenarios: Database caches in three-tier web apps (FénixEDU)

Rule 1: Reading Lower Bound Node  0   Node  1  

(it  stores  X)  Node  2  

(it  stores  Y)  

X(2)  

X(2)  T1:R(X)  

(1,1,1)  

(1,2,2)  

(1,1,1)  

Y(2)  (1,2,2)  

T0:W(X,v)  

T0:W(Y,w)  

(1,1,1)  

T1:R(Y)  Y(2)  

(1,2,2)  

Most  recent  VC  in  VCLog  

T1.VC  

T0:Commit  

Commit  

(1,2,2)  T1.VC  Euro-­‐TM  Workshop  on  Transac1onal  Memory  (WTM  2012),  Bern,  Switzerland   7  

Page 8: Distributed STMs - INESC-IDmcouceiro/eurotm/wtm2012/... · 2012. 4. 18. · Distributed STMs STMs are being employed in new scenarios: Database caches in three-tier web apps (FénixEDU)

Rule 2: Reading Upper Bound Node  0   Node  1  

(it  stores  X)  Node  2  

(it  stores  Y)  

X(3)  

Y(2)  

X(1)  T1:R(X)  

(1,1,1)  

(1,3,3)  

(1,1,1)  

Y(3)  (1,3,3)  

T0:W(X,v)  T0:W(Y,w)  

X(1)  (1,1,1)  

T1:R(Y)   Y(2)  

T1:Commit  

(1,1,1)  

Most  recent  VC  in  VCLog  

T1.VC  T0:Commit  

Commit  

(1,1,2)  T1.VC  

Euro-­‐TM  Workshop  on  Transac1onal  Memory  (WTM  2012),  Bern,  Switzerland  

(1,1,2)  

Y(1)  

8  

Page 9: Distributed STMs - INESC-IDmcouceiro/eurotm/wtm2012/... · 2012. 4. 18. · Distributed STMs STMs are being employed in new scenarios: Database caches in three-tier web apps (FénixEDU)

Rule 3: Selection of Data Versions

  Informally: observe the most recent consistent version of data item id on node i based on T’s history (previous

reads).

  Formally: iterate over the versions of id and return the most recent one s.t.

id.version.VN <= T.VC[i]

Euro-­‐TM  Workshop  on  Transac1onal  Memory  (WTM  2012),  Bern,  Switzerland   9  

Page 10: Distributed STMs - INESC-IDmcouceiro/eurotm/wtm2012/... · 2012. 4. 18. · Distributed STMs STMs are being employed in new scenarios: Database caches in three-tier web apps (FénixEDU)

Building the commit Vector Clock

  Based on a variant of the Skeen’s total order multicast algorithm [SKEEN85].

  Intuition:   Serialize all-and-only conflicting transactions,

tracking   direct and transitive conflict dependencies,

  causal relationship

Euro-­‐TM  Workshop  on  Transac1onal  Memory  (WTM  2012),  Bern,  Switzerland   10  

Page 11: Distributed STMs - INESC-IDmcouceiro/eurotm/wtm2012/... · 2012. 4. 18. · Distributed STMs STMs are being employed in new scenarios: Database caches in three-tier web apps (FénixEDU)

Consistency Criterion

  GMU ensures Extended Update Serializability:   Update Serializability [ICDT86] ensures:

  1-Copy-Serializabilty (1CS) on the history restricted to committed update transactions;

  1CS on the history restricted to committed update transactions and any single read-only transaction.   But it can admit non-1CS histories containing at least 2 read-

only transactions.

  Extended Update Serializability [Adya99]:   ensures US property also to executing transactions;

  analogous to opacity in STMs.

Euro-­‐TM  Workshop  on  Transac1onal  Memory  (WTM  2012),  Bern,  Switzerland   11  

Page 12: Distributed STMs - INESC-IDmcouceiro/eurotm/wtm2012/... · 2012. 4. 18. · Distributed STMs STMs are being employed in new scenarios: Database caches in three-tier web apps (FénixEDU)

Experiments on private cluster

0

1000

2000

3000

4000

5000

6000

7000

8000

9000

2 4 6 8 10 12 14 16 18 20

Thro

ughp

ut (c

omm

itted

tx/s

ec)

Number of Nodes

Read & Write Transactions - (TPC-C)

GMURR

NGM

8  core  physical  nodes  

TPC-­‐C  -­‐  90%  read-­‐only  xacts  -­‐  10%  update  xacts    -­‐  4  threads  per  node    -­‐  moderate    conten1on  (15%  abort  rate  at  20  nodes)  

Euro-­‐TM  Workshop  on  Transac1onal  Memory  (WTM  2012),  Bern,  Switzerland   12  

Page 13: Distributed STMs - INESC-IDmcouceiro/eurotm/wtm2012/... · 2012. 4. 18. · Distributed STMs STMs are being employed in new scenarios: Database caches in three-tier web apps (FénixEDU)

Thanks for the attention

Euro-­‐TM  Workshop  on  Transac1onal  Memory  (WTM  2012),  Bern,  Switzerland   13  

Page 14: Distributed STMs - INESC-IDmcouceiro/eurotm/wtm2012/... · 2012. 4. 18. · Distributed STMs STMs are being employed in new scenarios: Database caches in three-tier web apps (FénixEDU)

References

Euro-­‐TM  Workshop  on  Transac1onal  Memory  (WTM  2012),  Bern,  Switzerland  

[Adya99]  A.  Adya,  “Weak  consistency:  A  generalized  theory  and  op1mis1c  implementa1ons  for  distributed  transac1ons,”  tech.  rep.,  PhD  Thesis,  Massachusebs  Ins1tute  of  Technology,  1999.  [ICDCS12]  Sebas1ano  Peluso,  Pedro  Ruivo,  Paolo  Romano,  Francesco  Quaglia,  Luís  Rodrigues.  “When  Scalability  Meets  Consistency:  Genuine  Mul1version  Update-­‐Serializable  Par1al  Replica1on”.  The  IEEE  32nd  Interna1onal  Conference  on  Distributed  Compu1ng  Systems,  June,  2012.  [ICDT86]  R.  C.  Hansdah  and  L.  M.  Patnaik,  “Update  serializability  in  locking,”.  Interna1onal  Conference  of  Database  Theory,  vol.  243  of  Lecture  Notes  in  Computer  Science,  pp.  171–185,  Springer  Berlin  /  Heidelberg,  1986.                  [SKEEN85]  D.  Skeen.  “Unpublished  communica1on”,  1985.  Referenced  in  K.  Birman,  T.  Joseph  “Reliable  Communica1on  in  the  Presence  of  Failures”,  ACM  Trans.  on  Computer  Systems,  47-­‐76,  1987    [SRDS10]  Nicolas  Schiper,  Pierre  Sutra,  Fernando  Pedone.  “P-­‐Store:  Genuine  Par1al  Replica1on  in  Wide  Area  Networks”.  Proc.  of  the  29th  Symposium  of  Reliable  Distributed  Systems,  2010.  

14