concurrency control in transactional systems jinyang li some slides adapted from stonebraker and...

29
Concurrency control in transactional systems Jinyang Li Some slides adapted from Stonebraker and Mad

Upload: dinah-cooper

Post on 29-Dec-2015

224 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Concurrency control in transactional systems Jinyang Li Some slides adapted from Stonebraker and Madden

Concurrency control in transactional systems

Jinyang Li

Some slides adapted from Stonebraker and Madden

Page 2: Concurrency control in transactional systems Jinyang Li Some slides adapted from Stonebraker and Madden

What we’ve learnt so far…

• All-or-nothing atomicity in transactions– So that failures do not result in (undesirable)

intermediate state

• How to realize all-or-nothing atomicity?– Logging– REDO/UNDO logging vs. REDO-only

logging

Page 3: Concurrency control in transactional systems Jinyang Li Some slides adapted from Stonebraker and Madden

Concurrency control

• Many application clients are concurrently accessing a storage system.

• Concurrent transactions might interfere– Similar to data races in parallel programs

Page 4: Concurrency control in transactional systems Jinyang Li Some slides adapted from Stonebraker and Madden

A_bal = READ(A)If (A_bal >50) { A_bal -= 50 WRITE(A, A_bal) dispense $50 to user}

Withdraw $50 from A

A_bal = READ(A)B_bal = READ(B)Print(A_bal+B_bal)

Report sum of money

An example

Storage system

A_bal = READ(A)If (A_bal >100) { B_bal = READ(B) B_bal += 100 A_bal -= 100 WRITE(A, A_bal) WRITE(B, B_bal)}

Transfer $100 from A to B

Page 5: Concurrency control in transactional systems Jinyang Li Some slides adapted from Stonebraker and Madden

Yfs lab example

extentserver

d = GET(dir)newd = append “x” to dPUT(newd)

yfs client 1 (CREATE)

d = GET(dir)newd = append “y” to dPUT(newd)

yfs client 2 (CREATE)

Page 6: Concurrency control in transactional systems Jinyang Li Some slides adapted from Stonebraker and Madden

Solutions?

1. Leave it to application programmers– Application programmers are responsible

for locking data items

2. Storage system performs automatic concurrency control

– Concurrent transactions execute as if serially

Page 7: Concurrency control in transactional systems Jinyang Li Some slides adapted from Stonebraker and Madden

ACID transactions

• A (Atomicity)– All-or-nothing w.r.t. failures

• C (Consistency)– Transactions maintain any internal storage state

invariants

• I (Isolation)– Concurrently executing transactions do not

interfere

• D (Durability)– Effect of transactions survive failures

Page 8: Concurrency control in transactional systems Jinyang Li Some slides adapted from Stonebraker and Madden

What’s ACID? an example

T1: Transfer $100 from A to B

T2: Transfer $50 from B to A

• A T1 completes or nothing (ditto for T2)

• D once T1/T2 commits, stays done, no updates lost

• I no races, as if T1 happens either before or after T2

• C preserves invarants, e.g. account balance > 0

Page 9: Concurrency control in transactional systems Jinyang Li Some slides adapted from Stonebraker and Madden

Ideal isolation semantics: serializability

• Definition: execution of a set of transactions is equivalent to some serial order– Two executions are equivalent if they have the

same effect on database and produce same output.

Page 10: Concurrency control in transactional systems Jinyang Li Some slides adapted from Stonebraker and Madden

Conflict serializability

• An execution schedule is the ordering of read/write/commit/abort operations

A_bal = READ(A)B_bal = READ(B)B_bal += 100A_bal -= 100WRITE(A, A_bal)WRITE(B, B_bal)Commit

A_bal = READ(A)B_bal = READ(B)Print(A_bal+B_bal)Commit

A (serial) schedule: R(A),R(B),W(A),W(B),C,R(A),R(B),C

Page 11: Concurrency control in transactional systems Jinyang Li Some slides adapted from Stonebraker and Madden

Conflict serializability

• Two schedules are equivalent if they:– contain same operations– order conflicting operations the same way

• A schedule is serializable if it’s identical to some serial schedule

Two ops conflict if they access the same data item and one is a write.

Page 12: Concurrency control in transactional systems Jinyang Li Some slides adapted from Stonebraker and Madden

Examples

A_bal = READ(A)B_bal = READ(B)Print(A_bal+B_bal)

A_bal = READ(A)B_bal = READ(B)B_bal += 100A_bal -= 100WRITE(A, A_bal)WRITE(B, B_bal)

Serializable? R(A),R(B),R(A),R(B),C W(A),W(B),C

Serializable? R(A),R(B), W(A), R(A),R(B),C, W(B),C

Equivalent serial schedule: R(A),R(B),C,R(A),R(B),W(A),W(B),C

Page 13: Concurrency control in transactional systems Jinyang Li Some slides adapted from Stonebraker and Madden

Realize a serializable schedule

• Locking-based approach

• Strawman solution 1:– Grab global lock before transaction starts– Release global lock after transaction commits

• Strawman solution 2:– Grab lock on item X before reading/writing X– Release lock on X after reading/writing X

Page 14: Concurrency control in transactional systems Jinyang Li Some slides adapted from Stonebraker and Madden

Strawman 2’s problem

A_bal = READ(A)B_bal = READ(B)Print(A_bal+B_bal)

A_bal = READ(A)B_bal = READ(B)B_bal += 100A_bal -= 100WRITE(A, A_bal)WRITE(B, B_bal)

Possible with strawman 2? (short-duration locks on writes) R(A),R(B), W(A), R(A),R(B),C, W(B),C

Locks on writes should be held till end of transaction

Read an uncommitted value

Page 15: Concurrency control in transactional systems Jinyang Li Some slides adapted from Stonebraker and Madden

More Strawmans

• Strawman 3– Grab lock on item X before writing X– Release locks at the end of transaction– Grab lock on item X before reading X– Release locks immediately after reading

Possible with strawman 3? (short-duration locks on reads) R(A),R(B), W(A), R(A),R(B),C, W(B),C

R(A), R(A),R(B), W(A), W(B),C, R(B), C

Non-repeatable readsRead locks must be held

till commit time

Page 16: Concurrency control in transactional systems Jinyang Li Some slides adapted from Stonebraker and Madden

Strawman 3’s problem

Possible with strawman 3? (short-duration locks on reads) R(A),R(B), W(A), R(A),R(B),C, W(B),C

R(A), R(A),R(B), W(A), W(B),C, R(B), C

Non-repeatable readsRead locks must be held

till commit time

A_bal = READ(A)B_bal = READ(B)Print(A_bal+B_bal)

A_bal = READ(A)B_bal = READ(B)B_bal += 100A_bal -= 100WRITE(A, A_bal)WRITE(B, B_bal)

Page 17: Concurrency control in transactional systems Jinyang Li Some slides adapted from Stonebraker and Madden

• 2 phase locking (2PL)– A growing phase in which the transaction is acquiring

locks– A shrinking phase in which locks are released

• In practice,– The growing phase is the entire transaction– The shrinking phase is at the commit time

• Optimization:– Use read/write locks instead of exclusive locks

Realize a serializable schedule

Page 18: Concurrency control in transactional systems Jinyang Li Some slides adapted from Stonebraker and Madden

2PL: an example

R_lock(A)A_bal = READ(A)R_lock(B)B_bal = READ(B)B_bal += 100A_bal -= 100W_lock(A)WRITE(A, A_bal)W_lock(B)WRITE(B, B_bal)W_unlock(A)W_unlock(B)

R_lock(A)A_bal = READ(A)R_lock(B)B_bal = READ(B)Print(A_bal+B_bal)R_unlock(A)R_unlock(B)

Possible? R(A),R(B), W(A), R(A),R(B),C W(B),C

Page 19: Concurrency control in transactional systems Jinyang Li Some slides adapted from Stonebraker and Madden

More on 2PL

• What if a lock is unavailable?

• Deadlocks possible?

• How to cope with deadlock? – Grab locks in order? No always possible– Transaction manager detects deadlock cycles

and aborts affected transactions– Alternative: timeout and abort yourself

wait

detect & abort

Use crash recovery mechanism (e.g. UNDO records) to clean up

Page 20: Concurrency control in transactional systems Jinyang Li Some slides adapted from Stonebraker and Madden

The Phantom problem

T1: begin_txupdate emp (set salary = 1.1 * salary) where dept = ‘CS’end_tx

T2: begin_txUpdate emp (set dept = ‘CS’) where emp.name = ‘jinyang’end_tx

T1: update BobT1: update HerbertT2: move JinyangT2: move LakshmiT2 commits and releases all locksT1: update Lakshmi(!!!!!)T1: update MichaelT1: commits

Page 21: Concurrency control in transactional systems Jinyang Li Some slides adapted from Stonebraker and Madden

The Phantom problem

• Issue is lock things you touch – but must guarantee non-existence of any new ones!

• Solutions:– Predicate locking– Range locks in a B-tree index (assumes an

index on dept). Otherwise, lock entire table– Often ignored in practice

Page 22: Concurrency control in transactional systems Jinyang Li Some slides adapted from Stonebraker and Madden

Why less than serializability?

• Performance of locking?

• What to do:– run analytics on a companion data

warehouse (common practice)– take the stall (rarely done)– run less than serializability

T1: begin_txSelect avg (sal) from empend_tx

T2: begin_txUpdate emp …end_tx

Page 23: Concurrency control in transactional systems Jinyang Li Some slides adapted from Stonebraker and Madden

Degrees of isolation• Degree 0 (read uncommitted)

– no locks (guarantees nothing)

• Degree 1 (read committed)– Short read locks, long write locks

• Degree 2– Long read/write locks. (serializable unless you squint)

• Degree 3– Long read/write locks plus solving the phantom

problem

Page 24: Concurrency control in transactional systems Jinyang Li Some slides adapted from Stonebraker and Madden

Disadvantages of locking-based approach

• Need to detect deadlocks– Distributed implemention needs distributed

deadlock detection (bad)

• Read-only transactions can block update transactions– Big performance hit if there are long

running read-only transactions

Page 25: Concurrency control in transactional systems Jinyang Li Some slides adapted from Stonebraker and Madden

Multi-version concurrency control

• No locking during transaction execution!• Each data item is associated with multiple

versions• Multi-version transactions:

– Buffer writes during execution– Reads choose the appropriate version– At commit time, system validates if okay to make

writes visible (by generating new versions)

Page 26: Concurrency control in transactional systems Jinyang Li Some slides adapted from Stonebraker and Madden

Snapshot isolation

• A popular multi-version concurrency control scheme

• A transaction:– reads a “snapshot” of database image– Can commit only if there are no write-write

conflict

Page 27: Concurrency control in transactional systems Jinyang Li Some slides adapted from Stonebraker and Madden

Implementing snapshot isolation

• T is assigned a start timestamp,T.sts

R(A) W(B)

• T buffers writes to B and addsB to its writeset, T.wset += {B}

• T is assigned a commit timestamp• System checks forall T’, s.t. T’.cts > T.sts && T’.cts < T.ctsT’.wset and T.wset do not overlap

• T reads the biggest versionof A, A(i), such that i <= T.sts

time

Page 28: Concurrency control in transactional systems Jinyang Li Some slides adapted from Stonebraker and Madden

An example

R(A) R(B) W(B)W(A)

R(A) R(B)

R(A) R(B)

No read-uncommitted

No unrepeatable-read

105 50 55 60

Page 29: Concurrency control in transactional systems Jinyang Li Some slides adapted from Stonebraker and Madden

Snapshot isolation < serializability

• The write-skew problem

Possible under SI? R(A),R(B),W(B), W(A),C,C

Serializable?