concurrency control in transactional systems jinyang li some slides adapted from stonebraker and...
TRANSCRIPT
Concurrency control in transactional systems
Jinyang Li
Some slides adapted from Stonebraker and Madden
What we’ve learnt so far…
• All-or-nothing atomicity in transactions– So that failures do not result in (undesirable)
intermediate state
• How to realize all-or-nothing atomicity?– Logging– REDO/UNDO logging vs. REDO-only
logging
Concurrency control
• Many application clients are concurrently accessing a storage system.
• Concurrent transactions might interfere– Similar to data races in parallel programs
A_bal = READ(A)If (A_bal >50) { A_bal -= 50 WRITE(A, A_bal) dispense $50 to user}
Withdraw $50 from A
A_bal = READ(A)B_bal = READ(B)Print(A_bal+B_bal)
Report sum of money
An example
Storage system
A_bal = READ(A)If (A_bal >100) { B_bal = READ(B) B_bal += 100 A_bal -= 100 WRITE(A, A_bal) WRITE(B, B_bal)}
Transfer $100 from A to B
Yfs lab example
extentserver
d = GET(dir)newd = append “x” to dPUT(newd)
yfs client 1 (CREATE)
d = GET(dir)newd = append “y” to dPUT(newd)
yfs client 2 (CREATE)
Solutions?
1. Leave it to application programmers– Application programmers are responsible
for locking data items
2. Storage system performs automatic concurrency control
– Concurrent transactions execute as if serially
ACID transactions
• A (Atomicity)– All-or-nothing w.r.t. failures
• C (Consistency)– Transactions maintain any internal storage state
invariants
• I (Isolation)– Concurrently executing transactions do not
interfere
• D (Durability)– Effect of transactions survive failures
What’s ACID? an example
T1: Transfer $100 from A to B
T2: Transfer $50 from B to A
• A T1 completes or nothing (ditto for T2)
• D once T1/T2 commits, stays done, no updates lost
• I no races, as if T1 happens either before or after T2
• C preserves invarants, e.g. account balance > 0
Ideal isolation semantics: serializability
• Definition: execution of a set of transactions is equivalent to some serial order– Two executions are equivalent if they have the
same effect on database and produce same output.
Conflict serializability
• An execution schedule is the ordering of read/write/commit/abort operations
A_bal = READ(A)B_bal = READ(B)B_bal += 100A_bal -= 100WRITE(A, A_bal)WRITE(B, B_bal)Commit
A_bal = READ(A)B_bal = READ(B)Print(A_bal+B_bal)Commit
A (serial) schedule: R(A),R(B),W(A),W(B),C,R(A),R(B),C
Conflict serializability
• Two schedules are equivalent if they:– contain same operations– order conflicting operations the same way
• A schedule is serializable if it’s identical to some serial schedule
Two ops conflict if they access the same data item and one is a write.
Examples
A_bal = READ(A)B_bal = READ(B)Print(A_bal+B_bal)
A_bal = READ(A)B_bal = READ(B)B_bal += 100A_bal -= 100WRITE(A, A_bal)WRITE(B, B_bal)
Serializable? R(A),R(B),R(A),R(B),C W(A),W(B),C
Serializable? R(A),R(B), W(A), R(A),R(B),C, W(B),C
Equivalent serial schedule: R(A),R(B),C,R(A),R(B),W(A),W(B),C
Realize a serializable schedule
• Locking-based approach
• Strawman solution 1:– Grab global lock before transaction starts– Release global lock after transaction commits
• Strawman solution 2:– Grab lock on item X before reading/writing X– Release lock on X after reading/writing X
Strawman 2’s problem
A_bal = READ(A)B_bal = READ(B)Print(A_bal+B_bal)
A_bal = READ(A)B_bal = READ(B)B_bal += 100A_bal -= 100WRITE(A, A_bal)WRITE(B, B_bal)
Possible with strawman 2? (short-duration locks on writes) R(A),R(B), W(A), R(A),R(B),C, W(B),C
Locks on writes should be held till end of transaction
Read an uncommitted value
More Strawmans
• Strawman 3– Grab lock on item X before writing X– Release locks at the end of transaction– Grab lock on item X before reading X– Release locks immediately after reading
Possible with strawman 3? (short-duration locks on reads) R(A),R(B), W(A), R(A),R(B),C, W(B),C
R(A), R(A),R(B), W(A), W(B),C, R(B), C
Non-repeatable readsRead locks must be held
till commit time
Strawman 3’s problem
Possible with strawman 3? (short-duration locks on reads) R(A),R(B), W(A), R(A),R(B),C, W(B),C
R(A), R(A),R(B), W(A), W(B),C, R(B), C
Non-repeatable readsRead locks must be held
till commit time
A_bal = READ(A)B_bal = READ(B)Print(A_bal+B_bal)
A_bal = READ(A)B_bal = READ(B)B_bal += 100A_bal -= 100WRITE(A, A_bal)WRITE(B, B_bal)
• 2 phase locking (2PL)– A growing phase in which the transaction is acquiring
locks– A shrinking phase in which locks are released
• In practice,– The growing phase is the entire transaction– The shrinking phase is at the commit time
• Optimization:– Use read/write locks instead of exclusive locks
Realize a serializable schedule
2PL: an example
R_lock(A)A_bal = READ(A)R_lock(B)B_bal = READ(B)B_bal += 100A_bal -= 100W_lock(A)WRITE(A, A_bal)W_lock(B)WRITE(B, B_bal)W_unlock(A)W_unlock(B)
R_lock(A)A_bal = READ(A)R_lock(B)B_bal = READ(B)Print(A_bal+B_bal)R_unlock(A)R_unlock(B)
Possible? R(A),R(B), W(A), R(A),R(B),C W(B),C
More on 2PL
• What if a lock is unavailable?
• Deadlocks possible?
• How to cope with deadlock? – Grab locks in order? No always possible– Transaction manager detects deadlock cycles
and aborts affected transactions– Alternative: timeout and abort yourself
wait
detect & abort
Use crash recovery mechanism (e.g. UNDO records) to clean up
The Phantom problem
T1: begin_txupdate emp (set salary = 1.1 * salary) where dept = ‘CS’end_tx
T2: begin_txUpdate emp (set dept = ‘CS’) where emp.name = ‘jinyang’end_tx
T1: update BobT1: update HerbertT2: move JinyangT2: move LakshmiT2 commits and releases all locksT1: update Lakshmi(!!!!!)T1: update MichaelT1: commits
The Phantom problem
• Issue is lock things you touch – but must guarantee non-existence of any new ones!
• Solutions:– Predicate locking– Range locks in a B-tree index (assumes an
index on dept). Otherwise, lock entire table– Often ignored in practice
Why less than serializability?
• Performance of locking?
• What to do:– run analytics on a companion data
warehouse (common practice)– take the stall (rarely done)– run less than serializability
T1: begin_txSelect avg (sal) from empend_tx
T2: begin_txUpdate emp …end_tx
Degrees of isolation• Degree 0 (read uncommitted)
– no locks (guarantees nothing)
• Degree 1 (read committed)– Short read locks, long write locks
• Degree 2– Long read/write locks. (serializable unless you squint)
• Degree 3– Long read/write locks plus solving the phantom
problem
Disadvantages of locking-based approach
• Need to detect deadlocks– Distributed implemention needs distributed
deadlock detection (bad)
• Read-only transactions can block update transactions– Big performance hit if there are long
running read-only transactions
Multi-version concurrency control
• No locking during transaction execution!• Each data item is associated with multiple
versions• Multi-version transactions:
– Buffer writes during execution– Reads choose the appropriate version– At commit time, system validates if okay to make
writes visible (by generating new versions)
Snapshot isolation
• A popular multi-version concurrency control scheme
• A transaction:– reads a “snapshot” of database image– Can commit only if there are no write-write
conflict
Implementing snapshot isolation
• T is assigned a start timestamp,T.sts
R(A) W(B)
• T buffers writes to B and addsB to its writeset, T.wset += {B}
• T is assigned a commit timestamp• System checks forall T’, s.t. T’.cts > T.sts && T’.cts < T.ctsT’.wset and T.wset do not overlap
• T reads the biggest versionof A, A(i), such that i <= T.sts
time
An example
R(A) R(B) W(B)W(A)
R(A) R(B)
R(A) R(B)
No read-uncommitted
No unrepeatable-read
105 50 55 60
Snapshot isolation < serializability
• The write-skew problem
Possible under SI? R(A),R(B),W(B), W(A),C,C
Serializable?