transactions all up
TRANSCRIPT
8/6/2019 Transactions All Up
http://slidepdf.com/reader/full/transactions-all-up 1/11
Transactions
Advanced Databases Transactions 2
Definition
• Transaction: a collection of operations performing a single
logical function
• written in high-level language (C, Java, SQL)
• delimited by BEGIN TRANSCATION and END TRANSACTION
• ACID properties
BEGIN TRANSACTION transfer
UPDATE accounts SET balance = balance - 100 WHERE account=AUPDATE accounts SET balance = balance + 100 WHERE account=B
END TRANSACTION transfer
Advanced Databases Transactions 3
ACID
• Atomicity: all operations in the transaction are executed
properly or none
– both UPDATE operations need to be performed
• Consistency: the execution of a single transaction preserves
the consistency of the database
– the sum of balances should remain unchanged after execution
• Isolation: each transaction is unaware of other transactions
executed concurrently
– Ti finished before T j started
• Durability: changes the transaction have performed persist in
the database
– recovery
Advanced Databases Transactions 4
Transaction States
• active: initial state, transaction stays in this state while
executing
• partially committed: after the final operation has been
performed
• failed: when normal execution cannot proceed
• aborted: transaction has been rolled back restoring into prior
state
• committed: successful completion
8/6/2019 Transactions All Up
http://slidepdf.com/reader/full/transactions-all-up 2/11
Advanced Databases Transactions 5
Abstraction of transaction
• a sequence of read and write operations
– T1: r 1(balanceA), w1(balanceA), r 1(balanceB), w1(balanceB)
• other operations
– ci: commit everything done
– ai: abort and roll back
Recovery
Advanced Databases Transactions 7
Example
• T1: r 1(balanceA), w1(balanceA), r 1(balanceB), w1(balanceB)
BEGIN TRANSACTION transfer
UPDATE accounts SET balance = balance - 100 WHERE account=A
UPDATE accounts SET balance = balance + 100 WHERE account=B
END TRANSACTION transfer
900B
1000A
balanceaccount
accounts
900B
900A
balanceaccount
accounts
r 1(balanceA), w1(balanceA), fail
1000B
900A
900B
1000A
balanceaccount
accounts
r 1(balanceA), w1(balanceA),
r 1(balanceB), w1(balanceB)
memory
Advanced Databases Transactions 8
Failure Classification
• Transaction failure:
– logical error
• e.g. bad input
– system error
• e.g. deadlock
• System crash
– e.g. bug
• Disk failure
8/6/2019 Transactions All Up
http://slidepdf.com/reader/full/transactions-all-up 3/11
Advanced Databases Transactions 9
Recovery algorithms
• ensure there is enough information to allow recovery from failure
• actions taken after failure to ensure consistency, atomicity and
durability
• examples:
– dirty pages not written to disk when failure occurs• need to reexecute operation: REDO
– partial writes have been performed
• need toUNDO operations
900B
1000A
balanceaccount
accounts
900B
900A
balanceaccount
accounts
r 1(balanceA), w1(balanceA), fail
1000B
900A
900B
1000A
balanceaccount
accounts
r 1(balanceA), w1(balanceA),
r 1(balanceB), w1(balanceB)
memory
Advanced Databases Transactions 10
Log-based Recovery
• Record modification in log
– stored in disk
– sequence of log records
• transaction identifier
• data-item identifier
• o ld value
• new value
• <Ti start>
• <Ti abort>
• <Ti commit>
• <Ti, Item, OldValue, NewValue>
Advanced Databases Transactions 11
Example
• For each transaction and each write operation write log records
to disk:
– <T1 start>
– <T1, balanceA, 1000, 900>
– <T1, balanceB, 900, 1000>
– <T1 commit>
• Problems?
1000B
900A
900B
1000A
balanceaccount
accounts
r 1(balanceA), w1(balanceA),
r 1(balanceB), w1(balanceB)
memory
Advanced Databases Transactions 12
Write-ahead logging
• disk access for each write
• high overhead
• use buffer
– but log records might be lost!
– log records stored in stable storage (e.g RAID)
• write-ahead logging – before data are changed log records are created
– if transaction commits, flush log to disc
8/6/2019 Transactions All Up
http://slidepdf.com/reader/full/transactions-all-up 4/11
Advanced Databases Transactions 13
Recovery procedure (1/2)
• Scan back through the log – collect set of committed transactions C
– UNDO any transaction not in C
• Scan forward through the log
– REDO any transaction in C
initial
state
final
state! w1(a), c1, w2(b), w3(a), c3, w4(b) !
initial
state
final
state
!UNDO(w4(b)), UNDO(w2(b)),
REDO(w1(a)), REDO(w3(a),)!
Advanced Databases Transactions 14
Log Exercise
• Perform recovery from the current to the recovered state
– <T4 start>
– <T1 start>
– <T1, balance56, 94340.45, 84340.95>
– <T2 start>
– <T2, balance34, 10900.67, 8900.67> – <T2, balance67, 34005.00, 36005.25>
– <T7 start>
– <T2 commit>
– <T1, balance34, 8900.67, 18900.67>
– <T7, balance67, 36005.25, 37005.25>
– <T7 abort>
– <T4 commit>
34005.0067
18900.673484340.4556
balanceaccount
accounts
67
34
56
balanceaccount
accounts
Advanced Databases Transactions 15
Recovery procedure (2/2)
• Any problems?
– need to read complete log
– time consuming
– unnecessary (when most of the REDO operations have already
been written onto disk)
• Use checkpoints
– write all log records to stable storage
– write all dirty pages to disk
– store log record <checkpoint L> to stable storage
• L list of active transactions
• Recovery needed up to checkpoint
– faster
– shorter log
– updates are suspended
Advanced Databases Transactions 16
Checkpoint
• scan backwards until checkpoint
• collect set of committed transactions C
• the records of each transaction in L which is not in C must be
UNDOne
• UNDO any transaction not in C (all its operations)
• REDO each record of transaction inC
8/6/2019 Transactions All Up
http://slidepdf.com/reader/full/transactions-all-up 5/11
Advanced Databases Transactions 17
Fuzzy Checkpoint
• checkpoint can be slow if many pages to write
• everything suspended
• fuzzy checkpoint
– permit updates after <checkpoint> has been written before modified
pages written to disk• incomplete checkpoints
– last-checkpoint record
– keep a list of modified pages
– update last-checkpoint when all modified pages have been written
to disk
Concurrency
Advanced Databases Transactions 19
Concurrent Execution
• improved throughput
– number of transactions executed in given time
• improved resource utilization
• reduced waiting time
– don’t need to wait behind long transactions
• transaction histories interleave
• what happens when transactions access common items?
– preserve isolation (ACID)
Advanced Databases Transactions 20
Examples
• H1 = r 1(b56), w1(b56), r 1(b34), w1(b34), c1
• H2 = r 2(b34), w2(b34), r 2(b67), w2(b67), c2
• Possible concurrent executions:
– order preserved
– Hx = r 2(b34), r 1(b56), w1(b56), r 1(b34), w1(b34), c1,w2(b34), r 2(b67),
w2(b67), c2
– Hy = r 2(b34), w2(b34), r 1(b56), w1(b56), r 1(b34), w1(b34), c1, r 2(b67),
w2(b67), c2
8/6/2019 Transactions All Up
http://slidepdf.com/reader/full/transactions-all-up 6/11
Advanced Databases Transactions 21
Restrictions in transactions
• Example:
– H = r 1(a), w1(a), r 2(a), c2,r 1(b), fail
– What’s the problem here?
– non-recoverable history
• recoverable histories – transaction does not commit until all transactions it has read from
have committed
• Rolling back:
– H = r 1(a), w1(b), w1(a), r 2(a), w2(a), r 2(a), a1
– problem?
• cascadeless history
– transactions reads only from committed transactions
Advanced Databases Transactions 22
More Examples and Problems
BEGIN TRANSACTION transfer
UPDATE accounts SET balance = balance - X WHERE account=A
UPDATE accounts SET balance = balance + X WHERE account=B
END TRANSACTION transfer
transfer(From, To, Amount)
T1: transfer(56, 34, 10000) T2: transfer(34, 67, 2000)
34005.0067
18900.6734
84340.4556
balanceaccount
accounts
Advanced Databases Transactions 23
Anomaly 1: Lost Update
T1: transfer(56, 34, 10000) T2: transfer(34, 67, 2000)
34005.0067
18900.6734
84340.4556
balanceaccount
accounts
r 1(b56), w1(b56), r 1(b34), w1(b34), c1 r 2(b34), w2(b34), r 2(b67), w2(b67), c2
r 2(b34,18900.67), r 1(b56 ,84340.45), w1(b56 ,74340.45), r 1(b34 ,18900.67),
w1(b34 ,28900.67), c1,w2(b34 ,16900.67), r 2(b67,34005.00), w2(b67 ,36005.00), c2
H1H2
Hx
Advanced Databases Transactions 24
Concurrency anomalies
• Must ensure that concurrent execution is free of anomalies
– e.g. lost update, inconsistent analysis, dirty writes, etc.
• use a serial history
– no interleaving!
8/6/2019 Transactions All Up
http://slidepdf.com/reader/full/transactions-all-up 7/11
Advanced Databases Transactions 25
Serialization
• Isolation (ACID): each transaction is unaware of other
transactions executed concurrently
– Ti finished before T j started
• Serial history: all operations of a given transaction appear
together, one after the other
– Hz = r 2(b34), w2(b34), r 2(b67), w2(b67), c2,r 1(b56), w1(b56), r 1(b34),
w1(b34), c1
– T2,T1
• All histories equivalent to a serial history are also accepted
(and they are free of concurrency anomalies)
– serializable histories
Advanced Databases Transactions 26
When are two histories equivalent?
• equivalence:
– view equivalence
– conflict equivalence
• view equivalence
– database ends in the same state
– writes and reads are performed in the same order in the two histories
– there have the same final writes
– but, difficult to test at run time
• conflict equivalence
– two operations conflict if they belong to different transactions and one of
them is a write
– two histories are conflict equivalent if the order of the conflicting operations
is the same
• conflict serializable
– a history is conflict serializable if it is conflict equivalent with a serial history
Advanced Databases Transactions 27
Conflict serializable history
T1: transfer(56, 34, 10000) T2: transfer(34, 67, 2000)
r 1(b56), w1(b56), r 1(b34), w1(b34), c1 r 2(b34), w2(b34), r 2(b67), w2(b67), c2
r 2(b34), w2(b34), r 1(b56), w1(b56), r 1(b34), w1(b34), c1, r 2(b67), w2(b67), c2
H1H2
Hy
r 2(b34), w2(b34), r 2(b67), w2(b67), c2, r 1(b56), w1(b56), r 1(b34), w1(b34), c1,
Advanced Databases Transactions 28
Identifying conflict serializability
• For history H, create precedence graph Gh
– a node for each transaction in H
– a directed edge for each pair OP1, OP2 of conflicting operations
• OP1 appears before OP2 in H
• H is conflict serializable if
– no cycles in Gh
8/6/2019 Transactions All Up
http://slidepdf.com/reader/full/transactions-all-up 8/11
Advanced Databases Transactions 29
Example
• H: r 1(a), w1(a), r 1(b), r 2(b), w2(b), w1(b)
• Not conflict serializable
T1 T2
Advanced Databases Transactions 30
Serializability Exercise
• Are the following histories conflict-serializable?
– Hx = r 2(b34), r 1(b56), w1(b56), r 1(b34), w1(b34), c1,w2(b34), r 2(b67),w2(b67), c2
– Hy = r 2(b34), w2(b34), r 1(b56), w1(b56), r 1(b34), w1(b34), c1, r 2(b67),w2(b67), c2
– Hz = r 1(a), w1(a), w3(b), w3(c), r 2(c), r 1(a), w1(b), w2(b)
• which are their equivalent serial histories (if they have one)?
Concurrency control protocols
Advanced Databases Transactions 32
So far…
• Transactions
• Recoverable histories
– H = r 1(a), w1(a), r 2(a), w2(a), c2,r 1(b), fail
– recovery procedures
• Concurrency
– Hx = r 2(b34), r 1(b56), w1(b56), r 1(b34), w1(b34), c1,w2(b34), r 2(b67),
w2(b67), c2
– serializable histories
• desirable…
– cascadeless histories
8/6/2019 Transactions All Up
http://slidepdf.com/reader/full/transactions-all-up 9/11
Advanced Databases Transactions 33
Concurrency control protocol
• protocols that schedule operations (r,w) so that transactions
executing concurrently generate only serializable histories
• lock-based protocols
– using locks to protect items
• timestamp-based protocols – transactions are assigned a timestamp (unique identifier)
Advanced Databases Transactions 34
Lock-based Protocols (1/2)
• l(item) …. op(item) … u(item)
• lock manager
– if T i requests for an item already locked by T j then Ti goes into “waitqueue”
• binary lock: lock - unlock
– T1: l1(a), r 1(a), u1(a), l1(b), r 1(b), [b = a+b], w1(b), u1(b) – T2: l2(b), r 2(b), u2(b), l2(a), r 2(a), [a= a+b], w2(a), u2(a)
– concurrently: l1(a), r 1(a), u1(a), l2(b), r 2(b), u2(b),
l2(a), r 2(a), [a= a+b], w2(a), u2(a), l1(b), r 1(b), [b = a+b],
w1(b), u1(b)
• initially, a=5, b=10:
– T1,T2: b=15, a=20
– T2,T1: a=15, b=25
– concurrently: a=15, b=15, not serializable
Advanced Databases Transactions 35
Lock-based Protocols (2/2)
• batch locking
– acquire all locks at start time
– release all locks at commit/abort time
• example:
– l(b56),l(b34),r 1(b56), w1(b56), r 1(b34), w1(b34), c1,u(b34), u(b56)
• does this guarantee serializability,recoverability? problems?
time
#locks
start end
Advanced Databases Transactions 36
Two-Phase Locking (1/2)
• read locks: rl(item) … ru(item)
– shared
• write locks: wl(item) … wu(item)
– exclusive
• downgrade from wl to rl
• upgrade from rl to wl
• two phases
– growing phase
– shrinking phase
– once a lock released,
no more locks acquired
time
#locks
start end
8/6/2019 Transactions All Up
http://slidepdf.com/reader/full/transactions-all-up 10/11
Advanced Databases Transactions 37
Two-Phase Locking (2/2)
• does 2PL guarantee serializability, recoverability?
• but:
– T1: wl1(a), w1(a), wl1(b), w1(b), wu1(b), wu1(a)
– T2: wl2(b), w2(b), wl2(a), w2(a), wu2(a), wu2(b)
– concurrently: wl1(a), w1(a), wl2(b), w2(b), deadlock• T1 waits for lock on b
• T2 waits for lock on a
• 2PL susceptible to deadlocks
• waits-for graph
T1 T2
wl1(b)
wl2(a)
Advanced Databases Transactions 39
Preventing deadlocks
• conservative 2PL
• using timestamps
– timestamp TS(Ti) for transaction Ti
• unique identifier for each transaction
• monotonically increasing
– if T i started before T j: TS(Ti) < TS(T j)
– Ti waits for lock on item X that T j holds
• wait-die: if TS(Ti) < TS(T j): Ti waits
– else Ti aborts and restarts with the same timestamp
• wound-wait: if TS(Ti) < TS(T j): T j aborted and restarted with same
timestamp
– else Ti waits
– deadlock-free: why?
Advanced Databases Transactions 40
Deadlock Detection
• construct wait-for graphs
– deadlock when cycle
– select victim transaction
• expensive in space and time
• timeouts
– if waiting exceeds specified timeout, abort
Advanced Databases Transactions 41
Livelocks
• deadlock preventions
– transactions aborted based on timestamp
• deadlock detection
– transactions aborted to break waits-for cycle
• transactions are assigned a priority
• priority increased with time
• abort transactions with less priority
8/6/2019 Transactions All Up
http://slidepdf.com/reader/full/transactions-all-up 11/11
Advanced Databases Transactions 42
Timestamp-based protocols (1/2)
• concurrency control
• timestamp ordering
– history equivalent to the serial order corresponding to timestamps
– for each item accessed by conflicting operation doesn’t violateserializability order
– read_TS(item)
– write_TS(item)
• basic timestamp ordering
– TS(T) < read_TS(item), write_TS(item)
• T aborted
Advanced Databases Transactions 43
Timestamp-based protocols (2/2)
• T1: r 1(a), w1(a)
• T2: w2(a)
• concurrently: r 1(a), w2(a), w1(a)
– r 1(a),
• read_TS(a) = 1
– w2(a),• TS(T2) = 2 > read_TS(a) = 1
• write_TS(a) = 2
– w1(a)
• TS(T1) = 1 < write_TS(a) = 2
• T1 aborted
• but, w1(a) doesn’t need to be executed
– if T issues a write(X), and write_TS(X) > TS(T), ignore write
– Thomas’ write rule
Advanced Databases Transactions 44
Delete and Insert operations
• what are the conflicts when delete and insert operations are also
examined?
• how should the locking protocols be adapted?
• how about timestamp ordering?