![Page 1: Contention in shared memory multiprocessors Multiprocessor synchronization algorithms (20225241) Lecturer: Danny Hendler Definitions Lower bound for consensus](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d585503460f94a3856e/html5/thumbnails/1.jpg)
Contention in shared memory multiprocessors
Multiprocessor synchronization
algorithms (20225241)
Lecturer: Danny Hendler
• Definitions• Lower bound for consensus• Lower bounds for counters, stacks and queues
![Page 2: Contention in shared memory multiprocessors Multiprocessor synchronization algorithms (20225241) Lecturer: Danny Hendler Definitions Lower bound for consensus](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d585503460f94a3856e/html5/thumbnails/2.jpg)
Contention in shared-memory systems
Contention: the extent to which processes access the same memory locations simultaneously
When multiple processes simultaneously write to the same memory location, they are being stalled
High contention hurts performance!
![Page 3: Contention in shared memory multiprocessors Multiprocessor synchronization algorithms (20225241) Lecturer: Danny Hendler Definitions Lower bound for consensus](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d585503460f94a3856e/html5/thumbnails/3.jpg)
Memory Stalls & Write-Contention
variable
p0p1p2pj
Stalls# j 2 1 0
Write-contention is the maximum number of processes that can be enabled to perform a write or read-modify-write operation to the same memory location simultaneously.
![Page 4: Contention in shared memory multiprocessors Multiprocessor synchronization algorithms (20225241) Lecturer: Danny Hendler Definitions Lower bound for consensus](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d585503460f94a3856e/html5/thumbnails/4.jpg)
Recall the consensus implementation we saw…
Decide(v) ; code for pi, i=0,11. CAS(C, null, v) 2. return C
Initially C=null
We use a single object, C, that supports the compare&swap and read operations.
What is the write-contention of this algorithm?
nIt can be shown that this is the write-
contention of any consensus algorithm
![Page 5: Contention in shared memory multiprocessors Multiprocessor synchronization algorithms (20225241) Lecturer: Danny Hendler Definitions Lower bound for consensus](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d585503460f94a3856e/html5/thumbnails/5.jpg)
What can we say about the worst-case time complexity of objects such as counters,
stacks and queues?
![Page 6: Contention in shared memory multiprocessors Multiprocessor synchronization algorithms (20225241) Lecturer: Danny Hendler Definitions Lower bound for consensus](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d585503460f94a3856e/html5/thumbnails/6.jpg)
Naïve Counter Implementation
3
4
6 5
2
1
FAI
Last processes to succeed incur θ(n) time complexity!
FAI
FAIFAI
FAI
FAI
Can we do much better?
FAI object
![Page 7: Contention in shared memory multiprocessors Multiprocessor synchronization algorithms (20225241) Lecturer: Danny Hendler Definitions Lower bound for consensus](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d585503460f94a3856e/html5/thumbnails/7.jpg)
We will see a time lower bound of √n on non-blocking implementations of:
counters, stacks, queues…
Any algorithm either (a) suffers high contention or (b) suffers high latency
![Page 8: Contention in shared memory multiprocessors Multiprocessor synchronization algorithms (20225241) Lecturer: Danny Hendler Definitions Lower bound for consensus](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d585503460f94a3856e/html5/thumbnails/8.jpg)
Capture Influence between processes
3
5
1
4 2
6
Time complexity is determined by the extent by which operations by different processes
influence each other.
![Page 9: Contention in shared memory multiprocessors Multiprocessor synchronization algorithms (20225241) Lecturer: Danny Hendler Definitions Lower bound for consensus](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d585503460f94a3856e/html5/thumbnails/9.jpg)
Influence-levelShared Counter
17
Each of us may precede you
and modify the value you will
get!
Influence level (w.r.t. p)
FAI
Hmmm… I will soon request a
value
p
![Page 10: Contention in shared memory multiprocessors Multiprocessor synchronization algorithms (20225241) Lecturer: Danny Hendler Definitions Lower bound for consensus](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d585503460f94a3856e/html5/thumbnails/10.jpg)
Modifying StepsShared Counter
17FAI
Hmmm… I will soon request a
value
Each of us may precede
you!
pq
![Page 11: Contention in shared memory multiprocessors Multiprocessor synchronization algorithms (20225241) Lecturer: Danny Hendler Definitions Lower bound for consensus](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d585503460f94a3856e/html5/thumbnails/11.jpg)
Modifying StepsShared Counter
17
Hmmm… I will soon request a
value
Each of us may precede
you!
pq
FAI
![Page 12: Contention in shared memory multiprocessors Multiprocessor synchronization algorithms (20225241) Lecturer: Danny Hendler Definitions Lower bound for consensus](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d585503460f94a3856e/html5/thumbnails/12.jpg)
Modifying StepsShared Counter
17FAI
Hmmm… I will soon request a
value
Each of us may precede
you!
pq
![Page 13: Contention in shared memory multiprocessors Multiprocessor synchronization algorithms (20225241) Lecturer: Danny Hendler Definitions Lower bound for consensus](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d585503460f94a3856e/html5/thumbnails/13.jpg)
Modifying StepsShared Counter
18
Hmmm… I will soon request a
value
Each of us may precede
you!
pq 17
There’s an atomic step in which q modifies p’s return value.
We bring all the ‘Influencers’ to be on the verge of performing a modifying step
FAI
![Page 14: Contention in shared memory multiprocessors Multiprocessor synchronization algorithms (20225241) Lecturer: Danny Hendler Definitions Lower bound for consensus](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d585503460f94a3856e/html5/thumbnails/14.jpg)
Space/Write-contention tradeoff
• We bring all Influencers to be on the verge of a modifying step
• Each modifying step is necessarily a write/RMW operation
S ≥IC
Space complexity
Influence-level
Write-contention
![Page 15: Contention in shared memory multiprocessors Multiprocessor synchronization algorithms (20225241) Lecturer: Danny Hendler Definitions Lower bound for consensus](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d585503460f94a3856e/html5/thumbnails/15.jpg)
Latency/Contention tradeoff
Base-objects on which there are outstanding modifying steps
Shared Counter
17 FAI
Hmmm… I will soon request a
value
p
Process p can be made to read all
these variables in the course of its
operation!
LR ≥IC
# of read base objects
Influence-level
Write-contention
![Page 16: Contention in shared memory multiprocessors Multiprocessor synchronization algorithms (20225241) Lecturer: Danny Hendler Definitions Lower bound for consensus](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d585503460f94a3856e/html5/thumbnails/16.jpg)
Time lower bound
LRC ≥I
Time complexity is at least I
![Page 17: Contention in shared memory multiprocessors Multiprocessor synchronization algorithms (20225241) Lecturer: Danny Hendler Definitions Lower bound for consensus](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d585503460f94a3856e/html5/thumbnails/17.jpg)
Influence(n) Objects ClassDefinition: The Influence-function, Io(n), of
a generic object O, is defined as follows:
Io(n)= k, if the influence-level of any n-process nonblocking implementation of O is at least k.
Influence(n) includes: stacks, queues, hash-tables, pools, linearizable counters, consensus, approximate-agreement…
Definition: Influence(n) is the class of generic objects whose Influence-function is in (n)
![Page 18: Contention in shared memory multiprocessors Multiprocessor synchronization algorithms (20225241) Lecturer: Danny Hendler Definitions Lower bound for consensus](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d585503460f94a3856e/html5/thumbnails/18.jpg)
Concurrent Counter is in Influence(n)
Shared Counter
17
Each of us may precede
you!
FAI
Hmmm… I will soon request a
value
p
Influence-level is (n-1): every q≠p can influence p
![Page 19: Contention in shared memory multiprocessors Multiprocessor synchronization algorithms (20225241) Lecturer: Danny Hendler Definitions Lower bound for consensus](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d585503460f94a3856e/html5/thumbnails/19.jpg)
Stack is in Influence(n)
Each of us may precede
you!
Hmmm… I will soon
attempt to pop a value.
p123
n
Top of stack
Influence-level is (n-1), e.g. if every q≠p has a pending pop operation.
![Page 20: Contention in shared memory multiprocessors Multiprocessor synchronization algorithms (20225241) Lecturer: Danny Hendler Definitions Lower bound for consensus](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d585503460f94a3856e/html5/thumbnails/20.jpg)
Approximate Agreement is in Influence(n)
P1
0 2ε 2ε 2ε 2ε 2ε
Influence-level is (n-1)
If p1 runs first, it must return 0. If it is preceded by an
execution where some q≠p1 terminates, p
1 must return a
value no less than ε.
P2 P
3P
4P
5 Pn
In approximate agreement, each process proposes its value.
•Validity: Each process must decide on a value that is legal (in the range of proposed values).
•Approximate agreement: The values decided by any two processes must be no more than ε apart.
![Page 21: Contention in shared memory multiprocessors Multiprocessor synchronization algorithms (20225241) Lecturer: Danny Hendler Definitions Lower bound for consensus](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d585503460f94a3856e/html5/thumbnails/21.jpg)
The First-Generation Problem
• Every process calls a First operation once.• We say an operation is in the first generation of execution
E if it is not preceded in E by any other operation
• All operations not in the first generation of the execution must return false.
• In quiescence, at least one operation from the first generation must have returned true.
Lemma
The First-Generation object is in Influence(n), and for this problem our bound is tight.
The bound for Influence(n) is tight
![Page 22: Contention in shared memory multiprocessors Multiprocessor synchronization algorithms (20225241) Lecturer: Danny Hendler Definitions Lower bound for consensus](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d585503460f94a3856e/html5/thumbnails/22.jpg)
The mark array of n multi-reader multi-writer atomic variables
An Optimal Implementation for the First Generation Problem
Groups of n
processes
![Page 23: Contention in shared memory multiprocessors Multiprocessor synchronization algorithms (20225241) Lecturer: Danny Hendler Definitions Lower bound for consensus](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d585503460f94a3856e/html5/thumbnails/23.jpg)
A linear lower bound on the number ofStalls for long-lived objects
The following material is not required
for the exam/assignments.
![Page 24: Contention in shared memory multiprocessors Multiprocessor synchronization algorithms (20225241) Lecturer: Danny Hendler Definitions Lower bound for consensus](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d585503460f94a3856e/html5/thumbnails/24.jpg)
“Naïve” Counter Implementation
3
4
6 5
2
1
FAI
Last process incurs θ(n) time complexity!
FAI
FAIFAI
FAI
FAI
Can we do better?
Shared word supporting fetch&inc
FAI: Fetch-and-Increment
![Page 25: Contention in shared memory multiprocessors Multiprocessor synchronization algorithms (20225241) Lecturer: Danny Hendler Definitions Lower bound for consensus](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d585503460f94a3856e/html5/thumbnails/25.jpg)
Theorem:Consider any n-process implementation of an obstruction-free counter, then the worst-case number of stalls incurred by a process as it performs a fetch&increment operation is at least n-1.
![Page 26: Contention in shared memory multiprocessors Multiprocessor synchronization algorithms (20225241) Lecturer: Danny Hendler Definitions Lower bound for consensus](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d585503460f94a3856e/html5/thumbnails/26.jpg)
Worst-case stalls number ≥ n-1
Start from an initial state. Fix a process p about to perform a fetch&increment operation.
Consider the path it takes if it runs uninterrupted when only first-accesses to shared words are considered.p
1
![Page 27: Contention in shared memory multiprocessors Multiprocessor synchronization algorithms (20225241) Lecturer: Danny Hendler Definitions Lower bound for consensus](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d585503460f94a3856e/html5/thumbnails/27.jpg)
Worst-case stalls number ≥ n-1
Start from an initial state. Fix a process p about to perform a fetch&increment operation.
Consider the path it takes if it runs uninterrupted when only first-accesses to shared words are considered.p
1 2
![Page 28: Contention in shared memory multiprocessors Multiprocessor synchronization algorithms (20225241) Lecturer: Danny Hendler Definitions Lower bound for consensus](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d585503460f94a3856e/html5/thumbnails/28.jpg)
Worst-case stalls number ≥ n-1
Start from an initial state. Fix a process p about to perform a fetch&increment operation.
Consider the path it takes if it runs uninterrupted when only first-accesses to shared words are considered.p
1 2 3
![Page 29: Contention in shared memory multiprocessors Multiprocessor synchronization algorithms (20225241) Lecturer: Danny Hendler Definitions Lower bound for consensus](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d585503460f94a3856e/html5/thumbnails/29.jpg)
Worst-case stalls number ≥ n-1
Start from an initial state. Fix a process p about to perform a fetch&increment operation.
Consider the path it takes if it runs uninterrupted when only first-accesses to shared words are considered.p
1 2 34
![Page 30: Contention in shared memory multiprocessors Multiprocessor synchronization algorithms (20225241) Lecturer: Danny Hendler Definitions Lower bound for consensus](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d585503460f94a3856e/html5/thumbnails/30.jpg)
Worst-case stalls number ≥ n-1
p
1 2 34
Let O1 be the first word along p's path that is written by some other process in any p-free execution
There must be such a word.
O1
![Page 31: Contention in shared memory multiprocessors Multiprocessor synchronization algorithms (20225241) Lecturer: Danny Hendler Definitions Lower bound for consensus](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d585503460f94a3856e/html5/thumbnails/31.jpg)
Worst-case stalls number ≥ n-1
p
1 2 34
O1
Let E1 be an execution that maximizes the number of processes that are about to write to O1 over all p-free executions.
|G1| = K1
![Page 32: Contention in shared memory multiprocessors Multiprocessor synchronization algorithms (20225241) Lecturer: Danny Hendler Definitions Lower bound for consensus](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d585503460f94a3856e/html5/thumbnails/32.jpg)
Worst-case stalls number ≥ n-1
p
1 2 34
O1
If (k1=n-1) then we are done.
|G1| = K1
Otherwise, we show that p must access yet another word that may be written by other processes.
![Page 33: Contention in shared memory multiprocessors Multiprocessor synchronization algorithms (20225241) Lecturer: Danny Hendler Definitions Lower bound for consensus](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d585503460f94a3856e/html5/thumbnails/33.jpg)
Worst-case stalls number ≥ n-1
p
1 2 34
O1
|G1| = K1
What happens if p incurs the stalls on O1?
![Page 34: Contention in shared memory multiprocessors Multiprocessor synchronization algorithms (20225241) Lecturer: Danny Hendler Definitions Lower bound for consensus](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d585503460f94a3856e/html5/thumbnails/34.jpg)
Worst-case stalls number ≥ n-1
p
1 2 34
O1
What happens if p incurs the stalls on O1?
![Page 35: Contention in shared memory multiprocessors Multiprocessor synchronization algorithms (20225241) Lecturer: Danny Hendler Definitions Lower bound for consensus](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d585503460f94a3856e/html5/thumbnails/35.jpg)
Worst-case stalls number ≥ n-1
p
1 2 34
O1
What happens if p incurs the stalls on O1?
![Page 36: Contention in shared memory multiprocessors Multiprocessor synchronization algorithms (20225241) Lecturer: Danny Hendler Definitions Lower bound for consensus](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d585503460f94a3856e/html5/thumbnails/36.jpg)
Worst-case stalls number ≥ n-1
p
1 2 34
O1
What happens if p incurs the stalls on O1?
![Page 37: Contention in shared memory multiprocessors Multiprocessor synchronization algorithms (20225241) Lecturer: Danny Hendler Definitions Lower bound for consensus](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d585503460f94a3856e/html5/thumbnails/37.jpg)
Worst-case stalls number ≥ n-1
p
1 2 34
O1
What happens if p incurs the stalls on O1?
![Page 38: Contention in shared memory multiprocessors Multiprocessor synchronization algorithms (20225241) Lecturer: Danny Hendler Definitions Lower bound for consensus](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d585503460f94a3856e/html5/thumbnails/38.jpg)
Worst-case stalls number ≥ n-1
p
1 2 34
O1
What happens if p incurs the stalls on O1?
![Page 39: Contention in shared memory multiprocessors Multiprocessor synchronization algorithms (20225241) Lecturer: Danny Hendler Definitions Lower bound for consensus](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d585503460f94a3856e/html5/thumbnails/39.jpg)
Worst-case stalls number ≥ n-1
p
1 2 34
O1
What happens if p incurs the stalls on O1?
But now the rest of the path may change....
![Page 40: Contention in shared memory multiprocessors Multiprocessor synchronization algorithms (20225241) Lecturer: Danny Hendler Definitions Lower bound for consensus](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d585503460f94a3856e/html5/thumbnails/40.jpg)
Worst-case stalls number ≥ n-1
p
1 2 34
O1
What happens if p incurs the stalls on O1?
But now the rest of the path may change....
3
![Page 41: Contention in shared memory multiprocessors Multiprocessor synchronization algorithms (20225241) Lecturer: Danny Hendler Definitions Lower bound for consensus](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d585503460f94a3856e/html5/thumbnails/41.jpg)
Worst-case stalls number ≥ n-1
p
1 2 34
O1
What happens if p incurs the stalls on O1?
But now the rest of the path may change....
3
![Page 42: Contention in shared memory multiprocessors Multiprocessor synchronization algorithms (20225241) Lecturer: Danny Hendler Definitions Lower bound for consensus](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d585503460f94a3856e/html5/thumbnails/42.jpg)
Worst-case stalls number ≥ n-1
p
1 2 4
O1
What happens if p incurs the stalls on O1?
But now the rest of the path may change....
3
Assume p gets value v
![Page 43: Contention in shared memory multiprocessors Multiprocessor synchronization algorithms (20225241) Lecturer: Danny Hendler Definitions Lower bound for consensus](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d585503460f94a3856e/html5/thumbnails/43.jpg)
Worst-case stalls number ≥ n-1
1 2 4
O1
3
|G1| = K1
v: the value returned by p if we let it run and incur the stallsc: the number of fetch&increment operations completed before p starts its operation
We have: v {c,…,c+K1}
p
![Page 44: Contention in shared memory multiprocessors Multiprocessor synchronization algorithms (20225241) Lecturer: Danny Hendler Definitions Lower bound for consensus](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d585503460f94a3856e/html5/thumbnails/44.jpg)
Worst-case stalls number ≥ n-1
v: the value returned by p if we let it run and incur the stallsc: the number of fetch&increment operations completed before p starts its operation
We have: v {c,…,c+K1}
time
q.enq(x)
q.enq(y)
fetch&inc
fetch&inc
fetch&inc
time
vp
q.enq(x)fetch&inc
q.enq(x)fetch&inc
fetch&inc
c q.enq(x)fetch&inc
K1
![Page 45: Contention in shared memory multiprocessors Multiprocessor synchronization algorithms (20225241) Lecturer: Danny Hendler Definitions Lower bound for consensus](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d585503460f94a3856e/html5/thumbnails/45.jpg)
Worst-case stalls number ≥ n-1
1 2 4
O1
3
|G1| = K1
v: the value returned by p if we let it run and incur the stallsc: the number of fetch&increment operations completed before p starts its operation
p
We select some process q G1 {p}
We let q perform K1+1 fetch&increment operations
q must write to a word read by p after O1
![Page 46: Contention in shared memory multiprocessors Multiprocessor synchronization algorithms (20225241) Lecturer: Danny Hendler Definitions Lower bound for consensus](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d585503460f94a3856e/html5/thumbnails/46.jpg)
Worst-case stalls number ≥ n-1
1 2 4
O1
3
|G1| = K1
v: the value returned by p if we let it run and incur the stallsc: the number of fetch&increment operations completed before p starts its operation
p
We select some process q G1 {p}
We let q perform K1+1 fetch&increment operations
q must write to a word read by p after O1
q
![Page 47: Contention in shared memory multiprocessors Multiprocessor synchronization algorithms (20225241) Lecturer: Danny Hendler Definitions Lower bound for consensus](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d585503460f94a3856e/html5/thumbnails/47.jpg)
time
q.enq(x)
q.enq(y)
q.deq(x)
fetch&inc
fetch&inc
fetch&inc
time
v' > vP
q.enq(x)fetch&inc
fetch&inc
c+K1+1 q.enq(x)fetch&inc
K1
Worst-case stalls number ≥ n-1v: the value returned by p if we let it run and incur the stallsc: the number of fetch&increment operations completed before p starts its operation
We let q perform K1+1 fetch&increment operations q must write to a
word read by p after O1
![Page 48: Contention in shared memory multiprocessors Multiprocessor synchronization algorithms (20225241) Lecturer: Danny Hendler Definitions Lower bound for consensus](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d585503460f94a3856e/html5/thumbnails/48.jpg)
Worst-case stalls number ≥ n-1
1 2 4
O1
3
|G1| = K1
p
Let O2 be first word that will be accessed by p after it incurs the K1 stalls that is written by some process G1 {p}Let E2 be an execution that maximizes the number of processes that are about to write to O2 over all (G1 {p})-free executions.
![Page 49: Contention in shared memory multiprocessors Multiprocessor synchronization algorithms (20225241) Lecturer: Danny Hendler Definitions Lower bound for consensus](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d585503460f94a3856e/html5/thumbnails/49.jpg)
Worst-case stalls number ≥ n-1
O1
|G1| = K1
p
Continuing with this construction we get:
O2
|G2| = K2 |Gm| = Km
Om
![Page 50: Contention in shared memory multiprocessors Multiprocessor synchronization algorithms (20225241) Lecturer: Danny Hendler Definitions Lower bound for consensus](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d585503460f94a3856e/html5/thumbnails/50.jpg)
Conclusion: “Naïve ” implementation is best
possible!
3
4
6 5
2
1
FAI
FAI
FAIFAI
FAI
FAI
FAI object