![Page 1: Improving Bloom Filter Configuration for Lazy Transactional Memory](https://reader036.vdocuments.net/reader036/viewer/2022062323/56816374550346895dd44fde/html5/thumbnails/1.jpg)
Improving Bloom Filter Configuration for Lazy Transactional Memory
Mark Jeffrey and J. Gregory SteffanECE, University of Toronto
November 10, 2011
![Page 2: Improving Bloom Filter Configuration for Lazy Transactional Memory](https://reader036.vdocuments.net/reader036/viewer/2022062323/56816374550346895dd44fde/html5/thumbnails/2.jpg)
Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM 2
Parallel Programming is Hard
T1
Rd(a)
Rd(b)
Wr(a)
T2
Rd(a)
Wr(c)
Rd(a)
T3
Rd(x)
Rd(a)
Tools offload some burden of managing data accesses:– Memory Race Replay– Atomicity Violation Survival– Transactional Memory– Speculative Optimizations
Many tools are using Bloom filters
![Page 3: Improving Bloom Filter Configuration for Lazy Transactional Memory](https://reader036.vdocuments.net/reader036/viewer/2022062323/56816374550346895dd44fde/html5/thumbnails/3.jpg)
Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM 3
Bloom Filter
• Bit-vector-based data structure [1970]– offers fast set operations– in exchange for some imprecision
• Recently used to compare memory accesses• With unconventional practices: Intersection
&
We show new practices are inefficient!(in theory and empirically)
![Page 4: Improving Bloom Filter Configuration for Lazy Transactional Memory](https://reader036.vdocuments.net/reader036/viewer/2022062323/56816374550346895dd44fde/html5/thumbnails/4.jpg)
Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM 4
Bloom Filters in Concurrency ToolsSystem Year ApplicationBulk 2006 Hardware TMBulkSC 2007 Memory ConsistencyHARD 2007 Race DetectionDeLorean 2008 Deterministic Race ReplaySoftSig 2008 Code Analysis/Optimization/DebugRingSTM 2008 Software TMSigRace 2009 Race DetectionColorSafe 2010 Atomicity ViolationInvalSTM 2010 Software TMAdapSig 2010 Software TMSvS 2011 Auto-protection of shared state
Our propositions will improve parallelism!
![Page 5: Improving Bloom Filter Configuration for Lazy Transactional Memory](https://reader036.vdocuments.net/reader036/viewer/2022062323/56816374550346895dd44fde/html5/thumbnails/5.jpg)
Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM 5
Tracking Address-Set Conflicts
![Page 6: Improving Bloom Filter Configuration for Lazy Transactional Memory](https://reader036.vdocuments.net/reader036/viewer/2022062323/56816374550346895dd44fde/html5/thumbnails/6.jpg)
Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM 6
Address-Sets
T1
Rd(a)
Rd(b)
Wr(a)
T2
Rd(a)
Wr(c)
Rd(a)
T3
Rd(x)
Rd(a)
Read Set:• memory locations read• RT1 = {a,b}
Write Set:• memory locations written• WT1 = {a}
![Page 7: Improving Bloom Filter Configuration for Lazy Transactional Memory](https://reader036.vdocuments.net/reader036/viewer/2022062323/56816374550346895dd44fde/html5/thumbnails/7.jpg)
Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM 7
Burden: Address-Set Conflicts
T1
Rd(a)
Rd(b)
Wr(a)
T2
Rd(a)
Wr(c)
Rd(a)
T3
Rd(x)
Rd(a)
Conflicts– address accesses are dependent– independence -> parallelism!– address conflicts -> no parallelism
Conflict Detection requires – read and write set comparison
![Page 8: Improving Bloom Filter Configuration for Lazy Transactional Memory](https://reader036.vdocuments.net/reader036/viewer/2022062323/56816374550346895dd44fde/html5/thumbnails/8.jpg)
Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM 8
Test address-sets for null-intersections
Detect conflicts at the end of a transaction
Lazy Conflict Detection
R1={a,c}W1={b}
T1 T2
Wr(b)--Rd(a)Rd(a)-
Rd(c)- -Rd(b)
?021 RW
![Page 9: Improving Bloom Filter Configuration for Lazy Transactional Memory](https://reader036.vdocuments.net/reader036/viewer/2022062323/56816374550346895dd44fde/html5/thumbnails/9.jpg)
Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM 9
Bloom Filters (BF)
![Page 10: Improving Bloom Filter Configuration for Lazy Transactional Memory](https://reader036.vdocuments.net/reader036/viewer/2022062323/56816374550346895dd44fde/html5/thumbnails/10.jpg)
Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM 10
Bloom Filter Background
• Bloom filter is a compact set representation– bit vector - much smaller than address space
x
h()
xS )BF(
![Page 11: Improving Bloom Filter Configuration for Lazy Transactional Memory](https://reader036.vdocuments.net/reader036/viewer/2022062323/56816374550346895dd44fde/html5/thumbnails/11.jpg)
Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM 11
Bloom Filter Background
y h()?)BF(Sy
{Yes, No}
Query for an address, y
![Page 12: Improving Bloom Filter Configuration for Lazy Transactional Memory](https://reader036.vdocuments.net/reader036/viewer/2022062323/56816374550346895dd44fde/html5/thumbnails/12.jpg)
Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM 12
Bloom Filter False Positives (FPs)
• Encode a large address space into a bit-vector – response to query is actually No or Maybe
• False Positives – when “maybe” is wrong
is y in ?
x y
![Page 13: Improving Bloom Filter Configuration for Lazy Transactional Memory](https://reader036.vdocuments.net/reader036/viewer/2022062323/56816374550346895dd44fde/html5/thumbnails/13.jpg)
Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM 13
Partitioned Bloom Filter
Insert an address, x:– k hash functions encode k bit indices to set
x
h1() h2() hk()…
…
xS )BF(
![Page 14: Improving Bloom Filter Configuration for Lazy Transactional Memory](https://reader036.vdocuments.net/reader036/viewer/2022062323/56816374550346895dd44fde/html5/thumbnails/14.jpg)
Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM 14
Probability of False Positives is well understood
Query for an address, y:
Partitioned Bloom Filter
y
h1() h2() hk()…
…
{Maybe, No}
?)BF(Sy
![Page 15: Improving Bloom Filter Configuration for Lazy Transactional Memory](https://reader036.vdocuments.net/reader036/viewer/2022062323/56816374550346895dd44fde/html5/thumbnails/15.jpg)
Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM 15
UnconventionalBloom Filter Null-Intersection Tests
![Page 16: Improving Bloom Filter Configuration for Lazy Transactional Memory](https://reader036.vdocuments.net/reader036/viewer/2022062323/56816374550346895dd44fde/html5/thumbnails/16.jpg)
Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM 16
Two existing approaches:1. build a Queue of Queries (QoQ)
2. combine queries into distinct Bloom filter– replace many queries with 1 intersection!
Bloom Filter Null-Intersection Tests
a2a3a4a5 a1 ?
?
![Page 17: Improving Bloom Filter Configuration for Lazy Transactional Memory](https://reader036.vdocuments.net/reader036/viewer/2022062323/56816374550346895dd44fde/html5/thumbnails/17.jpg)
Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM 17
Do two sets share any elements?
Partitioned BF Intersection
…
?021 SS
…& …
{Disjoint, Maybe Overlap}
![Page 18: Improving Bloom Filter Configuration for Lazy Transactional Memory](https://reader036.vdocuments.net/reader036/viewer/2022062323/56816374550346895dd44fde/html5/thumbnails/18.jpg)
Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM 18
Any asserted bits indicate set overlap
Unpartitioned BF Intersection
…
?021 SS
…& …
{Disjoint, Maybe Overlap}
![Page 19: Improving Bloom Filter Configuration for Lazy Transactional Memory](https://reader036.vdocuments.net/reader036/viewer/2022062323/56816374550346895dd44fde/html5/thumbnails/19.jpg)
Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM 19
Imprecision in BF Intersection
• Bloom filter was intended for fast Querying
• Recent systems use filter for Intersection– Imprecision can produce False Set-Overlaps (FSO)– We are the first to study Bloom filter FSOs– Our goal is to
Understand and improve Bloom filter intersection
![Page 20: Improving Bloom Filter Configuration for Lazy Transactional Memory](https://reader036.vdocuments.net/reader036/viewer/2022062323/56816374550346895dd44fde/html5/thumbnails/20.jpg)
Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM 20
Important Questions
When using BFs for testing null-intersection1. How do BF Intersection and QoQ compare?– theoretical study [SPAA ‘11]
2. Can we compromise? – new Bloom filter design
3. Does theory work in practice? – empirical study
![Page 21: Improving Bloom Filter Configuration for Lazy Transactional Memory](https://reader036.vdocuments.net/reader036/viewer/2022062323/56816374550346895dd44fde/html5/thumbnails/21.jpg)
Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM 21
1. How do BF Intersection and QoQ compare?
Bloom Filters for Null-Intersection Tests
![Page 22: Improving Bloom Filter Configuration for Lazy Transactional Memory](https://reader036.vdocuments.net/reader036/viewer/2022062323/56816374550346895dd44fde/html5/thumbnails/22.jpg)
Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM 22
Definitions
sets access addressdisjoint ,BA
bits m
![Page 23: Improving Bloom Filter Configuration for Lazy Transactional Memory](https://reader036.vdocuments.net/reader036/viewer/2022062323/56816374550346895dd44fde/html5/thumbnails/23.jpg)
Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM 23
Definitions
h1() h2() hk()……
partitions k
sets access addressdisjoint ,BA
bits m
![Page 24: Improving Bloom Filter Configuration for Lazy Transactional Memory](https://reader036.vdocuments.net/reader036/viewer/2022062323/56816374550346895dd44fde/html5/thumbnails/24.jpg)
Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM 24
• Unpartitioned BF Intersection
• Partitioned BF Intersection
• Queue of BF Queries
BAkmUnpartp
2111
Probability of FSO [SPAA ‘11]h1 h2 hk…
h1 h2 hk…
kBA
mk
Partp 11
BkA
mk
QoQp 1111b2b3b4b5 b1 ϵ?
![Page 25: Improving Bloom Filter Configuration for Lazy Transactional Memory](https://reader036.vdocuments.net/reader036/viewer/2022062323/56816374550346895dd44fde/html5/thumbnails/25.jpg)
Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM 25
For any length m, and k > 1 hash functions,
nedUnpartitiodPartitioneQoQ ppp
Queue of Queries gives the fewest false conflictsPartitioned intersection improves on Unpartitioned
Comparing FSOs [SPAA ’11]
b2b3b4 b1 ϵ?
h1 hk… h1 hk…
![Page 26: Improving Bloom Filter Configuration for Lazy Transactional Memory](https://reader036.vdocuments.net/reader036/viewer/2022062323/56816374550346895dd44fde/html5/thumbnails/26.jpg)
Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM 26
2. Can we compromise? A new Bloom filter design
Bloom Filters for Null-Intersection Tests
![Page 27: Improving Bloom Filter Configuration for Lazy Transactional Memory](https://reader036.vdocuments.net/reader036/viewer/2022062323/56816374550346895dd44fde/html5/thumbnails/27.jpg)
Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM 27
Batch-of-Bloom-filters (BoB)
…
x hpre
x
…
h1 hk…
…
…h1 hk
xS )BoB(
…
…h1 hk
bSSSS 21
)BF( 1S )BF( 2S )BF( bS)BF(S
![Page 28: Improving Bloom Filter Configuration for Lazy Transactional Memory](https://reader036.vdocuments.net/reader036/viewer/2022062323/56816374550346895dd44fde/html5/thumbnails/28.jpg)
Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM 28
{Disjoint, Maybe Overlap}
BoB Intersection
&…
…
……
…
…
…
?021 SS
BoB: compromise between QoQ and Intersect
![Page 29: Improving Bloom Filter Configuration for Lazy Transactional Memory](https://reader036.vdocuments.net/reader036/viewer/2022062323/56816374550346895dd44fde/html5/thumbnails/29.jpg)
Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM 29
3. Does theory work in practice?Bloom Filters for Null-Intersection Tests
![Page 30: Improving Bloom Filter Configuration for Lazy Transactional Memory](https://reader036.vdocuments.net/reader036/viewer/2022062323/56816374550346895dd44fde/html5/thumbnails/30.jpg)
Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM 30
Methodology
• Augment RingSTM with alternate BF configs[Spear et al. SPAA ’08]– unpartitioned Bloom filter intersection
• Stress BF configurations using STAMP bench
• 8-core Intel Xeon with SSE2 ISA– 32-bit Linux 2.6.32-5-686
![Page 31: Improving Bloom Filter Configuration for Lazy Transactional Memory](https://reader036.vdocuments.net/reader036/viewer/2022062323/56816374550346895dd44fde/html5/thumbnails/31.jpg)
Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM 31
QoQ, BoB, part. intersect outperform baseline
Performance Results: LabyrinthExecution Time Aborts
21% Speedup
Better
![Page 32: Improving Bloom Filter Configuration for Lazy Transactional Memory](https://reader036.vdocuments.net/reader036/viewer/2022062323/56816374550346895dd44fde/html5/thumbnails/32.jpg)
Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM 32
Querying overhead counteracts reduced aborts
Performance Results: Kmeans-low
Better
>25% slowdown
Execution Time Aborts
![Page 33: Improving Bloom Filter Configuration for Lazy Transactional Memory](https://reader036.vdocuments.net/reader036/viewer/2022062323/56816374550346895dd44fde/html5/thumbnails/33.jpg)
Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM 33
Conclusion
![Page 34: Improving Bloom Filter Configuration for Lazy Transactional Memory](https://reader036.vdocuments.net/reader036/viewer/2022062323/56816374550346895dd44fde/html5/thumbnails/34.jpg)
Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM 34
Conclusion
Conflict detection often applies Bloom filters– for fast set operations: y ϵ S and S1∩S2
– unconventionally using BFs for null-intersection
Our recommendations (from theory & practice)1. strongly consider querying before intersection2. in hardware, consider intersecting BoBs3. build adaptive systems for application behaviors