![Page 1: Light64: Lightweight Hardware Support for Data Race Detection during Systematic Testing of Parallel Programs A. Nistor, D. Marinov and J. Torellas to appear](https://reader035.vdocuments.net/reader035/viewer/2022062716/56649dcf5503460f94ac387d/html5/thumbnails/1.jpg)
Light64: Lightweight Hardware Support for Data Race Detection during Systematic Testing of Parallel Programs
A. Nistor, D. Marinov and J. Torellas
to appear MICRO’09LBA reading group – 09/29/09
(by Evangelos)
![Page 2: Light64: Lightweight Hardware Support for Data Race Detection during Systematic Testing of Parallel Programs A. Nistor, D. Marinov and J. Torellas to appear](https://reader035.vdocuments.net/reader035/viewer/2022062716/56649dcf5503460f94ac387d/html5/thumbnails/2.jpg)
Introduction – Context Debugging of parallel applications
Even for 1 input too many interleavings Systematic Testing
Execute many times - explore all interleavings
Assumptions: Input provided Thread Interleaving only cause of non-determinism
Goal: Hardware support for data race detection under Systematic Testing
![Page 3: Light64: Lightweight Hardware Support for Data Race Detection during Systematic Testing of Parallel Programs A. Nistor, D. Marinov and J. Torellas to appear](https://reader035.vdocuments.net/reader035/viewer/2022062716/56649dcf5503460f94ac387d/html5/thumbnails/3.jpg)
Background of Systematic Testing
• Serializing of threads (multiplexing)
• New scheduler implementation
• Happens-before definition
• Segment-based interleaving
![Page 4: Light64: Lightweight Hardware Support for Data Race Detection during Systematic Testing of Parallel Programs A. Nistor, D. Marinov and J. Torellas to appear](https://reader035.vdocuments.net/reader035/viewer/2022062716/56649dcf5503460f94ac387d/html5/thumbnails/4.jpg)
Background of Systematic Testing
State: represented by a Serial Log; ordered list of segments
![Page 5: Light64: Lightweight Hardware Support for Data Race Detection during Systematic Testing of Parallel Programs A. Nistor, D. Marinov and J. Torellas to appear](https://reader035.vdocuments.net/reader035/viewer/2022062716/56649dcf5503460f94ac387d/html5/thumbnails/5.jpg)
Background of Systematic Testing
![Page 6: Light64: Lightweight Hardware Support for Data Race Detection during Systematic Testing of Parallel Programs A. Nistor, D. Marinov and J. Torellas to appear](https://reader035.vdocuments.net/reader035/viewer/2022062716/56649dcf5503460f94ac387d/html5/thumbnails/6.jpg)
Light64 – The Idea
“Two different thread interleavings that have the same happens-before graph but a flipped data race, will very likely have at least a small deviation in the execution history”
![Page 7: Light64: Lightweight Hardware Support for Data Race Detection during Systematic Testing of Parallel Programs A. Nistor, D. Marinov and J. Torellas to appear](https://reader035.vdocuments.net/reader035/viewer/2022062716/56649dcf5503460f94ac387d/html5/thumbnails/7.jpg)
Corner cases?
No false positives; few false negatives Systematic tester environment highly
deterministic Extremely improbable for two different
streams of values to generate the same hash
Cannot identify benign races; races on data that will never be consumed
By construction…
![Page 8: Light64: Lightweight Hardware Support for Data Race Detection during Systematic Testing of Parallel Programs A. Nistor, D. Marinov and J. Torellas to appear](https://reader035.vdocuments.net/reader035/viewer/2022062716/56649dcf5503460f94ac387d/html5/thumbnails/8.jpg)
Design
Small hardware modifications CRC logic at the head of ROB ISA extensions; start/stop – save/load hash
history Two modes of execution
Passive Mode Active Mode Tradeoff between accuracy and
performance
![Page 9: Light64: Lightweight Hardware Support for Data Race Detection during Systematic Testing of Parallel Programs A. Nistor, D. Marinov and J. Torellas to appear](https://reader035.vdocuments.net/reader035/viewer/2022062716/56649dcf5503460f94ac387d/html5/thumbnails/9.jpg)
Passive Mode
During step 4 Augment each state with the Execution
History Hash. Check if executions with same happens-before have the same hash value (e.g., S2 & S11)
No guarantees on coverage Dependable on systematic tester’s exploration
strategy and pruning heuristics No practical overhead
![Page 10: Light64: Lightweight Hardware Support for Data Race Detection during Systematic Testing of Parallel Programs A. Nistor, D. Marinov and J. Torellas to appear](https://reader035.vdocuments.net/reader035/viewer/2022062716/56649dcf5503460f94ac387d/html5/thumbnails/10.jpg)
Active Mode
During step 2; While re-executing to reach the selected state ‘S’,
flip as many segments as possible. Compare Execution History Hash against original execution
Heuristic 1 – efficient segment reordering Smallest-ID Thread first during first run Biggest-ID Thread first during re-execution
Heuristic 2 – additional re-executions to increase coverage
ActiveFIN – re-execute all final states ActiveFULL – re-execute all states
![Page 11: Light64: Lightweight Hardware Support for Data Race Detection during Systematic Testing of Parallel Programs A. Nistor, D. Marinov and J. Torellas to appear](https://reader035.vdocuments.net/reader035/viewer/2022062716/56649dcf5503460f94ac387d/html5/thumbnails/11.jpg)
Experimental Setup
Used Pin to model a system running a systematic tester
Instruction count as a performance metric
SPLASH-2 benchmarks (modified & unmodified)
6 versions of a system: Plain, Plain+RD, ActiveNO, ActiveFIN,
ActiveFULL, Passive
![Page 12: Light64: Lightweight Hardware Support for Data Race Detection during Systematic Testing of Parallel Programs A. Nistor, D. Marinov and J. Torellas to appear](https://reader035.vdocuments.net/reader035/viewer/2022062716/56649dcf5503460f94ac387d/html5/thumbnails/12.jpg)
State Space Characterization
![Page 13: Light64: Lightweight Hardware Support for Data Race Detection during Systematic Testing of Parallel Programs A. Nistor, D. Marinov and J. Torellas to appear](https://reader035.vdocuments.net/reader035/viewer/2022062716/56649dcf5503460f94ac387d/html5/thumbnails/13.jpg)
Race Detection Capability
![Page 14: Light64: Lightweight Hardware Support for Data Race Detection during Systematic Testing of Parallel Programs A. Nistor, D. Marinov and J. Torellas to appear](https://reader035.vdocuments.net/reader035/viewer/2022062716/56649dcf5503460f94ac387d/html5/thumbnails/14.jpg)
Runtime Overhead
![Page 15: Light64: Lightweight Hardware Support for Data Race Detection during Systematic Testing of Parallel Programs A. Nistor, D. Marinov and J. Torellas to appear](https://reader035.vdocuments.net/reader035/viewer/2022062716/56649dcf5503460f94ac387d/html5/thumbnails/15.jpg)
Runtime Overhead – Software-based
![Page 16: Light64: Lightweight Hardware Support for Data Race Detection during Systematic Testing of Parallel Programs A. Nistor, D. Marinov and J. Torellas to appear](https://reader035.vdocuments.net/reader035/viewer/2022062716/56649dcf5503460f94ac387d/html5/thumbnails/16.jpg)
Conclusions
Lightweight support for data race detection in a Systematic Tester world
Relatively low overhead for S.T. Not a conventional MICRO paper