sos: saving time in dynamic race detection with stationary analysis du li, witawas srisa-an, matthew...

SOS: Saving Time in Dynamic Race Detection with StationaryAnalysis

Du Li, Witawas Srisa-an, Matthew B. Dwyer

Data Race

• Two concurrent accesses to the same data, and at least one access is a write

• One of the most common concurrency bugs

Thread 1 Thread 2

Release L

Acquire L

Dynamic Race Detection

• Lockset based approaches– Imprecise (false positive)– Eraser

• Vector-clock based approaches– Precise– FastTrack

Vector Clock & Race Detection

Thread 1 Thread 2

(2, 4)

(1, 5)

Release L

Require L

(2, 3) (1, 4)

Read(0, 0)

(2, 0)

(2, 4)

(1, 5)

(2,0) does not happen before (1,5)

Overhead of Vector Clock Based Race Detection• Compare and update vector clocks

– Greatly reduced by FastTrack*• Reduce comparison cost by replacing vector clock

with epoch, O(n) to O(1)

• Still suffers 8.5X slowdown

• Monitor read/write operations– A dominating cost

• Monitor lock operations– Not a major factor due to small number of locks

*Flanagan and Freund. FastTrack: Efficient and Precise Dynamic Race Detection (PLDI '09)

Reducing R/W Monitors

• Race detection in deployed software???– Pacer*: “get what you pay for”

• Based on FastTrack, precise• Use random sampling to reduce monitoring

efforts– 3% sampling rate yields 86% overhead– 100% sampling rate yields 13.8x slowdown– Detection rate = Sampling rate

*Bond, Coons, and McKinley. PACER: Proportional Detection of Data Races (PLDI '10).

Reducing R/W Monitors

Focus of our Work

Monitor only objects that can race and ignore those that cannot race while

maintaining precision

Stationary Analysis

• Final fields– Fields: written only once during execution

• Stationary fields*– Fields: all writes occur before all reads

• Stationary objects– Objects: read-only after thread escaping

*Unkel and Lam. Automatic Inference of Stationary Fields: A Generalization of Java's Final Fields. (POPL '08)

Stationary Analysis

• Object lifecycle

• Only monitor reads/writes to objects in non-stationary state

Initialization

Read/write

Stationary Non-Stationary

Read Read/write

writelose

Lose: write the address of an object to the heap.

Thread local, no race.

Thread shared but no write, no race.

Races can occur

Potential Savings

ProgramAll

Reads(106)

Non-Stationary

Reads (106)

Potential Savings

All Writes(1

Non-Stationary

Writes (106)

avrora 25.85 20.37 21.22 2.01 2.00

eclipse 63.41 40.65 35.91 18.13 18.04

hsqldb 12.99 4.17 67.89 1.90 1.87

pseudojbb 4.08 2.06 49.49 1.34 1.33

sunflow 9.76 6.80 30.33 0.270 0.269

xalan 8.86 5.40 38.97 1.47 1.45

Implementation

• Built on Pacer code base, which is on top of Jikes RVM 3.1.0

• Instrument R/W barriers to monitor object state transitions

Main Features

• Enable/disable monitoring at run time on a per object basis

• Efficient dynamic analysis to detect stationary objects

• Optimistically assume all thread shared objects are stationary until a write is observed

Further Optimization

• Insights:– Races tend to occur in cold region of code*– Most races occur repeatedly; some of them

occur thousands of times in a run

• Reduce monitoring for hot code– Ignore object state transitions from

stationary to non-stationary in hot code* Marino, Musuvathi, and Narayanasamy. LiteRace: Effective Sampling for Lightweight Data-race Detection. (PLDI

Two Versions

• SO = Monitoring state transitions in both baseline and optimizing compilers

• SOn = Monitoring state transitions only in the baseline compiler– Ignore state transitions in optimizing

compiler

Two Types of Evaluations

• Experiment 1: Turn on sampling at 100% and measure overhead between Pacer, SO, and SOn

• Experiment 2: Control the amount of overheads then measure the number of detected races (unique) between SOn and Pacer for each overhead value.

Exp 1: Overhead with 100% Sampling

Normalized against performance of FastTrack_Pacer

Exp 1: Overhead with 100% Sampling

ProgramFastTrackPacer SO SOn

avrora 5.1 4.5 4.2

eclipse 21.0 16.8 8.2

hsqldb 12.2 7.3 3.8

pseudojbb 7.7 5.1 4.3

sunflow 29.2 24.1 13.1

xalan 11.6 7.1 2.6

Average 13.8 10.3 5.9

Slowdown Factor (times)

Two Types of Evaluations

• Experiment 1: Turn on sampling at 100% and measure overhead between Pacer, SO, and Son

• Experiment 2: Control the overheads via sampling then measure the number of detected races (unique) between SOn and Pacer for each overhead value

Exp 2: Controlling Overhead

Comparing race detection effectiveness (average) between SOn and Pacer

Summary of Evaluation

• Average overhead with 100% sampling is 45% of a FastTrack implementation in Pacer

• Up to a factor of 6 times more races than Pacer with tight overhead budget (100%)

Shortcomings

• SOS misses races due to– Optimistic stationary

analysis

Thread 1 Thread 2

writemissed

—— Stationary —— Non-stationary

Still StationaryNo monitor

Now non-stationary

Shortcomings

• When compare to 100% sampling, SOS misses races due to– Optimistic stationary analysis

– Further optimization (SOn)• No monitoring of state change in the optimizing

compiler

Shortcomings

Detected Races by FastTrack

Detected Races by SO

Detected Races by

Shortcomings

Comparing the number of missed races in SOn with that of SO normalizing with the number of races detected by FastTrack.

Conclusions

• Dynamic stationary analysis– Implemented inside a JVM to support per object

monitoring

– Reduce the overhead of monitoring R/W operations in vector-clock based race detection

– Applicable to general race detection approaches

• When combining with sampling, increase the detection effectiveness while maintaining low overhead– Make race detection in deployed systems more feasible

Acknowledgment

• Supported by NSF CNS-0720757 and CCF-0912566, NASA NNX08AV20A, AFOSR FA9550-09-1-0129 and FA9550-09-1-0687

• Many thanks to Stephen Blackburn for Psedujbb2005 and authors of Pacer for their making their implementation available and insightful discussions

sos: saving time in dynamic race detection with stationary analysis du li, witawas srisa-an, matthew...

race slide

stationary objects objects

vector clock race detection

non stationary reads

execution stationary

dwyer slide

non stationary writes

dynamic race detection

Documents

bus specification embedded systems design and implementation...

1 memory allocation overview witawas srisa-an csce496/896:...

dynamic tainting for deployed java programs du li advisor:...

dwyer, thomas

dwyer - dentistry101-april29,2005

dwyer 16c iom

catherine dwyer - pace

about the designer · 2017-11-21 · about the designer...

catechisam of hinduism srisa chandra vasu

operating system kernels1 operating system support for...

dwyer issa presentation

dwyer level

ieee transactions on mobile computing, vol. 2, no. 2...

lecturer: professor witty srisa-an

srisa sofa

dwyer house

gheranda samhita - srisa chandra vasu

fcv talk dwyer

[translated by rai bahadur srisa chandra vasu]...

sharon dwyer vcccd