racerx: effective, static detection of race conditions and deadlocks
DESCRIPTION
RacerX: Effective, Static Detection of Race Conditions and Deadlocks. by Dawson Engler & Ken Ashcraft (published in SOSP03) Hong,Shin. Contents. Introduction Overview Lockset Analysis Deadlock Checking Datarace Checking Conclusion. Introduction 1/2. - PowerPoint PPT PresentationTRANSCRIPT
/ 20Hong,Shin @ PSWLAB
RacerX: Effective, Static Detection of Race Conditions and Deadlocks
by Dawson Engler & Ken Ashcraft(published in SOSP03)
Hong,Shin
23年 4月 20日
1RacerX: Effective, Static Detection of Race Conditions and Deadlocks
/ 20Hong,Shin @ PSWLAB
Contents• Introduction• Overview• Lockset Analysis• Deadlock Checking• Datarace Checking• Conclusion
23年 4月 20日
RacerX: Effective, Static Detection of Race Conditions and Deadlocks
2
/ 20Hong,Shin @ PSWLAB
Introduction 1/2• Finding data races and deadlocks is difficult.
• There have been many approaches to detect these errors.– Dynamic detecting tool (e.g. Erase)
• These tools can only find errors on executed paths.
– Model checking• Model checking is not scalable (state explosion problem)
– Static tool• Many static tools make heavy use of annotations to inject
knowledge into the analysis.
23年 4月 20日
RacerX: Effective, Static Detection of Race Conditions and Deadlocks
3
/ 20Hong,Shin @ PSWLAB
Introduction 2/2• Approach
– Do not need annotations except for an indication as to what functions are used to acquire and release locks.
– Minimize the impact of false positives(false alarms)– Must scale to large industrial program both in speed and in its
ability to report complex errors.
A static tool that uses flow-sensitive, interprocedural analysis to detect both race conditions and deadlock
It aggressively infer checking informations(e.g. which locks protect which operations, which code contexts are multithreaded, which shared accesses are dangerous)
The tool sorts errors from most to least severe
23年 4月 20日
RacerX: Effective, Static Detection of Race Conditions and Deadlocks
4
/ 20Hong,Shin @ PSWLAB
Overview1/3• At a high level, checking a system with RacerX
involves five phases:
(1) Retargeting a system to system-specific locking function(2) Extracting a control flow graph from the system(3) Analysis(4) Ranking errors(5) Inspection
23年 4月 20日
RacerX: Effective, Static Detection of Race Conditions and Deadlocks
5
/ 20Hong,Shin @ PSWLAB
Overview2/3(1)Retargeting a system to system-specific locking
function– Users supply a table specifying the functions used to
acquire/release locks, and disable/enable interrupts.– Users may optionally specify a function is single-threaded,
multi-threaded, or interrupt handler
(2) Extracting a control flow graph from the system– The tool extracts a CFG from the system and stores it in a
file.– The CFG contains all function calls, uses of global variables,
uses of parameter pointer variables, and optionally uses of all local variables, concurrency operations.
– The CFG includes the symbolic information for these objects, such as their names, types, whether an access is read or write, whether a variable is a parameter or not, whether a function or variable is static or not, the line number, etc.
23年 4月 20日
RacerX: Effective, Static Detection of Race Conditions and Deadlocks
6
/ 20Hong,Shin @ PSWLAB
Overview3/3(3) Analysis
– The tool reads the emitted CFG and constructs a linked whole system CFG. And traverse the whole system CFG checking for deadlocks or data races.
– The traversal is depth-first, flow-sensitive, and interprocedural and it tracks the set of locks held at any point.
– At each program statement, the race checker or deadlock checker are passed the current statement, the current lockset, etc.
(4) Ranking errors– Compute ranking information for error messages– Ranking sorts error messages based on two features: the
likelihood of being false positive, and the difficulty of inspection
(5) Inspection– Present the ranked error messages to users
23年 4月 20日
RacerX: Effective, Static Detection of Race Conditions and Deadlocks
7
/ 20Hong,Shin @ PSWLAB
Lockset Analysis1/5• The tool compute locksets at all program points using
a top-down, flow-sensitive, context-sensitive, interprocedural analysis.– Top-down: it starts the root of each call graph and does a DFS
traversal down the CFG.– Flow-sensitive: the analysis effects of each path rather
than conflate paths at join points.– Context-sensitive: analyzes the lockset at each actual
callsite.
• In the DFS traversal over the CFG, the tool (1) adds and removes locks as needed, and(2) calls the race and deadlock checkers on each statement.
23年 4月 20日
RacerX: Effective, Static Detection of Race Conditions and Deadlocks
8
/ 20Hong,Shin @ PSWLAB
Lockset Analysis2/5• Caching
– Statement cache: The tool caches the locksets that have reached each statement in CFG.
– Summary cache: The tool caches the effect of each function by recording for each lockset l that entered function f , the set of locksets (l1, … , ln) that was produced.
– Caching works because the analysis is deterministic – two executions that both start from the same statement with the same lockset will always produce the same result.
– Since the analysis is flow-sensitive, a function could produce an exponential number of locksets. However, in practice, their effect are more modest.
23年 4月 20日
RacerX: Effective, Static Detection of Race Conditions and Deadlocks
9
/ 20Hong,Shin @ PSWLAB
Lockset Analysis3/5• Pseudo-code for interprocedural lockset algorithm (1/2)
23年 4月 20日
RacerX: Effective, Static Detection of Race Conditions and Deadlocks
10
void traverse_cfg(set of nodes roots) foreach r in roots traverse_fn(r, {}) ;end
set of locksets traverse_fn(fn, ls) foreach edge x in fn->cache if (x->entry_lockset == ls) return x->exit_locksets ; if (fn->on_stack_p) return {} ; fn->on_stack_p = 1 ; x = new edge ; x->entry_lockset = lockset ; x->exit_locksets=traverse_stmts(fn->entry,ls,ls); fn->on_stack_p = 0 ; fn->cache = fn->cache union x ; return x->exit_locksets ;end
Check summary cachea
Break recursive call
Cache update
/ 20Hong,Shin @ PSWLAB
Lockset Analysis4/5set of locksets traverse_stmts(s, entry_ls, ls)
if ((entry_ls, ls) in s->cache) return {}
s->cache = s->cache union (entry_ls, ls) ;
if (s is end-of-path) return ls ;
if (s is lock acquire operation) ls = add_lock(ls, s) ;
if (s is lock release operation) ls = remove_lock(ls, s) ;
if (s is not resolved call) worklist = {ls}
else worklist = traverse_fn(s->fn, ls) ;
summ = {} ;
foreach l in worklist
foreach k in s->succ
summ = summ union traverse_stmts(k,entry_ls, l) ;
return sum ;
end
23年 4月 20日
RacerX: Effective, Static Detection of Race Conditions and Deadlocks
11
• Pseudo-code for interprocedural lockset algorithm (2/2)
Check statement cache
Cache update
Lockset update
DFS traversal
/ 20Hong,Shin @ PSWLAB
Lockset Analysis5/5• Limitations– Do not do alias analysis.
The tool represent local and parameter pointer variables by their type and name rather than their variable name.(e.g. a parameter foo that is a pointer to a structure of type bar will be named “local:struct bar”)
– Do only simple function pointer resolution Record all functions ever assigned to a function pointer of a given type. And each call site, assume that all of the function could be invoked.
23年 4月 20日
RacerX: Effective, Static Detection of Race Conditions and Deadlocks
12
/ 20Hong,Shin @ PSWLAB
Deadlock Checking1/9
(1) Computing locking cycles(2) Ranking(3) Increasing analysis accuracy(4) Handling lockset mistakes(5) Experience result
23年 4月 20日
RacerX: Effective, Static Detection of Race Conditions and Deadlocks
13
/ 20Hong,Shin @ PSWLAB
Deadlock Checking2/9Computing locking cycles(1)Constraint extraction
At every lock acquisition, emit the lock ordering constraints produced by the current lock acquisition.(e.g. if the current lockset is {l1, l2} and the current ly acquired lock is l3, then emit l1l3, and l2l3)
(2) Constraint solving Reads in the emitted locking constraints and computes the transitive closure of all dependencies. It records the shortest path between any cyclic lock depdendencies.
23年 4月 20日
RacerX: Effective, Static Detection of Race Conditions and Deadlocks
14
/ 20Hong,Shin @ PSWLAB
Deadlock Checking3/9Ranking• Rank error messages based on three criteria:
(1) The number of threads involved.- Errors with fewer threads are preferred to one with many threads.
(2) Whether the lock involved are local or global- Global lock errors are preferred over local one.
(3) The depth of the call chain- Short call chains are better than longer ones.
• Use these ranking criteria hierarchically to sort error message: (1) > (2) > (3)
23年 4月 20日
RacerX: Effective, Static Detection of Race Conditions and Deadlocks
15
/ 20Hong,Shin @ PSWLAB
Deadlock Checking4/9
23年 4月 20日
RacerX: Effective, Static Detection of Race Conditions and Deadlocks
16
Example: Error message of simple deadlock between two global locks
/ 20Hong,Shin @ PSWLAB
Deadlock Checking5/9Increasing analysis accuracy (1/2)• There are two significant sources of false lock dependencies:
(1) Semaphores used to enforce scheduling dependency - A semaphore may be used to implement scheduling
dependencies.
- Signal-wait semaphores have two behavior patterns:they are almost never paired, more lock than unlock
- Statistical approach:(1) Calculate how often true locks satisfies these two behaviors by counting the number of lock acquisitions, lock releases, and unlock errors.(2) And discard semaphores below some probability threshold.
23年 4月 20日
RacerX: Effective, Static Detection of Race Conditions and Deadlocks
17
/ 20Hong,Shin @ PSWLAB
Deadlock Checking6/9Increasing analysis accuracy (2/2)(2) “Release-on-block” locks• Many operating systems such as FreeBSD and Linux use global,
coarse-grained locks(e.g. big kernel lock) that have “release-on-block” semantics.
23年 4月 20日
RacerX: Effective, Static Detection of Race Conditions and Deadlocks
18
<Thread1> <Thread2>lock_kernel() ; down(sem) ;down(sem) ; lock_kernel();
<Thread1> <Thread2>lock_kernel() ;
down(sem) ;
down(sem) ;
lock_kernel() ;
/* No deadlock */
down(sem) {…
while( down(sem) would block ) { unlock_kernel() ; schedule() ; lock_kernel() ; }
…}
/ 20Hong,Shin @ PSWLAB
Deadlock Checking7/9Handling lockset mistakes• The most of deadlock false positives are caused by invalid
locksets.• And almost all invalid locksets arise from a data-dependent
lock release, or correlated branches.
e.g. void foo(int x) { if (x) lock(l) ;
… if (x) unlock(l) ;}
Without path-sensitive analysis, the tool will believe there are four paths through foo.
Use simple and novel propagation techniques to minimize the propagation of invalid locksets.
23年 4月 20日
RacerX: Effective, Static Detection of Race Conditions and Deadlocks
19
/ 20Hong,Shin @ PSWLAB
Deadlock Checking8/9• Cutting off lock-error paths
- Cut off the lockset on paths that contains a locking error.
• Downward-only lockset propagation- A significant source of false positives occur when it falsely believe that a lock is held on function exit when it is actually not.- Propagate locksets downward from caller to callee but never upward.- Cause false negatives for wrapper functions.
• Selecting the right summary- Majority summary selection: Rather than following all locksets a function call with generates, we take the one produced by the largest number of exit point within the function.- Minimum-size summary selection
• Unlockset analysis- At program statement s, remove any lock l in the current lockset if there exists no successor statement s’ reachable from s that contains an unlock operation of l.
23年 4月 20日
RacerX: Effective, Static Detection of Race Conditions and Deadlocks
20
/ 20Hong,Shin @ PSWLAB
Deadlock Checking9/9
23年 4月 20日
RacerX: Effective, Static Detection of Race Conditions and Deadlocks
21
Ex. Deadlock: acquired lock is released and then reacquired by the same thread.
Experience result
scsiLock
handleArrayLock
/ 20Hong,Shin @ PSWLAB
Data Race Checking1/6• Dataracer checker is called by the lockset analysis on
each statement.• The checker can be run in three modes:
(1) Simple checking- only flags global accesses that occur without any lock held.
(2) Simple statistical- infer which non-global variables and functions must be protected by some lock.
(3) Precise statistical- infer which specified lock protects an access and flag when an access occurs when the lockset does not contain the lock.
23年 4月 20日
RacerX: Effective, Static Detection of Race Conditions and Deadlocks
22
/ 20Hong,Shin @ PSWLAB
Data Race Checking2/6• The tool uses a set of heuristics to rank data race
errors by a scoring function.
• Heuristics are to answer following questions:- Is the lockset valid?- Is code multithreaded?- Does x need to be protected?- Does x need to be protected by L?
23年 4月 20日
RacerX: Effective, Static Detection of Race Conditions and Deadlocks
23
/ 20Hong,Shin @ PSWLAB
Data Race Checking3/6- Is code multithreaded? Two methods of determining a code is multithreaded:
(1) Multithreading inference– Any concurrency operation (e.g. lock acquire/release, atomic
operations) implies that the programmer believes the surrounding code is multithreaded.
– The tool marks a function as multithreaded if concurrency operations occur anywhere within its body, or anywhere above it in a call chain.
(2) Programmer written automatic annotator– Users can mark a function as single threaded, a function
that should be ignored, multithreaded, interrupt handler.
23年 4月 20日
RacerX: Effective, Static Detection of Race Conditions and Deadlocks
24
/ 20Hong,Shin @ PSWLAB
Data Race Checking4/6- Does x need to be protected?• There are three approaches to answer this question:
(1) Eliminating accesses unlikely to be dangerous,- Avoid flagging data races on variables that are private to a thread.
- Demote errors where data appears to be written only during initialization and only read afterwards.
(2) Promoting accesses that have a good chance of being unsafe- Favor errors that write data over errors that read data- Flag unprotected variables that cannot be read or written atomically (e.g. 64-bit variables on 32-bit machine)
(3) Inferring which variables programmers believe must not be accessed without a lock.- Count how many times each variable is accessed with a lock held and versus not. - Variables the programmer believes should be protected will have a relatively high number of locked accesses and few unlocked accesses.
23年 4月 20日
RacerX: Effective, Static Detection of Race Conditions and Deadlocks
25
/ 20Hong,Shin @ PSWLAB
Data Race Checking5/6- Does x need to be protected by L?• The tool infers whether a given lock protects a variable (or
a function) using statistical approaches.
• For each variable (or function) (1) the number of accesses to a variable(function)
(2) the number of times these accesses held a specific lock
• And then pick a single best lock out of all the candidates and then do an interprocedural checking with this information.
23年 4月 20日
RacerX: Effective, Static Detection of Race Conditions and Deadlocks
26
/ 20Hong,Shin @ PSWLAB
Data Race Checking6/6Experience result
23年 4月 20日
RacerX: Effective, Static Detection of Race Conditions and Deadlocks
27
Ex. Datarace error
/ 20Hong,Shin @ PSWLAB
Conclusion• RacerX is a static tool that uses flow-sensitive,
interprocedural analysis to detect both data races and deadlocks.
• RacerX found errors in large commercial codes such as FreeBSD, and Linux.
23年 4月 20日
RacerX: Effective, Static Detection of Race Conditions and Deadlocks
28
/ 20Hong,Shin @ PSWLAB
Further Work• Chord , by Mayur Naik and Alex Aiken , POPL07
Static race detection system for Java. Flow-insensitive , context-sensitive static analysis tool.
23年 4月 20日
RacerX: Effective, Static Detection of Race Conditions and Deadlocks
29
/ 20Hong,Shin @ PSWLAB
Reference[1] RacerX: Effective, Static Detection of Race
Conditions and Deadlocks, Dawson Engler & Ken Ashcraft, SOSP03
23年 4月 20日
RacerX: Effective, Static Detection of Race Conditions and Deadlocks
30