a novel test coverage metric for concurrently-accessed software components

32
23.Mart.2022 PLDI 2005, June 12-15, Chicago, U.S. 1 Serdar Tasiran Serdar Tasiran , , Tayfun Elmas, Tayfun Elmas, Guven Bolukbasi, M. Erkan Keremoglu Guven Bolukbasi, M. Erkan Keremoglu Koç University, Istanbul, Turkey Koç University, Istanbul, Turkey A Novel Test Coverage A Novel Test Coverage Metric for Concurrently- Metric for Concurrently- Accessed Software Accessed Software Components Components

Upload: elijah-guerra

Post on 31-Dec-2015

50 views

Category:

Documents


3 download

DESCRIPTION

Serdar Tasiran , Tayfun Elmas, Guven Bolukbasi, M. Erkan Keremoglu Koç University, Istanbul, Turkey. A Novel Test Coverage Metric for Concurrently-Accessed Software Components. Our Focus. Widely-used software systems are built on concurrent ly-accessed software components - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: A Novel Test Coverage Metric for Concurrently-Accessed Software Components

19.Nisan.2023 PLDI 2005, June 12-15, Chicago, U.S. 1

Serdar TasiranSerdar Tasiran, , Tayfun Elmas, Tayfun Elmas,

Guven Bolukbasi, M. Erkan KeremogluGuven Bolukbasi, M. Erkan Keremoglu

Koç University, Istanbul, TurkeyKoç University, Istanbul, Turkey

A Novel Test Coverage Metric A Novel Test Coverage Metric for Concurrently-Accessed for Concurrently-Accessed

Software ComponentsSoftware Components

Page 2: A Novel Test Coverage Metric for Concurrently-Accessed Software Components

19.Nisan.2023 PLDI 2005, June 12-15, Chicago, U.S. 2

Our FocusOur Focus

• Widely-used software systems are built on concurrently-accessed

software components– File systems, databases, internet services– Standard Java and C# class libraries

• Intricate synchronization mechanisms to improve performance– Prone to concurrency errors

• Concurrency errors– Data loss/corruption– Difficult to detect, reproduce through testing

Page 3: A Novel Test Coverage Metric for Concurrently-Accessed Software Components

19.Nisan.2023 PLDI 2005, June 12-15, Chicago, U.S. 3

The Location Pairs MetricThe Location Pairs MetricGoal of metric: To help answer the question

– “If I am worried about concurrency errors only, what unexamined scenario should I try to trigger?”

• Coverage metrics: Link between validation tools– Communicate partial results, testing goals between tools– Direct tools toward unexplored, distinct new executions

• The “location pairs” (LP) metric– Directed at concurrency errors ONLY

• Focus: “High-level” data races– Atomicity violations– Refinement violations– All variables may be lock-protected,

but operations not implemented atomically

Page 4: A Novel Test Coverage Metric for Concurrently-Accessed Software Components

19.Nisan.2023 PLDI 2005, June 12-15, Chicago, U.S. 4

OutlineOutline

• Runtime Refinement Checking• Examples of Refinement/Atomicity Violations• The “Location Pairs” Metric• Discussion, Ongoing Work

Page 5: A Novel Test Coverage Metric for Concurrently-Accessed Software Components

19.Nisan.2023 PLDI 2005, June 12-15, Chicago, U.S. 5

Refinement as Correctness Refinement as Correctness CriterionCriterion

Call I

nsert

(3)

Un

lock A

[0]

A[0

].elt

=3

Call L

ookU

p(3

)

Retu

rn“su

cces

s”

Un

lock A

[1]

A[1

].elt

=4

Retu

rn“su

cces

s”

read

A[0

]

Retu

rn

“tr

ue”

A[0

].elt

=n

ull

Un

lock A

[0]

Retu

rn“su

cces

s”

Call I

nsert

(4)

Call D

ele

te(3

)

LookUp(3)

Thread 1

.....

Insert(3)Thread 2

.....

Insert(4)Thread 3

.....

Delete(3)

Thread 4

.....

ComponentImplementation

• Client threads invokeoperations concurrently

• Data structure operations should appear to be executed– atomically– in a linear order

to client threads.

Page 6: A Novel Test Coverage Metric for Concurrently-Accessed Software Components

19.Nisan.2023 PLDI 2005, June 12-15, Chicago, U.S. 6

Runtime Refinement Runtime Refinement CheckingChecking• Refinement

– For each execution of Impl• there exists an “equivalent”, atomic execution of data

structure Spec

• Spec: “Atomized” version of Impl– Client methods run one at a time– Obtained from Impl itself

• Use refinement as correctness criterion– More thorough than assertions– More observability than pure testing

• Runtime verification: Check refinement using execution traces– Can handle industrial-scale programs – Intermediate between testing & exhaustive verification

Page 7: A Novel Test Coverage Metric for Concurrently-Accessed Software Components

19.Nisan.2023 PLDI 2005, June 12-15, Chicago, U.S. 7

The VYRD ToolThe VYRD Tool

ReplayMechanism

Impl

Implreplay Spec

traceImpl

Call I

nsert

(3)

Un

lock

A[0

]

A[0

].elt

=3

Call L

ookU

p(3

)

Retu

rn“su

ccess

Un

lock

A[1

]

A[1

].elt

=4

Retu

rn“su

ccess

read

A[0

]

Retu

rn “tr

ue”

A[0

].elt

=n

ull

Un

lock

A[0

]

Retu

rn“su

ccess

Call I

nsert

(4)

Call D

ele

te(3

)

Multi-threaded test

RefinementChecker

......

Write to log

Read from log

traceSpec

Execute logged actions

Run methods atomically

• At certain points for each method, take “state snapshots”• Check consistency of data structure contents

Page 8: A Novel Test Coverage Metric for Concurrently-Accessed Software Components

19.Nisan.2023 PLDI 2005, June 12-15, Chicago, U.S. 8

The Vyrd ExperienceThe Vyrd Experience• Scalable method: Caught previously undetected,

serious but subtle bugs in industrial-scale designs– Boxwood (30K LOC)– Scan Filesystem (Windows NT)– Java Libraries with known bugs

• Reasonable runtime overhead

• Key novelty: Checking refinement improves observability– Catches bugs that are triggered but not observed by testing– Significant improvement

Page 9: A Novel Test Coverage Metric for Concurrently-Accessed Software Components

19.Nisan.2023 PLDI 2005, June 12-15, Chicago, U.S. 9

ExperienceExperience The Boxwood ProjectThe Boxwood Project

BLINKTREE MODULE Root Pointer Node

Internal Pointer Node Internal Pointer Node Internal Pointer NodeLevel n+1................

Internal Pointer Node Internal Pointer Node Internal Pointer NodeLevel n..................

Root Level

Leaf Pointer Node Leaf Pointer Node Leaf Pointer Node... Level 0 ...

...... ..... ..... ..... ..... ............... .... .... ......... ........ ......

...... ..... ..... ..... ..... ............... .... .... ......... ........ ......

Data Node Data Node Data Node Data Node Data Node Data Node... Data Nodes ...

GlobalDiskAllocator

CHUNK MANAGER MODULE

ReplicatedDisk Manager

Write

Read

Read Write

CACHE MODULE

Dirty Cache Entries

...

Clean Cache Entries

...

Cache

Page 10: A Novel Test Coverage Metric for Concurrently-Accessed Software Components

19.Nisan.2023 PLDI 2005, June 12-15, Chicago, U.S. 10

Refinement vs. Testing: Improved Refinement vs. Testing: Improved ObservabilityObservability• Using Vyrd, caught previously undetected bug in

– Boxwood Cache– Scan File System (Windows NT)

• Bug manifestation: – Cache entry is correct, marked “clean”– Permanent storage has corrupted data

• Hard to catch through testing

– As long as “Read”s hit in Cache, return value correct

– Caught through testing only if1.Cache fills, clean entry in Cache is evicted

• Not written again to permanent storage since entry is marked “clean”

2.Entry read from permanent storage after eviction– With no “Write”s to entry in the meantime

Page 11: A Novel Test Coverage Metric for Concurrently-Accessed Software Components

19.Nisan.2023 PLDI 2005, June 12-15, Chicago, U.S. 11

OutlineOutline

• Runtime Refinement Checking• Examples of Refinement/Atomicity Violations• The “Location Pairs” Metric• Discussion, Ongoing Work

Page 12: A Novel Test Coverage Metric for Concurrently-Accessed Software Components

19.Nisan.2023 PLDI 2005, June 12-15, Chicago, U.S. 12

Idea behind the LP metricIdea behind the LP metric• Observation: Bug occurs whenever

1. Method1 executes up to line X, context switch occurs2. Method2 starts execution from line Y3. Provided there is a data dependency between

• Method1’s code “right before” line X: BlockX• Method2’s code “right after” line Y: BlockY

• Description of bug in the log follows pattern above

• Only requirement on program state, other threads, etc.:– Make the interleaving above possible– May require many other threads, complicated program

state, ...

• A “one-bit” data abstraction captures error scenario– Depdt: Is there a data dependency between BlockX and

BlockY

Page 13: A Novel Test Coverage Metric for Concurrently-Accessed Software Components

19.Nisan.2023 PLDI 2005, June 12-15, Chicago, U.S. 13

int len = sb.length();

ensureCapacity(newCount);

int newCount = count + len;

if (newCount > value.length)

count = newCount;

return this;

sb.getChars(0, len, value, count);

public synchronized StringBuffer append(StringBuffer sb) {

}

...

} else {

if (count < newLength)

...

}

return this;

count = newLength;

public synchronized void setLength(int newLength) {

}

...

Page 14: A Novel Test Coverage Metric for Concurrently-Accessed Software Components

19.Nisan.2023 PLDI 2005, June 12-15, Chicago, U.S. 14

handle

handle

T Z

Chunk Manager

X Y

Cache

handle

handle

X Z

Chunk Manager

A Y

Cache

handle

handle

A Y

Chunk Manager

A B

Cache

handle

handle

A Y

Chunk Manager

A B

Cache

Write(handle,AB) starts

Flush()starts

Flush() ends

Write(handle, AB)

ends

ExperienceExperience Concurrency Bug in Concurrency Bug in CacheCache

handle

handle

A Y

Chunk Manager

A Y

Cache

Different byte-arrays for the same handle

Corrupted data in persistent storage

Page 15: A Novel Test Coverage Metric for Concurrently-Accessed Software Components

19.Nisan.2023 PLDI 2005, June 12-15, Chicago, U.S. 15

}

te.data[i] = buf[i];

for (int i=0; i<buf.length; i++) {

te.lsn = lsn

private static void CpToCache( byte[] buf, CacheEntry te, int lsn, Handle h sb) {

}

...

lock (clean) {

BoxMain.alloc.Write(h, te.data, te.data.length, 0, 0, WRITE_TYPE_RAW);

}

...

public static void Flush(int lsn) {

}

Page 16: A Novel Test Coverage Metric for Concurrently-Accessed Software Components

19.Nisan.2023 PLDI 2005, June 12-15, Chicago, U.S. 16

OutlineOutline

• Runtime Refinement Checking• Examples of Refinement/Atomicity Violations• The “Location Pairs” Metric• Discussion, Ongoing Work

Page 17: A Novel Test Coverage Metric for Concurrently-Accessed Software Components

-----------------------------------acquire(this)-----------------------------------invoke sb.length()

--------------------------– L1 ----int len = sb.length()

--------------------------- L2 ----int newCount = count + len

-----------------------------------invoke sb.getChar()-----------------------------------sb.getChars(0, len, value, count)--------------------------–--------count = newCount-----------------------------------return this

-----------------------------------if (newCount > value.length)

-----------------------------------expandCapacity(newCount);

public synchronized StringBuffer append(StringBufer sb) {

1 int len = sb.length();2 int newCount = count + len;3 if (newCount > value.length) {4 ensureCapacity(newCount);5 sb.getChars(0, len, value, count);6 count = newCount;7 return this;8 }

Page 18: A Novel Test Coverage Metric for Concurrently-Accessed Software Components

19.Nisan.2023 PLDI 2005, June 12-15, Chicago, U.S. 18

Coverage FSM StateCoverage FSM State

Method 1

Method 2

(LX, pend1, LY, pend2, depdt)

Location inthe CFG ofMethod 1

Is an “interesting” action in Method 1 is expected next?

Location inthe CFG ofMethod 2Is an

“interesting” action in Method 2 expected next?

Do actions following LX and LY have a data dependency?

Page 19: A Novel Test Coverage Metric for Concurrently-Accessed Software Components

19.Nisan.2023 PLDI 2005, June 12-15, Chicago, U.S. 19

Coverage FSMCoverage FSM

(L1, !pend1, L3, !pend2, depdt)

(L2, !pend1, L3, pend2, !depdt)

(L1, pend1, L3, !pend2, !depdt)

(L1, !pend1, L3, !pend2, !depdt)

t1: L1

L2

t1: L1

L2

t2: L3 L4

t2: L3 L4

Page 20: A Novel Test Coverage Metric for Concurrently-Accessed Software Components

19.Nisan.2023 PLDI 2005, June 12-15, Chicago, U.S. 20

Coverage GoalCoverage Goal• The “pend1” bit gets set when

– The depdt bit is TRUE– Method2 takes an action– Intuition: Method1’s dependent action must follow

• Must cover all (reachable) transitions of the form

– p = (LXp, TRUE, LY, pend2p, depdtp)

q = (LXq, pend1q, LY, pend2q, depdtq)

– p = (LX, pend1p, LYp, TRUE, depdtp)

q = (LX, pend1q, LYq, pend2q, depdtq)

• Separate coverage FSM for each method pair: FSM(Method1, Method2)– Cover required transitions in each FSM

Page 21: A Novel Test Coverage Metric for Concurrently-Accessed Software Components

19.Nisan.2023 PLDI 2005, June 12-15, Chicago, U.S. 21

Important DetailsImportant Details• Action: Atomically executed code fragment

– Defined by the language

• Method calls:– Call action: Method call, all lock acquisitions– Return action: Total net effect of method, atomically executed +

lock releases

• Separate coverage FSM for each method pair: FSM(Method1, Method2)– Cover required transitions in each FSM

• But what if there is interesting concurrency inside called method?– Considered separately when that method is considered as one in

the method pair– If Method1 calls Method3:

• Considered when FSM(Method3, Method2) is covered

Page 22: A Novel Test Coverage Metric for Concurrently-Accessed Software Components

19.Nisan.2023 PLDI 2005, June 12-15, Chicago, U.S. 22

OutlineOutline

• Runtime Refinement Checking• Examples of Refinement/Atomicity Violations• The “Location Pairs” Metric• Discussion, Ongoing Work

Page 23: A Novel Test Coverage Metric for Concurrently-Accessed Software Components

19.Nisan.2023 PLDI 2005, June 12-15, Chicago, U.S. 23

Empirical evidenceEmpirical evidence• Does this metric correspond well with high-level concurrency

errors? • Errors captured by metric

– 100% metric Bug guaranteed to be triggered• Triggered vs. detected:

– May need view refinement checking to improve observability

• Preliminary study– Bugs in Java class libraries– Bug found in Boxwood cache– Bug found in Scan file system– Bugs categories reported in

E. Farchi, Y. Nir, S. Ur Concurrent Bug Patterns and How to Test Them 17th Intl. Parallel and Distributed Processing Symposium (IDPDS ’03)

• How many are covered by random testing? How does coverage change over time?– Don’t know yet. Implementing coverage measurement tool.

Page 24: A Novel Test Coverage Metric for Concurrently-Accessed Software Components

19.Nisan.2023 PLDI 2005, June 12-15, Chicago, U.S. 24

Reducing the Coverage FSMReducing the Coverage FSM

• Method-local actions: – Basic block consisting of method-local actions

considered a single atomic action

• Pure blocks [Flanagan & Qadeer, ISSTA ’04]– A “pure” execution of pure block does not affect global state

• Example: Acquire lock, read global variable, decide resource not free, release lock

– Considered a “no-op”– Modeled by “bypass transition” in coverage FSM.

• Does not need to be covered

Page 25: A Novel Test Coverage Metric for Concurrently-Accessed Software Components

19.Nisan.2023 PLDI 2005, June 12-15, Chicago, U.S. 25

• The metric is NOT for deciding when to stop testing/verification• Intended use:

– Testing, runtime verification is applied to program– List of non-covered coverage targets provided to programmer

• Intuition: Given an unexercised scenario, the programmer must have a simple reason to believe that– the scenario is not possible, or– the scenario is safe

• Given uncovered coverage target, programmer – either provides hints to coverage tool to rule target out– or, assumes that coverage target is a possibility,

• writes test to trigger it• or, makes sure that no concurrency error would result

if coverage target were to be exercised

DiscussionDiscussion

Page 26: A Novel Test Coverage Metric for Concurrently-Accessed Software Components

19.Nisan.2023 PLDI 2005, June 12-15, Chicago, U.S. 26

Future Work: Approximating Reachable Future Work: Approximating Reachable LP SetLP Set• # of locations per method in Boxwood: ~10,

after factoring out atomic and pure blocks

• LP reachability undecidable– Metric only intended as aid to programmer

• What have I tested?• What should I try to test?• Make sure LP does not lead to error if it looks like it can be

exercised.

• Future work: Better approximate reachable LP set– Do conservative reachability analysis of coverage FSM

using predicate abstraction.– Programmer can add predicates for better FSM reduction

Page 27: A Novel Test Coverage Metric for Concurrently-Accessed Software Components

19.Nisan.2023 PLDI 2005, June 12-15, Chicago, U.S. 27

Page 28: A Novel Test Coverage Metric for Concurrently-Accessed Software Components

19.Nisan.2023 PLDI 2005, June 12-15, Chicago, U.S. 28

MultisetMultiset• Multiset data structure

M = { 2, 3, 3, 3, 9, 8, 8, 5 }

• Has highly concurrent implementations of– Insert– Delete– InsertPair– LookUp

Implementation: Implementation: LookUpLookUp

LookUp (x)for i = 1 to n acquire(A[i]) if (A[i].content==x && A[i].valid)

release(A[i]) return true else release(A[i]) return false

A9

8

6

8

5

3

null

3

3

2

content

valid

Page 29: A Novel Test Coverage Metric for Concurrently-Accessed Software Components

19.Nisan.2023 PLDI 2005, June 12-15, Chicago, U.S. 29

Call Insert(3)

Call LookUp(3)

Return“success”

Call Insert(4)

Return“success”

Return “true”

Call Delete(3)

Return“success”

Unlock A[0]

A[0].elt=3

Unlock A[1]

A[1].elt=4

read A[0]

A[0].elt=null

Unlock A[0]

MultisetMultiset TestingTesting

• Don’t know which happened first– Insert(3) or Delete(3) ?

• Should 3 be in the multiset at the end?– Must accept both possibilities

as correct

• Common practice:– Run long multi-threaded test– Perform sanity checks on final

state

Page 30: A Novel Test Coverage Metric for Concurrently-Accessed Software Components

19.Nisan.2023 PLDI 2005, June 12-15, Chicago, U.S. 30

Call Insert(3)

Call LookUp(3)

Return“success”

Call Insert(4)

Return“success”

Return “true”

Call Delete(3)

Return“success”

Unlock A[0]

A[0].elt=3

Unlock A[1]

A[1].elt=4

read A[0]

A[0].elt=null

Unlock A[0]

MultisetMultiset I/O RefinementI/O Refinement

M=Ø

{3}

{3}

{3, 4}

{4}

Spec trace

Call Insert(3)

Return “success”

Call LookUp(3)

Return “success”

Call Insert(4)

Call Delete(3)

Return “success”

M = M U {3}

Check 3 M

Return “true”

M = M U {4}

M = M \ {3}

CommitInsert(3)

CommitLookUp(3

)

CommitInsert(4)

CommitDelete(3

)

Witness ordering

Unlock A[0]

A[0].elt=3

Unlock A[1]

A[1].elt=4

read A[0]

A[0].elt=null

Unlock A[0]

M = M U {3}

Check 3 M

M = M U {4}

M = M \ {3}

Page 31: A Novel Test Coverage Metric for Concurrently-Accessed Software Components

19.Nisan.2023 PLDI 2005, June 12-15, Chicago, U.S. 31

View-refinementView-refinement• State correspondence

– Hypothetical “view” variables must match at commit points

• “view” variable: – Value of variable is abstract data structure state– Updated atomically once by each method

• For A[1..n]– Extract content

if valid=true

View VariablesView Variables

3

5

5

content

validA

3

9

8

6

8

5

viewImpl={3, 3, 5, 5, 8, 8, 9}

Page 32: A Novel Test Coverage Metric for Concurrently-Accessed Software Components

19.Nisan.2023 PLDI 2005, June 12-15, Chicago, U.S. 32

View-refinementView-refinement

Call Insert(3)

viewImpl = {3}

A[0].elt=3

Call LookUp(3)

Return“success”

viewImpl = {3,4}

A[1].elt=4

Call Insert(4)

Return“success”

Return “true”

Call Delete(3)

A[0].elt=null

viewImpl = {4}

Return“success”

CommitInsert(3)

CommitLookUp(3

)

CommitInsert(4)

CommitDelete(3

)

Witness ordering M=Ø

{3}

{3}

{3, 4}

{4}

Spec trace

Call Insert(3)

Return “success”

Call LookUp(3)

Return “true”

Return “success”

Call Insert(4)

Call Delete(3)

Return “success”

M = M U {3}

Check 3 M

M = M U {4}

M = M \ {3}

viewSpec = {3}

viewSpec = {3}

viewSpec = {3,4}

viewSpec = {4}

viewImpl = {3}