finding errors in.net with feedback-directed random testing carlos pacheco (mit) shuvendu lahiri...
TRANSCRIPT
![Page 1: Finding Errors in.NET with Feedback-Directed Random Testing Carlos Pacheco (MIT) Shuvendu Lahiri (Microsoft) Thomas Ball (Microsoft) July 22, 2008](https://reader035.vdocuments.net/reader035/viewer/2022062923/5518c2d1550346881f8b56a5/html5/thumbnails/1.jpg)
Finding Errors in .NETwithFeedback-Directed Random Testing
Carlos Pacheco (MIT)Shuvendu Lahiri (Microsoft)
Thomas Ball (Microsoft)
July 22, 2008
![Page 2: Finding Errors in.NET with Feedback-Directed Random Testing Carlos Pacheco (MIT) Shuvendu Lahiri (Microsoft) Thomas Ball (Microsoft) July 22, 2008](https://reader035.vdocuments.net/reader035/viewer/2022062923/5518c2d1550346881f8b56a5/html5/thumbnails/2.jpg)
Feedback-directed random testing (FDRT)
classesunder test
propertiesto check
feedback-directed random test generator
failingtest cases
![Page 3: Finding Errors in.NET with Feedback-Directed Random Testing Carlos Pacheco (MIT) Shuvendu Lahiri (Microsoft) Thomas Ball (Microsoft) July 22, 2008](https://reader035.vdocuments.net/reader035/viewer/2022062923/5518c2d1550346881f8b56a5/html5/thumbnails/3.jpg)
Feedback-directed random testing (FDRT)
classesunder test
propertiesto check
feedback-directed random test generator
failingtest cases
java.util.Collectionsjava.util.ArrayListjava.util.TreeSetjava.util.LinkedList...
![Page 4: Finding Errors in.NET with Feedback-Directed Random Testing Carlos Pacheco (MIT) Shuvendu Lahiri (Microsoft) Thomas Ball (Microsoft) July 22, 2008](https://reader035.vdocuments.net/reader035/viewer/2022062923/5518c2d1550346881f8b56a5/html5/thumbnails/4.jpg)
Feedback-directed random testing (FDRT)
classesunder test
propertiesto check
feedback-directed random test generator
failingtest cases
java.util.Collectionsjava.util.ArrayListjava.util.TreeSetjava.util.LinkedList...
Reflexivity of equality:
o != null : o.equals(o) == true
![Page 5: Finding Errors in.NET with Feedback-Directed Random Testing Carlos Pacheco (MIT) Shuvendu Lahiri (Microsoft) Thomas Ball (Microsoft) July 22, 2008](https://reader035.vdocuments.net/reader035/viewer/2022062923/5518c2d1550346881f8b56a5/html5/thumbnails/5.jpg)
Feedback-directed random testing (FDRT)
classesunder test
propertiesto check
feedback-directed random test generator
failingtest cases
java.util.Collectionsjava.util.ArrayListjava.util.TreeSetjava.util.LinkedList...
Reflexivity of equality:
o != null : o.equals(o) == true
public void test() {
Object o = new Object(); ArrayList a = new ArrayList(); a.add(o); TreeSet ts = new TreeSet(a); Set us = Collections.unmodifiableSet(ts);
// Fails at runtime. assertTrue(us.equals(us));
}
![Page 6: Finding Errors in.NET with Feedback-Directed Random Testing Carlos Pacheco (MIT) Shuvendu Lahiri (Microsoft) Thomas Ball (Microsoft) July 22, 2008](https://reader035.vdocuments.net/reader035/viewer/2022062923/5518c2d1550346881f8b56a5/html5/thumbnails/6.jpg)
Technique overview
• Creates method sequences incrementally• Uses runtime information to guide the
generation
• Avoids illegal inputs
6
Feedback-Directed Random Test GenerationPacheco, Lahiri, Ball and ErnstICSE 2007
normalerrorrevealing
exception
throwing
output as tests used to create largersequences
discarded
![Page 7: Finding Errors in.NET with Feedback-Directed Random Testing Carlos Pacheco (MIT) Shuvendu Lahiri (Microsoft) Thomas Ball (Microsoft) July 22, 2008](https://reader035.vdocuments.net/reader035/viewer/2022062923/5518c2d1550346881f8b56a5/html5/thumbnails/7.jpg)
Prior experimental evaluation (ICSE
2007)
7
• Compared with other techniques− Model checking, symbolic execution, traditional
random testing
• On collection classes (lists, sets, maps, etc.)− FDRT achieved equal or higher code coverage in less
time
• On a large benchmark of programs (750KLOC)− FDRT revealed more errors
![Page 8: Finding Errors in.NET with Feedback-Directed Random Testing Carlos Pacheco (MIT) Shuvendu Lahiri (Microsoft) Thomas Ball (Microsoft) July 22, 2008](https://reader035.vdocuments.net/reader035/viewer/2022062923/5518c2d1550346881f8b56a5/html5/thumbnails/8.jpg)
Goal of the Case Study
• Evaluate FDRT’s effectiveness in an industrial setting
− Error-revealing effectiveness− Cost effectiveness− Usability
• These are important questions to ask about any test generation technique
8
![Page 9: Finding Errors in.NET with Feedback-Directed Random Testing Carlos Pacheco (MIT) Shuvendu Lahiri (Microsoft) Thomas Ball (Microsoft) July 22, 2008](https://reader035.vdocuments.net/reader035/viewer/2022062923/5518c2d1550346881f8b56a5/html5/thumbnails/9.jpg)
Case study structure• Asked engineers from a test team at Microsoft
to use FDRT on their code base over a period of 2 months.
• We provided− A tool implementing FDRT− Technical support for the tool (bug fixes bugs, feature
requests)
• We met on a regular basis (approx. every 2 weeks)− Asked team for experience and results
9
![Page 10: Finding Errors in.NET with Feedback-Directed Random Testing Carlos Pacheco (MIT) Shuvendu Lahiri (Microsoft) Thomas Ball (Microsoft) July 22, 2008](https://reader035.vdocuments.net/reader035/viewer/2022062923/5518c2d1550346881f8b56a5/html5/thumbnails/10.jpg)
Randoop
FDRT
.NET .NET assemblyassembly Failing C# Test Cases
• Properties checked:− sequence does not lead to runtime assertion
violation− sequence does not lead to runtime access violation− executing process should not crash
10
![Page 11: Finding Errors in.NET with Feedback-Directed Random Testing Carlos Pacheco (MIT) Shuvendu Lahiri (Microsoft) Thomas Ball (Microsoft) July 22, 2008](https://reader035.vdocuments.net/reader035/viewer/2022062923/5518c2d1550346881f8b56a5/html5/thumbnails/11.jpg)
Subject program
• Test team responsible for a critical .NET component 100KLOC, large API, used by all .NET applications
• Highly stable, heavily tested− High reliability particularly important for this component− 200 man years of testing effort (40 testers over 5 years)− Test engineer finds 20 new errors per year on average− High bar for any new test generation technique
• Many automatic techniques already applied
11
![Page 12: Finding Errors in.NET with Feedback-Directed Random Testing Carlos Pacheco (MIT) Shuvendu Lahiri (Microsoft) Thomas Ball (Microsoft) July 22, 2008](https://reader035.vdocuments.net/reader035/viewer/2022062923/5518c2d1550346881f8b56a5/html5/thumbnails/12.jpg)
Discussion outline
• Results overview
• Error-revealing effectiveness− Kinds of errors, examples− Comparison with other techniques
• Cost effectiveness− Earlier/later stages
12
![Page 13: Finding Errors in.NET with Feedback-Directed Random Testing Carlos Pacheco (MIT) Shuvendu Lahiri (Microsoft) Thomas Ball (Microsoft) July 22, 2008](https://reader035.vdocuments.net/reader035/viewer/2022062923/5518c2d1550346881f8b56a5/html5/thumbnails/13.jpg)
Case study results: overview
13
Human time spent interacting with Randoop
15 hours
CPU time running Randoop 150 hours
Total distinct method sequences
4 million
New errors revealed 30
![Page 14: Finding Errors in.NET with Feedback-Directed Random Testing Carlos Pacheco (MIT) Shuvendu Lahiri (Microsoft) Thomas Ball (Microsoft) July 22, 2008](https://reader035.vdocuments.net/reader035/viewer/2022062923/5518c2d1550346881f8b56a5/html5/thumbnails/14.jpg)
Error-revealing effectiveness
• Randoop revealed 30 new errors in 15 hours of human effort.(i.e. 1 new per 30 minutes)
This time included:interacting with Randoopinspecting the resulting testsdiscarding redundant failures
• A test engineer discovers on average 1 new error per 100 hours of effort.
14
![Page 15: Finding Errors in.NET with Feedback-Directed Random Testing Carlos Pacheco (MIT) Shuvendu Lahiri (Microsoft) Thomas Ball (Microsoft) July 22, 2008](https://reader035.vdocuments.net/reader035/viewer/2022062923/5518c2d1550346881f8b56a5/html5/thumbnails/15.jpg)
Example error 1: memory management
• Component includes memory-managed and native code
• If native call manipulates references, must inform garbage collector of changes
• Previously untested path in native code reported a new reference to an invalid address
• This error was in code for which existing tests achieved 100% branch coverage
15
![Page 16: Finding Errors in.NET with Feedback-Directed Random Testing Carlos Pacheco (MIT) Shuvendu Lahiri (Microsoft) Thomas Ball (Microsoft) July 22, 2008](https://reader035.vdocuments.net/reader035/viewer/2022062923/5518c2d1550346881f8b56a5/html5/thumbnails/16.jpg)
Example error 2: missing resource string
• When exception is raised, component finds message in resource file
• Rarely-used exception was missing message in file• Attempting lookup led to assertion violation
• Two errors:− Missing message in resource file− Error in tool that verified state of resource file
16
![Page 17: Finding Errors in.NET with Feedback-Directed Random Testing Carlos Pacheco (MIT) Shuvendu Lahiri (Microsoft) Thomas Ball (Microsoft) July 22, 2008](https://reader035.vdocuments.net/reader035/viewer/2022062923/5518c2d1550346881f8b56a5/html5/thumbnails/17.jpg)
Errors revealed by expanding Randoop's scope
• Test team also used Randoop’s tests as input to other tools
• Used test inputs to drive other tools
• Expanded the scope of the exploration and the types of errors revealed beyond those that Randoop could find.
For example, team discovered concurrency errors this way
17
![Page 18: Finding Errors in.NET with Feedback-Directed Random Testing Carlos Pacheco (MIT) Shuvendu Lahiri (Microsoft) Thomas Ball (Microsoft) July 22, 2008](https://reader035.vdocuments.net/reader035/viewer/2022062923/5518c2d1550346881f8b56a5/html5/thumbnails/18.jpg)
Discussion outline
• Results overview
• Error-revealing effectiveness− Kinds of errors, examples− Comparison with other techniques
• Cost effectiveness− Earlier/later stages
18
![Page 19: Finding Errors in.NET with Feedback-Directed Random Testing Carlos Pacheco (MIT) Shuvendu Lahiri (Microsoft) Thomas Ball (Microsoft) July 22, 2008](https://reader035.vdocuments.net/reader035/viewer/2022062923/5518c2d1550346881f8b56a5/html5/thumbnails/19.jpg)
Traditional random testing
• Randoop found errors not caught by fuzz testing
• Fuzz testing’s domain is files, stream, protocols
• Randoop’s domain is method sequences
• Think of Randoop as a smart fuzzer for APIs
19
![Page 20: Finding Errors in.NET with Feedback-Directed Random Testing Carlos Pacheco (MIT) Shuvendu Lahiri (Microsoft) Thomas Ball (Microsoft) July 22, 2008](https://reader035.vdocuments.net/reader035/viewer/2022062923/5518c2d1550346881f8b56a5/html5/thumbnails/20.jpg)
Symbolic execution
• Concurrently with Randoop, test team used a method sequence generator based on symbolic execution− Conceptually more powerful than FDRT
• Symbolic tool found no errors over the same period of time, on the same subject program
• Symbolic approach achieved higher coverage on classes that− Can be tested in isolation− Do not go beyond managed code realm
20
![Page 21: Finding Errors in.NET with Feedback-Directed Random Testing Carlos Pacheco (MIT) Shuvendu Lahiri (Microsoft) Thomas Ball (Microsoft) July 22, 2008](https://reader035.vdocuments.net/reader035/viewer/2022062923/5518c2d1550346881f8b56a5/html5/thumbnails/21.jpg)
Discussion outline
• Results overview
• Error-revealing effectiveness− Kinds of errors, examples− Comparison with other techniques
• Cost effectiveness− Earlier/later stages
21
![Page 22: Finding Errors in.NET with Feedback-Directed Random Testing Carlos Pacheco (MIT) Shuvendu Lahiri (Microsoft) Thomas Ball (Microsoft) July 22, 2008](https://reader035.vdocuments.net/reader035/viewer/2022062923/5518c2d1550346881f8b56a5/html5/thumbnails/22.jpg)
The Plateau Effect
• Randoop was cost effective during the span of the study
• After this initial period of effectiveness, Randoop ceased to reveal errors
• After the study, test team made a parallel run of Randoop− Dozens of machines, hundreds of machine hours− Each machine with a different random seed− Found fewer errors than it first 2 hours of use on a single
machine22
![Page 23: Finding Errors in.NET with Feedback-Directed Random Testing Carlos Pacheco (MIT) Shuvendu Lahiri (Microsoft) Thomas Ball (Microsoft) July 22, 2008](https://reader035.vdocuments.net/reader035/viewer/2022062923/5518c2d1550346881f8b56a5/html5/thumbnails/23.jpg)
Overcoming the plateau
• Reasons for the plateau− Spends majority of time on subset classes− Cannot cover some branches
• Work remains to be done on new random strategies
• Hybrid techniques show promise− Random/symbolic− Random/enumerative
23
![Page 24: Finding Errors in.NET with Feedback-Directed Random Testing Carlos Pacheco (MIT) Shuvendu Lahiri (Microsoft) Thomas Ball (Microsoft) July 22, 2008](https://reader035.vdocuments.net/reader035/viewer/2022062923/5518c2d1550346881f8b56a5/html5/thumbnails/24.jpg)
Conclusion
• Feedback-directed random testing− Effective in an industrial setting
• Randoop used internally at Microsoft− Added to list of recommended tools for other product
groups− Has revealed dozens more errors in other products
• Random testing techniques are effective in industry− Find deep and critical errors− Scalability yields impact
24