metamorphic testing techniques to detect defects in applications without test oracles christian...

108
Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

Post on 19-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

Metamorphic Testing Techniques to Detect Defects in

Applications without Test Oracles

Christian Murphy

Thesis Defense

April 12, 2010

Page 2: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

2

Overview Software testing is important!

Certain types of applications are particularly hard to test because there is no “test oracle” Machine Learning, Discrete Event Simulation,

Optimization, Scientific Computing, etc.

Even when there is no oracle, it is possible to detect defects if properties of the software are violated

My research introduces and evaluates new techniques for testing such “non-testable programs” [Weyuker, Computer Journal’82]

Page 3: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

3

Motivating Example: Machine Learning

Page 4: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

4

Motivating Example: Simulation

Length of Stay versus Utilization

0

50

100

150

200

250

300

0 2 4 6 8 10 12

number of beds

unit

s of

tim

e

0

2

4

6

8

10

12

14

16

perc

ent

utiliz

ation

LOS

DoctorUtilizationNurseUtilizationTriageUtilizationClerkUtilization

Page 5: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

5

Problem Statement Partial oracles may exist for a limited subset

of the input domain in applications such as Machine Learning, Discrete Event Simulation, Scientific Computing, Optimization, etc.

Obvious errors (e.g., crashes) can be detected with certain inputs or testing techniques

However, it is difficult to detect subtle computational defects in applications without test oracles in the general case

Page 6: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

6

What do I mean by “defect”? Deviation of the implementation from the

specification Violation of a sound property of the software

“Discrete localized” calculation errors Off-by-one Incorrect sentinel values for loops Wrong comparison or mathematical operator

Misinterpretation of specification Parts of input domain not handled Incorrect assumptions made about input

Page 7: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

7

Observation Many programs without oracles have

properties such that certain changes to the input yield predictable changes to the output

We can detect defects in these programs by looking for any violations of these “metamorphic properties”

This is known as “metamorphic testing” [T.Y. Chen et al., Info. & Soft. Tech vol.4, 2002]

Page 8: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

8

Research Goals Facilitate the way that metamorphic testing is

used in practice

Develop new testing techniques based on metamorphic testing

Demonstrate the effectiveness of metamorphic testing techniques

Page 9: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

9

Hypotheses For programs that do not have a test oracle, an

automated approach to metamorphic testing is more effective at detecting defects than other approaches

An approach that conducts function-level metamorphic testing in the context of a running application will further increase the effectiveness

It is feasible to continue this type of testing in the deployment environment, with minimal impact on the end user

Page 10: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

10

Contributions1. A set of guidelines to help identify metamorphic properties

2. New empirical studies comparing the effectiveness of metamorphic testing to other approaches

3. An approach for detecting defects in non-deterministic applications called Heuristic Metamorphic Testing

4. A new testing technique called Metamorphic Runtime Checking based on function-level metamorphic properties

5. A generalized technique for testing in the deployment environment called In Vivo Testing

Page 11: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

11

Outline Background

Related WorkMetamorphic Testing

Metamorphic Testing Empirical Studies Metamorphic Runtime Checking Future Work & Conclusion

Page 12: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

12

Other Approaches [Baresi & Young, 2001]

Formal specifications A complete specification is essentially a test oracle

Embedded assertions Can check that the software behaves as expected

Algebraic properties Used to generate test cases for abstract datatypes

Trace checking & Log file analysis Analyze intermediate results and sequence of executions

Page 13: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

13

Metamorphic Testing [Chen et al., 2002]

If new test case output f(t(x)) is as expected, it is not necessarily correct

However, if f(t(x)) is not as expected, either f(x) or f(t(x)) – or both! – is wrong

x f f(x)Initial test case

t(x) f f(t(x))New test case

t f(x) and f(t(x))are “pseudo-oracles”

Transformation function based on

metamorphic properties of f

Page 14: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

14

Metamorphic Testing Example Consider a function to determine the standard

deviation of a set of numbers

a b c d e fInitialinput

c e b a f dNew testcase #1

2a 2b 2c 2d 2e 2fNew testcase #3

sstd_dev

std_dev

std_dev

s ?

2s ?

std_dev s ?New testcase #2

a+2b+2c+2d+2e+2f+2

Page 15: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

15

Outline BackgroundBackground

Related WorkRelated WorkMetamorphic TestingMetamorphic Testing

Metamorphic Testing Empirical Studies Metamorphic Runtime CheckingMetamorphic Runtime Checking Future Work & ConclusionFuture Work & Conclusion

Page 16: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

16

Empirical Study Is metamorphic testing more effective than other

approaches in detecting defects in applications without test oracles?

Approaches investigated Metamorphic Testing

Using metamorphic properties of the entire application

Runtime Assertion Checking Using Daikon-detected program invariants

Partial Oracle Simple inputs for which correct output can easily be determined

Page 17: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

17

Applications Investigated Machine Learning

C4.5: decision tree classifier MartiRank: ranking Support Vector Machines (SVM): vector-based classifier PAYL: anomaly-based intrusion detection system

Discrete Event Simulation JSim: used in simulating hospital ER

Information Retrieval Lucene: Apache framework’s text search engine

Optimization gaffitter: genetic algorithm approach to bin-packing

problem

Page 18: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

18

Methodology Mutation testing was used to seed defects into

each application Comparison operators were reversed Math operators were changed Off-by-one errors were introduced

For each program, we created multiple versions, each with exactly one mutation

We ignored mutants that yielded outputs that were obviously wrong, caused crashes, etc.

Effectiveness is determined by measuring what percentage of the mutants were “killed”

Page 19: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

19

Experimental Results

0 20 40 60 80 100 120

TOTAL

gaffitter

Lucene

JSim

PAYL

SVM

MartiRank

C4.5

% of Mutants Killed

Partial Oracle Runtime Assertion Checking Metamorphic Testing

Page 20: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

20

Analysis of Results Assertions are good for checking bounds and

relationships but not for changes to values

Metamorphic testing particularly good for detecting errors in loop conditions

Metamorphic testing was not very effective for PAYL (5%) and gaffitter (33%) fewer properties identified defects had little impact on output

Page 21: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

21

Outline BackgroundBackground

Related WorkRelated WorkMetamorphic TestingMetamorphic Testing

Metamorphic Testing Empirical StudiesMetamorphic Testing Empirical Studies Metamorphic Runtime Checking Future Work & ConclusionFuture Work & Conclusion

Page 22: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

22

Metamorphic Runtime Checking Results of previous study revealed limitations

of scope and robustness in metamorphic testing

What if we consider the metamorphic properties of individual functions and check those properties as the entire program is running?

A combination of metamorphic testing and runtime assertion checking

Page 23: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

23

Metamorphic Runtime Checking Tester specifies the metamorphic properties of

individual functions using a special notation in the code (based on JML)

Pre-processor instruments code with corresponding metamorphic tests

Tester runs entire program as normal (e.g., to perform system tests)

Violation of any property reveals a defect

Page 24: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

24Metamorphic test

MRC Model of Execution

Function f is about to be executed with input x in state S

Create a sandbox for

the test

Execute f(x)to get result

Send resultto test

Program continues

Transforminput to

get t(x)

Executef(t(x))

Compareoutputs

Reportviolations

The metamorphic test is conducted atthe same point in the program executionas the original function call

The metamorphic test runs in parallel withthe rest of the application

Page 25: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

25

Empirical Study Can Metamorphic Runtime Checking detect

defects not found by system-level metamorphic testing?

Same mutants used in previous study29% were not found by metamorphic testing

Metamorphic properties identified at function level using suggested guidelines

Page 26: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

26

Experimental Results

0

20

40

60

80

100

120

C4.5

Mar

tiRank

SVMPAYL

JSim

Luce

ne

gaffit

ter

TOTAL

% o

f d

efec

ts d

etec

ted

MRC only

Both

MT only

Page 27: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

27

Analysis of Results Scope: Function-level testing allowed us to:

identify additional metamorphic propertiesexecute more tests

Robustness: Metamorphic testing “inside” the application detected subtle defects that did not have much effect on the overall program output

Page 28: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

28

Combined Results

0 20 40 60 80 100 120

TOTAL

gaffitter

Lucene

JSim

PAYL

SVM

MartiRank

C4.5

Partial Oracle Runtime Assertion Checking Metamorphic Testing MT + MRC

Page 29: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

29

Outline BackgroundBackground

Related WorkRelated WorkMetamorphic TestingMetamorphic Testing

Metamorphic Testing Empirical StudiesMetamorphic Testing Empirical Studies Metamorphic Runtime CheckingMetamorphic Runtime Checking Future Work & Conclusion

Page 30: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

30

Results Demonstrated that metamorphic testing advances

the state of the art in detecting defects in applications without test oracles

Proved that Metamorphic Runtime Checking will reveal defects not found by using system-level properties

Showed that it is feasible to continue this type of testing in the deployment environment, with minimal impact on the end user

Page 31: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

31

Short-Term Opportunities Automatic detection of metamorphic properties

Using dynamic and/or static techniques

Fault localizationOnce a defect has been detected, figure out where

it occurred and how to fix it

Implementation issuesReducing overheadHandling external databases, network traffic, etc.

Page 32: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

32

Long-Term Directions Testing of multi-process or distributed

applications in these domains

Collaborative defect detection and notification

Investigate the impact on the software development processes used in the domains of non-testable programs

Page 33: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

33

Contributions & Accomplishments1. A set of metamorphic testing guidelines

[Murphy, Kaiser, Hu, Wu; SEKE’08]

2. New empirical studies [Xie, Ho, Murphy, Kaiser, Xu, Chen; QSIC’09]

3. Heuristic Metamorphic Testing [Murphy, Shen, Kaiser; ISSTA’09]

4. Metamorphic Runtime Checking [Murphy, Shen, Kaiser; ICST’09]

5. In Vivo Testing [Murphy, Kaiser, Vo, Chu; ICST’09] [Murphy, Vaughan, Ilahi, Kaiser; AST’10]

Page 34: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

34

Thank you!

Page 35: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

35

MotivationBackup Slides!

Page 36: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

36

Assessment of Quality

1994: Hatton et al. pointed out a “disturbing” number of defects due to calculation errors in scientific computing software [TSE vol.20]

2007: Hatton reports that “many scientific results are corrupted, perhaps fatally so, by undiscovered mistakes in the software used to calculate and present those results” [Computer vol.40]

Page 37: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

37

Complexity vs. Effectiveness

Complexity

Eff

ecti

ven

ess

Embedded Assertions

AlgebraicSpecifications

FormalSpecifications

Trace Checking& Log Analysis

System-levelMetamorphic

Testing

Metamorphic RuntimeChecking

Page 38: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

38

MotivationMetamorphicProperties

Page 39: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

39

Categories of Metamorphic Properties

Additive: Increase (or decrease) numerical values by a constant

Multiplicative: Multiply numerical values by a constant

Permutative: Randomly permute the order of elements in a set

Invertive: Negate the elements in a set Inclusive: Add a new element to a set Exclusive: Remove an element from a set Compositional: Compose a set

Page 40: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

40

Sample Metamorphic Properties1. Permuting the order of the examples in the training data should not

affect the model2. If all attribute values in the training data are multiplied by a positive

constant, the model should stay the same3. If all attribute values in the training data are increased by a positive

constant, the model should stay the same4. Updating a model with a new example should yield the same model

created with training data originally containing that example5. If all attribute values in the training data are multiplied by -1, and an

example to be classified is also multiplied by -1, the classification should be the same

6. Permuting the order of the examples in the testing data should not affect their classification

7. If all attribute values in the training data are multiplied by a positive constant, and an example to be classified is also multiplied by the same positive constant, the classification should be the same

8. If all attribute values in the training data are increased by a positive constant, and an example to be classified is also increased by the same positive constant, the classification should be the same

Page 41: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

41

Other Classes of Properties (1)

StatisticalSame mean, variance, etc. as the original

HeuristicApproximately equal to the original

Semantically EquivalentDomain specific

Page 42: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

42

Other Classes of Properties (2)

Noise BasedAdd/Change data that should not affect result

PartialChange to part of input only affects part of output

CompositionalNew input relies on original outputShortestPath(a, b) =

ShortestPath(a, c) + ShortestPath(c, b)

Page 43: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

43

Automatic Detection of Properties

StaticUse machine learning to model what code looks

like that exhibits certain properties, then determine whether other code matches that model

Use symbolic execution to check “algebraically”

DynamicObserve multiple executions and infer properties

Page 44: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

44

MotivationAutomatedMetamorphic

Testing

Page 45: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

45

Automated Metamorphic Testing

Tester specifies the application’s metamorphic properties

Test framework does the rest:Transform inputsExecute program with each inputCompare outputs according to specification

Page 46: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

46

AMST Model

Page 47: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

47

Specifying Metamorphic Properties

Page 48: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

48

MotivationHeuristicMetamorphic

Testing

Page 49: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

49

Statistical Metamorphic Testing Introduced by Guderlei & Mayer in 2007

The application is run multiple times with the same input to get a mean value μo and variance σo

Metamorphic properties are applied The application is run multiple times with the new

input to get a mean value μ1 and variance σ1

If the means are not statistically similar, then the property is considered violated

Page 50: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

50

Heuristic Metamorphic Testing When we expect that a change to the input will produce

“similar” results, but cannot determine the expected similarity in advance

Use input X to generate outputs M1 through Mk

Use some metric to create a profile of the outputs

Use input X’ (created according to a metamorphic property) to generate outputs N1 through Nk

Create a profile of those outputs

Use statistical techniques (e.g. Student t-test) to check that the profile of outputs N is similar to that of outputs M

Page 51: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

51

Heuristic Metamorphic Testing

x y 1nd_f

x y 2nd_f

x y nnd_f

t(x) y’ 1nd_f

y’ 2nd_f

y’ nnd_f

t(x)

t(x)Do the profiles demonstrate

the expected relationship?

profile ofy1…yn

profile ofy’1…y’n

Page 52: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

52

HMT Example

2

sort

?

1

?

4

3

1

?

2

3

4

?

1

2

3

?

4

?

?

1

?

2

3

4

Build a profile based on normalized equivalenceP

permute

4

1

?

3

2

?

sort

1

?

?

2

3

4

1

2

?

3

4

?

Build a profile based on normalized equivalenceand compare it statistically to the first profile

P’=?

Page 53: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

53

HMT Empirical Study

Is Heuristic Metamorphic Testing more effective than other approaches in detecting defects in non-deterministic applications without test oracles?

Approaches investigated Heuristic Metamorphic Testing Embedded Assertions Partial Oracle

Applications investigated MartiRank: sorting sparse data sets JSim: non-deterministic event timing

Page 54: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

54

HMT Study Results & Analysis

Heuristic Metamorphic Testing killed 59 of the 78 mutants

Partial oracle and assertion checking ineffective for JSim because no single execution was outside the specified range

Page 55: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

55

MotivationMetamorphicRuntime Checking

Page 56: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

56

Extensions to JML

Page 57: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

57

Creating Test Functions

/*@ @meta std_dev(\multiply(A, 2)) == \result * 2 */public double __std_dev(double[] A) { ...}

protected boolean __MRCtest0_std_dev (double[] A, double result) { return Columbus.approximatelyEqualTo (__std_dev(Columbus.multiply(A, 2)), result * 2);}

Page 58: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

58

Instrumentation

public double std_dev(double[] A) { // call original function and save result double result = __std_dev(A);

// create sandbox int pid = Columbus.createSandbox();

// program continues as normal if (pid != 0) return result; else { // run test in child process if (!__MRCtest0_std_dev(A, result)) Columbus.fail(); // handle failure Columbus.exit(); // clean up }}

Page 59: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

59

MRC: Case Studies

We investigated the WEKA and RapidMiner toolkits for Machine Learning in Java

For WEKA, we tested four apps:Naïve Bayes, Support Vector Machines (SVM),

C4.5 Decision Tree, and k-Nearest Neighbors For RapidMiner, we tested one app:

Naïve Bayes

Page 60: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

60

MRC: Case Study Setup For each of the five apps, we specified 4-6

metamorphic properties of selected methods (based on our knowledge of the expected behavior of the overall application)

Testing was conducted using data sets from UCI Machine Learning Repository

Goal was to determine whether the properties held as expected

Page 61: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

61

MRC: Case Study Findings

Discovered defects in WEKA k-NN and WEKA Naïve Bayes related to modifying the machine learning “model”This was the result of a variable not being updated

appropriately

Discovered a defect in RapidMiner Naïve Bayes related to determining confidence There was an error in the calculation

Page 62: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

62

MotivationMetamorphic TestingExperimental

Study

Page 63: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

63

Approaches Not Investigated

Formal specification Issues related to completenessPrev. work converted specifications to invariants

Algebraic propertiesNot appropriate at system-levelAutomatic detection only supported in Java

Log/trace file analysisNeed more detailed knowledge of implementation

Pseudo-oraclesNone appropriate for applications investigated

Page 64: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

64

Methodology: Metamorphic Testing

Each variant (containing one mutation) acted as a pseudo-oracle for itself:Program was run to produce an output with the

original input datasetMetamorphic properties applied to create new

input datasetsProgram run on new inputs to create new outputs If outputs not as expected, the mutant had been

killed (i.e. the defect had been detected)

Page 65: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

65

Methodology: Partial Oracle

Data sets were chosen so that the correct output could be calculated by hand

These data sets were typically smaller than the ones used for other approaches

To ensure fairness, the data sets were selected so that the line coverage was approximately the same for each approach

Page 66: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

66

Methodology: Runtime Assertion Checking

Daikon was used to detect program invariants in the “gold standard” implementation

Because Daikon can generate spurious invariants, programs were run with a variety of inputs, and obvious spurious invariants were discarded

Invariants then checked at runtime

Page 67: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

67

Defects Detected in Study #1

Page 68: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

68

Study #1: SVM Results

Permuting the input was very effective at killing off-by-one mutants

Many functions in SVM analyze a set of numbers (mean, standard dev, etc.)

Off-by-one mutants caused some element of the set to be omitted

By permuting, a different number would be omitted This revealed the defect

Page 69: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

69

Study #1: SVM Example

Permuting the input reveals this defect because both m_I1 and m_I4 will be different

Partial oracle does not because only one element is omitted, so one will remain same; for small data sets, this did not affect the overall result

Page 70: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

70

Study #1: C4.5 Results

Negating the input was very effective C4.5 creates a decision tree in which nodes contain

clauses like “if attrn > α then class = C” If the data set is negated, those nodes should

change to “if attrn ≤ -α then class = C”, i.e. both the operator and the sign of α

In most cases, only one of the changes occurred

Page 71: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

71

Study #1: C4.5 Example

Mutant causes ClassFreq to have negative values, violating assertion

Permuting the order of elements does not affect the output in this case

Page 72: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

72

Study #1: MartiRank Results

Permuting and negating were effective at killing comparison operator mutants

MartiRank depends heavily on sorting Permuting and negating change which numbers get

sorted and what the result should be, thus inducing the differences in the final sorted list

Page 73: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

73

Study #1: Effectiveness of Properties

Page 74: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

74

Study #1: Lucene Results

Most mutants gave a non-zero score to the term “foo”, thus L3 detected the defect

Page 75: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

75

Study #1: gaffitter Results

G1: increasing the number of generations should increase the overall quality

G2: multiplying item and bin sizes by a constant should not affect the solution

Most of defects killed by G1 related to incorrectly selecting candidate solutions

Page 76: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

76

Empirical Studies: Threats to Validity

Representativeness of selected programs Types of defects Data sets Daikon-generated program invariants Selection of metamorphic properties

Page 77: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

77

MotivationMetamorphic RuntimeChecking

Experimental Study

Page 78: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

78

Study #2 Results

If we only consider functions for which metamorphic properties were identified, there were 189 total mutants

MRC detected 96.3%, compared to 67.7% for system-level metamorphic testing

Page 79: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

79

Study #2 PAYL Results

Both functions call numerous other functions, but we can circumvent restrictions on the input domain

Permuting input tends to kill off-by-one mutants

Page 80: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

80

Study #2 gaffitter Results

Page 81: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

81

Study #2: gaffitter Example

1 2 3 9

Metamorphic Property: If we switch the order, the new output should be predictable

1 2 3 4 5

6 7 8 94 56 7 8

Simply, the elements not includedin the original cross-over

Genetic Algorithm takes two sets and “crosses over” at a particular element

1 2 3 4 5

6 7 8 9

Page 82: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

82

Study #2: gaffitter Example

Metamorphic property is violated: elements 3 and 8 should not appear in both sets

1 2 3 4 5

6 7 8 9

Now consider a defect in which the cross-over happens at the wrong point

1 2 3 4 5

6 7 8 91 2 3 8 9

6 7 8 3 4 5

Page 83: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

83

Study #2: gaffitter Example

1 2 3 8

Erroneous implementation

Correct implementation

1 2 3 4 5

6 7 8 99

1 2 3 91 2 3 4 5

6 7 8 9

This defect is only detected by system-level metamorphic testingif element 8 has any impact on the “quality” of the final solution. However,a single element is unlikely to do so.

Page 84: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

84

Study #2 Lucene Results

MRC killed three mutants not killed by MT

All three were in the idf function

Page 85: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

85

Study #2: Lucene Example

Search query results are ordered according to a score

“ROMEO or JULIET”Act 3

Scene 5Act 2

Scene 4Act 5

Scene 1

Consider a defect in which the scores are off by one. The results stay the same because only the order is important.

“ROMEO or JULIET”Act 3

Scene 5Act 2

Scene 4Act 5

Scene 1

5.837 4.681 3.377

6.837 5.681 4.377Partial oracle does not reveal this defect because the scores cannot be calculated in advance.

Page 86: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

86

Study #2: Lucene Example

“ROMEO or JULIET”Act 3

Scene 5Act 2

Scene 4Act 5

Scene 1

System-level metamorphic property: changing the query order shouldn’t affect result

“JULIET or ROMEO”Act 3

Scene 5Act 2

Scene 4Act 5

Scene 1

6.837 5.681 4.377

6.837 5.681 4.377

Even though the defect exists, the property still holds and the defect is not detected.

Page 87: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

87

Study #2: Lucene Example

The score itself is computed as the result of many subcalculations.

Score(q) = ∑Similarity(f)*Weight(qi) + … + idf(q) + …

Metamorphic Runtime Checkingcan detect that there is an errorin this function by checking itsindividual (mathematical) properties.

Page 88: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

88

MotivationIn Vivo Testing

Page 89: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

89

Generalization of MRC

In Metamorphic Runtime Checking, the software tests itself

Why only run metamorphic tests? Why limit ourselves only to applications

without test oracles? Why not allow the software to continue testing

itself as it runs in the production environment?

Page 90: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

90

In Vivo Testing An approach whereby software tests itself in

the production environment by running any type of test (unit, integration, “parameterized unit”, etc.) at specified program points

Tests are run in a sandbox so as not to affect the original process

Invite implementation: less than half a millisecond overhead per test

Page 91: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

91

Example of Defect: Cache

private int numItems = 0, currSize = 0; private int maxCapacity = 1024; // in bytespublic int getNumItems() { return numItems;}

public boolean addItem(CacheItem i) throws ...{ numItems++;

add(i); currSize += i.size; return true;

}

if (currSize + i.size < maxCapacity) {

} else { return false; }

Should only be incremented within “if” block

Number of items in

the cache

Their size (in bytes)

Maximum capacity

Page 92: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

92

Insufficient Unit Test

public void testAddItem() { Cache c = new Cache(); assert(c.addItem(new CacheItem())) assert(c.getNumItems() == 1); assert(c.addItem(new CacheItem())) assert(c.getNumItems() == 2);}

1. Assumes an empty/new cache

2. Doesn’t take into account various states that the cache can be in

Page 93: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

93

Defects Targeted

1. Unit tests that make incomplete assumptions about the state of objects in the application

2. Possible field configurations that were not tested in the lab

3. A legal user action that puts the system in an unexpected state

4. A sequence of unanticipated user actions that breaks the system

5. Defects that only appear intermittently

Page 94: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

94

In Vivo: Model of Execution

Function isabout to beexecuted

NOExecutefunction

Yes

Run a test?

Createsandbox

Run testFork

Stop

Rest of program

continues

Page 95: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

95

Writing In Vivo Tests

/* Method to be tested */public boolean addItem(CacheItem i) { . . . }

/* JUnit style test */public void testAddItem() { Cache c = new Cache();

if (c.addItem(new CacheItem())) assert (c.getNumItems() == 1);

}

CacheItem i) {

int oldNumItems = getNumItems();

this; boolean

In Vivo

return oldNumItems+1;else return true;

i))

Page 96: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

96

Instrumentation

/* Method to be tested */public boolean __addItem(CacheItem i) { . . . }

/* In Vivo style test */public boolean testAddItem(CacheItem i) { ... }

public boolean addItem(CacheItem i) { if (Invite.runTest(“Cache.addItem”)) { Invite.createSandboxAndFork(); if (Invite.isTestProcess()) { if (testAddItem(i) == false) Invite.fail(); else Invite.succeed(); Invite.destroySandboxAndExit(); } } return __addItem(i);}

Page 97: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

97

In Vivo Testing: Case Studies Applied testing approach to two caching systems

OSCache 2.1.1 Apache JCS 1.3

Both had known defects that were found by users (no corresponding unit tests for these defects)

Goal: demonstrate that “traditional” unit tests would miss these but In Vivo testing would detect them

Page 98: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

98

In Vivo Testing: Experimental Setup

An undergraduate student created unit tests for the methods that contained the defects

These tests passed in “development”

Student was then asked to convert the unit tests to In Vivo tests

Driver created to simulate real usage in a “deployment environment”

Page 99: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

99

In Vivo Testing: Discussion In Vivo testing revealed all defects, even

though unit testing did not

Some defects only appeared in certain states, e.g. when the cache was at full capacityThese are the very types of defects that In Vivo

testing is targeted at

However, the approach depends heavily on the quality of the tests themselves

Page 100: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

100

In Vivo Testing: Performance

Page 101: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

101

More Robust Sandboxes

“Safe” test case selection [Willmor and Embury, ICSE’06]

Copy-on-write database snapshotsMS SQL Server v8

Page 102: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

102

In Vivo Testing: Related Work

Self-checking SoftwareGamma [A.Orso et al, ISSTA’02]Skoll: [A.Memon et al., ICSE’03]Cooperative Bug Isolation [B.Liblit et al., PLDI’03]COTS components [S.Beydeda, COMPSAC’06]

Property-based Software TestingD.Rosenblum: runtime assertion checking I.Nunes: checking algebraic properties [ICFEM’06]

Page 103: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

103

MotivationRelated Work

Page 104: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

104

Limitations of Other Approaches Formal specification languages

Issues related to completeness Balance between expressiveness and implementability

Algebraic properties Useful for data structures, but not for arbitrary functions or

entire programs Limitations of previous work in runtime checking

Log/trace file analysis Requires careful planning in advance

Page 105: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

105

Previous Work in MT

T.Y.Chen et al.: applying metamorphic testing to applications without oracles [Info. & Soft. Tech. vol.44, 2002]

Domain-specific testingGraphics [J.Mayer and R.Guderlei, QSIC’07]Bioinformatics [T.Y.Chen et al., BMC Bioinf.

10(24), 2009]Middleware [W.K.Chan et al., QSIC’05]Others…

Page 106: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

106

Previous Studies [Hu et al., SOQUA’ 06]

Invariants hand-generated Smaller programs Only deterministic applications Didn’t consider partial oracle

Page 107: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

107

Developer Effort [Hu et al., SOQUA’ 06]

Students were given three-hour training sessions on MT and on assertion checking

Given three hours to identify metamorphic properties and program invariants

Averaged about the same number of metamorphic properties as invariants

The metamorphic properties were more effective at killing mutants

Page 108: Metamorphic Testing Techniques to Detect Defects in Applications without Test Oracles Christian Murphy Thesis Defense April 12, 2010

108

Fault Localization Delta debugging

[Zeller, FSE’02] Compare trace of failed execution vs. successful ones

Cooperative Bug Isolation [Liblit et al., PLDI’03] Numerous instances report results and failed execution is

compared to those

Statistical approach [Baah, Gray, Harrold; SoQUA’06] Combines model of normal behavior with runtime

monitoring