mutation testing

Mutation Testing

Chris Sinjakli

Edition

Testing is a good thing

But how do we know our tests are good?

Code coverage is a start

But it can give a “good” score with really dreadful tests

Really dreadful testsclass Adder def self.add (x, y) return x - y endend

describe Adder do it "should add the two arguments" do Adder.add(1, 1) endend

Coverage: 100%Usefulness: 0

A contrived example

But how could we detect it?

Mutation Testing!

“Who watches the watchmen?”

If you can change the code, and a test doesn’t fail, either the code is never run or the tests are

wrong.

1. Run test suite2. Change code (mutate)3. Run test suite again

If tests now fail, mutant dies. Otherwise itsurvives.

Going with our previous exampleclass Adder def self.add (x, y) return x - y endend

Let’s change something

Going with our previous exampleclass Adder def self.add (x, y) return x + y endend

This still passes

Success

We know something is wrong

So what? It caught a really rubbish test

How about something slightly less obvious?

Slightly less obvious (and I mean slightly)

class ConditionChecker def self.check(a, b) if a && b return 42 else return 0 end endend

describe ConditionChecker do it "should return 42 when both arguments are true" do ConditionChecker.check(true, true).should == 42 end it "should return 0 when both arguments are false" do ConditionChecker.check(false, false).should == 0 endend

Coverage: 100%Usefulness: >0But still wrong

class ConditionChecker def self.check(a, b) if a && b return 42 else return 0 end endend

Mutate

class ConditionChecker def self.check(a, b) if a || b return 42 else return 0 end endend

Passing tests

Mutation testing caught our mistake

Useful technique

But still has its flaws

The downfall of mutation(Equivalent Mutants)

index = 0

while index != 100 dodoStuff()index += 1

index = 0

while index < 100 dodoStuff()index += 1

Mutates to

But the programs are equivalent, so no test will fail

There is no possible test which can “kill” the mutant

The programs are equivalent

Also (potentially)

• Infinite loops• More memory used• Compile/run time errors – tools should

minimise these

How bad is it?

• Good paper assessing the problem [SZ10]• Took 7 widely used, “large” projects• Found:– 15 mins to assess one mutation– 45% uncaught mutations are equivalent– Better tested project -> worse signal-to-noise ratio

Can we detect the equivalents?

• Not in the general case [BA82]• Some specific cases can be detected– Using compiler optimisation techniques [BS79]– Using mathematical constraints [DO91]– Line coverage changes [SZ10]

• All heuristic algorithms – not seen any claiming to kill all equivalent mutants

Some Ruby, then a Java one I liked

• Looked into Heckle• Seemed unmaintained (nothing since 2009)• Then I saw...

• Mutant seems to be the new favourite• Runs in Rubinius (1.8 or 1.9 mode)• Only supports RSpec• Easy to set up

rvm install rbx-headrvm use rbx-headgem install mutant

• And easy to usemutate “ClassName#method_to_test” spec

• Loads of tools to choose from• Bytecode vs source mutation• Will look at PIT (seems like one of the better

PIT - pitest.org

• Works with “everything”– Command line– Ant– Maven

• Bytecode level mutations (faster)• Very customisable

– Exclude classes/packages from mutation– Choose which mutations you want– Timeouts

• Makes pretty HTML reports (line/mutation coverage)

Summary

• Can point at weak areas in your tests• At the same time, can be prohibitively noisy• Try it and see

Questions?

References• [BA82] - T. A. Budd and D. Angluin. Two notions of correctness and

their relation to testing. Acta Informatica, 18(1):31-45, November 1982.

• [BS79] - D. Baldwin and F. Sayward. Heuristics for determining equivalence of program mutations. Research report 276, Department of Computer Science, Yale University, 1979.

• [DO91] - R. A. DeMillo and A. J. Outt. Constraint-based automatic test data generation. IEEE Transactions on Software Engineering, 17(9):900-910, September 1991.

• [SZ10] - D. Schuler and A. Zeller. (Un-)Covering Equivalent Mutants. Third International Conference on Software Testing, Verification and Validation (ICST), pages 45-54. April 2010.

Also interesting• [AHH04] – K. Adamopoulos, M. Harman and R. M. Hierons. How to

Overcome the Equivalent Mutant Problem and Achieve Tailored Selective Mutation Using Co-evolution. Genetic and Evolutionary Computation -- GECCO 2004, pages 1338-1349. 2004.

mutation testing - ruby edition

arguments doadder

false doconditionchecker

true doconditionchecker

endend coverage

equivalent mutant problem

tailoredselective mutation

code coverage

software testing

Technology

testing web applications with mutation...

mutation testing - ariadcmlrisks.com/resources/pdf/mutation...

ruby testing

mutation testing for model based requirements€¦ ·...

a mutation testing tool for java programs mutation testing...

ruby testing tools

introduction to ruby watir (web application testing in ruby)

open challenges in mutation testing

topics in mutation testing and program...

mateusz bryła - mutation testing

literature survey on mutation testing august 2014 ... -...

testing with ruby

oop 2015 – mutation testing

automated testing with ruby

a weak mutation testing framework for bpmn · c. mutation...

semantic mutation testing

software testing and validation swe 434 mutation testing

mutation testing in java

mutation testing - university of texas at...