faults and regression testing - localizing failure-inducing program edits based on spectrum...

Localizing Failure-Inducing Program Edit B d S t I f tiEdits Based on Spectrum Information

Lingming Zhang, Miryung Kim, Sarfraz KhurshidThe University of Texas at Austin

ICSM2011, September 27th 2011

Overview

Change impact analysis is effective at findingChange impact analysis is effective at finding suspicious edits but lacks precise ranking.

Spectrum based fault localization is effective atSpectrum-based fault localization is effective at ranking but does not scale well.

Our insight: combine change impact analysis andOur insight: combine change-impact analysis and spectrum-based fault localization.• Identify suspicious edits using extended call graphs. • Rank suspicious edits using dynamic program

spectrum information.

L. Zhang: Localizing failure-inducing program edits based on spectrum information 2

Summary of our results

FaultTracer localizes failure-inducing edits with

FaultTracer localizes failure inducing edits with high precision:

Id tif i i i dit t f• Identifying suspicious edits: outperforms Chianti by 19.37%.

• Ranking all suspicious edits: ranks real regression faults within top 3 edits for 14 ofregression faults within top 3 edits for 14 of the 22 studied real-world failures. R ki th d l l i i dit• Ranking method-level suspicious edits: outperforms existing heuristic by 56.25%.

Outline

FaultTracer ApproachFaultTracer ApproachEmpirical EvaluationRelated Work ConclusionsConclusions

Examplep

Program P Program P’Program P Program Ppublic class A {

public static int f1=0;public static int f2=0;

public class A {public static int f1=1;public static int f2=1;

evolve

p ;...

}class B {

int f1=0; int f2=0; int f3=0;

public static int f2 1;...

}class B {

int f1=0; int f2=1; int f3=1;public int foo(){return f1;}...

}class C extends B{

; ; ;int f4=1;public int foo(){ if(f1>=0) return f1;

else return f4;

Regression test suite T

}class C extends B{

public int f1=3;public void bar(int f) {f3=f+f1;}...

public void test1() { A.bar(1); }public void test2() { ... }public void test3() { }Test

Re-TestBug!Bug!

public void test3() { ... }public void test4() {

C c = new C();int f = c.foo();

}public void test5() { ... }

FaultTracer overview

Selecting tests

Detecting changes and

gbased on Extended Call Graph analysis

②Tchanges and

dependences

①P’∆

ᵟtId tif i i i

① ③

tIdentifying suspicious

edits based on Extended

Call Graph analysisRank suspicious edits based on④Call Graph analysis edits based on

program spectrum information

ᵟt’L. Zhang: Localizing failure-inducing program edits based on spectrum information 6

Extended Call Graph representationp p

public void test1() { A.bar(1); }public void test4() {

C c = new C();int f = c.foo();

Extended�Call�Graph�used by FaultTracer

Traditional�Call�Graph�used by Chianti used�by�FaultTracerused�by�Chianti

test1 test4

<C,C.foo()>

test1 test4

<C,C.foo()>

A.bar() C.foo()C.C() A.bar()

<SFW,A.f2>

A.Clinit() C.foo()

<FR,C.f1>

C.C()A.Clinit()

A.f2 B.f1B.B()

Step 1. Detecting atomic changes and p g gdependences

Change types

Description

CM Change�method

AM Add�method

DM Delete�method

AF Add�field

DF Delete�field

CFI Change�instance�field

CSFI Change static fieldCSFI Change�static�field

LCm Method�look-up�change

LCf Field�look-up changeChange dependences inference rulesChange�dependences�inference�rules

Atomic�Change�Types

Step 2. Test selection based on Extended C ll G h (ECG) l iCall Graph (ECG) analysis

FaultTracer directly matches all changes with test ECGs before edits to select the influenced tests.before edits to select the influenced tests.

Step 3. Suspicious edit identification b d E t d d C ll G h l ibased on Extended Call Graph analysis

FaultTracer directly selects the non-look-up changes appear on test ECGs after edits as suspicious edits.appear on test ECGs after edits as suspicious edits.

FaultTracer selects method or field edits that have caused look-up changes on test ECGs as suspicious editslook up changes on test ECGs as suspicious edits.

Step 4. Spectrum-based fault localization f ditCorrelation between suspicious edits and testsfor program edits

pEdits test2 test3 test4 test5

CSFI(A.f1)

CM(B f )CM(B.foo)

AF(C.f1)

AM(C.bar)

Suspiciousness score computationout Pass Pass Pass Fail

Suspiciousness Score TieBreak

Edits Tarantula SBI Jaccard Ochiai -EditsCSFI(A.f1) 0.00 0.00 0.00 0.00 -

CM(B.foo) 0.75 0.50 0.50 0.71 1

AF(C.f1) 0.75 0.50 0.50 0.71 0

AM(C.bar) 1.00 1.00 1.00 1.00 -

Outline

FaultTracer ApproachFaultTracer ApproachEmpirical EvaluationRelated Work ConclusionsConclusions

Research Questions

RQ1: How does FaultTracer compare to Chianti in id tif i i i dit ?identifying suspicious edits?

RQ2: How effective is FaultTracer in ranking suspicious edits?suspicious edits?

Subjects: overviewj

Subjects from Software-artifact Infrastructure Repository (SIR)Repository (SIR).

Project Version Program Size (KLoC) NumberProject Version Program Size (KLoC) Number of Test

Jtopas 0.0-3.0 1.83 ~ 5.36 95-209

Xml-Security 0.0-3.0 17.44 ~ 18.99 84-106

JMeter 0.0-5.0 31.01 ~ 41.05 70-97

Ant 0.0-8.0 17.20 ~ 80.44 112-878

Subjects: change statistics

Number of changes for each version pair

Ant5 0-6 0Ant6.0-7.0Ant7.0-8.0

Ant2.0-3.0Ant3.0-4.0Ant4.0-5.0Ant5.0 6.0

JMeter3.0-4.0JMeter4.0-5.0

Ant0.0-1.0Ant1.0-2.0 DM

JMeter0.0-1.0JMeter1.0-2.0JMeter2.0-3.0JMeter3.0 4.0

Jtopas2.0-3.0XmlSec0.0-1.0XmlSec1.0-2.0XmlSec2.0-3.0

0 1000 2000 3000 4000 5000 6000 7000

Jtopas0.0-1.0Jtopas1.0-2.0

0 1000 2000 3000 4000 5000 6000 7000

RQ1: How does FaultTracer compare to Chi ti i id tif i i i dit ?FaultTracer achieves 19.37% improvement in theChianti in identifying suspicious edits?FaultTracer achieves 19.37% improvement in the

precision of identification suspicious edits.

60 ChiantiFaultTracer

RQ2: How effective is FaultTracer in ki i i dit ?

Ranks all types of edits:ranking suspicious edits?Ranks all types of edits:

• Average performance.Tarantula SBI Jaccard Ochiai Suspicious

edit num.Editnumber

Average 8.50 8.50 10.83 14.66 68.83 3932Percentage Toedit number

0.22% 0.22% 0.28% 0.37% 1.75% --

• Example (Ant5.0-6.0)T t T t l SBI J O hi i S i i EditTest Tarantula SBI Jaccar

dOchiai Suspicious

edit num.Editnumber

ant.taskdefs.optional.EchoPropertiesTest testEchoToBadFile

1 1 1 10 182 5019pertiesTest.testEchoToBadFile

RQ2: How effective is FaultTracer in ki i i dit ?

Ranks method edits (FaultTracer v.s. Heuristic)ranking suspicious edits?Ranks method edits (FaultTracer v.s. Heuristic)

• Achieves 56.25% improvement in the precision of localizing method-level failure-inducing editslocalizing method-level failure-inducing edits

Limitations

Does not currently filter out refactorings (e.g., useDoes not currently filter out refactorings (e.g., use RefFinder [Prete+2010]).

Uses only four spectrum based fault localizationUses only four spectrum-based fault localization techniques.

The experimental evaluation is limited by the small number of real regression faults.number of real regression faults.

Related work

Change-impact analysisChange impact analysis• Chianti [Ren+2004]• Crisp [Chesley+2005]• Crisp [Chesley+2005]• Heuristic ranking [Ren+2007]

Fault localization• Spectrum-basedSpectrum based

• E.g., Tarantula [Jones+2002], SBI [Liblit+2005], Jaccard[Abreu+2007], Ochiai [Abreu+2007].

• Delta debugging [Zeller1999]• Model-basedModel based

• E.g., Bayesian diagnosis [Kleer+1987]

Conclusion

FaultTracer combines change impact analysis with g p ydynamic spectra.

FaultTracer improves change impact analysis basedFaultTracer improves change impact analysis based extended call graph analysis.

Experimental evaluation shows FaultTracer:Experimental evaluation shows FaultTracer:• Performs 19.37% better than Chianti in determining

affecting changesaffecting changes.• Localizes failure-inducing edits within top 3 edits for

14 of the 22 regression failures14 of the 22 regression failures.• Performs 56.25% better than previous heuristic for

l li i f il i d i ditlocalizing failure-inducing program edits.

zhanglm10@gmail com

zhanglm10@gmail.com

faults and regression testing - localizing failure-inducing program edits based on spectrum...

suspicious edits edits

findingsuspicious edits

field edits

localizing failure

test ecgsbefore edits

public int f1

public static int f1

public int foo

Technology

localizing mobile apps

guideline inducing labour

prr localizing development ch2

localizing development: does participation work?

localizing valiant hearts gerard barnaud

false localizing signs in neurology

brightonseo 2014 - localizing seo performance

localizing the mdgs using cbms

localizing the

internationalizing and localizing wordpress theme

prr localizing development ch3

prospects for observing and localizing gravitational … ·...

localizing molecules

microsoft localizing the portal

localizing xml documents through xslt

mbb localizing lesions

localizing ios apps

inducing downstream selling effort with market share...

localizing molecules. 1. what molecules? localizing...

localizing an application