mining software archives to support software development

136
Mining Software Archives to Support Software Development Tom Zimmermann Saarland University

Upload: thomas-zimmermann

Post on 27-Jun-2015

1.649 views

Category:

Technology


1 download

DESCRIPTION

Job application talk.

TRANSCRIPT

Page 1: Mining Software Archives to Support Software Development

Mining Software Archives to Support Software Development

Tom ZimmermannSaarland University

Page 2: Mining Software Archives to Support Software Development

Software Development

BuildHello Calgary!

Page 3: Mining Software Archives to Support Software Development

Software Development

Build

Page 4: Mining Software Archives to Support Software Development

Collaboration

Page 5: Mining Software Archives to Support Software Development

Collaboration

Page 6: Mining Software Archives to Support Software Development

Collaboration

Comm. Archive

Page 7: Mining Software Archives to Support Software Development

Collaboration

Comm. Archive

VersionArchive

Page 8: Mining Software Archives to Support Software Development

Collaboration

Comm. Archive

Bug Database

VersionArchive

Page 9: Mining Software Archives to Support Software Development

Collaboration

Comm. Archive

Bug Database

VersionArchive

Mining Software Archives

Page 10: Mining Software Archives to Support Software Development

Mining Software Archives

Page 11: Mining Software Archives to Support Software Development

Mining Software Archives

eROSE BugCache Vulture

Page 12: Mining Software Archives to Support Software Development

eROSERelated Changes

(ICSE 2004, TSE 2005)

Tom Zimmermann • Saarland UniversityPeter Weißgerber • University of Trier

Stephan Diehl • University of Trier Andreas Zeller • Saarland University

Page 13: Mining Software Archives to Support Software Development
Page 14: Mining Software Archives to Support Software Development
Page 15: Mining Software Archives to Support Software Development
Page 16: Mining Software Archives to Support Software Development

Developers who changed this functionalso changed...

Page 17: Mining Software Archives to Support Software Development

eROSE: Guiding Developers

PurchaseHistory

Customers who bought this item also

bought...

Page 18: Mining Software Archives to Support Software Development

eROSE: Guiding Developers

PurchaseHistory

Customers who bought this item also

bought...

Version Archive

Developers who changed this function

also changed...

Page 19: Mining Software Archives to Support Software Development
Page 20: Mining Software Archives to Support Software Development
Page 21: Mining Software Archives to Support Software Development
Page 22: Mining Software Archives to Support Software Development

eROSE suggests further locations.

Page 23: Mining Software Archives to Support Software Development
Page 24: Mining Software Archives to Support Software Development

eROSE prevents incomplete changes.

Page 25: Mining Software Archives to Support Software Development

Processing CVS data

Page 26: Mining Software Archives to Support Software Development

Processing CVS data

Page 27: Mining Software Archives to Support Software Development

Processing CVS data

1. Comparing files2. Building transactions

Page 28: Mining Software Archives to Support Software Development

Comparing Files

Page 29: Mining Software Archives to Support Software Development

A()

C()

E()

D()

B()

Comparing Files

Page 30: Mining Software Archives to Support Software Development

A()

C()

E()

D()

B()

A()

B()

E()

F()

D()

Comparing Files

Page 31: Mining Software Archives to Support Software Development

A()

C()

E()

D()

B()

A()

B()

E()

F()

D()

Comparing Files

Page 32: Mining Software Archives to Support Software Development

Building Transactions

CVS

150,000

Page 33: Mining Software Archives to Support Software Development

Building Transactions

CVS

150,000

createGeneralPage()createTextComparePage()fKeys[]initDefaults()buildnotes_compare.htmlPatchMessages.propertiesplugin.properties

2003-02-19 (aweinand): fixed #13332

Page 34: Mining Software Archives to Support Software Development

Building Transactions

CVS

150,000

createGeneralPage()createTextComparePage()fKeys[]initDefaults()buildnotes_compare.htmlPatchMessages.propertiesplugin.properties

2003-02-19 (aweinand): fixed #13332

same author + message + time

Page 35: Mining Software Archives to Support Software Development

Mining Associations

User changes fKeys[] and initDefaults()

Page 36: Mining Software Archives to Support Software Development

Mining Associations

Page 37: Mining Software Archives to Support Software Development

Mining Associations

EROSE finds past transactions

Page 38: Mining Software Archives to Support Software Development

Mining Associations

fKeys[]initDefaults()...plugin.properties

#104223

fKeys[]initDefaults()...plugin.properties

#756fKeys[]initDefaults()...plugin.properties

#6721fKeys[]initDefaults()...plugin.properties

#21078

fKeys[]initDefaults()...plugin.properties

#42432fKeys[]initDefaults()...plugin.properties

#51345fKeys[]initDefaults()...plugin.properties

#59998fKeys[]initDefaults()...plugin.properties

#71003

fKeys[]initDefaults()...

#87264fKeys[]initDefaults()...plugin.properties

#91220fKeys[]initDefaults()...plugin.properties

#101823

EROSE finds past transactions

Page 39: Mining Software Archives to Support Software Development

EROSE finds past transactions

fKeys[]initDefaults()...plugin.properties

#104223

Mining Associations

fKeys[]initDefaults()...plugin.properties

#756fKeys[]initDefaults()...plugin.properties

#6721fKeys[]initDefaults()...plugin.properties

#21078

fKeys[]initDefaults()...plugin.properties

#42432fKeys[]initDefaults()...plugin.properties

#51345fKeys[]initDefaults()...plugin.properties

#59998fKeys[]initDefaults()...plugin.properties

#71003

fKeys[]initDefaults()...

#87264fKeys[]initDefaults()...plugin.properties

#91220fKeys[]initDefaults()...plugin.properties

#101823

{fKeys[], initDefaults()} ⇒ {plugin.properties}Support 10, Confidence 10/11 = 0.909

Page 40: Mining Software Archives to Support Software Development

PostgreSQL

Evaluation

jEdit KOffice

GIMP

Page 41: Mining Software Archives to Support Software Development

PostgreSQL

Evaluation

jEdit KOffice

GIMPEROSE predicts 33% of all changed entities.(files: 44%)

Page 42: Mining Software Archives to Support Software Development

PostgreSQL

Evaluation

jEdit KOffice

GIMPEROSE predicts 33% of all changed entities.(files: 44%)

In 70% of all transactions, EROSE’s topmost three suggestions contain a changed entity.(files: 72%)

Page 43: Mining Software Archives to Support Software Development

PostgreSQL

Evaluation

jEdit KOffice

GIMPEROSE predicts 33% of all changed entities.(files: 44%)

In 70% of all transactions, EROSE’s topmost three suggestions contain a changed entity.(files: 72%)

EROSE learns quickly (within 30 days).

Page 44: Mining Software Archives to Support Software Development

eROSERelated Changes

(ICSE 2004, TSE 2005)

non-program elements(documentation)

learns quickly

guides developers

Page 45: Mining Software Archives to Support Software Development

`

BugCachePredicting Defects

(ASE 2006, ICSE 2007)

Sung Kim • MITTom Zimmermann • Saarland University

Jim Whitehead • Univ. of California SC Andreas Zeller • Saarland University

Page 46: Mining Software Archives to Support Software Development

The Problem

How should we allocate our resources for quality assurance?

Page 47: Mining Software Archives to Support Software Development

One Solution

List with elements that (will) have defects

List is adaptive, i.e., it changes over time

Page 48: Mining Software Archives to Support Software Development

One Solution

List with elements that (will) have defects

List is adaptive, i.e., it changes over time

Cache

Page 49: Mining Software Archives to Support Software Development

The BugCache Model

Cache size: 2

Hypothesis: Temporal locality between defects

What is loaded in the cache?

Page 50: Mining Software Archives to Support Software Development

The BugCache Model

Cache size: 2

Hypothesis: Temporal locality between defects

What is loaded in the cache?

Page 51: Mining Software Archives to Support Software Development

The BugCache Model

Cache size: 2

Hypothesis: Temporal locality between defects

What is loaded in the cache?

Page 52: Mining Software Archives to Support Software Development

The BugCache Model

Cache size: 2

Hypothesis: Temporal locality between defects

What is loaded in the cache?

Page 53: Mining Software Archives to Support Software Development

The BugCache Model

Cache size: 2

Hypothesis: Temporal locality between defects

What is loaded in the cache?

Page 54: Mining Software Archives to Support Software Development

The BugCache Model

Miss

Cache size: 2

Hypothesis: Temporal locality between defects

What is loaded in the cache?

Page 55: Mining Software Archives to Support Software Development

The BugCache Model

Miss

Cache size: 2

Hypothesis: Temporal locality between defects

What is loaded in the cache?

Page 56: Mining Software Archives to Support Software Development

The BugCache Model

Miss

Cache size: 2

Page 57: Mining Software Archives to Support Software Development

The BugCache Model

Miss

Cache size: 2

Page 58: Mining Software Archives to Support Software Development

The BugCache Model

Miss Hit

Cache size: 2

Page 59: Mining Software Archives to Support Software Development

The BugCache Model

Miss Hit

Cache size: 2

Page 60: Mining Software Archives to Support Software Development

The BugCache Model

Miss Hit Miss

Cache size: 2

Page 61: Mining Software Archives to Support Software Development

The BugCache Model

Miss Hit Miss

Cache size: 2

Page 62: Mining Software Archives to Support Software Development

The BugCache Model

Miss Hit Miss

Cache size: 2

Hit rate = #Hits / #Defects = 33.3%

Page 63: Mining Software Archives to Support Software Development

The BugCache Model

Miss Hit Miss

Cache size: 2

Page 64: Mining Software Archives to Support Software Development

The BugCache Model

Miss Hit Miss

Cache size: 2

Page 65: Mining Software Archives to Support Software Development

The BugCache Model

Miss Hit Miss Miss

Cache size: 2

Page 66: Mining Software Archives to Support Software Development

The BugCache Model

Miss Hit Miss Miss

Cache size: 2

Page 67: Mining Software Archives to Support Software Development

The BugCache Model

Miss Hit Miss Miss

Cache size: 2

Page 68: Mining Software Archives to Support Software Development

Loading Elements

Temporal locality – as shown before

Spatial locality – load “nearby” elements (i.e., co-changed before)

Changed-entity locality – load changed elements

New-entity locality – load new elements

Initial pre-fetch – start with a loaded cache

Page 69: Mining Software Archives to Support Software Development

Evaluation

PostgreSQLjEdit

Mozilla

Columba

Page 70: Mining Software Archives to Support Software Development

Hit Rates

Methods Files

Project BugCache FixCache BugCache FixCache

Apache 1.3ColumbaEclipseJEditMozillaPostgreSQL Subversion

59.6%58.9%64.5%50.5%49.3%61.9%68.3%

61.5%67.6%71.6%48.9%55.0%59.2%43.8%

83.9%83.5%95.1%85.7%93.3%73.9%82.0%

81.5%83.0%95.0%85.4%88.0%71.0%81.3%

Cache size = 10%

Page 71: Mining Software Archives to Support Software Development

Hit Rates

Methods Files

Project BugCache FixCache BugCache FixCache

Apache 1.3ColumbaEclipseJEditMozillaPostgreSQL Subversion

59.6%58.9%64.5%50.5%49.3%61.9%68.3%

61.5%67.6%71.6%48.9%55.0%59.2%43.8%

83.9%83.5%95.1%85.7%93.3%73.9%82.0%

81.5%83.0%95.0%85.4%88.0%71.0%81.3%

Cache size = 10%

Page 72: Mining Software Archives to Support Software Development

Reasons for Hits

Spatial locality18%

Temporal locality60%

Initial pre-fetch18%

Initial pre-fetchTemporal localitySpatial localityChanged-entity localityNew-entity locality

Page 73: Mining Software Archives to Support Software Development

Warning Developers

“Safe” Location(not in FixCache)

Risky Location(red, in FixCache)

Page 74: Mining Software Archives to Support Software Development

BugCachePredicting Defects

(ASE 2006, ICSE 2007)

adaptive

hit rates of 71%~95%

temporal locality

Page 75: Mining Software Archives to Support Software Development

VulturePredicting

Security Vulnerabilities(Work in Progress)

Stephan Neuhaus • Saarland University

Tom Zimmermann • Saarland UniversityAndreas Zeller • Saarland University

Page 76: Mining Software Archives to Support Software Development

Firefox/Mozilla

14,368 C/C++ files (10,452 components) 1,012,512 revisions

228,365 commits>700 developers

Page 77: Mining Software Archives to Support Software Development

14,368 C/C++ files (10,452 components) 1,012,512 revisions

228,365 commits>700 developers

Page 78: Mining Software Archives to Support Software Development

Vulnerabilities

Page 79: Mining Software Archives to Support Software Development

Vulnerabilities

Page 80: Mining Software Archives to Support Software Development

Vulnerabilities0

Vulnerabilities

Page 81: Mining Software Archives to Support Software Development

Security Advisory 2005-12

Title: Livefeed bookmarks can steal cookiesImpact: HighProducts: FirefoxDescription: Earlier versions of Firefox allowed javascript: and data: URLs as Livefeed bookmarks. When they updated the URL would be run in the context of the current page and could be used to steal cookies or data displayed on the page. If the user were on a page with elevated privileges (for example, about:config) when the Livefeed was updated, the feed URL could potentially run arbitrary code on the user's machine.

Vulnerabilities0

Vulnerabilities

Page 82: Mining Software Archives to Support Software Development

Vulnerabilities0

Vulnerabilities

Page 83: Mining Software Archives to Support Software Development

Security Advisory 2005-13

Title: Window Injection SpoofingSeverity: LowProducts: Firefox, Mozilla SuiteDescription: A website can inject content into a popup opened by another site if the target name of the popup window is known. An attacker who knows you are going to visit that other site could spoof the contents of the popup.

Vulnerabilities0

Vulnerabilities

Page 84: Mining Software Archives to Support Software Development

Security Advisory 2005-14

Title: SSL "secure site" indicator spoofingSeverity: ModerateProducts: Firefox, Mozilla SuiteDescription: Various schemes were reported that could cause the "secure site" lock icon to appear and show certificate details for the wrong site. These could be used by phishers to make their spoofs look more legitimate, particularly in windows that hide the address bar showing the true location.

Security Advisory 2005-15Title: Heap overflow possible in UTF8 to Unicode conversionSeverity: HighProducts: Firefox, Thunderbird, Mozilla SuiteDescription: It is possible for a UTF8 string with invalid sequences to trigger a heap overflow of converted Unicode data. Exploitability would depend on the attackers ability to get the string into the buggy converter. General web content is converted elsewhere but we can't rule out the possibility of a successful attack.

Security Advisory 2005-16Title: Spoofing download and security dialogs with overlapping windowsSeverity: HighProducts: Firefox, Mozilla SuiteDescription: Michael Krax demonstrates that the download dialog and security dialogs can be spoofed by partially covering them with an overlapping window. Some users may not notice the OS window border and browser statusbar bisecting what appears to be a single dialog, and be convinced by the spoofing text of the top-most window to click on the "Allow" or "Open" button of the window below.

Vulnerabilities0

Security Advisory 2005-41Title: Privilege escalation via DOM property overridesSeverity: CriticalProducts: Firefox, Mozilla SuiteDescription: moz_bug_r_a4 reported several exploits giving an attacker the ability to install malicious code or steal data, requiring only that the user do commonplace actions like click on a link or open the context menu. The common cause in each case was privileged UI code ("chrome") being overly trusting of DOM nodes from the content window.

Security Advisory 2006-76Title: XSS using outer window's Function objectImpact: HighProducts: Firefox 2.0Description: moz_bug_r_a4 demonstrated that the Function prototype regression described in bug 355161 could be exploited to bypass the protections against cross site script (XSS) injection, which could be used to steal credentials or sensitive data from arbitrary sites or perform destructive actions on behalf of a logged-in user.

Vulnerabilities

Page 85: Mining Software Archives to Support Software Development

Vulnerabilities0

Vulnerabilities

Page 86: Mining Software Archives to Support Software Development

Vulnerabilities0

Vulnerabilities

components

vulnerable424

10,452

4.05%

Page 87: Mining Software Archives to Support Software Development

Vulnerabilities0

What other components are

vulnerable?

Vulnerabilities

Page 88: Mining Software Archives to Support Software Development

Vulnerabilities0

Vulnerabilities

Page 89: Mining Software Archives to Support Software Development

Vulnerabilities0

Vulnerabilities

?

Page 90: Mining Software Archives to Support Software Development

Vulnerabilities0

Is this new component likely to be vulnerable?

Vulnerabilities

?

Page 91: Mining Software Archives to Support Software Development

Vulture

Vulnerability Database

Version Archive

CodeCodeCodeCodeRedo diagram

Page 92: Mining Software Archives to Support Software Development

Vulture

Vulnerability Database

Version Archive

CodeCodeCodeCode

Vulture

Redo diagram

Page 93: Mining Software Archives to Support Software Development

Component Component Component

Vulture

Vulnerability Database

Version Archive

CodeCodeCodeCode

Vulture

Redo diagram

Page 94: Mining Software Archives to Support Software Development

Predictor

Component Component Component

Vulture

Vulnerability Database

Version Archive

CodeCodeCodeCode

Vulture

Redo diagram

Page 95: Mining Software Archives to Support Software Development

Predictor

Code

Component Component Component

Vulture

Vulnerability Database

Version Archive

CodeCodeCodeCode

Vulture

Redo diagram

Page 96: Mining Software Archives to Support Software Development

Correlations

Page 97: Mining Software Archives to Support Software Development

Programmer Code Complexity

Correlations

Language

Page 98: Mining Software Archives to Support Software Development

Code Complexity

Correlations

Language

Page 99: Mining Software Archives to Support Software Development

Correlations

Language

Page 100: Mining Software Archives to Support Software Development

Problem Domain

Correlations

Language

Page 101: Mining Software Archives to Support Software Development

Imports

Page 102: Mining Software Archives to Support Software Development

GUI Database Certificates OS

Imports

Page 103: Mining Software Archives to Support Software Development

GUI Database Certificates OS

Imports

Page 104: Mining Software Archives to Support Software Development

GUI Database Certificates OS

Imports

Page 105: Mining Software Archives to Support Software Development

nsIContent.h

nsIContentUtils.h

nsIScriptSecurityManager.h

Example (1)

Page 106: Mining Software Archives to Support Software Development

nsIContent.h

nsIContentUtils.h

nsIScriptSecurityManager.h

Example (1)

import

Page 107: Mining Software Archives to Support Software Development

✘✔

nsIContent.h

nsIContentUtils.h

nsIScriptSecurityManager.h

Example (1)

import

95.5%

Page 108: Mining Software Archives to Support Software Development

nsIPrivateDOMEvent.h

nsReadableUtils.h

Example (2)

Page 109: Mining Software Archives to Support Software Development

nsIPrivateDOMEvent.h

nsReadableUtils.h

Example (2)

import

Page 110: Mining Software Archives to Support Software Development

nsIPrivateDOMEvent.h

nsReadableUtils.h

Example (2)

import

100%

✘✘

Page 111: Mining Software Archives to Support Software Development

• How well do imports predict vulnerabilities?

• Can imports be used for− classification (vulnerable or not) and for− regression (number of vulnerabilities)?

Research Questions

Page 112: Mining Software Archives to Support Software Development

nsCOMArraynsIDocument.h

nspr_md.hnsDOMClassInfoEmbedGTKTools

MozillaControl.cpp

0

1

0

10

0

0

nsDOMClassInfo has had 10 vulnerability-related bug reports

Input Data

Page 113: Mining Software Archives to Support Software Development

nsCOMArraynsIDocument.h

nspr_md.hnsDOMClassInfoEmbedGTKTools

MozillaControl.cpp

0

1

0

10

0

0

nsDOMClassInfo has had 10 vulnerability-related bug reports

Input Data

stdio.

h

util.h

nsSta

ckFr

ame.h

sys/fi

le.h

ssImpl.

h

nsIX

PCon

nect.

h

btre

e.h

1 0 0 0 1 0 0

0 0 1 0 0 1 0

0 1 1 0 0 1 0

0 0 1 0 1 0 0

0 0 0 0 1 0 0

0 1 0 1 0 0 0

nsDOMClassInfo imports “nsIXPConnect.h”

9,059

mor

e

Page 114: Mining Software Archives to Support Software Development

Distribution of MFSAs

Number of MFSAs

Num

ber o

f Com

pone

nts

1 3 5 7 9 11 13

12

520

5030

0

Distribution of Bug Reports

Number of Bug Reports

Num

ber o

f Com

pone

nts

1 3 5 7 9 13 17 24

12

520

5030

0

Distribution

Page 115: Mining Software Archives to Support Software Development

• 40 random splits6,968 rows in training set, 3,484 rows in validation set

• ClassificationTrain SVM, compute recall and precision

• RegressionTrain SVM, compute rank correlation on top 1%

• SVM: linear kernel with default parametersR implementation (up to 10GB of main memory)

Experiments

Page 116: Mining Software Archives to Support Software Development

● ●

●●

0.55 0.60 0.65 0.70 0.75

0.35

0.40

0.45

0.50

0.55

(a) Precision and Recall

Recall

Prec

ision

0.2 0.3 0.4 0.5 0.6 0.70.

00.

20.

40.

60.

81.

0

(b) Rank Correlation

Rank Correlation

Cum

ulat

ive

Dist

ribut

ion

●●

●●●●

●●

●●

●●

●●●●

●●●●

●●●●●●●●●

●●●●●

●●●

●●

Results

Page 117: Mining Software Archives to Support Software Development

● ●

●●

0.55 0.60 0.65 0.70 0.75

0.35

0.40

0.45

0.50

0.55

(a) Precision and Recall

Recall

Prec

ision

0.2 0.3 0.4 0.5 0.6 0.70.

00.

20.

40.

60.

81.

0

(b) Rank Correlation

Rank Correlation

Cum

ulat

ive

Dist

ribut

ion

●●

●●●●

●●

●●

●●

●●●●

●●●●

●●●●●●●●●

●●●●●

●●●

●●

45% (about 1/2) of predictions correct

Results

Page 118: Mining Software Archives to Support Software Development

● ●

●●

0.55 0.60 0.65 0.70 0.75

0.35

0.40

0.45

0.50

0.55

(a) Precision and Recall

Recall

Prec

ision

0.2 0.3 0.4 0.5 0.6 0.70.

00.

20.

40.

60.

81.

0

(b) Rank Correlation

Rank Correlation

Cum

ulat

ive

Dist

ribut

ion

●●

●●●●

●●

●●

●●

●●●●

●●●●

●●●●●●●●●

●●●●●

●●●

●●

2/3 of all vulnerable components detected45% (about 1/2) of predictions correct

Results

Page 119: Mining Software Archives to Support Software Development

● ●

●●

0.55 0.60 0.65 0.70 0.75

0.35

0.40

0.45

0.50

0.55

(a) Precision and Recall

Recall

Prec

ision

0.2 0.3 0.4 0.5 0.6 0.70.

00.

20.

40.

60.

81.

0

(b) Rank Correlation

Rank Correlation

Cum

ulat

ive

Dist

ribut

ion

●●

●●●●

●●

●●

●●

●●●●

●●●●

●●●●●●●●●

●●●●●

●●●

●●

2/3 of all vulnerable components detected45% (about 1/2) of predictions correct

Results

Page 120: Mining Software Archives to Support Software Development

● ●

●●

0.55 0.60 0.65 0.70 0.75

0.35

0.40

0.45

0.50

0.55

(a) Precision and Recall

Recall

Prec

ision

0.2 0.3 0.4 0.5 0.6 0.70.

00.

20.

40.

60.

81.

0

(b) Rank Correlation

Rank Correlation

Cum

ulat

ive

Dist

ribut

ion

●●

●●●●

●●

●●

●●

●●●●

●●●●

●●●●●●●●●

●●●●●

●●●

●●

2/3 of all vulnerable components detected45% (about 1/2) of predictions correct

moderately strong correlation (mostly significant at p < 0.01)

Results

Page 121: Mining Software Archives to Support Software Development

Ranking

Page 122: Mining Software Archives to Support Software Development

Rank Component Actual Rank1 nsDOMClassInfo 3

2 SGridRowLayout 95

3 xpcprivate 6

4 jsxml 2

5 nsGenericHTMLElement 8

6 jsgc 3

7 nsISEnvironment 12

8 jsfun 1

9 nsHTMLLabelElement 18

10 nsHttpTransaction 35

... (3,474 components)

Ranking

Page 123: Mining Software Archives to Support Software Development

Rank Component Actual Rank1 nsDOMClassInfo 3

2 SGridRowLayout 95

3 xpcprivate 6

4 jsxml 2

5 nsGenericHTMLElement 8

6 jsgc 3

7 nsISEnvironment 12

8 jsfun 1

9 nsHTMLLabelElement 18

10 nsHttpTransaction 35

... (3,474 components)

Ranking

Page 124: Mining Software Archives to Support Software Development

Rank Component Actual Rank1 nsDOMClassInfo 3

2 SGridRowLayout 95

3 xpcprivate 6

4 jsxml 2

5 nsGenericHTMLElement 8

6 jsgc 3

7 nsISEnvironment 12

8 jsfun 1

9 nsHTMLLabelElement 18

10 nsHttpTransaction 35

... (3,474 components)

Ranking

Page 125: Mining Software Archives to Support Software Development

Rank Component Actual Rank1 nsDOMClassInfo 3

2 SGridRowLayout 95

3 xpcprivate 6

4 jsxml 2

5 nsGenericHTMLElement 8

6 jsgc 3

7 nsISEnvironment 12

8 jsfun 1

9 nsHTMLLabelElement 18

10 nsHttpTransaction 35

... (3,474 components)

Ranking

Page 126: Mining Software Archives to Support Software Development

Similar Results for Bugs

Packages + Import relationships(ISESE 2006)

Precision: 66.7% Recall: 69.4%

Binaries + Dependencies(Internship @ Microsoft Research, 2006)

Precision: 64.4% Recall: 75.3%

Page 127: Mining Software Archives to Support Software Development

VulturePredicting

Security Vulnerabilities(Work in Progress)

locates past + predicts newvulnerabilities

problem domain

Page 128: Mining Software Archives to Support Software Development

?

Future Work

Page 129: Mining Software Archives to Support Software Development

#1: Mining across Projects

• Complement source code search engines with mining techniques.

• Large-scale mining (144,000 SF projects)

Page 130: Mining Software Archives to Support Software Development

#2: Developer Buddy

MOCKUP

Page 131: Mining Software Archives to Support Software Development

eROSE BugCache Vulture

Page 132: Mining Software Archives to Support Software Development

automatic

eROSE BugCache Vulture

Page 133: Mining Software Archives to Support Software Development

automaticlarge-scale

eROSE BugCache Vulture

Page 134: Mining Software Archives to Support Software Development

automatic

tool-oriented

large-scale

eROSE BugCache Vulture

Page 135: Mining Software Archives to Support Software Development

2.0

Empirical Software Engineering 2.0

automatic

tool-oriented

large-scale

Page 136: Mining Software Archives to Support Software Development

2.0

Empirical Software Engineering 2.0

automatic

tool-oriented

large-scale

Thanks! Questions?