carol v. alexandru, sebastiano panichella, harald c. gall · 2017. 2. 22. · 0.525 0.109 0.082...
TRANSCRIPT
![Page 1: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/1.jpg)
Carol V. Alexandru, Sebastiano Panichella, Harald C. GallSoftware Evolution and Architecture Lab
University of Zurich, Switzerland{alexandru,panichella,gall}@ifi.uzh.ch
22.02.2017
SANER’17
Klagenfurt, Austria
![Page 2: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/2.jpg)
The Problem Domain
• Static analysis (e.g. #Attr., McCabe, coupling...)
1
![Page 3: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/3.jpg)
The Problem Domain
v0.7.0 v1.0.0 v1.3.0 v2.0.0 v3.0.0 v3.3.0 v3.5.0
2
• Static analysis (e.g. #Attr., McCabe, coupling...)
![Page 4: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/4.jpg)
The Problem Domain
v0.7.0 v1.0.0 v1.3.0 v2.0.0 v3.0.0 v3.3.0 v3.5.0
2
• Static analysis (e.g. #Attr., McCabe, coupling...)
• Many revisions, fine-grained historical data
![Page 5: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/5.jpg)
A Typical Analysis Process
3
www
clone
select project
![Page 6: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/6.jpg)
A Typical Analysis Process
www
clone
checkout
select project
select revision
3
![Page 7: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/7.jpg)
A Typical Analysis Process
www
clone
checkout analysis tool
Res
select project
select revision
applytool
storeanalysis
results
3
![Page 8: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/8.jpg)
A Typical Analysis Process
www
clone
checkout analysis tool
Res
select project
select revision
applytool
storeanalysis
results
more revisions?
3
![Page 9: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/9.jpg)
A Typical Analysis Process
www
clone
checkout analysis tool
Res
select project
select revision
applytool
storeanalysis
results
more revisions?
more projects?
3
![Page 10: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/10.jpg)
Redundancies all over...
4
Redundancies in historical code analysis
Impact on Code Study Tools
![Page 11: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/11.jpg)
Redundancies all over...
Redundancies in historical code analysis
Few files change
Only small parts of a file change
Across RevisionsImpact on Code
Study Tools
4
![Page 12: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/12.jpg)
Redundancies all over...
Redundancies in historical code analysis
Few files change
Only small parts of a file change
Across RevisionsImpact on Code
Study Tools
Repeated analysis of "known" code
4
![Page 13: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/13.jpg)
Redundancies all over...
Redundancies in historical code analysis
Few files change
Only small parts of a file change
Changes may not even affect results
Across RevisionsImpact on Code
Study Tools
Repeated analysis of "known" code
Storing redundant results
4
![Page 14: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/14.jpg)
Redundancies all over...
4
Redundancies in historical code analysis
Few files change
Across Languages
Only small parts of a file change
Changes may not even affect results
Each language has their own toolchain
Yet they share many metrics
Across RevisionsImpact on Code
Study Tools
Repeated analysis of "known" code
Storing redundant results
![Page 15: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/15.jpg)
Redundancies all over...
Redundancies in historical code analysis
Few files change
Across Languages
Only small parts of a file change
Changes may not even affect results
Each language has their own toolchain
Yet they share many metrics
Across RevisionsImpact on Code
Study Tools
Repeated analysis of "known" code
Storing redundant results
Re-implementing identical analyses
Generalizability is expensive
4
![Page 16: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/16.jpg)
Redundancies all over...
Redundancies in historical code analysis
Few files change
Across Languages
Only small parts of a file change
Changes may not even affect results
Each language has their own toolchain
Yet they share many metrics
Across RevisionsImpact on Code
Study Tools
Repeated analysis of "known" code
Storing redundant results
Re-implementing identical analyses
Generalizability is expensive
5
![Page 17: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/17.jpg)
Important!
6
Techniques implemented
in LISA
Your favouriteanalysis features
Pick what you like!
![Page 18: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/18.jpg)
#1: Avoid Checkouts
![Page 19: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/19.jpg)
Avoid checkouts
7
clone
![Page 20: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/20.jpg)
Avoid checkouts
7
clone
checkout
read write
![Page 21: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/21.jpg)
Avoid checkouts
7
clone
checkout
read
read write
analyze
![Page 22: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/22.jpg)
Avoid checkouts
7
clone
checkout
read
For every file: 2 read ops + 1 write opCheckout includes irrelevant filesNeed 1 CWD for every revision to be analyzed in parallel
read write
analyze
![Page 23: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/23.jpg)
Avoid checkouts
8
clone analyze
read
![Page 24: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/24.jpg)
Avoid checkouts
8
clone analyze
Only read relevant files in a single read opNo write opsNo overhead for parallization
read
![Page 25: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/25.jpg)
Avoid checkouts
8
clone analyze
Only read relevant files in a single read opNo write opsNo overhead for parallization
Git
Analysis Tool
File Abstraction Layer
read
![Page 26: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/26.jpg)
Avoid checkouts
8
clone analyze
Only read relevant files in a single read opNo write opsNo overhead for parallization
Git
Analysis Tool
File Abstraction Layer
E.g. for the JDK Compiler:
class JavaSourceFromCharrArray(name: String, val code: CharBuffer)extends SimpleJavaFileObject(URI.create("string:///" + name), Kind.SOURCE) {override def getCharContent(): CharSequence = code
}
read
![Page 27: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/27.jpg)
Avoid checkouts
clone analyze
Only read relevant files in a single read opNo write opsNo overhead for parallization
Git
Analysis Tool
File Abstraction Layer
E.g. for the JDK Compiler:
class JavaSourceFromCharrArray(name: String, val code: CharBuffer)extends SimpleJavaFileObject(URI.create("string:///" + name), Kind.SOURCE) {override def getCharContent(): CharSequence = code
}
read
9
![Page 28: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/28.jpg)
#2: Use a multi-revision representationof your sources
![Page 29: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/29.jpg)
Merge ASTs
rev. 1
rev. 2
rev. 3
rev. 4
10
![Page 30: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/30.jpg)
rev. 1
rev. 2
rev. 3
rev. 4
11
rev. 1
Merge ASTs
![Page 31: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/31.jpg)
rev. 1
rev. 2
rev. 3
rev. 4
12
rev. 2
Merge ASTs
![Page 32: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/32.jpg)
Merge ASTs
rev. 1
rev. 2
rev. 3
rev. 4
13
rev. 3
![Page 33: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/33.jpg)
Merge ASTs
rev. 1
rev. 2
rev. 3
rev. 4
14
rev. 4
![Page 34: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/34.jpg)
Merge ASTs
rev. 1
rev. 2
rev. 3
rev. 4
15
![Page 35: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/35.jpg)
Merge ASTs
rev. 1
rev. 2
rev. 3
rev. 4
16
rev. range [1-4]
rev. range [1-2]
![Page 36: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/36.jpg)
Merge ASTs
rev. 1
rev. 2
rev. 3
rev. 4AspectJ (~440k LOC):
1 commit: 2.2M nodesAll >7000 commits: 6.5M nodes
17
![Page 37: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/37.jpg)
Merge ASTs
rev. 1
rev. 2
rev. 3
rev. 4AspectJ (~440k LOC):
1 commit: 2.2M nodesAll >7000 commits: 6.5M nodes
18
![Page 38: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/38.jpg)
Merge ASTs
rev. 1
rev. 2
rev. 3
rev. 4AspectJ (~440k LOC):
1 commit: 2.2M nodesAll >7000 commits: 6.5M nodes
19
![Page 39: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/39.jpg)
#3: Store AST nodes only ifthey're needed for analysis
![Page 40: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/40.jpg)
public class Demo {
public void run() {
for (int i = 1; i< 100; i++) {
if (i % 3 == 0 || i % 5 == 0) {
System.out.println(i)
}
}
}
What's the complexity (1+#forks) and name for each method and class?
20
![Page 41: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/41.jpg)
public class Demo {
public void run() {
for (int i = 1; i< 100; i++) {
if (i % 3 == 0 || i % 5 == 0) {
System.out.println(i)
}
}
}
140 AST nodes(using ANTLR)
parse
What's the complexity (1+#forks) and name for each method and class?
20
![Page 42: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/42.jpg)
public class Demo {
public void run() {
for (int i = 1; i< 100; i++) {
if (i % 3 == 0 || i % 5 == 0) {
System.out.println(i)
}
}
}
140 AST nodes(using ANTLR)
parseCompilationUnit
TypeDeclaration
Modifiers
public
Members
Method
Modifiers
public
Name
run
Name
Demo
Parameters ReturnType
PrimitiveType
VOID
Body
Statements
...
...
What's the complexity (1+#forks) and name for each method and class?
20
![Page 43: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/43.jpg)
public class Demo {
public void run() {
for (int i = 1; i< 100; i++) {
if (i % 3 == 0 || i % 5 == 0) {
System.out.println(i)
}
}
}
What's the complexity (1+#forks) and name for each method and class?
140 AST nodes(using ANTLR)
parse filtered parse
TypeDeclaration
Method
Name
run
Name
DemoForStatement
IfStatement
ConditionalExpression
7 AST nodes(using ANTLR)
21
![Page 44: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/44.jpg)
public class Demo {
public void run() {
for (int i = 1; i< 100; i++) {
if (i % 3 == 0 || i % 5 == 0) {
System.out.println(i)
}
}
}
What's the complexity (1+#forks) and name for eachmethod and class?
140 AST nodes(using ANTLR)
parse filtered parse
TypeDeclaration
Method
Name
run
Name
DemoForStatement
IfStatement
ConditionalExpression
7 AST nodes(using ANTLR)
22
![Page 45: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/45.jpg)
public class Demo {
public void run() {
for (int i = 1; i< 100; i++) {
if (i % 3 == 0 || i % 5 == 0) {
System.out.println(i)
}
}
}
What's the complexity (1+#forks) and name for eachmethod and class?
140 AST nodes(using ANTLR)
parse filtered parse
TypeDeclaration
Method
Name
run
Name
DemoForStatement
IfStatement
ConditionalExpression
7 AST nodes(using ANTLR)
23
![Page 46: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/46.jpg)
#4: Use non-duplicative data structuresto store your results
![Page 47: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/47.jpg)
rev. 1
rev. 2
rev. 3
rev. 4
24
![Page 48: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/48.jpg)
rev. 1
rev. 2
rev. 3
rev. 4
24
![Page 49: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/49.jpg)
rev. 1
rev. 2
rev. 3
rev. 4
24
[1-1]
label
#attr
mcc
[4-4]
label
#attr
mcc
InnerClass
0 4
1 2 4
[2-3]
label
#attr
mcc
![Page 50: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/50.jpg)
rev. 1
rev. 2
rev. 3
rev. 4 [1-1]
label
#attr
mcc
[4-4]
label
#attr
mcc
InnerClass
0 4
1 2 4
[2-3]
label
#attr
mcc
25
![Page 51: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/51.jpg)
LISA also does:#5: Parallel Parsing#6: Asynchronous graph computation#7: Generic graph computationsapplying to ASTs from compatiblelanguages
26
![Page 52: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/52.jpg)
To Summarize...
![Page 53: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/53.jpg)
The LISA Analysis Process
27
www
clone
select project
![Page 54: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/54.jpg)
The LISA Analysis Process
27
www
clone
parallel parseinto mergedgraph
select project
Language Mappings(Grammar to Analysis)
determines which ASTnodes are loaded
ANTLRv4Grammar used by
Generates Parser
![Page 55: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/55.jpg)
The LISA Analysis Process
27
www
clone
parallel parseinto mergedgraph
Res
select project
storeanalysis
results
Analysis formulated as Graph Computation
Language Mappings(Grammar to Analysis) used by
Async. compute
determines which ASTnodes are loaded
determines whichdata is persisted
ANTLRv4Grammar used by
Generates Parser
runson graph
![Page 56: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/56.jpg)
The LISA Analysis Process
27
www
clone
parallel parseinto mergedgraph
Res
select project
storeanalysis
results
more projects?
Analysis formulated as Graph Computation
Language Mappings(Grammar to Analysis) used by
Async. compute
determines which ASTnodes are loaded
determines whichdata is persisted
ANTLRv4Grammar used by
Generates Parser
runson graph
![Page 57: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/57.jpg)
How well does it work, then?
![Page 58: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/58.jpg)
Marginal cost for +1 revision
41.670
4.6330.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033
0
5
10
15
20
25
30
35
40
45
50
1 10 100 1000 2000 3000 4000 5000 6000 7000
Average Parsing+Computation time per Revision when analyzing n revisions of AspectJ (10 common metrics)
Seco
nd
s
# of revisions
28
![Page 59: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/59.jpg)
Overall Performance Stats
29
Language Java C# JavScript
#Projects 100 100 100
#Revisions 646'261 489'764 204'301
#Files (parsed!) 3'235'852 3'234'178 507'612
#Lines (parsed!) 1'370'998'072 961'974'773 194'758'719
Total Runtime (RT)¹ 18:43h 52:12h 29:09h
Median RT¹ 2:15min 4:54min 3:43min
Tot. Avg. RT per Rev.² 84ms 401ms 531ms
Med. Avg. RT per Rev.² 30ms 116ms 166ms
¹ Including cloning and persisting results² Excluding cloning and persisting results
![Page 60: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/60.jpg)
What's the catch?(There are a few...)
![Page 61: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/61.jpg)
The (not so) minor stuff
• Must implement analyses from scratch
• No help from a compiler
• Non-file-local analyses need some effort
30
![Page 62: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/62.jpg)
The (not so) minor stuff
• Must implement analyses from scratch
• No help from a compiler
• Non-file-local analyses need some effort
• Moved files/methods etc. add overhead
• Uniquely identifying files/entities is hard
• (No impact on results, though)
30
![Page 63: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/63.jpg)
Language matters
31
Javascript C# Java
E.g.: Javascript takes longer because:• Larger files, less modularization• Slower parser (automatic semicolon-insertion)
![Page 64: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/64.jpg)
LISA is E X TREME
32
complexfeature-richheavyweight
simplegeneric
lightweight
![Page 65: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/65.jpg)
Thank you for your attention
SANER ‘17, Klagenfurt, 22.02.2017
Read the paper: http://t.uzh.ch/Fj
Try the tool: http://t.uzh.ch/Fk
Get the slides: http://t.uzh.ch/Fm
Contact me: [email protected]
![Page 66: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/66.jpg)
Parallelize Parsing
66
Single Git tree traversal
src/Main.java {1: 1251a4}, {3: fc2452}, {4: 251929}
src/Settings.java {2: fa255a}
src/Foo.java {1: 512fc2}, {4: 791c2a}, {5: bcb215}
src/Bar.java {4: 8a23b2}, {5: b2399f}
Obtain sequence of Git blob ids for old versions of each unique path
![Page 67: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/67.jpg)
Parallelize Parsing
67
Single Git tree traversal
src/Main.java {1: 1251a4}, {3: fc2452}, {4: 251929}
src/Settings.java {2: fa255a}
src/Foo.java {1: 512fc2}, {4: 791c2a}, {5: bcb215}
src/Bar.java {4: 8a23b2}, {5: b2399f}
Obtain sequence of Git blob ids for old versions of each unique path
Parse files with different paths in parallelSome files will have more revisions, taking longer to parse in total Parsing only takes roughly as long as required for the file with the most revisions
![Page 68: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/68.jpg)
Parallelize Parsing
68
Single Git tree traversal
src/Main.java {1: 1251a4}, {3: fc2452}, {4: 251929}
src/Settings.java {2: fa255a}
src/Foo.java {1: 512fc2}, {4: 791c2a}, {5: bcb215}
src/Bar.java {4: 8a23b2}, {5: b2399f}
Obtain sequence of Git blob ids for old versions of each unique path
Parse files with different paths in parallelSome files will have more revisions, taking longer to parse in total Parsing only takes roughly as long as required for the file with the most revisions
![Page 69: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/69.jpg)
"Speed-up factor" for each technique
• Parallel parsing: Roughly 2
• Merged ASTs: >1000 for many revisions
• Filtered parsing: >10 during
computation, depends on how much is
filtered
• All depends on file sizes / parser speed
![Page 70: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/70.jpg)
Asynchronous Graph Analysis
rev. 1
rev. 2
rev. 3
rev. 4
Depending on the node type:- Signal specfic data
70
![Page 71: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/71.jpg)
Asynchronous Graph Analysis
rev. 1
rev. 2
rev. 3
rev. 4
Depending on the node type:- Signal specfic data- Collect specific data
71
![Page 72: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/72.jpg)
Asynchronous Graph Analysis
rev. 1
rev. 2
rev. 3
rev. 4
Depending on the node type:- Signal specfic data- Collect specific data- Store specific data
72
![Page 73: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/73.jpg)
Asynchronous Graph Analysis
rev. 1
rev. 2
rev. 3
rev. 4
Depending on the node type:- Signal specfic data- Collect specific data- Store specific data- Create new nodes & edges
73
![Page 74: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/74.jpg)
rev. 1
rev. 2
rev. 3
rev. 4
74
Example - Method Count (NOM):
Signal:
Collect:
Store:
Depending on the node type:- Signal specfic data- Collect specific data- Store specific data- Create new nodes & edges
Asynchronous Graph Analysis
![Page 75: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/75.jpg)
rev. 1
rev. 2
rev. 3
rev. 4
75
Example - Method Count (NOM):
Signal:METHOD – nom: 1
Collect:
Store:
Depending on the node type:- Signal specfic data- Collect specific data- Store specific data- Create new nodes & edges
Asynchronous Graph Analysis
![Page 76: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/76.jpg)
rev. 1
rev. 2
rev. 3
rev. 4
76
Example - Method Count (NOM):
Signal:METHOD – nom: 1
Collect:CLASS-like – nom += s.nom
Store:
Depending on the node type:- Signal specfic data- Collect specific data- Store specific data- Create new nodes & edges
Asynchronous Graph Analysis
![Page 77: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/77.jpg)
rev. 1
rev. 2
rev. 3
rev. 4
77
Example - Method Count (NOM):
Signal:METHOD – nom: 1
Collect:CLASS-like – nom += s.nom
Store:CLASS-like – nom
Depending on the node type:- Signal specfic data- Collect specific data- Store specific data- Create new nodes & edges
Asynchronous Graph Analysis
![Page 78: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/78.jpg)
Static source code analysis?
78
![Page 79: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/79.jpg)
Static source code analysis?
79
![Page 80: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/80.jpg)
Static source code analysis?
80
Simple Code Metrics(NOC, NOM, WMC, Complexity, ...)
![Page 81: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/81.jpg)
Static source code analysis?
81
Structure(Coupling, Inheritance, ...)
Simple Code Metrics(NOC, NOM, WMC, Complexity, ...)
![Page 82: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/82.jpg)
Static source code analysis?
82
Structure(Coupling, Inheritance, ...)
Simple Code Metrics(NOC, NOM, WMC, Complexity, ...)
![Page 83: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/83.jpg)
Static source code analysis?
83
Practicecode smells
refactoring advicehot-spot detection
bug prediction
Structure(Coupling, Inheritance, ...)
Simple Code Metrics(NOC, NOM, WMC, Complexity, ...)
![Page 84: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/84.jpg)
Static source code analysis?
84
Practicecode smells
refactoring advicehot-spot detection
bug prediction
Researchunderstanding software evolution identifying patterns &anti-patternscode quality assessment techniques...→ code studies
Structure(Coupling, Inheritance, ...)
Simple Code Metrics(NOC, NOM, WMC, Complexity, ...)
![Page 85: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/85.jpg)
(Many) existing studies
85
![Page 86: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/86.jpg)
(Many) existing studies
86
![Page 87: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/87.jpg)
(Many) existing studies
2g
• investigate a small number of projects
![Page 88: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/88.jpg)
(Many) existing studies
• investigate a small number of projects
source: github.com
source: openhub.net
88
![Page 89: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/89.jpg)
(Many) existing studies
• investigate a small number of projects
• analyze a few snapshots of multi-year projects
v0.7.0 v1.0.0 v1.3.0 v2.0.0 v3.0.0 v3.3.0 v3.5.0
source: github.com
source: openhub.net
89
![Page 90: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/90.jpg)
(Many) existing studies
• investigate a small number of projects
• analyze a few snapshots of multi-year projects
90
![Page 91: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/91.jpg)
(Many) existing studies
• investigate a small number of projects
• analyze a few snapshots of multi-year projects
• focus on very few programming languages
91
![Page 92: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/92.jpg)
Why?
92
Only few LanguagesOnly a few projects Only a few snapshots
![Page 93: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/93.jpg)
Why?
93
Only few LanguagesOnly a few projects Only a few snapshots
Time & resource intensive analysis
![Page 94: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/94.jpg)
Why?
94
Only few LanguagesOnly a few projects Only a few snapshots
Time & resource intensive analysis
Each commit analyzed seperately
![Page 95: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/95.jpg)
Why?
95
Only few LanguagesOnly a few projects Only a few snapshots
Manual work
Time & resource intensive analysis
Each commit analyzed seperately
![Page 96: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/96.jpg)
Why?
96
Only few LanguagesOnly a few projects Only a few snapshots
Manual work
Time & resource intensive analysis
Each commit analyzed seperately
Many preconditions for analyses
![Page 97: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/97.jpg)
Why?
97
Only few LanguagesOnly a few projects Only a few snapshots
Manual work
Time & resource intensive analysis
Each commit analyzed seperately
Many preconditions for analyses
![Page 98: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/98.jpg)
Why?
98
Only few LanguagesOnly a few projects Only a few snapshots
Manual work
Time & resource intensive analysis
Each commit analyzed seperately
Many preconditions for analyses
Analysis Tools are purpose-built, not designed for large-scale studies
![Page 99: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/99.jpg)
Why?
99
Only few LanguagesOnly a few projects Only a few snapshots
Manual work
Time & resource intensive analysis
Each commit analyzed seperately
Many preconditions for analyses
Analysis Tools are purpose-built, not designed for large-scale studies
Possible solution: LISAA fast, multi-purpose, graph-based analysis approach
![Page 100: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/100.jpg)
#3: Use asynchronous graph analysistechniques
![Page 101: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/101.jpg)
Rapid Analysis using LISA
101
Only few LanguagesOnly a few projects Only a few snapshots
Manual work
Time & resource intensive analysis
Each commit analyzed seperately
Many preconditions for analyses
Fast, general purpose code analysis tool aimed specifically at large scale
![Page 102: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/102.jpg)
Rapid Analysis using LISA
102
Only few LanguagesOnly a few projects Only a few snapshots
Manual work
Time & resource intensive analysis
Analyze all commits in parallel
Many preconditions for analyses
Fast, general purpose code analysis tool aimed specifically at large scale
![Page 103: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/103.jpg)
Rapid Analysis using LISA
103
Only few LanguagesOnly a few projects Only a few snapshots
«Point and Shoot»
Time & resource intensive analysis
Analyze all commits in parallel
Many preconditions for analyses
Fast, general purpose code analysis tool aimed specifically at large scale
![Page 104: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/104.jpg)
Rapid Analysis using LISA
104
Only few LanguagesOnly a few projects Only a few snapshots
«Point and Shoot»
Time & resource intensive analysis
Analyze all commits in parallel
Analysis directly on source code
Fast, general purpose code analysis tool aimed specifically at large scale
![Page 105: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/105.jpg)
Rapid Analysis using LISA
105
All code is equal(given a parser)
Only a few projects Only a few snapshots
«Point and Shoot»
Time & resource intensive analysis
Analyze all commits in parallel
Analysis directly on source code
Fast, general purpose code analysis tool aimed specifically at large scale
![Page 106: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/106.jpg)
Rapid Analysis using LISA
106
All code is equal(given a parser)
Only a few projects Only a few snapshots
«Point and Shoot»
Quick analysis of 1000s of commits
Analyze all commits in parallel
Analysis directly on source code
Fast, general purpose code analysis tool aimed specifically at large scale
![Page 107: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/107.jpg)
Rapid Analysis using LISA
107
All code is equal(given a parser)
Only a few projects All commits
«Point and Shoot»
Quick analysis of 1000s of commits
Analyze all commits in parallel
Analysis directly on source code
Fast, general purpose code analysis tool aimed specifically at large scale
![Page 108: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/108.jpg)
Rapid Analysis using LISA
108
All code is equal(given a parser)
Only a few projects All commits
«Point and Shoot»
Quick analysis of 1000s of commits
Analyze all commits in parallel
Analysis directly on source code
Fast, general purpose code analysis tool aimed specifically at large scale
![Page 109: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/109.jpg)
# of Commits: Linear Scaling
MarchUpdate(Total)
109
![Page 110: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/110.jpg)
Multi-Project Parallelization
27
LISA
AspectJ:7642 commits, 440k LOC
Requirements:20 GB memory45 min on 4 cores
![Page 111: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/111.jpg)
Multi-Project Parallelization
27
LISA
AspectJ:7642 commits, 440k LOC
Requirements:20 GB memory45 min on 4 cores
VM VM
VM
VMVM
VM
VM
VM
VM
VM
Parallelization scenario:10x Amazon EC 2 r3.8xlarge10x 244GB memory10x 32 cores
Potential to analyze 160AspectJ-sized projects per hour
(Most projects on git-hub are much smaller)
![Page 112: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/112.jpg)
Benchmarking
10
![Page 113: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/113.jpg)
Benchmarking
10
Is this good or bad?What does it even
mean?
Is this metric too high?
Here‘s a code smell - should I fix it?
![Page 114: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/114.jpg)
Benchmarking
10
![Page 115: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/115.jpg)
Benchmarking
10
Create reference points for orientation
![Page 116: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/116.jpg)
Cluster & Compare
11
![Page 117: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/117.jpg)
Cluster & Compare
11
• Discover „phenotypes“
• By metric values
![Page 118: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/118.jpg)
Cluster & Compare
11
• Discover „phenotypes“
• By metric values
• By metric evolution over
time
![Page 119: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/119.jpg)
Cluster & Compare
11
• Discover „phenotypes“
• By metric values
• By metric evolution over
time
• Compare programming
languages / paradigms
![Page 120: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/120.jpg)
Cluster & Compare
11
• Discover „phenotypes“
• By metric values
• By metric evolution over
time
• Compare programming
languages / paradigms
Find interesting projects for further study &identify evolution patterns to find (un-)desirable ones
![Page 121: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/121.jpg)
Machine Learning
12
Sequence of Changes(observable)
Bug Reports(training)
HMM
![Page 122: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/122.jpg)
Machine Learning
12
Sequence of Changes(observable)
Bug Reports(training)
HMM
Bug-Proneness(hidden)
![Page 123: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/123.jpg)
Machine Learning
12
Sequence of Changes(observable)
Bug Reports(training)
HMM
Bug-Proneness(hidden)
Leverage big data to train classifiers and predictors
![Page 124: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/124.jpg)
Study Replication
13
Conclusions X, Y, Z
![Page 125: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/125.jpg)
Study Replication
13
Conclusions X, Y, Z
Replicable with 1000s of projects?
![Page 126: Carol V. Alexandru, Sebastiano Panichella, Harald C. Gall · 2017. 2. 22. · 0.525 0.109 0.082 0.071 0.052 0.041 0.032 0.033 0 5 10 15 20 25 30 35 40 45 50 1 10 100 1000 2000 3000](https://reader033.vdocuments.net/reader033/viewer/2022052002/6014c010a6e0c55cbc054a16/html5/thumbnails/126.jpg)
Study Replication
13
Conclusions X, Y, Z
Replicable with 1000s of projects?
Confirmation or rebuttal of existing assumptions