contentsuggest--recommendation of relevant sections from a webpage about errors & exceptions

RECOMMENDING RELEVANT SECTIONS FROM A WEBPAGE ABOUTPROGRAMMING ERRORS AND EXCEPTIONS

Mohammad Masudur Rahman, and Chanchal K. RoySoftware Research Lab, Department of Computer ScienceUniversity of Saskatchewan, Canada

25th Center for Advanced Studies Conference (CASCON 2015)

2

Exception: A common experience!!

Exception triggering point

3

SOLVING EXCEPTION (STEP I: QUERY SELECTION)

Selection of traditional search query

Switching to web browser for web search

This query may not be sufficient enough for most of the exceptions

4

SOLVING EXCEPTION (STEP II: WEB SEARCH)

The browser does NOT know the context (i.e., details) of the exception.

Not much helpful ranking Forces the developer to SWITCH back

and forth between IDE and browser. Trial and error in searching 19% of development time in web

search

Switching is often

distracting

5

SOLVING EXCEPTION(STEP III: MAPPING TO PAGE SECTIONS )

Mapping

Mapping between the exception & relevant page sections non-trivial

Automated mapping between exception & relevant page sections

IDE-based web page content suggestion for review

5

6

OUTLINE OF THIS TALK

Content Suggest Architecture

Metrics & Algorithm

Empirical evaluation & validation (using

webpages)

Validation with IR techniques(using SO posts)

Conclusion

7

CONTENTSUGGEST—ARCHITECTURE

Start End

7

8

PROPOSED METRICS Content Density (CTD)

Text Density (TD) Link Density (LD) Code Density (CD) Purity of textual content, less hyperlinks

Content Relevance (CTR) Text Relevance (TR) Code Relevance (CR) Relevance of textual content with exception,

interesting tokens Content Score (CTS) = γ*Content Density

+ δ*Content Relevance Normalized metrics

9

PROPOSED TECHNIQUE (CONTENTSUGGEST)

HTML

HEAD BODY

TITLE

STYLE

SCRIPT

DIV DIV

H1 P

B

OL

LI

LI

LI

H1 TABLE P

TBODY

TR

TR

TD

TD

TD

TD

TextText Text

Text

Text

Text

Text

Text

Text

Text

Text

Text

Text

DIV

P

P

10

PROPOSED TECHNIQUE (CONTENTSUGGEST)--SCORING

HTML

HEAD BODY

TITLEDIV DIV

H1 P

B

OL

LI

LI

LI

H1 TABLE P

TBODY

TR

TR

TD

TD

TD

TD

TextText Text

Text

Text

Text

Text

Text

Text

Text

Text

Text

Text

11

PROPOSED TECHNIQUE (CONTENTSUGGEST)--TAGGING

HTML

HEAD BODY

TITLEDIV DIV

H1 P

B

OL

LI

LI

LI

H1 TABLE P

TBODY

TR

TR

TD

TD

TD

TD

TextText Text

Text

Text

Text

Text

Text

Text

Text

Text

Text

Text

Content

Noise

12

PROPOSED TECHNIQUE (CONTENTSUGGEST)--FILTERING

HTML

HEAD BODY

TITLEDIV DIV

H1 P

B

OL

LI

LI

LI

H1 TABLE P

TBODY

TR

TR

TD

TD

TD

TD

TextText Text

Text

Text

Text

Text

Text

Text

Text

Text

Text

Text

Content

Noise

13

EXPERIMENTS

(80 exceptions + 250 web pages)

Manual analysis (25 hours)

Gold sectionsEvaluation Validation (Sun et al)

Stack Overflow Crowd

SO Posts

ContentSuggest

IR (VSM, LSA)

14

PERFORMANCE METRICS Precision (P): % of the retrieved content (a)

that belong to gold content (b) of the page.

Recall (R): % of gold content (b) that is retrieved (a) by the technique.

F1-measure (F1): Combination of Precision (P) & Recall (R).

|||),(|

abaLCSP

|||),(|

bbaLCSR

RPRPF

2

1

15

RESEARCH QUESTIONS (4) RQ1: How effective is ContentSuggest in

recommending relevant content from a web page?

RQ2: How effective are the proposed metrics in identifying relevant page content?

RQ3: Can ContentSuggest outperform the baseline technique?

RQ4: Does ContentSuggest perform better than IR techniques (VSM, LSI) in identifying relevant content?

16

ANSWERING RQ1 & RQ2– EVALUATION OF TECHNIQUE & METRICSScores Metric SO-Pages Non-SO Pages All Pages

{ Content Density } MP 50.91% 49.50% 50.07%MR 91.74% 75.71% 82.18%MF 62.32% 53.76% 57.22%

{ Content Relevance }

MP 86.63% 69.17% 76.23%MR 52.17% 57.66% 55.44%MF 61.07% 55.88% 57.98%

{ Content Density, Content Relevance }(Proposed Technique)

MP 92.64% 74.60% 81.96%MR 74.17% 78.51% 76.74%MF 80.95% 73.09% 76.30%

[ SO = Stack Overflow, MP = Mean Precision, MR = Mean Recall, MF = Mean F1-measure ]

17

ANSWERING RQ3– COMPARISON WITH BASELINE TECHNIQUEContent Extractor Metric SO-Pages Non-SO Pages All Pages

Sun et al. (SIGIR 2011)

MP 52.63% 38.89% 44.44%MR 86.49% 41.84% 59.88%

MF 62.57% 34.49% 45.84%ContentSuggest(Proposed Technique)

MP 92.64% 74.60% 81.96%MR 74.17% 78.51% 76.74%MF 80.95% 73.09% 76.30%

[ SO = Stack Overflow, MP = Mean Precision, MR = Mean Recall, MF = Mean F1-measure ]

Performed better for all 3 sets of pages– SO pages, Non-SO pages, and All Pages

Performed better for all metrics– precision, recall and F-measure.

18

ANSWERING RQ3– COMPARISON WITH BASELINE TECHNIQUE

19

ANSWERING RQ4– COMPARISON WITH IR TECHNIQUES (VSM, LSI)

Content Extractor Metric Accepted Posts Most Voted PostsLatent Semantic Analysis(Marcus et al, ICSE 2003)

MP 19.98% 23.02%MR 21.78% 23.17%MF 18.43% 21.07%

Vector Space Model(Antoniol et al, TSE 2002)

MP 22.50% 33.89%MR 23.08% 31.90%MF 19.77% 30.44%

Content Suggest(Proposed Technique)

MP 23.10% 31.36%MR 45.15% 54.42%MF 26.99% 35.90%

20

ANSWERING RQ4– COMPARISON WITH IR TECHNIQUES (VSM, LSI)

21

THREATS TO VALIDITY Gold content preparation: Despite cross-

validation may contain subjective bias. Limited training dataset: Metric weights

trained based on limited dataset. Usability concern: Fully fledged user-study

required to validate the applicability of the technique. Limited study performed with 6 participants.

22

TAKE-HOME MESSAGE 19% of development time spent simply in web

search (Brandt et al, SIGCHI 2009) Mapping between information in IDE and in

web page could be non-trivial, time-consuming.

ContentSuggest automates such mapping in the context of exception handling.

Content Density and Content Relevance are found effective in identifying relevant sections from a web page.

ContentSuggest outperforms one baseline technique and two IR techniques (VSM, LSI).

23

THANK YOU!!

24

REFERENCES[1] J. Brandt, P.J. Guo, J. Lewenstein, M. Dontcheva, and S. R. Klemmer. Two Studies

of Opportunistic Programming: Interleaving Web Foraging, Learning, and Writing Code. In Proc. SIGCHI, pages 1589-1598, 2009

[2] G. Antoniol, G. Canfora, G. Casazza, A. De Lucia, and E. Merlo. Recovering traceability links between code and documentation. TSE, 28(10):970-983, 2002

[3] A. Marcus and J.I. Maletic. Recovering Documentation-toSource-Code Traceability Links Using Latent Semantic Indexing. In Proc. ICSE, pages 125-135, 2003

[4] F. Sun, D. Song, and L. Liao. DOM Based Content Extraction via Text Density. In Proc. SIGIR, pages 245-254, 2011.

[5] Luca Ponzanelli, Alberto Bacchelli, and Michele Lanza. Seahawk: Stack Overflow in the IDE. In Proc. ICSE, pages 1295-1298, 2013

[6] M.M Rahman, S. Yeasmin, and C. Roy. Towards a ContextAware IDE-Based Meta Search Engine for Recommendation about Programming Errors and Exceptions. In Proc. CSMRWCRE, pages 194-203, 2014

[7] ContentSuggest Web Portal. URL http://www.usask.ca/~mor543/contentsuggest[8] C.K. Roy and J.R. Cordy. NICAD: Accurate Detection of Near Miss Intentional

Clones Using Flexible Pretty-Printing and Code Normalization. In Proc. ICPC, pages 172-181, 2008.

http://www.usask.ca/~mor543/contentsuggest

contentsuggest--recommendation of relevant sections from a webpage about errors & exceptions

Education