api usage pattern extraction using semantic similarity

27
SEMANTIC NETWORK BASED API USAGE PATTERN EXTRACTION & LEARNING Mohammad Masudur Rahman [email protected] Department of Computer Science University of Saskatchewan

Upload: masud-rahman

Post on 17-Jun-2015

655 views

Category:

Education


1 download

DESCRIPTION

An enthusiastic project for API usage pattern extraction exploiting semantic similarity among API usage code examples.

TRANSCRIPT

Page 1: API Usage Pattern Extraction using Semantic Similarity

SEMANTIC NETWORK BASED API USAGE PATTERN EXTRACTION & LEARNING

Mohammad Masudur Rahman

[email protected]

Department of Computer Science

University of Saskatchewan

Page 2: API Usage Pattern Extraction using Semantic Similarity

PRESENTATION OVERVIEW

Introduction Motivating Example Background Concepts Proposed Approach Semantic Network of Source code API Usage Pattern Extraction Pattern Learning & Visualization Experimental Results & Discussions Threats to Validity Conclusion & Future Works

Page 3: API Usage Pattern Extraction using Semantic Similarity

INTRODUCTION

API (Application Programming Interface) Libraries

API Documentation, API Browser, forums API Usage learning for developers Existing projects using APIs API Usage Patterns

Page 4: API Usage Pattern Extraction using Semantic Similarity

WHAT IS API USAGE PATTERN?

A frequent and consistent sequence of API method calls and field accesses

Performs a particular programming task. Widely used in multiple projects Widely accepted by developers community

Page 5: API Usage Pattern Extraction using Semantic Similarity

API USAGE PATTERN

Page 6: API Usage Pattern Extraction using Semantic Similarity

BIG QUESTION?

How to extract the API usage patterns from the source code?

Page 7: API Usage Pattern Extraction using Semantic Similarity

SEMANTIC WEB OR NETWORK

What is the living place of the author of a particular software manual?

Page 8: API Usage Pattern Extraction using Semantic Similarity

MOTIVATING EXAMPLE

Page 9: API Usage Pattern Extraction using Semantic Similarity

MOTIVATING EXAMPLE

Page 10: API Usage Pattern Extraction using Semantic Similarity

RESEARCH QUESTIONS

RQ 1: Can semantic network technologies represent the semantics of OO source code properly?

RQ 2: Can this representation be used for API usage pattern extraction and learning?

Page 11: API Usage Pattern Extraction using Semantic Similarity

BACKGROUND CONCEPTS

API Usage Patterns API Usage Violation & Anomalies Semantic Web Semantic Network of Source Code Resource Description Framework (RDF) RDF Statement or Triples

Page 12: API Usage Pattern Extraction using Semantic Similarity

RDF TRIPLE (BUILDING BLOCK OF SEMANTIC WEB OR NETWORK)

Subject Predicate Object

Page 13: API Usage Pattern Extraction using Semantic Similarity

PROPOSED APPROACH FOR API USAGE PATTERN EXTRACTION & LEARNING

Page 14: API Usage Pattern Extraction using Semantic Similarity

PROPOSED APPROACH FOR API USAGE PATTERN EXTRACTION & LEARNING

API Class List

OSS Projects

Contains API ?

Source code parser

Semantic Network Builder

API Pattern Explorer

API Usage Pattern

Manager

RDF Pattern Visualizer

Pattern Source Skeleton Builder

1

2

3 4 5

6

9 8

7

API Classes

Source files

No

Yes

Parsed Expressions

RDF Files

Patterns

Pattern Pattern

Page 15: API Usage Pattern Extraction using Semantic Similarity

SOURCE CODE SEMANTIC NETWORK

AST Parser (Javaparser)

JavaExpressions

Apache Jena Framework

API Expression selection rules

RDF Maker

Java Source code

RDF Network

RDF Triples

Page 16: API Usage Pattern Extraction using Semantic Similarity

API USAGE PATTERN EXTRACTION

All Usages of an API Class

Candidate API usage

Patterns

Common Sub-graph Selection

Pattern Score >

threshold ?

No

Selected API Usage Patterns

Yes

Discarded

Page 17: API Usage Pattern Extraction using Semantic Similarity

EXPERIMENTAL RESULTS

25 Open source Projects 3 API libraries (java.io, java.util, java.awt) 250 API classes selected API usages found for 113 API classes Pattern found for 76 API classes Total 776 patterns

Page 18: API Usage Pattern Extraction using Semantic Similarity

API USAGE PATTERNS

Page 19: API Usage Pattern Extraction using Semantic Similarity

SOURCE CODE SKELETON

Fig: BufferedInputStream Usage Pattern

Page 20: API Usage Pattern Extraction using Semantic Similarity

EXPERIMENTAL RESULTS

Project #Class #M &C

#ATCF #ADCF #ATPF #ADPF

Ant-Contrib

186 1388 96 23 1865 280

AOI 461 6489 218 55 1651 494

Groimp 1202 13875 132 41 1632 407

JFreechart 1059 12368 507 38 6841 410

JHotdraw7 689 7330 310 49 2547 462

#M & C =Methods & Constructors, #ATCF=Total API class, #ADCF=Distinct API class, #ATPF=Total API Patterns found, #ADPF=Distinct API Patterns found

Page 21: API Usage Pattern Extraction using Semantic Similarity

PATTERNS PER CLASS

Fig: # patterns extracted per class comparison

Page 22: API Usage Pattern Extraction using Semantic Similarity

RESULTS DISCUSSION

RQ 1: Can semantic network technologies represent the semantics of OO source code properly?

Graph-based API Usage Extraction by Nguyen et al, FSE, 2009 : Incomplete semantics for edges and attributes

Source code ontology by Wursch et al, ICSE, 2010 : Does not represent the complete source code

The proposed approach captures expression level syntax and semantics

Focuses on API usage patterns

Page 23: API Usage Pattern Extraction using Semantic Similarity

RESULTS DISCUSSION

RQ 2: Can this representation be used for API usage pattern extraction and learning?

Successfully extracts 776 patterns for 76 API classes from 25 open source projects

A potential approach to be explored more for API usage pattern exploration

Visualization of RDF network helps in learning Source code as visual entities rather than

lines More comprehensive idea about OO source

code Applicable for complex OO relationships Very useful for quick learning

Page 24: API Usage Pattern Extraction using Semantic Similarity

THREATS TO VALIDITY

Representing complete semantics: a non-trivial task.

More expressions for more accurate representation

RDF pattern visualization within limited display

Need to be introduced with RDF convention

Page 25: API Usage Pattern Extraction using Semantic Similarity

CONCLUSION & FUTURE WORKS

Applicability of semantic web technologies for API usage pattern extraction

Semantic representation for learning by the developers

Real world user study Extracted patterns for automatic code

completion in the IDE. Extracted patterns for API violation and

anomaly detection

Page 26: API Usage Pattern Extraction using Semantic Similarity

THANK YOU!!!

Page 27: API Usage Pattern Extraction using Semantic Similarity

REFERENCES[1] Semantic web diagram.URL http://www.w3.org/ Talks/2002/10/16-sw/slide7-0.html.[2] Tung Thanh Nguyen, Hoan Anh Nguyen, NamH.Pham, JafarM.Al-Kofahi, and

TienN.Nguyen. Graph-based mining of multiple object usage patterns. In Proc. ESEC/FSE, 2009, pages 383-392.

[3] M.Wursch, G.Ghezzi, G.Reif,and H.C.Gall. Supporting developers with natural language queries. In Proc. ICSE, 2010,pages 165-174

[4] Tao Xie and Jian Pei. Mapo:mining api usages from open source repositories. In Proc. MSR, 2006, pages 574-57[5] Semantic web technology.URL http://www.w3.org/ 2001/sw[6] Visual learning style.URL http://www.learning-styles-online.com/style/visual-

spatial.[7] Apache Jena framework.URL http://jena.apache.org/.[8] Javaparser-java 1.5 parser and ast.URL http://code.google.com/p/javaparser/.[9] RDF-gravity tool.URL http://semweb.salzburgresearch.at/apps/rdf-gravity/.