reverse engineering web applicationswpage.unina.it/ptramont/download/presentazione_icsm2005.pdf ·...

16
Ph.D. Dissertation Forum Ph.D. Dissertation Forum ICSM 2005 ICSM 2005 Ph Ph .D. Dissertation .D. Dissertation Reverse Engineering Reverse Engineering Web Applications Web Applications Porfirio Tramontana Porfirio Tramontana University of Naples “Federico II” University of Naples “Federico II”

Upload: others

Post on 27-Sep-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Reverse Engineering Web Applicationswpage.unina.it/ptramont/Download/Presentazione_ICSM2005.pdf · Interaction Design Patterns Concepts describing Reverse Engineering artifacts Groups

Ph.D. Dissertation Forum Ph.D. Dissertation Forum –– ICSM 2005ICSM 2005

PhPh.D. Dissertation.D. Dissertation

Reverse Engineering Reverse Engineering Web ApplicationsWeb Applications

Porfirio TramontanaPorfirio Tramontana

University of Naples “Federico II”University of Naples “Federico II”

Page 2: Reverse Engineering Web Applicationswpage.unina.it/ptramont/Download/Presentazione_ICSM2005.pdf · Interaction Design Patterns Concepts describing Reverse Engineering artifacts Groups

Ph.D. Dissertation Forum Ph.D. Dissertation Forum –– ICSM 2005ICSM 2005

Web Applications: open problemsWeb Applications: open problems

In the past years, a great request for Web Applications takes place, due to the World Wide Web diffusion making available many services all over the world

Web Applications have been developed with immature design methodologies and technologies

Nowadays, there is a number of legacy Web Applications needing for maintenance and re-engineering

Page 3: Reverse Engineering Web Applicationswpage.unina.it/ptramont/Download/Presentazione_ICSM2005.pdf · Interaction Design Patterns Concepts describing Reverse Engineering artifacts Groups

Ph.D. Dissertation Forum Ph.D. Dissertation Forum –– ICSM 2005ICSM 2005

Ph. D. Thesis Goals

• To propose models, methods and tools supporting Reverse Engineering and Comprehension of Web Applications

• Reverse Engineering and comprehension are fundamental tasks needed to efficiently support maintenance, testing and quality assessment of Web Applications

Doctoral Thesis Goals

Page 4: Reverse Engineering Web Applicationswpage.unina.it/ptramont/Download/Presentazione_ICSM2005.pdf · Interaction Design Patterns Concepts describing Reverse Engineering artifacts Groups

Ph.D. Dissertation Forum Ph.D. Dissertation Forum –– ICSM 2005ICSM 2005

Peculiarities of script-based Web Applications

Page basedClient-Server ArchitectureInterpreted languagesClient pages may be generated “on the fly”Client pages are executed in a browser (and the designer doesn’t know what kind of browser will be used)HTML interpreters are fault tolerant

... and so on ...

Page 5: Reverse Engineering Web Applicationswpage.unina.it/ptramont/Download/Presentazione_ICSM2005.pdf · Interaction Design Patterns Concepts describing Reverse Engineering artifacts Groups

Ph.D. Dissertation Forum Ph.D. Dissertation Forum –– ICSM 2005ICSM 2005

A process for the A process for the Reverse Engineering of Web ApplicationsReverse Engineering of Web Applications

Abstraction

Extraction

WASourceCode

StaticAnalysis

Dynamic Analysis

Business Level UML Diagram Abstractions

WA Execution

Identification of cloned components

Identification of Interaction Design

Patterns

Assignment of Concepts

Functional Clustering

Cloned components

Interaction Design Patterns

Concepts describing Reverse Engineering artifacts

Groups of pages realizing Web Application use cases

Structural and Business Level UML diagrams

Maintanability assessment

Abstraction

Extraction

WASourceCode

StaticAnalysis

Dynamic Analysis

Business Level UML Diagram Abstractions

WA Execution

Identification of cloned components

Identification of Interaction Design

Patterns

Assignment of Concepts

Functional Clustering

Cloned components

Interaction Design Patterns

Concepts describing Reverse Engineering artifacts

Groups of pages realizing Web Application use cases

Structural and Business Level UML diagrams

Maintanability assessment

G.A. Di Lucca, A.R. Fasolino, P. Tramontana, “Reverse Engineering Web Application: the WARE approach”, Journal of Software Maintenance and Evolution: Research and Practice, Volume 16, Issue 1-2, Date: January - April 2004, Pages: 71-101

Page 6: Reverse Engineering Web Applicationswpage.unina.it/ptramont/Download/Presentazione_ICSM2005.pdf · Interaction Design Patterns Concepts describing Reverse Engineering artifacts Groups

Ph.D. Dissertation Forum Ph.D. Dissertation Forum –– ICSM 2005ICSM 2005

Analysis of Web ApplicationsAnalysis of Web Applications

1) Static analysis of the source codeA multi-language parser analysing the source code of Server pages, Client pages and Script modules has been realized.During the analysis of server pages, facts related to the client pages that are built by server pages are also recorded.Static analysis results are stored in a intermediate form and are used to fill a relational database

2) Dynamic AnalysisAnalysis of Built Client pages in order to add to the database some facts that have been observed by executing the application

The reference model adopted is an extension of the one proposed by Conallen for the forward engineering of Web Applications

Page 7: Reverse Engineering Web Applicationswpage.unina.it/ptramont/Download/Presentazione_ICSM2005.pdf · Interaction Design Patterns Concepts describing Reverse Engineering artifacts Groups

Ph.D. Dissertation Forum Ph.D. Dissertation Forum –– ICSM 2005ICSM 2005

Model of Web ApplicationsModel of Web Applications

Static Page

DB Interface

Java Applet

TextareaSelect Button

Media Flash Object Mail Address

Mail Interface Server File Interface

Other Object

Generic File

Download

Parameter

Other Interface

Hyperlink

Frame

Web Object

Frameset

Anchor

Field

Server Function Server Class

Interface Object

Built Page

Form

Server Script

Session Variable

Server CookieServer Page

Submits

include

HTML Tag

Web Page

source

redirect

Client Page

Client Script

event

Modify Tag

redirect

Client Function

Client Module

Page 8: Reverse Engineering Web Applicationswpage.unina.it/ptramont/Download/Presentazione_ICSM2005.pdf · Interaction Design Patterns Concepts describing Reverse Engineering artifacts Groups

Ph.D. Dissertation Forum Ph.D. Dissertation Forum –– ICSM 2005ICSM 2005

WARE (Web Application Reverse Engineering) toolWARE (Web Application Reverse Engineering) tool

Extractor Abstractor

Interface

IRF

DBR

Diagrams

Repository

HTML

ParserService

WARE-Tool

WA Source Files

WARE GUI

Graphical Visualizer

DottVCG RIGI

ASP

VBS

PHP

JS

….

IRF Translator

Query Executor

UML DiagramsAbstractor /areadocente.html

/check.asp

Redirect

/check.aspBuilds

/autenticazionedocente.html

Submit

/check.asp /check.asp/check.asp

Submit

/areadocente.html

/check.asp

Redirect

/check.aspBuilds

/autenticazionedocente.html

Submit

/check.asp /check.asp/check.asp

Submit

WARE Architecture

Detail Class Diagram abstracted by WAREG. A. Di Lucca, A.R. Fasolino, U. De Carlini, F. Pace, P. Tramontana, “WARE: a tool for the Reverse Engineering of web Applications”, Proc. of 6th

IEEE European Conference on Software Maintenance and Reengineering, CSMR 2002, IEEE CS Press, Los Alamitos, CA, Pages:241 - 250

Page 9: Reverse Engineering Web Applicationswpage.unina.it/ptramont/Download/Presentazione_ICSM2005.pdf · Interaction Design Patterns Concepts describing Reverse Engineering artifacts Groups

Ph.D. Dissertation Forum Ph.D. Dissertation Forum –– ICSM 2005ICSM 2005

Functional Clustering of Web PagesFunctional Clustering of Web Pages

• Goal:

To cluster together subsets of components realizing Web Application functionalities

• Proposed Technique:

Hierarchical clusteringalgorithm, grouping Web Application pages in subsets, maximizing the cohesion and minimizing the couplingbetween them

G. A. Di Lucca, A.R. Fasolino, U. De Carlini, F. Pace, P. Tramontana, “Comprehending Web Applications by a Clustering Based Approach”, Proc. of 10th IEEE Workshop on Program Comprehension, IWPC 2002, Pages:261 - 270

Page 10: Reverse Engineering Web Applicationswpage.unina.it/ptramont/Download/Presentazione_ICSM2005.pdf · Interaction Design Patterns Concepts describing Reverse Engineering artifacts Groups

Ph.D. Dissertation Forum Ph.D. Dissertation Forum –– ICSM 2005ICSM 2005

Concept AssignmentConcept AssignmentGoal:Goal:

To identify the more relevant To identify the more relevant concepts in client pages with concepts in client pages with the purpose to suggest a the purpose to suggest a semantic description of client semantic description of client pages and of functional pages and of functional clusters of pagesclusters of pages

Proposed Technique:Proposed Technique:Heuristic Algorithms based on Heuristic Algorithms based on Information RetrievalInformation Retrieval

Candidate concepts are Candidate concepts are searched in textual content of searched in textual content of client pagesclient pagesSingle common words and short Single common words and short word sequences are word sequences are candidatedcandidatedto be conceptsto be concepts

Built Client Page

Server Page

0..*

1

0..*

1<<builds>>

Data Component

StopWord

Word

has synonym

has stem

Web Page

Static Client Page

AttributeName

TagNameWeight

nested in

0..*0..*

Control Component

0..*0..*

Client PageFile name

1111

TextWeight

0..*0..*

0..1

0..1

0..1

0..1

0..*0..1 0..*0..1

Concept1

1

1

1

1

1

1

1

G.A. Di Lucca, A.R.Fasolino, P.Tramontana, U.De Carlini, “Supporting Concept Assignment in the Comprehension of Web Applications”, Proceedings of the 28th IEEE Annual International Computer Software and Applications Conference, COMPSAC 2004

Page 11: Reverse Engineering Web Applicationswpage.unina.it/ptramont/Download/Presentazione_ICSM2005.pdf · Interaction Design Patterns Concepts describing Reverse Engineering artifacts Groups

Ph.D. Dissertation Forum Ph.D. Dissertation Forum –– ICSM 2005ICSM 2005

Interaction Design PatternInteraction Design Patterns s IdentificationIdentification

Goal:Goal:To identify repetitive structures in Web To identify repetitive structures in Web Client pagesClient pages

These structures can be related to known These structures can be related to known Programming PatternsProgramming Patterns

Proposed Technique:Proposed Technique:Statistical methodology based on features Statistical methodology based on features extracted in the source code of client pages.extracted in the source code of client pages.

Presence, quantity and dimension of forms, Presence, quantity and dimension of forms, tables, input fields, frames, common keywords tables, input fields, frames, common keywords and so on. and so on.

G.A. Di Lucca, A.R.Fasolino, P.Tramontana, “Recovering Interaction Design Patterns in Web Applications”, submitted to 9th IEEE European Conference on Software Maintenace and Reengineering, CSMR 2005

Page 12: Reverse Engineering Web Applicationswpage.unina.it/ptramont/Download/Presentazione_ICSM2005.pdf · Interaction Design Patterns Concepts describing Reverse Engineering artifacts Groups

Ph.D. Dissertation Forum Ph.D. Dissertation Forum –– ICSM 2005ICSM 2005

Identification of cloned componentsIdentification of cloned components

Goals:Goals:ReRe--Engineering of cloned components via code Engineering of cloned components via code transformationstransformationsClassification of Built Client Pages Classification of Built Client Pages Identification of reusable Programming PatternsIdentification of reusable Programming Patterns

Proposed Techniques:Proposed Techniques:Extraction of features in the structure of Client pages Extraction of features in the structure of Client pages and in the source code of server pagesand in the source code of server pagesComputation of distance measures between pages Computation of distance measures between pages (Euclidean (Euclidean dstancedstance, Levenshtein edit distance), Levenshtein edit distance)

G.A. Di Lucca, A.R. Fasolino, P. Tramontana, U. De Carlini, “Identifying Reusable Components in Web Applications”, IASTED International Conference on Software Engineering, SE 2004, pp.526-531

Page 13: Reverse Engineering Web Applicationswpage.unina.it/ptramont/Download/Presentazione_ICSM2005.pdf · Interaction Design Patterns Concepts describing Reverse Engineering artifacts Groups

Ph.D. Dissertation Forum Ph.D. Dissertation Forum –– ICSM 2005ICSM 2005

Abstraction of Business Level ModelsAbstraction of Business Level ModelsGoals:Goals:

To abstract object oriented To abstract object oriented business level models of Web business level models of Web Applications Applications

Proposed Techniques:Proposed Techniques:Classes and attributes are Classes and attributes are identified by analysing the identified by analysing the data that are exchanged data that are exchanged between user, Web pages between user, Web pages and databases. and databases. Class methods are identified Class methods are identified by analysing the functions by analysing the functions implemented by cluster of implemented by cluster of pages pages Relationships between classes Relationships between classes are identified analysing data are identified analysing data structures and data flow structures and data flow among pagesamong pages

Tutoring requestDate

TeacherNameSurnameE-mailPhone numberPasswordCode

TutoringDateStart timeEnd time

NewsNumberDateText

StudentNameSurnameE-mailPasswordCodePhone number

ExamDateTimeClassroom

CourseAcademic yearCodeName

Exam ReservationDate

G.A. Di Lucca, A.R.Fasolino, U.De Carlini, P.Tramontana, “Recovering a Business Object Model from Web Applications”, Proceedings of the 27th IEEE Annual International Computer Software and Applications Conference, COMPSAC 2003, Pages: 348 - 353

Page 14: Reverse Engineering Web Applicationswpage.unina.it/ptramont/Download/Presentazione_ICSM2005.pdf · Interaction Design Patterns Concepts describing Reverse Engineering artifacts Groups

Ph.D. Dissertation Forum Ph.D. Dissertation Forum –– ICSM 2005ICSM 2005

Maintainability ModelMaintainability Model

Goals:Goals:To propose models and methods for the assessment To propose models and methods for the assessment of the maintainability of Web Applicationsof the maintainability of Web Applications

Proposed Models and Techniques:Proposed Models and Techniques:Adapting to Web Applications the Oman model Adapting to Web Applications the Oman model (thought for traditional applications)(thought for traditional applications)Selection of a set of product metrics and proposal of Selection of a set of product metrics and proposal of a maintainability index that can be calculated with a maintainability index that can be calculated with negligible effort and timenegligible effort and time

G.A. Di Lucca, A.R.Fasolino, P.Tramontana, C.A.Visaggio, “Towards the definition of a maintainability model for web applications”, Proceedings of the Eighth IEEE European Conference on Software Maintenance and Reengineering, CSMR 2004, pages:279 - 287

Page 15: Reverse Engineering Web Applicationswpage.unina.it/ptramont/Download/Presentazione_ICSM2005.pdf · Interaction Design Patterns Concepts describing Reverse Engineering artifacts Groups

Ph.D. Dissertation Forum Ph.D. Dissertation Forum –– ICSM 2005ICSM 2005

Current and future worksCurrent and future works

Techniques for the dynamic analysis Techniques for the dynamic analysis of Web Applicationsof Web ApplicationsAccessibility assessment of Client Accessibility assessment of Client pagespagesMigration from Web Applications to Migration from Web Applications to Web ServicesWeb ServicesTesting of Web ApplicationsTesting of Web Applications

Mutation Testing techniquesMutation Testing techniques

Maintainability assessmentMaintainability assessmentDefinition of ageing measures for Web Definition of ageing measures for Web ApplicationsApplications

G.A. Di Lucca, M. Di Penta, A.R. Fasolino, P. Tramontana, “Supporting Web Application Evolution by DynamicAnalysis”, IWPSE 2005

G.A. Di Lucca, A.R. Fasolino, P. Tramontana, “Web Site Accessibility: Identifying and Fixing of AccessibilityProblems in Client Page Code”, WSE 2005

Page 16: Reverse Engineering Web Applicationswpage.unina.it/ptramont/Download/Presentazione_ICSM2005.pdf · Interaction Design Patterns Concepts describing Reverse Engineering artifacts Groups

Ph.D. Dissertation Forum Ph.D. Dissertation Forum –– ICSM 2005ICSM 2005