sreedevi sampath, university of delaware amie souter, drexel university
Post on 07-Feb-2016
37 Views
Preview:
DESCRIPTION
TRANSCRIPT
Towards Defining and Exploiting Similarities in Web Application Use
Cases through User Session Analysis
Sreedevi Sampath, University of Delaware
Amie Souter, Drexel University
Lori Pollock, University of Delaware
Workshop on Dynamic Analysis (WODA), May 25, 2004co-located with
International Conference on Software Engineering (ICSE 2004)
Motivation and Overview12.3.40.65 GET index.jsp12.3.40.65 GET login.jsp12.3.40.65 GET reg.jsp? login=xxx&password=hello&12.3.40.65 GET myinfo.jsp
View user sessions as use casesBehaviorally relatedsequence of eventsperformed by the user through interaction with the system
User session analysis Test case generation
Monitoringload of traffic
Contentpersonalization
Test suitereduction
Softwaredevelopment/maintenancetools
• Clustering via concept analysis• Common subsequences analysis
Learn about dynamic behavior
Step 1Clustering via Concept Analysis
• Mathematical technique for clustering objects that have common discrete attributes
• Set of objects, O: user sessions, us
• Set of attributes, A: URLs, u
• Relation, R: us requests u
• Concept analysis identifies all concepts (Oi, Aj) for a given tuple (O, A, R)
GD GR GL PL GS GB GM
us1 x x x
us3 x x x x x
us4 x x x x x
us5 x x x
us6 x x x x x x
us2 x x x x x
,{GD,GR,GL})({us1, us2, us3, us4, us5, us6}
,{GD,GR,GL,GS})
,{GD,GR,GL,PL,GS}) ,{GD,GR,GL,GS,GB}) ({us3} ({us4}
({us2} ,{GD,GR,GL,GS,GM})({us6}
({us2, us3, us4, us6}
,{GD,GR,GL,PL,GS,GB})
{GD,GR,GL,PL,GS,GB,GM})(null, CONCEPT LATTICE: FULL
objects
attributes (URLs)RELATION TABLE GD GR GL
us1 us5GS
PL GB
us3
us2
GM
us6
SPARSE
us4
Step 2Heuristic for Test Suite Reduction
• Smallest set of user sessions• Covers all the URLs• Represents common URL subsequences
of different use casesus1 us5
GS
PL GB
us3 us4
us2
GM
us6
GD GR GL
Resulting reduced test suite: {us2, us6}
Identify next-to-bottom nodes
Pick one user session from each of these next-to-bottom nodes
Hypothesis Motivating the Approach
• Common Subsequences Hypothesis: The set of user sessions clustered together into the same concept node will have a high commonality in the subsequences of URLs in their sessions
Finding Common Subsequences of URLs
NODE 003
objects
{ us3, us6 }
attributes
{ GD, GL, GR, GS, PL }
us3
GD
GR
GL
PL
GS
PL
GR
GL
PL
GS
us6
GD
GR
GL
GB
PL
GS
GR
GL
GB
PL
GS
Common Subsequences
[GD, GR, GL]
[PL, GS]
[GR, GL]
Subsequences of URLs arerepresentative of partial use cases of the user sessions
Metric for Common Subsequences Hypothesis
• attr-size[n] set: level of node in lattice
# user sessions= 4
# URLs = 5
# usersessions = 3# URLs = 5
attr-size[5]: level 5
{a, b, c, d, e}
Subseqsize
Common subsequence
Percent
attrs covered
1 a, b, c, d, e 100 %
2 ab, bc, be 80%
3 abc, abe 80%
us1 us2 us3
abcdabe
abceabed
abcdabea
MetricPercent of attributes covered
by common subsequences of URLs of various sizes
Experiment: Applications Used
• Bookstore web application 9,748 LOC, 385 methods, 11 classes Front end: JSP, Backend: MySql 123 user sessions
• uPortal application 38,589 LOC, 4233 methods, 508 classes Java, JSP, XML, J2EE 2083 user sessions
Results for Common Subsequences Hypothesis
Bookstore
14
710
1316
1922
2528
3134
3740
Attr-Size Set
8
Subsequence Size
242628303234363840
42
44
46
48
50
52
Percent of Attributes Covered
Results for Common Subsequence Hypothesis
Bookstore
Result: subsequences of various sizes cover reasonable percent of attributes
Conclusions for Common Subsequences Hypothesis
• Between user sessions of a node there exists commonality in subsequences of URLs
• These common subsequences cover a reasonable percent of URLs (attributes) of the node
• Clustering based on single URLs clusters similar use cases can choose one object from each node
Next-to-bottom Coverage of Use Cases Hypothesis
reduced set{us2, us6}
remaining set{us1, us3, us4
us5}
user sessionsbelonging to next-to-bottomnodes
all other user sessions except sessionsbelonging to next-to-bottom nodes
GD GR GL
us1 us5GS
PL GB
us3 us4
us2
GM
us6
In addition to covering all the URLs of the original test suite, the user sessions in next-to-bottom nodes execute a high percentage of the subsequences of URLs of the rest of the original test suite
Results for Next-to-bottom Coverage of Use Cases Hypothesis
Bookstore
Result: short sequences present but long sequences are missing
Metric: loss of coverage of use cases in remaining set by the reduced set
Conclusion for Next-to-bottom Coverage of Use Cases Hypothesis
• Long sequences absent but smaller sequences are present in reduced set
• reduced set contains more URLs hence may contain other URL sequences absent in remaining set
• Moderately supports picking next-to-bottom nodes for reduced test suite
Pros and Cons of Our Approach
+ Results from common subsequences hypothesis support using concept analysis for clustering user sessions
+ Experiments show little coverage loss (tech report) by reduced test suite
- Results from next-to-bottom coverage of use cases hypothesis indicate further work needed on heuristic
Future Work
• Explore additional heuristics
• Additional user session analysis Useful for other software engineering tasks
top related