Download - Chaoyang University of Technology Clustering web transactions using rough approximation Source : Fuzzy Sets and Systems 148 (2004) 131–138 Author : Supriya

Chaoyang University of Technology

Clustering web transactions using rough approximation

Source : Fuzzy Sets and Systems 148 (2004) 131–138

Author : Supriya Kumar Dea, P. Radha Krishnab.

Adviser : RC. Chen

Present : Yu-Hsiang Fu (傅昱翔 )

Date :2006/12/14

Chaoyang University of TechnologyChaoyang University of Technology

2006/12/14 2

Chaoyang University of Technology Outline

• Abstract• Introduction• Rough Set• Rough Set Approximation• Experimental Results• Conclusions• References

2006/12/14 3

Chaoyang University of Technology Abstract

• Web usage mining is the application of data mining techniques

• Discovering user access patterns from web access log

• Using rough sets can effectively mine web log records to discover web page access patterns

2006/12/14 4

Chaoyang University of Technology Introduction (1/2)

• WWW includes a huge number of hyperlinks ,access and usage information.

• Web Mining– Web content mining– Web structure mining– Web usage mining

2006/12/14 5

Chaoyang University of Technology Introduction (2/2)

• User’s behaviors– Click stream is the sequence of clicks or pages

requested as a visitor explores a Web site.• Web transaction

– A user session is the click-stream of page views for a single user across the entire web.

• The usage patterns are different for different users that navigates the same pattern in different ways.

2006/12/14 6

Chaoyang University of Technology Rough Set (1/5)

• The Rough Set theory was introduced by Zdzislaw Pawlak in the early 1980s.

• Rough Set deals with the classification analysis of data table.

• Rough Set develop efficient searching for relevant tolerance relations and extract interesting patterns in data.

2006/12/14 7


• Universe and Relation

2006/12/14 8


• Lower and Upper Approximation

( surely )

( possible )

2006/12/14 9


• Boundary and Negative region

2006/12/14 10


2006/12/14 11

Chaoyang University of TechnologyRough Set Approximation (1/7)

• A user transaction is a sequence of items

• Let there be m users and the user transactions be

• Let U be the set of distinct n clicks (hyperlinks/URLs) clicked by users

2006/12/14 12


2006/12/14 13


2006/12/14 14


2006/12/14 15


2006/12/14 16


2006/12/14 17


2006/12/14 18

Chaoyang University of TechnologyExperimental Results (1/2)

• Log files form www.idrbt.ac.in .– The web sites consists of 62 web pages and 283

links.– Log files record every click that user make.– Session time is 30 min.

http://www.idrbt.ac.in/

2006/12/14 19

Chaoyang University of TechnologyExperimental Results (2/2)

• Steps：– First, the data is preprocessed and transformed.– Second, computing similarity upper approximation for

each transaction.– Finally, clusters of transactions using rough approxim

ation (threshold = 0.5).

2006/12/14 20

Chaoyang University of Technology Conclusion• This paper presented a novel algorithm for

clustering using rough approximation to cluster the web transactions of user access.

• This approach is useful to find interesting user access patterns in web log.

• The result can be helpful for building up adaptive web according to the user’s behavior.

2006/12/14 21

Chaoyang University of Technology References• Zdzislaw Pawlak,Jerzy Grzymala-Busse,Roman Slowinski, and Wojciech Ziarko, Rough S

ets, COMMUNICATIONS OF THE ACM November 1995/Vol. 38, No. 11, 88-95• Zdzislaw Pawlak, Rough Sets (Abstract) ,262-264• Zdzisław Pawlak , Andrzej Skowron , Rudiments of rough sets , Information Sciences 177

(2007) 3–27• Nils Kammenhuber, Julia Luxenburger, Anja Feldmann, Gerhard Weikum, Web Search Cli

ckstreams, IMC’06, October 25–27, 2006,• A, Jain, Data Clustering： A Review , ACM Computing Suversy, Vol 31, No 3, September

1999 ,274-275,281-285

Download - Chaoyang University of Technology Clustering web transactions using rough approximation Source : Fuzzy Sets and Systems 148 (2004) 131–138 Author : Supriya

Top Related