wum tutorial

Post on 13-Apr-2018

231 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

7/27/2019 WUM Tutorial

http://slidepdf.com/reader/full/wum-tutorial 1/24

http://Ka.rsten-Winkler.de

 

Home » hypKNOWsys » Project WUM » WUM Tutorial  E-Mail 

The Unfortunately Incomplete WUM Tutorial

This small tutorial should enable you to start the Web Utiliation Miner WUM! to create ane" demo minin# base! to import the $irst demo lo# $ile that comes "ith this distribution!

to create the %isitors& sessions contained in this lo# $ile! to build the a##re#ated lo# and to

e'ecute your $irst M(NT )uery "ith the M(NT )uery processor* (t co%ers the basic

techni)ues that you should +no" about be$ore minin# your o"n lo# $iles "ith WUM*

,d%anced techni)ues in usin# WUM are co%ered by the second part o$ the tutorial* (t is

stron#ly recommended to "or+ your "ay throu#h both parts o$ the tutorial be$ore startin#

your o"n minin# session*

(t is assumed that you success$ully installed the Web Utiliation Miner on your system and

modi$ied all necessary con$i#uration $iles* ($ you did not install WUM yet! please re$er to the

(nstallation uide that is part o$ this User .ocumentation and continue "ith the installation

o$ this minin# so$t"are* The demo %ersion o$ WUM is supposed to be pure /a%a* There$oreis should run "ithout di$$iculties on all e'istin# /a%a 0irtual Machines supportin# /a%a 1*2*2

or hi#her* Please note that this Web Utiliation Miner is a beta %ersion intended $or use in

research and education* The WUM team "ould really appreciate to #et all +inds o$ bu#

reports and $eature su##estions $or the $uture de%elopment o$ this so$t"are* 3imply drop

us an e-mail* ood luc+ in e'plorin# WUM4 The Web Utiliation Miner*

,lternati%ely! you may be interested in readin# "hat others "rite about WUM4

5eli' 3chendel* Web-Usa#e-Minin#4 ,nalyse %orhandener Technolo#ien und

+ombinierter Einsat $6r +ennahl- und e$$iienorientierte ,nalyse %on 3er%er-

7o#$iles* Proje+tdo+umentation! 5achbereich Wirtscha$t! Hochschule Wismar*

Wismar! ermany! /anuary 2889* (n erman* :P.5 5ile! Mail! Web;

How to tart WUM

UN(< and 7inu'4 Open a ne" <-Terminal and ma+e sure that your current "or+in#

directory is the bin= subdirectory o$ the WUM_HOME directory* (n the #i%en e'ample! the

en%ironment %ariable JAVA_HOME is set to /usr/local/jdk1.2.2 and WUM_HOME is set to /users

/kwinkler/WUM.v60* The miner can be started as a bac+#round process by e'ecutin# the shell

script wumgui*

sten Winkler http://ka.rsten-winkler.de/hypknowsys/wum/wumTutorial.html

24 30/07/2014 10:28

7/27/2019 WUM Tutorial

http://slidepdf.com/reader/full/wum-tutorial 2/24

sten Winkler http://ka.rsten-winkler.de/hypknowsys/wum/wumTutorial.html

24 30/07/2014 10:28

7/27/2019 WUM Tutorial

http://slidepdf.com/reader/full/wum-tutorial 3/24

Windo"s >?=>@=NT4 Open the Windo"s E'plorer by ri#ht-clic+in# the Start  icon o$ your

tas+ bar and selectin# Explorer ! open the home directory o$ WUM by bro"sin# the tree

%ie" o$ your $ile system and $inally double-clic+ the icon correspondin# to the $ile

startwum.pif* Usin# 7inu' and the K-.es+top En%ironment! the main $rame o$ the Web

Utiliation Miner may loo+ li+e this* The main "indo" o$ WUM can be resied or mo%ed on

your des+top "ithout di$$iculties*

How to !reate a "ew Minin# $ase

sten Winkler http://ka.rsten-winkler.de/hypknowsys/wum/wumTutorial.html

24 30/07/2014 10:28

7/27/2019 WUM Tutorial

http://slidepdf.com/reader/full/wum-tutorial 4/24

Each minin# project re)uires a minin# base "ithin WUM* , minin# base contains

descripti%e in$ormation as "ell as an Object 3tore P3E Pro database and %arious other $iles

created by the miner durin# the minin# process* (n order to create a ne" minin# base $or

this tutorial! please open the File menu and select Create Mining Base.

There are $i%e te't $ields $or the parameters o$ the ne" minin# base* Each minin# base

must ha%e a uni)ue name that may include blan+ spaces and numbers* The correspondin#

"eb ser%er UA7 can optionally be stored $or $uture use*

Each minin# base must ha%e its o"n directory to store the database and other related $iles*

(t is recommended to create a subdirectory in the directory data $or each ne" minin# base

be$ore startin# the miner* The minin# base o$ this tutorial "ill be stored in the e'istin#

directory data/demoWebSite* Blic+ on the button (Directory) ... to open a $ile dialo# o$ your

operatin# system* (n order to select the necessary directory websites/demoWebSite! please

select the directory and clic+ OK * ,lternati%ely! the name o$ an e'istin# directory can beentered in the correspondin# te't $ield*

sten Winkler http://ka.rsten-winkler.de/hypknowsys/wum/wumTutorial.html

24 30/07/2014 10:28

7/27/2019 WUM Tutorial

http://slidepdf.com/reader/full/wum-tutorial 5/24

,$ter selectin# or enterin# the home directory o$ the ne" minin# base! the current dialo#

should - more or less - loo+ li+e this4

,dditionally! the local directory containin# the lo# $iles o$ your Web ser%er must be

speci$ied* The demo lo# $ile AccessLog.txt is stored in the same directory as the database*

There$ore! clic+ on the button (Log Files:) ... to open the $ile dialo# o$ your operatin#

system* Open the directory data/demoWebSite and $inally clic+ OK * to select the lo# $ile

directory*

sten Winkler http://ka.rsten-winkler.de/hypknowsys/wum/wumTutorial.html

24 30/07/2014 10:28

7/27/2019 WUM Tutorial

http://slidepdf.com/reader/full/wum-tutorial 6/24

,$ter chec+in# the entered parameters! please clic+ the button OK  in order to create the

ne" minin# base $or this tutorial* Blic+in# Bancel "ould abort the creation o$ a ne" minin#

base* (n this case! the $ocus "ould be returned to the main "indo" o$ WUM*

,$ter success$ully creatin# a ne" minin# base! the title o$ the main "indo" contains thename o$ the ne" minin# base in brac+ets* The ne" minin# base is no" open and can be

used $or $urther operations* There can be only one open minin# base at a time* The Object

3tore P3E Pro database consists o$ three $iles WUM.MiningBase.* that are stored in the same

directory* Please do not edit! modi$y or delete these $iles*

Please note that the underlyin# Object 3tore P3E Pro is a sin#le user database only* The

.CM3 o$ Object 3tore P3E Pro uses a loc+in# mechanism to secure that each minin# base

is accessed by e'actly one user at a time* The database o$ an open minin# base is loc+ed

by creatin# a subdirectory WUM.MiningBase.odx in its home directory*

($ the pre%ious minin# session ended abnormally! the loc+ directory can be deleted byWUM in order to start the miner* Ce$ore unloc+in# a database by $orce! ma+e sure that

there is no other user "or+in# "ith the correspondin# minin# base*

How to Import a %o# &ile

sten Winkler http://ka.rsten-winkler.de/hypknowsys/wum/wumTutorial.html

24 30/07/2014 10:28

7/27/2019 WUM Tutorial

http://slidepdf.com/reader/full/wum-tutorial 7/24

,$ter creatin# a ne" or openin# an allready e'istin# minin# base! HTTP ser%er lo# $iles "ith

increasin# time stamps can subse)uently be imported into the minin# base* The import

module per$orms basic data cleanin# operations on each lo# $ile line and updates the

database "ith data o$ ne" %isitors and Web pa#es* (n order to import the small demo lo#

$ile into the tutorial minin# base! please open the File menu and select Import Log File.

The user inter$ace o$ the import module is depicted in the ne't picture* There are a $e"

parameters that must be speci$ied by the user be$ore a lo# $ile can be imported* ,part$rom simply enterin# the lo# $ile name and its $ormat! all parameters concernin# the data

cleanin# process should be considered %ery care$ully*

The te't $ield Filename contains the de$ault directory o$ HTTP ser%er lo# $iles* Cy clic+in#

the button (Filename) ...! you can speci$y the lo# $ile to be imported usin# the $ile dialo# o$ 

your operatin# system* ,$ter choosin# the correct $ile and clic+in# OK ! the complete lo# $ile

name "ill be sho"n in the te't $ield*

WUM currently supports $our "ide-spread lo# $ile $ormats* There is an e'ample lo# $ile line

o$ each $ile $ormat in the $ollo"in# table4 The e'ample lo# $ile AccessLog.txt corresponds to

the common lo# $ile $ormat* There$ore! please chec+ the Common Log File radio button*

sten Winkler http://ka.rsten-winkler.de/hypknowsys/wum/wumTutorial.html

24 30/07/2014 10:28

7/27/2019 WUM Tutorial

http://slidepdf.com/reader/full/wum-tutorial 8/24

The $ollo"in# table contains an e'ample lo# $ile line $or each lo# $ile $ormat supported by

WUM4

Bommon picasso.wiwi.hu-berlin.de - - [10/Dec/1999:23:06:31 +0200] "GET /index.html

HTTP/1.0" 200 3540

E'tended picasso.wiwi.hu-berlin.de - - [10/Dec/1999:23:06:31 +0200] "GET /index.html

HTTP/1.0" 200 3540 "http://www.berlin.de/" "Mozilla/3.01 (Win95; I)"

Boo+ie picasso.wiwi.hu-berlin.de - - [10/Dec/1999:23:06:31 +0200] "GET /index.html

HTTP/1.0" 200 3540 "http://www.berlin.de/" "Mozilla/3.01 (Win95; I)"

"VisitorID=10001; SessionID=20001"

M3-((3 picasso.wiwi.hu-berlin.de, -, 10.12.99, 23:06:31, W3SVC2, WWW,

100.100.100.100, 547, 444, 0, 200, 0, GET, /index.html, -,

(n order to reduce the number o$ "eb pa#es "ithin the WUM database! HTTP re)uests can

be truncated by cuttin# o$ all characters startin# at the $irst occurence o$ &D& HTM7

anchorsF or &G& B( parametersF* E'amples4 ($ the option r!ncate "e#!ests: $ML

 %nc&ors is enabled! the re)uests ET =contact*htmlDaddress and et

 =contact*htmlDemail "ill both be shortened to ET =contact*html and "ill there$ore be

treated as re)uests concernin# the same "eb pa#e* ($ the option r!ncate "e#!ests: C'I 

arameter  is enabled! the re)uests PO3T =c#i-bin=do"nload*c#iGuseridI12J%ersionIa

and PO3T =c#i-bin=do"nload*c#iGuseridI9?L%ersionIb "ill both be shortened to PO3T

 =c#i-bin=do"nload*c#i*

The WUM distribution contains a %ery small lo# $ile AccessLog.txt that is to be used in this

tutorial* :The tutorial is hope$ully to be continued at some point in time* .o you "ant to

helpG;

sten Winkler http://ka.rsten-winkler.de/hypknowsys/wum/wumTutorial.html

24 30/07/2014 10:28

7/27/2019 WUM Tutorial

http://slidepdf.com/reader/full/wum-tutorial 9/24

 

sten Winkler http://ka.rsten-winkler.de/hypknowsys/wum/wumTutorial.html

24 30/07/2014 10:28

7/27/2019 WUM Tutorial

http://slidepdf.com/reader/full/wum-tutorial 10/24

Please +eep in mind that the import module o$ WUM per$orms only basic substrin#operations on each lo# $ile line* ,ccordin# to the user&s indi%idual minin# #oals!

preprocessin# the ra" lo# $ile "ith the help o$ user speci$ic Perl scripts etc* can be

e'tremely use$ul*

How to 'naly(e a %o# &ile

sten Winkler http://ka.rsten-winkler.de/hypknowsys/wum/wumTutorial.html

de 24 30/07/2014 10:28

7/27/2019 WUM Tutorial

http://slidepdf.com/reader/full/wum-tutorial 11/24

How to )isuali(e the !ontents

sten Winkler http://ka.rsten-winkler.de/hypknowsys/wum/wumTutorial.html

e 24 30/07/2014 10:28

7/27/2019 WUM Tutorial

http://slidepdf.com/reader/full/wum-tutorial 12/24

The #enerated HTM7 report can be $ound here*

sten Winkler http://ka.rsten-winkler.de/hypknowsys/wum/wumTutorial.html

de 24 30/07/2014 10:28

7/27/2019 WUM Tutorial

http://slidepdf.com/reader/full/wum-tutorial 13/24

sten Winkler http://ka.rsten-winkler.de/hypknowsys/wum/wumTutorial.html

de 24 30/07/2014 10:28

7/27/2019 WUM Tutorial

http://slidepdf.com/reader/full/wum-tutorial 14/24

(ma#e o$ Bomplete ,##re#ated 7o#

How to *+ecute MI"T ,ueries

sten Winkler http://ka.rsten-winkler.de/hypknowsys/wum/wumTutorial.html

de 24 30/07/2014 10:28

7/27/2019 WUM Tutorial

http://slidepdf.com/reader/full/wum-tutorial 15/24

sten Winkler http://ka.rsten-winkler.de/hypknowsys/wum/wumTutorial.html

de 24 30/07/2014 10:28

7/27/2019 WUM Tutorial

http://slidepdf.com/reader/full/wum-tutorial 16/24

sten Winkler http://ka.rsten-winkler.de/hypknowsys/wum/wumTutorial.html

de 24 30/07/2014 10:28

7/27/2019 WUM Tutorial

http://slidepdf.com/reader/full/wum-tutorial 17/24

sten Winkler http://ka.rsten-winkler.de/hypknowsys/wum/wumTutorial.html

de 24 30/07/2014 10:28

7/27/2019 WUM Tutorial

http://slidepdf.com/reader/full/wum-tutorial 18/24

sten Winkler http://ka.rsten-winkler.de/hypknowsys/wum/wumTutorial.html

de 24 30/07/2014 10:28

7/27/2019 WUM Tutorial

http://slidepdf.com/reader/full/wum-tutorial 19/24

sten Winkler http://ka.rsten-winkler.de/hypknowsys/wum/wumTutorial.html

de 24 30/07/2014 10:28

7/27/2019 WUM Tutorial

http://slidepdf.com/reader/full/wum-tutorial 20/24

sten Winkler http://ka.rsten-winkler.de/hypknowsys/wum/wumTutorial.html

de 24 30/07/2014 10:28

7/27/2019 WUM Tutorial

http://slidepdf.com/reader/full/wum-tutorial 21/24

How to *+it from WUM

sten Winkler http://ka.rsten-winkler.de/hypknowsys/wum/wumTutorial.html

de 24 30/07/2014 10:28

7/27/2019 WUM Tutorial

http://slidepdf.com/reader/full/wum-tutorial 22/24

sten Winkler http://ka.rsten-winkler.de/hypknowsys/wum/wumTutorial.html

de 24 30/07/2014 10:28

7/27/2019 WUM Tutorial

http://slidepdf.com/reader/full/wum-tutorial 23/24

emarks and an 'dditional *+ample

WUM accepts as input a template! i*e* an ordered list o$ %ariables and "ildcards! and a

conjunction o$ constraints on the statistics o$ those %ariables* (t $inds all se)uences! "hich

ta+en to#ether build a pattern actually a directed acyclic #raphF that satis$ies the templateand the constraints*

E'ample4 We are interested in an e%ent ' that occurs a$ter y "ith probability at least >?*

This e%ent y should appear in at least 188 o$ our se)uences* ' needs not occur

sten Winkler http://ka.rsten-winkler.de/hypknowsys/wum/wumTutorial.html

de 24 30/07/2014 10:28

7/27/2019 WUM Tutorial

http://slidepdf.com/reader/full/wum-tutorial 24/24

immediately a$ter y! but it should not be more than ? e%ents a"ay $rom y* This

speci$ication produces the template y [0;5] x "here x and y are %ariables* The "ildcard

[0;5] stands $or any number o$ e%ents! and the inter%al [0;5] constraints the "ildcard

bet"een ero and up to ? e%ents* The constraints on x and y result in t"o restrictions4

y.support >= 100

and ( x.support / y.support ) > 0.95

To $ind the se)uences satis$yin# this template and constraints! issue the $ollo"in# M(NT

)uery4

select t

from node as a b, template y [0;5] x as t

where y.support >= 100

and ( x.support / y.support ) > 0.95

ou can use this )uery in the demo* Cut you ha%e to reduce the support o$ y! because

there is no e%ent that appears in more than @ se)uences* This "as just an e'ample* 5or

the $ormal de$initions and the description o$ the miner at "or+! please re$er to the

publications about the Web Utiliation Miner WUM*

When issuin# a M(NT )uery! WUM $inds all acceptable bindin#s $or the template %ariables*

, bindin# is a list o$ e%ents! i*e* o$ %alues! bound to the %ariables* , bindin# is acceptable i$ 

the e%ents comprisin# it appear in se)uences "hich4

con$orm to the template&s structure

ta+en to#ether constitute a #roup! the statistics o$ "hich satis$y the )uery

constraints

(n the abo%e e'ample! a UA7 *html in the dataset could be bound to %ariable y* , UA7

<*html could then be bound to x! only i$ there e'ists a se)uence "here <*html appears

"ithin L positions a$ter *html* 5or the bindin# to be acceptable! there should be at least

188 se)uences containin# *html and >? o$ them should contain <*html in at most L

positions a$ter *html* Those se)uences contribute the bindin# *html! <*htmlF*

WUM disco%ers all acceptable bindin#s $or the )uery and builds a na%i#ation pattern $or

each bindin#* , na%i#ation pattern is a directed acyclic #raph comprised o$ the se)uencescontributin# the bindin#4 the se)uences ha%e been mer#ed at common pre$i' and at each

e%ent o$ the bindin#*

The %isualiation tool o$ WUM can display a na%i#ation pattern in t"o "ays4

The template tree consists only o$ the e%ents comprisin# the bindin#* The e%ents

are annotated "ith the number o$ contributin# se)uences*

This $ormat #i%es an o%er%ie" o$ the e%ents that satis$y our )uery! "ithout

in$ormation on the surroundin# e%ents*

,n a##re#ate tree is a set o$ subse)uences mer#ed on common pre$i'* 5or t"o

consecuti%e e%ents in the bindin#! the a##re#ate tree sho"s the $ra#ments o$ the

contributin# se)uences bet"een those t"o e%ents*WUM cannot yet display #raphs* 3o! a na%i#ation pattern is split into a##re#ate

trees! one per e%ent in the bindin#* This e%ent is then the root o$ the a##re#ate

tree*

 

Top o$ the Pa#e  7e#al Notice  .ecember J! 2889

sten Winkler http://ka.rsten-winkler.de/hypknowsys/wum/wumTutorial.html

top related