mark levene, an introduction to search engines and web navigation © pearson education limited 2005...
TRANSCRIPT
Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005
Slide 3.1
Chapter 3 : The Problem of Web Navigation
• User’s often get “lost in hyperspace” when– Following links on web pages, or– Jumping to and from search engine results.
• Machine learning can provide a sound basis for improving web intreraction.
Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005
Slide 3.2
Getting lost in hyperspace
Figure 3.1: The navigation problem
Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005
Slide 3.3
Getting lost in hyperspace
Figure 3.2: Being lost in hyperspace
Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005
Slide 3.4The Naïve Bayes Classifier:
Automatic classification of web pages can widen the scope and size of web directories
Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005
Slide 3.5
Trails should be First-Class Objects
Figure 3.3: Example web site
Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005
Slide 3.6
Trails should be First-Class Objects
Figure 3.4: Four trails within a web site
Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005
Slide 3.7
Trails should be First-Class Objects
Figure 3.5: Query results for “mark research”
Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005
Slide 3.8
Trails should be First-Class Objects
Figure 3.6: Relevant trail for “mark research”
Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005
Slide 3.9
Markov chains
• Markov chains have been extensively studied by statisticians and have been applied in a wide variety of areas.
Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005
Slide 3.10
The probabilities of following links
Figure 3.7: Markov chain for example web site
Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005
Slide 3.11
The probabilities of following links
Figure 3.8: Two trails in the Markov chain
Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005
Slide 3.12
The probabilities of following links
Figure 3.9: Probabilities of the four trails
Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005
Slide 3.13
The relevance of links
Figure 3.10: Scoring web pages
Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005
Slide 3.14
The relevance of links
Figure 3.11: Constructing a chain from scores
Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005
Slide 3.15
Conflict Between Web Site Owner and Visitor
• The web site owner has objectives related to the business model of the site, e.g. selling products in an e-commerce site.
• The objectives of visitors are related to their information needs, e.g. gathering information in an e-commerce site.
• Web site owners would like to identify their visitors (e.g. via cookies), while visitors may prefer to remain anonymous.
Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005
Slide 3.16
Conflict Between Semantics of Web Site and Business Model
• E.g. the objective of an e-commerce site is to convert visitors into customers.
• But to keep visitors satisfied a web site must provide solutions to users’ information needs.
• There must be a balance between web site navigability and the business objectives of the site.