exploiting implicit feedback to identify usage patterns on the desktop

31
01/28/22 Michał Kopycki 1 Exploiting Implicit Feedback to Identify Usage Patterns on the Desktop Bachelor Thesis Leibniz University of Hanover Michał Kopycki

Upload: julius

Post on 08-Jan-2016

31 views

Category:

Documents


0 download

DESCRIPTION

Exploiting Implicit Feedback to Identify Usage Patterns on the Desktop. Bachelor Thesis Leibniz University of Hanover Micha ł Kopycki. Bestseller. H ow to write SPYWARE for “research purpose” and get paid for this. Personalization Research Issues (from Eelco’s presentation). - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Exploiting Implicit Feedback to Identify Usage Patterns on the Desktop

04/20/23Michał Kopycki 1

Exploiting Implicit Feedback to Identify Usage Patterns on the Desktop

Bachelor Thesis Leibniz University of Hanover Michał Kopycki

Page 2: Exploiting Implicit Feedback to Identify Usage Patterns on the Desktop

04/20/23Michał Kopycki 2

How to write SPYWARE for “research purpose” and get paid for this

Bestselle

r

Page 3: Exploiting Implicit Feedback to Identify Usage Patterns on the Desktop

04/20/23Michał Kopycki 3

Personalization Research Issues (from Eelco’s presentation)

Data Acquisition

Knowledge Inference

Adaptation Decision Making

Adaptation Mechanism

User Model

Page 4: Exploiting Implicit Feedback to Identify Usage Patterns on the Desktop

04/20/23Michał Kopycki 4

Outline

Motivation

Logging Framework

User study

Conclusion and future work

Motivation

Page 5: Exploiting Implicit Feedback to Identify Usage Patterns on the Desktop

04/20/23Michał Kopycki 5

MovielensAmazon

Del.icio.us

LastFM

Haystack ‘97

Letizia ‘95

Stuff I’ve Seen ‘03

LifeStreams ‘96

JIRIT ‘00

[BM02]

[CDH+08][Her06]

[CSC+07]

[RM00]Beagle++ ‘05

[WJR02]

[CGNP05][CN06]

[TDH05]

StumbleUpon

Libra

User ContextUser Context

Page 6: Exploiting Implicit Feedback to Identify Usage Patterns on the Desktop

User Context ... in our context

04/20/23Michał Kopycki 6

TFxIDF

GPS location

Reference

Genre

Sender

Resource as context

Web address

Interaction with resource as context

Sequence of access

Time windows

Bookmarking

Reading time

Printing document

Page 7: Exploiting Implicit Feedback to Identify Usage Patterns on the Desktop

04/20/23Sergey Chernov, Task Detection for Activity-Based Desktop Search, L3S Research Seminar

Slide 7 of 16

What is user context good for ?Desktop SearchDesktop Search

Search!Logger

www.pharos-audiovisual-search.eu

Pharos Project

Pharos Deliverable

Pharos Review

pas.kbs.uni-hannover.de

PIM Research

PIM 2008 paper

Logger v0.2

3/12/2008 3/12/2008

11:00 12:00 13:00 14:00 15:00 16:00 17:00

10:30 - 11:59Pharos work

12:58 - 13:58PIM Research13:58 - 14:57

Pharos

14:57 - 17:01PIM

1. Relationships between resources

2. Elicitation of user interests 3. Activity based computing

04/20/23Michał Kopycki 7

Page 8: Exploiting Implicit Feedback to Identify Usage Patterns on the Desktop

Thesis goals

1. User context recognition support

2. Public Desktop dataset alternative

04/20/23Michał Kopycki 8

“…exploiting usage analysis information about sequences of accesses to local resources…” (L3S 2006)

„… The absence of shared information makes it difficult to focus research problems, and to compare research results…” (Newman 1997)

“…an appropriate common test collection that is accepted by the community is required…” (Voorhees. 2001)

“…Desktop datasets within different research groups using a single methodology and a common set of tools …” (L3S 2008)

“…Building a Desktop IR testbed seems to be more challenging…”(L3S 2007)

Page 9: Exploiting Implicit Feedback to Identify Usage Patterns on the Desktop

04/20/23Michał Kopycki 9

Outline

Motivation

Logging Framework

User study

Conclusion and future work

Page 10: Exploiting Implicit Feedback to Identify Usage Patterns on the Desktop

Requirements

- Automatic

04/20/23Michał Kopycki 10

- Automatic

- Cross-application- Implicit Feedback

- Privacy preserving

- Cross-application- Implicit Feedback

A

B

C

Relevant

Not relevant

Relevant

Not relevant

Relevant

Not relevant

- Privacy preserving

Web

Email

File System

IM

- Extensible- ExtensibleLogging Framework

New best Email client plug-in

New best Web browser plug-in

Page 11: Exploiting Implicit Feedback to Identify Usage Patterns on the Desktop

Applications

Our approach

04/20/23Michał Kopycki 11

Resources

Page 12: Exploiting Implicit Feedback to Identify Usage Patterns on the Desktop

Component view

04/20/23Michał Kopycki 12

User Activity Logger

Desktop

Window Events

File System

Internet Explorer

Outlook Express

Thunderbird

Firefox

Thudnerbird

Firefox

Outlook 2003

Outlook 2007

C\C++

Window hooks

File system drivers

Windows undocumented API

JavaScript

XUL

C#

VSTO

.NET

Page 13: Exploiting Implicit Feedback to Identify Usage Patterns on the Desktop

Logging Framework

04/20/23Michał Kopycki 13

Page 14: Exploiting Implicit Feedback to Identify Usage Patterns on the Desktop

Supported notifications

04/20/23Michał Kopycki 14

Page 15: Exploiting Implicit Feedback to Identify Usage Patterns on the Desktop

Nepomuk adaptation

04/20/23Michał Kopycki 15

User Observation Hub

Logging Framework

Page 16: Exploiting Implicit Feedback to Identify Usage Patterns on the Desktop

04/20/23Michał Kopycki 16

Outline

Motivation

Logging Framework

User study

Conclusion and future work

Page 17: Exploiting Implicit Feedback to Identify Usage Patterns on the Desktop

User study

21 participants Average of 170 active logging days 2,828,706 Events Average of 2,815 distinct emails per user Average of 9,337 distinct URLs per user Average of 902 events per user per day Average 5 hours of active interaction per user per day

04/20/23Michał Kopycki 17

Page 18: Exploiting Implicit Feedback to Identify Usage Patterns on the Desktop

Dataset activity coverage

04/20/23Michał Kopycki 18

Page 19: Exploiting Implicit Feedback to Identify Usage Patterns on the Desktop

Data collection

04/20/23Michał Kopycki 19

Data Encryption schema

File path level1 \ level2 \ filename . extension

URL Protocol \ host \ dynamic part

URL host part Host name . Domain name . TLD

Address book entry User name \ email address

Email address domain name. TLD

Encryption schemas:

Methodology:

www

l3s

google

de

Page 20: Exploiting Implicit Feedback to Identify Usage Patterns on the Desktop

A glimpse into user behavior

04/20/23Michał Kopycki 20

Instant reader Moderate reader

Page 21: Exploiting Implicit Feedback to Identify Usage Patterns on the Desktop

04/20/23Michał Kopycki 21

Outline

Motivation

Logging Framework

User study

Conclusion and future work

Page 22: Exploiting Implicit Feedback to Identify Usage Patterns on the Desktop

Conclusion

1. Logging Framework• http://pas.kbs.uni-hannover.de/• http://sourceforge.net/projects/activity-logger

2. User study 3. Desktop Dataset4. Nepomuk integration 5. PIM’08 Workshop paper

04/20/23Michał Kopycki 22

Page 23: Exploiting Implicit Feedback to Identify Usage Patterns on the Desktop

Future work

1. Logging Framework: centralized architecture ontology based RDF output format support for new applications and notifications Vista support

2. Exploratory analysis of the Desktop dataset• Email interaction• Web search interaction • Application interaction

04/20/23Michał Kopycki 23

Page 24: Exploiting Implicit Feedback to Identify Usage Patterns on the Desktop

References[BM02] Peter Brusilovsky and Mark T. Maybury. From adaptive hypermedia to the adaptive web. Communications of the ACM, volume 45, pages 30–

33, 2002.[CDH+08] Sergey Chernov, Gianluca Demartini, Eelco Herder, Michał Kopycki, and Wolfgang Nejdl. Evaluating Personal Information Management

using an activity logs enriched Desktop dataset. In (To appear) PIM ’08: In Proceedings of the Workshop on Personal Information Management, 2008.

[CSC+07] Sergey Chernov, Pavel Serdyukov, Paul-Alexandru Chirita, Gianluca Demartini, and Wolfgang Nejdl. Building a desktop search test-bed. In ECIR ’07: Proceedings of 29th European Conference on IR Research, Advances in Information Retrieval, pages 686–690. Springer, 2007.

[Her06] E. Herder. Forward, Back and Home Again - Analyzing User Behavior on the Web. PhD thesis, University of Twente, Enschede, 2006.[RM00] B. J. Rhodes and P. Maes. Just-in-time information retrieval agents. IBM Systems Journal, volume 39, pages 685–704, 2000.[TDH05] Jaime Teevan, Susan T. Dumais, and Eric Horvitz. Personalizing search via automated analysis of interests and activities. In SIGIR ’05:

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, pages 449–456. ACM, 2005.

[WJR02] R.W. White, J.M. Jose, and I. Ruthven. Comparing explicit and implicit feedback techniques for web retrieval: Trec-10 interactive track report. TREC ’02: Proceedings of the Tenth Text Retrieval Conference, 2002.

[CN06] Paul-Alexandru Chirita, Wolfgang Nejdl Analyzing User Behavior to Rank Desktop Items. In: String Processing and Information Retrieval, 13th International Conference, SPIRE 2006, Proceedings, pp. 86-97, 2006.

[CGNP05] Paul-Alexandru Chirita, Stefania Costache, Wolfgang Nejdl, Raluca Paiu Beagle++: Semantically Enhanced Searching and Ranking on the Desktop. (Electronic Edition) In: The Semantic Web: Research and Applications, 3rd European Semantic Web Conference, ESWC 2006, Proceedings, pp. 348-362, 2006.

[WTN00] Steve Whittaker, Loren Terveen, and Bonnie A. Nardi. Let’s stop pushing the envelope and start addressing it: a Reference Task Agenda for HCI. Human Computer Interaction, volume 15, pages 75–106, 2000.

[McG95] Joseph E. McGrath. Methodology matters: doing research in the behavioral and social sciences. Human-computer interaction: toward the year 2000, pages 152–169, 1995.

[CLWB01] Mark Claypool, Phong Le, Makoto Wased, and David Brown. Implicit interest indicators. In IUI ’01: Proceedings of the 6th international conference on Intelligent user interfaces, pages 33–40. ACM, 2001.

[TAAK04] Jaime Teevan, Christine Alvarado, Mark S. Ackerman, and David R. Karger. The perfect search engine is not enough: a study of orienteering behavior in directed search. In CHI ’04: Proceedings of the SIGCHI conference on Human factors in computing systems, pages 415–422. ACM, 2004.

[WRJ02] Ryen W. White, Ian Ruthven, and Joemon M. Jose. Finding relevant documents using top ranking sentences: an evaluation of two alternative schemes. In SIGIR ’02: Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval, pages 57–64. ACM, 2002.

[Voo02] Ellen M. Voorhees. The philosophy of information retrieval evaluation. In CLEF ’01: Revised Papers from the SecondWorkshop of the Cross-Language Evaluation Forum on Evaluation of Cross-Language Information Retrieval Systems, pages 355–370, London, 2002.

04/20/23Michał Kopycki 24

Page 25: Exploiting Implicit Feedback to Identify Usage Patterns on the Desktop

Many thanks to:

Sergey and Eelco

04/20/23Michał Kopycki 25

Study participants

YOU !!

Page 26: Exploiting Implicit Feedback to Identify Usage Patterns on the Desktop

Related work

04/20/23Michał Kopycki 26

Implicit Feedback

Explicit Feedback

Single domain (Web, Email)

Cross domain

Dragontalk

Connections

Beagle ++

Stuff I’ve Seen

LifeStreamsHaystack

MyLifeBits

[TAAK04]

[WRJ02]

Page 27: Exploiting Implicit Feedback to Identify Usage Patterns on the Desktop

Collected data

04/20/23Michał Kopycki 27

Page 28: Exploiting Implicit Feedback to Identify Usage Patterns on the Desktop

A glimpse into user behavior

File access over folder hierarchy

04/20/23Michał Kopycki 28

Page 29: Exploiting Implicit Feedback to Identify Usage Patterns on the Desktop

A glimpse into user behavior

Web page visit length

04/20/23Michał Kopycki 29

Page 30: Exploiting Implicit Feedback to Identify Usage Patterns on the Desktop

Alternative to the public Desktop dataset

04/20/23Michał Kopycki 30

Dataset 1

Desktop dump

Logging Framework

Dataset 2

Desktop dump

Logging Framework

Dataset 3

Desktop dump

Logging Framework

Comparable Soft-repeatable

Common output

Common structure

Page 31: Exploiting Implicit Feedback to Identify Usage Patterns on the Desktop

Seems hard, but…

04/20/23Michał Kopycki 31

“It is possible”[BLA06],[APRILFOOL08],[HAHA07] DEADLINE