recognizing user interest and document value from reading and organizing activities in document...

16
Recognizing User Interest and Document Value from Reading and Organizing Activities in Document Triage Rajiv Badi, Soonil Bae, J. Michael Moore, Konstantinos Meintanis, Anna Zacchi, Haowei Hsieh, Frank Shipman and Cathy Marshall Center for the Study of Digital Libraries & Department of Computer Science Texas A&M University Microsoft Corporation

Post on 15-Jan-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Recognizing User Interest and Document Value from Reading and Organizing Activities in Document Triage Rajiv Badi, Soonil Bae, J. Michael Moore, Konstantinos

Recognizing User Interest and Document Value from Reading and Organizing

Activities in Document Triage

Rajiv Badi, Soonil Bae, J. Michael Moore, Konstantinos Meintanis, Anna Zacchi, Haowei

Hsieh, Frank Shipman and Cathy Marshall

Center for the Study of Digital Libraries &

Department of Computer Science

Texas A&M University

Microsoft Corporation

Page 2: Recognizing User Interest and Document Value from Reading and Organizing Activities in Document Triage Rajiv Badi, Soonil Bae, J. Michael Moore, Konstantinos

What is Document Triage?

● People quickly evaluate a large set of documents selecting documents to read

● People organize them into a personal information collection

● People re-read the documents, progressively refining the organization

● Knowledge forms incrementally as initial understanding becomes more refined over time

A specific form of information collecting, reading and organizing

2/16

Page 3: Recognizing User Interest and Document Value from Reading and Organizing Activities in Document Triage Rajiv Badi, Soonil Bae, J. Michael Moore, Konstantinos

Prior Document Triage Study (2004)

● Task: organize the documents to help a teacher prepares a set of lessons on ethnomathematics as a reference librarian

● 24 subjects

● 40 documents from NSDL & Google searches

● Organizing tool: Visual Knowledge Builder (VKB)

● Reading tool: Internet Explorer (IE)

● Logged reading & editing events

● Asked subjects to select five most & least useful documents

3/16

Page 4: Recognizing User Interest and Document Value from Reading and Organizing Activities in Document Triage Rajiv Badi, Soonil Bae, J. Michael Moore, Konstantinos

Initial Document List

4/16

Document object

Collection

Metadata

Page title

Page URL

Summary

NSDL Search

System-generated Visualization based on metadata

Google Search

Page 5: Recognizing User Interest and Document Value from Reading and Organizing Activities in Document Triage Rajiv Badi, Soonil Bae, J. Michael Moore, Konstantinos

Document in a Web Browser

5/16

Page 6: Recognizing User Interest and Document Value from Reading and Organizing Activities in Document Triage Rajiv Badi, Soonil Bae, J. Michael Moore, Konstantinos

Final Organization Sample

6/16

Categories

(Collections)

Background Color

Border Color

Border Thickness

Page 7: Recognizing User Interest and Document Value from Reading and Organizing Activities in Document Triage Rajiv Badi, Soonil Bae, J. Michael Moore, Konstantinos

Proactive Support for Document Triage

1.Recognizing user interest and document value

2.Representing user interests

3.Recognizing documents of interest

4.Visualizing interest information

Motivations7/16

Page 8: Recognizing User Interest and Document Value from Reading and Organizing Activities in Document Triage Rajiv Badi, Soonil Bae, J. Michael Moore, Konstantinos

Recognizing User Interest (1)

● Explicit and implicit interest indicators

● Correlation between reading activity and user interest

● Reading time, # of visits, # of scrolls, …

● Correlation between organizing activity and user interest

● Resize, move, delete …

● Correlation between document attributes and user interest

● # of characters, # of links, # of images …

8/16

Page 9: Recognizing User Interest and Document Value from Reading and Organizing Activities in Document Triage Rajiv Badi, Soonil Bae, J. Michael Moore, Konstantinos

Recognizing User Interest (2)

● Prior work has focused on a single application as the source for interest indicators

● Document triage occurs in the context of multiple applications

● Interest profile is the basis for determining, sharing and storing implicit interest

9/16

Page 10: Recognizing User Interest and Document Value from Reading and Organizing Activities in Document Triage Rajiv Badi, Soonil Bae, J. Michael Moore, Konstantinos

Interest Profile Manager

10/16

Interest Profile

Communication Communication Communication

Communication

ReadingApplication

OrganizingApplication

OverviewApplication

Interest Profile Manager

InterestModels

Page 11: Recognizing User Interest and Document Value from Reading and Organizing Activities in Document Triage Rajiv Badi, Soonil Bae, J. Michael Moore, Konstantinos

Data Analysis (1)

11/16

Document Attributes

Reading Activity Organizing Activity

# of characters

# of links

# of images

Reading time

# of clicks

# of text selections

# of scrolls

# of scrolling direction changes

Time spent in scrolling

Scroll offset

# of document accesses

# of object moves

# of object resizes

# of object deletions

# of content changes

# of background color changes

# of border color changes

# of border width changes

Page 12: Recognizing User Interest and Document Value from Reading and Organizing Activities in Document Triage Rajiv Badi, Soonil Bae, J. Michael Moore, Konstantinos

Data Analysis (2)

● Identified the correlation between user activity & document attributes and user interest

● Found meaningful interest indicators in user activity

● Reading time, # of scrolls, # of resize events …

● Found meaningful interest indicators in document attributes

● # of characters, # of links, # of images …

● No indicator cannot dominantly identify user interest

● Significant difference between individual styles

12/16

Page 13: Recognizing User Interest and Document Value from Reading and Organizing Activities in Document Triage Rajiv Badi, Soonil Bae, J. Michael Moore, Konstantinos

Interest Models

● Models to estimate average interest on documents

13/16

Model name Data

Statistical Model

Reading activity model

Reading activity

Organizing activity model

Organizing activity

Combined Model

Reading & Organizing activity

Qualitative ModelReading & Organizing activity

Page 14: Recognizing User Interest and Document Value from Reading and Organizing Activities in Document Triage Rajiv Badi, Soonil Bae, J. Michael Moore, Konstantinos

Evaluation (1)

● The same task and topic as in the prior study in 2004

● 16 subjects

● 40 documents from NSDL & Google searches

● Asked subjects to select five most & least useful documents

● Scaled to a continuous value between 0 (least useful) and 2 (most useful)

● Calculated the absolute value of the difference between the explicit user rating and each model's predicted rating

14/16

Page 15: Recognizing User Interest and Document Value from Reading and Organizing Activities in Document Triage Rajiv Badi, Soonil Bae, J. Michael Moore, Konstantinos

Evaluation (2)

● Combined and qualitative models using reading and organizing activity show better performance than others

14/15

0 - 5%

5 - 10%

10 - 15%

15 - 20%

20 - 25%

25 - 30%

30 - 35%

Reading

Organizing

Combined

Qualitative

0%5%10%15%20%25%30%35%40%45%

Freq

uen

cy

Residue ErrorModels

Page 16: Recognizing User Interest and Document Value from Reading and Organizing Activities in Document Triage Rajiv Badi, Soonil Bae, J. Michael Moore, Konstantinos

Conclusion

● Predictive models based on user activity collected from multiple applications have been built

● Utilizing user activity from multiple applications rather than single application can improve the accuracy of prediction

● Software infra structure, Interest Profile Manager, has been developed to support the result

16/16