paul clough sheffield ischool evaluating info searching in digital cultural heritage

18
Evaluating Information Searching in Digital Cultural Heritage: Thinking Outside the (Search) Box? Paul Clough Information School, University of Sheffield, UK Presented at the Evaluating Use and Impact Workshop 2016

Upload: scotdigich

Post on 19-Feb-2017

279 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Paul Clough Sheffield iSchool Evaluating Info Searching in Digital Cultural Heritage

Evaluating Information Searching in Digital Cultural Heritage: Thinking

Outside the (Search) Box?

Paul CloughInformation School, University of Sheffield, UK

Presented at the Evaluating Use and Impact Workshop 2016

Page 2: Paul Clough Sheffield iSchool Evaluating Info Searching in Digital Cultural Heritage

Presented at the Evaluating Use and Impact Workshop 2016

• Evaluating search success• Summary of the PATHS project• Evaluation activities in PATHS• Some issues and challenges

Outline

Page 3: Paul Clough Sheffield iSchool Evaluating Info Searching in Digital Cultural Heritage

Presented at the Evaluating Use and Impact Workshop 2016

• Whether it retrieves ‘relevant’ documents• How quickly it returns results• How well it supports user interaction• Whether the user is satisfied with the results• How easily users can use the system• Whether the system helps users resolve their information

needs, carry our their tasks or make decisions• Whether the system impacts on the wider use environment• ….. Which of these are the most important?

What makes a search system successful?

Depends on who you ask, the users and their context

Page 4: Paul Clough Sheffield iSchool Evaluating Info Searching in Digital Cultural Heritage

Focus (and contexts) of evaluation

Presented at the Evaluating Use and Impact Workshop 2016

Tefko Saracevic (1995)

Page 5: Paul Clough Sheffield iSchool Evaluating Info Searching in Digital Cultural Heritage

Evaluating search

• Most typical evaluation in IR focuses on assessing the quality of search results (system-oriented evaluation or IR)– Evaluation typically comparative (Systems A vs. B)– Most common evaluation criteria include relevancy, retrieval

effectiveness and retrieval efficiency– Common evaluation measures include precision, recall, speed of

response – Methods include standardised benchmarks (e.g. test collections) or

use of ad hoc (heuristic) testing

Presented at the Evaluating Use and Impact Workshop 2016

Page 6: Paul Clough Sheffield iSchool Evaluating Info Searching in Digital Cultural Heritage

Evaluating search

• But often we want to measure aspects of retrieval performance beyond system effectiveness (user-oriented evaluation or Interactive IR)– User satisfaction with results, usability of the interface,

engagement, user performance with a task and effects of system changes on user behaviour

– Common criteria include satisfaction, usability, utility, etc.– Evaluation methods include lab-based controlled experiment,

naturalistic observation, predictive evaluation, etc.– Measures often include characteristics of interaction (e.g. number

of queries issued), performance measures (e.g. number of saved relevant document) or subjective measures (e.g. usability, engagement)

Presented at the Evaluating Use and Impact Workshop 2016

Page 7: Paul Clough Sheffield iSchool Evaluating Info Searching in Digital Cultural Heritage

Landscape of (I)IR evaluation

Presented at the Evaluating Use and Impact Workshop 2016

Kelly, D. (2009). Methods for evaluating interactive information retrieval systems with users. Foundations and Trends in Information Retrieval, 3(1-2), 1-224. DOI: 10.1561/1500000012.

Page 8: Paul Clough Sheffield iSchool Evaluating Info Searching in Digital Cultural Heritage

Thinking outside the search box

• Evaluation typically focuses on search box but search-based applications are typically rich in features to support information searching and seeking

• Many search applications involve multiple components– e.g. visualisations, recommendations, taxonomies, facets

• In practice evaluation will take place during system development (formative and summative) using combinations of system- and user-oriented methods

• But how do we evaluate components beyond the search box from a system and user-oriented perspective?

Presented at the Evaluating Use and Impact Workshop 2016

Clough, P. (2015) Evaluation: Thinking Outside the (Search) Box, In Proceedings of the Forum for Information Retrieval Evaluation (FIRE '14), Prasenjit Majumder, Mandar Mitra, Madhulika Agrawal, and Parth Mehta (Eds.), ACM, New York, NY, USA, pp. 1-9. http://ir.shef.ac.uk/cloughie/papers/Clough_FIRE2014.pdf

Page 9: Paul Clough Sheffield iSchool Evaluating Info Searching in Digital Cultural Heritage

The PATHS project

• PATHS (Personalised Access To cultural Heritage Spaces) project funded under EU FP7

• Multidisciplinary project involving academic and industrial partners from various disciplines – Cultural Heritage, Library and Information Science and Computer

Science• Developed techniques to support expert and non-expert users with

navigating and using cultural heritage materials from Europeana• Investigated use of trails/paths to facilitate narrative-like structures

through digital collections for use as guides and learning aids (like exhibitions/guides in physical space)

Presented at the Evaluating Use and Impact Workshop 2016

“Cultural heritage involves rich and highly heterogeneous collections that are challenging to archive and convey to the general public” Hardman, L., Aroyo, L., van Ossenbruggen, J. and Hyvönen, E. (2009) Using AI to Access and Experience Cultural Heritage, IEEE Intelligent Systems, 24(2), pp. 23-25.

Page 10: Paul Clough Sheffield iSchool Evaluating Info Searching in Digital Cultural Heritage

The PATHS system

• More generally the PATHS system aims to support information seeking and the ‘information journey’ – Recognising an information need– Acquiring information– Interpreting and validating this information– Using the information

• PATHS also aims to support exploration (exploratory search) and help users to make sense of concepts and items in a digital library collection (sense-making)– Providing functionalities to overview collections, aid interpretation of

information, use information to create paths

Presented at the Evaluating Use and Impact Workshop 2016

Clough, P. (2015) Supporting Exploration and Use of Digital Cultural Heritage Materials, EuropeanaTech Insight, Issue 4. http://ir.shef.ac.uk/cloughie/publications.html

Page 11: Paul Clough Sheffield iSchool Evaluating Info Searching in Digital Cultural Heritage

Guiding interaction model

“Find Collect Use”

Presented at the Evaluating Use and Impact Workshop 2016

Page 12: Paul Clough Sheffield iSchool Evaluating Info Searching in Digital Cultural Heritage

Interface components

• Interface components developed to support activities in conceptual model and specific requirements gathered from user studies– Standard search box and facets– Thematic map-based visualisation (similar to Google Maps)– Thesaurus based on data-driven concept hierarchy– Links to related items (based on typed similarity)– Item-level (non-personalised) recommendations (based on

mining Europeana logs)– Features for creating, editing, publishing and following ‘paths’

(tree structures)• Components used in desktop and mobile (iPad) interfaces

http://paths.sheffield.ac.uk/pathsui

Presented at the Evaluating Use and Impact Workshop 2016

Page 13: Paul Clough Sheffield iSchool Evaluating Info Searching in Digital Cultural Heritage

Presented at the Evaluating Use and Impact Workshop 2016

Page 14: Paul Clough Sheffield iSchool Evaluating Info Searching in Digital Cultural Heritage

Presented at the Evaluating Use and Impact Workshop 2016

Example ‘paths’

Goodale, P., Clough, P., Hall, M., Stevenson, M, Fernie, K., Griffiths, J., and Agirre, E. (2013) Pathways to Discovery: Supporting Exploration and Information Use in Cultural Heritage Collections. In Proceedings of Museums and the Web Asia 2013, Hong Kong, 9-12 December, 2013.

Page 15: Paul Clough Sheffield iSchool Evaluating Info Searching in Digital Cultural Heritage

• System development followed classic user-centred approach of requirements gathering prototyping evaluate [repeat]

• Evaluation of components– Search box– Recommender systems– Visualisations– Related/similar items– Subject hierarchies and facets

• System architecture/infrastructure testing• Evaluation of user interface designs• Evaluation of the integrated prototype

– Controlled lab-based user testing– Field trials

Evaluations carried out by researchers to select best

algorithms; had to learn from domains beyond search

Evaluations carried out by software developers

Evaluations carried out by UI designers

Evaluations carried out by ‘end users’

Evaluation activities in PATHS

Presented at the Evaluating Use and Impact Workshop 2016

Page 16: Paul Clough Sheffield iSchool Evaluating Info Searching in Digital Cultural Heritage

Example: lab-based user testing

Presented at the Evaluating Use and Impact Workshop 2016

Page 17: Paul Clough Sheffield iSchool Evaluating Info Searching in Digital Cultural Heritage

• Many issues and challenges facing evaluation when we think outside the search box, including– Combining user- and system-oriented approaches (e.g. to ‘inform and

predict’)– Understanding the relationship between evaluation criteria (and

associated measures) – Sharing evaluation practices between domains and disciplines– Thinking beyond ad hoc search tasks– Combining the evaluation results (e.g. does the whole=sum of parts?)– Evaluating whole-page relevance

• What constitutes success?• It depends - on the stakeholder, the user and their context

Issues and challenges

Presented at the Evaluating Use and Impact Workshop 2016

Page 18: Paul Clough Sheffield iSchool Evaluating Info Searching in Digital Cultural Heritage

Otegi, A.; Agirre, E.; Clough, P., "Personalised PageRank for making recommendations in digital cultural heritage collections," Digital Libraries (JCDL), 2014 IEEE/ACM Joint Conference on , vol., no., pp.49,52, 8-12 Sept. 2014 [Recommendations]

Hall, M., Fernando, S., Clough, P., Soroa, A., Agirre, E., and Stevenson, M. (2014) Evaluating hierarchical organisation structures for exploring digital libraries, Information Retrieval, Volume 17(4), pp. 351-379. [Automatic hierarchy induction]

Aletras, N., Stevenson, M. and Clough, P. (2013) Computing Similarity between Items in a Digital Library of Cultural Heritage, Journal on Computing and Cultural Heritage, Volume 5(4), Article 16. [Similarity of items]

Goodale, P., Clough, P., Hall, M., Stevenson, M, Fernie, K., Griffiths, J., and Agirre, E. (2013) Pathways to Discovery: Supporting Exploration and Information Use in Cultural Heritage Collections. In Proceedings of Museums and the Web Asia 2013, Hong Kong, 9-12 December, 2013.  [Analysis of manually-created paths]

Agirre, E., Aletras, N., Clough, P., Fernando, S., Goodale, P., Hall, M., Soroa, A., and Stevenson, M.,(2013) PATHS: A System for Accessing Cultural Heritage Collections, In Proceedings of 51st Annual Meeting of the Association for Computational Linguistics (ACL'13), Sofia, Bulgaria, August 4-9 2013. pp. 151-166. [Project overview]

Hall, M. and Clough, P. (2013) Exploring Large Digital Library Collections using a Map-based Visualisation, In Proceedings of The International Conference on Theory and Practice of Digital Libraries (TPDL 2013), pp. 216-227. [Collection visualisations]

Presented at the Evaluating Use and Impact Workshop 2016