the university of sheffield - groups - researchir.shef.ac.uk/cloughie/download/paulclough-cv.pdf ·...

29
The University of Sheffield Curriculum Vitae Paul David Clough Contact details: Information School University of Sheffield Room 226, Regent Court, 211 Portobello Street, Sheffield, S1 4DP UK. Tel : +44 (0) 114 2222664 Fax : +44 (0) 114 2780300 [email protected] http://ir.shef.ac.uk/cloughie/

Upload: others

Post on 03-Jun-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The University of Sheffield - Groups - Researchir.shef.ac.uk/cloughie/download/PaulClough-cv.pdf · 2005 worth ~£3.8 million in total (~£1 million for Sheffield). • Co-author

The University of Sheffield

Curriculum Vitae

Paul David Clough

Contact details: Information School University of Sheffield Room 226, Regent Court, 211 Portobello Street, Sheffield, S1 4DP UK.

Tel : +44 (0) 114 2222664 Fax : +44 (0) 114 2780300 [email protected] http://ir.shef.ac.uk/cloughie/

Page 2: The University of Sheffield - Groups - Researchir.shef.ac.uk/cloughie/download/PaulClough-cv.pdf · 2005 worth ~£3.8 million in total (~£1 million for Sheffield). • Co-author

July, 2011 Page 2 of 29

Curriculum Vitae Paul David Clough

Summary

General • Senior Lecturer in Information Retrieval – Information School (formerly Department of

Information Studies), University of Sheffield (UK).

Teaching • Coordinator of four undergraduate and postgraduate courses: Database Design (UG

and PGT); Digital Multimedia (UG) and Information Systems in Healthcare (PGT). Delivered lecturers on several other courses in the Information School.

• Involved in supervising 10 PhD students since appointment (2 successfully completed). Co-supervising 1 PhD student in Computer Science.

• External examiner for 5 PhDs and internal examiner for 2 PhDs. • External advisor on the MSc, PGDip and PGCert in Geospatial Intelligence run by the

Royal School of Military Survey (RSMS) and Cranfield University (from 2011).

Leadership, Management and Administration • Postgraduate exams officer from July 2010 (until 2014). • Member of the Departmental Research Committee (DRC) from 2010 (until 2014). • Programme coordinator of BA undergraduate programmes (2006-2009).

Professional and External Development Activities

• Programme co-chair for the European Conference on Information Retrieval (ECIR) 2011 and member of the Senior Programme Committee for the Conference on Information and Knowledge Management (CIKM) 2010 and 2011.

• Co-organiser of 6th Workshop on Geographic Information Retrieval (GIR'10) and Session Track at the 2010 Text REtrieval Conference (TREC). Organiser of TrebleCLEF Query Log Analysis Workshop (QLAW) 2009.

• Involved in the Cross Language Evaluation Form (CLEF) for several years serving on the steering committee and organising tracks (GeoCLEF, ImageCLEF and iCLEF).

• Reviewer for many of the major journals, conferences and workshops in my field. • Member of the editorial board of two journals: (i) International Journal of Digital

Library Systems and (ii) Plagiary. • External reviewer for research projects for the Science Foundation Ireland (SFI), the

UK Engineering and Physical Sciences Research Council (EPSRC) and the Arts and Humanities Council (AHRC).

• Fellow of the UK Higher Education Academy. • Given 17 invited talks or research seminars at organisations worldwide.

Research & Publications

• Internationally recognised researcher in information storage and retrieval, particularly multi-lingual searching of texts and images, image retrieval and geographic search; evaluation of retrieval systems; text re-use and plagiarism detection.

• Principal or co-investigator on 6 research grants and contracts since employment in 2005 worth ~£3.8 million in total (~£1 million for Sheffield).

• Co-author of 1 book (in press), 2 edited books, 5 book chapters, 15 refereed journal papers, 8 professional articles, 23 refereed conference papers, 10 refereed conference short papers/posters, and 51 refereed workshop papers since 2001.

• Founder and co-founder of two internationally recognised research collaborations (ImageCLEF and GeoCLEF) that have run for 12 years in total and involved 60 international research groups producing hundreds of papers.

• Awarded US Patent for a new information management system.

Page 3: The University of Sheffield - Groups - Researchir.shef.ac.uk/cloughie/download/PaulClough-cv.pdf · 2005 worth ~£3.8 million in total (~£1 million for Sheffield). • Co-author

July, 2011 Page 3 of 29

1 PERSONAL DETAILS

SURNAME: Clough FORENAMES: Paul David

DEPARTMENT: Information School (formerly Information Studies)

DATE OF BIRTH: 28th December 1974

CITENZENSHIP: British

EDUCATION AFTER SCHOOL: University of Sheffield, 1999-2002

University of York, 1995-1998 Suffolk College, 1990-1995

QUALIFICATIONS: Postgraduate Certificate in Higher Education (PGCHE), University of Sheffield, 2008

D.Phil. Department of Computer Science, University of Sheffield, 2002

BEng. Computer Science, University of York (First Class Honours), 1998

BTEC Higher National Certificate Eng. (Software Engineering), Suffolk College (Distinction), 1995

BTEC National Certificate Eng. (Electronics and Communications), Suffolk College (Distinction), 1993

CURRENT APPOINTMENTS: Senior Lecturer in Information Retrieval

(Information School), University of Sheffield, 2010-present

Lecturer in Information Systems (Information

School), University of Sheffield, 2005-2010

Research Assistant (Information Studies), University of Sheffield, 2002-2005

Research Assistant (Computer Science),

University of Sheffield, 1999-2002 PREVIOUS APPOINTMENTS: Manager, British Telecommunications Plc,

1998-1999

Technician, British Telecommunications Plc, 1994-1998

BT Laboratories training programme for

technicians, British Telecommunications Plc, 1991-1994

Page 4: The University of Sheffield - Groups - Researchir.shef.ac.uk/cloughie/download/PaulClough-cv.pdf · 2005 worth ~£3.8 million in total (~£1 million for Sheffield). • Co-author

July, 2011 Page 4 of 29

2 TEACHING 2.1 CURRENT UNDERGRADUATE AND POSTGRADUATE TEACHING

Masters Course Title Dates No.* Contribution Student Contact Hours

INF6050 – Database Design

2009-present

33 Module coordinator and main lecturer

20 hours lectures and 4 hours seminars

INF6512 – Information Systems in Health

2008-present

27 Module coordinator and main lecturer (distance learning)

16 hours lectures

INF6090 - Information Storage and Retrieval Research

2007-present

13 Lecturer 4 hours lectures

INF6017 – Content Management Systems

2009-present

12 Lecturer 6 hours lectures

INF6840 - Archives and Records Management

2009-present

17 Lecturer 2 hours lectures

INF6040 - Business Intelligence

2008-present

52 Lecturer 3 hours lectures

PG Dissertations 2005-present

7 Supervise student dissertations

7-10 hours per student

* figures for 2009-10

Undergraduate Course Title

Dates No.* Contribution Student Contact Hours

INF205 – Database Design

2009-present

30 Module coordinator and main lecturer

(lectures and seminars shared with INF6050)

INF208 Digital Multimedia

2007-present

20 Module coordinator and main lecturer

26 hours Lectures and practicals

INF209 - Information Storage and Retrieval Research

2007-present

15 Lecturer (lectures shared with INF6090)

INF317 – Content Management Systems

2009-present

6 Lecturer (lectures shared with INF6017)

INF304 - Business Intelligence

2008-present

13 Lecturer 2 hours lectures

UG Dissertations 2005-present

1 Supervise student dissertations

7-10 hours per student

* figures for 2009-10

Page 5: The University of Sheffield - Groups - Researchir.shef.ac.uk/cloughie/download/PaulClough-cv.pdf · 2005 worth ~£3.8 million in total (~£1 million for Sheffield). • Co-author

July, 2011 Page 5 of 29

2.2 PREVIOUS UNDERGRADUATE AND POSTGRADUATE TEACHING

Course Title Dates No. Contribution Student Contact Hours

INF205 (INF305)/6050 – Database Design

2005-2008

~90 Lecturer 8 hours lectures

INF211 (INF311)/6110 – Information Systems Modelling

2005-2008

~90 Lecturer 8 hours lectures

INF6400 - E-Business

2005-2007

14 Lecturer 4 hours lectures

INF6440 - Electronic Publishing

2005-2007

12 Lecturer 6 hours lectures

INF312 - Information Management in the Digital Economy

2005-2007

12 Lecturer 4 hours lectures

INF6400 - Information Systems and the Society

2005-2007

12 Lecturer 2 hours lectures

INF6001 - Information Systems Project Management

2006-2008

~40 Lecturer 6 hours practical sessions

2.3 TEACHING INNOVATION AND DEVELOPMENT

Creating new modules and re-vamping existing ones In 2008 I developed a new course on Information Systems in Health (INF6512) for the MSc Health Informatics aimed mainly at healthcare professionals (e.g. UK NHS staff). This involved putting into practice all that I learned in the second year of the Sheffield University Certificate in Learning and Teaching course on designing modules. This included creating learning objectives and appropriate assessment criteria, and generating 9 new lectures (all 2 hour) and a range of online resources. The module is taught in a distance learning environment using new technologies from WIMBA. Overall the course ran well and the following comments were particularly pleasing (based on student feedback): “I have found this a thoroughly enjoyable module which has given me a good introduction to the design and implementation of information systems which I feel will be very useful to me as my career develops in the future” and “I have been able to utilise aspects of this module immediately in my job which has been particularly satisfying.”

In 2007 I became module coordinator of the (optional) undergraduate module INF205 Digital Multimedia. I re-designed the module as part of completing a Certificate in Learning and Teaching (leading towards a PGCEHE) and used a model of constructive alignment to make the learning objectives, lecture material and practical sessions tie together more coherently. The module regularly has 20-30 students registering for it, including those from Computer Science and the Management School. One of the objectives of the module has been to teach students Adobe Flash, professional animation software, to demonstrate principles from the lectures and expose students to a professional software tool used by animators worldwide. The quality of student’s work has been very high and demonstrated at undergraduate open days and our research panel advisory day.

Page 6: The University of Sheffield - Groups - Researchir.shef.ac.uk/cloughie/download/PaulClough-cv.pdf · 2005 worth ~£3.8 million in total (~£1 million for Sheffield). • Co-author

July, 2011 Page 6 of 29

In 2009 I took over as module coordinator of the undergraduate and postgraduate Database Design module (INF205/6050). This module has an annual total of around 75-80 students taking it. I completely re-designed the course to make it more focused on data management and provide a more unified view on database design. I wrote 8 new lectures (2 hours each) for the course on a variety of topics and created a new assessment for the course: part of the assessment designed to assess student’s abilities in designing and implementing relational databases using Oracle (professional database software) and the rest to assess how well students were able to explain their understanding of databases to non-technical managers of an SME and provide an overview of current research trends. The quality of student’s work has been outstanding and a number of Computer Science students have taken this module to refresh their understanding of database design.

Teaching student’s professional software To develop students transferable skills I have endeavoured to include exposure to professional software within the modules I teach where possible. Since 2007 Adobe Flash, professional software for time-based authoring of animations has featured as a core part of the practical sessions for INF205 Digital Multimedia. I use Flash to reinforce theoretical principles of animation taught in the lectures and the coursework is based on developing a professional animation to communicate, through multimedia, a news event. In the undergraduate and postgraduate Database Design module (INF205/6050), together with Peter Stordy, we use Oracle’s DBMS to introduce relational database design and teach SQL. Oracle is widely used in industry for data storage solutions. Supervising Computer Science dissertations In 2007 and 2008 I have been involved, with Prof. Mark Sanderson (Information Studies) and Dr. Mark Stevenson (Computer Science) with supervising Computer Science students who have taken projects within the DARWIN initiative, a group software development project. One of these groups contributed towards a submission to the journal Language Resources and Evaluation that was published in 2010. Strong student feedback I have successfully supervised undergraduate students with a number of personal issues. In 2008 I was shortlisted (12 out of 124 recommendations) for a campaign run by the Student Union called “I love my personal tutor”. I have consistently received positive feedback from students for the modules that I have taught and supervised. For example, from the 2010 INF6512 Information Systems in Health cohort I received the following comment: “The lectures were very thorough and even though there was a lot of technical detail, it was explained well with good examples. I was able to relate the module to my job which meant that I could use both my own experiences and information from the lectures in the assignment”. In 2008 I received a formal letter of thanks from Major Alan Easingwood (Cartographer in the British Army) on the quality of supervision for his Master’s dissertation. Outreach to schools In 2006 and 2007 I gave a talk on literary forensics to local Sheffield College students studying forensics as part of the Audiometry Day, organised by Department of Computer Science). Working with Mr. Peter Stordy and Prof. Mark Sanderson from our department. In 2009, I contributed to setting up and running of the department’s inaugural “excellence hub”. This was a one day workshop for

Page 7: The University of Sheffield - Groups - Researchir.shef.ac.uk/cloughie/download/PaulClough-cv.pdf · 2005 worth ~£3.8 million in total (~£1 million for Sheffield). • Co-author

July, 2011 Page 7 of 29

schools to bring classes of pupils into the University to give them a flavour of what undergraduate courses are like. I presented a lecture and exercise on search engines to a class of pupils from two schools. International teaching I gave a seminar at the cross language search summer school in Pisa, Italy in 2009. This was attended by approximately 30 postgraduate research students from worldwide and included invited talks from a number of well-known academics in the field of cross-language IR. I gave two tutorials: "Multilingual Search Assistance: Interactive Aspects of Cross Language Information Access" and "MultiMatch: A Multilingual/Multimedia Search Engine for Cultural Heritage: Interface Design and Demo" (15-19 June 2009). Guest lecturer at the University of Valencia (Valencia, Spain) and gave lectures to around 20 PhD students on "Research in Geographical Information Retrieval" and "Measuring Text Reuse in the British Press Industry" (June 2009) Guest lecturer at the Universidad Nacional de Educación a Distancia (UNED) (Madrid, Spain) and gave lectures to around 40 PhD students on plagiarism detection and business intelligence (Nov. 2008).

2.4 OTHER TEACHING ACTIVITIES In 2003 I was invited to write a report for the UK Plagiarism Advisory Service on the technical aspects of plagiarism detection, one of my research areas. I wrote a report entitled “Old and new challenges in automatic plagiarism detection” that has been cited 31 times.

I coordinated INF6001 Information Systems Project Management whilst the module coordinator was on sabbatical.

2.5 EXTERNAL EXAMINING

External reviewer (and panel member) for Universidad Nacional de Educatión a Dictancia (UNED) of the following thesis; “Harnessing Folksonomies for Resource Classification”, Arkaitz Zubiaga Mendialdua (July 2011) External reviewer for RMIT University of the following thesis: “Source code authorship attribution”, Steven Burrows (Feb 2011) External reviewer (and panel member) for Polytechnic University of Valencia of the following thesis: "Toponym Disambiguation in Natural Language Processing", Davide Buscaldi (September 2010). PhD examiner at the University of Bradford for “Equivalence class model for structured peer-to-peer information retrieval systems” (2008). External reviewer of the MSc Information Management programme as a part of a five-yearly Institution-Led Subject Review of the Department of Information Management at Aberdeen Business School, Robert Gordon University (2010).

External advisor on the MSc, PGDip and PGCert in Geospatial Intelligence run by the Royal School of Military Survey (RSMS) and Cranfield University in 2009. I have invited to act as external examiner on the course from 2011.

Page 8: The University of Sheffield - Groups - Researchir.shef.ac.uk/cloughie/download/PaulClough-cv.pdf · 2005 worth ~£3.8 million in total (~£1 million for Sheffield). • Co-author

July, 2011 Page 8 of 29

2.6 INTERNAL EXAMINING

PhD thesis examiner at the University of Sheffield (Information Studies) for 2 PhDs (Robert Polding in 2009 and Wen-Chin Hsu in 2011).

3 LEADERSHIP, MANAGEMENT AND ADMINISTRATION 3.1 CURRENT ACTIVITIES

Postgraduate Examinations Officer (2010-). The primary duty of this role is to define the means of assessment for postgraduate teaching in the department. This means being responsible for ensuring postgraduate student course work is returned to students in a specified time, that assessment of work has been conducted properly, dealing with requests for extensions, and most importantly acting as the initial point of access for all student inquiries regarding PGT modules. The role also requires attending all postgraduate exam boards and chairing those where the head of department is not present. First Year Undergraduate Mentor (2010-) The primary duty of this role is to provide pastoral support for first year undergraduate students in the department. The role also includes helping to coordinate departmental activities organised for introduction week. Member of the Departmental Research Committee (DRC) (2010-) Duties of this role include attending DRC meetings, reviewing research activities within the department and contributing to discussions about departmental research strategies.

3.2 PREVIOUS ADMINISTRATIVE ACTIVITIES

Programme Coordinator (2006-2009). The primary duty of this role was to coordinate the recruitment of students onto the BA programmes run in conjunction with the Sheffield University Management School (SUMS). The dual degrees were Business Management/Information Management (NP21/MGTU17) and Accounting & Finance/Information Management (NP41/MGTU18). The role included dealing with undergraduate admissions (UCAS forms), reviewing the curriculum and helping organise and run Open Days. This role required interacting with students, prospective students and their families, the admissions office, other members of staff from our department, as well as members of staff from the Management School. Member of the Board of the Faculty of Pure Science (2006-2009). In this role I acted as a departmental representative on the faculty of Pure Science. Duties included attending faculty meetings and contributing to faculty learning and teaching discussions. Dissertation coordinator on the MSc Health Informatics Programme (2006-2009). The primary duty of this role was to keep track of the progress of dissertation students on the Masters level Health Informatics programme (distance learning), help to allocate appropriate supervisors and assist with the practical organisation of annual day schools (3 per year).

4 PROFESSIONAL AND EXTERNAL STANDING 4.1 EDITORIAL

Editorial board

Page 9: The University of Sheffield - Groups - Researchir.shef.ac.uk/cloughie/download/PaulClough-cv.pdf · 2005 worth ~£3.8 million in total (~£1 million for Sheffield). • Co-author

July, 2011 Page 9 of 29

I am a member of the editorial board for the following journals: (i) International Journal of Digital Library Systems, published by IGI Global, and Plagiary, published by the Scholarly Publishing Office, University of Michigan. Programme co-chair Programme co-chair of the world’s 3rd most prestigious conference on IR: the European Conference on Information Retrieval (ECIR) 2011. http://www.ecir2011.dcu.ie/

Lab Organizing Committee Chair for the Cross Language Evaluation Forum (CLEF) 2011 conference (equivalent to PC co-chair). http://clef2011.org/

4.2 REVIEWING

Programme committees On the Senior Programme Committee for the second most prestigious conference in information and knowledge management: the Conference on Information and Knowledge Management (CIKM) 2010. http://www.yorku.ca/cikm10/ and 2011 http://www.cikm2011.org/ Involved in the mentoring programme for the ACM Special Interest Group on Information Retrieval (SIGIR) in 2008 and 2009. Regularly on the programme committees of various conferences and workshops in my field including European Conference on Information Retrieval (ECIR), Association of Computing Machinery Conference of the Special Interest Group in Information Retrieval (ACM SIGIR), Conference on Information and Knowledge Management (CIKM), European Association for Computational Linguistics (EACL), Conference on Empirical Methods in Natural Language Processing (EMNLP), European Semantic Web Conference (ESWC) and the European Conference on Digital Libraries (ECDL). Research proposal reviewing I have reviewed grant applications for Science Foundation Ireland (SFI), the UK Engineering and Physical Sciences Research Council (EPSRC) and the UK Arts and Humanities Council (AHRC). Journal/conference reviewing Reviewer for the journals: Journal of the American Society for Information Science and Technology, Computers, Environment and Urban Systems, Database Systems for Advanced Applications, ACM Transactions on Software Engineering and Methodology, International Journal on Geographic Information Systems, Information Processing and Management, Transactions on Information Systems, ACM Transactions on Asian Language Information Processing, ACM Transactions on the Web, Computers and the Humanities, Computer Speech and Language, Journal of Machine Learning Research, Plagiary (cross-disciplinary studies in plagiarism, fabrication and falsification), Language Resources and Evaluation, Software: Practice and Experience, Transactions on Knowledge and Data Engineering. Reviewer for the conferences: ACM Special Interest Group on Information Retrieval, European Conference on Information Retrieval, European Semantic Web Conference, Joint Conference on Digital Libraries, Conference on Information and Knowledge Management, European Association for Computational Linguistics, Conference on Empirical Methods in Natural

Page 10: The University of Sheffield - Groups - Researchir.shef.ac.uk/cloughie/download/PaulClough-cv.pdf · 2005 worth ~£3.8 million in total (~£1 million for Sheffield). • Co-author

July, 2011 Page 10 of 29

Language Processing, European Semantic Web Conference, IRF Conference, European Conference on Digital Libraries. Reviewer for the workshops: Workshop on Geographic Information Retrieval, Workshop on Simulated Interaction, International Workshop on Content-Based Multimedia Indexing, Cross Language Evaluation Forum, PAN Workshop “Uncovering Plagiarism, Authorship and Social Software Misuse”, Workshop on Web 2.0 and Natural Language Engineering tasks, Workshop on Language Technology for Cultural Heritage Data, Multimedia Mining and Image Understanding, International Workshop on Location and the Web, Geographic Information on the Internet Workshop.

4.3 CONFERENCE AND WORKSHOP ORGANISATION

Lab Organising Committee Chair for the Cross Language Evaluation Forum (CLEF) 2011 conference (equivalent to PC co-chair). The premier conference on multilingual information systems in Europe (and worldwide). Programme co-chair for the premier IR conference in Europe: the European Conference on Information Retrieval (ECIR) 2011. http://www.ecir2011.dcu.ie/ Co-organiser of 6th Workshop on Geographic Information Retrieval (GIR'10), Zurich 18-19 February 2010. http://www.geo.uzh.ch/~rsp/gir10/ Co-organiser of Session Track at Text REtrieval Conference (TREC 2010). http://ir.cis.udel.edu/sessions/ Organiser of TrebleCLEF Query Log Analysis Workshop "Query Log Analysis: From Research to Best Practice" (QLAW2009), May 27-28 2009, London. This workshop focused on the field of query log analysis and involved a range of speakers from around the world to provide input and discussion on the topic (Jim Jansen, USA; Lynn Silipigni Connaway, USA; Filip Radlinski, Microsoft UK; Vanessa Murdock, Yahoo! Spain; Mark Levene, UK; Bettina Berendt, Belgium). http://ir.shef.ac.uk/cloughie/qlaw2009/ Involved in the Cross Language Evaluation Form (CLEF) for several years, both on the steering committee and as a track organiser (ImageCLEF and iCLEF). Involved in setting up a cross-language image retrieval track (called ImageCLEF) in 2003 (this still runs). http://www.clef-campaign.org/

4.4 INVITED TALKS/LECTURES

Invited speaker at the Enterprise Search Europe Conference 2011. My talk is entitled “The challenges of multilingual search” (24 October 2011). Invited talk to academic staff from the HCI group at York University and also two visitors from the Open Society Archives from the Central European University in Budapest. The talk was entitled "Using pathways for navigating and personalising access to cultural heritage materials" (2 June 2011). Invited talk to the Information Retrieval group at Glasgow University. The talk was entitled "Using pathways for navigating and personalising access to cultural heritage materials" (14 March 2011). Invited speaker to Department of Information Science at Loughborough University. The talk was entitled “MultiMatch: Multilingual Information Access to Cultural Heritage Materials” (4 February 2011).

Page 11: The University of Sheffield - Groups - Researchir.shef.ac.uk/cloughie/download/PaulClough-cv.pdf · 2005 worth ~£3.8 million in total (~£1 million for Sheffield). • Co-author

July, 2011 Page 11 of 29

Invited speaker at the Online Computer Library Centre, Inc. in Dublin (Ohio, USA), and Talk entitled “The MultiMatch Project”, Part of the Distinguished Seminar Series (23 September 2009). Invited speaker at the TrebleCLEF Summer School on Multilingual Information Access (Pisa, Italy), Tutorials: "Multilingual Search Assistance: Interactive Aspects of Cross Language Information Access" and "MultiMatch: A Multilingual/Multimedia Search Engine for Cultural Heritage: Interface Design and Demo" (15-19 June 2009). Invited speaker at the University of Valencia (Valencia, Spain), Talks included "Experiments involving a Web2.0 application called Flickr", "Research in Geographical Information Retrieval" and "Measuring Text Reuse in the British Press Industry" (June 2009) Invited speaker at the MAVIR 2008 event (Madrid, Spain), "Multimedia retrieval and evaluation" (November 2008) Guest lecturer at the Universidad Nacional de Educación a Distancia (UNED) (Madrid, Spain), lectures on plagiarism detection and business intelligence (November 2008) Invited to speak at an interdisciplinary workshop held at Cambridge University (Cambridge, UK), "Measuring Text Reuse in Journalism", The workshop was entitled "Inspiration, Innovation, or Infringement: Multidisciplinary Perspectives on Piracy and Copyright" and organised by the Centre for Intellectual Property & Information Law and held at Emmanuel College, Cambridge (July 1 2008) Umbrella 2007 conference organised by the Chartered Institute of Library and Information Professionals (University of Hertfordshire, UK), "Trends in multimedia retrieval - the professional implications" (June 2007) Panel session on evaluation of medical retrieval systems at ISHIMR2007 conference (Sheffield, UK), "Current state-of-the-art in image retrieval" (July 2007) Panel session on evaluation of image retrieval systems at American Society for Information Science and Technology (ASIST) (Texas, USA), "ImageCLEF: the CLEF Cross-Language Image Retrieval Campaign" (November 2006) The DELOS Network of Excellence on Digital Libraries Demo Day, Bibliothèque Nationale de France, Francois-Mitterand site Paris, France (Paris, France), "Building and Evaluating Systems for Cross-Language Image Retrieval" (1 February 2006) The MICHAEL (Multilingual Inventory of Cultural Heritage in Europe) conference: towards a catalogue for the European digital library (Bristol, UK), "Bridging the Language Gap: making digital collections available to a multilingual society" (15 November 2005) Invited talk at Universidad Iberoamericana, Torreon, Coahuila (Torreon, Mexico), "SPIRIT: spatially-aware information retrieval on the Internet" (August 2005)

Page 12: The University of Sheffield - Groups - Researchir.shef.ac.uk/cloughie/download/PaulClough-cv.pdf · 2005 worth ~£3.8 million in total (~£1 million for Sheffield). • Co-author

July, 2011 Page 12 of 29

Invited talk at Lancaster University to the Corpus Linguistics Research Group (CRG) (Lancaster, UK), "Measuring Text Reuse" (May 2005) Invited talk at Geneva University (Geneva, Switzerland), “Cross-language image retrieval" (17 March 2005) Invited talk at the Press Association (London, UK), "METER - Measuring Text Reuse" (4 November 2004) Keynote speaker at a Digital Forum event organised by Business Link South Yorkshire (Sheffield, UK), "SPIRIT: Spatially-Aware Information Retrieval on the Internet" (23 September 2004) Keynote speaker at RIAO'2004 Conference (Avignon, France), "The CLEF cross language image retrieval campaign - ImageCLEF" (May 2004)

4.5 MEMBERSHIP OF PROFESSIONAL BODIES Fellow of the UK Higher Education Academy (recognition reference 41201) from 9th August 2010. Member of American Society for Information Science & Technology (ASIS&T) (2007-2008).

4.6 KNOWLEDGE TRANSFER IN THE FORM OF CONSULTANCIES In 2008 I was invited as an external reviewer for the UK National Archives (TNA) for their search committee and discussed potential project work for the coming months for geo-referencing their archived collections. This resulted in work described in (Clough et al., 2011). In collaboration with Prof. Mark Sanderson we provided consultancy on the way in which the search engine managers at TNA could monitor activity on their system and provide general information on how users typically interact with search engines (log analysis). TNA funded a 6 months project (April – September 2010) to formalise this collaboration (worth ~£70,000).

4.7 LANGUAGE RESOURCES I have created a number of language resources for analysing activities such as text re-use and plagiarism and building test collection resources for evaluating IR systems. These have been used by research groups worldwide:

The METER Corpus: this resource was released in June 2003 and contains 1,716 documents and was created for the study and analysis of text re-use in the newspaper industry. The corpus consists of a set of news stories written by the Press Association (PA), the major UK news agency, and a set of stories about the same news events as published in nine British newspapers. This has been cited in 23 research papers to date. A Corpus of Plagiarised Short Answers: this resource was released in May 2010 and consists of 100 documents that can be used for the development and evaluation of plagiarism detection systems. Texts in the corpus reflect the types of plagiarism practiced by students in an academic setting as far as realistically possible. This has been cited in 2 research papers to date.

Page 13: The University of Sheffield - Groups - Researchir.shef.ac.uk/cloughie/download/PaulClough-cv.pdf · 2005 worth ~£3.8 million in total (~£1 million for Sheffield). • Co-author

July, 2011 Page 13 of 29

ImageCLEF test collections: this evaluation campaign (from 2003) has included developing a number of test collections for distribution to CLEF participants. These are available from http://www.imageclef.org. These have been used by approximately 200 researchers worldwide.

4.8 PATENTS

Whilst working for BT at Adastral Park (UK) I was involved with designing a personalised information management tool that could be used to filter information according to registered users, group and project profiles which are defined by sets of keywords. The tool was used for searching and locating relevant information in relation to projects and a user interface controls access by users to view or modify projects and to share associated information. The tool is registered as a US Patent (Number 6424968).

5 RESEARCH & PUBLICATIONS 5.1 RESEARCH AREAS

My research interests mainly revolve around developing technologies to assist people with accessing and managing information. Much of my research has focused on Information Retrieval (IR) of natural language texts within various domains and contexts. In particular, I have worked on Multi-Lingual Information Retrieval (MLIR), Text-Based Image Retrieval (TBIR), Geographic Information Retrieval and user- and system-oriented evaluation of IR systems. More recently I have been interested in understanding users and their information needs, and adopted a more user-centred attitude to my research. For example, in the EU-funded Multimatch project we developed a cross-language search user interface using a user-centred approach and a current research project is aiming to develop a recommender system for digital libraries, also using a user-centred approach. In addition my research in IR, I am also interested in how people use, and re-use, the information that they find. In particular, I have analysed legitimate text re-use within the context of the British Press (METER), and also published on plagiarism and its detection within an educational context (considered as illegitimate re-use). A further theme of my research has been to create re-usable resources (corpora and test collections) for the wider research community. A substantial proportion of my research has involved non-academic institutions or industrial collaborators, including the UK Press Association, Ordnance Survey, the UK National Archives, OCLC Inc., Alinari, the Netherlands Sound and Vision Archives, Press Association Images and Tate Online. This has allowed me to investigate ‘real’ information access problems and facilitate knowledge transfer. Text re-use and derivation Text re-use is the activity whereby pre-existing written texts are used again to create a new text or version(s). I have investigated text re-use within commercial (text re-use in the British Press) and academic (plagiarism in Higher Education) contexts, creating resources for analysing text re-use and benchmarking systems (the METER Corpus and a Corpus of Plagiarised Short Answers). Projects related to this work include: • METER: I worked on the METER (Measuring Text Reuse) project as a

researcher and for my PhD thesis. The METER project was funded by the EPSRC (Engineering and Physical Sciences Research Council) and sponsored by the UK Press Association (PA) and aimed to investigate the

Page 14: The University of Sheffield - Groups - Researchir.shef.ac.uk/cloughie/download/PaulClough-cv.pdf · 2005 worth ~£3.8 million in total (~£1 million for Sheffield). • Co-author

July, 2011 Page 14 of 29

issue of automatically detecting and measuring text re-use between PA source texts and corresponding newspaper articles.

Multi-Lingual IR (MLIR) Multi-Lingual IR (MLIR) and Cross-Lingual IR (CLIR) describe the situation in which the user’s query language is different from the language of the document collection (which may be one or many languages) and techniques must be employed to help users cross the language barrier. I have researched the use of MT and bilingual dictionaries for query and document translation (in the Eurovision project), investigated methods for improving cross-language search using Natural Language Processing (NLP) techniques, investigated the design of interactive CLIR and been involved in the organisation of large-scale evaluation campaigns for MLIR/CLIR. Projects related to this work include: • Eurovision: The Eurovision project explored the cross-language retrieval

of images via their associated textual metadata. I developed cross-language tools and prototypes in this project in addition to carrying out research into multi-lingual image retrieval.

• MultiMatch: The MultiMatch (Multilingual/Multimedia Access to Cultural

Heritage) project was aimed at enabling users to be able to explore and interact with online cultural heritage content, across media types and language boundaries. I lead work on designing the user interface and was also involved in work in semantic annotation of cultural heritage information from the UK’s Tate Online and explored the use of various forms of visualisation for exploring the information space.

Text-Based Image Retrieval (TBIR) With the proliferation of devices being able to produce and capture visual material, designing effective methods to search and browse visual material is paramount. My focus has been mainly centred on Text-Based Image Retrieval (TBIR) approaches (as opposed to Content-Based approaches) that exploit textual metadata for information retrieval. I have also explored image retrieval in the context of amateur photography and people’s access to their own photographic collections within a family context. Projects related to this work include: • Memoir: This project investigated the technology, ethics and psychology

of storing and accessing a life-time of personal information. My work activities were related to personal multimedia management, particularly photos (e.g. how do we collect multimedia data, what do we collect and why, what is the role of audiovisual material within personal social structures such as the family?)

Geographic IR (GIR) Geographic IR (GIR) systems are used to exploit geo-referenced information and enhance search and browse capabilities using these spatial semantics. My research has mainly focused on: geo-tagging (extracting geo-references from natural language and assigning them spatial coordinates); assisting with the development of a GIR system (SPIRIT); investigating notions of relevance within the context of evaluation of GIR systems; investigating the presentation of spatial information; and generating techniques for deriving boundaries for imprecise regions based on mining Web-based sources. Projects related to this work include:

Page 15: The University of Sheffield - Groups - Researchir.shef.ac.uk/cloughie/download/PaulClough-cv.pdf · 2005 worth ~£3.8 million in total (~£1 million for Sheffield). • Co-author

July, 2011 Page 15 of 29

• SPIRIT: The SPIRIT (Spatially Aware Access to Information on the Internet) was engaged in the design and implementation of a search engine to find documents and datasets on the web relating to places or regions referred to in a query. I was mainly involved in developing techniques to extract placenames from natural language texts and ground these to specific locations (i.e. coordinates). These techniques were used to create a prototype spatially-aware search for testing and evaluation of new techniques in geographical information retrieval.

IR evaluation Evaluation is the process of assessing the ‘worth’ of something and evaluating the performance of Information Retrieval (IR) systems is an important part of the development process. My research has included both system-oriented (e.g. ImageCLEF) and user-oriented (e.g. iCLEF) approaches to IR system evaluation. I have been involved in organising large-scale evaluation exercises, mainly for the Cross Language Evaluation. I am also co-organiser of a track in the U.S. Text REtrieval Conference (TREC) on query reformulation, called the Session Track. I have recently been studying evaluation in the context of enterprise search at the UK National Archives (TNA). This work is also investigating the area of query log analysis for evaluating TNA’s search systems. Projects related to this topic include: • IFF@TNA: The IIF@TNA (Improving Information Finding at The National

Archives) project aims at improving access to data managed by TNA. The project involves analysing TNA's main web server logs to establish the range of subjects being searched by online visitors to their archives and identifying common information searching behaviours. Additionally we are creating a methodology that will allow TNA to evaluate their existing and future search products and services.

• TrebleCLEF: This was an EU-funded Coordination Action (CA) designed

to bring together investigators working in the field of evaluation for multilingual information access to consolidate and promote best practice. The specific target for this project was the European digital library community. I contributed to discussions on evaluation of CLIR systems, wrote deliverables and ran a workshop on query log analysis and evaluation (QLAW2009).

Future Research Aims Given the ubiquity of search and the prevalence of information in all areas of our lives, my future research interest is in developing effective retrieval technologies that support users as they seek to fulfil their information needs. I believe that in the future explicit semantics will play a more active role in information access as projects like the Semantic Web and Linked Data create more semantically-rich information; using and exploiting semantic information for search will become increasingly important in IR. The shift is also likely to focus on non-topical aspects of information, i.e. opinion rather than fact, as more user-generated content is made accessible. Allowing users to access these additional ‘dimensions’ of natural language texts will require a better understanding of such attributes. The growing amount of non-English content and increased number of scenarios that require crossing the language boundaries will mean that cross-language search continues to require investigation. But not just at the level of matching algorithms, but rather understanding the information seeking behaviours and information needs of diverse groups of users and the functionality required for effective multi-lingual search. Finally, there is growing interest in supporting information discovery and serendipity through applications such as recommender

Page 16: The University of Sheffield - Groups - Researchir.shef.ac.uk/cloughie/download/PaulClough-cv.pdf · 2005 worth ~£3.8 million in total (~£1 million for Sheffield). • Co-author

July, 2011 Page 16 of 29

systems. I am starting to explore this direction with a current AHRC project on developing a recommender system for the OCLC Inc. WorldCat online catalogue. I plan to continue researching information retrieval in this changing landscape and within different contexts with a focus on the end user, their information needs and required support. Given the ease with which information is also made available for re-use I also plan to continue researching text re-use and derivation, with a focus on identifying patterns of re-use in language as a whole and studying re-writing in more depth. This will lead to a better understanding of originality of text and ownership, and development of techniques to identify re-used vs. original texts, and if re-used, the possible sources. I believe all of this is only possible with a tighter coupling between Natural Language Processing (NLP) and IR.

5.2 RESEARCH GRANTS AND INCOME

Total: £3,796,000, of which £1,023,000 to Sheffield. Total as PI: £2,822,000, of which £476,000 to Sheffield. Total as Co-PI: 974,000 in total, of which £547,000 to Sheffield. Key to funding agencies: AHRC – U.K. Arts and Humanities Research Council; EPSRC – U.K. Engineering and Physical Sciences Research Council; CEC – Commission of the European Communities I have been Principal Investigator (PI) or Co-Principal Investigator (Co-PI) for the following research grants: • CEC 7th Framework STREP “Personalized Access To Cultural Heritage

Spaces (PATHS)”. Sheffield grantholders: Paul Clough (PI), Mark Stevenson (PI). Total value: £2 million, of which to Sheffield £254,000.

• AHRC Collaborative Doctoral Award (CDA) “User-Centered Design of a

Recommender System for a 'Universal' Library Catalogue”. September 2010 – September 2013. Grantholders: Paul Clough (PI) and Barbara Sen. Value: ~£70,000.

• The UK National Archives (TNA) “Improving Information Finding at the UK

National Archives”. April 2010 – September 2010. Grantholders: Paul Clough (PI) and Mark Sanderson (Co-PI). Value: £70,000.

• CEC 7th Framework Coordinated Action “TrebleCLEF: Evaluation, Best

Practices and Collaboration for Multilingual Information Access”. Sheffield grantholders: Mark Sanderson (PI) and Paul Clough (Co-PI). Total value: ~£500,000, of which to Sheffield £73,000.

• CEC 6th Framework STREP “Multilingual/Multimedia Access To Cultural

Heritage (MultiMatch)”. Sheffield grantholders: Paul Clough (PI), Mark Sanderson (Co-PI), Fabio Ciravegna (Co-PI) and Daniela Petrelli (Co-PI). Total value: £2.6 million, of which to Sheffield £254,000.

• CEC Marie Curie Transfer of Knowledge “Memoir: Learning how

technology can help people create and manage long-term personal memories”. Sheffield grantholders: Steve Whittaker (PI), Mark Sanderson (Co-PI), Daniela Petrelli (Co-PI) and Paul Clough (Co-PI). Value: £474,000.

• EPSRC CASE Studentship “Defining Imprecise Regions Using

Knowledge from the Web”. Grantholder: Paul Clough (PI). Value: £82,000.

Page 17: The University of Sheffield - Groups - Researchir.shef.ac.uk/cloughie/download/PaulClough-cv.pdf · 2005 worth ~£3.8 million in total (~£1 million for Sheffield). • Co-author

July, 2011 Page 17 of 29

5.3 RESEARCH DEGREE SUPERVISION

Currently primary supervisor for 8 PhD students: Simon Wakeling “User-Centered Design of a Recommender System for a 'Universal' Library Catalogue”, Munirah Abdulhadi “Towards enriching metadata descriptions with tags in a bilingual academic library context”, Basheer Al Farwan “Automatically constructing gazetteers by mining the web”, Monica Lestari Paramita “Methods to Build Comparable Corpora”, Rita Wan-Chik “Answers from the Quran: Online information seeking needs”, Paula Goodale “Constructing Personal Narratives in Cultural Heritage Spaces Online” and Murad Abouammoh “Investigation into Diversity in Information Retrieval” (the latter two students were taken over from Prof. Mark Sanderson who left the department in 2010). Primary for 2 PhD students who have successfully passed: Rob Pasley “Finding and defining vernacular geography using knowledge from unstructured data sources” and Antje Bothin “Analysing Meeting notes and their role in automatic meeting summarization”. Currently secondary supervisor for 1 PhD student: Muhammed Adeel Rao “Natural Language Processing for plagiarism detection” in conjunction with Computer Science. I have co-supervised 3 PhDs that successfully completed: Azzah Al Maskari “Beyond classical measures: how to evaluate the effectiveness of interactive information retrieval systems?“ and Shahram Sedhi “Relevance criteria used by health professionals in selecting medical images for educational purposes” and Johannes Schanda “Novelty detection in remote databases” (this last student is making final corrections). Regularly supervise 7-8 Master’s dissertation projects on the INF6340 Research Methods and Dissertation Preparation. Four of my dissertation projects have been part of the Ordnance Survey MSc Dissertation Programme and 3 of these students have won sponsorship prizes for their work (Stephen Aladiran in 2006, Alan Easingwood in 2008 and Paul Hurst in 2010). Successfully ran a Darwin project with Dr. Mark Stevenson in Computer Science on creating a corpus of simulated plagiarism examples which resulted in a journal paper submission to LRE (Language Resources and Evaluation).

5.4 PUBLICATIONS

Summary: h-index of 20 (estimated 1,436 citations with 8.09 cites/paper). Most cited two papers are “The CLEF 2005 Cross–Language Image Retrieval Track” (cited 96 times) and “Plagiarism in natural and programming languages: an overview of current tools and techniques” (cited 85 times). 20 papers have been co-authored with students • Books in Press: 1 – see section 5.4.1 • Chapters in Press: 1 – see section 5.4.2 • Books in Print: 2 – see section 5.4.3 • Chapters in Print: 4 – see section 5.4.4 • Refereed Journals in Press: 2 – see section 5.4.5

Page 18: The University of Sheffield - Groups - Researchir.shef.ac.uk/cloughie/download/PaulClough-cv.pdf · 2005 worth ~£3.8 million in total (~£1 million for Sheffield). • Co-author

July, 2011 Page 18 of 29

• Refereed Journals in Print: 15 – see section 5.4.6 • Refereed Conference Papers: 34 – see section 5.4.7 • Articles in Professional Publications: 8 – see section 5.4.8 • Refereed Workshop Papers: 51 – see section 5.4.9

• Technical Reports: 10 – see section 5.4.10

5.4.1 BOOKS - IN PRESS

Peters, C., Braschler, M. and Clough, P. Multilingual Information Retrieval, Springer: Heidelberg, Germany (Forthcoming 2011).

5.4.2 CHAPTERS - IN PRESS Clough, P. User-related issues in multilingual access to multimedia collections, In Dobreva, Dwyer and Feliciati (eds) User Studies for Digital Library Development, Facet Publishing (Forthcoming 2011).

5.4.3 BOOKS - IN PRINT Clough, P., Foley, C., Gurrin, C., Jones, G.J.F., Kraaij, W., Lee, H., and Murdoch, V. (2011) Advances in Information Retrieval - 33rd European Conference on IR Research, ECIR 2011, Dublin, Ireland, April 18-21, 2011. Proceedings Springer: Heidelberg, Germany, LNCS 6611. Müller, H., Clough, P., Deselaers, T. and Caputo, B. (2010) ImageCLEF - Experimental Evaluation of Visual Information Retrieval, Springer: Heidelberg, Germany (Publication date: September 2010). 495 pages.

5.4.4 CHAPTERS - IN PRINT Clough, P., Müller, H. and Sanderson, M. Seven Years of Image Retrieval Evaluation (2010) In Müller, H., Clough, P., Deselaers, T. and Caputo, B. (eds) ImageCLEF - Experimental Evaluation of Visual Information Retrieval, Springer, Heidelberg, Germany (Publication date: September 2010). Grubinger, M., Nowak, S. and Clough, P. Data sets Created in ImageCLEF (2010) In Müller, H., Clough, P., Deselaers, T. and Caputo, B. (eds) ImageCLEF - Experimental Evaluation of Visual Information Retrieval, Springer, Heidelberg, Germany (Publication date: September 2010). Clough, P. Measuring Text Re-Use in the News Industry (2010) In Lionel Bently, Jennifer Davis, and Jane C. Ginsburg (eds) Copyright and Piracy: An Interdisciplinary Critique, Cambridge University Press (Publication date: October 2010). Clough, P. and Gaizauskas, R. (2009) Corpora and Text Re-use, In Anke Lüdeling, Merja Kytö and Tony McEnery (eds) Handbook of Corpus Linguistics. (Series: Handbooks of Linguistics and Communication Science), 1249–1271, Mouton de Gruyter.

5.4.5 REFEREED JOURNALS - IN PRESS P = primary author; J = joint author; S = involves student as co-author

Page 19: The University of Sheffield - Groups - Researchir.shef.ac.uk/cloughie/download/PaulClough-cv.pdf · 2005 worth ~£3.8 million in total (~£1 million for Sheffield). • Co-author

July, 2011 Page 19 of 29

JS Sedghi, S., Sanderson, M., and Clough, P. Medical image resources used by

health care professionals, Aslib Proceedings, (Forthcoming 2011). J Petrelli, D. and Clough, P. Analysing User’s Queries for Cross-Language

Image Retrieval from Digital Library Collections, The Electronic Library, (Forthcoming 2011).

5.4.6 REFEREED JOURNALS - IN PRINT P = primary author; J = joint author; S = involves student as co-author P Clough, P., Hall, M., Warner, A., and Tang, J. (2011) Linking Archival Data to

Location: A Case Study at the UK National Archives, Aslib Proceedings, Volume 63(2/3), pp. 127-147.

J Cox, A. M., Clough, P., and Siersdorfer, S. (2011) Developing Metrics to

Characterize Flickr Groups, Journal of the American Society for Information Science and Technology, Volume 62(3), pp. 493-506.

J Hanbury, A., Müller, H., and Clough, P. (2010) Introduction to Special Issue on Image and Video Retrieval Evaluation (Editorial), Computer Vision and Image Understanding, Vol. 114 (2010), pp. 409-410.

P Clough, P. and Stevenson, M. (2010) Developing A Corpus of Plagiarised Short Answers, Language Resources and Evaluation, Published online 16 January 2010.

P S Clough, P. and Eleta, I. (2010) Investigating Language Skills and Field of Knowledge on Multilingual Information Access in Digital Libraries, International Journal of Digital Library Systems, Vol. 1(1), pp. 89-103.

J Whittaker, S., Bergman, O. and Clough, P. (2010) Easy on that Trigger Dad: A Study of Long Term Family Photo Retrieval, Personal and Ubiquitous Computing , Volume 14(1), pp. 31-43.

P Clough, P., Ireson, N. and Marlow, J. (2009) Extending Domain-Specific Resources to Enable Semantic Access to Cultural Heritage Data, Journal of Digital Information, Volume 10(6). Available online: https://journals.tdl.org/jodi/issue/view/89

J S Sedghi, S., Sanderson, M. and Clough, P. (2008) A Study on the Relevance Criteria for Medical Images, Pattern Recognition, Vol. 29(15), Nov. 2008, pp. 2046-2057.

J Rorissa, A., Clough, P. and Deselaers, T. (2008), Exploring the relationship between feature and perceptual visual spaces, Journal of the American Society for Information Science and Technology, Vol. 59(5), pp. 770-784.

J Jones, C., Purves, R.S., Clough, P., and Joho, H. (2008) Modelling Vague Regions with Knowledge from the Web, International Journal Geographic Information Systems, Vol. 22(10), pp. 1045-1065.

J S Cox, A., Clough, P. and Marlow, J. (2008), Flickr: a first look at user behaviour in the context of photography as serious leisure, Information Research, 13(1) paper 336. [Available, 6 January, 2008 at http://InformationR.net/ir/13-1/paper336.html].

Page 20: The University of Sheffield - Groups - Researchir.shef.ac.uk/cloughie/download/PaulClough-cv.pdf · 2005 worth ~£3.8 million in total (~£1 million for Sheffield). • Co-author

July, 2011 Page 20 of 29

J Deselaers, T., Müller, H., Clough, P., Ney, H. and Lehmann, T. (2007), The

CLEF 2005 Automatic Medical Image Annotation Task, International Journal of Computer Vision, Vol. 74(1), pp. 51-58.

J Purves, R.S., Clough, P., Jones, C.B., Arampatzis, A., Bucher, B., Finch, D., Fu, G., Joho, H., Khirini, A.S., Vaid, S., and Yang, B. (2007), The Design and Implementation of SPIRIT: a Spatially-Aware Search Engine for Information Retrieval on the Internet, International Journal Geographic Information Systems, Vol. 21(7), January 2007, pp. 717-745.

P Clough, P. and Sanderson, M. (2006), User Experiments with the Eurovision Cross-Language Image Retrieval System, In Journal of the American Society for Information Science and Technology, Vol. 57(5), pp. 697 - 708.

J A. Arampatzis, M. van Kreveld., I. Reinbacher, C.B. Jones, S. Vaid, P.D. Clough, H. Joho, and M. Sanderson (2005), Web-based delineation of imprecise regions, In the Journal of Computers, Environment and Urban Systems, Vol. 30(4), pp. 436-459.

5.4.5 REFEREED CONFERENCE PAPERS P = primary author; J = joint author; S = involves student as co-author 5.4.5.1 FULL PAPERS J Kanoulas, E., Carterette, B., Clough, P., and Sanderson, M. (2011)

Evaluating Multi-Query Sessions, In Proceedings of the 34th Annual ACM SIGIR Conference, Bejing, China, (In Print).

J Sanderson, M., Paramita, M., Clough, P. and Kanoulas, E. (2010) Do user

preferences and evaluation measures line up?, In Proceedings of the 33rd Annual ACM SIGIR Conference, Geneva, Switzerland, pp. 555-562.

P Clough, P. and Stevenson, M. (2009) Creating A Corpus of Plagiarised Academic Texts, In Proceedings of Corpus Linguistics Conference 2009, Liverpool, UK , Article 131, Available Online: http://ucrel.lancs.ac.uk/publications/cl2009/

P Clough, P. and Sen, B. (2008) Evaluating Tagclouds for Health-Related Information Research, In Proceedings of the 13th International Symposium on Health Information Management Research (ISHIMR), Auckland, New Zealand, October 20-22 2008. (Nominated for Best Paper.)

J S Pasley, R., Clough, P., Purves, R. and Twaroch, F. (2008) Mapping Geographic Coverage of the Web, In Proceedings of 16th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (ACM GIS 2008), pp. 154-162.

J S Al-Maskari, A., Sanderson, M. and Clough, P. (2008), The Good and the Bad System: Does the Test Collection Predict Users’ Effectiveness?, In Proceedings of the 31st Annual International ACM SIGIR Conference (SIGIR2008), 20-24 July 2008, Singapore, pp. 59-66.

J Marlow, J., Clough, P., Ireson, N., Cigarrán Recuero, J., Artiles, J. and Debole, F. (2008), The MultiMatch Project: Multilingual/Multimedia Access to

Page 21: The University of Sheffield - Groups - Researchir.shef.ac.uk/cloughie/download/PaulClough-cv.pdf · 2005 worth ~£3.8 million in total (~£1 million for Sheffield). • Co-author

July, 2011 Page 21 of 29

Cultural Heritage on the Web, In Museums on the Web Conference (MW2008): Proceedings, J. Trant and D. Bearman (eds). Toronto: Archives & Museum Informatics. 2008. Available online: http://www.archimuse.com/mw2008/abstracts/prg_335001834.html

J Marlow, J., Clough, P., Cigarrán Recuero, J. and Artiles, J. (2008), Exploring the Effects of Language Skills on Multilingual Web Search, In Proceedings of the 30th European Conference on IR Research (ECIR'08), Glasgow, UK, April 2008, LNCS4956, pp. 126-137.

J S Al-Maskari, A., Sanderson, M. and Clough, P. (2007), Arabic Users' Satisfaction with the Online Information as Obtained from Google , In Proceedings of Sixth International Conference on Conceptions of Library and Information Science (CoLIS), Borås, Sweden, August 13-16.

J S Marlow, J., Clough, P., and Dance, K. (2007), Multilingual needs of cultural heritage website visitors: A case study of Tate Online, In International Cultural Heritage Informatics Meeting (ICHIM07): Proceedings, J. Trant and D. Bearman (eds). Toronto: Archives & Museum Informatics. 2007. Published September 30, 2007 at http://www.archimuse.com/ichim07/papers/marlow/marlow.html.

J Müller, H., Clough, P., Hersh, W., Deselaers, T., Lehmann, T., and Geissbuhler, A. (2006), Using heterogeneous annotation and visual information for the benchmarking of image retrieval systems, SPIE conference Photonics West, Electronic Imaging, special session on benchmarking image retrieval systems, San Diego, February 2006. Vol. 6061. Available online: http://spie.org/x648.html?product_id=660259.

J Müller, H., Clough, P., Hersh, W., Deselaers, T., Lehmann, T. and Geissbuhler, A. (2005), Axes for the evaluation of medical image retrieval systems - the Image CLEF experience, In Proceedings of ACM Multimedia 2005 (Brave New Topics track), 6-12 November, Singapore, pp. 1014-1022.

J Bucher, B., Clough, P., Joho, H., Purves, R., and Syed, A. K. (2005), Geographic IR Systems: Requirements and Evaluation. In Proceedings of the 22nd International Cartographic Conference, A Coruña, Spain, CD-ROM.

J Purves, R., Clough, P. and Joho, H. (2005), Identifying imprecise regions for geographic information retrieval using the web, In Proceedings of GIS RESEARCH UK 13th Annual Conference, Glasgow, UK, pp. 313-318.

J Stevenson, M. and Clough, P. (2004), EuroWordNet as a Resource for Cross-Language Information Retrieval, In Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC-04), May 2004.

P Clough, P. and Stevenson, M. (2004), Cross-Language Information Retrieval using EuroWordnet and Word Sense Disambiguation, In Advances in Information Retrieval, McDonald and Tait (Eds), Proceedings of the 26th European Conference on IR Research (ECIR'04), Sunderland, UK, April 2004, Springer-Verlag LNCS 2997, pp.327-337.

J Sanderson, M., Clough, P.D., Paterson C. and Tung Lo, W. (2004), Measuring a Cross Language Image Retrieval System, In Advances in Information Retrieval, McDonald and Tait (Eds), Proceedings of the 26th European Conference on IR Research (ECIR'04), Sunderland, UK, April 2004, Springer-Verlag LNCS 2997, pp.353-363.

Page 22: The University of Sheffield - Groups - Researchir.shef.ac.uk/cloughie/download/PaulClough-cv.pdf · 2005 worth ~£3.8 million in total (~£1 million for Sheffield). • Co-author

July, 2011 Page 22 of 29

P Clough, P. and Sanderson, M.(2004), The Effects of Relevance Feedback in Cross Language Image Retrieval, In Advances in Information Retrieval, McDonald and Tait (Eds), Proceedings of the 26th European Conference on IR Research (ECIR'04), Sunderland, UK, April 2004, Springer-Verlag LNCS 2997, pp.238-252.

P Clough, P. and Stevenson, M. (2004), Evaluating the Contribution of EuroWordNet and Word Sense Disambiguation to Cross-language Retrieval, In Proceedings of the Second International Global WordNet Conference (GWC-2004), Brno, Czech Republic. pp. 97-107, January 2004.

J Gaizauskas, R., Burnard, L., Clough, P. and Piao, S.L. (2003), Using the XARA XML-Aware Corpus Query Tool to investigate the METER Corpus. In Proceedings of Corpus Linguistics 2003, Lancaster, UK.

P Clough, P., Gaizauskas, R. and Piao, S. L. (2002), Building and annotating a corpus for the study of journalistic text reuse. In Proceedings of the 3rd International Conference on Language Resources and Evaluation (LREC-02), pp.1678-1691 (Vol V), 29-31st May 2002, Los Palmas de Gran Canaria, Spain.

P Clough, P.D., Gaizauskas, Piao, S.L. and Wilks, Y. (2002), Measuring Text Reuse, In Proceedings of Association for Computational Linguistics (ACL2002), Philadelphia, PA, USA, pp.152-159.

J Gaizauskas, R., Foster, J., Wilks, Y., Arundel, J., Clough, P. and Piao, S.L. (2001), The METER corpus: a corpus for analysing journalistic text reuse, In Proceedings of Corpus Linguistics 2001, Lancaster, UK, pp.214-223.

5.4.5.1 SHORT PAPERS AND POSTERS

J S Bothin, A. and Clough, P. (2010) Quantitative Analysis of Individual Differences in Note-Taking and Talking Behaviour in Meetings, In Proceedings of IADIS Multi Conference on Computer Science and Information Systems 2010, Freiberg, Germany, 26-31 July. Available on CD-ROM.

P S Clough, P., Sanderson, M., Abouammoh, M., Navarro, S. and Paramita, M. (2009) Multiple approaches to Analysing Query Diversity, In Proceedings of the 32nd Annual ACM SIGIR Conference, Boston, Massachusetts, pp. 734-735.

J Sanderson, M., Tang, J., Arni, T. and Clough, P. (2009) What else is there? Search Diversity Examined, In Proceedings of the European Conference on Information Retrieval (ECIR'09), Toulouse, France, pp. 562-569.

J S Al-Maskari, A., Sanderson, M. and Clough, P. (2008), Relevance Judgments between TREC and Non-TREC Assessors, In Proceedings of the 31st Annual International ACM SIGIR Conference (SIGIR2008), 20-24 July 2008, Singapore, pp. 683-684.

J Carmichael, J., Larson, M., Marlow, J., Newman, E., Clough, P., Oomen, O., and Sav, S. (2008), Multimodal Indexing of Digital Audio-visual Documents: A Case Study for Cultural Heritage Data, In Proceedings of the Sixth International Workshop on Content-Based Multimedia Indexing (CBMI2008), London, UK, 18-20th June, pp. 93-100.

Page 23: The University of Sheffield - Groups - Researchir.shef.ac.uk/cloughie/download/PaulClough-cv.pdf · 2005 worth ~£3.8 million in total (~£1 million for Sheffield). • Co-author

July, 2011 Page 23 of 29

P S Clough, P. and Read, S. (2008), Key Design Issues with Visualising Images using Google Earth, In Proceedings of the 30th European Conference on IR Research (ECIR'08), Glasgow, UK, April 2008, LNCS4956, pp. 570-574.

J S Al-Maskari, A., Sanderson, M., and Clough, P.D. (2007), The Relationship between IR Effectiveness Measures and Users’ Satisfaction, In Proceedings of the ACM SIGIR2007 Conference, Amsterdam, Netherlands, pp. 773-774.

J Purves, R. and Clough, P. (2006), Judging spatial relevance and document location for Geographic Information Retrieval, extended abstract, In Proceedings of 4th International Conference on Geographic Information Science (GIScience 2006), Münster, Germany, September 2006, pp. 159-164.

P Clough, P., Sanderson, M. and Müller, H. (2004), A proposal for the CLEF Cross Language Image Retrieval Task 2004, In Proceedings of the 2004 CIVR conference, Dublin, Ireland, 2004, Springer-Verlag LNCS 3115, pp. 243-251.

J Sanderson, M. and Clough, P. (2004), Measuring Pseudo Relevance Feedback and CLIR, In Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'04), pp. 484-485.

J Levin, S., Clough, P.D., and Sanderson, M. (2003), Assessing the effectiveness of pen-based input queries. In Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval

5.4.6 ARTICLES IN PROFESSIONAL PUBLICATIONS J Purves, R., Clough, P. and Jones, C. (2010) Highlights from GIR'10, ACM

SIGSPATIAL Special, Vol. 2 (1), March 2010, pp. 17-23.

P Clough, P. and Berendt, B. (2009) Report on the TrebleCLEF Query Log Analysis Workshop 2009, ACM SIGIR Forum, pp. 71-77.

P Clough, P. (2007), Large-scale evaluation of cross-language image retrieval systems, ASIST Bulletin, Feb-Mar 2007.

J Grubinger, M., Clough, P., and Leung, C. (2006), IAPR TC-12 Benchmark for visual information search, International Association for Pattern Recognition Newsletter, Vol. 28(2), pp. 10-12.

J Müller, H. and Clough, P. (2006), The ImageCLEF Benchmark on Multimodal, Multilingual Visual Images, International Association for Pattern Recognition Newsletter, Vol. 28(2), pp. 13-17.

J Karlgren, Jussi and Clough, Paul and Gonzalo, Julio (2006), Multilingual Interactive Experiments with Flickr, ERCIM News, 66, July 2006.

P Clough, P., Sanderson, M., and Reid, N. (2006), The Eurovision St Andrews Collection of Photographs, ACM SIGIR Forum, 40(1) (June 2006), pp. 21-30.

P Clough, P. (2003), Old and new challenges in automatic plagiarism detection, National UK Plagiarism Advisory Service (Online).

5.4.7 REFEREED WORKSHOP PAPERS

Page 24: The University of Sheffield - Groups - Researchir.shef.ac.uk/cloughie/download/PaulClough-cv.pdf · 2005 worth ~£3.8 million in total (~£1 million for Sheffield). • Co-author

July, 2011 Page 24 of 29

P Clough, P., Stevenson, M. and Ford, N. (2011) Personalizing Access to

Cultural Heritage Collections using Pathways, In Proceedings of 3rd Workshop on Personalised Access to Cultural Heritage (PATCH 2011). In conjunction with IUI2011 Conference, Stanford University 13 Feb 2011, pp. 12-19.

P Clough, P., Gonzalo, J. and Karlgren, J. (2010) Creating Re-useable Log

Files for Interactive CLIR, In SIGIR 2010 Workshop on the Simulation of Interaction (SimInt), Geneva 23 July 2010.

J Kanoulas, E., Clough, P., Carterette, B. and Sanderson, M. (2010) Session Track at TREC2010, In SIGIR 2010 Workshop on the Simulation of Interaction (SimInt), Geneva 23 July 2010.

P S Clough, P. and Pasley, R. (2010) Images and Perceptions of Neighbourhood Extents, In Proceedings of 6th Workshop on Geographic Information Retrieval (GIR'10), Zurich 18-19 February 2010.

J Warner, A. and Clough, P. (2009) A Proposal for Space Exploration at The National Archives, In Proceedings of Workshop: The Cultural Heritage of Historic European Cities and Public Participatory GIS, York (UK), September 2009.

J Lestari Paramita, M., Sanderson, M. and Clough, P. (2009) Diversity in Photo Retrieval: Overview of the ImageCLEFPhoto Task 2009, In Borri, F., Nardi, A., and Peters, C., editors, Working Notes for the CLEF 2009 Workshop.

J Gonzalo, J., Peinado, V., Clough, P. and Karlgren, J. (2009) Overview of iCLEF 2009: Exploring Search Behaviour in a Multilingual Folksonomy Environment, In Borri, F., Nardi, A., and Peters, C., editors, Working Notes for the CLEF 2009 Workshop.

J Lestari Paramita, M., Sanderson, M. and Clough, P. (2009) Developing a Test Collection to Support Diversity Analysis, In Proceedings of Redundancy, Diversity, and Interdependence Document Relevance workshop held at ACM SIGIR, pp. 39-45.

J S Easingwood, A. and Clough, P. (2009) An Evaluation of Publicly Accessible Geographic Information Websites, In Proceedings of the ECIR 2009 Workshop Geographic Information on the Internet, Toulouse, France, April 6, 2009, pp. 61-66.

J T. Arni, P. Clough, M. Sanderson, M. Grubinger (2009), Overview of the ImageCLEFphoto 2008 Photographic Retrieval Task, In Proceedings of 9th Workshop of the Cross-Language Evaluation Forum (CLEF'08), September 17-19 2008, LNCS 5706, pp. 500-511.

J Gonzalo, J., Clough, P. and Karlgren, J. (2009), Overview of iCLEF2008: Search Log Analysis for Multilingual Image Retrieval, In Proceedings of 9th Workshop of the Cross-Language Evaluation Forum (CLEF'08), September 17-19 2008, LNCS 5706, pp. 227-235.

J Tang, J., Arni, T., Sanderson, M. and Clough, P. (2009), Building a Diversity Featured Search System by Fusing Existing Tools, In Proceedings of 9th Workshop of the Cross-Language Evaluation Forum (CLEF'08), September 17-19 2008, LNCS 5706, pp. 560-567.

Page 25: The University of Sheffield - Groups - Researchir.shef.ac.uk/cloughie/download/PaulClough-cv.pdf · 2005 worth ~£3.8 million in total (~£1 million for Sheffield). • Co-author

July, 2011 Page 25 of 29

P Clough, P., Marlow, J. and Ireson, N. (2008), Enabling Semantic Access to Cultural Heritage: A Case Study of Tate Online, Larson, M., K. Fernie, J. Oomen and J. Cigarran (eds.) Proceedings of the ECDL 2008 Workshop on Information Access to Cultural Heritage, Aarhus, Denmark, September 18, 2008. ISBN 978-90-813489-1-1 [Online]

J Arni, T., Tang, J., Sanderson, M. and Clough, P. (2008), Creating a test collection to evaluate diversity in image retrieval, In Proceedings of the Workshop on Beyond Binary Relevance: Preferences, Diversity, and Set-Level Judgments, held at SIGIR2008.

P Clough, P., Gonzalo, J., Karlgren, J., Barker, E., Artiles, J. and Peinado, V. (2008), Large-Scale Interactive Evaluation of Multilingual Information Access Systems - the iCLEF Flickr Challenge , In Proceedings of Workshop on novel methodologies for evaluation in information retrieval, 30th European Conference on Information Retrieval, Glasgow, 30th March-3rd April, pp. 33-38.

J Grubinger, M., Clough, P., Hanbury, A., Mueller, H. (2008), Overview of the ImageCLEFphoto 2007 Photographic Retrieval Task, In Proceedings of 8th Workshop of the Cross-Language Evaluation Forum (CLEF'07), Budapest, September 2007, LNCS 5152, pp.433-444.

J Siersdorfer, S., Sizov, S. and Clough, P. (2007) Know the Right People? Recommender Systems for Web 2.0, Proceedings of Workshop on Knowledge and Experience Management, FGWM'07, pp. 330-337.

P S Clough, P., Pasley, R., Siersdorfer, S., San Pedro J. and Sanderson, M. (2007), Visualising the South Yorkshire Floods of '07, Proceedings of Workshop on Geographic Information Retrieval GIR'07.

J S Pasley, R., Clough, P. and Sanderson, M. (2007), Geo-Tagging for Imprecise Regions of Different Sizes, Proceedings of Workshop on Geographic Information Retrieval GIR'07.

J Minelli, S. H., Marlow, J., Clough, P., Cigarran Recuero, J.M., Gonzalo, J., Oomen, J. and Loschiavo, D. (2007), Gathering requirements for multilingual search of audiovisual material in cultural heritage, In Proceedings of Workshop on User Centricity - state of the art (16th IST Mobile and Wireless Communications Summit), Budapest, Hungary, 1-5 July, 2007.

P Clough, P., Grubinger, M., Deselaers, T., Hanbury, A. and Müller, H. (2007), Overview of the ImageCLEF 2006 Photographic Retrieval and Object Annotation Tasks, Evaluation of Multilingual and Multi-modal Information Retrieval: 7th Workshop of the Cross-Language Evaluation Forum, CLEF 2006, Alicante, Spain, September 20-22, 2006, Revised Selected Papers, Peters, C., Clough, P., Gey, F.C., Karlgren, J., Magnini, B., Oard, D.W., de Rijke, M., Stempfhuber, M. (Eds.), LNCS Vol. 4730, 2007, ISBN 978-3-540-74998-1, Softcover, pp. 579-594.

J Müller, H., Deselaers, T., Grubinger, M., Clough, P., Hanbury, A. and Hersh, W. (2007) Problems with Running a Successful Multimedia Retrieval Benchmark, In Proceedings of the third MUSCLE / ImageCLEF workshop on image and video retrieval evaluation, Budapest, Hungary, 19-21 September 2007, [online].

J Grubinger, M. and Clough, P. (2007) On the Creation of Query Topics for ImageCLEFphoto, In Proceedings of the third MUSCLE / ImageCLEF

Page 26: The University of Sheffield - Groups - Researchir.shef.ac.uk/cloughie/download/PaulClough-cv.pdf · 2005 worth ~£3.8 million in total (~£1 million for Sheffield). • Co-author

July, 2011 Page 26 of 29

workshop on image and video retrieval evaluation, Budapest, Hungary, 19-21 September 2007, [online].

P S Clough, P., Al-Maskari, A. and Darwish, K. (2007), Providing Multilingual Access to Flickr for Arabic Users, Evaluation of Multilingual and Multi-modal Information Retrieval: 7th Workshop of the Cross-Language Evaluation Forum, CLEF 2006, Alicante, Spain, September 20-22, 2006, Revised Selected Papers, Peters, C., Clough, P., Gey, F.C., Karlgren, J., Magnini, B., Oard, D.W., de Rijke, M., Stempfhuber, M. (Eds.), LNCS Vol. 4730, 2007, ISBN 978-3-540-74998-1, Softcover, pp. 205-216.

J Gonzalo, J., Karlgren, J. and Clough, P. (2007), iCLEF2006 Overview: Searching the Flickr WWW Photo-Sharing Repository, Evaluation of Multilingual and Multi-modal Information Retrieval: 7th Workshop of the Cross-Language Evaluation Forum, CLEF 2006, Alicante, Spain, September 20-22, 2006, Revised Selected Papers, Peters, C., Clough, P., Gey, F.C., Karlgren, J., Magnini, B., Oard, D.W., de Rijke, M., Stempfhuber, M. (Eds.), LNCS Vol. 4730, 2007, ISBN 978-3-540-74998-1, Softcover, pp.

J S Al-Maskari, A., Clough, P., and Sanderson, M. (2006), Users' Effectiveness and Satisfaction for Image Retrieval, Workshop Information Retrieval 2006, University of Hildesheim, Germany, 9.-11. October 2006. pp. 83-87.

P S Clough, P., Marlow, J. and Sanderson, M. (2006), Designing Multilingual Information Access to Tate Online, Workshop held at the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Workshop: New Directions in Multilingual Access, Seattle, August 2006.

P Clough, P., and Müller, H. (2006) Building and Evaluating Systems for Cross-Language Image Retrieval, DELOS Network of Excellence on Digital Libraries Demo Day, Bibliothèque Nationale de France, Francois-Mitterand site Paris, France, 1st Feb.

J Grubinger, M., Clough, P., Müller, H. and Deselaers, T. (2006), The IAPR TC-12 Benchmark: A New Evaluation Resource for Visual Information Systems, In Proceedings of International Workshop OntoImage’2006 Language Resources for Content-Based Image Retrieval, held in conjuction with LREC'06, Genoa, Italy, pp. 13-23.

P Clough, P., and Petrelli, D. (2006) Using Concept Hierarchies in Text-Based Image Retrieval: A User Evaluation, In Accessing Multilingual Information Repositories, Eds (Paul Clough, Julio Gonzalo, Thomas Mandl, Thomas Deselaers, Henning Müller, Horacio Rodríguez, Sven Hartrumpf, Felisa Verdejo, Alicia Ageno, Víctor Peinado), Lecture Notes in Computer Science (LNCS), Springer, Heidelberg, Germany, Volume 4022/2006, Softcover, pp. 297-306.

J Gey, F., Larson, R., Sanderson, M., Joho, H., Clough, P., and Petras, V. (2006) GeoCLEF: The CLEF 2005 Cross-Language Geographic Information Retrieval Track Overview, In Accessing Multilingual Information Repositories, Eds (Paul Clough, Julio Gonzalo, Thomas Mandl, Thomas Deselaers, Henning Müller, Horacio Rodríguez, Sven Hartrumpf, Felisa Verdejo, Alicia Ageno, Víctor Peinado), Lecture Notes in Computer Science (LNCS), Springer, Heidelberg, Germany, Volume 4022/2006, Softcover, pp. 908-919.

J Grubinger, M., Leung, C., and Clough, P. (2006) Linguistic Estimation of Topic Difficulty in Cross-Language Image Retrieval, In Accessing Multilingual

Page 27: The University of Sheffield - Groups - Researchir.shef.ac.uk/cloughie/download/PaulClough-cv.pdf · 2005 worth ~£3.8 million in total (~£1 million for Sheffield). • Co-author

July, 2011 Page 27 of 29

Information Repositories, Eds (Paul Clough, Julio Gonzalo, Thomas Mandl, Thomas Deselaers, Henning Müller, Horacio Rodríguez, Sven Hartrumpf, Felisa Verdejo, Alicia Ageno, Víctor Peinado), Lecture Notes in Computer Science (LNCS), Springer, Heidelberg, Germany, Volume 4022/2006, Softcover, pp. 558-566.

J Gonzalo, J., Clough, P., and Vallin, A. (2006) Overview of the CLEF 2005 Interactive Track, In Accessing Multilingual Information Repositories, Eds (Paul Clough, Julio Gonzalo, Thomas Mandl, Thomas Deselaers, Henning Müller, Horacio Rodríguez, Sven Hartrumpf, Felisa Verdejo, Alicia Ageno, Víctor Peinado), Lecture Notes in Computer Science (LNCS), Springer, Heidelberg, Germany, Volume 4022/2006, Softcover, pp. 251-262.

P Clough, P., Müller, H., Deselaers, T., Grubinger, M., Lehmann, T., Jensen, J., and Hersh, W. (2006) The CLEF 2005 Cross–Language Image Retrieval Track, In Accessing Multilingual Information Repositories, Eds (Paul Clough, Julio Gonzalo, Thomas Mandl, Thomas Deselaers, Henning Müller, Horacio Rodríguez, Sven Hartrumpf, Felisa Verdejo, Alicia Ageno, Víctor Peinado), Lecture Notes in Computer Science (LNCS), Springer, Heidelberg, Germany, Volume 4022/2006, Softcover, pp. 535-557.

J S Sanderson, M., Tian, J. and Clough, P. (2006), Testing an automatic organisation of retrieved images into a hierarchy, In Proceedings of International Workshop OntoImage’2006 Language Resources for Content-Based Image Retrieval, held in conjunction with LREC'06, Genoa, Italy, pp. 44-49.

J Müller, H., Clough, P., Hersh, W. and Geissbuhler, A. (2006) Variation of Relevance Assessments for Medical Image Retrieval, In Proceedings of 4th International Workshop on Adaptive Multimedia Retrieval, University of Geneva, Switzerland (in print - to appear in LNCS Volume 3877).

P Clough, P., Gonzalo, J. and Karlgren, J. (2006) Multilingual interactive experiments with Flickr, In Proceedings of Workshop on New Texts - Wikis and blogs and other dynamic text sources, held in conjunction with the 11th Conference of the European Chapter of the Association for Computational Linguistics (EACL2006), Trento, Italy.

P Clough, P., Sanderson, M., and Shou, X.M. (2005) Searching and Organizing Images across Languages, In Proceedings of EVA (Electronic Imaging, the Visual Arts & Beyond), Moscow.

P Clough, P. (2005), Extracting Metadata for Spatially-Aware Information Retrieval on the Internet, In Proceedings of Workshop on Geographic Information Retrieval (GIR'05), held in conjunction with CIKM2005, Bremen, Germany, pp. 25-30.

P Clough, P., Joho, H. and Sanderson, M. (2005), Automatically Organising Images using Concept Hierarchies, Workshop held at the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Workshop: Multimedia Information Retrieval, August 15-19, 2005, in Salvador, Brazil.

J Müller, H., Clough, P., Geissbuhler, A. and Hersh, W. (2005) ImageCLEF 2004-2005: results, experiences and new ideas for image retrieval evaluation. In Proceedings of the Fourth International Workshop on Content-Based Multimedia Indexing (CBMI2005), Riga, Latvia, CD-ROM.

Page 28: The University of Sheffield - Groups - Researchir.shef.ac.uk/cloughie/download/PaulClough-cv.pdf · 2005 worth ~£3.8 million in total (~£1 million for Sheffield). • Co-author

July, 2011 Page 28 of 29

J Grubinger, M., Leung, C. and Clough, P. (2005) The IAPR Benchmark for Assessing Image Retrieval Performance in Cross Language Evaluation Tasks, In Proceedings of the first MUSCLE / ImageCLEF workshop on image and video retrieval evaluation, Vienna, Austria, 20th September 2005, pp. 33-50.

P Clough, P., Müller, H. and Sanderson, M. (2005), The CLEF 2004 Cross Language Image Retrieval Track, In Multilingual Information Access for Text, Speech and Images: Results of the Fifth CLEF Evaluation Campaign, Eds (Peters, C., Clough, P., Gonzalo, J., Jones, G., Kluck, M. and Magnini, B.), Lecture Notes in Computer Science (LNCS), Springer, Heidelberg, Germany, Volume 3491/2005, pp. 597-613.

P Clough, P. (2005), Caption vs. Query Translation for Cross-Language Image Retrieval, In Multilingual Information Access for Text, Speech and Images: Results of the Fifth CLEF Evaluation Campaign, Eds (Peters, C., Clough, P., Gonzalo, J., Jones, G., Kluck, M. and Magnini, B.), Lecture Notes in Computer Science (LNCS), Springer, Heidelberg, Germany, Volume 3491/2005, pp. 614-625.

P Clough, P. and Sanderson, M. (2004), A proposal for comparative evaluation of automatic annotation for geo-referenced documents, Workshop held at the 27th Annual International ACM SIGIR Conference on Geographic Information Retrieval, University of Sheffield, UK, July 29th 2004.

P Clough, P. and Sanderson, M. (2004), Assessing Translation Quality for Cross Language Image Retrieval, In Comparative Evaluation of Multilingual Information Access Systems, Eds (Peters, C., Gonzalo, Braschler, M. and Kluck, M.), Lecture Notes in Computer Science (LNCS), Springer, Heidelberg, Germany, Volume 3237/2004, pp. 594-610.

J Arampatzis et al. (2004), Web-based delineation of imprecise regions, Workshop held at the 27th Annual International ACM SIGIR Conference on Geographic Information Retrieval, University of Sheffield, UK, July 29th 2004.

J Müller, H., Geissbuhler, G., Marchand-Maillet, S., and Clough, P. (2004) Benchmarking Image Retrieval Applications, In Proceedings of The Tenth International Conference on Distributed Multimedia Systems (DMS'2004), Workshop on Visual Information Systems (VIS 2004), San Francisco, CA, USA, 2004, pp. 334-337.

J Karlgren, J., Eriksson, G., Franzén, K., Clough, P., Hansen, P., Mizzarro, S., and Sanderson, M. (2004), Reading between the lines: attitudinal expressions in text, In proceedings of the AAAI Spring Symposium Workshop on Exploring Attitude and Affect in Text: Theories and Applications, 2004.

J Sanderson, M. and Clough, P. (2002), Eurovision - an image-based CLIR system, Workshop held at the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Workshop 1: Cross-Language Information Retrieval: A Research Roadmap, University of Tampere, Finland, August 15th 2002, pp. 56-59.

P Clough, P. (2001), Measuring text reuse in a journalistic domain, In Proceedings of the 4th Annual CLUK Colloquium, University of Sheffield, Sheffield, UK, pp.53-63.

5.4.8 TECHNICAL REPORTS

Page 29: The University of Sheffield - Groups - Researchir.shef.ac.uk/cloughie/download/PaulClough-cv.pdf · 2005 worth ~£3.8 million in total (~£1 million for Sheffield). • Co-author

July, 2011 Page 29 of 29

P Clough, P., Marlow, J., Carmichael, J., Cigarran Recuero, J.M., Artiles, Gareth Jones, Stéphane Marchand-Maillet, Neil Ireson (2008) Interface for Second Prototype, Deliverable number: D6.2.2, Multilingual/Multimedia Access To Cultural Heritage (MultiMatch) project.

P Clough, P., Marlow, J., Carmichael, J., Cigarran Recuero, J.M., Artiles, J., Gonzalo, J., and Petrelli, D. (2007) Designing the User Interface for the First Prototype, Deliverable number: D6.1.2, Multilingual/Multimedia Access To Cultural Heritage (MultiMatch) project.

J Peters, C., Oomen, O., Ibbotson, C., Ireson, N., Kamps, J., Jones, G., Clough, P. (2006) State of the art monitoring, Deliverable number: D1.1.1, Multilingual/Multimedia Access To Cultural Heritage (MultiMatch) project.

J Heinzle, F., Clough, P., Elias, B., and Sester, M. (2005) Metadata Annotation Methods, Deliverable number: D29 6301, Spatially-Aware Information Retrieval on the Internet (SPIRIT) project.

J Bucher, B., Clough, P., Finch, D., Joho, H., Purves, R., Khirni Syed, A. (2005) Evaluation of SPIRIT prototype following integration and testing, Deliverable number: D31 7301, Spatially-Aware Information Retrieval on the Internet (SPIRIT) project.

J Purves, R., Clough, P., Joho, H., Jones, C., van Kreveld, M. (2005) Modelling Vague Places with Knowledge from the Web, Deliverable number: D24 3301, Spatially-Aware Information Retrieval on the Internet (SPIRIT) project.

P Clough, P., Joho, H., and Sanderson, M. (2004) Extraction of semantic annotations from textual web pages, Deliverable number: D15 6201, Spatially-Aware Information Retrieval on the Internet (SPIRIT) project.

P Clough, P. Measuring text reuse (2003), PhD thesis, University of Sheffield.

J Joho, H., Clough, P., and Sanderson, M. (2003) A Working Searching System,

Deliverable number: D10 2101, Spatially-Aware Information Retrieval on the Internet (SPIRIT) project.

P Clough, P. (2000), Plagiarism in natural and programming languages: an overview of current tools and technologies, Research Memoranda: CS-00-05, Department of Computer Science, University of Sheffield, UK.