utilising webometric data from online digitised newspaper collections

13
Utilising Webometric Data from Online Digitised Newspaper Collections Paul Gooding UCL Centre for Digital Humanities

Upload: europeana-newspapers

Post on 11-May-2015

101 views

Category:

Education


3 download

DESCRIPTION

Utilising Webometric Data from Online Digitised Newspaper Collections by Paul Gooding, UCL Centre for Digital Humanities. Presentation given at the Europeana Newspapers Information Day, held at the British Library on 9 June 2014.

TRANSCRIPT

Page 1: Utilising Webometric Data from Online Digitised Newspaper Collections

Utilising Webometric Data from Online Digitised Newspaper

Collections Paul Gooding

UCL Centre for Digital Humanities

Page 2: Utilising Webometric Data from Online Digitised Newspaper Collections

The Context for Large-Scale Digitisation

Page 3: Utilising Webometric Data from Online Digitised Newspaper Collections

Digitised Newspaper Collections: Primary Source and Research topic…

Citations of British Library Nineteenth Century Newspapers (launch to 2012)

BNCN used as research tool BNCN as a collection

From Gooding, P. (2014) “Search All About it”: A Mixed Methods Case Study into the Impact of Large-Scale Newspaper Digitisation.

(Thesis, not yet published)

Page 4: Utilising Webometric Data from Online Digitised Newspaper Collections

Web Analytics: Google Analytics

• Web Analytics = “The measurement, collection, analysis and reporting of web data for purposes of understanding and optimizing web usage.” (http://www.digitalanalyticsassociation.org/Files/PDF_standards/WebAnalyticsDefinitions.pdf)

• Google Analytics is the leading analytics platform, and it’s great! • Unobtrusive;

• Easy to implement;

• Rich data source.

• But it does pose a couple of problems…

Page 6: Utilising Webometric Data from Online Digitised Newspaper Collections

Web Log Analysis for Welsh Newspapers Online

• 3 types of server queries (in this case):

• “Search queries” – users undertake search on the collection;

• “Browser queries” – users use browse or filter functions;

• “Content queries” – users view digitised newspaper content.

• Results cover period from 12th March 2013 to 30th June 2013.

• Investigating a longer period would increase the significance…

Page 7: Utilising Webometric Data from Online Digitised Newspaper Collections

Content Log Analysis: Welsh Newspapers Online

• Server logs look like this (except for the colours…):

• 2013-06-02T12:26:50+01:00 51a5c97c3c8d3 llgc-id:3036868 llgc-id:3039814 llgc-id:3037695 Aberystwyth Observer 21 September 1872 [2] ART40

• And they tell us the following information:

• Time and date of interaction Unique user ID Server identification Newspaper title Edition date [Page number] Article number

Page 8: Utilising Webometric Data from Online Digitised Newspaper Collections

Users viewed content from the 1840s more than any other decade

0.00%

1.00%

2.00%

3.00%

4.00%

5.00%

6.00%

7.00%

8.00%

9.00%

1804-1809 1810-1819 1820-1829 1830-1839 1840-1849 1850-1859 1860-1869 1870-1879 1880-1889 1890-1899 1900-1909 1910-1919

Po

pu

lari

ty

Most Viewed Decades in WNO, compared to total pages per decade

Page 9: Utilising Webometric Data from Online Digitised Newspaper Collections

They searched for personal names, place names and topics relevant to Wales

Page 10: Utilising Webometric Data from Online Digitised Newspaper Collections

And they engaged heavily with newspaper content

0%

10%

20%

30%

40%

50%

60%

70%

0 20 40 60 80 100 120 140 160

Pe

rce

nta

ge o

f U

sers

Pageview number

Percentage of Queries by Type

Search %

Browser %

Content %

Page 11: Utilising Webometric Data from Online Digitised Newspaper Collections

“But when people are past a certain age,

you sort of stop asking them why they do

things. It feels dangerous. What if you say

So, Mr Penumbra, why do you want to

know about Mr Tyndall's coat buttons? And

he pauses, and scratches his chin, and

there's an uncomfortable silence-- and we

both realize he can't remember?”

Robin Sloan, Mr. Penumbra’s 24 Hour Bookstore.

Page 12: Utilising Webometric Data from Online Digitised Newspaper Collections

The Qualitative Context

Page 13: Utilising Webometric Data from Online Digitised Newspaper Collections

Thanks for listening!

Any Questions?