citylis talk, feb 1st 2016

64
Farces and Failures Ben O’Steen, British Library Labs @benosteen

Upload: benosteen

Post on 11-Jan-2017

881 views

Category:

Education


0 download

TRANSCRIPT

Farces and FailuresBen O’Steen, British Library Labs@benosteen

Names and labels we choose shape the questions that people will ask and the assumptions that they make.

For example - “Labs”

https://www.flickr.com/photos/internetarchivebookimages/14763474682

In a nutshell:

British Library Labs works with researchers on their specific problems, trying to assess how widely this problem is felt.

With their help, we talk to communities of researchers and try to pinpoint what they need as opposed to what they think they need to ask us.

What were the researcher's initial preconceptions of working with the British Library?

“Give me all of collection X!”

Common for researchers to want all of a named collection.

Also common for us to give names to a collection based on who paid for it, or what project 'collated' it.

Farces...

A common plot mechanism:

A conversation where the participants leave in agreement but with two very different ideas of what was actually discussed.

Some common farce-inducing words

Collection

Some common farce-inducing words

Collection

Access

Some common farce-inducing words

Collection

Access

Content

Some common farce-inducing words

Collection

Access

Content

Metadata

Some common farce-inducing words

Collection

Access

Content

Metadata

Crowdsourced

Microsoft Books digitisation project

● Started in 2007, but stopped in 2009 due to the cancellation of the MS Book search project.

● Digitised approximately 49k works, (~65k volumes).

● Online from 2012 via a “standard” page-turning interface, but very low usage statistics.

“I am interested in travel accounts in Europe during the 19th Century”

2013 Competition winnershttp://labs.bl.uk/Ideas+for+Labs

Pieter Francois

Bias in digitisation

The tool was made to give a statistically valid sample.

Due to the paltry amount digitised, it showed how skewed the digital corpus is, compared to the overall holdings.

Allen B. Riddell in “Where are the novels?”* estimates that using HathiTrust’s corpus:

“... about 58%—somewhere between 47% and 68%—of the 2,903 novels [all publications in English between 1800 and 1836] have publicly accessible scans.”

* (2012) https://ariddell.org/where-are-the-novels.html

John Cooper, https://www.flickr.com/photos/atomicshed/2436324958 CC-BY-NC-ND 2.0

[ ]The square brackets of the soul...

What about some of our metadata?

The Chartist Walk...

Katrina Navickas, our researcher, in period costume!

http://turbulentlondon.com/2015/10/01/following-the-chartists-around-london/

“Access”

The newspapers were accessible. We had access to the newspapers but...

We didn't have access to them.

Keyword search fails miserably, and bulk access is an issue.

Simple data structure would've helped!

All projects to date would’ve been made incredibly easier if:

• Every thing had a URL.• The URL linked to a page that tells you all

about that thing.• It should link to other, related things.• The page was machine-readable - never

assumed a human would always read it.• Access to all data – images, XML, etc

Uptake?

Hard to measure but:•13-20 million hits on average every month, over 330,000,000 hits to date.

•Almost every image has been seen at least 20 times.

•Over 500,000 tags added by volunteers and machine algorithms.

•Iterative crowdsourcing is key.

Iterative crowdsourcing?(The term is stolen with permission from Mia Ridge.)

1. Crowdsource broad facts and subcollections of related items will emerge.

2. No 'one-size-fits-all': Subcollections allow for more focussed curation.

Goto step 1

Purposefully contextless

● Presenting them through Flickr removed the illustration's context.

– Did this help or hinder?● Wished to stimulate research with the illustrations

themselves (linotypes, etchings, etc). CS research was primarily 'Vision'

It wasn't perfect, it was an experiment

“You know, the whole thing about perfectionism. The perfectionism is very dangerous, because of course if your fidelity to perfectionism is too high, you never do anything.

Because doing anything results in— It’s actually kind of tragic because it means you sacrifice how gorgeous and perfect it is in your head for what it really is.”

- As told to Leonard Lopate on WNYC on March 4, 1996.

(emphasis my own)

http://blankonblank.org/interviews/david-foster-wallace-on-ambition/

Fear of imperfection

Encourages us to value the systems that provide access above the outcomes that could occur.

Adherence to a specification, and 'hit' counts are easy to measure.

Once you've built one interface, people are loath to make any others that would run in parallel.

Metaphors don't translate well between media

Why do we assume that physical facsimiles are anything but a comforting solution?

Tagathon found nearly 30,000 maps!

Georeferencing - http://bl.uk/maps

Not just research use!

http://www.playingbythebook.net/2014/03/18/barbapapas-new-house-a-book-so-good-im-featuring-it-for-a-second-time/

Burning Man Festival

David Normal created light boxes around theBurning man, using the British Library’s Flickr Images

“Crossroads of Curiosity” launched on 20th June 2015

“Crowdsourcing”

Found lots of really bad assumptions using this term:

● A crowd of people, each doing a small bit

● You must have special software for it

● If you build it, they will come – free labour!

● It's totally untrustworthy

● It's easy

● It fixes all problems

● It's cheap

“Crowdsourcing”

● A crowd of people, each doing a small bit

%done

Crowd

Zooniverse usage concurs with this distribution

“Crowdsourcing”

● You must have special software for it

Capturing input, showing progress and engaging with volunteers is what is important.

Spreadsheets can be a wonderful thing!

“Crowdsourcing”

● If you build it, they will come – free labour!

“Crowdsourcing”

● It's totally untrustworthy

● It's easy

● It fixes all problems

● It's cheap

Investigation into the unusual

● Can we avoid the keyboard and mouse?

● Can we make use of casual interaction, as opposed to the usual “group of experts”?

● Can useful games be made with this constraint?

● Can they be fun, as well as rewarding?

● Which age ranges understand what an arcade machine even is?

Game Jam!

In Summary:

● Be careful with the words you use, especially those you think everyone understands

● Things do not need to be catalogued or perfect to be useful to people

● Wanting access to everything is the default

● A singular presentation of a collection is a risky strategy – only mimicking the physical may not be the best idea

● Experts are where you find them, look after them once you do!

● Make space to experiment, to fail and to learn from your mistakes.

My contact details:

[email protected]@benosteen

Links:http://labs.bl.ukhttp://mechanicalcurator.tumblr.com https://flickr.com/photos/britishlibraryhttps://github.com/bl-labshttp://britishlibrary.typepad.co.uk/digital-scholarship/2013/12/a-million-first-steps.html