can bilateral digitization tear down the wall between institutions and the public? ben brumfield...

Post on 19-Jan-2018

215 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Can Bilateral Digitization Tear Down the Wall Between

Institutions and the Public?

Ben BrumfieldDigital Frontiers 2012

“You know Ben that it really stinks that I can't get access to the original. My grandfather Jeremiah wrote the diary so that I could read about his daily life happenings. My grandfather Edward used to own it and if he had known that I would be so interested in it I'm sure he would have kept it and given it to me instead of the university.”Alan Williams, 2009 email

Walls

• Professionally conserved• Publicly accessible • Catalogued• 1000 miles away• Reading room restrictions• “Permission-to-publish” agreements • Costly scanning fees

Penetrating the Walls

• Digitization• Collaboration

Shallow Digitization(Institutional Version)

• “Scan-and-dump” facsimiles– Limited metadata– No transcripts– Not crawlable

Shallow Digitization(Amateur Version)

• Full transcripts– No facsimiles– No provenance– No metadata on sources– Invisible editorial decisions

• Cut-and-paste replication– No attribution

Deep Digitization

• Institutional Challenges– Funding– Manpower

• Non-institutional Challenges– Standards– Access to sources

Crowdsourcing

• Who are the volunteers?• What can they do?

• OldWeather.org• Zenas Matthews• Harry Ransom Center Fragments

Accuracy

• Individual transcriptions are about 97% accurate

• Of 1000 transcribed logbook entries:– 3 will be lost because of transcription errors– 10 will be illegible– At least 3 will be errors in the logs

OldWeather Participation

• More than 1.6 million weather observations.

• 16,000 volunteers.• 1 million log pages transcribed.

• Mean contribution of 100 transcriptions per user.

OldWeather Participation

• More than 1.6 million weather observations.

• 16,000 volunteers.• 1 million log pages transcribed.

• Mean contribution of 100 transcriptions per user – but this statistic is worthless!

Power-law Distribution

• Most contributions are made by a core of well-informed enthusiasts.

• True regardless of project size.

• What are the implications?

One “Well-Informed Enthusiast”

• In 14 days,– Entire diary transcribed– 250 revisions to 43 pages– Two dozen footnotes

Crowdsourcing’s Virtuous Circle

• Volunteers• Deep digitization• Findability• More Volunteers!

One Volunteer’s Story

• Nat Wooding– Retired data analyst– 100 pages of Julia Brumfield’s diaries

transcribed and indexed in six months– No relation to diarist

One Volunteer’s Story

• Nat Wooding– Retired data analyst– 100 pages of Julia Brumfield’s diaries

transcribed and indexed in six months– No relation to diarist

– Great-uncle was diarist’s letter carrier, also named Nat Wooding

Non-institutional Digitization

The Invisible Archive

• Private collections• Family archivists (filing cabinets)

– or their heirs (boxes in the attic)• Non-notable subjects• Flickr

The Standards Problem

• “We can't overemphasize the potential futility of citing websites, any websites,but especially non-institutional websites.”

– Diggitt McLaughlin• (H-SHEAR 2011-04-27)

The Standards Problem

• “Needless to say, amateurs will continue to put out poorly edited versions of documents in print which we, as professionals, will continue to eschew using.”

– Christopher L. Miller • (H-OIEAHC list, 1996-05-07)

Solutions

• Collaboration

• Participation by professionals in amateur projects

• FreeREG/FreeCEN

Solutions

• Community

• Flickr• RootsTech

Solutions

• Software Platforms

• Suggested rigor• Graceful degradation

Thanks!

Ben Brumfield

benwbrum@gmail.comhttp://fromthepage.com/

Slides and transcript to be posted athttp://manuscripttranscription.blogspot.com/

top related