making silent voices heard stephen rhind-tutt, president charting vanishing voices workshop

Post on 25-Feb-2016

58 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Making Silent Voices Heard Stephen Rhind-Tutt, President Charting Vanishing Voices Workshop June 29, 2012. Agenda. About Alexander Street The Challenge The Nature of Virtual Space Examples from Alexander Street Partnerships and Collaboration. 1.About ASP. - PowerPoint PPT Presentation

TRANSCRIPT

Making Silent Voices HeardStephen Rhind-Tutt, President

Charting Vanishing Voices WorkshopJune 29, 2012

1. About Alexander Street2. The Challenge3. The Nature of Virtual Space4. Examples from Alexander Street5. Partnerships and Collaboration

Agenda

1. About ASP

• Founded in 2000 by executives who used to work for Chadwyck-Healey, SilverPlatter, Wolters-Kluwer, Gale and Wilson.

• Headquartered just outside Washington DC, USA• Offices in Stevenage, England; Shanghai, China;

Kuala Lumpur, Malaysia; Sydney, Australia; Brazil; New Zealand

• 3,000 customers• 2,500 licensors

About Alexander Street Press

Making silent voices heard…

Collaboration

More examples

2. The Challenge

The Challenge

By 2020 the web will have• > 5 Bn users, (currently 2.3 Bn - 37% of the world)• > 90% of published works prior to 1923• > Most works published to 2020• > 4 Billion websites (currently 555m, 71% growth p.a)• > 1 Trillion photographs (Facebook adds 300m daily)• > 100 Million pages of facsimiles of manuscripts • > 100 Million audio files • > 1 Billion video files (YouTube adds 72 hrs every minute)

Preservation and Access

• More than 6,500 endangered languages

• Countless cultural artifacts, audio, video, texts

• Hidden collections

• (Personal) archives

• Field Notes

• Data sets

• Little or no cataloging

• Mostly undigitized

• Decaying film and audio formats

• Increasing opportunities to embellish (HD-video, 3-D models, social annotation etc)

How are we going to do all of this?

3. The Nature of Virtual Space

“You must consult the laws of nature…you say “What do you want brick?” and the brick says to you “I like an arch” and you say to brick “Look, I want one too, but arches are expensive…” Brick says “I like an arch”…

“Honor the material you use”

Louis Kahn (1979)

The nature of virtual space…

• Steel – High cost to create, strong, easy to stamp shapes, medium weight…

• Wood – Low cost to create, moderately strong, needs to be crafted, light weight…

• Glass – Medium cost to create, weak, easy to craft, transparent

• The Web - ?

Understanding the medium

Nature of electronic publications

• Atomic• Interconnected • Interdependent• The link matters more than the object

• Pliable• Evolving quickly• Unlimited in size

Page Page Page

Page

Page Page Page

Page Page

Understanding the medium

0111010011010000101101101000101110100010001110101010101010101011111010101010101111101011100100011101

Binary

Machine Code

Assembly Code

Programming languages

C++, PERL, VB, etc…

Understanding the medium

Communications Protocols – TCP-IP, Modems

Display Standards – Super VGA

Font Standards – Postscript

Plug-in standards – Java

Browser Standards – IE 7.0

Document formats - PDF

Mark-up Standards – SGML, XML, HTML

Image Standards – JPG, TIFF, etc, etc

Understanding the medium

Phone standards – 3G, 4G, 5G

Four Square

Twitter – local, custom, news

Network protocols – 801

Map Standard - Google Maps, Open Map

iOS, Android,

Devices – Nook, Kindle, iPad,

Video Standards – H264, Silverlight, Flash

Evolving quickly

• Processing speed – by 2015 machines 4 times more powerful than today’s.

• Storage space – by 2015 20 Terabytes of storage (8 Bn pages) will cost under $100

• > than 90% of all developed world will have Web access• Significant improvements in the developing world • Phone Bandwidth > 1.5 Mb/s

On current trends…

Evolving quickly

On current trends…

Year Hard Disk Size (MB)1988 20 1990 40 1991 80 1993 160 1994 320 1996 640 1997 1,280 1999 2,560 2000 5,120 2002 10,240 2003 20,480

Year Hard Disk Size (MB)2000 20,000 2002 40,000 2003 80,000 2005 160,000 2006 320,000 2008 640,000 2009 1,280,000 2011 2,560,000 2012 5,120,000 2014 10,240,000 2015 20,480,000

Where we’re headed…

After Data, Information, Knowledge, and Wisdom, Gene Bellinger, Durval Castro, Anthony Mills. http://www.systems-thinking.org/

Who, What, When, Where?

Therefore

Why?

Value in the electronic world is about...

Understanding electronic products

“The manner in which or the efficiency with which something reacts or fulfills its intended purpose”

Webster’s Unabridged

What do we need to do? • Comprehensive - everything on the network

• Everyone on the network

• Local and personal (unique verified identity)

• Ubiquitous access (everywhere, all devices)

• High quality (peer review)

• Workflow integration and analysis (deep links to relevant content and tools)

• Maximize efficiencies (easy ingestion and dissemination)

• Real time currency

Devices

Inbound Discovery Quality

BandwidthEncodes# of pixelsSampling

ToolsTranscriptsSubtitlesChapteringTranslationUsage Stats

PermissionsPrivacyPermissionsAnonymityShibboleth

IndexingMARCSemanticControlled vocabularies

Outbound Discovery

API HarvestingPromotionConferencesAdsenseE-mailMailings

IngestionScanningUploadingData Crosswalking

CommunityPeer ReviewCrowdsourceAnnotationPlaylists

ProducingFilmingRecordingLicensingWritingCommissioning

Evolution of tasks

Fading Growing Typesetting

Printing Compiling Directories

Simple, One database Search

Rare and unpublished material

Inbound discovery

Republishing public domain

Process integrationWorkflow tools & apps

Warehousing

Community BuildingOutbound discovery

Automated ingestion and tagging

Human tagging

Permissions

Evolution of tasks

Fading Growing Typesetting

Printing Compiling Directories

Simple, One database Search

Rare and unpublished material

Inbound discovery

Licensing? Republishing public domain

Process integrationWorkflow tools & apps

Warehousing

Community BuildingOutbound discovery

Automated ingestion and tagging

Human tagging

Commissioning?

Editorial?

Quality?

Selection?

Permissions

Marketing?

4. Examples

Searchability

Make video searchable…

30 minutes of news12 double-spaced pages 5 minutes to read in depth2 minutes to scan

=

Great functionality

Let it be embedded in courses

Annotation

Studio

Inbound discovery

Be of the web

Music Newspapers

Websites

Monographs

Primary Works

Journals

Library Branded Interface

Embeddable Search Box

Major Collections Individual Titles

Federated Search Engines

Make it accessible widely…

Indexing, discovery and analysis

The strain on keyword search…

Questions • Google: Martin Luther King – 8.3m hits (2005), 32.5m

(2012)• Google Scholar: 202k hits, options to restrict:

• Article • Legal document• Date range (year published)• Patent or Citation

‘Semantic’ Indexing

Collection

Series

Book or Volume

Chapter

Page

Word

Where ?When ?

What ?Who ?

Traditional in

dexing

>

‘Semantic’ indexing >

Increases in Utility

Access Keyword Search

Fielded Search

Semantic Search

Do youhave the booktitled…

All mentions of ‘Star Wars’

All mentions of ‘Star Wars’ in texts about Regan published in 1985

All mentions of ‘Star Wars’ by Regan in speeches he delivered in 1985

• Identify and divide texts into content elements (e.g. letter, diary entry…)

• Identify key concepts for these elements(e.g. authors, sources, battles, encounters…)• Index both elements and associated concepts• Integrate to form a cohesive whole

• Unique ways of browsing through concepts • Unique ways to ask questions

What is Semantic Indexing ?

Semantic Indexing…

Encounter Author SourceEncounter NameCultural GroupsEstimated # of peopleStart yearStart monthStart dayLocationExpeditionEncounter TypeFatalitiesEtc…

NameDate of birthPlace of birthDate of deathPlace of deathNationalityReligionSexual OrientationOccupationEtc…

SourceEditor/TranslatorOriginal Language PublisherPublication DatePublication PlaceSubject of WorkEtc…

DocumentTextAuthor IDEncounter IDSource IDDateSubjectAge writingEtc…

Semantic Indexing…

Encounter Author SourceEncounter NameCultural GroupsEstimated # of peopleStart year, month, dayLocationExpeditionEncounter TypeFatalitiesEtc…

NameDate of birthPlace of birthDate of deathPlace of deathNationalityReligionSexual OrientationOccupationEtc…

SourceEditor/TranslatorOriginal Language PublisherPublication DatePublication PlaceSubject of WorkEtc…

DocumentTextAuthor IDEncounter IDSource IDDateSubjectAge writingEtc…

“Show me writings by Jesuits, originally written in French, that discuss trade involving the Huron.”

Early Encounters in North America

Fauna and Flora

Geophysical, Natural Phenomena

Peoples

Personal & Cultural Events

Specific entry points for American Indian Studies…

Encounter database

Encounter database

Early Encounters in North America

Early Encounters in North America

• More than a way to answer questions• A framework by which users can be guided to

understand, explore, discover and learn.• A route-map to guide users through data - saving time and effort.• The intellectual fabric by which information should be

organized…• Delivers answers that cannot be asked elsewhere

• Discipline specific• Oriented towards the user and the content • At the ‘right’ level• Thoroughly controlled• Metadata should be open

Semantic Indexing…

Outbound discovery

Higher value linkages…

Loosely Held Tightly Held

Free Websites

Loosely integrated

Tightlyintegrated

Refuse to License

License widely

License widelyand be a Licensor

• Higher value links• Semantic indexing and keyword

searching of more than 3,000 oral history collections.

• Represents the personal histories of some 300,000 people.

• Value:– Context– Selection– Search Power– Licensed material– Integration

Higher value linkages…

Context and Selection

Search Power

Organized Results

Building the network…

Unhelpful• Legal warnings not to link• Changing links constantly• Disabling links • No permanent URLs• No crawling• Randomly changing URLs• Insisting on one interface and

one access point • Unattached pages

Helpful• Visibility• Permanent URLs• RSS feeds• OpenURL, Open Metadata• Design for multiple interfaces• Open to crawling• Published open APIs• Welcome linking• Ask others to do the same

5. Partnerships & Collaboration

Where will the £££ come from?

JSTOR$52m Revenues in 2010

American Memory

Women and Social Movements

• Collaboration with the Center for the Historical Study of Women and Gender at SUNY Binghamton and ASP

• Original site is free –new content is for fee.

• Usage across the free site dipped only slightly – more usage following commercial launch.

• Added video, audio, > 200k pages, new functionality.

We’re engaged in a leviathan taskMoney is neededFor fee content can sit alongside open contentPublishers can helpNeed for collaboration and openness

Summary

• It will all be available in digital form• It will not cost too much• Many more people will use it • It will be enriched through better display, better

integration, better links, better context, etc, etc

Good for publishers

Good for academics

Good for “society”

Where we’re headed…

www.alexanderstreet.comwww.alexanderstreet.com

top related