digital odyssey 2012: open data

56
Open Data is Dead! Long Live Open Data! MJ Suhonos June 8, 2012

Upload: robotninja

Post on 29-Jun-2015

346 views

Category:

Education


0 download

TRANSCRIPT

Page 1: Digital Odyssey 2012: Open Data

Open Data is Dead!Long Live Open Data!

MJ SuhonosJune 8, 2012

Page 2: Digital Odyssey 2012: Open Data

The web and openness

Page 3: Digital Odyssey 2012: Open Data

2009: The Next Web• TED talk on the 20th anniversary of

the WWW

• Idea of WWW borne of frustration

• Unrealized potential due to

incompatibility

• Virtual documentation system on the

Internet

Page 4: Digital Odyssey 2012: Open Data

”vague, but exciting”

Page 5: Digital Odyssey 2012: Open Data

A new way of thinking

• CD-ROMs already had isolated

hyperlinking

• Later done "on the side, as a play

project”

• Made everything openly and freely

available

Page 6: Digital Odyssey 2012: Open Data

A grassroots movement

• People started doing things that

weren't imagined originally

• Network effect: more involvement =

more new, interesting, useful things

• Most valuable thing was the

community

Page 7: Digital Odyssey 2012: Open Data

Openness Movements

• About community and culture

building

• Based around a new way of thinking

• Facilitated by a new technology

Page 8: Digital Odyssey 2012: Open Data

Openness Movements

• Open Access: 1997 (SPARC)

• Open Source: 1998 (Open Source

Summit)

Page 9: Digital Odyssey 2012: Open Data

Old ideas rebooted

• Both actually go back to about 1910

• New movements based on the idea

of non-rivalry (digital reproduction)

• Facilitated by the Internet and WWW

Page 10: Digital Odyssey 2012: Open Data

The value of data

• Data is only useful when someone

does something with it

• No data = zero possibilities

• More unrealized potential

Page 11: Digital Odyssey 2012: Open Data

RawDataNow!

Page 12: Digital Odyssey 2012: Open Data
Page 13: Digital Odyssey 2012: Open Data

Gold stars of Open Data1. Make your stuff openly available on the

web ★2. Make it available as structured data

★★

e.g. Excel instead of PDF

3. Use a non-proprietary format ★★★

e.g. CSV instead of Excel

Page 14: Digital Odyssey 2012: Open Data

2010: TPL Open Data

• First project was to submit the entire

catalogue to the Internet Archive

• 2.5 million MARC records, about 2GB

http://archive.org/details/

marc_toronto_public_library

Page 15: Digital Odyssey 2012: Open Data

Open catalogue data

• 2/3 stars for binary MARC format ★★

• Downloaded 89 times since 2010

• U of T: 5400 times, UPEI 2900 times

• TPL is hands-off: no updates, no

license

Page 16: Digital Odyssey 2012: Open Data

2009-2010

Page 17: Digital Odyssey 2012: Open Data

OCLC record use policy• Trying to protect their business

model by preventing sharing

• Deliberately exploited uncertainty of

legality

• Librarians argued vocally for public

domain

• Policy retracted and changed (not

defensible)

Page 18: Digital Odyssey 2012: Open Data

Circling the wagons

• Libraries have the power to fight

back

• Best counter-strategy is to release

the data

• Need the ability to work together

somehow

Page 19: Digital Odyssey 2012: Open Data

Linked Data

Page 20: Digital Odyssey 2012: Open Data

Linked Data

• Technical framework for data

interoperability

• A common language for sharing data

and relations online

• More unrealized potential due to

massive incompatibility & “siloing”

Page 21: Digital Odyssey 2012: Open Data

A new way of thinking

• Fundamentally differs from

conceptualization underlying data

formats of the 20th century

• From concept of "records" as

bounded sets, to an unbounded set

of "statements”

Page 22: Digital Odyssey 2012: Open Data

Based on a new technology

• Same principles and mechanisms as

WWW

– URIs for names, HTTP for retrieval, plus

RDF

• Still organized facts about things, but

infinitely more flexible structure

Page 23: Digital Odyssey 2012: Open Data

”vague, but exciting”

Page 24: Digital Odyssey 2012: Open Data

Why Linked Data?

• Breaking data out of silos by pointing to

and linking between other databases

• Formulate questions for which no answer

exists on the current WWW

• Anyone can contribute unique expertise in

a form that can be reused and recombined

Page 25: Digital Odyssey 2012: Open Data

“The coolest thing to do to your data will be thought of by

someone else.”

Page 26: Digital Odyssey 2012: Open Data

Open Data

Page 27: Digital Odyssey 2012: Open Data

Open Data• Legal and policy framework for data

interoperability

• Clarifies the terms and purposes of

data use

• Allows for a spectrum of licensing

options

– see Creative Commons

Page 28: Digital Odyssey 2012: Open Data

Open Data definition

“freely usable, reusable and

redistributable, subject, at most,

to the requirements to attribute

and share-alike”

http://opendefinition.org/okd/

Page 29: Digital Odyssey 2012: Open Data

Database hugging• People don't want to let go of their

data:

– until it's perfect or complete or

"finished”

– because data is raw and unpolished and

ugly

– because “we know better than everyone

else”

– something unforeseeably terrible might

happen

Page 30: Digital Odyssey 2012: Open Data
Page 31: Digital Odyssey 2012: Open Data

Misconception #1

• Open Data will destroy/compromise

quality

– Already a lot of high-quality data being

created outside of libraries

– Our MARC records aren't actually that

great

Page 32: Digital Odyssey 2012: Open Data

Misconception #2• Open Data will reveal our

mistakes/problems

– everyone's data is messy, that’s its

nature

– what if someone were able to clean it up

for you?

Page 33: Digital Odyssey 2012: Open Data

Misconception #3

• Open Data will facilitate competition

– new and useful tools are good, even

ones that involve money

– what if someone does a better job with

our data than we do?

Page 34: Digital Odyssey 2012: Open Data

Misconception #4

• Open Data is a loss of control

– if you deliberately make it available, you

can set the (legal) terms of its use

– requires thinking about / dealing with

legal stuff

Page 35: Digital Odyssey 2012: Open Data
Page 36: Digital Odyssey 2012: Open Data

An increasing trend• 2012: Canada Post Files Copyright

Lawsuit Over Crowd-sourced Postal

Code Database

http://geocoder.ca/?sued=1

1. take down the openly-licensed

database

2. pay damages on lost business

($5500/year)

Page 37: Digital Odyssey 2012: Open Data
Page 38: Digital Odyssey 2012: Open Data

New library business model

1. Sell access to library catalogue data

2. Sue every organization who makes

bibliographic data available for free

e.g. Internet Archive, Amazon, Library of

Congress

3. Profit!

Page 39: Digital Odyssey 2012: Open Data

Open Data vs. Linked Data• Open Data does not have to be

Linked Data

• Linked Data does not require it to be

Open

• But the potential of the both is best

realized when data is published as

Open Linked Data

Page 40: Digital Odyssey 2012: Open Data

Open Linked Data

Linked

Data

Open Data

Page 41: Digital Odyssey 2012: Open Data

Gold stars of Open Linked Data1. Make your stuff openly available on the

web ★2. Make it available as structured data

★★

3. Use a non-proprietary format ★★★

4. Use URIs to identify your things ★★★★

5. Link to other people’s things using URIs

★★★★★

Page 42: Digital Odyssey 2012: Open Data

Libraries & The Semantic Web

Page 43: Digital Odyssey 2012: Open Data

2011: Library Linked Data

• W3C Library Linked Data incubator

group

• Panel of invited librarians,

academics, experts

• “to help increase global

interoperability of library data on the

Semantic Web”

• Final report produced October 2011

Page 44: Digital Odyssey 2012: Open Data

A struggle for relevancy

• "library" = all cultural heritage & memory

institutions (archives, museums)

• Natural extension to the collaborative sharing

models historically employed by libraries

• In a position to provide trusted metadata for

resources of long-term cultural importance

Page 45: Digital Odyssey 2012: Open Data

Major goals for libraries

1. Foster discussion about Open Data and

rights management issues

2. Develop library standards that are

compatible with Linked Data

3. Apply library experience in curation and

long-term preservation to Open Linked

Data

Page 46: Digital Odyssey 2012: Open Data
Page 47: Digital Odyssey 2012: Open Data

A discussion about Open Data

• Data can have unclear and untested rights

issues that hinder their release as Open Data

• Seek agreement with owners about licensing;

consider the impact of usage restrictions

• Establish institutional policies for data sharing

and licensing

Page 48: Digital Odyssey 2012: Open Data

Issues with library standards

• Data is expressed primarily in natural-

language text

• Technology changes depend on vendor

systems development

• Data is not integrated with web resources

• Designed only for the library community

Page 49: Digital Odyssey 2012: Open Data

Benefits of Open Linked Data

• Will be able to use mainstream solutions

• Can give libraries a wider choice of vendors

and developers to recruit from and interact

with

• Much larger community to provide IT support

• Smaller institutions can make themselves

more visible and connected

Page 50: Digital Odyssey 2012: Open Data
Page 51: Digital Odyssey 2012: Open Data

Already going mainstream

• National libraries of Sweden, Hungary,

Germany, France, the British Library, L of C

• BNB: 2.6 million records as 85 million RDF

statements, public domain license

• Cities of Vancouver, Edmonton, Ottawa, and

Toronto have created grassroots @g4open

Page 52: Digital Odyssey 2012: Open Data

In Summary

Page 53: Digital Odyssey 2012: Open Data

Now is the time

• Missed opportunities before

• Don’t often get a second chance

• Major opportunity here for libraries to

catch up and become leaders online

Page 54: Digital Odyssey 2012: Open Data

Open Data Now!• Remember the 5 stars of Open

Linked Data

1. Choose a license, keep control of the

rights

2. Release the data – just get it out

there

Page 55: Digital Odyssey 2012: Open Data

Thanks!

@mjsuhonos

[email protected]

http://mj.suhonos.ca

Page 56: Digital Odyssey 2012: Open Data