de conferentie 2007 jack van ossenbruggen

32
Tumbling Walls Tumbling Walls & & Building Bridges Building Bridges Steps towards a Culture Web

Upload: digitaal-erfgoedconferentie

Post on 19-Jan-2015

197 views

Category:

Documents


1 download

DESCRIPTION

 

TRANSCRIPT

Page 1: DE Conferentie 2007 Jack van Ossenbruggen

Tumbling Walls Tumbling Walls & &

Building BridgesBuilding Bridges

Steps towards a Culture Web

Page 2: DE Conferentie 2007 Jack van Ossenbruggen

2

Interoperability: tearing down the walls between collections

• Musea have increasingly nice websites

• But: most of them are driven by stand-alone collection databases

• Data is isolated, both syntactically and semantically

• If users can do cross-collection search, the individual collections become more valuable!

Page 3: DE Conferentie 2007 Jack van Ossenbruggen

3

The Web: “open” documents and links

URL URLWeb link

Page 4: DE Conferentie 2007 Jack van Ossenbruggen

4

The Semantic Web: “open” data and links

URL URLWeb link

Painter“Henri Matisse”

Getty ULAN

creator

Dublin Core

Painting“Green Stripe (Mme Matisse)”Royal Museum of Fine Arts, Copenhagen

Page 5: DE Conferentie 2007 Jack van Ossenbruggen

5

Page 6: DE Conferentie 2007 Jack van Ossenbruggen

6

Principle 1: semantic annotation

Description of web objects with “concepts”from a shared vocabulary

Page 7: DE Conferentie 2007 Jack van Ossenbruggen

7

Principle 2: semantic search

• Search for objects which are linked via concepts (semantic link)

• Use the type of semantic link to provide meaningful presentation of the search results

Paris

Montmartre

PartOf

Query“Paris”

Page 8: DE Conferentie 2007 Jack van Ossenbruggen

8

Principle 3: vocabulary alignment

“Tokugawa”

SVCN periodEdo

SVCN is local in-house ethnology thesaurus

AAT style/periodEdo (Japanese period)Tokugawa

AAT is Getty’s Art & Architecture Thesaurus

Page 9: DE Conferentie 2007 Jack van Ossenbruggen

9

The myth of a unified vocabulary• In large virtual collections there are

always multiple vocabularies – In multiple languages

• Every vocabulary has its own perspective– You can’t just merge them

• But you can use vocabularies jointly by defining a limited set of links– “Vocabulary alignment”

• It is surprising what you can do with just a few links

Page 10: DE Conferentie 2007 Jack van Ossenbruggen

10

Page 11: DE Conferentie 2007 Jack van Ossenbruggen

11

Page 12: DE Conferentie 2007 Jack van Ossenbruggen

12

Part of the Dutch national MultimediaN project

CWI, VU, UvA, DEN, ICNAlia Amin, Lora Aroyo

Mark van Assem, Victor de Boer Lynda Hardman

Michiel Hildebrand, Laura Hollink Marco de Niet, Borys Omelayenko

Marie-France van Orsouw Jacco van Ossenbruggen

Guus Schreiber, Jos Taekema Annemiek Teesing, Anna Tordai

Jan Wielemaker, Bob Wielinga

Artchive.comRijksmuseum Amsterdam

Dutch ethnology musea (Amsterdam, Leiden)

National Library (Bibliopolis)

http://e-culture.multimedian.nl

Page 13: DE Conferentie 2007 Jack van Ossenbruggen

13

Page 14: DE Conferentie 2007 Jack van Ossenbruggen

14

Extra slides

Page 15: DE Conferentie 2007 Jack van Ossenbruggen

15

From metadata to semantic metadata

Page 16: DE Conferentie 2007 Jack van Ossenbruggen

16

Example textual annotation

Page 17: DE Conferentie 2007 Jack van Ossenbruggen

17

Resulting semantic annotation (rendered as HTML with RDFa)

Page 18: DE Conferentie 2007 Jack van Ossenbruggen

18

Levels of interoperability

• Syntactic interoperability–using data formats that you can

share–XML family is the preferred option

• Semantic interoperability–How to share meaning / concepts–Technology for finding and

representing semantic links

Page 19: DE Conferentie 2007 Jack van Ossenbruggen

19

Term disambiguation is key issue in semantic search• Post-query

–Sort search results based on different meanings of the search term

–Mimics Google-type search

• Pre-query–Ask user to disambiguate by

displaying list of possible meanings– Interface is more complex, but more

search functionality can be offered

Page 20: DE Conferentie 2007 Jack van Ossenbruggen

20Semantic autocompletion

Page 21: DE Conferentie 2007 Jack van Ossenbruggen

21

Faceted (pre query)Faceted search

Page 22: DE Conferentie 2007 Jack van Ossenbruggen

22

Page 23: DE Conferentie 2007 Jack van Ossenbruggen

23

Page 24: DE Conferentie 2007 Jack van Ossenbruggen

24

Page 25: DE Conferentie 2007 Jack van Ossenbruggen

25

skos

Page 26: DE Conferentie 2007 Jack van Ossenbruggen

26

• v

Page 27: DE Conferentie 2007 Jack van Ossenbruggen

27

Multi-lingual labels for concepts

Page 28: DE Conferentie 2007 Jack van Ossenbruggen

28

Learning alignments

• Learning relations between art styles in AAT and artists in ULAN through NLP of art historic texts– “Who are Impressionist painters?”

Page 29: DE Conferentie 2007 Jack van Ossenbruggen

29

Perspectives

• Basic Semantic Web technology is ready for deployment

• Web 2.0 facilities fit well:– Involving community experts in

annotation–Personalization, myArt

• Social barriers have to be overcome!– “open door” policy– Involvement of general public =>

issues of “quality”

Page 30: DE Conferentie 2007 Jack van Ossenbruggen

30

Semantic interoperability• Large, smart web “mash ups”, combining:

– Data: images, metadata & encyclopaedic knowledge (gazetteers, thesauri, Wikipedia, …)

– Visualisations: maps, timelines, social networks, …

• Data too diverse for a traditional database approach– fixed schemas will not work– data includes relational data, XML text, images, video,

• Need to link different data sources together– focus on light weight, heuristic approaches– reusing as much as possible (web standards)

• Need new interfaces and search paradigms– need to find relations between pieces of information– need to organize (cluster/rank/filter) the many

relations we will find

Page 31: DE Conferentie 2007 Jack van Ossenbruggen

31

Caveats for museum software • Be wary of Flash

–Accessibility

• Make sure you can connect others and other can connect to you–“Don’t buy software which does not

support standard open API’s”

• Export facilities to common formats (XML, …)

Page 32: DE Conferentie 2007 Jack van Ossenbruggen

32

Semantic Web Myths *)

• Sem Web = Artificial Intelligence on the Web • Relies on centrally controlled ontologies for

“meaning”– as opposed to a democratic, bottom-up

control of terms

• One has to manually add metadata to all Web pages, relational databases, XML data, etc to use it

• It is just ugly XML • One has to learn formal logic, knowledge

representation, description logic, etc. • An academic project, of no interest for industry

*) Adapted from a slide by Frank van Harmelen, panel WWW2006