isslod2011 - semantic multimedia

181
Semantic Multimedia Indian Summer School on Linked Data Leipzig, 15 Sep. 2011 Dr. Harald Sack Hasso-Plattner-Institut for IT-Systems Engineering University of Potsdam

Upload: harald-sack

Post on 16-Jan-2015

3.183 views

Category:

Technology


1 download

DESCRIPTION

My lecture at the Indian-summer School on Linked Open Data 2011 at the University Leipzig (Germany) on 15. Sep 2011

TRANSCRIPT

Page 1: ISSLOD2011 - Semantic Multimedia

Semantic MultimediaIndian Summer School on Linked Data

Leipzig, 15 Sep. 2011

Dr. Harald SackHasso-Plattner-Institut for IT-Systems Engineering

University of Potsdam

Page 2: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

■ HPI was founded in October 1998 as a Public-Private-Partnership

■ HPI Research and Teaching is focussed onIT Systems Engineering

■ 10 Professors and 100 Scientific Coworkers■ 450 Bachelor / Master Students ■ HPI is winner of CHE-Ranking 2010

http://hpi.uni-potsdam.de/

Page 3: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

■ Research Topics□ Semantic Web Technologies□ Ontological Engineering□ Information Retrieval□ Multimedia Analysis & Retrieval□ Social Networking□ Data/Information Visualization

■ Research Projects

Semantic Technologies & Multimedia Retrieval

Page 4: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Overview(1) Multimedia and Semantics(2) Multimedia Metadata and Ontologies(3) Semantic Multimedia Analysis(4) Semantic Multimedia Retrieval

Semantic MultimediaIndian Summer School on Linked Data, Leipzig, 15 Sep. 2011

Page 5: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

1. Multimedia and Semantics

Communication is the activity of conveying meaningful information

Page 6: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Sender

Information

Encoding

Message

Receiver

Information

Decoding

Message

Channel

1. Multimedia and Semantics

Claude E. Shannon: ,A mathematical theory of communication‘, Bell System Technical Journal, vol. 27, pp. 379–423, 623-656, July, October, 1948

Shannon‘s Model of Communication

Claude E. Shannon(1916-2001)

Page 7: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Sender

Information

Encoding

Message

Receiver

Information

Decoding

Message

Channel

1. Multimedia and Semantics

Claude E. Shannon: ,A mathematical theory of communication‘, Bell System Technical Journal, vol. 27, pp. 379–423, 623-656, July, October, 1948

Shannon‘s Model of Communication

Claude E. Shannon(1916-2001)

Media

Page 8: ISSLOD2011 - Semantic Multimedia

Sender

Information

Encoding

Receiver

Information

Decoding

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Message Message

Channel

1. Multimedia and Semantics

Media

Page 9: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Message Message

Channel

1. Multimedia and Semantics

Media

MEDIA: In communications, media (singular medium) are the storage and transmission channels or tools used to store and deliver information or data.

Page 10: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Page 11: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

TEXT: In literary theory, a text is a coherent set of symbols that transmits some kind of informative message.

Page 12: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

TEXT: In literary theory, a text is a coherent set of symbols that transmits some kind of informative message.

Page 13: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

TEXT: In literary theory, a text is a coherent set of symbols that transmits some kind of informative message.

Text

Page 14: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

TEXT: In literary theory, a text is a coherent set of symbols that transmits some kind of informative message. Images

Text

Page 15: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Page 16: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Text

Page 17: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Image

Text

Page 18: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Video / Audio

Image

Text

Page 19: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Video / Audio

Image

Text

InteractiveElements

Page 20: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

1. Multimedia and Semantics

Media

Page 21: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

1. Multimedia and Semantics

Media

time-independent

text

image

Page 22: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

1. Multimedia and Semantics

Media

time-dependent

audio

video / animation

time-independent

text

image

Page 23: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

1. Multimedia and Semantics

One Small Step ...This video shows Neil Armstrong climbing down the lunar module ladder to the lunar surface. The video compares existing footage with the partially restored video. The thumbnail image shows the new footage on the left and the old on the right.

• Information is encoded in media content• Media content contains implicite semantics

Page 24: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

1. Multimedia and Semantics

One Small Step ...This video shows Neil Armstrong climbing down the lunar module ladder to the lunar surface. The video compares existing footage with the partially restored video. The thumbnail image shows the new footage on the left and the old on the right.

• Information is encoded in media content• Media content contains implicite semantics

Page 25: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

1. Multimedia and Semantics

One Small Step ...This video shows Neil Armstrong climbing down the lunar module ladder to the lunar surface. The video compares existing footage with the partially restored video. The thumbnail image shows the new footage on the left and the old on the right.

• Information is encoded in media content• Media content contains implicite semantics

Page 26: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

SEMANTIC MULTIMEDIA facilitates • explicite semantic annotation • of multimedia content • on different levels of abstraction w.r.t.

• time, • space, and • provenance.

Page 27: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

1. Multimedia and Semantics

One Small Step ...This video shows Neil Armstrong climbing down the lunar module ladder to the lunar surface. The video compares existing footage with the partially restored video. The thumbnail image shows the new footage on the left and the old on the right.

dbpedia:Neil_Armstrong

Text

(1)Identify media fragment(2)Annotate with explicite semantics

Page 28: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

1. Multimedia and Semantics

One Small Step ...This video shows Neil Armstrong climbing down the lunar module ladder to the lunar surface. The video compares existing footage with the partially restored video. The thumbnail image shows the new footage on the left and the old on the right.

dbpedia:Astronautdbpedia:Flag

Video

(1)Identify media fragment(2)Annotate with explicite semantics

Page 29: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Overview(1) Multimedia and Semantics(2) Multimedia Metadata and Ontologies(3) Semantic Multimedia Analysis(4) Semantic Multimedia Retrieval

Semantic MultimediaIndian Summer School on Linked Data, Leipzig, 15 Sep. 2011

Page 30: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

2. Multimedia Metadata and Ontologies

One Small Step ...This video shows Neil Armstrong climbing down the lunar module ladder to the lunar surface. The video compares existing footage with the partially restored video. The thumbnail image shows the new footage on the left and the old on the right.

dbpedia:Astronautdbpedia:Flag

Video

Page 31: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

2. Multimedia Metadata and Ontologies

One Small Step ...This video shows Neil Armstrong climbing down the lunar module ladder to the lunar surface. The video compares existing footage with the partially restored video. The thumbnail image shows the new footage on the left and the old on the right.

dbpedia:Astronautdbpedia:Flag

Video

How can we put (semantic) metadata at the appropriate place within the media?

Page 32: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

2. Multimedia Metadata and Ontologies

What is ,Metadata‘?

Page 33: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

2. Multimedia Metadata and Ontologies

What is ,Metadata‘?„Metadata is defined as data providing information about one or more aspects of the data“ (informal Definition, Wikipedia)

Page 34: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

2. Multimedia Metadata and Ontologies

What is ,Metadata‘?„Metadata is defined as data providing information about one or more aspects of the data“ (informal Definition, Wikipedia)

„Metadata is structured, encoded data that describe characteristics of information-bearing entities to aid in the identification, discovery, assessment, and management of the described entities.“ (W.R. Durell, 1985)

Page 35: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

2. Multimedia Metadata and Ontologies

What is ,Metadata‘?„Metadata is defined as data providing information about one or more aspects of the data“ (informal Definition, Wikipedia)

„Metadata is structured, encoded data that describe characteristics of information-bearing entities to aid in the identification, discovery, assessment, and management of the described entities.“ (W.R. Durell, 1985)

„Metadata is machine understandable information about web resources or other things.“ (T.Berners-Lee, Axioms of Web Architecture: Metadata, W3C, 1997)

Page 36: ISSLOD2011 - Semantic Multimedia

Metadata

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

2. Multimedia Metadata and Ontologies

•Simple example: bibliographic metadata

Identification viaISBN / ISSNauthor(s)titel...

Classification viacategorieskeywordsabstract...

Page 37: ISSLOD2011 - Semantic Multimedia

Structured Metadata• name-value pairs (e.g. author=‘Ernest Hemingway‘)

• typed (e.g. author is of type string)

• Meaning (semantics) of structured data is only implicite, i.e. it relies on mutual agreement about the proper usage of the data (e.g. Standardization for Dublin Core)

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

2. Multimedia Metadata and Ontologies

• Title: A name given to the resource. • Creator: An entity primarily responsible for making the resource. • Subject: The topic of the resource. • Description: An account of the resource. • Publisher: An entity responsible for making the resource available. • Contributor: An entity responsible for making contributions to the

resource.....

http://dublincore.org/documents/dces/

Page 38: ISSLOD2011 - Semantic Multimedia

Structured Metadata• can also be structured hierarchically

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

2. Multimedia Metadata and Ontologies

Systema Naturae (1735)

Carl von Linné(1707-1787)

Page 39: ISSLOD2011 - Semantic Multimedia

Structured Metadata• Classification Systems, as e.g. Dewey Decimal Classification

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

2. Multimedia Metadata and Ontologies

DDC 23 (2011)• 4 volumes• >4.000 pages• >45.000 classes• >96.000 registered terms

DDC 1 (1876)• 44 pages

10 Main DDC Classes000 Computer science, information & general works100 Philosophy & psychology200 Religion300 Social sciences400 Language500 Science600 Technology700 Arts & recreation800 Literature900 History & geography

Melvil Dewey(1851-1931)

http://www.oclc.org/dewey/

Page 40: ISSLOD2011 - Semantic Multimedia

Unstructured Metadata• Text based metadata without a predefined structure, where the meaning

(semantics) is determined implicitely by the (natural language) content. • e.g. abstract/summary

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

2. Multimedia Metadata and Ontologies

Melville Louis Kossuth (Melvil) Dewey was an American librarian and educator, inventor of the Dewey Decimalsystem of library classification, and a founder of the Lake Placid Club.. Dewey was born in Adams Center, New York, the fifth and last child of Joel and Eliza Greene Dewey. He attended rural schools and determined early that his destiny was to be a reformer in educating the masses. At Amherst College he belonged to Delta Kappa Epsilon, earning a bachelor's degree in 1874 and a master's in 1877....

Page 41: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Authoritative vs. non-authoritative Metadata

Authoritative Metadataare generated by a reliable (authoritative) source, as e.g. • the author of the original information• a certified expert

Non-authoritative Metadataare created by an unreliable source, as e.g.

• the user• Social Tagging Systems

2. Multimedia Metadata and Ontologies

Page 42: ISSLOD2011 - Semantic Multimedia

Collaborative Annotation -- Social Tagging

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

AuthorRessource

Users

authoritativeMetadata

apple

fruit

non-authoritativeMetadata

tasty

apple

fruit

breakfast

to buy © E.C. Publications, Inc.

2. Multimedia Metadata and Ontologies

Page 43: ISSLOD2011 - Semantic Multimedia

Collaborative Annotation -- Folksonomies

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

http://www.wordle.net/

2. Multimedia Metadata and Ontologies

Page 44: ISSLOD2011 - Semantic Multimedia

2. Multimedia Metadata and Ontologies

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Semantic Metadata• can be structured or semi-structured metadata• the semantics of metadata is defined explicitely in a formal way (Ontologies) and

therefore machine readable (as well as machine understandable)

"An ontology is an explicit, formal specification of a shared conceptualization. The term is borrowed from philosophy, where an Ontology is a systematic account of Existence. For AI systems, what ‘exists’ is that which can be represented.“ (Thomas R. Gruber, 1993)

conceptualization: abstract model (domain, relevant terms, relations)explicit: semantics of all terms must be definedformal: machine understandableshared: consensus about ontology

Page 45: ISSLOD2011 - Semantic Multimedia

Example for Semantic Metadata

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

publication

2. Multimedia Metadata and Ontologies

Page 46: ISSLOD2011 - Semantic Multimedia

Example for Semantic Metadata

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

publication

• titel

• keywords

• ...

properties

2. Multimedia Metadata and Ontologies

Page 47: ISSLOD2011 - Semantic Multimedia

Example for Semantic Metadata

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

publication

book

is a

• titel

• keywords

• ...

properties

2. Multimedia Metadata and Ontologies

Page 48: ISSLOD2011 - Semantic Multimedia

Example for Semantic Metadata

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

publication

book

is a

journal

is a

• titel

• keywords

• ...

properties

2. Multimedia Metadata and Ontologies

Page 49: ISSLOD2011 - Semantic Multimedia

Example for Semantic Metadata

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

publication

book

is a

journal

is a

publisherpublishes

• titel

• keywords

• ...

properties

2. Multimedia Metadata and Ontologies

Page 50: ISSLOD2011 - Semantic Multimedia

Example for Semantic Metadata

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

publication

book

is a

journal

is a

publisherpublishes

• titel

• keywords

• ...

properties

Autorwrites

is written by

2. Multimedia Metadata and Ontologies

Page 51: ISSLOD2011 - Semantic Multimedia

Example for Semantic Metadata

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

publication

book

is a

journal

is a

publisherpublishes

• titel

• keywords

• ...

properties

Autorwrites

is written by1..n

1..n

2. Multimedia Metadata and Ontologies

Page 52: ISSLOD2011 - Semantic Multimedia

Example for Semantic Metadata

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

publication

book

is a

journal

is a

publisherpublishes

• titel

• keywords

• ...

properties

Autorwrites

is written by

Personis a

1..n

1..n

2. Multimedia Metadata and Ontologies

Page 53: ISSLOD2011 - Semantic Multimedia

Example for Semantic Metadata

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

publication

book

is a

journal

is a

publisherpublishes

• titel

• keywords

• ...

properties

Autorwrites

is written by

Personis a

address

has a1..n

1..n

2. Multimedia Metadata and Ontologies

Page 54: ISSLOD2011 - Semantic Multimedia

Example for Semantic Metadata

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

publication

book

is a

journal

is a

publisherpublishes

• titel

• keywords

• ...

properties

Autorwrites

is written by

Personis a

address

has a

• surname

• first name

• street...

properties

1..n

1..n

2. Multimedia Metadata and Ontologies

Page 55: ISSLOD2011 - Semantic Multimedia

Example for Semantic Metadata

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

publication

book

is a

journal

is a

publisherpublishes

• titel

• keywords

• ...

properties

Autorwrites

is written by

Personis a

address

has a

• surname

• first name

• street...

properties

Springer Verlag

is a

1..n

1..n

2. Multimedia Metadata and Ontologies

Page 56: ISSLOD2011 - Semantic Multimedia

Example for Semantic Metadata

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

publication

book

is a

journal

is a

publisherpublishes

• titel

• keywords

• ...

properties

Autorwrites

is written by

Personis a

address

has a

• surname

• first name

• street...

properties

Springer Verlag

is a

Harald Sack

is a

1..n

1..n

2. Multimedia Metadata and Ontologies

Page 57: ISSLOD2011 - Semantic Multimedia

Example for Semantic Metadata

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

publication

book

is a

journal

is a

publisherpublishes

• titel

• keywords

• ...

properties

Autorwrites

is written by

Personis a

address

has a

• surname

• first name

• street...

properties

Springer Verlag

is a

Harald Sack

is a

Internetworkingis a

1..n

1..n

2. Multimedia Metadata and Ontologies

Page 58: ISSLOD2011 - Semantic Multimedia

Example for Semantic Metadata

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

publication

book

is a

journal

is a

publisherpublishes

• titel

• keywords

• ...

properties

Autorwrites

is written by

Personis a

address

has a

• surname

• first name

• street...

properties

Springer Verlag

is a

Harald Sack

is a

Internetworkingis a

1..n

1..n

male

femaleis a

is a

2. Multimedia Metadata and Ontologies

Page 59: ISSLOD2011 - Semantic Multimedia

Example for Semantic Metadata

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

publication

book

is a

journal

is a

publisherpublishes

• titel

• keywords

• ...

properties

Autorwrites

is written by

Personis a

address

has a

• surname

• first name

• street...

properties

Springer Verlag

is a

Harald Sack

is a

Internetworkingis a

1..n

1..n

male

femaleis a

is a

2. Multimedia Metadata and Ontologies

Page 60: ISSLOD2011 - Semantic Multimedia

Example for Semantic Metadata

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

publication

book

is a

journal

is a

publisherpublishes

• titel

• keywords

• ...

properties

Autorwrites

is written by

Personis a

address

has a

• surname

• first name

• street...

properties

Springer Verlag

is a

Harald Sack

is a

Internetworkingis a

1..n

1..n

male

femaleis a

is a

2. Multimedia Metadata and Ontologies

entity

Page 61: ISSLOD2011 - Semantic Multimedia

Example for Semantic Metadata

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

publication

book

is a

journal

is a

publisherpublishes

• titel

• keywords

• ...

properties

Autorwrites

is written by

Personis a

address

has a

• surname

• first name

• street...

properties

Springer Verlag

is a

Harald Sack

is a

Internetworkingis a

1..n

1..n

male

femaleis a

is a

2. Multimedia Metadata and Ontologies

entity

class

Page 62: ISSLOD2011 - Semantic Multimedia

Example for Semantic Metadata

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

publication

book

is a

journal

is a

publisherpublishes

• titel

• keywords

• ...

properties

Autorwrites

is written by

Personis a

address

has a

• surname

• first name

• street...

properties

Springer Verlag

is a

Harald Sack

is a

Internetworkingis a

1..n

1..n

male

femaleis a

is a

2. Multimedia Metadata and Ontologies

entity

class

relation

Page 63: ISSLOD2011 - Semantic Multimedia

Example for Semantic Metadata

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

publication

book

is a

journal

is a

publisherpublishes

• titel

• keywords

• ...

properties

Autorwrites

is written by

Personis a

address

has a

• surname

• first name

• street...

properties

Springer Verlag

is a

Harald Sack

is a

Internetworkingis a

1..n

1..n

male

femaleis a

is a

2. Multimedia Metadata and Ontologies

entity

class

relation

axiom

Page 64: ISSLOD2011 - Semantic Multimedia

2. Multimedia Metadata and Ontologies

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Semantic Metadata• enable the definition of formal Axioms

• e.g. „It is not possible that the publishing date is earlier than the birth date of the author of the publication.“

• enable deduction of new facts• e.g. „All men are mortal.“

„Socrates is a man.“ „Therefore Socrates is mortal.“

• semantic Metadata enable to make implicitely giveninformation explicite with the help of deduction andinference

Raffael: The School of Athens, 1510

Page 65: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Multimedia Metadata Description Languages• for time-based media

• annotatation of temporal media fragments

2. Multimedia Metadata and Ontologies

time

Page 66: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Multimedia Metadata Description Languages• for media with spatial extend

• annotatation of spatial media fragments

2. Multimedia Metadata and Ontologies

metadata

metadata

Page 67: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Multimedia Metadata• MPEG-7

The MPEG-7 standard, formerly named „Multimedia Content Description Interface“, provides a rich set of standard tools to describe multimedia content. Both human users and automatic systems that process audiovisual information are within the scope of MPEG-7.

• Components of the MPEG-7 Standard• MPEG-7 Systems• MPEG-7 Description Definition Language• MPEG-7 Visual• MPEG-7 Audio• MPEG-7 Multimedia Description Schemes MDS• MPEG-7 Reference Software• MPEG-7 Conformance• MPEG-7 Extraction and Use of Descriptions

2. Multimedia Metadata and Ontologies

Page 68: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

MPEG-7 Description of a Video Segment <Mpeg7 xmlns="..."><Description xsi:type="ContentEntityType"> ... <Video> <TemporalDecomposition> <VideoSegment> <CreationInformation>...</CreationInformation> <TextAnnotation> <KeywordAnnotation> <Keyword>mouse</Keyword> </KeywordAnnotation> </TextAnnotation> <MediaTime> <MediaTimePoint>T00:05:05:0F25</MediaTimePoint> <MediaDuration>PT00H00M31S0N25F</MediaDuration> </MediaTime> </VideoSegment> </TemporalDecomposition> </Video> ... </Description></Mpeg7>

2. Multimedia Metadata and Ontologies

Page 69: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

MPEG-7 Description of a Still Image

2. Multimedia Metadata and Ontologies

Page 70: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

MPEG-7 and the Semantic Web• MDS Upper Layer represented in RDF(S)

(2001: Hunter, later with link to ABC upper ontology)• MDS fully represented in OWL-DL

(2004: Tsinaraki et al., DS-MIRF model)• MPEG-7 fully represented in OWL-DL

(2005: Garcia & Celma, Rhizomik model)• MDS and Visual Parts represented in OWL-DL

(2007: Arndt et al., COMM model, re-engineering of MPEG-7 with DOLCE design patterns)

2. Multimedia Metadata and Ontologies

Page 71: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Example: Tagging with an MPEG-7 Ontology

2. Multimedia Metadata and Ontologies

Reg1

• Localize a region → Draw a bounding box

• Annotate the content → Interpret the content → Tag ,Astronaut‘

:Reg1 foaf:depicts dbpedia:Astronaut

Page 72: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Example: Tagging with an MPEG-7 Ontology

2. Multimedia Metadata and Ontologies

Reg1

mpeg7:image

mpeg7:depicts

Man on the Moon

mpeg7:spatial_decomposition Reg1

mpeg7:StillRegion

rdf:type

mpeg7:depicts

dbpedia:Astronaut

mpeg7:SpatialMask

mpeg7:polygon

mpeg7:Coords

Page 73: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Media Fragment Identification• Multimedia data has temporal and spatial dimension• pinpoint access on media fragments (on the web) with media fragment

identifiers• (W3C Media Fragments URI 1.0, Juli 2009, Working Draft)• simple examples

• requires different handling of media data by http client-server transactions

2. Multimedia Metadata and Ontologies

http://www.example.com/example.ogg#track=‘audio‘

http://www.example.com/example.ogg#track=‘audio‘&t=10s,20s

http://www.example.com/example.ogg#track=‘video‘&xywh=160,120,320,240

Page 74: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Overview(1) Multimedia and Semantics(2) Multimedia Metadata and Ontologies(3) Semantic Multimedia Analysis(4) Semantic Multimedia Retrieval

Semantic MultimediaIndian Summer School on Linked Data, Leipzig, 15 Sep. 2011

Page 75: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

How do we find something in a Multimedia Archive?

Page 76: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

How does Google find a video?

Page 77: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

How do you find something in an audiovisual archive?

Page 78: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Step 1: Digitalization of analogue data

How do you find something in an audiovisual archive?

Page 79: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Step 1: Digitalization of analogue data

How do you find something in an audiovisual archive?

Step 2: Annotation with (textbased) metadata

Page 80: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

• Manual annotation of AV-content with descriptive metadata

How do you find something in an audiovisual archive?

Page 81: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

...can this also be achieved in an automated way?

Page 82: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Automated AV-Media Analysis

automated content-based analysis is•difficult (error prone) and•complex

Page 83: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Automated AV-Media Analysis

automated content-based analysis is•difficult (error prone) and•complex

Page 84: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Automated AV-Media Analysis

automated content-based analysis is•difficult (error prone) and•complex

Page 85: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Automated AV-Media Analysis

automated content-based analysis is•difficult (error prone) and•complex

Genre Analysis

Page 86: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Automated AV-Media Analysis

automated content-based analysis is•difficult (error prone) and•complex

Face Detection

Genre Analysis

Page 87: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Automated AV-Media Analysis

automated content-based analysis is•difficult (error prone) and•complex

Face Detection

overlay text

Genre Analysis

Page 88: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Automated AV-Media Analysis

automated content-based analysis is•difficult (error prone) and•complex

Face Detection

overlay text

Logo Detection

Genre Analysis

Page 89: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Automated AV-Media Analysis

automated content-based analysis is•difficult (error prone) and•complex

Face Detection

overlay text

Logo Detection

Genre Analysis

scenetext

Page 90: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Automated AV-Media Analysis

automated content-based analysis is•difficult (error prone) and•complex

Face Detection

overlay text

Logo Detection

Genre Analysis

scenetext{

Audio-Mining

structuralanalysis transcription speaker

identification

Page 91: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

• Result: Video segments with time-based metadata annotations

• Metadata consist of combined low level / high level feature descriptors• Metadata serve as a basis for traditional and semantic retrieval

Metadata Extractiontime

Automated AV-Media Analysis

Page 92: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

time

e.g., person xylocation yzevent abc

e.g., bibliographical data,geographical data,encyclopedic data, ..

Video Analysis /Metadata Extraction

Entity Recognition/ Mapping

Semantic Multimedia Analysis

Page 93: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Some Examples of Automated Video Analysis

Page 94: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

• Structural Analysis• Intelligent Character Recognition (ICR)

• Character/Logo Detection• Character Filtering• Character Recognition

• Audio Analysis • Speaker Detection • Automated Speech Recognition (ASR)

• Genre Analysis / Categorization•graphic / real• indoor / outdoor•day / night•...

• Face/Body Detection, Tracking & Clustering

Page 95: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

video

• Automated subdivision of AV media data by structural segmentation• Subdivision of data streams in contentual coherent segments

Structural Analysis

Page 96: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

video

scenes

• Automated subdivision of AV media data by structural segmentation• Subdivision of data streams in contentual coherent segments

Structural Analysis

Page 97: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

video

scenes

shots

• Automated subdivision of AV media data by structural segmentation• Subdivision of data streams in contentual coherent segments

Structural Analysis

Page 98: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

video

scenes

shots

subhots

• Automated subdivision of AV media data by structural segmentation• Subdivision of data streams in contentual coherent segments

Structural Analysis

Page 99: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

video

scenes

shots

subhots

frames

• Automated subdivision of AV media data by structural segmentation• Subdivision of data streams in contentual coherent segments

Structural Analysis

Page 100: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

video

scenes

shots

subhots

frames

• Automated subdivision of AV media data by structural segmentation• Subdivision of data streams in contentual coherent segments

Structural Analysis

Page 101: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

shots

• Shot Boundary Detection

• Identification of• Hard Cuts• Drop Outs• Soft Cuts, as e.g., Dissolve, Wipe, Cross-Fade, etc.

Analytical Shot Boundary Detection• Analysis of Luminance/Chrominance Histograms• Analysis of Edge Distribution• Analysis of Motion Vectors

Machine Learning• Classification of Hard/Soft Cuts based on Image Features• K-Nearest Neighbor• Random Forrest • Support Vector Machines

Histogram Difference Analysis

Motion Vector Analysis

Structural Analysis

Page 102: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

shots

• Shot Boundary Detection

• Identification of• Hard Cuts

91930 91931 91932919299192891927

Feature Analysis• Luminance Histogram Difference• Chrominance Histogram Difference• Edge Distribution

Structural Analysis

Page 103: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

shots

• Shot Boundary Detection

• Identification of• Hard Cuts• Drop Outs

Drop Out

Histogram/Chrominance Difference Analysis

Structural Analysis

Page 104: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

shots

• Shot Boundary Detection

• Identification of• Hard Cuts• Drop Outs• Soft Cuts, as e.g., Dissolve, Wipe, Cross-Fade, etc.

Fade Out

Fade In

Structural Analysis

Page 105: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

shots

• Shot Boundary Detection

• Identification of• Hard Cuts• Drop Outs• Soft Cuts, as e.g., Dissolve, Wipe, Cross-Fade, etc.

Analytical Shot Boundary Detection• Analysis of Luminance/Chrominance Histograms• Analysis of Edge Distribution• Analysis of Motion Vectors

Machine Learning• Classification of Hard/Soft Cuts based on Image Features• K-Nearest Neighbor• Random Forrest • Support Vector Machines

Histogram Difference Analysis

Motion Vector Analysis

Structural Analysis

Page 106: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Face-DetectionFace ClusteringFace Tracking

Character DetectionCharacter Recognition

Logo-Detection

Genre Detection

Automated AV-Media Analysis

Page 107: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Intelligent Character Recognition• Preprocessing• Character Identification• Text Preprocessing

• Text Filtering• Adaption of script geometry (Deskew)• Image quality enhancement

• Optical Character Recognition (OCR)• Standard OCR software (OCRopus)

• Postprocessing• Lexical analysis • Statistical / context based filtering Ermittlungen nach

Bombenfunden

Automated AV-Media Analysis

Page 108: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

• Preprocessing• Character Identification

Filtering• Local Binary Patterns (LBP)• Histogram of Oriented Gradients

Intelligent Character Recognition

Page 109: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

• Preprocessing• Character Identification

Filtering• Local Binary Patterns (LBP)• Histogram of Oriented Gradients

Intelligent Character Recognition

Page 110: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

• Preprocessing• Character Identification

Filtering• Local Binary Patterns (LBP)• Histogram of Oriented Gradients

Intelligent Character Recognition

Page 111: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Intelligent Character Recognition• Preprocessing

• Character Identification• Text Preprocessing

• Text Filtering• Adaption of script geometry (Deskew)• Image quality enhancement

• Optical Character Recognition (OCR)• Standard OCR software (OCRopus)

• Postprocessing• Lexical analysis • Statistical / context based filtering Ermittlungen nach

Bombenfunden

Automatisierte Audio- und Videoanalyse

Page 112: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Original Image Bounding Box

Intelligent Character Recognition

Page 113: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Advanced Image Enhancement

Intelligent Character Recognition

Page 114: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Standard OCR (OCRopus)

Intelligent Character Recognition

Page 115: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Context-based Spell Correction

Intelligent Character Recognition

Page 116: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Semantic Multimedia Analysis

Page 117: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

timeVideo Analysis /Metadata Extraction

e.g., person xylocation yzevent abc

e.g., bibliographical data,geographical data,encyclopedic data, ..

Entity Recognition/ Mapping

Semantic Multimedia Analysis

Page 118: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

timeVideo Analysis /Metadata Extraction

e.g., person xylocation yzevent abc

e.g., bibliographical data,geographical data,encyclopedic data, ..

Entity Recognition/ Mapping

Semantic Multimedia Analysis

Page 119: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Named Entity Recognition

Main Problem in NER: Ambiguity of Terms

Page 120: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Named Entity Recognition

Main Problem in NER: Ambiguity of Terms

jaguar

Example: „Jaguar“ in different contexts

Page 121: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Named Entity Recognition

rainforest

Main Problem in NER: Ambiguity of Terms

jaguar

Example: „Jaguar“ in different contexts

Page 122: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Named Entity Recognition

rainforest

Steve McQueen

Main Problem in NER: Ambiguity of Terms

jaguar

Example: „Jaguar“ in different contexts

Page 123: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Named Entity Recognition

rainforest

Steve McQueen

Main Problem in NER: Ambiguity of Terms

jaguar

Example: „Jaguar“ in different contexts

apple

Page 124: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Named Entity Recognition

rainforest

Steve McQueen

Main Problem in NER: Ambiguity of Terms

jaguar

Example: „Jaguar“ in different contexts

Context matters!

apple

Page 125: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Named Entity Recognition• Mapping keyterms (text) to semantic entities

• Context Analysis and Disambiguation

Semantic Multimedia Analysis

Page 126: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Named Entity Recognition• Mapping keyterms (text) to semantic entities

• Context Analysis and Disambiguation

JaguarKeyterm / User Tag

Semantic Multimedia Analysis

Page 127: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Named Entity Recognition• Mapping keyterms (text) to semantic entities

• Context Analysis and Disambiguation

JaguarKeyterm / User Tag

Semantic Multimedia Analysis

Jaguar (Car)

Jaguar (Cat)

Jaguar (OS)

Jaguar (Aircraft)

?

?

?

?

Semantic Entities

Page 128: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

RDF graph to find relations between entities co-occurringin a text maintaining the hypothesis that disambiguationof co-occurring elements in a text can be obtained byfinding connected elements in an RDF graph [7]. In orderto regard the special compilation of non-textual data, staticand user-genrated metadata in audio-visual content our novelapproach combines the use of semantic technologies andLinked Data with linguistic methods.

III. METHOD

According to a study about structure and characteristicsof folksonomy tags [8] an average of 83% of user-generatedtags are single terms. Also, an average of 82% of thereviewed tags are nouns. Based on these study results, weignore tag practices, such as camel case (”barackObama”)and treat tags as subjects or categories describing a resource.As a tag could also be part of a group of nouns representingan entity or a name (”flying machine”,”albert einstein”) thetags stored as single words without any given order have tobe combined in term groups of two or more terms to findall appropriate entities. Hence, every tag or group of tagswithin a given context may represent a distinct entity. Theterm combination process and subsequent mapping of termsand term groups to entities are described in sect. III-B.

To disambiguate ambiguous terms we combine two meth-ods: a co-occurences analysis of the terms in the context inWikipedia articles and an analysis of the page link graph ofthe Wikipedia articles of entity candidates. The scores forboth analysis steps are calculated to a total score.

A. Context Definition

Metadata exists in a certain context and has to be inter-preted according to this context. For tags of audio-visualcontent we identified two dimensions:

• temporal dimension• user-centered dimensionIn the temporal dimension a context can be defined as the

entire video, a segment or a single timestamp in the video.The user-centered dimension classifies a context by howmany users created the concerning metadata - only tags by acertain user or all tags regardless of which user. Fig. 1 showsthe combinations of the two dimensions of contexts formetadata in audio-visual content the interpretation regardingthe significance of a context.

Audio-visual content also provides the opportunity tosupply spatial information. Thus, tags in the same regionof a video frame are considered as related to each other.In the current approach we did not consider this contextdimension.

To describe our approach we use a sample context of ourtest set (see sect. IV). This sample context is composed oftags by only one user at a certain timestamp in the video.The video containing this sample context is a presentation

Figure 1. Dimensions of context definition in audio-visual content

by Dr. Garik Israelian at the TED conference3 entitled ”Howspectroscopy could reveal alien life”4. Our sample contextconsists of the tags ”hubble”, ”spitzer”, ”carbon”, ”dioxide”,”methan”, ”co2”, and ”water”.

B. Preprocessing

Term Combination: Our combination algorithm takesall tags of a specified spatio-temporal context (at a certaintimestamp/in a certain segment of a video, of a singleURL/image and generates every possible combination of atmost three terms of the context in every possible order. Inthat way we make sure to rectify groups of single termsthat belong together. We chose to generate combinationsof three words to make sure to also hit named entitiesconsisting of more than two words, such as ”public keycryptography” or ”alberto santos dumont”. About 90% ofthe DBpedia [9] labels consist of at most three words, butless than 5% consist of 4 words. Due to these numbersand performance issues we decided to limit the number ofterms to be combined to three. Subsequently in this paperby terms we will refer to single terms as well as generatedterm groups. The number c of combinations is calcultaed byc =

�jk=1

n!(n�k)! .

For our sample context containing 7 tags and at most3 terms in a combination (j = 3), 259 combinations aregenerated.

Term Mapping: The terms then have to be mapped tosemantic entities. For our approach we use entities of theLinked Open Data Cloud [10], in particular of the DBpedia,version 3.5.1.

DBpedia provides labels for the identification of distinctentities in 92 languages. We use English and German aswell as Finnish labels, as we noticed that neither English northe German labels contain important acronyms as labels, butthe Finnish language version does. As tagging users prefer tokeep it simple and short[2], resources dealing with ”DomainName System” would rather be tagged with ”DNS” than”Domain Name System”.

After simple string matching of the terms of the contextto DBpedia URIs, the URIs are revised for redirects and

3http://www.ted.com4http://yovisto.com/play/14415

Context Analysis and DisambiguationWhat defines a Context in AV-Data?

• Temporal Coherence • Spatial Coherence• Provenance

Semantic Multimedia Analysis

Page 129: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

RDF graph to find relations between entities co-occurringin a text maintaining the hypothesis that disambiguationof co-occurring elements in a text can be obtained byfinding connected elements in an RDF graph [7]. In orderto regard the special compilation of non-textual data, staticand user-genrated metadata in audio-visual content our novelapproach combines the use of semantic technologies andLinked Data with linguistic methods.

III. METHOD

According to a study about structure and characteristicsof folksonomy tags [8] an average of 83% of user-generatedtags are single terms. Also, an average of 82% of thereviewed tags are nouns. Based on these study results, weignore tag practices, such as camel case (”barackObama”)and treat tags as subjects or categories describing a resource.As a tag could also be part of a group of nouns representingan entity or a name (”flying machine”,”albert einstein”) thetags stored as single words without any given order have tobe combined in term groups of two or more terms to findall appropriate entities. Hence, every tag or group of tagswithin a given context may represent a distinct entity. Theterm combination process and subsequent mapping of termsand term groups to entities are described in sect. III-B.

To disambiguate ambiguous terms we combine two meth-ods: a co-occurences analysis of the terms in the context inWikipedia articles and an analysis of the page link graph ofthe Wikipedia articles of entity candidates. The scores forboth analysis steps are calculated to a total score.

A. Context Definition

Metadata exists in a certain context and has to be inter-preted according to this context. For tags of audio-visualcontent we identified two dimensions:

• temporal dimension• user-centered dimensionIn the temporal dimension a context can be defined as the

entire video, a segment or a single timestamp in the video.The user-centered dimension classifies a context by howmany users created the concerning metadata - only tags by acertain user or all tags regardless of which user. Fig. 1 showsthe combinations of the two dimensions of contexts formetadata in audio-visual content the interpretation regardingthe significance of a context.

Audio-visual content also provides the opportunity tosupply spatial information. Thus, tags in the same regionof a video frame are considered as related to each other.In the current approach we did not consider this contextdimension.

To describe our approach we use a sample context of ourtest set (see sect. IV). This sample context is composed oftags by only one user at a certain timestamp in the video.The video containing this sample context is a presentation

Figure 1. Dimensions of context definition in audio-visual content

by Dr. Garik Israelian at the TED conference3 entitled ”Howspectroscopy could reveal alien life”4. Our sample contextconsists of the tags ”hubble”, ”spitzer”, ”carbon”, ”dioxide”,”methan”, ”co2”, and ”water”.

B. Preprocessing

Term Combination: Our combination algorithm takesall tags of a specified spatio-temporal context (at a certaintimestamp/in a certain segment of a video, of a singleURL/image and generates every possible combination of atmost three terms of the context in every possible order. Inthat way we make sure to rectify groups of single termsthat belong together. We chose to generate combinationsof three words to make sure to also hit named entitiesconsisting of more than two words, such as ”public keycryptography” or ”alberto santos dumont”. About 90% ofthe DBpedia [9] labels consist of at most three words, butless than 5% consist of 4 words. Due to these numbersand performance issues we decided to limit the number ofterms to be combined to three. Subsequently in this paperby terms we will refer to single terms as well as generatedterm groups. The number c of combinations is calcultaed byc =

�jk=1

n!(n�k)! .

For our sample context containing 7 tags and at most3 terms in a combination (j = 3), 259 combinations aregenerated.

Term Mapping: The terms then have to be mapped tosemantic entities. For our approach we use entities of theLinked Open Data Cloud [10], in particular of the DBpedia,version 3.5.1.

DBpedia provides labels for the identification of distinctentities in 92 languages. We use English and German aswell as Finnish labels, as we noticed that neither English northe German labels contain important acronyms as labels, butthe Finnish language version does. As tagging users prefer tokeep it simple and short[2], resources dealing with ”DomainName System” would rather be tagged with ”DNS” than”Domain Name System”.

After simple string matching of the terms of the contextto DBpedia URIs, the URIs are revised for redirects and

3http://www.ted.com4http://yovisto.com/play/14415

Context Analysis and DisambiguationWhat defines a Context in AV-Data?

• Temporal Coherence • Spatial Coherence• Provenance

Semantic Multimedia Analysis

Spatial Dimension

Page 130: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

RDF graph to find relations between entities co-occurringin a text maintaining the hypothesis that disambiguationof co-occurring elements in a text can be obtained byfinding connected elements in an RDF graph [7]. In orderto regard the special compilation of non-textual data, staticand user-genrated metadata in audio-visual content our novelapproach combines the use of semantic technologies andLinked Data with linguistic methods.

III. METHOD

According to a study about structure and characteristicsof folksonomy tags [8] an average of 83% of user-generatedtags are single terms. Also, an average of 82% of thereviewed tags are nouns. Based on these study results, weignore tag practices, such as camel case (”barackObama”)and treat tags as subjects or categories describing a resource.As a tag could also be part of a group of nouns representingan entity or a name (”flying machine”,”albert einstein”) thetags stored as single words without any given order have tobe combined in term groups of two or more terms to findall appropriate entities. Hence, every tag or group of tagswithin a given context may represent a distinct entity. Theterm combination process and subsequent mapping of termsand term groups to entities are described in sect. III-B.

To disambiguate ambiguous terms we combine two meth-ods: a co-occurences analysis of the terms in the context inWikipedia articles and an analysis of the page link graph ofthe Wikipedia articles of entity candidates. The scores forboth analysis steps are calculated to a total score.

A. Context Definition

Metadata exists in a certain context and has to be inter-preted according to this context. For tags of audio-visualcontent we identified two dimensions:

• temporal dimension• user-centered dimensionIn the temporal dimension a context can be defined as the

entire video, a segment or a single timestamp in the video.The user-centered dimension classifies a context by howmany users created the concerning metadata - only tags by acertain user or all tags regardless of which user. Fig. 1 showsthe combinations of the two dimensions of contexts formetadata in audio-visual content the interpretation regardingthe significance of a context.

Audio-visual content also provides the opportunity tosupply spatial information. Thus, tags in the same regionof a video frame are considered as related to each other.In the current approach we did not consider this contextdimension.

To describe our approach we use a sample context of ourtest set (see sect. IV). This sample context is composed oftags by only one user at a certain timestamp in the video.The video containing this sample context is a presentation

Figure 1. Dimensions of context definition in audio-visual content

by Dr. Garik Israelian at the TED conference3 entitled ”Howspectroscopy could reveal alien life”4. Our sample contextconsists of the tags ”hubble”, ”spitzer”, ”carbon”, ”dioxide”,”methan”, ”co2”, and ”water”.

B. Preprocessing

Term Combination: Our combination algorithm takesall tags of a specified spatio-temporal context (at a certaintimestamp/in a certain segment of a video, of a singleURL/image and generates every possible combination of atmost three terms of the context in every possible order. Inthat way we make sure to rectify groups of single termsthat belong together. We chose to generate combinationsof three words to make sure to also hit named entitiesconsisting of more than two words, such as ”public keycryptography” or ”alberto santos dumont”. About 90% ofthe DBpedia [9] labels consist of at most three words, butless than 5% consist of 4 words. Due to these numbersand performance issues we decided to limit the number ofterms to be combined to three. Subsequently in this paperby terms we will refer to single terms as well as generatedterm groups. The number c of combinations is calcultaed byc =

�jk=1

n!(n�k)! .

For our sample context containing 7 tags and at most3 terms in a combination (j = 3), 259 combinations aregenerated.

Term Mapping: The terms then have to be mapped tosemantic entities. For our approach we use entities of theLinked Open Data Cloud [10], in particular of the DBpedia,version 3.5.1.

DBpedia provides labels for the identification of distinctentities in 92 languages. We use English and German aswell as Finnish labels, as we noticed that neither English northe German labels contain important acronyms as labels, butthe Finnish language version does. As tagging users prefer tokeep it simple and short[2], resources dealing with ”DomainName System” would rather be tagged with ”DNS” than”Domain Name System”.

After simple string matching of the terms of the contextto DBpedia URIs, the URIs are revised for redirects and

3http://www.ted.com4http://yovisto.com/play/14415

Context Analysis and DisambiguationWhat defines a Context in AV-Data?

• Temporal Coherence • Spatial Coherence• Provenance

Semantic Multimedia Analysis

Temporal Dimension

Spatial Dimension

Page 131: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

RDF graph to find relations between entities co-occurringin a text maintaining the hypothesis that disambiguationof co-occurring elements in a text can be obtained byfinding connected elements in an RDF graph [7]. In orderto regard the special compilation of non-textual data, staticand user-genrated metadata in audio-visual content our novelapproach combines the use of semantic technologies andLinked Data with linguistic methods.

III. METHOD

According to a study about structure and characteristicsof folksonomy tags [8] an average of 83% of user-generatedtags are single terms. Also, an average of 82% of thereviewed tags are nouns. Based on these study results, weignore tag practices, such as camel case (”barackObama”)and treat tags as subjects or categories describing a resource.As a tag could also be part of a group of nouns representingan entity or a name (”flying machine”,”albert einstein”) thetags stored as single words without any given order have tobe combined in term groups of two or more terms to findall appropriate entities. Hence, every tag or group of tagswithin a given context may represent a distinct entity. Theterm combination process and subsequent mapping of termsand term groups to entities are described in sect. III-B.

To disambiguate ambiguous terms we combine two meth-ods: a co-occurences analysis of the terms in the context inWikipedia articles and an analysis of the page link graph ofthe Wikipedia articles of entity candidates. The scores forboth analysis steps are calculated to a total score.

A. Context Definition

Metadata exists in a certain context and has to be inter-preted according to this context. For tags of audio-visualcontent we identified two dimensions:

• temporal dimension• user-centered dimensionIn the temporal dimension a context can be defined as the

entire video, a segment or a single timestamp in the video.The user-centered dimension classifies a context by howmany users created the concerning metadata - only tags by acertain user or all tags regardless of which user. Fig. 1 showsthe combinations of the two dimensions of contexts formetadata in audio-visual content the interpretation regardingthe significance of a context.

Audio-visual content also provides the opportunity tosupply spatial information. Thus, tags in the same regionof a video frame are considered as related to each other.In the current approach we did not consider this contextdimension.

To describe our approach we use a sample context of ourtest set (see sect. IV). This sample context is composed oftags by only one user at a certain timestamp in the video.The video containing this sample context is a presentation

Figure 1. Dimensions of context definition in audio-visual content

by Dr. Garik Israelian at the TED conference3 entitled ”Howspectroscopy could reveal alien life”4. Our sample contextconsists of the tags ”hubble”, ”spitzer”, ”carbon”, ”dioxide”,”methan”, ”co2”, and ”water”.

B. Preprocessing

Term Combination: Our combination algorithm takesall tags of a specified spatio-temporal context (at a certaintimestamp/in a certain segment of a video, of a singleURL/image and generates every possible combination of atmost three terms of the context in every possible order. Inthat way we make sure to rectify groups of single termsthat belong together. We chose to generate combinationsof three words to make sure to also hit named entitiesconsisting of more than two words, such as ”public keycryptography” or ”alberto santos dumont”. About 90% ofthe DBpedia [9] labels consist of at most three words, butless than 5% consist of 4 words. Due to these numbersand performance issues we decided to limit the number ofterms to be combined to three. Subsequently in this paperby terms we will refer to single terms as well as generatedterm groups. The number c of combinations is calcultaed byc =

�jk=1

n!(n�k)! .

For our sample context containing 7 tags and at most3 terms in a combination (j = 3), 259 combinations aregenerated.

Term Mapping: The terms then have to be mapped tosemantic entities. For our approach we use entities of theLinked Open Data Cloud [10], in particular of the DBpedia,version 3.5.1.

DBpedia provides labels for the identification of distinctentities in 92 languages. We use English and German aswell as Finnish labels, as we noticed that neither English northe German labels contain important acronyms as labels, butthe Finnish language version does. As tagging users prefer tokeep it simple and short[2], resources dealing with ”DomainName System” would rather be tagged with ”DNS” than”Domain Name System”.

After simple string matching of the terms of the contextto DBpedia URIs, the URIs are revised for redirects and

3http://www.ted.com4http://yovisto.com/play/14415

Context Analysis and DisambiguationWhat defines a Context in AV-Data?

• Temporal Coherence • Spatial Coherence• Provenance

Semantic Multimedia Analysis

User-centered Dimension

Temporal Dimension

Spatial Dimension

Page 132: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Semantic Multimedia Analysis

Preprocessing

Term Combination

Term Mapping

Entity Candidate Disambiguation

Co-Occurence Analysis

Semantic Graph Analysis

Score Calculation

1956 Stevejaguar

McQueenrim wheel

Steve McQueen ../resource/Steve_McQueen

jaguar ../resource/Jaguar_Cars

wheel rim ../resource/rim_(wheel)

1956 ../resource/1956

NER Custom Workflow

Page 133: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Semantic Multimedia Analysis

Preprocessing

Term Combination

Term Mapping

Entity Candidate Disambiguation

Co-Occurence Analysis

Semantic Graph Analysis

Score Calculation

1956 Stevejaguar

McQueenrim wheel

Steve McQueen ../resource/Steve_McQueen

jaguar ../resource/Jaguar_Cars

wheel rim ../resource/rim_(wheel)

1956 ../resource/1956

NER Custom Workflow

only if there is no spatialinformation for compositeterms available

Page 134: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Semantic Multimedia AnalysisNamed Entity Recognition Workflow

Term Combination

1956Steve

wheeljaguar

McQueen

rim

Page 135: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Semantic Multimedia AnalysisNamed Entity Recognition Workflow

Term Combination

1956Stevewheel jaguar

McQueenrim

Page 136: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Semantic Multimedia AnalysisNamed Entity Recognition Workflow

Assigning Entity Candidates

1956Stevewheel jaguar

McQueenrim

7 entity candidates

2 entity candidates

36 entity candidates

1 entity candidate

Page 137: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Cooccurrence Analysis

„jaguar“http://dbpedia.org/resource/Jaguar_(Cats)

Page 138: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Cooccurrence Analysis

„jaguar“http://dbpedia.org/resource/Jaguar_(Cats)

1956 wheel rimsteve mcqueen

context tags:

Page 139: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Cooccurrence Analysis

„jaguar“http://dbpedia.org/resource/Jaguar_(Cats)

1956 wheel rimsteve mcqueen

context tags:

Page 140: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Cooccurrence Analysis

„jaguar“http://dbpedia.org/resource/Jaguar_(Cats)

1956 wheel rimsteve mcqueen

context tags:

score: 0.00

Page 141: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

„jaguar“http://dbpedia.org/resource/Jaguar_Cars

Cooccurrence Analysis

Page 142: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

„jaguar“http://dbpedia.org/resource/Jaguar_Cars

1956 wheel rimsteve mcqueen

context tags:

Cooccurrence Analysis

Page 143: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

„jaguar“http://dbpedia.org/resource/Jaguar_Cars

1956 wheel rimsteve mcqueen

context tags:

Cooccurrence Analysis

Page 144: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

„jaguar“http://dbpedia.org/resource/Jaguar_Cars

1956 wheel rimsteve mcqueen

context tags:

score: 0.87

Cooccurrence Analysis

Page 145: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

jaguarKeyterm / User Tag

LOD Cloud

Semantic Graph Analysis

1956 Stevejaguar

McQueenrim wheel

context

Jaguar (Car)Steve McQueen

1956

Jaguar (Cat)Jaguar (OS)

Page 146: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Overview(1) Multimedia and Semantics(2) Multimedia Metadata and Ontologies(3) Semantic Multimedia Analysis(4) Semantic Multimedia Retrieval

Semantic MultimediaIndian Summer School on Linked Data, Leipzig, 15 Sep. 2011

Page 147: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Searching is not always just searching...

Page 148: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Searching is not always just searching

a simple example:

I‘m looking for a book by Earnest Hemingway with the title ,For Whom the Bell Tolls‘ in the first German edition...“

Page 149: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Wem die Stunde schlägt. - Ernest H E M I N G W A Y. (Stockholm usw., Bermann-Fischer Verlag, 1941) 560 S. 8“

II 1, 2506, 34548

Searching is not always just searching

Page 150: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

...but what if...

I really liked the book ,For Whom the Bell Tolls‘ but I have no idea what I should read next....

Page 151: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

...but what if...

I really liked the book ,For Whom the Bell Tolls‘ but I have no idea what I should read next....

Page 152: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Exploratory Search• What, if the user does not know, which query string to use?

• What, if the user is looking for complex answers ?

• What, if the user does not know the domain he/she is looking for?• What, if the user wants to know all(!) about a specific topic?

• ...,Browsing‘ instead of ,Searching‘• ...to find something by chance• ...serendipitous findings• ...to get an overview• ...enable content based navigation

Page 153: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

How to implement an exploratory search?

Page 154: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

time

e.g., person xylocation yzevent abc

e.g., bibliographical data,geographical data,encyclopedic data, ..

Video Analysis /Metadata Extraction

Entity Recognition/ Mapping

Semantic Video Search

Page 155: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Data is a precious thing and will last longer than the systems themselves. (Tim Berners-Lee) http://linkeddata.org/

The Web of Data - The Semantic Web

Page 156: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

http://dbpedia.org/

Page 157: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

dbpedia:For_Whom_the_Bell_Tolls

What facts for dbpedia:For_Whom_the_Bell_Tollsare relevant?

http://dbpedia.org/page/For_Whom_the_Bell_Tolls

DBPedia - the Semantic Wikipedia

...use heuristics

Page 158: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Exploratory Search

dbpedia-owl:author

dbpedia:Ernest_Hemingwaydbpedia:For_Whom_the_Bell_Tolls

Page 159: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Exploratory Search

dbpedia-owl:author

dbpedia:Ernest_Hemingwaydbpedia:For_Whom_the_Bell_Tolls

dbpe

dia-

owl:a

utho

r

Page 160: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Exploratory Search

dbpedia-owl:author

dbpedia:Ernest_Hemingwaydbpedia:For_Whom_the_Bell_Tolls

dbpe

dia-

owl:a

utho

r

dbpedia-owl:author

Page 161: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Exploratory Search

dbpedia-owl:author

dbpedia:Ernest_Hemingwaydbpedia:For_Whom_the_Bell_Tolls

dbpe

dia-

owl:a

utho

r

dbpedia-owl:author

dbpedia-owl:author

Page 162: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

dbpedia-owl:author

dbpedia:Ernest_Hemingwaydbpedia:For_Whom_the_Bell_Tolls

Exploratory Search

Page 163: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

dbpedia-owl:author

dbpedia:Ernest_Hemingwaydbpedia:For_Whom_the_Bell_Tolls

dbpedia:Raymond_Carver

dbpedia-

owl:influenced_by

Exploratory Search

Page 164: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

dbpedia-owl:author

dbpedia:Ernest_Hemingwaydbpedia:For_Whom_the_Bell_Tolls

dbpedia:Raymond_Carver

dbpedia-

owl:influenced_by

dbpedia:Jack_Kerouac

dbpe

dia-

owl:i

nflu

ence

d_by

Exploratory Search

Page 165: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

dbpedia-owl:author

dbpedia:Ernest_Hemingwaydbpedia:For_Whom_the_Bell_Tolls

dbpedia:Raymond_Carver

dbpedia-

owl:influenced_by

dbpedia:Jack_Kerouac

dbpe

dia-

owl:i

nflu

ence

d_by

dbpedia-owl:influenced_by

dbpedia:Jerome_D._Salinger

Exploratory Search

Page 166: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

dbpedia:Jack_Kerouac dbpedia:Raymond_Carverdbpedia:Jerome_D._Salinger

Exploratory Search

Page 167: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

dbpedia:Jack_Kerouac dbpedia:Raymond_Carverdbpedia:Jerome_D._Salinger

dbpedia-owl:notableWork

Exploratory Search

Page 168: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

dbpedia:Jack_Kerouac dbpedia:Raymond_Carverdbpedia:Jerome_D._Salinger

dbpedia-owl:notableWork dbpedia-owl:notableWork

Exploratory Search

Page 169: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

dbpedia:Jack_Kerouac dbpedia:Raymond_Carverdbpedia:Jerome_D._Salinger

dbpedia-owl:notableWork dbpedia-owl:notableWork dbpedia-owl:notableWork

Exploratory Search

Page 170: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

...and how does an exploratory search look like?

Page 171: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

29

http://mediaglobe.yovisto.com:8080

Page 172: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

2929

Semantische SuchtechnologienExplorative Suche in audiovisuellen Daten

J. Waitelonis, H. Sack, Z. Kramer, J. Hercher:Semantically Enabled Exploratory Video Search, in Proc. of Semantic Search Workshop (SemSearch10) at the 19th Int. World Wide Web Conference (WWW2010), 26-30 April 2010, Raleigh, NC, USA, 2010.

Page 173: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

2929

Semantische SuchtechnologienExplorative Suche in audiovisuellen Daten

J. Waitelonis, H. Sack, Z. Kramer, J. Hercher:Semantically Enabled Exploratory Video Search, in Proc. of Semantic Search Workshop (SemSearch10) at the 19th Int. World Wide Web Conference (WWW2010), 26-30 April 2010, Raleigh, NC, USA, 2010.

Page 174: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

29

J. Waitelonis, H. Sack, Z. Kramer, J. Hercher:Semantically Enabled Exploratory Video Search, in Proc. of Semantic Search Workshop (SemSearch10) at the 19th Int. World Wide Web Conference (WWW2010), 26-30 April 2010, Raleigh, NC, USA, 2010.

Page 175: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

29

J. Waitelonis, H. Sack, Z. Kramer, J. Hercher:Semantically Enabled Exploratory Video Search, in Proc. of Semantic Search Workshop (SemSearch10) at the 19th Int. World Wide Web Conference (WWW2010), 26-30 April 2010, Raleigh, NC, USA, 2010.

29

Semantische SuchtechnologienExplorative Suche in audiovisuellen Daten

Page 176: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

29

J. Waitelonis, H. Sack, Z. Kramer, J. Hercher:Semantically Enabled Exploratory Video Search, in Proc. of Semantic Search Workshop (SemSearch10) at the 19th Int. World Wide Web Conference (WWW2010), 26-30 April 2010, Raleigh, NC, USA, 2010.

29

Semantische SuchtechnologienExplorative Suche in audiovisuellen Daten

Page 177: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

29

J. Waitelonis, H. Sack, Z. Kramer, J. Hercher:Semantically Enabled Exploratory Video Search, in Proc. of Semantic Search Workshop (SemSearch10) at the 19th Int. World Wide Web Conference (WWW2010), 26-30 April 2010, Raleigh, NC, USA, 2010.

29

Semantische SuchtechnologienExplorative Suche in audiovisuellen Daten

Page 178: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

29

Page 180: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Overview(1) Multimedia and Semantics(2) Multimedia Metadata and Ontologies(3) Semantic Multimedia Analysis(4) Semantic Multimedia Retrieval

Semantic MultimediaIndian Summer School on Linked Data, Leipzig, 15 Sep. 2011

Page 181: ISSLOD2011 - Semantic Multimedia

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Indian Summer School on Linked Data, Leipzig, 12-18. Sep. 2011

Contact:Dr. Harald SackHasso-Plattner-Institut für SoftwaresystemtechnikUniversität PotsdamProf.-Dr.-Helmert-Str. 2-3D-14482 Potsdam

Homepage:http://www.hpi.uni-potsdam.de/meinel/team/sack.html http://www.yovisto.com/Blog: http://moresemantic.blogspot.com/E-Mail: [email protected] [email protected]: lysander07 / biblionomicon / yovisto

Thank you for

your Attention!