convergence of multimedia and knowledge technologies · metadata generation & representation...

Convergence of multimedia andknowledge technologies

aceMedia, Aim@Shape, BOEMIE, MESH,X-Media, K-Space, VITALAS and VICTORY

Practitioner Day CIVR 2007

Yiannis KompatsiarisMultimedia Knowledge Laboratory

CERTH - Informatics and Telematics Institute

22

Multimedia Knowledge LaboratoryInformatics and Telematics Institute

Outline• Introduction• Content - applications• A common view

• Multimedia Ontologies• Analysis• Reasoning• Retrieval

• Common issues• Dissemination• Conclusions

33


DIRECTOR

SCENE

TAKE

TITLE

Multimedia Content

Networks

Storage & Devices

SegmentationKA Analysis

Labeling

Cross-mediaanalysis

Context

Reasoning

MetadataGeneration &

Representation

Content adaptation anddistribution - MultipleTerminal & Networks

Hybrid / Content-basedretrieval recommendations

and personalization

Semantic technologyin MarketsWeb 2.0 photo -

video applications

44


Need for annotation + metadata

“The value of information depends onhow easily it can be found,

retrieved, accessed, filtered ormanaged in an active, personalized

way”

55


Content - Applications

Content KnowledgeExtraction Applications

3D

Industrial

Personal Sports

Semantic Desktop Retrieval

News

CommercialPersonalization

Mobile

Fashion News

66




Personal

PersonalizationMobile

Retrieval

Commercial

77



Content distributionand adaptation

Sharing

ACE concept

Actionable content

88




Retrieval

News

PersonalizationMobile

99



News Syndication

Multi-National &Local news providers

1010




Industrial

Semantic Desktop Retrieval

1111



Large-Scale content

Process support

Industry content

1212




3D

Retrieval

1313


Content - ApplicationsMaximise

automation ofthe shape

knowledgelifecycle

raw data

geometric model

structural model

semantic model

conceptual sketch

shape facets andsemantic mapping

semanticallystructured model

From raw data to semantics From semantics to model

Embe

ddin

g se

man

tic c

onte

nt

Extr

actin

g se

man

tic c

onte

nt

geometric model

1414




SportsRetrieval

1515



Automatic semanticannotation of digital

mapsEVOLVED

ONTOLOGY

INITIALONTOLOGY

POPULATION &ENRICHMENT COORDINATION

INTERMEDIATEONTOLOGY

ONTOLOGY EVOLUTION

EVENTSDATABASE

MAPSDATABASE

MAP ANNOTATIONINTERFACE

SEMANTICSEXTRACTION

RESULTS

OTHERONTOLOGIES

SEMANTICS EXTRACTION

MULTIMEDIACONTENT

FROM VISUALCONTENT

FROM NON-VISUALCONTENT

FROM FUSEDCONTENT

ContentCollection(crawlers,spiders, etc.)

1616




Personal SportsRetrieval

News

1717



K-Space

R&D

Dissemination

Fellowships

Integration

Network ofExcellence

Emphasis onintegration of

research activities

1818




Retrieval

News

Personalization

Fashion NewsLarge-Scale

Real use cases

1919




3D

Retrieval

Mobile

P2P and Mobile

2020


ManualAnnotation

- Models

AdditionalAnalysis

Information

SingleModalityAnalysis

SemanticAnalysis

Knowledge Infrastructure(Multimedia Ontology)

Knowledge ExtractionA common view

Implicit KnowledgeSignal Level

Explicit Knowledge – Logic - Semantics& Hybrid Level

2121


AdditionalAnalysis

Information


ManualAnnotation

- Models

SemanticAnalysis

SingleModalityAnalysis

Knowledge ExtractionA common view

Feature extractionText, Image analysisSegmentation, SVMsEvidence generation“Vehicle”, “Building”

Classifiers fusionGlobal vs. LocalModalities fusion

Context “Ambulance”

ReasoningFusion of annotationsConsistency checking

Higher-levelconcepts/events

“Emergency scene”

Multimedia contentannotation tools

Training(Statistical)Modeling

Domain Multimedia content

AnnotationsAlgorithms - Features

Context

AdditionalAnalysis

Information


ManualAnnotation -

Models

SemanticAnalysis

Single ModalityAnalysis

2222


Use of ontologies

• Metadata representation• Annotation• Interoperability

• Reasoning• Extracting higher-level

annotations• Consistency checking• Fusion

• Ontology-driven analysis• Retrieval• Personalization

2323


Multimedia Ontologies

• Multimedia content structure• aceMedia(MPEG-7, RDF), AIM@SHAPE(3D content)

• Multimodality• MESH(OWL), BOEMIE (OWL-DL)• K-Space, X-Media (COMM, OWL, DOLCE)

• Fuzziness• K-Space, X-Media (Fuzzy-OWL)

• Changing knowledge• BOEMIE (evolution)• X-Media (versioning, reasons of change)

• Specific domains

2424


COMM: Core Ontology of MultiMediaK-Space, X-Media

• Instead of translating MPEG-7 1-to-1into an ontology, COMM provides 5multimedia design patternswhich formalize basic notions ofmultimedia annotation

• Digital data pattern• Decomposition pattern• Content annotation pattern• Media annotation pattern• Semantic annotation pattern

• Usage of DOLCE as modeling basisand consideration of commonrequirements for multimediaannotation

• COMM already covers large parts ofMPEG-7 (some additional patternsmay be required for completecoverage)

DigitalData

MultimediaData

OutputSegmentRole

plays

ProcessingRole

InputRoleOutputRole

plays

SegmentDecompositionAlgorithm

SegmentationAlgorithm setting

satisfiesSituationMethod

InputSegmentRole

DOLCEdefines

Decomposition pattern

DigitalData

MultimediaData

AnnotationRole

plays

ProcessingRole

InputRoleOutputRole

plays

Annotation

Method

setting

setting

satisfiesSituation

defines

Algorithm

AnnotatedDataRole

DOLCE

Description

StructuredDataDescription

MPEG 7Descriptor

Content annotation pattern

2525


Multimedia Information ObjectsMESH

hasDecomposition

about

orderedBy

interpretedBy

realizedBy

„Segmentation-Tool“

MultimediaFile

Format: JPG, UTF-8Lang: DE

Format

Linguistic-IO / Visual-IO

MatchTeam

Decomposition

hasSegment

text, image

Text

hasContent

TextSegmentspatio-temporal-region

hasSegment

Image

hasContent

1 2

1 2

about

instanceOf

Example: Analysis of a Multimedia Web Document

2626


Multimedia Content AnnotationM-Ontomat-Annotizer (aceMedia, K-Space)

2727


Multimedia Content Analysis• MPEG-7 widely used for LL features• Segmentation and feature extraction tools (aceMedia, K-Space)• Well-known classifiers applied and developed

• SVMs, EM, HMM• Bio-inspired approaches

• Increasing use of context (aceMedia, K-Space)• Spatial, Frequency, EXIF

• Fusion• Classifiers (K-Space, MESH: global+local)• Modalities

• X-Media (Text+Image+1D data)• MESH (Text+Speech+Video)• aceMedia (Text+Video)

• Mostly statistical and machine learning (implicit) based but also• Hybrid (implicit + explicit, K-Space)

2828


Context and Reasoning for AnalysisaceMedia

KAA

beach scene

person

faceperson/facedetection

sceneclassification

<RDF /> rocksky

sea

beach beach/rock

rock/beach

sea, sky

person/bear

…other analysis methods

Creation ofcontextualinformation

multimediareasoning

•Use of contextual information•From metadata layer•spatio-temporal relations•Domain knowledge

•Reduction of label sets•Merging of segments

2929


Natural-Person: 0.456798Sailing-Boat: 0.463645Sand: 0.476777Building: 0.415358Pavement: 0.454740Road: 0.503242Body-Of-Water: 0.489957Cliff: 0.472907Cloud: 0.757926Mountain: 0.512597Sea: 0.455338Sky: 0.658825Stone: 0.471733Waterfall: 0.500000Wave: 0.476669Dried-Plant: 0.494825Dried-Plant-Snowed: 0.476524Foliage: 0.497562Grass: 0.491781Tree: 0.447355Trunk: 0.493255Snow: 0.467218Sunset: 0.503164Car: 0.456347Ground: 0.454769Lamp-Post: 0.499387Statue: 0.501076

Classification ResultsaceMedia

Segment’shypothesisset

3030


Semantic Region MergingK-Space

RSST

SemanticRSST

Sky

Building

Sea

Ground

3131


Cross Media Knowledge AcquisitionX-Media

Cross Media approaches:• Result level: combining results

obtained from different systemson different types of media

• Extractor level: using systemresults from different types ofmedia as annotation orbackground knowledge

• Feature level: using featurescoming from different media.

Cross Media Framework:• Multimedia Document

Processing:• Extract single & cross media features

• Feature Processing:• Find optimal representation of

feature space• Cross media data models

• Create knowledge models for allconcepts

• Cross media dependency models• Integrate background knowledge &

exploit causality information

32Multimedia Knowledge LaboratoryInformatics and Telematics Institute

Reasoning• Support of imprecision - uncertainty• Logic-based approaches

• Extensions of formal theories (X-Media, K-Space)• Ad-hoc solutions based on crisp reasoners (aceMedia)

• Statistical approaches (X-Media)• Used for:

• Fusion• Consistency checking• Higher-level results

3333


region1region3

region2

image1

Remarks• Annotations (fuzzy ABox) considered:

• fuzzy, positive, concept assertions• crisp role assertions

• Prior knowledge (TBox) considered:• Crisp inclusion and equivalence axioms

1. Classical (crisp) DL reasoning applied on assertions, leaving out fuzzy degrees

2. Axioms revisited by external module to updateappropriately the degrees according to fuzzy interpretation semantics:

Ad-hoc Fuzzy ReasoningaceMedia

34Multimedia Knowledge LaboratoryInformatics and Telematics Institute

Fuzzy OWLK-Space, X-Media

• Automatically Derived Multimedia Annotation is oftenimprecise or errorprone• Model this uncertainty

• Extend OWL with fuzzy A-Box• “region4 shows an object which is

a ball with a fuzzy degree of 0.8 anda pumpkin with a fuzzy degree of 0.3”

3535


Σ

Abduction as non-standard retrieval: Acquire what should be added toa knowledge base (Σ,Δ) to make a query (set of assertions) Γ true

Interpretation as abduction: let Γ be the analysis produced concept and roleassertions, Γ1 : by default assertions, Γ2 : ones that need to be explained

TBox ABox

Σ includes apart from DL axioms, rules to answer the Γ queries in a backward-chaining way.

Multiple explanations may be obtained.•Consistency checking eliminates not valid answers•Preference measure according to the number of new assertions required

Γ2

Δ1: adds 2 new individuals and infers Pole_Vault

Δ2: adds 1 new individual and infers Pole_Vault

Δ3: adds 1 new individual and infers High_Jump(neglects though pole1:Pole)

Possible explanations

ΒΟΕΜΙΕ

3636


RetrievalVITALAS, Victory

VITALAS:• Adapting the search space to the user profile and providing

interactive functionalities to control the results• The system validation will be performed on professional

collections, up to 10,000 hours of video (television archives –INA/IRT) and 1.500.000 still images (Belga)

• The textual annotation would have different interpretationregarding the usage context.

VICTORY:• 3D and multimedia distributed content (MultiPedia) into Peer-to-

Peer (P2P) and mobile Ρ2Ρ networks• 3D content (and context) analysis, personalization, 3D object

watermarking techniques

3737


aceMedia applications

Web

-based

Stan

dalo

ne

aceMedia PCapplications

3838


Common (Open) Issues

• Evaluation• Annotated content• Ontologies• Fusion in analysis• Uncertainty in reasoning• Large-Scale• Generic vs. Specific approaches• Multiple domains support

3939


Dissemination Activities

• SMART: Semantic MultimediA Research andTechnology, networking cluster

• SAMT: International Conference on Semanticsand digital Media Technologies (EWIMT)• 2007: 5-7 December 2007, Genova, Italy

• SSMS: Summer School on MultimediaSemantics• 2007: Glasgow, UK, July 15-21, 2007

• Special issues, sessions, workshops, books

4040


Conclusions

• Semantic analysis of multimedia isalready providing results

• Fundamental and applied research in• Logic-based + signal approaches• Implicit + explicit (knowledge) approaches

• Different applications and requirements• Ongoing research in all areas• Future direction: analysis+reasoning for

social (Web 2.0) applications

4141


Many thanks to theprojects!

4242


Thank you!CERTH-ITI / Multimedia Knowledge Laboratory

http://mklab.iti.gr

convergence of multimedia and knowledge technologies · metadata generation & representation...

Documents