multimedia data navigation and the semantic web (semtech 2006)
TRANSCRIPT
Navigation for the Digital Universe
Multimedia Data Navigation and the Semantic Web
Valery A. Petrushin and Bradley P. Allen
Copyright © 2005 Accenture, LLP / Siderean Software, Inc. All rights reserved. 2
Outline• About the authors• Faceted Navigation• Semantic Web Techniques
– RDF(S)
– Dublin Core
– SKOS
– TGM
– LSCOM, SMIL & MPEG-7• Case Study: BBC Rushes• Implementation
– BBC Rushes Navigator• Metadata representation
• Architecture
• User interface
• Future work• Contact Information• Demo
Copyright © 2005 Accenture, LLP / Siderean Software, Inc. All rights reserved. 3
About the Authors• Valery A. Petrushin, Ph.D.
– Sr. Researcher, Accenture Technology Labs
– Semantics of programming languages
– Multimedia data mining, analysis, annotation and retrieval
– Georgia Tech, Glushkov Institute for Cybernetics
• Bradley P. Allen– Founder and CTO Siderean Software, Inc.
– Semantic-based navigation, Web personalization services, case-based reasoning
– Former founder and CTO of Limbex Corp. and TriVida Corp.
– Carnegie-Mellon University
Copyright © 2005 Accenture, LLP / Siderean Software, Inc. All rights reserved. 4
Faceted navigation• Facets are metadata properties whose ranges
form a near-orthogonal set of controlled vocabularies
• Creator: “Dickens, Charles”• Subject: Arsenic, Antimony• Location: World > U.S. > California > Venice
• Facets form a frame of reference for information overview, access and discovery
• Other properties serve as landmarks and cues
• Faceted navigation uses facets to provide end user access and discovery in the context of large collections of semi-structured information
Copyright © 2005 Accenture, LLP / Siderean Software, Inc. All rights reserved. 5
Faceted Navigation BuiltUsing Semantic Web Standards
• Define/reuse ontologies expressed in RDF(S)/OWL
• Classes for defining instances and controlled vocabularies• Properties for facets and additional asset metadata attributes
• Import/transform aggregated instance metadata into an RDF representation
• Resources referred to via URIs• Content and controlled vocabularies
• Write application profiles in terms of RDF
Copyright © 2005 Accenture, LLP / Siderean Software, Inc. All rights reserved. 6
Building Faceted Navigation Applications
… then represented as instances of concepts in
ontologies and tagged using controlled vocabularies…
… then application profilesare created…
… that define navigation services for user applications
Metadata is aggregated…
Term
Event
Person
PlaceText
Application Profiles
Copyright © 2005 Accenture, LLP / Siderean Software, Inc. All rights reserved. 7
Semantic Web Technology
• RDF(S) – Resource Description Framework (Schema) • Dublin Core• SKOS – Simple Knowledge Organization System• TGM-I & II – Thesaurus for Graphic Materials • LSCOM – Large Scale Concept Ontology for Multimedia• SMIL – Synchronized Multimedia Integration Language• MPEG-7 – Multimedia Content Description Interface
Copyright © 2005 Accenture, LLP / Siderean Software, Inc. All rights reserved. 8
RDF (S)• RDF (S) - Resource Description Framework (Schema)
• http://www.w3.org/RDF/• http://www.w3.org/TR/rdf-schema/• language for representing metadata about Web resources• Triple : subject – predicate -- > object • Example:
<?xml version="1.0"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:contact="http://www.w3.org/2000/10/swap/pim/contact#">
<contact:Person rdf:about="http://www.accenture.com/techlabs/VAP/contact#me">
<contact:fullName>Valery A. Petrushin</contact:fullName>
<contact:mailbox rdf:resource="mailto:[email protected]"/>
</contact:Person>
</rdf:RDF>
Copyright © 2005 Accenture, LLP / Siderean Software, Inc. All rights reserved. 9
Dublin Core (DC)• Dublin Core
• http://dublincore.org/documents/ • vocabulary for describing documents (title, creator, subject,
description, publisher, contributor, date, type, format, identifier, source, language, relation, coverage, rights)
• Example:
<?xml version="1.0"?>
<!DOCTYPE rdf:RDF PUBLIC "-//DUBLIN CORE//DCMES DTD 2002/07/31//EN"
"http://dublincore.org/documents/2002/07/31/dcmes-xml/dcmes-xml-dtd.dtd">
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:dc="http://purl.org/dc/elements/1.1/">
<rdf:Description rdf:about="http://www.accenture/techlabs/Petrushin">
<dc:title> Multimedia Data Mining and Knowledge Discovery</dc:title>
<dc:creator> Valery A. Petrushin </dc:creator >
<dc:publisher>Springer Verlag</dc:publisher>
</rdf:Description>
</rdf:RDF>
Copyright © 2005 Accenture, LLP / Siderean Software, Inc. All rights reserved. 10
SKOS• SKOS – Simple Knowledge Organization System
• http://www.w3.org/2004/02/skos/ • model for expressing structure and content of concept schemes
(thesauri, taxonomies, etc.)• Specifies concepts, collections of concepts and relations between
concepts (broader, narrower, related)• Example:
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:skos="http://www.w3.org/2004/02/skos/core#">
<rdf:Description rdf:about="http://www.example.com/concepts#people">
<skos:broader rdf:resource="http://www.example.com/concepts#mammals"/>
<skos:narrower rdf:resource="http://www.example.com/concepts#children"/>
<skos:narrower rdf:resource="http://www.example.com/concepts#adults"/>
</rdf:Description>
</rdf:RDF>
Copyright © 2005 Accenture, LLP / Siderean Software, Inc. All rights reserved. 11
TGM – I & II• TGM – Thesaurus for Graphic Materials (The
Library of Congress)• TGM-I – Subject Terms (6,300)
– http://www.loc.gov/rr/print/tgm1/toc.html
• TGM-II – Genre and Physical Characteristic Headings (600)– http://www.loc.gov/rr/print/tgm2/
• Example:
TGM-I:
Term: Sand
Narrower Term: Quicksand
Related Term: Dunes, Sand sculpture, Sandpaintings
TGM-II:
Term: Aerial views
Public Note: Views from a high vantage point.
Used For: Air views, Balloon views, Views, Aerial
Broader Term: Views
Narrower Term: Aerial photographs
Related Term: Bird's-eye views, Panoramic views
Copyright © 2005 Accenture, LLP / Siderean Software, Inc. All rights reserved. 12
LSCOM, SMIL & MPEG-7• LSCOM – Large Scale Concept Ontology for
Multimedia• http://www.acemedia.org/aceMedia/files/multimedia_ontology/
presentations_1st_meeting/arda.pdf
• SMIL – Synchronized Multimedia Integration Language
• http://www.w3.org/TR/REC-smil/ • Simple language for representing multiple synchronized media
streams
• MPEG-7 – Multimedia Content Description Interface
• http://www.chiariglione.org/mpeg/standards/mpeg-7/mpeg-7.htm • Advanced language for representing multimedia content• ISO Standard
Copyright © 2005 Accenture, LLP / Siderean Software, Inc. All rights reserved. 13
Case Study: BBC Rushes• Rushes are raw footage …
with a promise to turn into golden nuggets of stockshots
• TRECVID 2005• Video Retrieval Competition at NIST• http://www-nlpir.nist.gov/projects/trecvid/
• Problem:• create a system that helps a TV program maker
compose a video using current clips and rushes
• Data Statistics:– Duration: 49.3 hours– Content:
– Clips about vacation and travel– 4 issues of “Summer Holiday” (~ 2 hours)– BBC One News (30’) + fragment (~3’)
Copyright © 2005 Accenture, LLP / Siderean Software, Inc. All rights reserved. 14
BBC Rushes: Data Statistics - 1• Statistics: clip level
• 615 clips (308 development + 307 test sets)
• Duration (mm:ss) :– Minimal / Maximal - 00:03.48 / 47:11
– Mean / Median – 04:49 / 02:25
– Std - 06:02.73
• Keywords:– Different keywords / Occurrences –
1036 / 4908
– Mean / Median – 7.98 / 7
– Minimal / Maximal – 0 / 34
Copyright © 2005 Accenture, LLP / Siderean Software, Inc. All rights reserved. 15
BBC Rushes: Data Statistics - 2• Statistics: shot level
• Number of shots 10,064• Shot duration (mm:ss)
» Minimal - 0:00.04» Maximal –
22:45.16» Mean – 0:17.51» Median – 0:09.74» Std -
0:33.97
• Number of key frames» Total: 39,132» Median per shot: 2» Mean per shot: 3.8» Maximal: 377» Minimal: 1
Copyright © 2005 Accenture, LLP / Siderean Software, Inc. All rights reserved. 16
BBC Rushes: representation• Ontologies
– RDFS, Dublin Core, SKOS• Controlled vocabularies
– TGM-1 (reflecting Light Scale Concept Ontology for Multimedia), ISO8601 (temporal hierarchy of dates), MPEG-7 (visual features)
• Instances– trecvid:Shot, trecvid:Clip
• Application profile– Retrieve instances of type trecvid:Clip
• Textual facets: dc:title (clip title), dc:subject (keywords), dc:creator (director), dcterms:created (production date), dcterms:issued (show date), dc:extent (duration)
– Retrieve instances of type trecvid:Shot• Visual facets: dc:subject with values skos:narrower than trecvid:color,
trecvid:texture and trecvid:colorplustexture
• Textual facets through reference to containing clip
Copyright © 2005 Accenture, LLP / Siderean Software, Inc. All rights reserved. 17
Ontology Schema
Clip
Shot
KeyFrame
Color
Texture
Color+Texture
Title
Creator
Subject
Date
dcterms: partOf
dc: title
dc: creator
dc: subject
dc: subject
dc: created
skos: broader
skos: broader
skos: broader
skos: broaderskos: broader
skos: broader
skos: broader
skos: broader
VISUAL
TEXTUAL
Copyright © 2005 Accenture, LLP / Siderean Software, Inc. All rights reserved. 18
BBC Rushes: visual facets• Facets: color, texture, [shape] +
combinations• Color, texture, color+texture
• To build facets• Extract features (MPEG-7):
– Color: dominantColor(24), colorStructure (256), colorLayout (12)
– Texture: edgeHistogram (80), homogenousTexture (60)
• SOM Clustering of keyframes– Select as a visual “word” the closest
keyframe to node centroid• Represent keyframes as SKOS concepts,
centroids as skos:broader of cluster members
• Example: – SOM for color 35x28 (=980 nodes)
Copyright © 2005 Accenture, LLP / Siderean Software, Inc. All rights reserved. 19
Self-organizing Maps• SOM = Kohonen NN = Topology-preserving map• Unsupervised learning (Clustering + Visualization)• X = {xi} , xi Rd - input data• M = {mk} , mk Rd - prototype vectors (codebook) =
neurons on 1D or 2D grid• Training:
• 1. Start with random mk
• 2. For xi find best-matching unit (BMU) mc
• 3. Update prototype vectors in neighborhood
where is the neighborhood kernel is radius at time t
• Two phases: rough and fine tuning
kk
c mxmx min
)()()()()()1( tmtxthttmtm ickkk
tkccktckck rrddth ,),2/exp()( 22 )(thck
Copyright © 2005 Accenture, LLP / Siderean Software, Inc. All rights reserved. 20
BBC Rushes: RDF subgraph
Chilli_peppers
v159_001.wmv v159.mpg
“michelle jones”
2000-03-01
dc:subject
dc:creator
dcterms:partOf
dc:created
dc:subject
color#26547
f000000000.jpg
skos:broader
skos:broader
2000
2000-03
Hot_peppers
PeppersYear
skos:broader
skos:broader
skos:broader
skos:broader
“thailand, chiang mai/chillis”
dc:title
Color
skos:broader
Copyright © 2005 Accenture, LLP / Siderean Software, Inc. All rights reserved. 21
BBC Rushes: RDF/XML serialization <trecvid:Clip rdf:about="http://swvideo.techlabs.accenture.com/v159.mpg"> <rdf:type rdf:resource="&dctype;MovingImage" /> <dc:title>thailand, chiang mai/chillis</dc:title> <dcterms:extent>202200</dcterms:extent> <dc:creator>michelle jones</dc:creator> <dc:identifier>mrs320354</dc:identifier> <dcterms:created rdf:resource="tag:siderean.com,1752-09-14:2000-03-01" /> <dcterms:issued rdf:resource="tag:siderean.com,1752-09-14:2000-07-18" /> <dc:subject rdf:resource="&trecvid;thailand" /> <dc:subject rdf:resource="&trecvid;chiang_mai" /> <dc:subject rdf:resource="&trecvid;chillis" /> <dc:subject rdf:resource="&trecvid;peppers" /> <dc:subject rdf:resource="&trecvid;chilli_peppers" /> <dc:subject rdf:resource="&trecvid;vegetables" /> <dc:subject rdf:resource="&trecvid;markets" /> <dc:subject rdf:resource="&trecvid;street_markets" /> <dc:subject rdf:resource="&trecvid;food_markets" /> <dc:subject rdf:resource="&trecvid;food" /> <dc:subject rdf:resource="&trecvid;herbs" /> <dc:relation>http://swvideo.techlabs.accenture.com/v159.fset/f000000000.jpg </dc:relation> </trecvid:Clip>
<skos:Concept rdf:about="&trecvid;chilli_peppers"> <skos:broader rdf:resource="&tgm1;Hot_peppers"/> <skos:prefLabel>chilli peppers</skos:prefLabel> </skos:Concept>
<skos:Concept rdf:about='tag:siderean.com,1752-09-14:2000-03-01'> <skos:prefLabel>2000-03-01</skos:prefLabel> <skos:broader rdf:resource='tag:siderean.com,1752-09-14:2000-03'/> </skos:Concept>
<trecvid:Shot rdf:about="http://swvideo.techlabs.accenture.com/shotsWMV/v159_001.wmv"> <rdf:type rdf:resource="&dctype;MovingImage" /> <dcterms:isPartOf rdf:resource="http://swvideo.techlabs.accenture.com/v159.mpg" /> <dcterms:extent>21000</dcterms:extent> <dc:relation>http://swvideo.techlabs.accenture.com/v159.fset/f000000000.jpg</dc:relation> <dc:subject rdf:resource="http://swvideo.techlabs.accenture.com/v159.fset/f000000000.jpg"/> <dc:subject rdf:resource="http://swvideo.techlabs.accenture.com/v159.fset/f000000240.jpg"/> <dc:subject rdf:resource="http://swvideo.techlabs.accenture.com/v159.fset/f000000280.jpg"/> <dc:subject rdf:resource="http://swvideo.techlabs.accenture.com/v159.fset/f000001440.jpg"/> <dc:subject rdf:resource="http://swvideo.techlabs.accenture.com/v159.fset/f000003120.jpg"/> <dc:subject rdf:resource="http://swvideo.techlabs.accenture.com/v159.fset/f000005440.jpg"/> <dc:subject rdf:resource="http://swvideo.techlabs.accenture.com/v159.fset/f000009680.jpg"/> <dc:subject rdf:resource="http://swvideo.techlabs.accenture.com/v159.fset/f000011520.jpg"/> <dc:subject rdf:resource="http://swvideo.techlabs.accenture.com/v159.fset/f000012040.jpg"/> <dc:subject rdf:resource="http://swvideo.techlabs.accenture.com/v159.fset/f000013800.jpg"/> <dc:subject rdf:resource="http://swvideo.techlabs.accenture.com/v159.fset/f000014800.jpg"/> <dc:subject rdf:resource="http://swvideo.techlabs.accenture.com/v159.fset/f000015120.jpg"/> <dc:subject rdf:resource="http://swvideo.techlabs.accenture.com/v159.fset/f000016760.jpg"/> <dc:subject rdf:resource="http://swvideo.techlabs.accenture.com/v159.fset/f000018280.jpg"/> <dc:subject rdf:resource="http://swvideo.techlabs.accenture.com/v159.fset/f000019360.jpg"/> <dc:subject rdf:resource="http://swvideo.techlabs.accenture.com/v159.fset/f000021000.jpg"/> </trecvid:Shot>
<skos:Concept rdf:about="http://swvideo.techlabs.accenture.com/v159.fset/f000000000.jpg"> <skos:broader rdf:resource="http://swvideo.techlabs.accenture.com/color#26547" /> <skos:prefSymbol rdf:resource="http://swvideo.techlabs.accenture.com/v159.fset/f000000000.jpg" /> </skos:Concept>
<skos:Concept rdf:about="http://swvideo.techlabs.accenture.com/color#26547"> <skos:broader rdf:resource="&trecvid;color" /> <skos:prefSymbol rdf:resource="http://swvideo.techlabs.accenture.com/v289.fset/f000048880.jpg" /> </skos:Concept>
Copyright © 2005 Accenture, LLP / Siderean Software, Inc. All rights reserved. 22
BBC Rushes Navigator: Architecture
AJAX client in FirefoxMetadataAggregator
MetadataStore
NavigationWeb Services
XRBRquery
XRBRresponse
BBC Rushes RDF
http://www.siderean.com/bbcrush/bbcrush.jsp (with Firefox 1.5)
Copyright © 2005 Accenture, LLP / Siderean Software, Inc. All rights reserved. 23
Lessons Learned• Data preparation
• Robust shot boundary detection• Careful selection of keyframes
– Motion based– Salient object based– Filtering redundant keyframes
• Using group-of-frames (GOF) features
• Concept recognition/propagation• Propagate keywords from clip to shots• Recognize concepts from visual data• Probabilistic reasoning• Derive concepts from data (data mining) + labeling
Copyright © 2005 Accenture, LLP / Siderean Software, Inc. All rights reserved. 24
Summary• Methodology of Multimedia Data
Representation– Semantic Web Technology– Multimedia Data Mining
• Prototype of Multimedia Retrieval System– BBC Rushes– Web-based Interface using AJAX
Copyright © 2005 Accenture, LLP / Siderean Software, Inc. All rights reserved. 25
Future work• More facets
• Shape + combinations• Geographical location
• More Interfaces• Map of the world for browsing
places• Hierarchy of SOM for browsing clips
and shots
• More Tools• Tagging tool for creating and
managing metadata• Tools for creating video databases
(shot extraction, feature extraction, clustering, classification of events, etc.)
• Tools for creating audio-video compositions (TV programs, commercials, etc.)
Copyright © 2005 Accenture, LLP / Siderean Software, Inc. All rights reserved. 26
BBC Rushes Navigator:Navigation with LSCOM
Copyright © 2005 Accenture, LLP / Siderean Software, Inc. All rights reserved. 27
BBC Rushes Navigator:Hierarchical Drill-down on People Facet
Copyright © 2005 Accenture, LLP / Siderean Software, Inc. All rights reserved. 28
BBC Rushes Navigator:Faceted View of All Shots
Copyright © 2005 Accenture, LLP / Siderean Software, Inc. All rights reserved. 29
BBC Rushes Navigator:Searching by Subject
Copyright © 2005 Accenture, LLP / Siderean Software, Inc. All rights reserved. 30
BBC Rushes Navigator: Searching by Color, Playlist composition
Copyright © 2005 Accenture, LLP / Siderean Software, Inc. All rights reserved. 31
BBC Rushes Navigator: Drill-down using Subject and Color
Copyright © 2005 Accenture, LLP / Siderean Software, Inc. All rights reserved. 32
Contact Information
• Valery A. [email protected]
• Bradley P. [email protected]