bridge semantic gap: a large scale concept ontology for multimedia (lscom)

49
Large Scale Concept Large Scale Concept Ontology for Multimedia Ontology for Multimedia (LSCOM) (LSCOM) Guo-Jun Qi Beckman Institute University of Illinois at Urbana- Champaign

Upload: gaille

Post on 19-Jan-2016

23 views

Category:

Documents


0 download

DESCRIPTION

Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM). Guo -Jun Qi Beckman Institute University of Illinois at Urbana-Champaign. LSCOM (Large Scale Concept Ontology for Multimedia). A broadcast news video dataset 200+ news videos/ 170 hours 61,901 shots Language - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM)

Bridge Semantic Gap: A Bridge Semantic Gap: A Large Scale Concept Large Scale Concept Ontology for Multimedia Ontology for Multimedia (LSCOM)(LSCOM)

Guo-Jun QiBeckman InstituteUniversity of Illinois at Urbana-Champaign

Page 2: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM)

LSCOM (Large Scale Concept LSCOM (Large Scale Concept Ontology for Multimedia)Ontology for Multimedia)A broadcast news video dataset

200+ news videos/ 170 hours

61,901 shots

Language

◦ English/Arabic/Chinese

Page 3: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM)

Why broadcast News Why broadcast News ontology?ontology?Critical mass of users, content

providers, applicationsGood content availability

(TRECVID LDC FBIS)Share Large set of core concepts

with other domains

Page 4: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM)

LSCOM ProvidesLSCOM ProvidesRichly annotated video content

for accomplishing required access and analysis functions over massive amount of video content

Large scale useful well-defined semantic lexicon◦More than 3000 concepts◦374 annotated concepts◦Bridging semantic gap from low-level

features to high-level concepts

Page 5: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM)

A LSCOM conceptA LSCOM concept

000 - ParadeConcept ID: 000Name: ParadeDefinition: Multiple units of marchers, devices, bands, banners or Music.Labeled: Yes

Page 6: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM)

LSCOM HierarchyLSCOM Hierarchy http://www.lscom.org/ontology/index.html

Thing.Individual..Dangerous_Thing...Dangerous_Situation....Emergency_Incident.....Disaster_Event......Natural_Disaster....Natural_Hazard.....Avalance.....Earthquake.....Mudslide.....Natural_Disaster.....Tornado...Dangerous_Tangible_Thing....Cutting_Device

Page 7: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM)

Definition: What’s the Definition: What’s the ontology? (Wikipedia)ontology? (Wikipedia)An ontology is a formal

representation of the knowledge by a set of concepts within a domain and the relationships between those concepts. It is used to reason about the properties of that domain, and may be used to describe the domain.

Page 8: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM)

OntologyOntologyRepresents the visual knowledge

base in a structure way◦Graph structure◦Tree (hierarchy) structure

Images/videos can be effectively learned and retrieved by the coherence between concepts◦Logical coherence◦Statistical coherence

Page 9: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM)

An Ontology Hierarchy: An Ontology Hierarchy: Military VehicleMilitary Vehicle

Page 10: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM)

An example from An example from WikipediaWikipedia

Page 11: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM)

Ontology Tree for LSCOMOntology Tree for LSCOM

Page 12: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM)

A Light Scale Concept A Light Scale Concept Ontology for Multimedia Ontology for Multimedia Understanding (LSCOM-Lite)Understanding (LSCOM-Lite)The aim is to break the semantic

space using a few concepts (39 concepts).

Selection Criteria◦Semantic Coverage

As many as semantic concepts in News videos could be covered by the light concept set.

◦Compactness These concept should not semantically overlap.

◦Modelability These concepts could be modeled with a

smaller semantic gap.

Page 13: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM)

Selected concept Selected concept dimensionsdimensionsDivide the semantic space into a

multimedia-dimensional space, where each dimension is nearly orthogonal◦Program Category◦Setting/Scene/Site◦People◦Objects◦Activities◦Events◦Graphics

Page 14: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM)

Histogram of LSCOM-Lite Histogram of LSCOM-Lite ConceptsConcepts

Page 15: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM)

Some example keyframesSome example keyframes

Page 16: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM)

ApplicationsApplications

Application I: Conceptual Fusion

(most basic – early fusion)

Application II: Cross-Category

Classification (inter-class relation)

Application III: Event Dynamic in

Concept Space

Page 17: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM)

Application I: Conceptual Application I: Conceptual FusionFusion

Video

Concept 1

Concept 2

Concept 3

Concept n

Visual Features

Classifier

Page 18: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM)

LSCOM 374 ModelsLSCOM 374 Models

374 LIBSVM models◦http://www.ee.columbia.edu/ln/dvmm/col

umbia374/◦Feature used (MPEG-7 descriptors)

Color Moments Edge Histogram Wavelet Texture

◦LIBSVM – a library for support vector machine at http://www.csie.ntu.edu.tw/~cjlin/libsvm/

Page 19: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM)

Application II: cross-category Application II: cross-category classification with concept classification with concept transfertransfer

G.-J. Qi et al. Towards Cross-Category Knowledge Propagation for Learning Visual Concepts, in CVPR 2011

Page 20: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM)

Instance-Level Concept Instance-Level Concept CorrelationCorrelation

+1

-1

+1

-1

Mountain Castle

Mountain and castle

Castle o

nly Mountain only

Page 21: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM)

Transfer FunctionTransfer Function

Mountain, Castle

Mountain

Castle

None of them

Page 22: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM)

Model Concept RelationsModel Concept Relations

Page 23: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM)

Automatically construct Automatically construct ontology in a data-driven ontology in a data-driven mannermanner

Page 24: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM)

An application III – Event An application III – Event Dynamics in Concept SpaceDynamics in Concept Space

Page 25: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM)

Event Detection with Event Detection with Concept DynamicsConcept Dynamics

W. Jiang et al, Semantic event detection based on visual concept prediction, ICME, Germany, 2008.

Page 26: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM)

Open ProblemsOpen ProblemsCross-Dataset Gap

◦ Generalize LSCOM dataset to other dataset (e.g., non-news video dataset)

Cross-Domain Gap◦ Text script associated with news videos

Can help information extraction for visual concepts?

Automatic ontology construction◦ Task dependent v.s. task independent◦ Data driven v.s. preliminary knowledge (e.g.,

WordNet)◦ Incorporate prior human knowledge (logic relation

etc.)

Page 27: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM)

TRECVID CompetitionTRECVID CompetitionTask 1: High-Level Feature

Extraction◦Input: subshot◦Output: detection results for 39

LSCOM-Lite concepts in the subshot

Page 28: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM)

High-Level Feature High-Level Feature ExtractionExtractionEach concept assumed to be binary

(absent or present) in each subshotSubmission: Find subshots that

contain a certain concept, rank them by the detection confidence score, and submit the top 2000.

Evaluations: NIST evaluated 20 medium frequent concepts from 39 concepts using a 50% random samples of all the submission pools

Page 29: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM)

20 Evaluated Concepts20 Evaluated Concepts

Page 30: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM)

Evaluation Metric: Average Evaluation Metric: Average PrecisionPrecisionRelevant subshots should be

ranked higher than the irrelevant ones.

R is the number of relevant images in total, Rj is the number of relevant images in top j images, Ij indicates if the jth image is irrelevant or not.

1

1Average Precision

Njj

j

RI

R j

Page 31: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM)

ResultsResults

Page 32: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM)

TRECVID CompetitionTRECVID CompetitionTask II: Video Search

◦Input: text-based 24 topics◦Output: relevant subshots in the

database

Page 33: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM)

Topics to searchTopics to search

Page 34: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM)

Topics to search (cont’d)Topics to search (cont’d)

Page 35: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM)

Topics to searchTopics to search

Page 36: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM)

Three Types of Search Three Types of Search Systems Systems

Page 37: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM)

Results: Automatic RunsResults: Automatic Runs

Page 38: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM)

Results: Manual RunsResults: Manual Runs

Page 39: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM)

Results: Interactive RunsResults: Interactive Runs

Page 40: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM)

Machine Problem 7: Shot Machine Problem 7: Shot Boundary Detection in Boundary Detection in VideosVideos

Page 41: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM)

GoalsGoalsDetect the abrupt content

changes between consecutive frames.◦Scene changes◦Scene cuts

Page 42: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM)

StepsStepsStep 1: Measuring the change of

content between video frames◦Visual/Acoustic measurements

Step 2: Compare the content distance between successive frames. If the distance is larger than a certain threshold, then a shot boundary may exist.

Page 43: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM)

Measuring Content based on Measuring Content based on Visual InformationVisual Information256 dimensional Color Histogram

◦In RGB space, normalize the r, g, b in [0,1]

◦Color spacenr

ng

8X8 histogram

Page 44: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM)

Color HistogramsColor HistogramsDivide each image into four

parts, each part has a 8X8 histogram, and 256 dim features in total.

Page 45: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM)

Acoustic FeaturesAcoustic Features

12 cepstral coefficients

Energy (sum of square of raw signals)

Zero crossing rates (ZCR)

ZCR = sum(|sign(S(2:N))-sign(S(1:N-

1))|)Hints: normalize energy to avoid it

over-dominating when computing distances between successive frames

Page 46: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM)

DatasetsDatasetsTwo videos of little over one

minuteManually label the shot boundary

Page 47: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM)

What to submitWhat to submitSource codeReport

◦compare shot boundary detection results returned by your algorithm with the manually labeled boundaries

◦Compare ◦Explain your choice of threshold◦Explain the differences between the

acoustic-based and visual-based detection results

Page 48: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM)

Where and when to Where and when to submitsubmit

Email to [email protected]

Due: May 2nd

Page 49: Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM)

Thanks! Thanks! Q&AQ&A