content-enriched classifier for web video classification

C t t i h d Cl ifiContent-enriched Classifier for Web Video Classificationo eb deo C ass cat o

ByBin Cui & Ce Zhang

Dept of CS Peking UniversityDept. of CS Peking University

Gao CongSchool of CE, Nanyang Technological

Presented by

University, Singapore

SIGIR 2010Presented by

Ahmed Ibrahim

OutlineOutline• Introduction• Current Approaches• Current Approaches• Proposed Approach

– Content -enriched Classifier– Content-enriched Similarity – CSE Classifier Algorithm

• Experimental ResultsExperimental Results• Conclusions & Critique• Proposed approach extension

IntroductionIntroduction

• In video sharing services the userIn video sharing services, the user browses the web by categories.

• Real time categorization plays a key roll for organizing, browsing, and retrieving online video.

3CS848 Winter 2011

Web Video ProcessingWeb Video Processing

Video Title

User Description

4CS848 Winter 2011

Web Video Classification Problems

o Although text features and content features areo Although text features and content features are complementary but utilizing content features in video classification stage is computationally expensive.

o Text classification cannot use the rich information contained in video content.

o Text description characteristics limits the classification performance of semantic similarity based on WordNet(and / or) term co occurrence(and / or) term co-occurrence.

5CS848 Winter 2011

Current ApproachesCurrent Approaches

6CS848 Winter 2011

Proposed ApproachProposed Approach1- Content-Enriched Similarity:

Using visual clues of web videos to obtain more reasonablesemantic relations among words which called Content-EnrichedSimilarity (CES) between words.y ( )

2- Content-Enriched Nonlinear Classifier

At the training stage a nonlinear SVM classifier is built to– At the training stage, a nonlinear SVM classifier is built to explore the semantic similarity between words using CES.

– At the classification stage, this classifier classifies a new video g ,using its text features (but not its content features).

CS848 Winter 20117

Proposed Approach (cont.)Proposed Approach (cont.)3- Semantic kernels will be computed using the following

f lformula:

4- Multi-Kernel Enhancement: Given several kernels created using different word pair-wise similarity matrices for multiple kernel optimization.

8CS848 Winter 2011

Content-enriched ClassifierContent enriched ClassifierClassifier

Training Data(Test

Features)Content-Enriched Semantic KernelBuilding

Classifier

Features)

Content-enriched word

similarity Finding the hyperplane in

Content Enriched

ClassifierApplying Classifier

Extract CES

Testing Data

Content-Enriched Kernel Space

Training Data g(Test Features)

CES: Content-Enriched Similarity

(Test + Content Features)

9CS848 Winter 2011

Content-Enriched SimilarityContent Enriched SimilarityGenerally, two words are similar if they appear in they, y ppsame cluster, within which the videos are similar in termsof content.

Extract Visual Content Features

VideoDatabase

K-means Cl t i

“K” clusters = 100Project ‘tf’ into cluster

spaceDatabase(5149 videos)

Clustering

‘VS’video-cluster

space

relation matrix

10CS848 Winter 2011

CES Classifier AlgorithmCES Classifier Algorithm

11CS848 Winter 2011

Experimental ResultsExperimental Results• Experimental Settings p g

– Datasets:• Two real-life datasets are collected from ‘YouTube’ between Sept 23

& 24 of 2009, YT923 (5149 videos) & YT924(4447 videos)., ( ) ( )• They categorized both datasets into 15 Categorize.

– Preprocessing : Feature Extraction:• Text features are extracted from videos include (video titles andText features are extracted from videos include (video titles and

descriptions).• Words are stemmed using WordNet stemmer.• Stop words are manually removed.p y• The following visual content features (color, texture & edges) are

extracted .

12CS848 Winter 2011

Word Similarity ApproachWord Similarity Approach

• The relation discovered by CSE are meaningful and agree with common sense.

• The classification results reflect the superiority of proposed methods.

13CS848 Winter 2011

Classification effectivenessClassification effectiveness

• Classification Performance on different frameworks.• F-score: accuracy measure for classification which can

be calculated usingbe calculated using . • Macro-F: average of F-score for each category.• Micro-F: average of F-score for all decisionsMicro F: average of F score for all decisions.

14CS848 Winter 2011

Effectiveness Per CategoryEffectiveness Per Category

The scores of content classifier have been excluded because their performance is much worse than the text.

15CS848 Winter 2011

Performance on Multi-kernelPerformance on Multi kernel

• This table shows the results on classification effectiveness with multi-kernel solutioneffectiveness with multi kernel solution

16CS848 Winter 2011

ConclusionsConclusions• Novel Framework that exploits visual contentNovel Framework that exploits visual content

and text features to facilitate online videocategorization is presented.

• Content-enriched Semantic Kernel whichextracts word relationship by clustering the videowith visual content feature is proposed.

17CS848 Winter 2011

PRESENTED APPROACH

EXTENDEXTEND

Camera Motion ModelCamera Motion Model

To enhance the presented approach, we will study the feasibility ofi C M ti M d l id t t f t thusing Camera Motion Model as a video content feature on the

classification performance and efficiency using CC_WEB_VIDEO.

20CS848 Winter 2011

Questions

Thank You21

CS848 Winter 2011

content-enriched classifier for web video classification

Education

terms of content

video classification

occurrencecs848 winter

o text classification

classification results

text features

video titles

current approachescs848