pseudo-relevance feedback for multimedia retrieval

PSEUDO-RELEVANCE FEEDBACK FOR MULTIMEDIA RETRIEVAL

2011-11709Seo Seok Jun

AbstractVideo information retrieval

◦Finding info. relevant to queryApproach

◦Pseudo-relevance feedback◦Negative PRF

QuestionsHow this paper approach to con-

tent-based video retrievalWhat is the advantage of nega-

tive PRFWhat this paper do to remove ex-

treme outliers

IntroductionContent-based access to video

info.CBVR

◦Allow users to query and retrieve based on audio and video

◦Limite capturing fairly low-level physical fea-

tures Color, texture, shape, … Difficult to determine similarity metrics

diff. query scenario -> diff. similarity metrics Animals -> by shape Sky, water -> by color

Introduction◦Making the similarity metric adaptive

Adapting similarity metric◦Automatically discover the discrimi-

nating feature subspace◦How?

Cast as classification problem Margin-based classifier

SVMs, Adaboosting High performance Learning the maximal margin hyperplane Users’ query only provides a small positive data

with no explicit negative data at all

Introduction◦Thus, to use, more training data

needed Negative examples Random sampling

As positive data # in a collection is very small Risk: positive examples might be included as

negative In standard relevance feedback

Ask user to label Tedious!

Automatic retrieval is essential!

Introduction Automatic relevance feedback

Based on not tailored to specific queries Negative feedback -> sample the bottom-

ranked examples Ex) car -> different from query images in

“shape” Feedback negative data

re-weight Refine discriminating feature subspace

Learning algorithm would be better than univer-sal similarity metric(used in all query)

IntroductionLearning process

◦Purpose Discover a better similarity metric Finding the most discriminating subspace be-

tween positive and negative examples.◦Cannot produce fully accurate classifica-

tion Training data is too small

◦Negative distribution -> not reliable!◦Risk! -> feedback from incorrect estimate◦Combining! (with generic similarity met-

ric)

Related workBriefly discuss some of the fea-

tures of complete system◦The Informedia Digital Video Library◦Relevance and Pseudo-Relevance

Feedback

Pseudo-Relevance Feed-backSimilar to relevance feedback

◦Both oriented from document re-trieval

◦Without any user intervention◦Few study in multimedia retrieval yet

No longer can assume top ranked are al-ways relevant

Relatively poor performance of visual re-trieval

Pseudo-Relevance Feed-backPositive example based learning

◦Partially supervised learning◦Begin with a small # of positive ex-

amples◦No negative examples◦Goal: associate all examples in col-

lection with one of the given cate-gories Out goal?

Producing a ranked list of the examples

Pseudo-Relevance Feed-backSemi-supervised learning

◦Two classifier◦Training set of labeled data◦Working set of unlabeled data

Transductive learning ◦Paradigms to utilize the info. of unla-

beled data◦Successful in image retrieval◦Computation is too expensive

Multimedia -> large collection

Pseudo-Relevance Feed-backQuery: text + audio + image/

videoRetrieving a set of relevant video

shot◦Permutation of the video shots◦Sorted by their similarity

Difference(two video segments) -> simi-larity metric

◦Video feature Multiple perspective

Speech transcript, audio, camera motion, video frame

Pseudo-Relevance Feed-backRetrieval as classification prob-

lem◦Data collection can be separated into

pos/neg◦Mean average precision

Precision and recall is common measure But not taking the rank into consideration Area under an ideal recall/precision curve

Pseudo-Relevance Feed-backPRF

◦Users’ judgment -> output of a base similarity metric

◦fb: base similarity metric◦p: sampling strategy◦fl: learning algorithm◦g: combination strategy

Pseudo-Relevance Feed-back

Algorithm DetailsBase similarity metric

◦Dissimilarity for x to query q1,…,qn◦Score -> for each frame

But retrieval unit -> shot(multiple frames)

Choose maximal score of a frame in one shot

Sampling Strategies◦From speech transcript -> positive

feedback Due to high precision of textual retrieval

Algorithm DetailsClassification Algorithm

◦SVMs◦Posterior probability

Linearly normalize the score = g(, ) = + : combinational factor

Algorithm DetailsCombinational with text retrieval

◦Externally provided video summaries are source of textual information Posterior probability set to 1 if keyword

exists Posterior probability for

+ + : posterior prob. of transcript retrieval : video summary retrieval Each for

In experiment , = 1, = 0.2

Whole video as a unit -> too coarse to be ac-curate

Pseudo-Relevance Feed-backPositive example

◦Query examplesNegative example

◦Strongest negative examplesFeedback only one time

◦Computational issueAutomatically feedback the training

data based on generic similarity metric◦To learn adaptive similarity metric◦Generalize the discriminating subspace for

various queries

Pseudo-Relevance Feed-backWhy good?

◦Good generalization ability of mar-gin-based learning algorithm

Isotropic data distribution -> in-valid◦Directions vary with different

queries, topics Sky -> color Car -> shape

◦In this case, PRF provide better simi-lar metric than generic.

Pseudo-Relevance Feed-backTest two case

◦Positive data Along the edge of the data collection Center of the data collection

◦Both case PRF superior Base similarity metric: generic metric

Cannot be modified across query

Pseudo-Relevance Feed-back

Pseudo-Relevance Feed-backPRF metric can be adapted based

on the global data distribution and training data◦By feeding back the negative exam-

ples◦Near optimal decision boundary

Associate higher score◦Farther away from the negative data◦Good when positive data are near

the margin Common in high dimensional spaces

Pseudo-Relevance Feed-backDownside

◦Some neg. outlier assigned a higher score than any positive data -> more false alarm

◦Solution Combining base metric and PRF metric Smooth out most of the outlier Just simple linear combination(1:1) Reasonable trade-off between local clas-

sification behavior and global discriminat-ing ability

ExperimentVideo: TREC Video Retrieval TrackText: NIST

◦40 hours of MPEG-1 videoAudio: splits the audio from the video

◦Down-samples to 16cKz, 16 bit sampleSpeech recognition system

◦Broadcast news transcriptImage processing side

◦Low-level image features; color and tex-ture

◦Query as xml

Experiment

Results

results

conclusionClassification taskMachine learning theory to video

retrievalSVMs learn to weight the discrim-

inating featuresNegative PRF

◦Separate the means of distributions of the neg. and pos. examples

Smoothing with combination

pseudo-relevance feedback for multimedia retrieval

Documents

relevance feedbackboth

universal similarity

similarity metricsanimals

similarity metricsdiff

small positive data

explicit negative data

video shotssorted

query images