image retrieval part ii

80
Image Retrieval Part II

Upload: kyoko

Post on 12-Jan-2016

43 views

Category:

Documents


7 download

DESCRIPTION

Image Retrieval Part II. Topics. Applications of CBIR in digital library Human-controlled interactive CBIR Machine-controlled interactive CBIR. “Get similar images”. CBIR. Query Sample. Results. Query by Example. Pick query examples and ask the system to retrieve “similar” images. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Image Retrieval Part II

Image Retrieval Part II

Page 2: Image Retrieval Part II

2

Topics

• Applications of CBIR in digital library

• Human-controlled interactive CBIR

• Machine-controlled interactive CBIR

Page 3: Image Retrieval Part II

3

Query by ExampleQuery by Example

Query Sample

Results

CBIRCBIR

“Get similar images”

• Pick query examples and ask the system to retrieve “similar” images.

Page 4: Image Retrieval Part II

4

QBIC(TM) – IBM's Query By Image Content

http://www.hermitagemuseum.org/fcgi-bin/db2www/qbicSearch.mac/qbic?http://www.hermitagemuseum.org/fcgi-bin/db2www/qbicSearch.mac/qbic?selLang=EnglishselLang=English

Page 6: Image Retrieval Part II

6

NETRA @ UCSB

http://nayana.ece.ucsb.edu/M7TextureDemo/Demo/client/M7TextureDemo.html

Page 7: Image Retrieval Part II

7

Medical Decision Support

• Breast cancer is among the top killers of women in the developed world.

• Early detection of malignancy can greatly reduce the risk of death.

MammogramMammogram

Page 8: Image Retrieval Part II

8

Nationwide Support for Physicians

Page 9: Image Retrieval Part II

9

Summary of Fundamental CBIR

• CBIR using query by example

• CBIR Algorithm:• First step --- Image Indexing

• Second step --- content matching

• Third step --- Ranking and displaying

• Relevance feedback (RF)

Page 10: Image Retrieval Part II

10

Step I: Image Indexing

• Image content = { Color, Shape, Texture }

• color: Color histogram, Color Moments

• Shape: Chaincodes, Fourier descriptors

• Texture: Gabor wavelet features, Co-occurrence matrix

• = { Color, Shape, Texture }

• example

1v

2v

3v

4v

5v

v

]1.0......05.02.0[v

Page 11: Image Retrieval Part II

11

Example of feature vectors in relational databaseExample of feature vectors in relational database

Page 12: Image Retrieval Part II

12

Step II: Content Matching

• Content similarity measure is obtained by a distance function:

where is the feature vector of query image

is the feature vector of image in the database

• Many distance function have been used:

• Euclidean distance

• l1-norm

• cosine measure

),( function distance qvvD

TqNqqq vvvv ],...,,[ 21

2

1

1

2)(),(

N

iqiiq vvvvD

N

iqiiq vvvvD

1

),(

q

qq

vv

vvvvS

),(

TNvvvv ],...,,[ 21

Page 13: Image Retrieval Part II

13

Step III: Similarity Ranking

• Calculate for each image in the database

• Sort in decreasing order (assume J=10)

• top 3 images: image6, image2, image9• top 5 images: image6, image2, image9, image10, image7

),( qjj vvD

),(),...,(),....,,(),,(),,( 332211 qJJqjjqqq vvDvvDvvDvvDvvD

54183710926 DDDDDDDDDD

jD

Page 14: Image Retrieval Part II

14

Problems with CBIR

In essence, retrieval is a pattern recognition problem with special characteristics.

• Huge volume of (visual) data.

• High dimensionality in feature space.

• Query design: Gap between the high level concepts and low level features.

• Linear matching criteria: A mismatch to the popular human perception model.

Page 15: Image Retrieval Part II

15

Example: Compressed Domain

Visual database in the compressed domain: DCT: Many of the current image/video coding standards;

JPEG, MPEG-1,2, and H.261/3. Wavelets/VQ: Related to the new image coding standard

JPEG2000.

Significant gap between human visual perception and information presentation in DCT/wavelet.

JPEG/MPEGImage.jpgImage.jpgvideo.mpgvideo.mpg feature vectorfeature vector

DatabaseDatabase

DCT coeff.DCT coeff.

Page 16: Image Retrieval Part II

16

State-of-the-art

• Human controlled interactive CBIR (HCI-CBIR) • Integrating human perception into content-based retrieval.

• Machine controlled interactive CBIR (MCI-CBIR)• To reduce bandwidth requirement for browsing and

searching over the Internet.• To minimize errors caused by excessive human

involvement.

Page 17: Image Retrieval Part II

Integrating human perception into content-based retrieval

Page 18: Image Retrieval Part II

18

Scenario

• Machine provides Machine provides initial retrieval resultsinitial retrieval results, through query-, through query-by-keyword, sketch, or example, etc.;by-keyword, sketch, or example, etc.;

• Iteratively:Iteratively:• User providesUser provides judgment judgment on the current results as to on the current results as to

whether, and to what degree, they are relevant to her/his whether, and to what degree, they are relevant to her/his request;request;

• The machine The machine learns and try againlearns and try again..

Page 19: Image Retrieval Part II

19

Relevance Feedback

Initialsample 1st Result

Query

2nd Result

Feedback Feedback

•User gives a feedback to the query resultsUser gives a feedback to the query results•System recalculates feature weights and modified System recalculates feature weights and modified queryquery

Page 20: Image Retrieval Part II

20

Basic GUI for Relevance Feedback

Slider

Checkbox

Page 21: Image Retrieval Part II

21

ImageGroup

Pallete PanelResult View

Page 22: Image Retrieval Part II

22

3D MARS

Structure

color

Texture

Initial DisplayResult

Page 23: Image Retrieval Part II

23

Human-Controlled Interactive CBIR (HCI-CBIR)Human-Controlled Interactive CBIR (HCI-CBIR)

• An attractive solution to numerous applications

• Main feature: an active role played by users to improve retrieval accuracy

• State-of-the-art• query design considerations • linear criteria in similarity ranking

Page 24: Image Retrieval Part II

24

Effective Retrieval through User Interaction

• Current systems:

• QBIC: interactive region segmentation (IBM).

• FourEyes: including human in the image annotation and retrieval loop, (MIT Media Lab).

• WebSEEk: dynamic feature vector recomputation based on the user’s feedback (Colombia University).

• PicHunter: a Bayesian framework for content based image retrieval (NEC Research Institute).

• PicToSeek: (UVA).

• MARS: a relevance feedback architecture in image retrieval (UIUC).

Page 25: Image Retrieval Part II

25

A “New” Proposal for HCI-CBIR

The framework: Relevance feedback

The key features:

Modeling: mapping a high level concept to low level features

Matching:

capturing user’s perceptual subjectivity to modify the query using non-linear measurement

overcoming the difficulties faced by the traditional linear matching criteria

Page 26: Image Retrieval Part II

26

The Relevance Feedback Framework

The goal: measure feature relevance to improve performance of image matching in retrieval

A supervised learning procedure based on regressionFor a given query z and a set of the retrieved items,

, category xn into two classes:

a relevant class (visually similar to z): x_m, m = 1, 2, …, M, and an irrelevant class (not similar to z): x_q, q = 1, 2, …, Q

Structure a new query based on the information in x_m and x_q Use the new query in the next round of retrieval

pn Nn xx ,,...,2,1,

Page 27: Image Retrieval Part II

27

Query Modification Model 1

The new query:

nonrelevant image

relevant image

original query

modified query

M

mm

Pi

Mz

11

1}{ xz1

Page 28: Image Retrieval Part II

28

Query Modification Models 2&3(Anti Re-enforcement Learning)

x

zazb

)( 1NzxzaZ2zxwhen

)( 2NzxzbZ2zxwhen

Where

Example (1-D)

The new query:

z1Nx 2Nx RN , small positive constantsCenter of relevant itemsCenter of non-relevant itemsquery at previous iteration

xx

z

)()(3 zxzxzz NR

)(}{ 12 zxxz NP

iz

M

mm

M 1

1xx

Q

q

qQ 1

1xx

Page 29: Image Retrieval Part II

29

Nonlinear Search Unit

Non-linear (Gaussian) Search Unit (NSU)

Small i : a relevant feature (sensitive to change)

Large i : a nonrelevant feature (insensitive to change)

P

i i

iiP

iiii

zxzxGf

12

2

1 2

)(exp)(),(

zx

Pizx imim

i ,...,1|,|max

TPi xxx ],...,,...,[ 1x

TPi zzz ],...,,...,[ 1z

- image feature vector:

- adjustable query vector:

- the tuning parameters (NSU widths):

Page 30: Image Retrieval Part II

30

Linear Search Unit (LSU)

• To benchmark the performance of the NSU

• To initiate the search

• The parameters: exactly the same as in the NSU

P

i i

iilinear

zxdS

1

2

2

2 ),(),(

zxzx

Page 31: Image Retrieval Part II

31

Architecture for Interactive CBIR

Perceptual similarityMeasure

RAM(feature database)

image database

query image

Feature Extraction&Similarity Measure

Weight-parameterupdating User Interaction

Initial Searching

Interactive searching

User

(tentative)Query

Output: k-retrieved images

when iteration n>0when iteration n=0

storing offeature vectors

Page 32: Image Retrieval Part II

32

VQ Codewords as “content descriptors”

Image Blocks

Codebook

i= 1

i= 2

i=n-2

i=n

i=n-1

i=n-3

i

Code labels

The usage of codewords reflects the content of the input image encoded.

Page 33: Image Retrieval Part II

33

Two-level WT/VQ coding (1 bpp)

CB1HL2

CB2HL1

CB3VL2

CB4VL1

Multiresolution CodebookMallat’s two-leveldecomposition

0.5 bpp

2 bpp

8 bpp2 bpp

0.5 bpp

0 bpp0.5 bpp

H5 H4 H3 H2 H1

CB5DL2

Label Histogram

Page 34: Image Retrieval Part II

34

Test Database 1: Bordatz database

• A texture image database provided by Mahjunath, at http://vivaldi.ece.ucsb.edu/users/wei/codes.html

• 1,856 patterns in 116 different classes

• 16 similar patterns in each class

• Maintained as a single unclassified image database

Page 35: Image Retrieval Part II

35

Queries (the Bordatz Database)

[116 different image classes]

Page 36: Image Retrieval Part II

36

Performance Comparison

• Methods Compared

• LSU2: linear search unit & query model 2

• NSU2: non-linear search unit & query model 2

• Interactive CBIR in MARS: Multimedia Analysis and Retrieval System (developed at UIUC)

Page 37: Image Retrieval Part II

37

Retrieval Results (the Bordatz Database)

Retrieval Rate (%)

Methods t=0 t=1 t=2 t=3

Avg. LSU2 73.7 83.0 85.1 85.9

NSU2 73.7 84.9 88.2 89.2

MARS 67.0 75.1 76.4 76.7

Table 1. Average retrieval rate (%)

Note: The retrieval rate is defined as the average percentage of images belonging to the same class as the query in the top 16 matched.

Page 38: Image Retrieval Part II

iARM: Interactive-based Analysis and Retrieval of Multimedia

OnThe Internet

@iarm.ee.ryerson.ca:8000/corel

Page 39: Image Retrieval Part II

39

Strategy

• iARM implements interactive retrieval for the large image database, running on the J2EE Web Server.

• Interaction architecture.• Based on a non-linear relevance feedback, a multi-

model SRBF network.• Positive and negative feedbacks.• Properties: local and non-linear learning, fast and robust

on a small input data.

Page 40: Image Retrieval Part II

40

• the positive examples;

• The SRBF network characterizes the query by multiple clusters, each of which is modeled by a p-D Gaussian distribution as:

(1)

where (2)

A single-pass Radial Basis function (SBRF) Network

,)(rmx Mm ,...,2,1

2

1

2)(

)(

2exp),,(

m

P

p

rmppp

mr

mm

xxG

xx

Mil ri

rm

im ,...,2,1||,||min )()( xx

Page 41: Image Retrieval Part II

41

• The Weighted-Euclidean Space

(3)

where (4)

• A summation of M Gaussian units (1) yields similarity function for the input vector as follows:

(5)

SRBF Network Cont..

,0 if1

,0 if 1

pp

pp

M

np

rpmp xx

M 1

2

12)( ])(

1[

M

mm

rmmGS

1

)( ),,()( xxx

x

Page 42: Image Retrieval Part II

Single-class approachSingle-class approach Multi-class approachMulti-class approach

nonrelevant image

relevant image

original query

modified query

Page 43: Image Retrieval Part II

43

• Tuning decision boundary with negative samples:

• Antireinforced Learning Algorithm:

• If (6)

• Then

(7)

Negative Feedback Strategy

Nnirn ,...,2,1,)( x

,, ikDD ki

P

p

rip

irppi xxD

1

2)()(

)],()()[()()1( )()()()( ttttt ri

irn

ri

ri xxxx

Page 44: Image Retrieval Part II

44

Performance of IARM

• Using Corel Image Collection, containing 40,000 real-life images, www.corel.com.

• A total of 400 queries were generated, and relevance judgments were based on the “ground true” from Corel.

• Multiple descriptors:

<shape, color, texture>

<Fourier descriptors, HSV color histogram& color moments, Gabor Wavelet transform>

Page 45: Image Retrieval Part II

45

Result

Non-interactive CBIR iARM

r(1) r(2) r(3) r(8)

53% 80.08% 85.99% 87.58% 89.00%

Table 1: Average Precision Rate (%) obtained by retrieving 400 queries, measured from the top 16 retrievals.

Page 46: Image Retrieval Part II

Test 1: Fast and Robust with small # relevance feedbacks

Page 47: Image Retrieval Part II

Example: Looking for “model”. 0. Start with choosing image at the bottom right corner as the query.

Page 48: Image Retrieval Part II

1. Result after the initial search, then five relevant images are the feedbacks.

Page 49: Image Retrieval Part II

2. Result after one relevance feedback: all the top sixteen are relevant.

Page 50: Image Retrieval Part II

Test 2: Non-linearity

Page 51: Image Retrieval Part II

1. Initial Result: for a kang-fu performance; here shape plays very important role, but only seven images are relevant.

Page 52: Image Retrieval Part II

2. After one feedbacks: non-relevant images that have similar shape were removed, and all returns are the kang-fu.

Page 53: Image Retrieval Part II

Case 1 (left) : only the full-body actions were selectedCase 2 (Right) : only the half-body actions were selected

After two feedbacks:

Page 54: Image Retrieval Part II

Test 3: Multi-model capturing

Page 55: Image Retrieval Part II

Multi-modeling can capture very precisely on the local context defined by the current query session

Initial results Final results

Page 56: Image Retrieval Part II

56

Summary

Incorporating human perception in retrieval:significant improvement over the simple CBIRimpressive improvement over other relevance

feedback systemssatisfactory user queries support

The query models: effective in compressed domain

Page 57: Image Retrieval Part II

Machine Controlled Interactive Content-based Retrieval (MCI-CBR)

Page 58: Image Retrieval Part II

58

Problem with HCI-CBR

• User interaction requires• User to specify `relevance’ or `nonrelevance’• Inconsistency human performance• Repeating many feedbacks for convergence• Transmission sample files, i.e., high bandwidth

• User-friendly environment (ideal preference) • Less training samples, i.e., < 20 images/iteration• Less feedbacks, i.e., 1-2 iterations

Page 59: Image Retrieval Part II

59

Search distributed DVL’s on Internet

Wide Coverage.

Full Features Search.

DBContent

Search Agent Broker

SOLO

ArchivistEngine

DB-1Feature

SAPMHost

DB-2Content

ArchivistEngine

DB-2Feature

SAPMHost

DB-NContent

ArchivistEngine

DB-NFeature

SAPMHost

Third party Workshop.

Third party Service.

Progressive Search.

Knowhow Control.

Page 60: Image Retrieval Part II

60

Machine Controlled Interactive CBIR (MCI-CBIR)

A key research area in multimedia processing

Aim: To incorporate self-learning capability into CBIR which allows:automatic & semi-automatic retrievalminimization of user participations to reduce errors caused

by human inconsistency performancereduction of bandwidth requirement in Internet browsing

and searching

Page 61: Image Retrieval Part II

61

HCI-CBIR vs MCI-CBIR

Human controlled Interactive System

Machine ControlledInteractive System

SearchUnit

ImageDatabase

RelevanceFeedback

Userinteraction

Query

Retrievalresults Search

Unit

ImageDatabase

RelevanceFeedback

Query

Retrievalresults

Page 62: Image Retrieval Part II

62

The Essence of MCI-CBR

• Based on two feature space: R1 and R2

• Space R1 is of reasonable quality & easy to calculate in retrieval, such as:

• DCT,• DWT

• Space R2 is of very high quality, but potentially computationally intensive in relevance identification

• Descriptors extracted from un-compressed images• Object and region based descriptors

Page 63: Image Retrieval Part II

63

Architecture for Automatic Interactive CBR

SearchUnit

WT/VQ Coded Image Database

RBFNQuery

Feature Extraction

Displayimages

Compressed domain Processing

Relevance Feedback Module

User interaction

Different perceptual subjectivity

Semi-automaticMode

Retrievalresults

Relevance Classification

Page 64: Image Retrieval Part II

64

SOTM

• Relevance classification is performed by a Self-Organizing Tree Map (SOTM) which offers:

• Independent learning based on competitive learning technique• A unique feature map that preserves topological ordering

• SOTM is more suitable than the conventional SOM when input feature space is of high dimensionality

Page 65: Image Retrieval Part II

65

SOTM algorithm

Step oneStep oneInitialize the root node with a Initialize the root node with a point selected at random point selected at random from the input space.from the input space.

The root node is represented The root node is represented by the blue cross, and the by the blue cross, and the data space is represented by data space is represented by the green spots.the green spots.

Page 66: Image Retrieval Part II

66

SOTM algorithm Cont..

Step twoStep two Randomly select a new data Randomly select a new data point point xx, and compute the , and compute the Euclidean distance, dEuclidean distance, djj, to , to

node node wwjj (j = 1,....,J), where J (j = 1,....,J), where J

is the total number of nodes, is the total number of nodes, here J = 1.here J = 1.

Step threeStep threeSelect the winning node, j*, Select the winning node, j*, with minimum dwith minimum djj. .

Page 67: Image Retrieval Part II

67

SOTM algorithm Cont..

Step fourStep fourIfIf d dj*j*((xx, , wwjj ) ) Threshold, where Threshold, where

Threshold decreases with timeThreshold decreases with timethenthen assign assign xx to the j*th cluster to the j*th cluster and update the weight vector and update the weight vector according to:according to:

wwjj(t+1) = (t+1) = wwjj(t) + (t) + (t)[(t)[xx(t) - (t) - wwjj(t)],(t)],

where where (t) is the learning rate, (t) is the learning rate, 0<0<(t)<1. The position of the (t)<1. The position of the updated node is indicated by the updated node is indicated by the red arrow.red arrow.

Page 68: Image Retrieval Part II

68

SOTM algorithm Cont..

Step four, cont.Step four, cont.Else form a new subnode Else form a new subnode starting with starting with xx. The map . The map now has two nodes, as now has two nodes, as indicated by the two blue indicated by the two blue arrows.arrows.

Step fiveStep fiveContinue from step 2.Continue from step 2.

Page 69: Image Retrieval Part II

69

SOTM vs SOM

SOTM No nodes converge to areas of zero data density

SOM Nodes converge to areas of zero data density

Page 70: Image Retrieval Part II

70

Retrieval Procedure (1)

a. Initial Search: for a given query z, retrieve the set of K-most similar images, F = {O1,O2,…,Ok}, using

‘nearest neighbor rule’ & the feature space R1 for retrieval

b. Characterization: use features in feature space R2 to describe the retrieved images :

Okxk

and obtain the training set:

F(R2) = {x1,…,xk,…,xK}, xkR2

Page 71: Image Retrieval Part II

71

Retrieval Procedure (2)

c. Relevance classification: use SOTM to classify the training vectors in F(R2), then use the results to label the retrieved images:

{Ok,yk}, k=1,…,K where yk=1 if Ok is the relevant images otherwise yk=0

d. Relevance Feedback Module: implement interactive learning methods (e.g., RF, non-linear RF, or RBFN) using the training set {Ok,yk}, k=1,…,K, and ‘feature space R1’ for retrieval

e. Go back to Step (b)

Page 72: Image Retrieval Part II

72

Experiment Setup

• Brodatz texture database (1,856 images), using 116 queries

• ARR: average retrieval rate based on ground true classes

denotes the size of the ground truth set denotes the number of ground truth found within the top 16

• Feature representations• R1: MHI features on compressed WT/VQ images• R2: Gabor wavelet features

]1,0[)(NG

)(NF)(RR

q

qq

NFNG

Page 73: Image Retrieval Part II

73

Automatic vs User Controlled Retrieval

Methods Average Retrieval Rate (ARR), % User’s RF (Iter.)

0 Iter. 1 Iter. 4 Iter.

(a) MCI-CBR 63.42 71.66 76.51 -

(b) HCI-CBR 63.42 77.64 80.17 4

∆=(b)-(a) - +5.98 +3.66

Table 1: A comparison of ARR (%) between MCI-CBR method and HCI-CBR method, obtained during retrieving 116 queries, using Brodatz database.

Page 74: Image Retrieval Part II

74

Retrieval Example

Non-Interactive Retrieval Automatic Interactive Retrieval

Page 75: Image Retrieval Part II

75

Semi-Automatic VS User-controlled Retrievals

A comparison of AVR at convergence, between semiautomatic and HCI-CBR method

Page 76: Image Retrieval Part II

76

Retrieval using DCT-compressed images

• Compressed domain descriptor is based on energy histogram of the low frequency DCT coefficients [Lay, 1999]

• JPEG photograph database distributed by Media Graphic Inc, consisting of nearly 4,700 JPEG color images

Page 77: Image Retrieval Part II

77

Result

Method Avg. Relative Precision (%)

Avg. # user RF for convergence (Iter.)

Non-interactive CBIR 49.82 -

Automatic interaction 79.18 -

User controlled 95.66 2.63

Semi-automatic CBIR 98.08 1.33

Retrieval results on JPEG database, Column 2: average relative precision (%); Column 3: average number of user feedbacks (iteration) required for convergence, averaged over 30 queries.

Page 78: Image Retrieval Part II

Non-interactive CBIR Automatic-interactive CBIR

Semiautomatic CBIR (two user’s RF) User controlled CBIR (three user’s RF)

Page 79: Image Retrieval Part II

Non-interactive CBIR Automatic-interactive CBIR

Semiautomatic CBIR (one user’s RF) User controlled CBIR (two user’s RF)

Page 80: Image Retrieval Part II

80

Summary

• MCI-CBIR • minimizes the role of users in CBIR• Semi-automatic retrieval reach optimal

performance quickly