image retrieval part ii
DESCRIPTION
Image Retrieval Part II. Topics. Applications of CBIR in digital library Human-controlled interactive CBIR Machine-controlled interactive CBIR. “Get similar images”. CBIR. Query Sample. Results. Query by Example. Pick query examples and ask the system to retrieve “similar” images. - PowerPoint PPT PresentationTRANSCRIPT
Image Retrieval Part II
2
Topics
• Applications of CBIR in digital library
• Human-controlled interactive CBIR
• Machine-controlled interactive CBIR
3
Query by ExampleQuery by Example
Query Sample
Results
CBIRCBIR
“Get similar images”
• Pick query examples and ask the system to retrieve “similar” images.
4
QBIC(TM) – IBM's Query By Image Content
http://www.hermitagemuseum.org/fcgi-bin/db2www/qbicSearch.mac/qbic?http://www.hermitagemuseum.org/fcgi-bin/db2www/qbicSearch.mac/qbic?selLang=EnglishselLang=English
6
NETRA @ UCSB
http://nayana.ece.ucsb.edu/M7TextureDemo/Demo/client/M7TextureDemo.html
7
Medical Decision Support
• Breast cancer is among the top killers of women in the developed world.
• Early detection of malignancy can greatly reduce the risk of death.
MammogramMammogram
8
Nationwide Support for Physicians
9
Summary of Fundamental CBIR
• CBIR using query by example
• CBIR Algorithm:• First step --- Image Indexing
• Second step --- content matching
• Third step --- Ranking and displaying
• Relevance feedback (RF)
10
Step I: Image Indexing
• Image content = { Color, Shape, Texture }
• color: Color histogram, Color Moments
• Shape: Chaincodes, Fourier descriptors
• Texture: Gabor wavelet features, Co-occurrence matrix
• = { Color, Shape, Texture }
• example
1v
2v
3v
4v
5v
v
]1.0......05.02.0[v
11
Example of feature vectors in relational databaseExample of feature vectors in relational database
12
Step II: Content Matching
• Content similarity measure is obtained by a distance function:
where is the feature vector of query image
is the feature vector of image in the database
• Many distance function have been used:
• Euclidean distance
• l1-norm
• cosine measure
),( function distance qvvD
TqNqqq vvvv ],...,,[ 21
2
1
1
2)(),(
N
iqiiq vvvvD
N
iqiiq vvvvD
1
),(
q
vv
vvvvS
),(
TNvvvv ],...,,[ 21
13
Step III: Similarity Ranking
• Calculate for each image in the database
• Sort in decreasing order (assume J=10)
• top 3 images: image6, image2, image9• top 5 images: image6, image2, image9, image10, image7
),( qjj vvD
),(),...,(),....,,(),,(),,( 332211 qJJqjjqqq vvDvvDvvDvvDvvD
54183710926 DDDDDDDDDD
jD
14
Problems with CBIR
In essence, retrieval is a pattern recognition problem with special characteristics.
• Huge volume of (visual) data.
• High dimensionality in feature space.
• Query design: Gap between the high level concepts and low level features.
• Linear matching criteria: A mismatch to the popular human perception model.
15
Example: Compressed Domain
Visual database in the compressed domain: DCT: Many of the current image/video coding standards;
JPEG, MPEG-1,2, and H.261/3. Wavelets/VQ: Related to the new image coding standard
JPEG2000.
Significant gap between human visual perception and information presentation in DCT/wavelet.
JPEG/MPEGImage.jpgImage.jpgvideo.mpgvideo.mpg feature vectorfeature vector
DatabaseDatabase
DCT coeff.DCT coeff.
16
State-of-the-art
• Human controlled interactive CBIR (HCI-CBIR) • Integrating human perception into content-based retrieval.
• Machine controlled interactive CBIR (MCI-CBIR)• To reduce bandwidth requirement for browsing and
searching over the Internet.• To minimize errors caused by excessive human
involvement.
Integrating human perception into content-based retrieval
18
Scenario
• Machine provides Machine provides initial retrieval resultsinitial retrieval results, through query-, through query-by-keyword, sketch, or example, etc.;by-keyword, sketch, or example, etc.;
• Iteratively:Iteratively:• User providesUser provides judgment judgment on the current results as to on the current results as to
whether, and to what degree, they are relevant to her/his whether, and to what degree, they are relevant to her/his request;request;
• The machine The machine learns and try againlearns and try again..
19
Relevance Feedback
Initialsample 1st Result
Query
2nd Result
Feedback Feedback
•User gives a feedback to the query resultsUser gives a feedback to the query results•System recalculates feature weights and modified System recalculates feature weights and modified queryquery
20
Basic GUI for Relevance Feedback
Slider
Checkbox
21
ImageGroup
Pallete PanelResult View
22
3D MARS
Structure
color
Texture
Initial DisplayResult
23
Human-Controlled Interactive CBIR (HCI-CBIR)Human-Controlled Interactive CBIR (HCI-CBIR)
• An attractive solution to numerous applications
• Main feature: an active role played by users to improve retrieval accuracy
• State-of-the-art• query design considerations • linear criteria in similarity ranking
24
Effective Retrieval through User Interaction
• Current systems:
• QBIC: interactive region segmentation (IBM).
• FourEyes: including human in the image annotation and retrieval loop, (MIT Media Lab).
• WebSEEk: dynamic feature vector recomputation based on the user’s feedback (Colombia University).
• PicHunter: a Bayesian framework for content based image retrieval (NEC Research Institute).
• PicToSeek: (UVA).
• MARS: a relevance feedback architecture in image retrieval (UIUC).
25
A “New” Proposal for HCI-CBIR
The framework: Relevance feedback
The key features:
Modeling: mapping a high level concept to low level features
Matching:
capturing user’s perceptual subjectivity to modify the query using non-linear measurement
overcoming the difficulties faced by the traditional linear matching criteria
26
The Relevance Feedback Framework
The goal: measure feature relevance to improve performance of image matching in retrieval
A supervised learning procedure based on regressionFor a given query z and a set of the retrieved items,
, category xn into two classes:
a relevant class (visually similar to z): x_m, m = 1, 2, …, M, and an irrelevant class (not similar to z): x_q, q = 1, 2, …, Q
Structure a new query based on the information in x_m and x_q Use the new query in the next round of retrieval
pn Nn xx ,,...,2,1,
27
Query Modification Model 1
The new query:
nonrelevant image
relevant image
original query
modified query
M
mm
Pi
Mz
11
1}{ xz1
28
Query Modification Models 2&3(Anti Re-enforcement Learning)
x
zazb
)( 1NzxzaZ2zxwhen
)( 2NzxzbZ2zxwhen
Where
Example (1-D)
The new query:
z1Nx 2Nx RN , small positive constantsCenter of relevant itemsCenter of non-relevant itemsquery at previous iteration
xx
z
)()(3 zxzxzz NR
)(}{ 12 zxxz NP
iz
M
mm
M 1
1xx
Q
q
qQ 1
1xx
29
Nonlinear Search Unit
Non-linear (Gaussian) Search Unit (NSU)
Small i : a relevant feature (sensitive to change)
Large i : a nonrelevant feature (insensitive to change)
P
i i
iiP
iiii
zxzxGf
12
2
1 2
)(exp)(),(
zx
Pizx imim
i ,...,1|,|max
TPi xxx ],...,,...,[ 1x
TPi zzz ],...,,...,[ 1z
- image feature vector:
- adjustable query vector:
- the tuning parameters (NSU widths):
30
Linear Search Unit (LSU)
• To benchmark the performance of the NSU
• To initiate the search
• The parameters: exactly the same as in the NSU
P
i i
iilinear
zxdS
1
2
2
2 ),(),(
zxzx
31
Architecture for Interactive CBIR
Perceptual similarityMeasure
RAM(feature database)
image database
query image
Feature Extraction&Similarity Measure
Weight-parameterupdating User Interaction
Initial Searching
Interactive searching
User
(tentative)Query
Output: k-retrieved images
when iteration n>0when iteration n=0
storing offeature vectors
32
VQ Codewords as “content descriptors”
Image Blocks
Codebook
i= 1
i= 2
i=n-2
i=n
i=n-1
i=n-3
i
Code labels
The usage of codewords reflects the content of the input image encoded.
33
Two-level WT/VQ coding (1 bpp)
CB1HL2
CB2HL1
CB3VL2
CB4VL1
Multiresolution CodebookMallat’s two-leveldecomposition
0.5 bpp
2 bpp
8 bpp2 bpp
0.5 bpp
0 bpp0.5 bpp
H5 H4 H3 H2 H1
CB5DL2
Label Histogram
34
Test Database 1: Bordatz database
• A texture image database provided by Mahjunath, at http://vivaldi.ece.ucsb.edu/users/wei/codes.html
• 1,856 patterns in 116 different classes
• 16 similar patterns in each class
• Maintained as a single unclassified image database
35
Queries (the Bordatz Database)
[116 different image classes]
36
Performance Comparison
• Methods Compared
• LSU2: linear search unit & query model 2
• NSU2: non-linear search unit & query model 2
• Interactive CBIR in MARS: Multimedia Analysis and Retrieval System (developed at UIUC)
37
Retrieval Results (the Bordatz Database)
Retrieval Rate (%)
Methods t=0 t=1 t=2 t=3
Avg. LSU2 73.7 83.0 85.1 85.9
NSU2 73.7 84.9 88.2 89.2
MARS 67.0 75.1 76.4 76.7
Table 1. Average retrieval rate (%)
Note: The retrieval rate is defined as the average percentage of images belonging to the same class as the query in the top 16 matched.
iARM: Interactive-based Analysis and Retrieval of Multimedia
OnThe Internet
@iarm.ee.ryerson.ca:8000/corel
39
Strategy
• iARM implements interactive retrieval for the large image database, running on the J2EE Web Server.
• Interaction architecture.• Based on a non-linear relevance feedback, a multi-
model SRBF network.• Positive and negative feedbacks.• Properties: local and non-linear learning, fast and robust
on a small input data.
40
• the positive examples;
• The SRBF network characterizes the query by multiple clusters, each of which is modeled by a p-D Gaussian distribution as:
(1)
where (2)
A single-pass Radial Basis function (SBRF) Network
,)(rmx Mm ,...,2,1
2
1
2)(
)(
2exp),,(
m
P
p
rmppp
mr
mm
xxG
xx
Mil ri
rm
im ,...,2,1||,||min )()( xx
41
• The Weighted-Euclidean Space
(3)
where (4)
• A summation of M Gaussian units (1) yields similarity function for the input vector as follows:
(5)
SRBF Network Cont..
,0 if1
,0 if 1
pp
pp
M
np
rpmp xx
M 1
2
12)( ])(
1[
M
mm
rmmGS
1
)( ),,()( xxx
x
Single-class approachSingle-class approach Multi-class approachMulti-class approach
nonrelevant image
relevant image
original query
modified query
43
• Tuning decision boundary with negative samples:
• Antireinforced Learning Algorithm:
• If (6)
• Then
(7)
Negative Feedback Strategy
Nnirn ,...,2,1,)( x
,, ikDD ki
P
p
rip
irppi xxD
1
2)()(
)],()()[()()1( )()()()( ttttt ri
irn
ri
ri xxxx
44
Performance of IARM
• Using Corel Image Collection, containing 40,000 real-life images, www.corel.com.
• A total of 400 queries were generated, and relevance judgments were based on the “ground true” from Corel.
• Multiple descriptors:
<shape, color, texture>
<Fourier descriptors, HSV color histogram& color moments, Gabor Wavelet transform>
45
Result
Non-interactive CBIR iARM
r(1) r(2) r(3) r(8)
53% 80.08% 85.99% 87.58% 89.00%
Table 1: Average Precision Rate (%) obtained by retrieving 400 queries, measured from the top 16 retrievals.
Test 1: Fast and Robust with small # relevance feedbacks
Example: Looking for “model”. 0. Start with choosing image at the bottom right corner as the query.
1. Result after the initial search, then five relevant images are the feedbacks.
2. Result after one relevance feedback: all the top sixteen are relevant.
Test 2: Non-linearity
1. Initial Result: for a kang-fu performance; here shape plays very important role, but only seven images are relevant.
2. After one feedbacks: non-relevant images that have similar shape were removed, and all returns are the kang-fu.
Case 1 (left) : only the full-body actions were selectedCase 2 (Right) : only the half-body actions were selected
After two feedbacks:
Test 3: Multi-model capturing
Multi-modeling can capture very precisely on the local context defined by the current query session
Initial results Final results
56
Summary
Incorporating human perception in retrieval:significant improvement over the simple CBIRimpressive improvement over other relevance
feedback systemssatisfactory user queries support
The query models: effective in compressed domain
Machine Controlled Interactive Content-based Retrieval (MCI-CBR)
58
Problem with HCI-CBR
• User interaction requires• User to specify `relevance’ or `nonrelevance’• Inconsistency human performance• Repeating many feedbacks for convergence• Transmission sample files, i.e., high bandwidth
• User-friendly environment (ideal preference) • Less training samples, i.e., < 20 images/iteration• Less feedbacks, i.e., 1-2 iterations
59
Search distributed DVL’s on Internet
Wide Coverage.
Full Features Search.
DBContent
Search Agent Broker
SOLO
ArchivistEngine
DB-1Feature
SAPMHost
DB-2Content
ArchivistEngine
DB-2Feature
SAPMHost
DB-NContent
ArchivistEngine
DB-NFeature
SAPMHost
Third party Workshop.
Third party Service.
Progressive Search.
Knowhow Control.
60
Machine Controlled Interactive CBIR (MCI-CBIR)
A key research area in multimedia processing
Aim: To incorporate self-learning capability into CBIR which allows:automatic & semi-automatic retrievalminimization of user participations to reduce errors caused
by human inconsistency performancereduction of bandwidth requirement in Internet browsing
and searching
61
HCI-CBIR vs MCI-CBIR
Human controlled Interactive System
Machine ControlledInteractive System
SearchUnit
ImageDatabase
RelevanceFeedback
Userinteraction
Query
Retrievalresults Search
Unit
ImageDatabase
RelevanceFeedback
Query
Retrievalresults
62
The Essence of MCI-CBR
• Based on two feature space: R1 and R2
• Space R1 is of reasonable quality & easy to calculate in retrieval, such as:
• DCT,• DWT
• Space R2 is of very high quality, but potentially computationally intensive in relevance identification
• Descriptors extracted from un-compressed images• Object and region based descriptors
63
Architecture for Automatic Interactive CBR
SearchUnit
WT/VQ Coded Image Database
RBFNQuery
Feature Extraction
Displayimages
Compressed domain Processing
Relevance Feedback Module
User interaction
Different perceptual subjectivity
Semi-automaticMode
Retrievalresults
Relevance Classification
64
SOTM
• Relevance classification is performed by a Self-Organizing Tree Map (SOTM) which offers:
• Independent learning based on competitive learning technique• A unique feature map that preserves topological ordering
• SOTM is more suitable than the conventional SOM when input feature space is of high dimensionality
65
SOTM algorithm
Step oneStep oneInitialize the root node with a Initialize the root node with a point selected at random point selected at random from the input space.from the input space.
The root node is represented The root node is represented by the blue cross, and the by the blue cross, and the data space is represented by data space is represented by the green spots.the green spots.
66
SOTM algorithm Cont..
Step twoStep two Randomly select a new data Randomly select a new data point point xx, and compute the , and compute the Euclidean distance, dEuclidean distance, djj, to , to
node node wwjj (j = 1,....,J), where J (j = 1,....,J), where J
is the total number of nodes, is the total number of nodes, here J = 1.here J = 1.
Step threeStep threeSelect the winning node, j*, Select the winning node, j*, with minimum dwith minimum djj. .
67
SOTM algorithm Cont..
Step fourStep fourIfIf d dj*j*((xx, , wwjj ) ) Threshold, where Threshold, where
Threshold decreases with timeThreshold decreases with timethenthen assign assign xx to the j*th cluster to the j*th cluster and update the weight vector and update the weight vector according to:according to:
wwjj(t+1) = (t+1) = wwjj(t) + (t) + (t)[(t)[xx(t) - (t) - wwjj(t)],(t)],
where where (t) is the learning rate, (t) is the learning rate, 0<0<(t)<1. The position of the (t)<1. The position of the updated node is indicated by the updated node is indicated by the red arrow.red arrow.
68
SOTM algorithm Cont..
Step four, cont.Step four, cont.Else form a new subnode Else form a new subnode starting with starting with xx. The map . The map now has two nodes, as now has two nodes, as indicated by the two blue indicated by the two blue arrows.arrows.
Step fiveStep fiveContinue from step 2.Continue from step 2.
69
SOTM vs SOM
SOTM No nodes converge to areas of zero data density
SOM Nodes converge to areas of zero data density
70
Retrieval Procedure (1)
a. Initial Search: for a given query z, retrieve the set of K-most similar images, F = {O1,O2,…,Ok}, using
‘nearest neighbor rule’ & the feature space R1 for retrieval
b. Characterization: use features in feature space R2 to describe the retrieved images :
Okxk
and obtain the training set:
F(R2) = {x1,…,xk,…,xK}, xkR2
71
Retrieval Procedure (2)
c. Relevance classification: use SOTM to classify the training vectors in F(R2), then use the results to label the retrieved images:
{Ok,yk}, k=1,…,K where yk=1 if Ok is the relevant images otherwise yk=0
d. Relevance Feedback Module: implement interactive learning methods (e.g., RF, non-linear RF, or RBFN) using the training set {Ok,yk}, k=1,…,K, and ‘feature space R1’ for retrieval
e. Go back to Step (b)
72
Experiment Setup
• Brodatz texture database (1,856 images), using 116 queries
• ARR: average retrieval rate based on ground true classes
denotes the size of the ground truth set denotes the number of ground truth found within the top 16
• Feature representations• R1: MHI features on compressed WT/VQ images• R2: Gabor wavelet features
]1,0[)(NG
)(NF)(RR
q
NFNG
73
Automatic vs User Controlled Retrieval
Methods Average Retrieval Rate (ARR), % User’s RF (Iter.)
0 Iter. 1 Iter. 4 Iter.
(a) MCI-CBR 63.42 71.66 76.51 -
(b) HCI-CBR 63.42 77.64 80.17 4
∆=(b)-(a) - +5.98 +3.66
Table 1: A comparison of ARR (%) between MCI-CBR method and HCI-CBR method, obtained during retrieving 116 queries, using Brodatz database.
74
Retrieval Example
Non-Interactive Retrieval Automatic Interactive Retrieval
75
Semi-Automatic VS User-controlled Retrievals
A comparison of AVR at convergence, between semiautomatic and HCI-CBR method
76
Retrieval using DCT-compressed images
• Compressed domain descriptor is based on energy histogram of the low frequency DCT coefficients [Lay, 1999]
• JPEG photograph database distributed by Media Graphic Inc, consisting of nearly 4,700 JPEG color images
77
Result
Method Avg. Relative Precision (%)
Avg. # user RF for convergence (Iter.)
Non-interactive CBIR 49.82 -
Automatic interaction 79.18 -
User controlled 95.66 2.63
Semi-automatic CBIR 98.08 1.33
Retrieval results on JPEG database, Column 2: average relative precision (%); Column 3: average number of user feedbacks (iteration) required for convergence, averaged over 30 queries.
Non-interactive CBIR Automatic-interactive CBIR
Semiautomatic CBIR (two user’s RF) User controlled CBIR (three user’s RF)
Non-interactive CBIR Automatic-interactive CBIR
Semiautomatic CBIR (one user’s RF) User controlled CBIR (two user’s RF)
80
Summary
• MCI-CBIR • minimizes the role of users in CBIR• Semi-automatic retrieval reach optimal
performance quickly