content-based image retrieval using the emd algorithm igal ioffe george leifman supervisor: doron...

22
Content-Based Image Retrieval using the EMD algorithm Igal Ioffe George Leifman Supervisor: Doron Shaked Winter-Spring 2000 Technion - Israel Institute of Technology Department of Electrical Engineering The Vision Research and Image Science Laboratory

Post on 22-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Content-Based Image Retrieval using the EMD algorithm

Igal Ioffe

George Leifman

Supervisor: Doron Shaked

Winter-Spring 2000

Technion - Israel Institute of Technology Department of Electrical Engineering

The Vision Research and Image Science Laboratory

Project Goal

Similar ImagesSource Image Color Image DB

• Estimate similarity between pairs of images

• Order the images accounting to similarity to the source image by query

System Overview

Distance Similarimages

DB Imagefeatures

Queryprocess

Query

Image

Overview:Images & Histograms

Overview: Distance

• Minkowski-form distance (L2)

• EMD – Earth Movers Distance

2)(p

/1

||),(

p

i

piip khKHd

Overview: quantization

• Summarizing the image content

• Reducing high computation complexity

Original Image)20154 colors(

Quantized Image)15 colors(

Quantized Image)5 colors(

Research Issues

• Color Quantization algorithms

• Quad Tree clustering

• Different color spaces

• EMD - Earth Movers Distance algorithm

Median Cut vs. Maximum Diversity

• Maximum Diversity better than Median Cut for small number of colors (<10)

Median Cut

Maximum Diversity3 colors

2 colors

Problems with Histogram

Quad Tree Clustering• Recursive cluster definition

• Dynamic stop constraints

Q.Tree Clustering Examples

Color Spaces• RGB color space

• linear combination of red, green, blue• used to represent image pixels

• CIE LAB color space– closer to human vision system

EMD• Bipartite network flow problem

• Can be formalized as a well known transportation problem from linear programming field

• Minimize

- cost

Efficient and fast Simplex based solutions

m

i

n

j ij

m

i

n

j ijij

f

fdQPEMD

1 1

1 1),(

ijd

Principal Block Scheme

Im age D B D B crea tor D B in fo file

D BN avigator

(a ) (b ) (d )(c)

•(a) Color Image Database

•(b) Preprocess each image

•(c) Store properties of each image in file

•(d) Start data base navigation

DB Creator demo

DB Navigator demo

ResultsNubmber of relevant images retrieved (out of 30)

0

5

10

15

20

25

30

35

0 20 40 60 80 100 120

Number of retrived images

Nu

mb

er o

f re

leva

nt

imag

es

Clusters 20 colors Histogarm 35 colors

Why Visual C++ ?

• Graphic user-friendly interface

• Faster than Matlab

• C++ Object Oriented Design Patterns

• Usage of MFC: effective and convenient way to manipulate large database structures, information reordering and querying (files, strings, array, etc)

Code Optimizations

• Effective Cache Usage

• Decreasing data dependencies in out-of-order execution

•Loop Unrolling

•Using Multi-Threading to achieve performance gain on Multi-Processor systems

Code Optimizations - Examples

struct Rec{

Key key;

Data data;

Rec *next;

};

for (i=0; i<N; i++)

{

acc+=a[i];

}

for (i=0; i<N/2; i+=2)

{

acc1+=a[i];

acc2+=a[i+1];

}

acc = acc1 + acc2;

struct Rec{

Key key;

Rec *next; Data data;

};

Conclusions

• EMD captures well perceptual similarity or dissimilarity of images

•Using both color histogram and image cluster map improves the results versus histogram alone

• There is no preferable color space, but their combination leads to better results

Issues For Further Research

• Including texture properties in image description

• Testing the application on very large image data bases (> 10000 images)

• Handling various images transformations, e.g. partial image, scaling, rotation

• More advanced image feature combination,including color,texture and position