university of new orleans 1 intelligent indexing and retrieval of images a machine learning approach...

University of New Orleans 1

Intelligent Indexing and Retrieval of ImagesA Machine Learning Approach

Yixin ChenDept. of Computer Science and EngineeringThe Pennsylvania State Universityhttp://www.cse.psu.edu/~yixchen


Outline

IntroductionA region-based image categorization

methodCluster-based retrieval of images by

unsupervised learningConclusions and future work


Image Retrieval

The driving forces InternetStorage devicesComputing power

Two approachesText-based approachContent-based approach


Text-Based Approach

Input keywords descriptions

Text-BasedImage

RetrievalSystem

Elephants

ImageDatabase


Text-Based Approach

Index images using keywords(Google, Lycos, etc.)Easy to implementFast retrievalWeb image search (surrounding text)Manual annotation is not always availableA picture is worth a thousand wordsSurrounding text may not describe the

image


Content-Based Approach

Index images using low-level features

Content-based image retrieval (CBIR): search pictures as pictures

CBIRSystem

ImageDatabase


CBIR

ApplicationsCommerce (fashion catalogue, ……)Biomedicine (X-ray, CT, ……)Crime prevention (security filtering, ……)Cultural (art galleries, museums, ……)Military (radar, aerial, ……)Entertainment (personal album, ……)


CBIR

A highly interdisciplinary research area

Databases CBIR

InformationRetrieval

ComputerVision

DomainKnowledge


Previous Work on CBIR

Starting from early 1990sGeneral-purpose image search engines

IBM QBIC System and MIT Photobook System (two of the earliest systems)

VIRAGE System, Columbia VisualSEEK and WebSEEK Systems, UCSB NeTra System, UIUC MARS System, Stanford SIMPLIcity System, NECI PicHunter System, Berkeley Blobworld System, etc.


A Data-Flow Diagram

ImageDatabase

FeatureExtraction

ComputeSimilarityMeasure

Visualization

Histogram, colorlayout, sub-images,regions, etc.

Euclidean distance,intersection, shapecomparison, region matching, etc.

Linear ordering,Projection to 2-D, etc.


Open Problem

Nature of digital images: arrays of numbers Descriptions of images: high-level concepts

Sunset, mountain, dogs, ……

Semantic gap Discrepancy between low-level features and high-

level concepts High feature similarity may not always correspond

to semantic similarity


Narrowing the Semantic Gap

Imagery features and similarity measureSelect effective imagery features [Tieu et al., IEEE

CVPR’00]

Subjective experiments [Mojsilovic et al., IEEE Trans. IP 9(1)]

Tigers Cars

Feature Space



Image CategorizationVacation images [Vailaya et

al., IEEE Trans. IP 10(1)]

SIMPLIcity [Wang et al., IEEE Trans. PAMI 23(9)]

Indoor/ourdoor [Yu and Wolf, SPIE 1995]

ALIP [Li et al., IEEE Trans. PAMI 2003]

Image Database

Indoor

Landscape

Sunset

Outdoor

ForestMountain

City



Relevance feedback

Adjusting similarity measure [Picard et al., IEEE ICIP’96], [Rui et al., IEEE CSVT 8(5)], [Cox et al., IEEE Trans. IP 9(1)]

Support vector machine [Tong et al. ACM MM’01]

CBIRSystem

1 2

3 4

1, 2, 3, 4


Outline





Image Categorization

Image categorizationLabeling of images into one of a number of

predefined categoriesDifficulties

Variable and uncontrolled imaging conditions

Complex and hard-to-describe objectsOcclusionSemantic gap


Motivation

(a) to (d) belong to winter category since we see snow in them

(b) to (f) belong to people category since there are people in them

(b) to (d) belong to skiing category since we see people and snow

(a) to (g) belong to outdoor scene category since they all have a region or regions corresponding to snow, sky, sea, trees, or grass


Our Approach

Region-based methodRule-based descriptionGeneralization of supervised learning

Labels are associated with images instead of individual regions

Identical to multiple-instance learning (MIL) setting

Images – bagsRegions – instances


Multiple-Instance Learning

Every instances posses an unknown true label, which is only indirectly accessible through bag labels

Key assumptions of previous formulation [Andrews, et al., NIPS’03], [Maron, et al., ICML’98], [Zhang, et al., ICML’02]Bags and instances share the same set of

labelsA bag receives a particular label if at least

one of the instances in the bag possesses the label


Multiple-Instance Learning

Previous formulation does not perform well for image categorization

SnowPeople Sky Trees

(a) (e) (f) (e) (f) (a) (f) (g)


New Formulation

Weaker assumptions Bags and instances DO NOT share the same set of

labels{winter, people, skiing, outdoor scenes}

{snow, people, sky, sea, trees, grass}

Each instance has multiple labels with different weights


Learning

Region Prototypes (RP)Each RP corresponds to an instance classEach RP is a local maximizer of Diverse

Density [Maron et al., NIPS’98]

Distance between an RP and an imageRule-based classifier

Support vector learning [Chen et al., IEEE Trans. Fuzzy Systems 2003]


Experiments

20 thematically diverse image categories, each containing 100 images


Categorization Performance

Classification accuracy

Our approach

Color Histogram SVM

MIL (old formulation)[Andrews et al., NIPS’03]



Confusion matrix



Some misclassified images


Sensitivity to Image Segmentation

1 2 3 4 50

0.2

0.4

0.6

0.8

1

Different Coarseness Level of Image Segmentation

Ave

rage

Cla

ssifi

catio

n A

ccur

acy

1 2 3 4 50

0.005

0.01

0.015

0.02

0.025

Different Coarseness Level of Image Segmentation

Sta

ndar

d D

evia

tion

of A

ccur

acy

6.8% 9.5% 11.7% 13.8% 27.4%Compare our method with MI-SVM


Scalability

10 11 12 13 14 15 16 17 18 19 200

0.2

0.4

0.6

0.8

1

Number of Categories

Ave

rage

Cla

ssifi

catio

n A

ccur

acy

10 11 12 13 14 15 16 17 18 19 200

0.005

0.01

0.015

0.02

0.025


Sta

ndar

d D

evia

tion

of A

ccur

acy

Compare our method with MI-SVM


Scalability

10 11 12 13 14 15 16 17 18 19 200

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16


Diff

ere

nce

in A

vera

ge

Cla

ssifi

catio

n A

ccu

racy

Difference in classification accuracy between our method and MI-SVM


Outline





Motivation

A query image and its top 29 matches returned by a CBIR system

Horses (11 out of 29), flowers (7 out of 29), golf player (4 out of 29)


CLUE: CLUsters-based rEtrieval of images by unsupervised learning

HypothesisIn the “vicinity” of a query image, images tend to be semantically clustered

CLUE attempts to capture high-level semantic concepts by learning the way that images of the same semantics are similar


System Overview

A general diagram of a CBIR system using the CLUE

ImageDatabase

FeatureExtraction

ComputeSimilarityMeasure

DisplayAnd

Feedback

SelectNeighboring

Images

ImageClustering

CLUE


Neighboring Images Selection

Nearest neighbors method Pick k nearest

neighbors of the query as seeds

Find r nearest neighbors for each seed

Take all distinct images as neighboring images

k=3, r=4


Weighted Graph Representation

Geometric representation Graph representation

Vertices denote images Edges are formed

between vertices Nonnegative weight of

an edge indicates the similarity between two vertices


Clustering

Graph partitioning and cut

Normalized cut (Ncut) [Shi et al., IEEE Trans. PAMI 22(8)]


An Experimental System

Similarity measureUFM [Chen et al. IEEE PAMI 24(9)]

DatabaseCOREL60,000


User Interface


Query Examples

Query Examples from 60,000-image COREL DatabaseBird, car, food, historical buildings, and soccer game

CLUE UFM

Bird, 6 out of 11 Bird, 3 out of 11


Query Examples

CLUE UFM

Car, 8 out of 11 Car, 4 out of 11

Food, 8 out of 11 Food, 4 out of 11


Query Examples

CLUE UFM

Historical buildings, 10 out of 11 Historical buildings, 8 out of 11

Soccer game, 10 out of 11 Soccer game, 4 out of 11


Clustering WWW Images

Google Image SearchKeywords: tiger, BeijingTop 200 returns4 largest clustersTop 18 images within each cluster



Tiger Cluster 1 (75 images)






Beijing Cluster 1 (61 images) Beijing Cluster 2 (59 images)

Beijing Cluster 3 (43 images) Beijing Cluster 4 (31 images)


Retrieval Accuracy

10 image categorieseach containing 100images


Outline





Conclusions

Two approaches to tackle the “semantic gap” problemRegion-based image categorization methodRetrieving image clusters by unsupervised

learning

Tested using 60,000 images from COREL and images from WWW


Potential Applications

Biomedical InformaticsMRI, CT images

Internet Image search engine

Digital libraries


Future Work

Continue to make contributions to the areas ofMachine learning theoriesComputer visionRoboticsComputational intelligence


Future Work

Apply theories to real world problemsBiomedical InformaticsGISRobot vision systemsSensor, imaging and video systemsAutonomous intelligent agents


Related Publications

Image Categorization IEEE Trans. Fuzzy Systems, 11, 2003 IEEE Int’l Conf. on Fuzzy Systems, 2003 Machine Learning Research, (ready for submission)

CLUE IEEE Int’l Sym. Signal Processing and its Applications, 2003 ACM Multimedia Conference, 2003 (under review) IEEE Trans. Pattern Anal. Machine Intell., (under review)

Unified Feature Matching IEEE Trans. Pattern Anal. Machine Intell., 24(9),2003 ACM Multimedia Conference, 2001 IEEE Int’l Conf. on Fuzzy Systems, 2003


Support by

The National Science FoundationThe Pennsylvania State UniversityThe PNC FoundationSUN MicrosystemsNEC Research Institute


Acknowledgment

Dissertation committee at Penn StateProfessor James Z. Wang, IST and CSEProfessor Lee C. Giles, IST and CSEProfessor John Yen, IST and CSEProfessor Jia Li, STATProfessor Donald Richards, STAT

NEC Research InstituteDr. Robert Krovetz


More Information

Papers in PDF, 60,000-image DB, demonstrations, etc.

http://wang.ist.psu.edu

http://www.cse.psu.edu/~yixchen

university of new orleans 1 intelligent indexing and retrieval of images a machine learning approach...

Documents

ieee icip96

ieee csvt

stanford simplicity

ucsb netra system

uiuc mars system

neci pichunter system

mit photobook system

earliest systemsvirage