university of new orleans 1 intelligent indexing and retrieval of images a machine learning approach...
TRANSCRIPT
University of New Orleans 1
Intelligent Indexing and Retrieval of ImagesA Machine Learning Approach
Yixin ChenDept. of Computer Science and EngineeringThe Pennsylvania State Universityhttp://www.cse.psu.edu/~yixchen
University of New Orleans 2
Outline
IntroductionA region-based image categorization
methodCluster-based retrieval of images by
unsupervised learningConclusions and future work
University of New Orleans 3
Image Retrieval
The driving forces InternetStorage devicesComputing power
Two approachesText-based approachContent-based approach
University of New Orleans 4
Text-Based Approach
Input keywords descriptions
Text-BasedImage
RetrievalSystem
Elephants
ImageDatabase
University of New Orleans 5
Text-Based Approach
Index images using keywords(Google, Lycos, etc.)Easy to implementFast retrievalWeb image search (surrounding text)Manual annotation is not always availableA picture is worth a thousand wordsSurrounding text may not describe the
image
University of New Orleans 6
Content-Based Approach
Index images using low-level features
Content-based image retrieval (CBIR): search pictures as pictures
CBIRSystem
ImageDatabase
University of New Orleans 7
CBIR
ApplicationsCommerce (fashion catalogue, ……)Biomedicine (X-ray, CT, ……)Crime prevention (security filtering, ……)Cultural (art galleries, museums, ……)Military (radar, aerial, ……)Entertainment (personal album, ……)
University of New Orleans 8
CBIR
A highly interdisciplinary research area
Databases CBIR
InformationRetrieval
ComputerVision
DomainKnowledge
University of New Orleans 9
Previous Work on CBIR
Starting from early 1990sGeneral-purpose image search engines
IBM QBIC System and MIT Photobook System (two of the earliest systems)
VIRAGE System, Columbia VisualSEEK and WebSEEK Systems, UCSB NeTra System, UIUC MARS System, Stanford SIMPLIcity System, NECI PicHunter System, Berkeley Blobworld System, etc.
University of New Orleans 10
A Data-Flow Diagram
ImageDatabase
FeatureExtraction
ComputeSimilarityMeasure
Visualization
Histogram, colorlayout, sub-images,regions, etc.
Euclidean distance,intersection, shapecomparison, region matching, etc.
Linear ordering,Projection to 2-D, etc.
University of New Orleans 11
Open Problem
Nature of digital images: arrays of numbers Descriptions of images: high-level concepts
Sunset, mountain, dogs, ……
Semantic gap Discrepancy between low-level features and high-
level concepts High feature similarity may not always correspond
to semantic similarity
University of New Orleans 12
Narrowing the Semantic Gap
Imagery features and similarity measureSelect effective imagery features [Tieu et al., IEEE
CVPR’00]
Subjective experiments [Mojsilovic et al., IEEE Trans. IP 9(1)]
Tigers Cars
Feature Space
University of New Orleans 13
Narrowing the Semantic Gap
Image CategorizationVacation images [Vailaya et
al., IEEE Trans. IP 10(1)]
SIMPLIcity [Wang et al., IEEE Trans. PAMI 23(9)]
Indoor/ourdoor [Yu and Wolf, SPIE 1995]
ALIP [Li et al., IEEE Trans. PAMI 2003]
Image Database
Indoor
Landscape
Sunset
Outdoor
ForestMountain
City
University of New Orleans 14
Narrowing the Semantic Gap
Relevance feedback
Adjusting similarity measure [Picard et al., IEEE ICIP’96], [Rui et al., IEEE CSVT 8(5)], [Cox et al., IEEE Trans. IP 9(1)]
Support vector machine [Tong et al. ACM MM’01]
CBIRSystem
1 2
3 4
1, 2, 3, 4
University of New Orleans 15
Outline
IntroductionA region-based image categorization
methodCluster-based retrieval of images by
unsupervised learningConclusions and future work
University of New Orleans 16
Image Categorization
Image categorizationLabeling of images into one of a number of
predefined categoriesDifficulties
Variable and uncontrolled imaging conditions
Complex and hard-to-describe objectsOcclusionSemantic gap
University of New Orleans 17
Motivation
(a) to (d) belong to winter category since we see snow in them
(b) to (f) belong to people category since there are people in them
(b) to (d) belong to skiing category since we see people and snow
(a) to (g) belong to outdoor scene category since they all have a region or regions corresponding to snow, sky, sea, trees, or grass
University of New Orleans 18
Our Approach
Region-based methodRule-based descriptionGeneralization of supervised learning
Labels are associated with images instead of individual regions
Identical to multiple-instance learning (MIL) setting
Images – bagsRegions – instances
University of New Orleans 19
Multiple-Instance Learning
Every instances posses an unknown true label, which is only indirectly accessible through bag labels
Key assumptions of previous formulation [Andrews, et al., NIPS’03], [Maron, et al., ICML’98], [Zhang, et al., ICML’02]Bags and instances share the same set of
labelsA bag receives a particular label if at least
one of the instances in the bag possesses the label
University of New Orleans 20
Multiple-Instance Learning
Previous formulation does not perform well for image categorization
SnowPeople Sky Trees
(a) (e) (f) (e) (f) (a) (f) (g)
University of New Orleans 21
New Formulation
Weaker assumptions Bags and instances DO NOT share the same set of
labels{winter, people, skiing, outdoor scenes}
{snow, people, sky, sea, trees, grass}
Each instance has multiple labels with different weights
University of New Orleans 22
Learning
Region Prototypes (RP)Each RP corresponds to an instance classEach RP is a local maximizer of Diverse
Density [Maron et al., NIPS’98]
Distance between an RP and an imageRule-based classifier
Support vector learning [Chen et al., IEEE Trans. Fuzzy Systems 2003]
University of New Orleans 23
Experiments
20 thematically diverse image categories, each containing 100 images
University of New Orleans 24
Categorization Performance
Classification accuracy
Our approach
Color Histogram SVM
MIL (old formulation)[Andrews et al., NIPS’03]
University of New Orleans 25
Categorization Performance
Confusion matrix
University of New Orleans 26
Categorization Performance
Some misclassified images
University of New Orleans 27
Sensitivity to Image Segmentation
1 2 3 4 50
0.2
0.4
0.6
0.8
1
Different Coarseness Level of Image Segmentation
Ave
rage
Cla
ssifi
catio
n A
ccur
acy
1 2 3 4 50
0.005
0.01
0.015
0.02
0.025
Different Coarseness Level of Image Segmentation
Sta
ndar
d D
evia
tion
of A
ccur
acy
6.8% 9.5% 11.7% 13.8% 27.4%Compare our method with MI-SVM
University of New Orleans 28
Scalability
10 11 12 13 14 15 16 17 18 19 200
0.2
0.4
0.6
0.8
1
Number of Categories
Ave
rage
Cla
ssifi
catio
n A
ccur
acy
10 11 12 13 14 15 16 17 18 19 200
0.005
0.01
0.015
0.02
0.025
Number of Categories
Sta
ndar
d D
evia
tion
of A
ccur
acy
Compare our method with MI-SVM
University of New Orleans 29
Scalability
10 11 12 13 14 15 16 17 18 19 200
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
Number of Categories
Diff
ere
nce
in A
vera
ge
Cla
ssifi
catio
n A
ccu
racy
Difference in classification accuracy between our method and MI-SVM
University of New Orleans 30
Outline
IntroductionA region-based image categorization
methodCluster-based retrieval of images by
unsupervised learningConclusions and future work
University of New Orleans 31
Motivation
A query image and its top 29 matches returned by a CBIR system
Horses (11 out of 29), flowers (7 out of 29), golf player (4 out of 29)
University of New Orleans 32
CLUE: CLUsters-based rEtrieval of images by unsupervised learning
HypothesisIn the “vicinity” of a query image, images tend to be semantically clustered
CLUE attempts to capture high-level semantic concepts by learning the way that images of the same semantics are similar
University of New Orleans 33
System Overview
A general diagram of a CBIR system using the CLUE
ImageDatabase
FeatureExtraction
ComputeSimilarityMeasure
DisplayAnd
Feedback
SelectNeighboring
Images
ImageClustering
CLUE
University of New Orleans 34
Neighboring Images Selection
Nearest neighbors method Pick k nearest
neighbors of the query as seeds
Find r nearest neighbors for each seed
Take all distinct images as neighboring images
k=3, r=4
University of New Orleans 35
Weighted Graph Representation
Geometric representation Graph representation
Vertices denote images Edges are formed
between vertices Nonnegative weight of
an edge indicates the similarity between two vertices
University of New Orleans 36
Clustering
Graph partitioning and cut
Normalized cut (Ncut) [Shi et al., IEEE Trans. PAMI 22(8)]
University of New Orleans 37
An Experimental System
Similarity measureUFM [Chen et al. IEEE PAMI 24(9)]
DatabaseCOREL60,000
University of New Orleans 38
User Interface
University of New Orleans 39
Query Examples
Query Examples from 60,000-image COREL DatabaseBird, car, food, historical buildings, and soccer game
CLUE UFM
Bird, 6 out of 11 Bird, 3 out of 11
University of New Orleans 40
Query Examples
CLUE UFM
Car, 8 out of 11 Car, 4 out of 11
Food, 8 out of 11 Food, 4 out of 11
University of New Orleans 41
Query Examples
CLUE UFM
Historical buildings, 10 out of 11 Historical buildings, 8 out of 11
Soccer game, 10 out of 11 Soccer game, 4 out of 11
University of New Orleans 42
Clustering WWW Images
Google Image SearchKeywords: tiger, BeijingTop 200 returns4 largest clustersTop 18 images within each cluster
University of New Orleans 43
Clustering WWW Images
Tiger Cluster 1 (75 images)
Tiger Cluster 3 (32 images)
Tiger Cluster 2 (64 images)
Tiger Cluster 4 (24 images)
University of New Orleans 44
Clustering WWW Images
Beijing Cluster 1 (61 images) Beijing Cluster 2 (59 images)
Beijing Cluster 3 (43 images) Beijing Cluster 4 (31 images)
University of New Orleans 45
Retrieval Accuracy
10 image categorieseach containing 100images
University of New Orleans 46
Outline
IntroductionA region-based image categorization
methodCluster-based retrieval of images by
unsupervised learningConclusions and future work
University of New Orleans 47
Conclusions
Two approaches to tackle the “semantic gap” problemRegion-based image categorization methodRetrieving image clusters by unsupervised
learning
Tested using 60,000 images from COREL and images from WWW
University of New Orleans 48
Potential Applications
Biomedical InformaticsMRI, CT images
Internet Image search engine
Digital libraries
University of New Orleans 49
Future Work
Continue to make contributions to the areas ofMachine learning theoriesComputer visionRoboticsComputational intelligence
University of New Orleans 50
Future Work
Apply theories to real world problemsBiomedical InformaticsGISRobot vision systemsSensor, imaging and video systemsAutonomous intelligent agents
University of New Orleans 51
Related Publications
Image Categorization IEEE Trans. Fuzzy Systems, 11, 2003 IEEE Int’l Conf. on Fuzzy Systems, 2003 Machine Learning Research, (ready for submission)
CLUE IEEE Int’l Sym. Signal Processing and its Applications, 2003 ACM Multimedia Conference, 2003 (under review) IEEE Trans. Pattern Anal. Machine Intell., (under review)
Unified Feature Matching IEEE Trans. Pattern Anal. Machine Intell., 24(9),2003 ACM Multimedia Conference, 2001 IEEE Int’l Conf. on Fuzzy Systems, 2003
University of New Orleans 52
Support by
The National Science FoundationThe Pennsylvania State UniversityThe PNC FoundationSUN MicrosystemsNEC Research Institute
University of New Orleans 53
Acknowledgment
Dissertation committee at Penn StateProfessor James Z. Wang, IST and CSEProfessor Lee C. Giles, IST and CSEProfessor John Yen, IST and CSEProfessor Jia Li, STATProfessor Donald Richards, STAT
NEC Research InstituteDr. Robert Krovetz
University of New Orleans 54
More Information
Papers in PDF, 60,000-image DB, demonstrations, etc.
http://wang.ist.psu.edu
http://www.cse.psu.edu/~yixchen