image retrieval by content (cbir). presentation outline introduction introduction history of image...

78
Image Retrieval by Content Image Retrieval by Content (CBIR) (CBIR)

Upload: magdalene-stanley

Post on 18-Dec-2015

238 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

Image Retrieval by ContentImage Retrieval by Content(CBIR)(CBIR)

Page 2: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

Presentation OutlinePresentation Outline

IntroductionIntroduction History of image retrieval – Issues facedHistory of image retrieval – Issues faced Solution – Content-based image Solution – Content-based image

retrievalretrieval Feature extractionFeature extraction Multidimensional indexingMultidimensional indexing Current SystemsCurrent Systems Open issuesOpen issues ConclusionConclusion

Page 3: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

IntroductionIntroduction

Image databases, once an expensive Image databases, once an expensive proposition, in terms of space, cost proposition, in terms of space, cost and time has now become a reality.and time has now become a reality.

Image databases, store images of a Image databases, store images of a various kinds.various kinds.

These databases can be searched These databases can be searched interactively, based on image interactively, based on image content or by indexed keywords.content or by indexed keywords.

Page 4: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

IntroductionIntroduction

Examples:Examples: Art collection – paintings could be Art collection – paintings could be

searched by artists, genre, style, color searched by artists, genre, style, color etc.etc.

Medical images – searched for anatomy, Medical images – searched for anatomy, diseases.diseases.

Satellite images – for analysis/prediction.Satellite images – for analysis/prediction. General – you want to write an General – you want to write an

illustrated report.illustrated report.

Page 5: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

IntroductionIntroduction

Database Projects:Database Projects: IBM Query by Image Content (QBIC).IBM Query by Image Content (QBIC).

Retrieves based on visual content, Retrieves based on visual content, including properties such as color including properties such as color percentage, color layout and texture.percentage, color layout and texture.

Fine Arts Museum of San Francisco uses Fine Arts Museum of San Francisco uses QBIC.QBIC.

Virage Inc. Search Engine.Virage Inc. Search Engine. Can search based on color, composition, Can search based on color, composition,

texture and structure.texture and structure.

Page 6: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

IntroductionIntroduction

Commercial Systems:Commercial Systems: Corbis – general purpose, 17 million Corbis – general purpose, 17 million

images, searchable by keywords.images, searchable by keywords. Getty Images – image database organized Getty Images – image database organized

by categories and searchable through by categories and searchable through keywords.keywords.

The National Laboratory of Medicine – The National Laboratory of Medicine – database of X-rays, CT-scans MRI images, database of X-rays, CT-scans MRI images, available for medical research.available for medical research.

NASA & USGS – satellite images (for a NASA & USGS – satellite images (for a fee!)fee!)

Page 7: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

History of Image History of Image RetrievalRetrieval

Images appearing on the WWW Images appearing on the WWW typically contain captions from which typically contain captions from which keywords can be extracted.keywords can be extracted.

In relational databases, entries can be In relational databases, entries can be retrieved based on the values of their retrieved based on the values of their textual attributes.textual attributes.

Categories include objects, (names of) Categories include objects, (names of) people, date of creation and source.people, date of creation and source.

Indexed according to these attributes.Indexed according to these attributes.

Page 8: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

History of Image History of Image RetrievalRetrieval

Traditional text-based image search Traditional text-based image search enginesengines Manual annotation of imagesManual annotation of images Use text-based retrieval methodsUse text-based retrieval methods

E.g. E.g.

Water lilies

Flowers in a pond

<Its biological name>

Page 9: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

History of Image History of Image RetrievalRetrieval

SELECTSELECT * FROM IMAGEDB * FROM IMAGEDB

WHEREWHERE CATEGORY = ‘GEMS’ CATEGORY = ‘GEMS’ AND AND

SOURCESOURCE = ‘SMITHSONIAN’ = ‘SMITHSONIAN’

Page 10: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

History of Image History of Image RetrievalRetrieval

SELECTSELECT * FROM IMAGEDB * FROM IMAGEDBWHEREWHERE CATEGORY = ‘GEMS’ CATEGORY = ‘GEMS’

ANDANDSOURCESOURCE = ‘SMITHSONIAN’ = ‘SMITHSONIAN’

ANDAND((KEYWORDKEYWORD = ‘AMETHYST’ = ‘AMETHYST’ ORORKEYWORDKEYWORD = ‘CRYSTAL’ = ‘CRYSTAL’ ORORKEYWORDKEYWORD = ‘PURPLE’) = ‘PURPLE’)

Page 11: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

Limitations of text-based Limitations of text-based approachapproach

Problem of image annotationProblem of image annotation Large volumes of databasesLarge volumes of databases Valid only for one language – with image Valid only for one language – with image

retrieval this limitation should not existretrieval this limitation should not exist Problem of human perceptionProblem of human perception

Subjectivity of human perceptionSubjectivity of human perception Too much responsibility on the end-userToo much responsibility on the end-user

Problem of deeper (abstract) needsProblem of deeper (abstract) needs Queries that cannot be described at all, but Queries that cannot be described at all, but

tap into the visual features of images.tap into the visual features of images.

Page 12: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

OutlineOutline

History of image retrieval – Issues facedHistory of image retrieval – Issues faced Solution – Content-based image retrievalSolution – Content-based image retrieval Feature extractionFeature extraction Multidimensional indexingMultidimensional indexing Current SystemsCurrent Systems Open issuesOpen issues ConclusionConclusion

Page 13: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

What is CBIR?What is CBIR?

Images have rich content.Images have rich content. This content can be extracted as This content can be extracted as

various content features:various content features: Mean color , Color Histogram etc…Mean color , Color Histogram etc…

Take the responsibility of forming Take the responsibility of forming the query away from the user.the query away from the user.

Each image will now be described by Each image will now be described by its own features.its own features.

Page 14: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

CBIR – A sample search CBIR – A sample search queryquery

User wants to search for, say, many rose User wants to search for, say, many rose imagesimages He submits an existing rose picture as query.He submits an existing rose picture as query. He submits his own sketch of rose as query.He submits his own sketch of rose as query.

The system will extract image features for The system will extract image features for this query.this query.

It will compare these features with that of It will compare these features with that of other images in a database.other images in a database.

Relevant results will be displayed to the Relevant results will be displayed to the user.user.

Page 15: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

Sample QuerySample Query

Page 16: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

Sample CBIR Sample CBIR architecturearchitecture

Page 17: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

OutlineOutline

History of image retrieval – Issues facedHistory of image retrieval – Issues faced Solution – Content-based image retrievalSolution – Content-based image retrieval Feature extractionFeature extraction Multidimensional indexingMultidimensional indexing Current SystemsCurrent Systems Open issuesOpen issues ConclusionConclusion

Page 18: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

Feature ExtractionFeature Extraction

What are image features?What are image features? Primitive featuresPrimitive features

Mean color (RGB)Mean color (RGB) Color HistogramColor Histogram

Semantic featuresSemantic features Color Layout, texture etc…Color Layout, texture etc…

Domain specific featuresDomain specific features Face recognition, fingerprint matching Face recognition, fingerprint matching

etc…etc…

General features

Page 19: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

Mean ColorMean Color

Pixel Color Information: R, G, BPixel Color Information: R, G, B Mean component (R,G or B)= Mean component (R,G or B)=

Sum of that component for all Sum of that component for all pixels pixels

Number of pixelsNumber of pixels

Pixel

Page 20: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

HistogramHistogram

Frequency count of each individual Frequency count of each individual colorcolor

Most commonly used color feature Most commonly used color feature representationrepresentation

Image

Corresponding histogram

Page 21: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

Color LayoutColor Layout

Need for Color LayoutNeed for Color Layout Global color features give too many false Global color features give too many false

positivespositives How it works:How it works:

Divide whole image into sub-blocksDivide whole image into sub-blocks Extract features from each sub-blockExtract features from each sub-block

Can we go one step further?Can we go one step further? Divide into regions based on color Divide into regions based on color

feature concentrationfeature concentration This process is called segmentation.This process is called segmentation.

Page 22: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

Example: Color layoutExample: Color layout

** Image adapted from Smith and Chang : Single Color Extraction and Image Query

Page 23: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

Images returned for 40% red, 30% yellow and 10% black.

Page 24: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

Color Similarity Color Similarity MeasuresMeasures

Color histogram matching could be used as Color histogram matching could be used as described earlier.described earlier.

QBIC defines its color histogram distance asQBIC defines its color histogram distance as

dddist dist (I,Q) = (h(I) – h(Q))(I,Q) = (h(I) – h(Q))TTA(h(I) – h(Q))A(h(I) – h(Q))

where h(I) and h(Q) are the K-bin histogram where h(I) and h(Q) are the K-bin histogram of images I and Q respectively and A is a of images I and Q respectively and A is a KxK similarity matrix.KxK similarity matrix.

In this matrix similar colors have values In this matrix similar colors have values close to1 and colors that are different have close to1 and colors that are different have values close to 0. values close to 0.

Page 25: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

Color Similarity Color Similarity MeasuresMeasures

Color layout is another possible Color layout is another possible distance measure.distance measure.

The user can specify regions with The user can specify regions with specific colors.specific colors.

Divide the image into a finite Divide the image into a finite number of grids. Starting with an number of grids. Starting with an empty grid, associate each grid with empty grid, associate each grid with a specific color (chosen from a color a specific color (chosen from a color palette. palette.

Page 26: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval
Page 27: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

Color Similarity Color Similarity MeasuresMeasures

It is also possible to provide this It is also possible to provide this information from a sample image. As information from a sample image. As was seen in Fig 8.3.was seen in Fig 8.3.

Color layout measures that use a grid Color layout measures that use a grid require a grid square color distance require a grid square color distance measure measure ddcolorcolor that compare the grids that compare the grids between the sample image and the between the sample image and the matched image.matched image.

ddgridded_squaregridded_square (I,Q) = (I,Q) = Σ Σ ddcolorcolor(C(CII(g),C(g),CQQ(g))(g))g

Page 28: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

Where CWhere CII(g) and C(g) and CQQ(g) represent the color (g) represent the color in grid g of a database image I and query in grid g of a database image I and query image Q respectively.image Q respectively.

The representation of the color in a grid The representation of the color in a grid square can be simple or complicated.square can be simple or complicated.

Some suitable representations areSome suitable representations are The mean color in the grid squareThe mean color in the grid square The mean and standard deviation of the colorThe mean and standard deviation of the color A multi-bin histogram of the colorA multi-bin histogram of the color

These should be assigned meaning ahead These should be assigned meaning ahead of time, i.e. mean color could mean of time, i.e. mean color could mean representation of the mean of R, G and B representation of the mean of R, G and B or a single value.or a single value.

Page 29: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

TextureTexture Texture – innate property of all surfacesTexture – innate property of all surfaces

Clouds, trees, bricks, hair etc…Clouds, trees, bricks, hair etc… Refers to visual patterns of homogeneityRefers to visual patterns of homogeneity Does not result from presence of single colorDoes not result from presence of single color Most accepted classification of textures based Most accepted classification of textures based

on psychology studies – Tamura on psychology studies – Tamura representationrepresentation

• Coarseness

• Contrast

• Directionality

• Linelikeness

• Regularity

• Roughness

Page 30: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

Segmentation issuesSegmentation issues

Considered as a difficult problemConsidered as a difficult problem Not reliableNot reliable Segments regions, but not objectsSegments regions, but not objects Different requirements from Different requirements from

segmentation:segmentation: Shape extraction: High Accuracy Shape extraction: High Accuracy

requiredrequired Layout features: Coarse segmentation Layout features: Coarse segmentation

may be enoughmay be enough

Page 31: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

Texture Similarity Texture Similarity MeasuresMeasures

Texture similarity tends to be more Texture similarity tends to be more complex use than color similarity.complex use than color similarity.

An image that has similar texture to An image that has similar texture to a query image should have the same a query image should have the same spatial arrangements of color, but spatial arrangements of color, but not necessarily that same colors.not necessarily that same colors.

The texture measurements studied The texture measurements studied in the previous chapter can be used in the previous chapter can be used for matching.for matching.

Page 32: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval
Page 33: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

Texture Similarity Texture Similarity MeasuresMeasures

In the previous example Laws texture In the previous example Laws texture energy measures were used.energy measures were used.

As can be seen from the results, the As can be seen from the results, the measure is independent of color.measure is independent of color.

It also possible to develop measures that It also possible to develop measures that look at both texture and color.look at both texture and color.

Texture distance measures have two Texture distance measures have two aspectsaspects The representation of textureThe representation of texture The definition of similarity with respect to that The definition of similarity with respect to that

representationrepresentation

Page 34: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

Texture Similarity Texture Similarity MeasuresMeasures

The most commonly used texture The most commonly used texture representation is a representation is a texture description texture description vector, vector, which is a vector of numbers which is a vector of numbers that summarizes the texture in a given that summarizes the texture in a given image or image region.image or image region.

The vector of Haralick’s five co-The vector of Haralick’s five co-occurrence-based texture features occurrence-based texture features and that of Laws’ nine texture energy and that of Laws’ nine texture energy features are examples.features are examples.

Page 35: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

Texture Similarity Texture Similarity MeasuresMeasures

While a texture description vector can be While a texture description vector can be used to summarize the texture in an entire used to summarize the texture in an entire image, this is only a good method for image, this is only a good method for describing single texture images.describing single texture images.

For more general images, texture For more general images, texture description vectors are calculated at each description vectors are calculated at each pixel for a small (e.g. 15 x15) neighborhood pixel for a small (e.g. 15 x15) neighborhood about that pixel. about that pixel.

Then the pixels are grouped by a clustering Then the pixels are grouped by a clustering algorithm that assigns a unique label to algorithm that assigns a unique label to each different texture category it finds.each different texture category it finds.

Page 36: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

Texture Similarity Texture Similarity MeasuresMeasures

Several distances can be defined once Several distances can be defined once the vector information is derived for an the vector information is derived for an image. The simplest texture distance is image. The simplest texture distance is the the pick-and-clickpick-and-click approach, where the approach, where the user picks the texture by clicking on the user picks the texture by clicking on the image.image.

The texture measure vector is found for The texture measure vector is found for the selected pixel and is used to measure the selected pixel and is used to measure similarity with the texture measure similarity with the texture measure vectors for the images in the database. vectors for the images in the database.

Page 37: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

Texture Similarity Texture Similarity MeasuresMeasures

The texture distance is given byThe texture distance is given by

ddpick_and_clickpick_and_click(I,Q)(I,Q) = min = min i in Ii in I ||T(i) – T(Q)|| ||T(i) – T(Q)||22

where T(i) is the texture description where T(i) is the texture description vector at pixel I of the image I and T(Q) is vector at pixel I of the image I and T(Q) is the textue description vector at the the textue description vector at the selected pixel (or region).selected pixel (or region).

While this could be computationally While this could be computationally expensive to do on the fly, prior expensive to do on the fly, prior computation (and indexing) of the textures computation (and indexing) of the textures in the image database would be a solution.in the image database would be a solution.

Page 38: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

Alternate to Alternate to pick-and-clickpick-and-click is the is the gridded approach discussed in the gridded approach discussed in the color matching.color matching.

A grid is placed on the image and A grid is placed on the image and texture description vector calculated texture description vector calculated for the query image. The same process for the query image. The same process is applied to the DB images.is applied to the DB images.

The gridded texture distance is given The gridded texture distance is given byby

Where dWhere dtexturetexture can be Euclidean distance can be Euclidean distance or some other distance metric.or some other distance metric.

Page 39: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

Shape Similarity Shape Similarity MeasuresMeasures

Color and texture are both global attributes Color and texture are both global attributes of an image.of an image.

Shape refers to a specific region of an image.Shape refers to a specific region of an image. Shape goes one step further than color and Shape goes one step further than color and

texture in that it requires some kind of texture in that it requires some kind of region identification process to precede the region identification process to precede the shape similarity measure.shape similarity measure.

Segmentation is still a crucial problem to be Segmentation is still a crucial problem to be solved.solved.

Shape matching will be discussed here. Shape matching will be discussed here.

Page 40: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

Shape Similarity Shape Similarity MeasuresMeasures

2-D shape recognition is an important 2-D shape recognition is an important aspect of image analysis.aspect of image analysis.

Comparing shapes can be accomplished Comparing shapes can be accomplished in several ways – structuring elements, in several ways – structuring elements, region adjacency graphs etc.region adjacency graphs etc.

They tend to expensive in terms of time.They tend to expensive in terms of time. In CBIR we need the shape matching to In CBIR we need the shape matching to

be fast.be fast. The matching should also be size, The matching should also be size,

rotational and translation invariant.rotational and translation invariant.

Page 41: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

Shape HistogramShape Histogram

Histogram distance simply an Histogram distance simply an extension from color and texture.extension from color and texture.

The biggest challenge is to define The biggest challenge is to define the variable on which the histogram the variable on which the histogram is defined.is defined.

One kind of histogram matching is One kind of histogram matching is projection matchingprojection matching, using , using horizontal and vertical projections of horizontal and vertical projections of the shape in a binary image.the shape in a binary image.

Page 42: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

Projection MatchingProjection Matching For an For an n n x x m m image construct an image construct an n+mn+m

histogram where each bin will contain histogram where each bin will contain the number of 1-pixels in each row and the number of 1-pixels in each row and column.column.

This approach is useful if the shape is This approach is useful if the shape is always the same size.always the same size.

To make PM size invariant, To make PM size invariant, n n and and m m are are fixedfixed

Translation invariance can be achieved in Translation invariance can be achieved in PM by shifting the histogram from the PM by shifting the histogram from the top-left to the bottom-right of the shape.top-left to the bottom-right of the shape.

Page 43: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

Projection MatchingProjection Matching

Rotational invariance is harder but Rotational invariance is harder but can be achieved by computing the can be achieved by computing the axes of the best fitting ellipse and axes of the best fitting ellipse and rotate the shape along the major axis.rotate the shape along the major axis.

Since we do not know the top of the Since we do not know the top of the shape we have to try two orientations.shape we have to try two orientations.

If the major and minor-axes are about If the major and minor-axes are about the same size four orientations are the same size four orientations are possible.possible.

Page 44: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

Projection MatchingProjection Matching

Another possibility is to construct the Another possibility is to construct the histogram over the tangent angle at histogram over the tangent angle at each pixel on the boundary of the each pixel on the boundary of the shape.shape.

This is automatically size and This is automatically size and translation but not rotation invariant.translation but not rotation invariant.

The rotational invariance can be The rotational invariance can be solved by rotating the histogram (K solved by rotating the histogram (K possible rotations in a K-bin possible rotations in a K-bin histogram).histogram).

Page 45: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

Boundary MatchingBoundary Matching

BM algorithms require the extraction BM algorithms require the extraction and representation of the boundaries and representation of the boundaries of the query shape and image shape.of the query shape and image shape.

The boundary can be represented as a The boundary can be represented as a sequence of pixels or maybe sequence of pixels or maybe approximated by a polygon.approximated by a polygon.

For a sequence of pixels, one classical For a sequence of pixels, one classical matching technique uses Fourier matching technique uses Fourier descriptors to compare two shapes.descriptors to compare two shapes.

Page 46: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

Boundary MatchingBoundary Matching

In the continuous case the FDs are In the continuous case the FDs are the coefficients of the Fourier series the coefficients of the Fourier series expansion of the function that defines expansion of the function that defines the boundary of the shape.the boundary of the shape.

In the discrete case the shape is In the discrete case the shape is represented by a sequence of represented by a sequence of mm points points <V<V00, V, V11, …,V, …,Vm-1m-1>.>.

From this sequence of points a sequence of From this sequence of points a sequence of unit vectors and a sequence of cumulative unit vectors and a sequence of cumulative differences can be computed differences can be computed

Page 47: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

Boundary MatchingBoundary Matching

Unit vectors –Unit vectors –

Cumulative differences Cumulative differences

Page 48: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

Boundary MatchingBoundary Matching

The Fourier descriptors {aThe Fourier descriptors {a-M-M, …, a, …, a00, , …,a…,aMM}}

are then approximated byare then approximated by

These descriptors can be used to These descriptors can be used to define a shape distance measure.define a shape distance measure.

Page 49: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

Boundary MatchingBoundary Matching

Suppose Suppose QQ is the query shape and is the query shape and II is the image shape. Let {ais the image shape. Let {ann

QQ} be the } be the sequence of FDs for the query and sequence of FDs for the query and {a{ann

II} be the sequence of FDs for the } be the sequence of FDs for the image.image.

The the Fourier distance measure is The the Fourier distance measure is given bygiven by

Page 50: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

Boundary MatchingBoundary Matching

This measure is only translation This measure is only translation invariant.invariant.

Other methods can be used in Other methods can be used in conjunction with this to solve other conjunction with this to solve other invariances.invariances.

If the boundary is represented by If the boundary is represented by polygons, the lengths and angles polygons, the lengths and angles between them can be used to between them can be used to compute and represent the shapes.compute and represent the shapes.

Page 51: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

Boundary MatchingBoundary Matching

Another boundary matching technique Another boundary matching technique is is elastic matching elastic matching in which the query in which the query shape is deformed to become as similar shape is deformed to become as similar as possible to the image shape.as possible to the image shape.

The distance between the query shape The distance between the query shape and image depends on two and image depends on two components :components : The energy required to deform the query The energy required to deform the query

shapeshape A measure of how well the deformed shape A measure of how well the deformed shape

actually matches the image.actually matches the image.

Page 52: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval
Page 53: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

Sketch MatchingSketch Matching

Sketch matching systems allow the Sketch matching systems allow the user to input a rough sketch of the user to input a rough sketch of the major edges in an image and look for major edges in an image and look for matching images.matching images.

In the ART MUSEUM system, the DB In the ART MUSEUM system, the DB consists of color images of famous consists of color images of famous paintings. The following preprocessing paintings. The following preprocessing step are performed to get an step are performed to get an abstract abstract imageimage of all the images in the DB of all the images in the DB..

Page 54: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

An affine transform is applied to An affine transform is applied to reduce the image to a standard size, reduce the image to a standard size, such as 64x64 and median filter is such as 64x64 and median filter is applied to remove noise. The result is a applied to remove noise. The result is a normalized image.normalized image.

Detect edges based on gradient-based Detect edges based on gradient-based edge-finding algorithm. This is done edge-finding algorithm. This is done using two steps – major edges are using two steps – major edges are found with a global threshold that is found with a global threshold that is based on the mean and variance of the based on the mean and variance of the gradient; then the local edges are gradient; then the local edges are selected from the global edges by local selected from the global edges by local threshold. The result is a normalized threshold. The result is a normalized image.image.

Perform thinning and shrinking on the Perform thinning and shrinking on the refined edge image. The final result is refined edge image. The final result is an abstract image.an abstract image.

Page 55: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

Sketch MatchingSketch Matching

When the user enters a rough When the user enters a rough sketch, it is also converted to the sketch, it is also converted to the normalized size, binarized, thinned normalized size, binarized, thinned and shrunk, resulting in a and shrunk, resulting in a linear linear sketch.sketch.

Now the linear sketch must be Now the linear sketch must be matched to the abstract image.matched to the abstract image.

The matching algorithm is (gridded) The matching algorithm is (gridded) correlation-based.correlation-based.

Page 56: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

Face FindingFace Finding

Face finding is both useful and difficult.Face finding is both useful and difficult. Faces can vary is size and spatial Faces can vary is size and spatial

location in an image.location in an image. A system developed at CMU employs a A system developed at CMU employs a

multi-resolution approach to solve the multi-resolution approach to solve the size problem.size problem.

The system uses a neural-net classifier The system uses a neural-net classifier that was trained on 16,000 images to that was trained on 16,000 images to segment faces from non-faces.segment faces from non-faces.

Page 57: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

Flesh FindingFlesh Finding

Another way of finding objects is to Another way of finding objects is to find regions in images that have the find regions in images that have the color and texture usually associated color and texture usually associated with that object.with that object.

Fleck, Forsyth and Bregler (1996) Fleck, Forsyth and Bregler (1996) used this to find human flesh –used this to find human flesh – Finding large regions of potential flesh Finding large regions of potential flesh Grouping these regions to find potential Grouping these regions to find potential

human bodies.human bodies.

Page 58: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

Spatial RelationshipSpatial Relationship

Once objects can be recognized, their Once objects can be recognized, their spatial relationships can also be determined.spatial relationships can also be determined.

Final step in the image retrieval hierarchy.Final step in the image retrieval hierarchy. Involves in segmenting images into regions Involves in segmenting images into regions

that often correspond to objects or scene that often correspond to objects or scene background.background.

A symbolic representation of the image in A symbolic representation of the image in which the regions of interest are depicted which the regions of interest are depicted can be extracted. This can be useful in can be extracted. This can be useful in understanding spatial relationships of the understanding spatial relationships of the objects with background.objects with background.

Page 59: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval
Page 60: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

Presentation OutlinePresentation Outline

History of image retrieval – Issues facedHistory of image retrieval – Issues faced Solution – Content-based image retrievalSolution – Content-based image retrieval Feature extractionFeature extraction Multidimensional indexingMultidimensional indexing Current SystemsCurrent Systems Open issuesOpen issues ConclusionConclusion

Page 61: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

Problem of high Problem of high dimensionsdimensions

Mean Color = RGB = 3 dimensional Mean Color = RGB = 3 dimensional vectorvector

Color Histogram = 256 dimensionsColor Histogram = 256 dimensions Effective storage and speedy retrieval Effective storage and speedy retrieval

neededneeded Traditional data-structures not Traditional data-structures not

sufficientsufficient R-trees, SR-Trees etc…R-trees, SR-Trees etc…

Page 62: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

2-dimensional space2-dimensional space

D1

D2

Point A

Page 63: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

3-dimensional space3-dimensional space

Page 64: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

Now, imagine…Now, imagine…

An N-dimensional An N-dimensional box!!box!!

We want to conduct We want to conduct a nearest neighbor a nearest neighbor query.query.

R-trees are designed R-trees are designed for speedy retrieval for speedy retrieval of results for such of results for such purposespurposes

Designed by Designed by Guttmann in 1984Guttmann in 1984

Page 65: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

Presentation OutlinePresentation Outline

History of image retrieval – Issues facedHistory of image retrieval – Issues faced Solution – Content-based image retrievalSolution – Content-based image retrieval Feature extractionFeature extraction Multidimensional indexingMultidimensional indexing Current SystemsCurrent Systems Open issuesOpen issues ConclusionConclusion

Page 66: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

IBM’s QBICIBM’s QBIC

QBIC – Query by Image ContentQBIC – Query by Image Content First commercial CBIR system.First commercial CBIR system. Model system – influenced many others.Model system – influenced many others. Uses color, texture, shape featuresUses color, texture, shape features Text-based search can also be Text-based search can also be

combined.combined. Uses R*-trees for indexingUses R*-trees for indexing

Page 67: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

QBIC – Search by colorQBIC – Search by color

** Images courtesy : Yong Rao

Page 68: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

QBIC – Search by shapeQBIC – Search by shape

** Images courtesy : Yong Rao

Page 69: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

QBIC – Query by sketchQBIC – Query by sketch

** Images courtesy : Yong Rao

Page 70: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

VirageVirage

Developed by Virage inc.Developed by Virage inc. Like QBIC, supports queries based Like QBIC, supports queries based

on color, layout, textureon color, layout, texture Supports arbitrary combinations of Supports arbitrary combinations of

these features with weights attached these features with weights attached to eachto each

This gives users more control over This gives users more control over the search processthe search process

Page 71: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

VisualSEEkVisualSEEk

Research prototype – University of Research prototype – University of ColumbiaColumbia

Mainly different because it considers Mainly different because it considers spatial relationships between objects.spatial relationships between objects.

Global features like mean color, color Global features like mean color, color histogram can give many false positiveshistogram can give many false positives

Matching spatial relationships between Matching spatial relationships between objects and visual features together objects and visual features together result in a powerful search.result in a powerful search.

Page 72: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

ISearchISearch

Page 73: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

ISearchISearch

Page 74: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

ISearchISearch

Page 75: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

Feature selection in Feature selection in ISearchISearch

Page 76: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

Database Admin facility in Database Admin facility in ISearchISearch

Page 77: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

Presentation OutlinePresentation Outline

History of image retrieval – Issues facedHistory of image retrieval – Issues faced Solution – Content-based image retrievalSolution – Content-based image retrieval Feature extractionFeature extraction Multidimensional indexingMultidimensional indexing Current SystemsCurrent Systems Open issuesOpen issues ConclusionConclusion

Page 78: Image Retrieval by Content (CBIR). Presentation Outline Introduction Introduction History of image retrieval – Issues faced History of image retrieval

Open issuesOpen issues

Gap between low level features and Gap between low level features and high-level conceptshigh-level concepts

Human in the loop – interactive Human in the loop – interactive systemssystems

Retrieval speed – most research Retrieval speed – most research prototypes can handle only a few prototypes can handle only a few thousand images.thousand images.

A reliable test-bed and measurement A reliable test-bed and measurement criterion, please!criterion, please!