content based image retrieval using c-svm techniqueijsrm.in/v3-i4/36 ijsrm.pdf · content based...

International Journal of scientific research and management (IJSRM) ||Volume||3||Issue||4||Pages|| 2696-2707||2015|| \ Website: www.ijsrm.in ISSN (e): 2321-3418

Sumiti Bansal1 IJSRM volume 3 issue 4 April 2015 [www.ijsrm.in] Page 2696

Content Based Image Retrieval Using C-SVM

Technique Sumiti Bansal1, Er.Rishamjot Kaur2

1Department of Computer Science & Engineering,

Baba Farid College of Engineering and Techonolgy, Bathinda, India

[email protected]

2Department of Information Technology,

Baba Farid College of Engineering and Techonolgy, Bathinda, India

[email protected]

Abstract: Content Based Image Retrieval is an important technique which helps to retrieve images like query image from large database.

Many traditional techniques were employed to retrieve images from large database. Relevance feedback is often a critical task when designing

image databases. Relevance feedback interactively determines the user’s query by asking the user whether image is relevant or not. The use of

cluster based support vector machine(C-SVM) technique makes this task more easy and effective. This technique selects the most relevant

images that satisfy the user’s requirement. Experimental results show how this technique achieves the effective results. The accuracy of the

proposed system comes out to be approximately 88%.

Keywords: Image Retrieval, Query Concept, Support Vector Machines, Cluster based support vector machine, Content Based Image

Retrieval.

1. Introduction

1.1 CBIR (Content Based Image Retrieval)

Content based image retrieval (CBIR), also known as query by

image content (QBIC) and content-based visual information

retrieval (CBVIR) is the application of computer vision for the

image retrieval problem. “Content-based” means that the

search will analyze the actual contents of the image. The term

‘content’ in this context might refers colors, shapes, textures or

any other information like similarity matrix which compare

pixel by pixel value that can be derived from the image itself.

Content based image retrieval is the application of

computer vision technique to address the problems allied with

text based image retrieval in large digital image database. In

content based image retrieval the query is in the form of image

and its low level features are used for content describing it.

Low level features are set of characteristics of the image such

as color, texture, and shapes. These features are extracted from

the query image as well as for all the images in the database

using feature extraction methods [2].

One key design task is the constructing of image databases and

the creation of an effective relevance feedback component.

Creating database by relevance feedback or by hand is very

time consuming, costly and subjective. The end user query can

meet both requirements by low level features such as color,

texture; shape is challenging and hard to articulate. There is a

need for an alternative technique to meet the user requirements.

Content Based Image Retrieval is a task of searching

images from a database and retrieval of an image, which are

seems to be visually similar to a given example or query

image. Content-based image retrieval uses the visual contents

of an image such as color, shape, texture, and spatial layout to

represent and index the image. In typical content-based image

retrieval systems, the visual contents of the images in the

database are extracted and described by multi-dimensional

feature vectors.

Query Image Image Collection

Feature

Extraction

Feature

Extraction

Query Image

Feature

Feature

Database

Similarity

Matching

Retrieved Images

mailto:[email protected]

Sumiti Bansal1 IJSRM volume 3 issue 4 April 2015 [www.ijsrm.in] 697

Figure 1.1: CBIR and its various components

Content-based image retrieval uses the visual contents of

an image such as texture, color, shape, and spatial layout to

represent and index the image. In typical CBIR systems, the

visual content of the images in the database are extracted and

described by multi-dimensional feature vectors. To retrieve the

images, users provide the retrieval system with example

images. The system then changes these examples into its

internal representation of feature vectors.

1.2 General Introduction

In image clustering (also called unsupervised classification)

images are grouped into Meaning full clusters on the basis of

similarity and not on the basis of known structures or labels.

The problem here is to find groups and structures which are

similar, without prior knowledge of predefined data types

(hence the name ‘unsupervised classification’). Cluster

analysis makes a data object decompose or divide into multiple

classes, or clusters, so that the same class of data objects has a

high similarity, but is as different as possible from other types

of data. A cluster is also a collection of data objects for

analysis, but clustering, unlike classification, does not make

use of predefined data types derived from known class labels

of training data sets.

Support vector machines are active learning

technology. The SVM’s are applied to tasks such as

handwritten digit recognition, object recognition and text

classification. The training data is considered as vectors in

some space X. The labels to these sets are given (-1,1).

Alternatively, said SVM’s are hyper planes that separate the

training data by maximal margin. All vectors lying on one side

of are labeled as -1 and others as +1. The training instances

that lie closest to hyper plane are labeled as support vectors.

Image Retrieval is in effect an extension of traditional

information retrieval to include images. Image retrieval is the

process of searching and retrieving images from a large

database. As the images grow complex, retrieve the right

images become a difficult problem. Content-Based Image

Retrieval (CBIR), also known as query by image content

(QBIC) is the process of retrieving images from a database on

the basis of features that are extracted automatically from the

images themselves. ‘Content-Based’ means that the search will

analyze the actual contents of the image. In CBIR, a query is

an image or portion of an image; relevant images are retrieved

based on the similarity of the features of the query and the

features of the individual images in the database. Possible

features include texture, color, shape, orientation, or a

combination thereof. Measures of image retrieval can be

defined in terms of Precision and Recall. CBIR is used to

reduce the semantic gap between low-level features and high-

level user semantics.

1.3 Feature of CBIR System

Images can be searched based on the following features:

Color

Shape

Texture

1.3.1 Color

Color similarity is achieved by computing a color histogram

for each image that identifies the proportion of pixels within an

image holding specific values. Examining images based on the

colors they contain is one of the most widely used techniques

because it does not depend on orientation or image size. Color

searches will usually involve comparing color histograms,

though this is not the only technique in practice.

1.3.2 Shape

Shape does not refer to the shape of an image but to the shape

of a particular region that is being called up. Shapes will often

be determined first applying edge detection or segmentation to

an image. Other methods like use shape filters to identify given

shapes of an image.

1.3.3 Texture

Texture measures look for visual patterns in images and how

they are spatially defined. These sets not only define the

texture, but also where the texture is located in the image.

Texture is a difficult concept to represent. The identification of

specific textures in an image is achieved primarily by modeling

texture as a two-dimensional gray level variation.

Applications of CBIR

The advantages of such systems range from simple

users searching a particular image on the web.

Various types of professionals like police force for

picture recognition in crime prevention.

Medicine diagnosis.

Architectural and engineering design

Query Image Image Collection

Feature

Extraction

Query Image

Feature

Similarity

Matching

Retrieved Images


Fashion and publishing.

Geographical information and remote sensing

systems.

Home entertainment.

1.4 Existing Approaches of CBIR

Content based image retrieval using SVM algorithm.

Content based image retrieval using image mining

techniques.

Content based image retrieval using clustering.

Content based image retrieval using histogram

technique.

Content based image retrieval using GLCM approach.

2. Problem Definition

Since the early 1990s, content-based image retrieval has

become a very active research area, many system have been

developed for both commercial and research purpose. We have

seen the provision of a rich set of search options today, but

systematic studies involving actual users in practical

applications still need to be done to explore the trade-offs

among the different options.

Various techniques are provided by different

researcher for the efficient retrieval of images. Many features

are used with content based image retrieval technique. This can

be the primary feature as the color, texture, shape or these are

used in pair or groups by the researchers for achieving their

goal. Existing system has the following problems:

Existing system are mainly based on SVM (Support

vector machine) which have a major limitation that as

number of images increases accuracy decreases

accordingly which must be removed.

Existing system works only two properties that are

color, texture which may appear same but irrelevant

images.

Existing system are not tested with real world images.

Maximum dataset with which the existing system is

tested is 1920. But in real world situations a large

dataset is used from which images are to be retrieved.

Image retrieval parameters like precision, Recall, and

F-measure values are needs further improvement.

A new system is required to be proposed that can solve all

the above problems effectively. Hence, an efficient method for

the retrieval of image using image primary feature, that is the

color and the texture, to provide the retrieval technique that is

more effective and uses a less storage space and lesser retrieval

time.

3. Proposed Work

Proposed system for content based image retrieval works in

two phases which are as follows:

Pre Processing Phase: In this phase a dataset of images is

provided to the system. For every image provided to the

system, system evaluate some features like color, texture ,

shape and distance in between the neighbor clusters and then

store the results for every image in the database.

Image retrieval Phase: In this phase query image is passed as

an input to the system and features of query image are

calculated as in the previous phase. These features are then

compared with the features already stored in the database.

Images whose features matches exactly are given high priority

and other images whose features are related closely are given

low priority. Final results are then displayed to the user from

high priority images to the lower priority images.

The following are the steps for the proposed system

working (Preprocessing Phase):

Step 1: Input the image dataset.

Step 2: Extract the features of images (Color, Texture, Shape

and cluster Distance).

Step 3: Combine these features.

Step 4: Store these features in the database.

The following steps are used in Image Retrieval Phase:

Step 1: Input the query image.

Step 2: Extract the features of query image (Color, Texture,

Shape and cluster Distance)

Step 3: Combine these features

Step 4: Compare these features with the features stored in the

database.

Step 5: Display the result according the image priority.

3.1 Steps of Image Retrieval System

Image Retrieval from the image collections involves the

following steps:

Pre-processing

Image Classification based on a true factor

RGB Components processing

Pre-clustering


Texture feature extraction

SVM algorithm

Sort out the results

Target image selection

Figure 3.1: Image Retrieval System

4. Performance Measurement

Evaluation of retrieval performance is a crucial problem in

Content-Based Image Retrieval (CBIR). Many different

methods for measuring the performance of a system have been

created and used by researchers. We have used the most

common evaluation methods namely, Precision, Recall and F-

Measure usually presented as a Precision, Recall and F-

measure graph. Precision and recall alone contain insufficient

information. We can always make recall value 1 just by

retrieving all images. In a similar way precision value can be

kept in a higher value by retrieving only few images or

precision and recall should either be used together or the

number of images retrieved should be specified. With this, the

following formulae are used for finding Precision, Recall and

F-measure values.

Precision

Recall

F- measure

Precision (P):

The precision is defined as, the precision parameter is used to

measure the number of relevant images retrieved obtained by

CBIR system, over the total number of images requested. The

Precision is represented by P. The following formulae describe

how precision can be calculated:

P = no. of relevant images retrieved / number of images

requested

Recall (R):

The recall is defined as; the recall parameter measures the no.

of correct images obtained by CBIR system over the total no.

of images requested. Thus, recall (R) can be calculated as,

R= no. of images retrieved / number of images requested

F-measure (F):

The F-measure represented by F and defined as; the f-measure

is used to represents the harmonic mean of precision and recall

i.e.

F=2RP/R+P

The F-measure (also F-score or F-measure) is a score to

measure the accuracy of the system. It considers both

the precision p and the recall r of the system to calculate

average score.

5. Results

We used a database of 9950 images for testing our system.

During the first stage, queries were tested in a limited database

against images in the database that were duplicates, different,

or transformations. The analysis was performed on whether or

not the appropriate image was returned as the top most result.

Once the initial validation was completed, group of

images were added to the database. Then the query image was

matched visually with groups of images from the database and

the database images were ranked according to how similar they

were perceived to be to the query. These groups serve as the

expected relevant results during testing. Figure 4.1 shows how

to browse for query image. It asks to browse query image from

set of images.

RGB

Compone

nts

Processin

g

Clustering

Based on

RGB

Components

Texture and

Color

Depth

Calculation

for Images

and

Clustering

SVM

Algorithm

Sort out

the

results

Preprocessing

and

Classification

Query and

Target

Images

Select

Target

Image

http://en.wikipedia.org/wiki/Precision_(information_retrieval)

http://en.wikipedia.org/wiki/Recall_(information_retrieval)


Figure 5.1: Browsing Query Image

In this manner using C-SVM technique is used for

retrieved the most relevant images according to query image

from the database in an efficient manner. Then the results are

compared with the previously existing technique. Figure 5.2

shows the result of new method C-SVM applied.

5.1.1 Performance measurement

Figure 5.2: Images Retrieved by Applying C-SVM

Then, we evaluate the system performance by finding

Precision, Recall and F-Measure value for all the images. The

achieved percentage of precision value is 83.3% and recall

value is 100%. Hence, our system has 88.8% efficient. Results

of performance evaluation for different set of images like

butterflies, mountains, flowers and colleges are shown below.

5.2 Result Discussion

Total no. of images in the database = 9950

Total no. of images retrieved = 5

Figure 5.3: Result of the query image from Dataset of 500

images


images


images


images


images


images


images



images


images


images


For Dataset Set 1

NAME OF THE

QUERY IMAGE

NO. OF IMAGES

RETRIEVED

PRECISION RECALL F-MEASURE

6.jpg 5 5/5=1 1 100%

88.jpg 5 5/5=1 1 100%

191.jpg 5 4/5=0.8 1 88%

196.jpg 5 5/5=1 1 100%

301.jpg 5 5/5=1 1 100%

Result 0.9 1 97.2%

Table 5.1: Results Achieved for Data Set of 500 images

For Dataset Set 2

NAME OF THE

QUERY IMAGE

NO. OF IMAGES

RETRIEVED


6.jpg 5 4/5=0.8 1 88.8%

88.jpg 5 3/5=0.6 1 75%

191.jpg 5 4/5=0.8 1 88%

196.jpg 5 5/5=1 1 100%

301.jpg 5 5/5=1 1 100%

Result 0.8 1 88.8%


For Dataset Set 3

NAME OF THE

QUERY IMAGE

NO. OF IMAGES

RETRIEVED


6.jpg 5 4/5=0.8 1 88.8%

88.jpg 5 3/5=0.6 1 75%

191.jpg 5 4/5=0.8 1 88%

196.jpg 5 5/5=1 1 100%

301.jpg 5 5/5=1 1 100%

Result 0.8 1 88.8%



Figure 5.1: Graphical Representation for Precision, Recall and F-Measure

97.20%

88.80% 88.80%

0

0.2

0.4

0.6

0.8

1

1.2

dataset1 dataset2 dataset3

Precision

Recall

Average


6. Conclusion

The dramatic rise in the sizes of images databases has stirred

the development of effective and efficient retrieval systems.

The development of these systems started with retrieving

images using textual connotations but later introduced image

retrieval based on content. This came to be known as Content

Based Image Retrieval or CBIR. Systems using CBIR retrieve

images based on visual features such as texture, colour and

shape as opposed to depending on image descriptions or

textual indexing. The main objective of this dissertation is to

retrieve the images from database in a fast and an efficient

manner. Improved C-SVM is very efficient and powerful

technology to handle large data sets. In this technique both

clustering and support vector machine algorithm are combined

and based on this results are identified. It assists faster image

retrieval and also allows the search for more relevant images in

large image databases. In proposed system we used different

sizes of datasets to calculate the F-measure value of all the

images present in the database.

Here we have tested this system on 9950 image taken

from different website. The overall accuracy for CBIR comes

out to be 88.8% which is very good. In the proposed system a

new technique is developed to search an image from a large

dataset which named as C-SVM technique. Images present in

dataset are flowers, mountains, butterflies, college, animals.

That is the reason searching of images becomes a crucial part

in any CBIR system. In this proposed system, we are working

on CBIR system for easily find out an image from a large

dataset. The concluding remarks are:

The proposed technique solves our problem

definition.

It improved the accuracy as compared to existing

techniques.

The technique used not only gives accurate results on

websites images but also on real world images.

7. Future Scope

This application can be used in future to classify the medical

images in order to diagnose the right disease verified earlier.

1. This system is useful in future to detect the diseases related

with human.

2. More effort to be taken to reduce the Image retrieval time of

a given input Query Image.

3. In future this system is also implemented in the field of

Computer Vision which is concerned with the automated

processing of images from the real world to extract and

interpret information on a real time basis.

4. In future this system is used in Astronomy to the study of

celestial objects (such as stars, comets, nebulae, planets, star

clusters and galaxies).

References

[1] Simon Tong, Edward chang. Support vector machine

active learning for image retrieval Department of

computer science department of electrical

engineering, Stanford University.

[2] A.Kannan, Dr.V.Mohan, Dr.N.Anbazhagan. An

Effective Method of Image Retrieval using Image

Mining Techniques. The International journal of

Multimedia & Its Applications (IJMA) Vol.2, No.4,

November 2010

[3] Kun-Che Lu, Don-Lin Yang. Image Processing and

Image Mining using Decision Trees. Journal of

Information Science and Engineering 25, 989-1003,

2009.

[4] Dr. Sanjay Silakari, Dr.Mahesh Motwani and Manish

Maheshwari. Color Image Clustering using Block

Truncation Algorithm. IJCSI International Journal of

Computer Science Issues, Vol. 4, No. 2, 2009.

[5] Amanbir Sandhu, Aarti Kochhar. Content Based

Image Retrieval using Texture, Color and Shape for

Image Analysis. International Journal of Computers &

Technology, Volume 3, No. 1, AUG, 2012.

[6] Saroj Shambharkar, Shubhangi Tirpude. Fuzzy C-

Means Clustering For Content Based Image Retrieval

System. International Conference on Advancements

in Information Technology With workshop of

ICBMG 2011, IPCSIT vol.20 (2011) © (2011)

IACSIT Press, Singapore, 2011.

[7] Sonali Jain. A Machine Learning Approach: SVM for

Image Classification in CBIR. Department of


Computer Science, MITM, RGPV, Indore, April

2013.

[8] Ray-I Chang, Shu-Yu Lin, Jan-Ming Ho, Chi-Wen

Fann, and Yu-Chun Wang. A Novel Content Based

Image Retrieval System using K-means/KNN with

Feature Extraction. ComSIS Vol. 9, No. 4, Special

Issue, December 2012.

[9] Peter Stanchev. Using image mining for image

retrieval. IASTED conf. Computer science and

Technology Cancum, Mexico, 214-218, May 19-21,

2003.

[10] F. Long, H. Zhang, H. Dagan, and D. Feng.

Fundamentals of content based image retrieval in D.

Feng, W. Siu, H. Zhang (Eds.): Multimedia

Information Retrieval and Management.

Technological Fundamentals and Applications,

Multimedia Signal Processing Book, Chapter 1,

Springer-Verlag, Berlin Heidelberg New York, pp.1-

26, 2003.

[11] Nishchol Mishra, Dr. Sanjay Silakari. Image Mining

in the Context of Content Based Image Retrieval: A

Perspective. IJCSI International Journal of Computer

Science Issues, Vol. 9, Issue 4, No 3, July 2012.

[12] S. Mangijao Singh, K. Hemachandran. Content-based

image retrieval using color moment and gabor texture

feature. IJCSI International Journal of Computer

Science Issues, Vol. 9, Issue 5, No 1, September

2012.

[13] Yong Rui and Thomas S. Huang, Shih-Fu Chang.

Image retrieval: current techniques, promising

directions, and open issues. Journal of Visual

Communication and Image Representation 10, 39–62,

1999.

[14] Gurpreet Kaur. A Review: Content Base Image

Mining Technique for Image Retrieval Using Hybrid

Clustering. International Journal of Advanced

Research in Computer Engineering & Technology

(IJARCET), Volume No. 2, Issue No. 6, June 2013.

[15] Ja-Hwung Su, Wei-Jyun Huang, Philip S. Yu, Vincent

S. Tseng. Efficient Relevance Feedback for Content-

Based Image Retrieval by Mining User Navigation

Patterns IEEE Transactions on Knowledge and Data

Engineering, Vol. 23, No. 3, March 2011.

[16] Manimala Singha and K.Hemachandran. Content

Based Image Retrieval using Color and Texture.

Signal & Image Processing: An International Journal

(SIPIJ) Vol.3, No.1, February 2012.

[17] Mahip M.Bartere, Dr.Prashant, R.Deshmukh. An

Efficient Technique using Text & Content Base

Image Mining Technique for Image Retrieval.

International Journal of Engineering Research and

Applications (IJERA) Vol. 2, Issue 1, pp.734-739,

Jan-Feb 2012.

[18] Rupinder kaur, Navleen kaur. Content Based Image

Mining Technique for Image Retrieval Using

Optimized Hybrid Clustering. International Journal

of Computer Trends and Technology (IJCTT) –

volume 11 number 3 – May 2014.

[19] Rajshree S. Dubey, Rajnish Choubey, Joy

Bhattacharjee. Multi Feature Content Based Image

Retrieval. International Journal on Computer Science

and Engineering (IJCSE), Vol. 02, No. 06, 2145-

2149, 2010.

[20] Jyoti Rani, Neeraj Gill. Content Based Image

Retrieval. Proceedings of Futuristic and Emerging

Area in Technology: Issue and Challenges 2013.

[21] Bin Xu, Can Wang. EMR: A Scalable Graph-based

Ranking Model for Content-based Image Retrieval.

IEEE Transactions on Knowledge and Data

Engineering, Vol. 6, No. 1, January 2007.

[22] Neha Sharma. Retrieval of image by combining the

histogram and HSV features along with surf

algorithm. International Journal of Engineering

Trends and Technology (IJETT) - Volume4 Issue7-

July 2013.

[23] A.Kannan, Dr.V.Mohan, Dr.N.Anbazhagan. An

Effective Method of Image Retrieval using Image

Mining Techniques. The International journal of

Multimedia & Its Applications (IJMA) Vol.2, No.4,

November 2010.

[24] J. Eakins, M. Graham. Content-based image retrieval.

Technical Report, University of Northumbria at

Newcastle, vol. 7, 1999.

[25] Abby A. Goodrum. Image information retrieval: An

overview of current research. Special issue on

Information Science Research, volume 3 No. 2, 2000.


[26] P. Mohanaiah, P. Sathyanarayana, L. GuruKumar.

Image Texture Feature Extraction Using GLCM

Approach. International Journal of Scientific and

Research Publications, Volume 3, Issue 5, May 2013.

[27] Padmashri Suresh, RMD.Sundaram, Aravindhan

Arumugam. Feature Extraction in Compressed

Domain for Content Based Image Retrieval.

International Conference on Advanced Computer

Theory and Engineering, 2008.

[28] Neetesh Gupta, Niket Bhargava, Dr. Bhupendra

Verma, Md.Ilyas Khan, Shiv Kumar. A New

Approach for CBIR Using Coefficient Of Correlation.

International Conference on Advances in Computing,

Control, and Telecommunication Technologies, 2009.

[29] Kong Fanhui. Image Retrieval Based on Multi-

features. International Conference on Network

Computing and Information Security, 2011.

[30] Yong-Hwan Lee, Sang-Burm Rhee, Bonam Kim.

Content-based Image Retrieval Using Wavelet

Spatial-Color and Gabor Normalized Texture in

Multi-resolution Database. Sixth International

Conference on Innovative Mobile and Internet

Services in Ubiquitous Computing, 2012.

Author Profile

Sumiti Bansal is a Student of M.Tech

(Computer Science Engg.) at Baba

Farid College of Engineering and

Technology, Bathinda. She has

received her B.Tech in Computer

Science from Baba Farid College of

Engineering and Technology,

Bathinda, in 2012. She is perusing

her M.Tech Thesis in the area of

Digital Image Processing.

Er.Rishamjot Kaur received

M.Tech degrees in Computer Science from Punjabi University,

Patiala in 2012. She is working as Assistant Professor in

Department of Information Technology at Baba Farid College

of Engineering and Technology, Bathinda, India.

content based image retrieval using c-svm techniqueijsrm.in/v3-i4/36 ijsrm.pdf · content based...

Documents