content based image retrieval using c-svm techniqueijsrm.in/v3-i4/36 ijsrm.pdf · content based...
TRANSCRIPT
International Journal of scientific research and management (IJSRM) ||Volume||3||Issue||4||Pages|| 2696-2707||2015|| \ Website: www.ijsrm.in ISSN (e): 2321-3418
Sumiti Bansal1 IJSRM volume 3 issue 4 April 2015 [www.ijsrm.in] Page 2696
Content Based Image Retrieval Using C-SVM
Technique Sumiti Bansal1, Er.Rishamjot Kaur2
1Department of Computer Science & Engineering,
Baba Farid College of Engineering and Techonolgy, Bathinda, India
2Department of Information Technology,
Baba Farid College of Engineering and Techonolgy, Bathinda, India
Abstract: Content Based Image Retrieval is an important technique which helps to retrieve images like query image from large database.
Many traditional techniques were employed to retrieve images from large database. Relevance feedback is often a critical task when designing
image databases. Relevance feedback interactively determines the user’s query by asking the user whether image is relevant or not. The use of
cluster based support vector machine(C-SVM) technique makes this task more easy and effective. This technique selects the most relevant
images that satisfy the user’s requirement. Experimental results show how this technique achieves the effective results. The accuracy of the
proposed system comes out to be approximately 88%.
Keywords: Image Retrieval, Query Concept, Support Vector Machines, Cluster based support vector machine, Content Based Image
Retrieval.
1. Introduction
1.1 CBIR (Content Based Image Retrieval)
Content based image retrieval (CBIR), also known as query by
image content (QBIC) and content-based visual information
retrieval (CBVIR) is the application of computer vision for the
image retrieval problem. “Content-based” means that the
search will analyze the actual contents of the image. The term
‘content’ in this context might refers colors, shapes, textures or
any other information like similarity matrix which compare
pixel by pixel value that can be derived from the image itself.
Content based image retrieval is the application of
computer vision technique to address the problems allied with
text based image retrieval in large digital image database. In
content based image retrieval the query is in the form of image
and its low level features are used for content describing it.
Low level features are set of characteristics of the image such
as color, texture, and shapes. These features are extracted from
the query image as well as for all the images in the database
using feature extraction methods [2].
One key design task is the constructing of image databases and
the creation of an effective relevance feedback component.
Creating database by relevance feedback or by hand is very
time consuming, costly and subjective. The end user query can
meet both requirements by low level features such as color,
texture; shape is challenging and hard to articulate. There is a
need for an alternative technique to meet the user requirements.
Content Based Image Retrieval is a task of searching
images from a database and retrieval of an image, which are
seems to be visually similar to a given example or query
image. Content-based image retrieval uses the visual contents
of an image such as color, shape, texture, and spatial layout to
represent and index the image. In typical content-based image
retrieval systems, the visual contents of the images in the
database are extracted and described by multi-dimensional
feature vectors.
Query Image Image Collection
Feature
Extraction
Feature
Extraction
Query Image
Feature
Feature
Database
Similarity
Matching
Retrieved Images
Sumiti Bansal1 IJSRM volume 3 issue 4 April 2015 [www.ijsrm.in] Page 2697
Figure 1.1: CBIR and its various components
Content-based image retrieval uses the visual contents of
an image such as texture, color, shape, and spatial layout to
represent and index the image. In typical CBIR systems, the
visual content of the images in the database are extracted and
described by multi-dimensional feature vectors. To retrieve the
images, users provide the retrieval system with example
images. The system then changes these examples into its
internal representation of feature vectors.
1.2 General Introduction
In image clustering (also called unsupervised classification)
images are grouped into Meaning full clusters on the basis of
similarity and not on the basis of known structures or labels.
The problem here is to find groups and structures which are
similar, without prior knowledge of predefined data types
(hence the name ‘unsupervised classification’). Cluster
analysis makes a data object decompose or divide into multiple
classes, or clusters, so that the same class of data objects has a
high similarity, but is as different as possible from other types
of data. A cluster is also a collection of data objects for
analysis, but clustering, unlike classification, does not make
use of predefined data types derived from known class labels
of training data sets.
Support vector machines are active learning
technology. The SVM’s are applied to tasks such as
handwritten digit recognition, object recognition and text
classification. The training data is considered as vectors in
some space X. The labels to these sets are given (-1,1).
Alternatively, said SVM’s are hyper planes that separate the
training data by maximal margin. All vectors lying on one side
of are labeled as -1 and others as +1. The training instances
that lie closest to hyper plane are labeled as support vectors.
Image Retrieval is in effect an extension of traditional
information retrieval to include images. Image retrieval is the
process of searching and retrieving images from a large
database. As the images grow complex, retrieve the right
images become a difficult problem. Content-Based Image
Retrieval (CBIR), also known as query by image content
(QBIC) is the process of retrieving images from a database on
the basis of features that are extracted automatically from the
images themselves. ‘Content-Based’ means that the search will
analyze the actual contents of the image. In CBIR, a query is
an image or portion of an image; relevant images are retrieved
based on the similarity of the features of the query and the
features of the individual images in the database. Possible
features include texture, color, shape, orientation, or a
combination thereof. Measures of image retrieval can be
defined in terms of Precision and Recall. CBIR is used to
reduce the semantic gap between low-level features and high-
level user semantics.
1.3 Feature of CBIR System
Images can be searched based on the following features:
Color
Shape
Texture
1.3.1 Color
Color similarity is achieved by computing a color histogram
for each image that identifies the proportion of pixels within an
image holding specific values. Examining images based on the
colors they contain is one of the most widely used techniques
because it does not depend on orientation or image size. Color
searches will usually involve comparing color histograms,
though this is not the only technique in practice.
1.3.2 Shape
Shape does not refer to the shape of an image but to the shape
of a particular region that is being called up. Shapes will often
be determined first applying edge detection or segmentation to
an image. Other methods like use shape filters to identify given
shapes of an image.
1.3.3 Texture
Texture measures look for visual patterns in images and how
they are spatially defined. These sets not only define the
texture, but also where the texture is located in the image.
Texture is a difficult concept to represent. The identification of
specific textures in an image is achieved primarily by modeling
texture as a two-dimensional gray level variation.
Applications of CBIR
The advantages of such systems range from simple
users searching a particular image on the web.
Various types of professionals like police force for
picture recognition in crime prevention.
Medicine diagnosis.
Architectural and engineering design
Query Image Image Collection
Feature
Extraction
Query Image
Feature
Similarity
Matching
Retrieved Images
Sumiti Bansal1 IJSRM volume 3 issue 4 April 2015 [www.ijsrm.in] Page 2698
Fashion and publishing.
Geographical information and remote sensing
systems.
Home entertainment.
1.4 Existing Approaches of CBIR
Content based image retrieval using SVM algorithm.
Content based image retrieval using image mining
techniques.
Content based image retrieval using clustering.
Content based image retrieval using histogram
technique.
Content based image retrieval using GLCM approach.
2. Problem Definition
Since the early 1990s, content-based image retrieval has
become a very active research area, many system have been
developed for both commercial and research purpose. We have
seen the provision of a rich set of search options today, but
systematic studies involving actual users in practical
applications still need to be done to explore the trade-offs
among the different options.
Various techniques are provided by different
researcher for the efficient retrieval of images. Many features
are used with content based image retrieval technique. This can
be the primary feature as the color, texture, shape or these are
used in pair or groups by the researchers for achieving their
goal. Existing system has the following problems:
Existing system are mainly based on SVM (Support
vector machine) which have a major limitation that as
number of images increases accuracy decreases
accordingly which must be removed.
Existing system works only two properties that are
color, texture which may appear same but irrelevant
images.
Existing system are not tested with real world images.
Maximum dataset with which the existing system is
tested is 1920. But in real world situations a large
dataset is used from which images are to be retrieved.
Image retrieval parameters like precision, Recall, and
F-measure values are needs further improvement.
A new system is required to be proposed that can solve all
the above problems effectively. Hence, an efficient method for
the retrieval of image using image primary feature, that is the
color and the texture, to provide the retrieval technique that is
more effective and uses a less storage space and lesser retrieval
time.
3. Proposed Work
Proposed system for content based image retrieval works in
two phases which are as follows:
Pre Processing Phase: In this phase a dataset of images is
provided to the system. For every image provided to the
system, system evaluate some features like color, texture ,
shape and distance in between the neighbor clusters and then
store the results for every image in the database.
Image retrieval Phase: In this phase query image is passed as
an input to the system and features of query image are
calculated as in the previous phase. These features are then
compared with the features already stored in the database.
Images whose features matches exactly are given high priority
and other images whose features are related closely are given
low priority. Final results are then displayed to the user from
high priority images to the lower priority images.
The following are the steps for the proposed system
working (Preprocessing Phase):
Step 1: Input the image dataset.
Step 2: Extract the features of images (Color, Texture, Shape
and cluster Distance).
Step 3: Combine these features.
Step 4: Store these features in the database.
The following steps are used in Image Retrieval Phase:
Step 1: Input the query image.
Step 2: Extract the features of query image (Color, Texture,
Shape and cluster Distance)
Step 3: Combine these features
Step 4: Compare these features with the features stored in the
database.
Step 5: Display the result according the image priority.
3.1 Steps of Image Retrieval System
Image Retrieval from the image collections involves the
following steps:
Pre-processing
Image Classification based on a true factor
RGB Components processing
Pre-clustering
Sumiti Bansal1 IJSRM volume 3 issue 4 April 2015 [www.ijsrm.in] Page 2699
Texture feature extraction
SVM algorithm
Sort out the results
Target image selection
Figure 3.1: Image Retrieval System
4. Performance Measurement
Evaluation of retrieval performance is a crucial problem in
Content-Based Image Retrieval (CBIR). Many different
methods for measuring the performance of a system have been
created and used by researchers. We have used the most
common evaluation methods namely, Precision, Recall and F-
Measure usually presented as a Precision, Recall and F-
measure graph. Precision and recall alone contain insufficient
information. We can always make recall value 1 just by
retrieving all images. In a similar way precision value can be
kept in a higher value by retrieving only few images or
precision and recall should either be used together or the
number of images retrieved should be specified. With this, the
following formulae are used for finding Precision, Recall and
F-measure values.
Precision
Recall
F- measure
Precision (P):
The precision is defined as, the precision parameter is used to
measure the number of relevant images retrieved obtained by
CBIR system, over the total number of images requested. The
Precision is represented by P. The following formulae describe
how precision can be calculated:
P = no. of relevant images retrieved / number of images
requested
Recall (R):
The recall is defined as; the recall parameter measures the no.
of correct images obtained by CBIR system over the total no.
of images requested. Thus, recall (R) can be calculated as,
R= no. of images retrieved / number of images requested
F-measure (F):
The F-measure represented by F and defined as; the f-measure
is used to represents the harmonic mean of precision and recall
i.e.
F=2RP/R+P
The F-measure (also F-score or F-measure) is a score to
measure the accuracy of the system. It considers both
the precision p and the recall r of the system to calculate
average score.
5. Results
We used a database of 9950 images for testing our system.
During the first stage, queries were tested in a limited database
against images in the database that were duplicates, different,
or transformations. The analysis was performed on whether or
not the appropriate image was returned as the top most result.
Once the initial validation was completed, group of
images were added to the database. Then the query image was
matched visually with groups of images from the database and
the database images were ranked according to how similar they
were perceived to be to the query. These groups serve as the
expected relevant results during testing. Figure 4.1 shows how
to browse for query image. It asks to browse query image from
set of images.
RGB
Compone
nts
Processin
g
Clustering
Based on
RGB
Components
Texture and
Color
Depth
Calculation
for Images
and
Clustering
SVM
Algorithm
Sort out
the
results
Preprocessing
and
Classification
Query and
Target
Images
Select
Target
Image
Sumiti Bansal1 IJSRM volume 3 issue 4 April 2015 [www.ijsrm.in] Page 2700
Figure 5.1: Browsing Query Image
In this manner using C-SVM technique is used for
retrieved the most relevant images according to query image
from the database in an efficient manner. Then the results are
compared with the previously existing technique. Figure 5.2
shows the result of new method C-SVM applied.
5.1.1 Performance measurement
Figure 5.2: Images Retrieved by Applying C-SVM
Then, we evaluate the system performance by finding
Precision, Recall and F-Measure value for all the images. The
achieved percentage of precision value is 83.3% and recall
value is 100%. Hence, our system has 88.8% efficient. Results
of performance evaluation for different set of images like
butterflies, mountains, flowers and colleges are shown below.
5.2 Result Discussion
Total no. of images in the database = 9950
Total no. of images retrieved = 5
Figure 5.3: Result of the query image from Dataset of 500
images
Figure 5.4: Result of the query image from Dataset of 1000
images
Figure 5.5: Result of the query image from Dataset of 1650
images
Figure 5.6: Result of the query image from Dataset of 9950
images
Figure 5.7: Result of the query image from Dataset of 500
images
Figure 5.8: Result of the query image from Dataset of 1000
images
Figure 5.9: Result of the query image from Dataset of 1650
images
Sumiti Bansal1 IJSRM volume 3 issue 4 April 2015 [www.ijsrm.in] Page 2701
Figure 5.10: Result of the query image from Dataset of 9950
images
Figure 5.11: Result of the query image from Dataset of 9950
images
Figure 5.12: Result of the query image from Dataset of 9950
images
Sumiti Bansal1 IJSRM volume 3 issue 4 April 2015 [www.ijsrm.in] Page 2702
For Dataset Set 1
NAME OF THE
QUERY IMAGE
NO. OF IMAGES
RETRIEVED
PRECISION RECALL F-MEASURE
6.jpg 5 5/5=1 1 100%
88.jpg 5 5/5=1 1 100%
191.jpg 5 4/5=0.8 1 88%
196.jpg 5 5/5=1 1 100%
301.jpg 5 5/5=1 1 100%
Result 0.9 1 97.2%
Table 5.1: Results Achieved for Data Set of 500 images
For Dataset Set 2
NAME OF THE
QUERY IMAGE
NO. OF IMAGES
RETRIEVED
PRECISION RECALL F-MEASURE
6.jpg 5 4/5=0.8 1 88.8%
88.jpg 5 3/5=0.6 1 75%
191.jpg 5 4/5=0.8 1 88%
196.jpg 5 5/5=1 1 100%
301.jpg 5 5/5=1 1 100%
Result 0.8 1 88.8%
Table 5.2: Results Achieved for Data Set of 1650 images
For Dataset Set 3
NAME OF THE
QUERY IMAGE
NO. OF IMAGES
RETRIEVED
PRECISION RECALL F-MEASURE
6.jpg 5 4/5=0.8 1 88.8%
88.jpg 5 3/5=0.6 1 75%
191.jpg 5 4/5=0.8 1 88%
196.jpg 5 5/5=1 1 100%
301.jpg 5 5/5=1 1 100%
Result 0.8 1 88.8%
Table 5.3: Results Achieved for Data Set of 9950 images
Sumiti Bansal1 IJSRM volume 3 issue 4 April 2015 [www.ijsrm.in] Page 2703
Figure 5.1: Graphical Representation for Precision, Recall and F-Measure
97.20%
88.80% 88.80%
0
0.2
0.4
0.6
0.8
1
1.2
dataset1 dataset2 dataset3
Precision
Recall
Average
Sumiti Bansal1 IJSRM volume 3 issue 4 April 2015 [www.ijsrm.in] Page 2704
6. Conclusion
The dramatic rise in the sizes of images databases has stirred
the development of effective and efficient retrieval systems.
The development of these systems started with retrieving
images using textual connotations but later introduced image
retrieval based on content. This came to be known as Content
Based Image Retrieval or CBIR. Systems using CBIR retrieve
images based on visual features such as texture, colour and
shape as opposed to depending on image descriptions or
textual indexing. The main objective of this dissertation is to
retrieve the images from database in a fast and an efficient
manner. Improved C-SVM is very efficient and powerful
technology to handle large data sets. In this technique both
clustering and support vector machine algorithm are combined
and based on this results are identified. It assists faster image
retrieval and also allows the search for more relevant images in
large image databases. In proposed system we used different
sizes of datasets to calculate the F-measure value of all the
images present in the database.
Here we have tested this system on 9950 image taken
from different website. The overall accuracy for CBIR comes
out to be 88.8% which is very good. In the proposed system a
new technique is developed to search an image from a large
dataset which named as C-SVM technique. Images present in
dataset are flowers, mountains, butterflies, college, animals.
That is the reason searching of images becomes a crucial part
in any CBIR system. In this proposed system, we are working
on CBIR system for easily find out an image from a large
dataset. The concluding remarks are:
The proposed technique solves our problem
definition.
It improved the accuracy as compared to existing
techniques.
The technique used not only gives accurate results on
websites images but also on real world images.
7. Future Scope
This application can be used in future to classify the medical
images in order to diagnose the right disease verified earlier.
1. This system is useful in future to detect the diseases related
with human.
2. More effort to be taken to reduce the Image retrieval time of
a given input Query Image.
3. In future this system is also implemented in the field of
Computer Vision which is concerned with the automated
processing of images from the real world to extract and
interpret information on a real time basis.
4. In future this system is used in Astronomy to the study of
celestial objects (such as stars, comets, nebulae, planets, star
clusters and galaxies).
References
[1] Simon Tong, Edward chang. Support vector machine
active learning for image retrieval Department of
computer science department of electrical
engineering, Stanford University.
[2] A.Kannan, Dr.V.Mohan, Dr.N.Anbazhagan. An
Effective Method of Image Retrieval using Image
Mining Techniques. The International journal of
Multimedia & Its Applications (IJMA) Vol.2, No.4,
November 2010
[3] Kun-Che Lu, Don-Lin Yang. Image Processing and
Image Mining using Decision Trees. Journal of
Information Science and Engineering 25, 989-1003,
2009.
[4] Dr. Sanjay Silakari, Dr.Mahesh Motwani and Manish
Maheshwari. Color Image Clustering using Block
Truncation Algorithm. IJCSI International Journal of
Computer Science Issues, Vol. 4, No. 2, 2009.
[5] Amanbir Sandhu, Aarti Kochhar. Content Based
Image Retrieval using Texture, Color and Shape for
Image Analysis. International Journal of Computers &
Technology, Volume 3, No. 1, AUG, 2012.
[6] Saroj Shambharkar, Shubhangi Tirpude. Fuzzy C-
Means Clustering For Content Based Image Retrieval
System. International Conference on Advancements
in Information Technology With workshop of
ICBMG 2011, IPCSIT vol.20 (2011) © (2011)
IACSIT Press, Singapore, 2011.
[7] Sonali Jain. A Machine Learning Approach: SVM for
Image Classification in CBIR. Department of
Sumiti Bansal1 IJSRM volume 3 issue 4 April 2015 [www.ijsrm.in] Page 2705
Computer Science, MITM, RGPV, Indore, April
2013.
[8] Ray-I Chang, Shu-Yu Lin, Jan-Ming Ho, Chi-Wen
Fann, and Yu-Chun Wang. A Novel Content Based
Image Retrieval System using K-means/KNN with
Feature Extraction. ComSIS Vol. 9, No. 4, Special
Issue, December 2012.
[9] Peter Stanchev. Using image mining for image
retrieval. IASTED conf. Computer science and
Technology Cancum, Mexico, 214-218, May 19-21,
2003.
[10] F. Long, H. Zhang, H. Dagan, and D. Feng.
Fundamentals of content based image retrieval in D.
Feng, W. Siu, H. Zhang (Eds.): Multimedia
Information Retrieval and Management.
Technological Fundamentals and Applications,
Multimedia Signal Processing Book, Chapter 1,
Springer-Verlag, Berlin Heidelberg New York, pp.1-
26, 2003.
[11] Nishchol Mishra, Dr. Sanjay Silakari. Image Mining
in the Context of Content Based Image Retrieval: A
Perspective. IJCSI International Journal of Computer
Science Issues, Vol. 9, Issue 4, No 3, July 2012.
[12] S. Mangijao Singh, K. Hemachandran. Content-based
image retrieval using color moment and gabor texture
feature. IJCSI International Journal of Computer
Science Issues, Vol. 9, Issue 5, No 1, September
2012.
[13] Yong Rui and Thomas S. Huang, Shih-Fu Chang.
Image retrieval: current techniques, promising
directions, and open issues. Journal of Visual
Communication and Image Representation 10, 39–62,
1999.
[14] Gurpreet Kaur. A Review: Content Base Image
Mining Technique for Image Retrieval Using Hybrid
Clustering. International Journal of Advanced
Research in Computer Engineering & Technology
(IJARCET), Volume No. 2, Issue No. 6, June 2013.
[15] Ja-Hwung Su, Wei-Jyun Huang, Philip S. Yu, Vincent
S. Tseng. Efficient Relevance Feedback for Content-
Based Image Retrieval by Mining User Navigation
Patterns IEEE Transactions on Knowledge and Data
Engineering, Vol. 23, No. 3, March 2011.
[16] Manimala Singha and K.Hemachandran. Content
Based Image Retrieval using Color and Texture.
Signal & Image Processing: An International Journal
(SIPIJ) Vol.3, No.1, February 2012.
[17] Mahip M.Bartere, Dr.Prashant, R.Deshmukh. An
Efficient Technique using Text & Content Base
Image Mining Technique for Image Retrieval.
International Journal of Engineering Research and
Applications (IJERA) Vol. 2, Issue 1, pp.734-739,
Jan-Feb 2012.
[18] Rupinder kaur, Navleen kaur. Content Based Image
Mining Technique for Image Retrieval Using
Optimized Hybrid Clustering. International Journal
of Computer Trends and Technology (IJCTT) –
volume 11 number 3 – May 2014.
[19] Rajshree S. Dubey, Rajnish Choubey, Joy
Bhattacharjee. Multi Feature Content Based Image
Retrieval. International Journal on Computer Science
and Engineering (IJCSE), Vol. 02, No. 06, 2145-
2149, 2010.
[20] Jyoti Rani, Neeraj Gill. Content Based Image
Retrieval. Proceedings of Futuristic and Emerging
Area in Technology: Issue and Challenges 2013.
[21] Bin Xu, Can Wang. EMR: A Scalable Graph-based
Ranking Model for Content-based Image Retrieval.
IEEE Transactions on Knowledge and Data
Engineering, Vol. 6, No. 1, January 2007.
[22] Neha Sharma. Retrieval of image by combining the
histogram and HSV features along with surf
algorithm. International Journal of Engineering
Trends and Technology (IJETT) - Volume4 Issue7-
July 2013.
[23] A.Kannan, Dr.V.Mohan, Dr.N.Anbazhagan. An
Effective Method of Image Retrieval using Image
Mining Techniques. The International journal of
Multimedia & Its Applications (IJMA) Vol.2, No.4,
November 2010.
[24] J. Eakins, M. Graham. Content-based image retrieval.
Technical Report, University of Northumbria at
Newcastle, vol. 7, 1999.
[25] Abby A. Goodrum. Image information retrieval: An
overview of current research. Special issue on
Information Science Research, volume 3 No. 2, 2000.
Sumiti Bansal1 IJSRM volume 3 issue 4 April 2015 [www.ijsrm.in] Page 2706
[26] P. Mohanaiah, P. Sathyanarayana, L. GuruKumar.
Image Texture Feature Extraction Using GLCM
Approach. International Journal of Scientific and
Research Publications, Volume 3, Issue 5, May 2013.
[27] Padmashri Suresh, RMD.Sundaram, Aravindhan
Arumugam. Feature Extraction in Compressed
Domain for Content Based Image Retrieval.
International Conference on Advanced Computer
Theory and Engineering, 2008.
[28] Neetesh Gupta, Niket Bhargava, Dr. Bhupendra
Verma, Md.Ilyas Khan, Shiv Kumar. A New
Approach for CBIR Using Coefficient Of Correlation.
International Conference on Advances in Computing,
Control, and Telecommunication Technologies, 2009.
[29] Kong Fanhui. Image Retrieval Based on Multi-
features. International Conference on Network
Computing and Information Security, 2011.
[30] Yong-Hwan Lee, Sang-Burm Rhee, Bonam Kim.
Content-based Image Retrieval Using Wavelet
Spatial-Color and Gabor Normalized Texture in
Multi-resolution Database. Sixth International
Conference on Innovative Mobile and Internet
Services in Ubiquitous Computing, 2012.
Author Profile
Sumiti Bansal is a Student of M.Tech
(Computer Science Engg.) at Baba
Farid College of Engineering and
Technology, Bathinda. She has
received her B.Tech in Computer
Science from Baba Farid College of
Engineering and Technology,
Bathinda, in 2012. She is perusing
her M.Tech Thesis in the area of
Digital Image Processing.
Er.Rishamjot Kaur received
M.Tech degrees in Computer Science from Punjabi University,
Patiala in 2012. She is working as Assistant Professor in
Department of Information Technology at Baba Farid College
of Engineering and Technology, Bathinda, India.