relevance feedback in image retrieval systems: a surveychens/courses/cis6931/2001/tao.pdf ·...

31
R e l e v a n c e F e e d b a c k i n I m a g e R e t r i e v a l S y s t e m s : A S u r v e y Tao Huang, Lin Luo, Chengcui Zhang School of Computer Science Florida International University Abstract The relevance feedback based approach in image retrieval system has been an active research field in the past few years. This powerful technique has been proved successful in many application areas. Various ad hoc parameter estimation techniques have been proposed for relevance feedback. In addition, methods that perform relevance feedback on multi-level image model have been formulated. The method of relevance feedback is based on the most popular vector model used in information retrieval, and most of the previous relevance feedback research can be classified into two approaches: query point movement and re-weighting techniques. More recently, the new trend towards taking advantages of the semantic contents of the images in addition to the low-level features has appeared. This paper surveys recent studies on relevance feedback techniques image databases. Throughout the discussion, we will introduce the ideas of various kinds of techniques and compare their performances.

Upload: tranbao

Post on 24-May-2018

219 views

Category:

Documents


1 download

TRANSCRIPT

Relevance Feedback in Image Retr ieval Systems: A Survey

Tao Huang, Lin Luo, Chengcui Zhang

School of Computer Science

Florida International University

Abstract

The relevance feedback based approach in image retrieval system has been an active research

field in the past few years. This powerful technique has been proved successful in many

application areas. Various ad hoc parameter estimation techniques have been proposed for

relevance feedback. In addition, methods that perform relevance feedback on multi-level image

model have been formulated. The method of relevance feedback is based on the most popular

vector model used in information retrieval, and most of the previous relevance feedback research

can be classified into two approaches: query point movement and re-weighting techniques. More

recently, the new trend towards taking advantages of the semantic contents of the images in

addition to the low-level features has appeared. This paper surveys recent studies on relevance

feedback techniques image databases. Throughout the discussion, we will introduce the ideas of

various kinds of techniques and compare their performances.

1 Introduction

The development of the computer technology augments the importance of multimedia

information that must be browsed, retrieved and delivered. We cannot access or make use of the

information unless it is organized so as to allow efficient browsing, searching, and retrieval. The

wide spread of networked systems, among which the WWW has a distinguished role, highlights

the importance of good retrieval systems due to the vast amount of information available.

Information Retrieval (IR) is usually based on document surrogates that hold essential

information in an easily processible way. Queries are performed on surrogates instead of original

documents. Traditional approaches rely on text surrogates also for multimedia information, e.g.,

legends or keywords are attached to images, videos, sounds, speech fragments and etc. Retrieval

is performed on them rather than on original multimedia information. This approach has

advantage of the inheritance of efficient technology developed for text retrieval but suffers from

two serious drawbacks as Rui et al addressed in [Rui98, Rui98a]: one is the manual effort

required for attaching text descriptors to multimedia data; the other difficulty results from the

rich content in the images and the different interpretation of the same image given by different

users, leading to inconsistency in keyword assignment.

Techniques for content-based retrieval of multimedia data aim at overcoming these

drawbacks by using numerical features computed by direct analysis of the information content.

Content-Based Image Retrieval (CBIR) has been proposed in the early 1990’s. CBIR systems

use visual features like color, shape, texture, objects placement, line orientation and so on to

represent the image content. This approach is favorable since features can be computed

automatically, and the information used during the retrieval process is always consistent, not

depending on different human interpretation.

In CBIR systems, data objects are represented by surrogates that are feature vectors. Given

an input image, retrieval can be accomplished by extracting the query image’s visual features

and computing similarity coefficients or distances in the features space between the feature

vectors of a query image and the stored image, then retrieving matched or most similar images

from the database. The result is usually returned to the user ranked by decreasing similarity. This

process has some resemblance with text retrieval techniques which use term vectors to mark the

presence, absence or relative importance of selected words representing the document content.

Compared with the traditional image retrieval approaches such as keyword annotation,

CBIR is more efficient and practical. Research work in this field has developed quickly and

richly. While these research efforts establish the base of CBIR, most of them relatively ignore

two distinct characteristics of CBIR systems: (1) the gap between high-level concepts and low-

level features; (2) subjectivity of human perception of visual content. To overcome these

shortcomings, the concept of relevance feedback (RF) associated with CBIR was proposed in

1996 [Rui98]. Relevance feedback is an interactive process in which the user judges the quality

of the retrieval performed by the system by marking among the images retrieved by the system,

the one the user perceives as truly relevant. This information is then used to refine the original

query, resubmitted for a shaper selection.

In the past few years, the RF approach to image retrieval has been an active research field.

This powerful technique has been proved successful in many application areas. Various ad hoc

parameter estimation techniques have been proposed for RF. In addition, methods that perform

RF on multi-level image model have been formulated. The method of RF is based on the most

popular vector model [Buckley95, Salton83, Shaw95] used in information retrieval. Most of the

previous RF researches [Aksoy00, Benitez98, Chang99, Chua98, Hsu95, Lee98, Low98,

Meilhac98, Paek99, Rui97a, Seidl97, Wood98] are based on the low-level image features such as

color, texture and shape and can be classified into two approaches: query point movement and

re-weighting techniques [Lu00]. More recently, the new trend towards taking advantages of the

semantic contents of the images in addition to the low-level features has appeared.

In this paper, we will devote our efforts primarily to the recent studies on RF techniques in

image content-based retrieval domain. Throughout the discussion, we will introduce the ideas of

various kinds of techniques and compare their performances. The rest of the paper is organized

as follows. Section 2 reviews the development of RF technique and briefly introduces some

recent studies on RF techniques based on the low-level features. In Section 3, we discuss the RF

techniques used in the current famous image retrieval systems. Section 4 introduces the new

trends towards taking advantage of the semantic contents of the images. Section 5 gives the

concluding remarks.

2. Relevance Feedback Technique

RF is a widely accepted method of improving interactive retrieval effectiveness both in text

retrieval and image retrieval systems [Rui99]. The main idea of RF is: an initial search is made

by the system with a user-supplied query, returning a small number of results (documents or

images) to the users. The users indicate which of the returned results are useful (relevant). The

system then automatically refines the original query based upon those user relevance judgments.

The new feedback query is then compared to the collection of information (documents or

images), returning an improved set of documents or images to the user. This process can

continue to iterate until the user’s information need is satisfied.

In this section we will briefly introduce the concept of the RF technique in text based

retrieval, the ideas of various kinds of recent proposed RF techniques in image retrieval systems

and compare their performances.

2.1 Relevance Feedback Technique in Text Retr ieval

RF was initially introduced in text retrieval as early as the late 1960’s to increase the

number of relevant documents retrieved by a query [Rocchio66]. In RF, a user query is used to

start a search through the document collection. The documents obtained from this search are

examined and certain terms selected from the documents deemed to be relevant are used to

expand the original query. Terms from documents deemed not relevant may also be used to

modify the original query. Rocchio’s original formula, shown in equation (F1), is based on the

vector-space model.

∑ ∑= =

−+=1 2

1 1 2101

n

k

n

k

kk

n

S

n

RQQ γβ (F1)

where

Q1 = new query vector

Q0 = initial query vector

Rk = vector for relevant document k

Sk = vector for non-relevant document k

n1 = number of relevant documents

n2 = number of non-relevant documents

β and γ = weight multipliers to control relative contributions of relevant and

non-relevant documents

Much of the early research on RF has focused on the impact from the relevant and non-

relevant document weight multipliers β and γ [Lundquist97]. Based on the research done before,

RF was found to be extremely effective in text-based IR systems [Sltoton90, Salton89], even

now this technique is still adopted in some special text retrieval applications, e.g. using concept-

based relevance feedback for text retrieval on the WWW proposed by Chang et al [Chang99].

As we know, techniques in content-based image retrieval (CBIR) systems lag far behind

their text counterparts [Rui99] due to the difficulty for human to precisely express their visual

queries. Recently, RF based CBIR techniques have emerged as a promising research direction.

Following we will look into the RF techniques applied in Content Based Information Retrieval

Systems (CBIRSs).

2.2 Relevance Feedback Techniques in Image Retr ieval

In this section, we will introduce the main ideas of RF in image retrieval including image

model which RF method is based on and some specific RF methods.

2.2.1 Basic ideas and methods

In image retrieval, Picard and Minka have first studied RF technique and put RF technique

into a learning system named “FourEyes” [Meilhac98]. FourEyes offers a practical way to get

interactive performance by using RF technique, which will be addressed at length in Section 3.

Ever since then, the application of RF has been intensively studied. RF techniques do not require

a user to provide accurate initial queries, but rather estimate the user’s ideal query by using

positive and negative examples (training samples) feedback by the user. The fundamental goal of

these techniques is to estimate the ideal query parameters (both the query vectors and the

associated weights) accurately and robustly. However, most of the current RF based systems

estimate the ideal query parameters only on the low-level image features such as color, texture,

and shape. Most of the previous RF research can be classified into two approaches: query point

movement and re-weighting techniques [Ishikawa98]. We will address some well-known low-

level feature based RF techniques and related theories as follows.

The query point movement method as mentioned above essentially tries to improve the

estimation of the “ ideal query point” by moving it towards good example points and away from

bad example points. The frequently used technique to iteratively improve this estimation is the

Rocchio’s formula given as equation (F1) for sets of relevant documents Rk and non-relevant

documents Sk given by the user [Lu00]. This technique is implemented in the MARS system

[Rui97].

The central idea behind the re-weighting method is very simple and intuitive [Lu00]. Since

each image is represented by an N dimensional feature vector, we can view it as a point in an N

dimensional space. The weight associated with a feature defines the importance of this feature.

Intuitively, if all the relevant images have similar values for a feature, the feature can be

considered as an good indicator for user’s expectation. Conversely, if the variance of the good

examples is high along a principle axis j, then we can deduce that the values on this axis is not

very relevant to the input query so that we assign a low weight wj on it. Basically we use the

inverse of the standard deviation calculated over this feature in the feature matrix to update the

weight wj. The smaller the variance, the higher the associated weight.

Taking above two approaches into consideration, methods that perform relevance feedback

on multi-level image model proposed by Rui et al has been formulated [Rui97a, Rui98]. We will

show the details to formalize how an image object is modeled by use of this modeling theory as

follows.

Fundamentally, an image object O is represented by three tuples as equation (M1):

O = O ( D, F, R) (M1)

where

• D is the raw image data, e.g. a JPEG image.

• F = { f i } is a set of low-level visual features associated with the image object, such as

color, texture and shape.

• R = { rij } is a set of representations for a given feature f i , e.g. both color histogram and

color moments are representations for the color feature. Note that, each representation rij

itself may be a vector consisting of multiple components, i.e.

r ij =[ r ij1, r ij2 … ,r ijK ] (M2)

where K is the length of the vector.

This proposed object model supports multiple representations with dynamically updated

weights to accommodate the rich content in the image objects. Weights exist at various levels.

Wi, Wij, and Wijk, are associated with features f i, representations rij, and components rijk,

respectively.

The goal of RF based on this model is to find the appropriate weights to model the user’s

information request. The method of RF has been developed to be an effective technique for this

model.

MARS97[Rui97] introduced both a query vector moving technique and a re-weighting

techniques to estimate the ideal query parameters. MindReader [Ishikawa98] formulated a

minimization problem on the parameter estimation process. These two techniques are among the

best known techniques of RF. Compared with the MARS97 system and other previous RF

systems, the global optimization done in MindReader has shown a new trend towards improving

the robustness of the theoretical basis of RF systems, which will be further discussed in Section

4.1.1.

2.2.2 Some other proposed RF methods

Chua [Chua98a] et al proposed a RF approach to content-based image retrieval using

multiple attributes. The proposed approach has been applied to images’ text and color attributes.

In order to ensure that meaningful features are extracted, a pseudo object model based on color

coherence vector [Pass96] has been adopted to model color content. The RF approach employs

techniques developed in the fields of information retrieval and machine learning to extract

pertinent attributes in integrated content-based image retrieval. Retrieval using individual

attributes of text or color is able to achieve similar levels of retrieval effectiveness. But each is

only able to retrieve a subset of relevant images. In order to improve the overall retrieval

effectiveness, most image retrieval systems perform multiple attribute retrieval by combining

evidence [Faloutsos94, Low98, Ortega97]. Such techniques have resulted in improvements in

retrieval performance. However they are not very satisfactory, since different attributes tend to

have different degrees of importance for different classes of queries. In particular, without a

detailed knowledge of the collection make–up and retrieval environment, most users find it

difficult to formulate effective queries. Thus Chua et al adopted color coherence vector (CCV) as

a representation of color attribute [Pass96]. By treating the CCV’s color coherent component as a

low-level representation of objects within the images’ contents, they have developed a pseudo-

object-based method for image retrieval with RF. The main idea is to use pseudo object

representation for feedback, thus permitting the system to retrieve more new relevant images

with similar objects. They also use the user’s relevance judgement information to estimate the

importance of different attributes in a multi-attribute image retrieval model. The tested result of

their system demonstrates that the pseudo-object model and the proposed RF approach are more

powerful than the standard RF approaches.

In [Aksoy00], Aksoy el al presented a weighted distance approach where the weights are

the ratios of standard deviations of the feature values both for the whole database and also among

the images selected as relevant by the user. The feedback is used for both independent and

incremental updating of the weights and these weights are used to iteratively refine the effects of

different features in the database search. With this proposed technique, experiments in

[Aksoy00] showed the average retrieval performance was improved greatly after the first

iteration.

In addition, Macarthur et al [MacArthur00] proposed a relevance feedback retrieval system,

for each retrieval iteration, to build a decision tree to uncover a common thread between all

images marked as relevant. This tree was then used as a model for inferring which of the unseen

images the user would most likely desire. By use of the proposed technique, in [MacArthur00],

they demonstrated how retrieval precision increase after one or two iterations and they also

showed their retriever was fast enough to use online.

All the approaches described above perform RF at the low-level feature vector level, but

failed to take into account the actual semantics for the images themselves. The inherent problem

with these approaches is that the low-level features are often not as powerful in representing

complete semantic content of images as keywords in representing text documents. In other

words, applying the RF approaches used in text information retrieval technologies to low-level

feature based image retrieval will not be as successful as in text document retrieval. In viewing

this, there have been efforts on incorporating semantics in RF for image retrieval. In section 4,

we will introduce this new trend.

3. Application System Examples

Quite a few famous CBIR systems have adopted relevance feedback technique to achieve

better efficiency and satisfaction. We will give brief discussions of two typical systems:

� PhotoBook

� PicToSeek

3.1. Photobook (FourEyes)

Photobook, developed by MIT’s Media Laboratory, is a set of tools for performing queries

on image databases based on image content. Photobook has proved to be one of the most

influential of the early CBIR systems. FourEyes, embedded in the most recent version of

Photobook, is an interactive, power-assisted tool for segmenting and annotating image including

human being in the annotating and retrieval loop. FourEyes is different from other tools like

QBIC, Virage and CORE, which all support search on various features but offer little assistance

in actually choosing one for a given task; FourEyes, on the contrary, allows users to address their

intent directly. Users are allowed to do this because FourEyes calculates information-preserving

features. That means from these features all essential aspects of the original image can in theory

be reconstructed. Features relevant to a particular type of search are allowed to be computed at

search time.

figure 3.1 Interface for PhotoBook * *

** http://vismod.www.media.mit.edu/~tpminka/photobook/foureyes/

FourEyes system tries to overcome the difficulties of dimensional explosion in feature

space by using a “society of models” [Minka96]. In an initial off-line phase, a number of

different filtering techniques are applied to the data to hierarchically cluster the data in as many

ways as possible. Groups of these clusters are identified which best represent classes of scenes

which are employed accordingly in the on-line query phase. User clicks on some regions, gives

them a label, and FourEyes extrapolates the label to other regions using shared neighbor

clustering algorithm on the image and in the database. A labeling is produced by FourEyes to

select and combine models from “society of models” . A grouping is a set of image regions that

are associated in some way, normally are identified as best representing classes of scenes that are

employed accordingly in the on-line query phase. The elements of a grouping may not

necessarily come from one same image, so there are within-image grouping and across-image

grouping in FourEyes. The incorporation of within- and across-image grouping has advantages

of class-likelihood continuity and self-improve capability.

Once a set of grouping has been formed, FourEyes selects and combines these grouping to

form compound groupings for users. User feedback is filtered back to alter the clustering with

respect to the success or failure of the query, thus adapting the grouping to the user’s needs.

FourEyes uses a leaner with simple concept language but adaptive weighting mechanism so that

it is easy to steer in desired directions. Each grouping has its own weight. Staring from an empty

union, the learner adds the grouping which maximizes the product of this number and the prior

weight of the grouping to the compound grouping.

Fig 3.1 shows the interface of PhotoBook system. Experimental results show that the “society of

model” approach is effective in interactive image annotation and retrieval.

3.3. PicToSeek

PicToSeek system [Gevers98] is used to explore visual information on the World Wide

Web. PicToSeek adopts relevance feedback technique to offer an interactive and iterative

content-based image retrieval to the users. Through the java interface, PicToSeek allows user to

choose different matching level: fuzzy, exact and so on; different feature type: RGB transition, H

transition and etc.; and different image type wanted: photograph, graphic, pictures. User can pick

up an example image either through loading from certain URL or browsing the database in the

server of PicToSeek (see figure 3.2 for interface of PicToSeek). Server search through the

database to return result to user. From the user feedback of giving negative and positive images,

the learning method can automatically learn which image features are more important. The effect

of such a process can move the query point in the direction of the relevant images and away from

the non-relevant ones.

PicToSeek use Fisher’s Linear Discriminant Method to classify images into two groups:

photographic and synthetic [Gevers98]. The classifying method computes three features: Color

variation, Color saturation and Color transition strength, Photographic images and synthetic

images tend to have evident difference in those features. Furthermore, images are also

automatically cataloged according to these characteristics such as JFIF-GIF, gray-color, size and

creating date.

figure3.2. the interface of PicToSeek WWW search system*

* Refer to: http://zomax.wins.uva.nl:5345/ret_user/.

Detailed retrieval model of PicToSeek system is described in [Gevers98]. PicToSeek uses

the vector feature model. Each image can be presented by its image vectors in the form of

);...;,;,( ,1100 InnII wfwfwfI = . When there is a query, query is also presented by its

corresponding image vectors Q in the same form );...;,;,( ,1100 QnnQQ wQwQwQQ = . PicToSeek

use formula: )log()}max{

5.05.0(

1 n

N

ff

ffw

ni

ii

=

+= to assign weights to different feature, this weights

assignment takes into consideration both the high feature frequencies and low overall collection

frequencies. PicToSeek returns user the most “similar” images stored in database to the query

image. The similarity between query and “similar” images is measured by similarity function,

PicToSeek chooses cross correlation similarity measure to provide best retrieval accuracy

without any object clutter, the similarity function is defined as ∑

∑=

==n

k Qk

n

k IkQk

w

wwIQS

1

1),min(

),( .

Relevance feedback is a method of feature selection and weighting. Users feed back

negative/positive images information. System learns which image features are more important

from users’ feedback and find the images according to the new features weighting. PicToSeek

allows the specification of (non)relevant images and sub-images. The relevance feedback process

is formulated as ∑ ∑−+=rel nonrel i

i

i

i

D

D

D

DQQ

||||’ γβα . The aim of relevance feedback is to produce

improved query specification. User need not to give a precise initial query formulation, the

relevance feedback technique can move the query into the user-desired direction.

4. New Trend

Recently, there are some new trends towards the relevance feedback techniques in the

application domain of content-based image retrieval. Our survey shows that there are roughly

two main new trends in this field:

• Try to derive the more computationally robust methods that perform global optimization

on the weights adjustment as well as the correlation between different attributes.

• Incorporate the semantic information with the low-level features into the relevance

feedback process for image retrieval.

Besides discussing the above two main trends, other techniques such as query expansion and

storing feedback information will also be included in this section.

4.1 Global Optimization

4.1.1 MindReader Retrieval System

Recently, more computationally robust methods that perform global optimization have

been proposed. In [Ishikawa 98], the MindReader retrieval system designed by Ishikawa et al.

formulates a minimization problem on the parameter estimating process. The key point is: Unlike

traditional retrieval systems whose distance function can be represented by ellipses aligned with

the coordinate axis, the MindReader system proposed a distance function that is not necessarily

aligned with the coordinate axis. Therefore, it allows for correlations between attributes in

addition to different weights on each component.

In Ishikawa’s paper, he explicitly pointed out that there is innate incompleteness in MARS

1997 [2]. Consider the two main techniques (Query-point Movement and Re-weighting) used in

MARS 1997: 1) in query-point movement, Rui and Huang generated pseudo-document vectors

(by method called “ inverse document frequency” ) from image feature vectors and then directly

applied the Rocchio’s formula. Although this technique is based on similarity-based query

processing, the similarity values can be easily transformed to straight Euclidean distances; 2) in

standard deviation method used in [2], the basic idea is very intuitive: if the variance of the good

examples is high along, say, the j-th axis, and therefore the j-th axis should have a low weight wj.

So that the inverse of the standard deviation of the j-th feature values in the feature matrix is used

as the weight wj for feature j, that is, wj=1/σj. The resulting weights wj are used to compute the

new similarity values for images. Ishikawa did not deny the correctness of the intuition in this

method, but he pointed out that there is no justification given in [2] about this specific choice of

wj=1/σj. In fact any decreasing function of σj would be a good candidate for the weight, like

1/log(σj).

By seeing the incompleteness of MARS 1997, Ishikawa proposed a more robust method

which he claimed to include both the above two types of query refinement techniques as its

special cases. In fact, this method does not use the heuristics such as β and γ in the Rocchio’s

formula but directly go for an optimal solution for minimum problem in “hidden distance”

function. In this function, it allows not only for different weights of each attribute, but also for

correlations between attributes. As shown in Figure 1 [Ishikawa 98] which gives the visual

descriptions for the different 2-D distance functions, the left one is the isosurfaces for the straight

Euclidean distance function which has circles; the central one shows the isosurface for a

weighted Euclidean distance, like in MARS1997, which has ellipses whose major axis must be

aligned with the coordinate axis; the right one is the proposed distance function by Ishikawa,

which also has ellipses for its isosurfaces, but the major axis are not necessarily aligned with the

coordinate axis. In another word, Ishikawa applied a general ellipses function as the model of his

distance function.

q q q

Euclidean Weighted Euclidean Generalized ellipsoid distance

Figure 1 Different distance functions

As stated in [Ishikawa 98], the proposed distance function is:

)()(),( qxMqxqxD T ������ −−=

where M defines a generalized ellipsoid distance matrix.

Obviously, the generalized ellipsoid distance function includes the straight and weighted

Euclidean distance functions as its special cases. Moreover, in this paper, it gives the proof that

the weighting scheme of MARS 1997 is optimal if we restrict M to a diagonal matrix, but MARS

1997 is unable to “guess” generalized ellipsoid distance. Furthermore, the experiments being

done in this paper shows that the searching is fast due to the recent developments on generalized

ellipsoid queries in spatial access methods technology [Seidl 97].

4.1.2 MARS 1999

Based on MindReader’s approach for global minimum, a further improvement over this

approach is given by Rui and Huang [Rui 99]. In their CBIR system, it not only formulates the

optimization problem but also takes into account the multi-level image model.

In Rui and Huang’s paper, by using Lagrange multipliers, they have derived the explicit

optimal solutions for both the query vectors and the weights associated with the multi-level

image model. That is, they combine the two best-known techniques of relevance feedback

(MindReader and MARS) to overcome the shortcomings that each technique has. For example,

even though MindReader formulated a more vigorous estimation process than MARS which

overcomes the shortcomings caused by heuristic based parameter estimation, it failed to analyze

the necessary conditions for the technique to work. Moreover, neither technique takes into

account that images contain multiple levels of content. I think one of the main contribution of

[Rui 99] is to embed the multi-level image model into the optimal solution and develop a

formulation that guarantees explicit optimal solutions while making the problem as general as

possible. The formulas generated by Rui and Huang’s work in this paper are claimed to be

general enough to include both MARS and MindReader as its special cases.

4.2 Semantic Information in Relevance Feedback

Until now, all the above techniques we have discussed only perform feedback at the low-

level feature vector level, and they failed to take into account the actual semantics information

for the image themselves. As mentioned in [Lu 00], the inherent problem with these approaches

is that the low-level features are often not as powerful in representing complete semantic content

of images as keywords in representing text documents. By viewing this, recently there have been

efforts on incorporating semantics in relevance feedback for content-based image retrieval. The

framework proposed in [Lee 98, Paek 99] tried to embed semantic information into a low-level

feature based image retrieval process by using a correlation matrix. In this framework, user’s

feedback help the system to learn the semantic relevance between image clusters, and this

information can be used to improve the performance of retrieval. Another framework is Lu and

Zhang’s Ifind information retrieval system, which integrates both the semantics and low-level

features into the relevance feedback process in a new way. The basic idea is to construct a

semantic network and integrated it with low-level feature vector based relevance feedback by

using a modified form of the Rocchio’s formula. The semantic network is represented by a set of

keywords having links to the images in the database. Weights are assigned to each individual

link [Lu 00]:

image image …… image

Keyword 1 Keyword 2

w11

w12

w1n

Figure 2: Semantic network

The degree of relevance of the keywords to the associated image’s semantic content is

represented as the weight on each link. But it seems there is still some problems with this kind of

semantic network. First, it does not sound such rigorous as MARS 1997 &99 and MindReader.

For example, considering the weights assigned to those keyword links according to their

relevance, there seems no normalization method on this mentioned in that paper. Second, the

proposed criteria used for weights adjustment seems no theoretical base. Another problem is the

scalability, with the expansion of the vocabulary, the proposed framework may have big trouble

dealing with the increased complexity. In all, the framework proposed in [Lu 00] is not as solid

as MARS and MindReader.

4.3 Other Trends in Feedback Information Retr ieval

Other new trends and techniques include the query expansion [Porkaew 99a, Porkaew 99b]

and storing feedback information [Bartolini 00]. In query expansion, in each iteration of

feedback, the relevant objects are added to the query and non-relevant ones are removed, which

has been proved to have better performance than query point movement approach. Another new

idea is to store the outcome of a feedback process when the process is terminated. Instead of

starting out the new feedback iteration with default parameters each time (for the same query),

Bartolini and Ciaccia presented that using wavelets to store the parameters and enable the

prediction of parameter settings for similar queries by interpolation. That is, the feedback process

for a new query can be started with a parameter setting, which is usually much better (much

closer to the optimal) than the default parameters, so that the increased effectiveness and

response time can be achieved.

5.Conclusion

In this paper, we have presented a brief survey of the Relevance feedback (RF) technique

used in image retrieval systems, especially in CBIR systems.

CBIR has emerged as one of the most active research areas in the recent years. Retrieval is

accomplished by computing the similarities of individual feature representation with fixed

weights. Although CBIR has been widely implemented and adopted around the world, its

usefulness is limited due to the difficulty in representing high level concepts using low level

features and human perception subjectivity.

Relevance feedback is an excellent technique for improving the retrieval effectiveness.

User need not decompose his/her interested information into different low-level feature

representations and specify all the associated weights precisely. With RF techniques, users are

allowed to submit a coarse query at the beginning, and the query results will be continuously and

automatically refined by the system based on more and more accurate feedback information from

users.

Employing RF in image retrieval system has proved to be very promising to improve the

retrieval effectiveness of overall system.

As addressed above, RF technique is employed to improve the retrieval effectiveness of

overall system, no matter integrating semantics or low level features or both. MARS and

Mindreader retrieval systems are the best-known RF systems till now. Experiments show RF is

an excellent technique for improving the effectiveness of queries against a database. But there

still exist some problems in RF research area:

1) First, since most of current RF based systems estimate the ideal query parameters on

only the low-level image features such as color, texture, and shape. These systems

work well if the feature vectors can capture the essence of the query. On the other

hand, if the user is searching for a specific object that cannot be sufficiently

represented by combinations of available feature vectors, these RF systems will not

return many relevant results even with a large number of user feedbacks [Lu00].

Moreover, nobody ever mentions the time complexity issue of extracting low-level

features from images or that of the weights adjustment.

2) Second, even those systems that embedded the semantic information (such as the

keyword index) in them still have trouble dealing with the issues of weight

normalization, thresholds selection and narrowing the return-sample space.

3) No current systems can support object level query even given lots of low-level features

together with the keywords information. It is also a big research issue in content-based

retrieval society.

Employing RF in image retrieval system has proved to be very promising to improve the

retrieval effectiveness of overall system. We believe that future works in this field will contribute

greatly to the information retrieval research.

[References:]

[Aksoy00] Selim Aksoy and Robert M. Haralick, “A Weighted Distance Approach to Relevance

Feedback,” Proceedings of the International Conference on Pattern Recognition (ICPR’00).

[Allan96] J. Allan. Incremental relevance feedback for information filtering. In Proc. ACM

SIGIR Conf., Zurich,Switzerland,August 1996.

[Ana98] Ana Lelescu Ouri and Wolfson Bo Xu, “Approximate Retrieval from Multimedia

Databases Using Relevance Feedback,” Proceedings of the String Processing and Information

Retrieval Symposium & International Workshop on Groupware, 1998.

[Bach] J. R. bach, C. Fuller et al., “The virage image search engine:an open framework for

image management,” in Proc. SPIE and Retrieval for Image and Video Databases.

[Bartolini00] I. Bartolini, P. Ciaccia and F. Waas, “Using the Wavelet Transform to Learn from

User Feedback,” In Proceedings of the 1st DELOS Workshop on "Information Seeking,

Searching and Querying in Digital Libraries" (DELOS’00 - Network of Excellence on Digital

Libraries), Zurich, Switzerland, December 2000.

[Benitez98] Ana B. Benitez, Mandis Beigi, and Shih-Fu Chang, “Using Relevance Feedback in

Content-Based Image Metasearch,” IEEE Internet Computing, Vol. 2, No. 4, July/August 1998.

[Buckley95] Buckley, C., and Salton, G. “Optimization of Relevance Feedback Weights,” in

Proc of SIGIR’95.

[Chang99] Chia-Hui Chang, Student Member, IEEE, Ching-Chi Hsu, “Enabling Concept-Based

Relevance Feedback for Information Retrieval on the WWW,” IEEE Transactions on Knowledge

and Data Engineering, Vol. 11, No. 4, July/August 1999.

[Chua98a] Tat-Seng Chua, Chun-Xin Chu and Mohan Kankanhalli, “Relevance Feedback

Techniques for Image Retrieval Using Multiple Attributes,” Proceedings of the IEEE

International Conference on Multimedia Computing and Systems, Volume I.,1998

[Chua98] T.S. Chua, W.C. Low and C.X. Chu, “Relevance Feedback Techniques for Color-

based Image Retrieval,” Proceedings of the 1998 MultiMedia Modeling.

[Cox96] I.J. Cox, M.L. Miller, SM. Omohundro & P.N. Yianilos. Pichunter: Bayesian relevance

feedback for image retrieval. Int’ l Conference on Pattern Recognition, 361-369, 1996.

[Doulamis98] Anastasios D. Doulamis, Yannis S. Avrithis, Nikolaos D. Doulamis and Stefanos

D. Kollias, “ Interactive Content-Based Retrieval in Video Databases Using Fuzzy Classification

and Relevance Feedback,” Proceedings of the IEEE International Conference on Multimedia

Computing and Systems, Volume II, 1998.

[Faloutsos94] C. Faloutsos, R. Barber, M. Flicker, J. Hafner, W. Niblack, D. Petkovic & W.

Equitz. Efficient and effective querying by image content. Journal of Intelligent Information

Systems, 231-262, 1994.

[Gevers98] Theo Gevers and Arnold W.M. Smeulders, “The PicToSeek WWW Image Search

System,” Proceedings of the IEEE International Conference on Multimedia Computing and

Systems, Volume I, 1998.

[Hichem99] Frigui Hichem, “ Interactive Image Retrieval Using Fuzzy Sets,” IEEE trans. On

Image processing, Abstract received March 24, 1999

[Hsieh98] Jun-Wei Hsieh, Cheng-Chin Chiang and Yea-Shuan Huang, “Using Relevance

Feedback to Learn Visual Concepts from Image Instances,” Proceedings of the 10th

International Conference on Image Analysis and Processing, 1998.

[Hsu95] W. Hsu, T.S. Chua & H. K.Pung. Integrated color-spatial approach to content-based

image retrieval. ACM Multimedia ‘95,305313, 1995.

[Ide71] Ide, E., “New Experiments in Relevance Feedback,” Gerard Salton, Editor, The SMART

Retrieval System, Prentice-Hall, Inc., Englewood Cliffs, New Jersey, 1971.

[Ishikawa98] Y. Ishikawa, R. Subramanya, and C. Faloutsos, “Mindreader: Query databases

through multiple examples” , In Proc. Of the 24th VLDB conference, (New York),1998.

[Junichi00] Tatemura, Junichi, “Graphical relevance feedback: Visual exploration in the

document space” , IEEE SYMP VISUAL LANG PROC, IEEE, LOS ALAMITOS, CA, (USA), pp.

39-46, 2000

[Lee98] Lee, C., Ma, W. Y., and Zhang, H. J. “ Information Embedding Based on user’s

relevance Feedback for Image Retrieval,” Technical Report HP Labs, 1998.

[Lewis95] DD Lewis. Active by accident: Relevance feedback in information retrieval. In AAAI

Fall Symposium on Active Learning, 1995.

[Lipson97] P. Lipson, E. Grimson, and P. Sinha, “Context and configuration based Scene

Classification” , Proceedings of IEEE Int. Conf. On Computer Vision and Pattern Recognition,

1997.

[Low98] W.C. Low & T.S. Chua. Color-based relevance feedback for image retrieval. In

International workshop on MM DBMS, Dayton, USA, 116-,123. IEEE Computer Society, 1998.

[Lu 00] Y. Lu, C.-H. Hu, X.-Q. Zhu, H.-J. Zhang and Q. Yang, “A Unified Framework for

Semantics and Feature Based Relevance Feedback in Image Retrieval Systems,” ACM

Multimedia, 2000.

[Lundquist97] Lundquist, C., D. Grossman, and O. Frieder, “ Improving Relevance Feedback in

the Vector-Space Model,” Proceedings of the Sixth ACM International Conference on

Information and Knowledge Management, 1997.

[Lundquist99] Lundquist, Carol; Frieder, Ophir; Holmes, David O; Grossman, David “Parallel

relational database management system approach to relevance feedback in information

retrieval” . Journal of the American Society for Information Science [J. Am. Soc. Inf. Sci.], vol.

50, no. 5, pp. 413-426, 1999

[MacArthur00] S. MacArthur, C. Brodley and C. Shyu, “Relevance Feedback Decision Trees in

Content-Based Image Retrieval,” Proceedings of the IEEE Workshop on Content-based Access

of Image and Video Libraries (CBAIVL’00).

[Maron98] O. Maron, “Learning from Ambiguity” , Doctoral Thesis,Dept. of Electrical

Engineering and Computer Science, M.I.T., June 1998.

[Meilhac98] Christophe Meilhac and Chahab Nastar, “Relevance Feedback and Category Search

in Image Databases,” Proceedings of the IEEE International Conference on Multimedia

Computing and Systems, Volume I, 1998.

[Minka96] T.P. Minka & R.W. Picard. Interactive learning using a society of models. IEEE

Computer Society Conference on Computer Vision and Pattern Recognition, 447-452, 1996.

[Müller00] Henning Müller, Wolfgang Müller, Stéphane Marchand-Maillet and Thierry Pun,

“Strategies for Positive and Negative Relevance Feedback in Image Retrieval,” Proceedings of

the International Conference on Pattern Recognition (ICPR’00).

[Ortega97] M. Ortega, Y. Rui, K. Chakrabarti, S. Mehrotra & T.S Huang. Supporting similarity

queries in Mars. ACM Multimedia ‘97,403-413, 1997.

[Paek 99] S. Paek, C.L. Sable, V. Hatzivassiloglou, A. Jaimes, B.H. Schiffman, S. F. Chang

and K.R. Mckeown, “ Integration of Visual and Text-Based Approaches for the Content Labeling

and Classification of Photographs,” SIGIR’99.

[Pass96] G. Pass, R. Zabih & J. Miller. Comparing images using color coherence vectors. ACM

Multimedia 96,65-73, 1996.

[Patrice00] Blancho Patrice and Hubert Konik, “Texture Similarity Queries and Relevance

Feedback for Image Retrieval,” Proceedings of the International Conference on Pattern

Recognition (ICPR’00).

[Porkaew 99a] K. Porkaew, K. Chakrabarti & S. Mehrotra, “Query Refinement for Multimedia

Similarity Retrieval in MARS,” Proceedings of the ACM International Multimedia Conference,

Orlando, Florida, pp 235-238, 1999.

[Porkaew 99b] K. Porkaew, S. Mehrotra, M. Ortega and K. Chakrabarti, “Similarity Search

Using Multiple Examples in MARS,” in Intl Conference on Visual Information Retrieval, 1999.

[Rocchio66], Jr., J. J., “Document Retrieval Systems - Optimization and Evaluation,” Ph.D.

Thesis, Harvard University, March 1966.

[Rocchio71] Rocchio, Jr., J. J., “Relevance Feedback in Information Retrieval,” Gerard Salton,

Editor, The SMART Retrieval System, Prentice-Hall, Inc., Englewood Cliffs, New Jersey, 1971.

[Rui97] Rui, Y.; Huang, T.S.; Mehrotra, S., “Content-based image retrieval with relevance

feedback in MARS,” Proceedings of the 1997 International Conference on Image Processing

(ICIP ’97) (3-Volume Set).

[Rui97a] Yong Rui, Thomas S. Huang, Sharad Mehrotra, and Michael Ortega, “A Relevance

Feedback Architecture for Content-based Multimedia Information Retrieval Systems,”

Proceedings of the 1997 Workshop on Content-Based Access of Image and Video Libraries

(CBAIVL ’97).

[Rui98] Yong Rui, Thomas S. Huang, Michael Ortega, and Sharad Mehrotra, “Relevance

Feedback: A Power Tool in Interactive Content-Based Image Retrieval” , IEEE Tran on Circuits

and Systems for Video Technology , Special Issue on Segmentation, Description, and Retrieval of

Video Content, pp644-655, Vol 8, No. 5, Sept, 1998

[Rui98a] Yong Rui, Thomas S. Huang, and Sharad Mehrotra. Relevance feedback techniques in

interactive contentbased image retrieval. In Proc. of IS&T SPIE Storage and Retrieval of

Images/Video Databases VI, EI'98,

[Rui99] Rui, Y., Huang, T. S. “A Novel Relevance Feedback Technique in Image Retrieval,”

ACM Multimedia, 1999.

[Salton83] Salton, G., and McGill, M. J. “ Introduction to Modern Information Retrieval,”

McGraw-Hill Book Company, 1983.

[Salton89] G. Salton. Automatic text processing. Addison-Wesley Publishing Company, 1989.

[Seidl 97] T. Seidl and H. P. Kriegel, “Efficient User-adaptable Similarity Search in Large

Multimedia Databases,” in Proceedings of VLDB, pp 506-515, Athens, Greece, August 1997.

[Shaw95] Shaw, W. M. “Term-Relevance Computation and Perfect Retrieval Performance” IPM

31, 1995, 491-498

[Sltoton90]G. Sltoton and C. Buckey. Improving retrieval performance by relevance feedback.

Journal of the American Society for Information Science, 41(4):228-287,1990.

[Squire99] D. M. Squire, W. M¨ uller, H. M¨ uller, and J. Raki. “Content-based query of image

databases, inspirations from text retrieval: inverted files, frequency-based weights and relevance

feedback” . 11th Scandinavian Conference on Image Analysis (SCIA’99), pages 143–149,

Kangerlussuaq, Greenland, June 7–11 1999.

[Squire99] D. M. Squire, W. M¨ uller, H. M¨ uller, and J. Raki. Content-based query of image

databases, inspirations from text retrieval: inverted files, frequency-based weights and relevance

feedback. In The 11th Scandinavian Conference on Image Analysis (SCIA’99), pages 143–149,

Kangerlussuaq, Greenland, June 7–11 1999.

[Vasconcelos00] Nuno Vasconcelos and Andrew Lippman, “Bayesian Relevance Feedback for

Content-Based Image Retrieval,” Proceedings of the IEEE Workshop on Content-based Access

of Image and Video Libraries (CBAIVL’00).

[Wood98] M. E. Wood, N. W. Campbell, and B. T. Thomas. Iterative refinement by relevance

feedback in content-based digital image retrieval. In Proceedings of The Fifth ACM

International Multimedia Conference (ACM Multimedia 98), pages 13--20, Bristol, UK,

September 1998.

[Wu00] Ying Wu, Qi Tian and Thomas S. Huang, “ Integrating Unlabeled Images for Image

Retrieval Based on Relevance Feedback,” Proceedings of the International Conference on

Pattern Recognition (ICPR’00).

[Zhou00] Xiang Sean Zhou and Thomas S. Huang, “ Image Retrieval: Feature Primitives, Feature

Representation, and Relevance Feedback,” Proceedings of the IEEE Workshop on Content-based

Access of Image and Video Libraries (CBAIVL’00).