multimedia retrieval architecture electrical communication engineering, indian institute of science,...

52
Multimedia Retrieval Architecture Electrical Communication Engineering, Indian Institute of Science, Bangalore – 560012, India Multimedia Retrieval Architecture

Upload: ashlee-long

Post on 28-Dec-2015

219 views

Category:

Documents


0 download

TRANSCRIPT

Multimedia Retrieval Architecture

Electrical Communication Engineering,Indian Institute of Science, Bangalore – 560012, India

Multimedia Retrieval Architecture

Multimedia Retrieval Architecture

Query processing

Query processing requirement of a multimedia presentation unit

Heterogeneous presentation units (image, video, audio etc) may beCombined to form single presentation units.

a. These units may have different attributes and accessing methods.b. Temporal relationships between presentation components.

To support the retrieval and reuse of presentations and presentation components, a multimedia Retrieval system must support the following types of queries

Multimedia Retrieval Architecture

Query Processing Types of queries

Attribute based queries

association of attributes, including text and numerical attributes which may represent features extracted from the multimedia units

– Retrieval by identifier (index)

– retrieval by conditional statements.

Content based queries

queries over color composition and other image or media characteristics

eg. Query can select all images that contain round shapes. Using pattern recognition operations

Temporal queries

temporal relations among the media units within a presentation.

Retrieve picture on basis of occurrence or non occurrence of certain entity,

eg.look at president shaking hand with PM, queries select video clips stored

Multimedia Retrieval Architecture

XQuery is the language for querying XML data

XQuery for XML is like SQL for databases

XQuery is built on XPath expressions

XQuery is supported by all the major database engines (IBM, Oracle, Microsoft, etc.)

XQuery?

Multimedia Retrieval Architecture

XML document<?xml version="1.0" encoding="ISO-8859-1"?>

<bookstore>

<book category="COOKING">

<title lang="en">Everyday Italian</title>

<author>Giada De Laurentiis</author>

<year>2005</year>

<price>30.00</price>

</book>

<book category="CHILDREN">

<title lang="en">Harry Potter</title>

<author>J K. Rowling</author>

<year>2005</year>

<price>29.99</price>

</book>

</bookstore>

Multimedia Retrieval Architecture

How to Select Nodes From "books.xml"?Functions

XQuery uses functions to extract data from XML documents.

The doc() function is used to open the "books.xml" file:

doc("books.xml")

Multimedia Retrieval Architecture

XQuery uses path expressions to navigate through elements in an XML document.

The following path expression is used to select all the title elements in the "books.xml" file:

doc("books.xml")/bookstore/book/title

(/bookstore selects the bookstore element, /book selects all the book elements under the bookstore element, and /title selects all the title elements under each book element)

The XQuery above will extract the following:

<title lang="en">Everyday Italian</title>

<title lang="en">Harry Potter</title>

<title lang="en">XQuery Kick Start</title>

<title lang="en">Learning XML</title>

Multimedia Retrieval Architecture

XQuery uses predicates to limit the extracted data from XML documents.

The following predicate is used to select all the book elements under the bookstore element that have a price element with a value that is less than 30:

doc("books.xml")/bookstore/book[price<30]

The XQuery above will extract the following:

<book category="CHILDREN">

<title lang="en">Harry Potter</title>

<author>J K. Rowling</author>

<year>2005</year>

<price>29.99</price>

</book>

Multimedia Retrieval Architecture

XML based Multimedia RetrievalRecognition of characters to locate names.

Identification of the type of line that represents the state boundaries and the symbol that represents cities.

Definition of a region in which the search for the state boundary should be performed; this requires knowledge of at least one point inside the state, which is obtained by locating the state name.

Search for a closed contour formed by the type of line which represent the state boundaries; if no closed contour is found the system should define a larger area for search, which is done by returning to step 3

Search for symbols which represent the cities near the city name and within the closed contour.

Search for symbols representing objects (such as buildings, parks, lakes, etc.) and identify the objects based on cont

Multimedia Retrieval Architecture

The input image and the retrieved image from the database

Multimedia Retrieval Architecture

XML Query Format

Multimedia Retrieval Architecture

XML Based RetrievalIt then translates the innermost query into the following operations:

Call Window: This routine defines the spatial working area where the search for the tower symbol is performed. The nearest symbol to the transformed coordinates is identified as the symbol of IISc (tower in this case).

Call Connected Component: This routine identifies potential symbols within the windowed area of the map.

Call Symbol: This routine examines the potential symbols and recognizes all the symbols contained within the window.

Call Short/Far: This routine identifies the city symbol nearest to the approximate coordinates as the symbol of the IISc tower.

Multimedia Retrieval Architecture

Image Queries Images are required for:

illustration of text articles, conveying information

or emotions difficult to describe in words,

display of detailed data (such as radiology

images) for analysis,

formal recording of design data (such as

architectural plans) for later use, and so on

Multimedia Retrieval Architecture

Image query

Multimedia Retrieval Architecture

Image Queries Types of attributes:

the presence of a particular combination of color, texture or shape features (e.g., green stars);

the presence or arrangement of specific types of object (e.g., chairs around a table);

the depiction of a particular type of event (e.g., a football match);

the presence of named individuals, locations, or events (e.g., the PM greeting a crowd);

subjective emotions one might associate with the image (e.g., happiness).

Multimedia Retrieval Architecture

Image query processing

Multimedia Retrieval Architecture

Video query

Multimedia Retrieval Architecture

Region query

Multimedia Retrieval Architecture

Video Queries Shortest video are made up of number of distinct scene each of

which can

Be further broken down into individual shots depicting single view, conversation or action

Prepare a storyboard of annotated still images (often known as key frames) representing each scene.

Prepare a series of short video clips, each capturing the essential details of a single sequence – video skimming.

Multimedia Retrieval Architecture

1-Dimensional Objects:

Text and speech objects

Reason - text and audio are to be accessed in a contiguous manner

2-dimensional Objects:

E.g., Image objects - Access to image data can be done with reference to the

spatial locations of objects.

E.g., a query can search for an object that is to the right of or below a specified

object.

3-dimensional Objects:

E.g., Video objects – both spatial as well as temporal characteristics

Access to video can be done by describing the temporal as well as the spatial

content.

E.g., a query can ask for a movie to be shown from 10 minutes after its start.

4-dimensional Objects:

3-D + Time Dimension

E.g., 3D heart-beat visualization – 3D heart image expanding and contracting over

time.

Multimedia Retrieval Architecture

Spatial Represents the way media objects are presented, by specifying the layout of windows on a monitor.

Multimedia Retrieval Architecture

Temporal ModelsDescribe the time and duration of presentation of each media object as well as their temporal relationships to other media objects. Temporal requirements of objects need to be specified and stored along with the database.

Multimedia Retrieval Architecture

Query based on three levels of increasing

complexityLevel 1 comprises retrieval by primitive features such as color, texture, shape or the spatial location of image elements

eg. find images containing yellow stars. This level requires retrieval uses features that are both objective and directly derivable from image itself.

Level 2 comprises retrieval by derived features, involving some degree of logical inference about the identity of the objects in image.

retrieval of objects of a given type; retrieval of individual objects or persons eg. find pic of double decker bus. Queries at this level requires reference to some outside store of knowledge.

Level 3 comprises retrieval by abstract attributes, involving a significant amount of high-level reasoning about the meaning and purpose of the objects or scenes depicted.

retrieval of named events or types of activity; retrieval of pictures with emotional or religious significance-(eg pic of rajasthani folk/depicting suffering). Requires complex reasoning, subjective judgement.

Multimedia Retrieval Architecture

Queries for Video and Images Retrieval Subimage Query:

Given a query image, find the parent image in the database with which it

matches, either as a whole or as a part.

Give location information showing where in the parent the query is positioned

The images may be very high resolution ,The query and target may be at

different resolutions

We want to retrieve all image that contain (k, u,t) query image given image

contains the

k labeled objects and u unlabeled objects, and a tolerance t, retrieve all images

that contain a (k,u,t) subimage which matches the query within tolerance t.

Multimedia Retrieval Architecture

Query

Result

Best matching image with sub-image identified

Query and target image also differ in resolution

Multimedia Retrieval Architecture

Generic search algorithm:Generic algorithm to search for a solution path in

graph.

R-tree search: are tree data structures used for spatial access methods for

indexing multidimensional info like geographical coordinates, rectangles,

polygons. Issue (one or more) range queries on the (k, 1) R-tree, to obtain a

list of promising images (image identifiers)

Multimedia Retrieval Architecture

Clean-up:

For each of the above obtained images, retrieve its corresponding Attributed relation Graph, ARG from the graph file, and compute the actual distance between this ARG and ARG of the original (k, u,t) query. If the distance is less than the threshold t , the image is included in the response set.

Multimedia Retrieval Architecture

Attributed relational graphs• Image content is represented by ARG holding

features of Objects and relationships between objects.

• eg.”find all X rays that are similar to Smith's X-ray”

In ARG objects are represent by graph nodes and relationships between objects are represented by arcs between such nodes

Multimedia Retrieval Architecture

Single Region Based Image Query A single region image query consist of computing

individual queries on color set, region-location queries spatial properties of individual regions, or indexing of region centroids or minimum bounding rectangles are used

Spatial distance between regions given by Euclidean distance

Where (xq, yq) and (xt, yt) are coordinates of 2 points

Multimedia Retrieval Architecture

Single Region Based Image Query Bounded query location The user has flexibility in designating the spatial

bounds for each region in the query within which a target region falls outside of the spatial distance of zero

Multimedia Retrieval Architecture

Region absolute location

fixed query location

The Euclidean distance of centroids

bounded query location Fall within a designed area: dq,t=0

otherwise: the Euclidean distance of the centroids

Multimedia Retrieval Architecture

Index StructureCentroid location spatial access

Spatial quad-tree

Centroids of image regions are indexed using spatial quad tree on x and y values. Quad tree

provides quick access to 2D data points.

Rectangle (MBR (Minimum bounding rectangle)) location spatial access

R-trees

Multimedia Retrieval Architecture

Single Region Based Image QueryArea

-The absolute distance between two regions

Spatial Extent-measure the distance among the width and the height of the

MBRs

-is much simpler than shape information

Multimedia Retrieval Architecture

Single Region Based Image Query Rectangle Location Spatial access – R-trees

The MBR (Minimum bounding rectangle) is the smallest vertically aligned rectangle that completely encloses the regions

Size

Another important perceptual dimension of the regions is their size in terms of area and spatial extent.

Area

The distance in area between two regions is given by the absolute distance

Spatial Extent

distance in MBR width (w) and height (h) between two regions is given by:

Multimedia Retrieval Architecture

Single Region Query Strategy Integrating these approaches, query strategy consist

of weighted sum of the color set, location, area and spatial extent distances.

single region query distance:

Multimedia Retrieval Architecture

Multiple Regions Query• Overall image query strategy consist of joining

queries on individual regions in query image.

Multimedia Retrieval Architecture

Multiple Regions Query Strategy – Absolute Locations For each region in the query positioned by absolute

location, the query strategy outlined for single region query is carried out, without computing the final minimization

List is intersected, best image match minimizes combined region query

Find the image having three regions that best matches

Matches found:

Multimedia Retrieval Architecture

Query Examples

Multimedia Retrieval Architecture

Shaped based Query Processing

Shape Index

For each color region the shape index may be computed as follows:

Compute the major and minor axes of each color region.

Rotate the shape region to align the major axis to X-axis to achieve rotation normalization and scale it such that major axis is of standard fixed length (say 96 pixels).

Place the grid of fixed size (96x96 pixels) over the normalized color region and obtain the binary sequence by assigning 1's and 0's accordingly.

Using the binary sequence, compute the row and column total vectors. These along with the eccentricity form the shape index for the region.

Multimedia Retrieval Architecture

Shaped based Query Processing

Query Process

The query image is processed to obtain a list of matching images based only on color features.

For each color region in the query image, the shape representation of each region is evaluated.

Compare the shape index of regions in the query image to those in the list of images retrieved on color.

Regions with only matching eccentricity within a threshold (t) are compared for shape similarity.

The matching images are ordered depending on the dierence in the sum of the difference in row and column vectors between query and matching image.

Multimedia Retrieval Architecture

Queries for multimedia objects Query Model

A query model for searching multimedia objects in a database or a file needs to satisfy the following requirements:

Consider that a match between the value of an attribute of a multimedia object and a given constant is not exact, i.e., must account for the grade of match.

Allow users to specify thresholds on the grade of match of the acceptable objects.

Enable users to ask for only a few top-matching objects

Multimedia Retrieval Architecture

Queries for multimedia documents Four main phases of query processing:

During the preprocessing phase parsing and catalog access are performed, and also the query is modified in light of the type hierarchy.

The multicluster query resolution phase determines the set of document clusters that must be accessed. Document distribution on the various clusters is transparent to the applications, to evaluate a query it is necessary to determine which clusters contain documents that can potentially satisfy the query.

Once the set of clusters involved in the query is determined, the single-cluster query optimization phase is performed and a query processing strategy is defined for each cluster.

The query execution phase applies the strategies defined in the previous phase.

Multimedia Retrieval Architecture

Queries for multimedia documents Predicates in a query are divided into four classes:

Structure predicates. These predicates are evaluated by accessing the system catalogs.

Index predicates. These predicates are evaluated by using the indexes.

Text predicates. These predicates are evaluated by means of signature scanning.

Residual predicates. These are predicates on components for which there are no access structures and so can only be evaluated by accessing the documents. This is the case for data attributes with no indexes. In addition, predicates defined on spring nodes belong to this class.

Multimedia Retrieval Architecture

Queries for multimedia documents Index query. A query issued against the index segments by using

the access paths provided by the index handler.

Text query. A query issued against the signature segments by using the access paths provided by the signature handler.

Document query. A query issued against the bulk storage segments by using the access paths provided by the bulk storage handler.

Query Preprocessing Phase

Parsing. The query is parsed by a conventional parser.

Catalog Access. After parsing of the query, the definitions of the conceptual types appearing in the query are retrieved from the system catalogs.

Component Checking. If the query contains a type-clause, then the conceptual components present in the query are veried as belonging to the specified types.

Multimedia Retrieval Architecture

Shape based multimedia retrieval

Multimedia Retrieval Architecture

Shape based multimedia retrieval Registration: Given two 3D models, align them

optimally; compute the geometric similarity between them;

Retrieval. Given a database of 3D models and a geometric query, find the models that best match the query;

Recognition. Given a database of 3D models and a query model, either find the query model in the database or determine it is not there;

Verification. Given a 3D model and a specification, determine whether they match to within some tolerance;

Clustering. Given a database of 3D models, automatically partition them into a set of classes;

Multimedia Retrieval Architecture

Shape based multimedia retrieval Feature detection. Given a 3D model, find geometric

features of interest on its surface; Classification. Given a set of model class

specifications and a query model, determine the class to which the query model belongs;

Segmentation. Partition a given 3D model into its salient parts;

Semantic labeling. Infer semantic meaning regarding the purpose and function of a given 3D model;

Synthesis. Automatically synthesize new examples typical of a given model class specification;

Multimedia Retrieval Architecture

Indexing and retrieval Used for pdf files Indexing

Each video sample is processed by the text recognition software. For each frame the recognized characters are stored after deletion of all text lines with fewer than 3 characters

Retrieval

Video sequences are retrieved by specifying a search string. Two search modes are supported:

exact substring matching and approximate substring matching.

Multimedia Retrieval Architecture

Shape based multimedia retrieval FIBSSR – Feature Index-based Similar Shape

Retrieval

A general and flexible shape similarity-based approach, enables retrieval of both rigid and articulated shapes.

Spatial Access based Retrieval Methods

Space-Filling Curves a finite precision in the representation of each

coordinate, say, K bits. Address space is a square – image, represented 2k x 2k

array of 1 X 1 squares - pixel.

R-Trees Z-ordering & R-trees and variants

Multimedia Retrieval Architecture

Content based retrieval methods Retrieving stored images from a collection by comparing

features automatically extracted from the images themselves

measures of color, texture or shape

Color retrieval

Each image added to the collection is analyzed to compute a color histogram which shows the proportion of pixels of each color within the image.

Texture retrieval

comparing values of what are known as second-order statistics calculated from query and stored images

Shape retrieval

A number of features characteristic of object shape are computed for every object identified within each stored image

Multimedia Retrieval Architecture

Retrieval using indexing Objects are represented as collections of features Similarity depends on context and frame of reference Features are characterized by multiple multimodal

feature measures Challenges in Indexing

The index must be created using all features of an object class

Nodes in index tree show consistency with respect to the context and frame of reference.

Multiple multimodal feature measures should be fused properly to generate index tree so that a valid categorization can be possible.

Multimedia Retrieval Architecture

Similarity based retrieval Uses similarity measures When presented with a sample facial image,

similarity retrieval occurs in the same way as pattern classification happens using a decision tree.

Retrieval follows the tree down to the leaf nodes. At each level, similarity measures determine the decision.

Using distance as the similarity measure, the index tree selects a node in the next level if d(x,t')=min,d(x,t'), where x is sample image and t' is the template of the jth node.

At the leaf node level, all leaf nodes similar to the sample image will be selected.