compact descriptors for visual search

52
Compact Descriptors 4 Visual Search Danilo Pau ([email protected]) Senior Principal Engineer Senior Member of Technical Staff SMIEEE SI/CVRP STMicroelectronics/AST Courtesy: M. Funamizu

Upload: antonio-capone

Post on 14-Jun-2015

3.142 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Compact Descriptors for Visual Search

Compact Descriptors 4 Visual Search

Danilo Pau ([email protected])

Senior Principal Engineer

Senior Member of Technical Staff

SMIEEE

SI/CVRP

STMicroelectronics/AST

Courtesy: M. Funamizu

Page 2: Compact Descriptors for Visual Search

Agenda

• Visual Search: Context

• MPEG initiative on Visual Search

• Compact Descriptors for Visual Search

• Implementation

• Use Cases

• Visual Search Evolution: Moving Pictures and 3D

• Question and Answers

2

15/01/2013Presentation Title

Page 3: Compact Descriptors for Visual Search

Agenda

• Visual Search: Context

• MPEG initiative on Visual Search

• Compact Descriptors for Visual Search

• Implementation

• Use Cases

• Visual Search Evolution: Moving Pictures and 3D

• Question and Answers

3

15/01/2013Presentation Title

Page 4: Compact Descriptors for Visual Search

Visual Search Context• Millions of images and videos continue being uploaded all over the

world on remote servers

• Each day on Facebook 300 million photos are uploaded

• roughly 58 photos uploaded each second

• One hour of video uploaded to YouTube every second

4

15/01/2013Presentation Title

Page 5: Compact Descriptors for Visual Search

Content Based Image Recognition

• CBIR covers the concept of search that analyzes the actual content inthe image, rather than relying on metadata.

• The development of this concept incorporated many algorithms andtechniques from fields such as statistics, pattern recognition andcomputer vision.

• CBIR attracted a lot of attention and after many years of research, ithas expanded towards the marketplace.

• CBIR’s application on mobile market is called Mobile Visual Search

• Visual Search is about the capability to initiate a search using animage as a query that captures a rigid object

• Market potential of mobile visual search considers any mobile device with camera(phones, tablets and hybrids).

5

15/01/2013Presentation Title

Page 6: Compact Descriptors for Visual Search

CBIR vs QR Codes

• Quick Response codes, a type of two-dimensional barcode.

• The code is scanned by the mobile imager to produce a URL addressfor re-direction and browsing.

• QR codes are being used by 6.2% of the smart phone users in USA

6

15/01/2013Presentation Title

Page 7: Compact Descriptors for Visual Search

Lots of Existing Applications• Google’s Goggles

• Nokia’s Point and Find

• oMoby

• Like.com

• Kooaba

• Moodstocks

• Snaptell

• pixlinQ

• Bing

7

15/01/2013Presentation Title

Page 8: Compact Descriptors for Visual Search

Existing Apps use Jpeg

• Previous applications use mobile imager that send JPEG compressed queries

8

15/01/2013Presentation Title

Remote server

Mobile device

Send Jpeg images

Visual search result

Database

Page 9: Compact Descriptors for Visual Search

An Example of Visual Search

Courtesy Telecom Italia

Interest Point DescriptionDescriptor pairingInliers

9

Query

Page 10: Compact Descriptors for Visual Search

The Rise of Compressed Descriptors

• Alternatively send “compact features” extracted from raw images

• For example Scale Invariant Feature Transform – SIFT visual descriptors

• Consider 1200 descriptors, each one 128 Bytes, 4 bytes for coordinates, times 30 fps � network load nearly 38 Mbit/s �unacceptable

10

15/01/2013Presentation Title

0

20

40

60

80

100

120

140

160

JPEG High JPEG Low SIFT

VGA Image

JPEG High

JPEG Low

SIFT

KB

Page 11: Compact Descriptors for Visual Search

Systems Considered

• Instead of sending images (a)

• application can send compact descriptors (b)

• and even perform search locally (c).

11

Page 12: Compact Descriptors for Visual Search

Previous Attempts

• Hashing• Locality Sensitive Hashing [Yeo et ali., 2008]

• Similarity Sensitive Coding [Torralba et ali., 2008]

• Spectral Hashing [Weiss et ali, 2008]

• Transform Coding• Karunen-love Transform [Chandrasekhar et ali. 2009]

• ICA based Transform [Narozny et ali., 2008]

• Vector Quantization• Product Quantization [Jegou et ali., 2010]

• Tree Structured Vector Quantization [Nistr et ali., 2006]

• Alternative to SIFT• Compressed Histogram of Gradients [Chandrasekhar et ali. 2011]

12

15/01/2013Presentation Title

Page 13: Compact Descriptors for Visual Search

Agenda

• Visual Search: Context

• MPEG initiative on Visual Search

• Compact Descriptors for Visual Search

• Implementation

• Use Cases

• Visual Search Evolution: Moving Pictures and 3D

• Question and Answers

13

15/01/2013Presentation Title

Page 14: Compact Descriptors for Visual Search

Is a standard on Visual Search needed ?

• Reduce load on wireless networks carrying visual search-related information.

• Ensure interoperability of visual search applications and databases,

• Enable hardware support for descriptor extraction and matching in mobile devices,

• Enable high level of performance of implementations conformant to the standard,

• Simplify design of descriptor extraction and matching for visual search applications,

14

Page 15: Compact Descriptors for Visual Search

What is a suitable standardizationbody ?

• Informal title:• Moving Picture Experts Group (MPEG)

• Formal title:• ISO/IEC JTC1 SC29 WG11 (Coding of Moving Pictures and Audio)

• Parent SDOs:• ISO: International Organization for Standardization • IEC: International Electro technical Commission• JTC 1: Joint Technical Committee One• SC29: Study Committee 29: Coding of Audio, Picture,

Multimedia and Hypermedia Information

• Members: National Bodies (25 voting, 16 observers)

JTC 1

SC29

WG11 (MPEG)

15

Page 16: Compact Descriptors for Visual Search

16

Page 17: Compact Descriptors for Visual Search

Agenda

• Visual Search: Context

• MPEG initiative on Visual Search

• Compact Descriptors for Visual Search

• Implementation

• Use Cases

• Visual Search Evolution: Moving Pictures and 3D

• Question and Answers

17

15/01/2013Presentation Title

Page 18: Compact Descriptors for Visual Search

CDVS : Scope

• Descriptor extraction process needed to ensure interoperability.

• Bitstream of compact descriptors

Query Image

Descriptor extraction

Descriptor bitstream

Descriptor matching

Geometric verification

Database

List of results

Standard

18

Page 19: Compact Descriptors for Visual Search

Requirements

� Robustness� High matching accuracy shall be achieved at least for images of textured

rigid objects, landmarks, and printed documents. � The matching accuracy shall be robust to changes in vantage points,

camera parameters, lighting conditions, as well as in the presence of partial occlusions.

� Sufficiency� Descriptors shall be self-contained, in the sense that no other data are

necessary for matching.

� Compactness� Shall minimize lengths/size of image descriptors

� Scalability� Shall allow adaptation of descriptor lengths to support the required

performance level and database size.� Shall enable design of web-scale visual search applications and

databases.

19

Page 20: Compact Descriptors for Visual Search

How to achieve robustness• Image content is transformed into visual feature with coordinates

that are invariant to illumination, scale, rotation, affine and perspective transforms

20

Page 21: Compact Descriptors for Visual Search

Types of invariance

• Illumination

21

Page 22: Compact Descriptors for Visual Search

• Illumination

• Scale

22Types of invariance

Page 23: Compact Descriptors for Visual Search

• Illumination

• Scale

• Rotation

23Types of invariance

Page 24: Compact Descriptors for Visual Search

• Illumination

• Scale

• Rotation

• Affine Transform

24Types of invariance

Page 25: Compact Descriptors for Visual Search

• Illumination

• Scale

• Rotation

• Affine Transform

• Full Perspective

25Types of invariance

Page 26: Compact Descriptors for Visual Search

Compactness 26

15/01/2013Presentation Title

0

20

40

60

80

100

120

140

160

JPEG High JPEG Low SIFT 512B 1KB 2KB 4KB 8KB 16KB

VGA Image

JPEG High

JPEG Low

SIFT

512B

1KB

2KB

4KB

8KB

16KB

KB

Page 27: Compact Descriptors for Visual Search

Extraction Pipeline 27

Image

Compactdescriptors

H Mode

H-Mode uses SQ encoding (256B)

S-Mode uses MSVQ encoding (38KB)

Both Mode uses SCFV (49KB)

Resizing

Local DescriptionExtraction

Encoding

SCFV

Descriptor

Coordinate coding

Arithmetic coding

MSVQ

encoding

Keypointselection

SIFTDoG

Transform & SQ

S Mode

Page 28: Compact Descriptors for Visual Search

Properties of SIFTDavid Lowe’s local descriptor detection extraction (1999-2004)

Extraordinarily robust matching technique• Can handle changes in viewpoint

• Up to about 30 degree out of plane rotation

• Can handle significant changes in illumination• Sometimes even day vs. night (below)

• Lots of code available � http://www.vlfeat.org (BSD license)

28

Page 29: Compact Descriptors for Visual Search

Pyramid of DoG

DoGs

DoGs

DoGs

Octave 1

Octave n

Scale 1 Scale m29

Page 30: Compact Descriptors for Visual Search

Actual Interest Point Detector Output 30

Page 31: Compact Descriptors for Visual Search

Building a Descriptor• Take 16x16 patch window around detected interest point

• Subdivide patch with 4x4 sub-patches

• Create per sub patch 8 bin-histogram over edge orientations weighted by magnitude

• These lead to a 4x4x8=128 element vector � the SIFT descriptor

31

15/01/2013Presentation Title

0 2ππππ

angle histogram

Page 32: Compact Descriptors for Visual Search

Key point selection

• Basic idea: inlier features do not behave, in a statistical sense, as do the outlier features.

• Relevance value that results from taking into account distance from center, scale, orientation, peak, mean and variance of the SIFT descriptor.

32

Page 33: Compact Descriptors for Visual Search

• Main idea is to generate a compressed descriptor from uncompressed SIFT by

• Simple linear combinations of histograms

• Scalar quantisation of resultant values

• Adaptive Arithmetic coding

• Main benefits• Very low computational complexity

• Negligible memory requirements

• Highly scalable

• Allows for very efficient matching and retrieval

Local Descriptor Compression H mode 33

Page 34: Compact Descriptors for Visual Search

Vector Quantizer Scheme: S- Mode 34

Page 35: Compact Descriptors for Visual Search

Location Encoding

• Histogram Map: The positions of the nonzero bins are encoded asbinary words through scanning columns and compressing the words byarithmetic coding.

• Histogram Count: The number of coordinates in the nonzero bins isencoded in an iterative fashion, by specifying first which bins containmore than 1 key point, then by specifying which among these thatcontain more than 2 keypoints, and so forth

35

Page 36: Compact Descriptors for Visual Search

Agenda

• Visual Search: Context

• MPEG initiative on Visual Search

• Compact Descriptors for Visual Search

• Implementation

• Use Cases

• Visual Search Evolution: Moving Pictures and 3D

• Question and Answers

36

15/01/2013Presentation Title

Page 37: Compact Descriptors for Visual Search

Extraction times

• SIFT interest point detection and feature extraction made the biggest contribution

• Global descriptors as complex as Interest Point Detection

• Very fast local descriptors and coordinate encoding

37

15/01/2013Quantitative evaluation of CDVS extraction and pairwise matching

Page 38: Compact Descriptors for Visual Search

Agenda

• Visual Search: Context

• MPEG initiative on Visual Search

• Compact Descriptors for Visual Search

• Implementation

• Use Cases

• Visual Search Evolution: Moving Pictures and 3D

• Question and Answers

38

15/01/2013Presentation Title

Page 39: Compact Descriptors for Visual Search

Mobile Visual Search: Music CDs

Query

Stream Music

39

… …

Page 40: Compact Descriptors for Visual Search

SnapshotPaper-copy Initiate Visual

Search

Mass Storage

SendCompact Query

Selective quality&contentprinting

Multimedia Content RetrievalFrom the cloud

Augmentation Rendering

Composition of augmentations

and image

Augmentation 3D models and markers

Transmission of markers and 3D

models

2D / 3D Rendering

Content Augmentation

40Visual Search: eReaders, Printers

Page 41: Compact Descriptors for Visual Search

News FinderStill Pictures - Visual Search

41

15/01/2013Presentation Title

Page 42: Compact Descriptors for Visual Search

Application and Use Cases from Broadcaster point of view

• Logo Detection

• Interactive Fruition

42

15/01/2013Presentation TitleCourtesy RAI

Page 43: Compact Descriptors for Visual Search

Automotive 3D Top View

EC

UCam

Cam

Cam

Cam

43

Page 44: Compact Descriptors for Visual Search

Automotive 3D Top View 44

Page 45: Compact Descriptors for Visual Search

45Moving Pictures Visual Search

Courtesy Telecom Design

Page 46: Compact Descriptors for Visual Search

Agenda

• Visual Search: Context

• MPEG initiative on Visual Search

• Compact Descriptors for Visual Search

• Implementation

• Use Cases

• Visual Search Evolution: Moving Pictures and 3D

• Question and Answers

46

15/01/2013Presentation Title

Page 47: Compact Descriptors for Visual Search

Intra Predicted Descriptors 47

15/01/2013Presentation Title

� Desirable Properties:

� An inter descriptor coded in a compact visual stream

� Expressed in terms of one or more temporally neighboring descriptors.

� The "inter" part of the term refers to the use of Inter Frame Prediction.

� Designed to achieve higher compression rates and/or better precision-recall performances

Page 48: Compact Descriptors for Visual Search

3D Mobile Devices Will Surpass 148 Million in 2015

• Advances in the 3D technology are very fast

• Industry adoption opens new opportunities � 3D Visual Search

• From In-Stat studies:• ~ 30 % of all handheld game consoles will be 3D by 2015.

• 3D mobile devices will increase demand for image sensors by 130 %.

• In 2012, Notebook will be the first 3D enabled mobile device to reach 1 million units.

• By 2014, 18 % of all tablets will be 3D.

• Nintendo, Fuji, GoPro, Sony, ViewSonic, LG, Origin, Toshiba, Fujitsu, HP, ASUS, Lenovo, Dell, Alienware, HTC and Sharp focusing on autostereoscopy mobile technologies

48

15/01/2013Presentation Title

Page 49: Compact Descriptors for Visual Search

49

15/01/2013Presentation Title

Microsoft Kinect Asus Xtion

Google 3D Warehouse

LG Optimus 3D P920

LG Optimus Pad

HTC EVO 3D Sharp Aquos SH-12C

3DS by Nintendo

Page 50: Compact Descriptors for Visual Search

3D Object Recognition with Kinect 50

15/01/2013Presentation Title

http://www.youtube.com/watch?v=eRW1zG_aONk

Courtesy: CV laboratory University of Bologna

SHOT: Unique Signatures of Histograms for Local Surface Description

Page 51: Compact Descriptors for Visual Search

Agenda

• Visual Search: Context

• MPEG initiative on Visual Search

• Compact Descriptors for Visual Search

• Implementation

• Use Cases

• Visual Search Evolution: Moving Pictures and 3D

• Question and Answers

51

15/01/2013Presentation Title

Page 52: Compact Descriptors for Visual Search

52

15/01/2013Presentation Title