perspective multiscale detection and tracking of persons

Post on 07-Jul-2015

411 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Using perspective information can boost multiscale detectors for finding objects in images.

TRANSCRIPT

Perspective Multiscale Detection and

Tracking of Persons

11

Tracking of Persons

Marcos Nieto, Juan Diego Ortega, Andoni Cortés, and Seán Gaines

MMM 2014 – The 20th Anniversary International Conference on

Multimedia Modeling, Dublin (Ireland), 6,7,8-10th January 2014

1. Motivation

2. Perspective calibration

3. Approach

4. Results

Outline

22

5. Conclusions

Outline

1. Motivation

1. Object detection in images

2. Real-time application

3. Contextual information

2. Perspective calibration

33

2. Perspective calibration

3. Approach

4. Results

5. Conclusions

Motivation

• Object detection in images

Detection-by-classification

Supervised learning

Feature extraction

Binary or multiclass

Multiscale detection

Sliding window

Spans position & size

Bounding boxes

44

Open Open

Close Close

• Real-time applications

Motivation

Multiscale detection

Kind of brute-force

Too many evaluations

Some are absurd given the context

0

10

20

30

40

50

60

70

80

90

100

1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61

Nu

m.

Ev

alu

ati

on

s

Th

.

Levels

1,02

1,05

1,1

55

Parameters

Initial (smallest) size

Number of scales

Factor between scales

Offset (stride)

Therefore, some

knowledge about the

scene must be provided

• Contextual information

Motivation

Color, motion, depth

Low generality

Particular to each application

Perspective of the scene

High generality

Allows to maintain multiscale technique

Applicable in real-time

Two assumptions

There is a dominant ground plane

66

There is a dominant ground plane

Objects lie on the plane, and their 3D size is app. known

Surveillance, ADAS

Vehicles, persons

Outline

1. Motivation

2. Perspective Calibration

1. Plane view calibration

2. GUI

3. Projection of objects

77

3. Projection of objects

3. Approach

4. Results

5. Conclusions

Perspective calibration

• Plane view calibration

Homography calculation

4-points

2 metric references

Extrinsics from Homography

Rotation and translation of

camera

88

1 DoF Camera model

Focal length from homography

Refinement using Lev.-Marq.

Perspective calibration

GUI

Useful to calibrate videos

Quick (2-5 minutes)

Also lens distortion

correction

99

Perspective calibration

• Projection of objects

Farthest size of object

1010

Closest size of object

Outline

1. Motivation

2. Perspective Calibration

3. Approach

1. Overview

2. Perspective Multiscale

1111

2. Perspective Multiscale

3. Perspective Grid

4. Results

5. Conclusions

• Define the perspective of the scene

• Define the 3D size of the object to search

Approach

Camera calibration

Intrinsic parameters

Camera pose

Extrinsic parameters

Homography

calibration

1212

• Define the 3D size of the object to search

• A) Calculate the best parameters for multiscale

• B) Define a fixed grid of positions in the plane

Persons

1700 x 500 x 500Car

1500 x 1700 x 3500

• A) Perspective multiscale

• Rescale original

image so model size

fits farthest object

• Compute scale

factor so that model

size coincides with

Approach

Multiscale Perspective Multiscale

1313

size coincides with

closest object at the

smallest image

• Filter out invalid

positions

Focused effort: less

number of levels are

required

• It is still necessary to filter out invalid positions-sizes

• The advantage of using this approach is that traditional multiscale

implementations can still be used with much less number of levels

Approach

60

70

80

90

100

Nu

m.

Ev

alu

ati

on

s

Th

.

1,02

1414

0

10

20

30

40

50

1 6 11 16 21 26 31 36 41 46 51 56 61

Nu

m.

Ev

alu

ati

on

s

Levels

1,02

1,05

1,1

Focused effort: less number

of levels are required

(typically 3 to 5)

• B) Grid of fixed positions

• Predefine feasible

locations of objects

• No need to filter

• Can not be used in

multiscale

Approach

1515

Can not be used in

multiscale

implementations.

One evaluation per

candidate

Much more focused

effortBounding boxesProjected boxes

Outline

1. Motivation

2. Perspective Calibration

3. Approach

4. Results

1. Case study: person detection

1616

1. Case study: person detection

2. Case study: vehicle detection

5. Conclusions

• Case study: Person detection

– Full-body and Head & Shoulder SVM-HOG detector

– Perspective Multiscale

– Linear multiobject tracking

– Active Vision Group dataset (1920x1080, 4500 frames, 71460 persons

labeled)

Results

1717

labeled)

Results

0,998

1

• Performance

– Reduction from

144880 to 46226

(68%) for similar

performance

Using 3 levels is

Multiscale Perspective Multiscale

1818

0,978

0,98

0,982

0,984

0,986

0,988

0,99

0,992

0,994

0,996

0,998

-0,1 6E-16 0,1 0,2 0,3 0,4 0,5 0,6

Pre

cisi

on

Recall

FB

FBUB

FBUB*

DAFFiltering

TrackingLess FN

Less FP but also

some

missdetections

L=3, 5, 7

– Using 3 levels is

enough because

perspective effect is

soft

• Case study: Vehicle detection

– Vehicle detection application for embedded vision system

– Road can be assumed as planar in the short distance

– Ground truth sequence 2 minutes

– Grid of fixed positions

Results

1919

• Case study: Vehicle detection

– Detections are sparse and noisy

– Tracking is still necessary

Results

2020

Results

•1000x less evaluations

•7x speed in PC

•Same TP

•5 times less FP

2121

Results

Type Processor RAM CPU OS Language

PC Intel Core

i5

8 GB 3.0 GHz Windows 7

Ubuntu 12.04

C++

Embedded

HW 1

ARM

Cortex

512 MB 800 MHz Xilinx Zynq

Linux

C++

2222

30 - 40 ms in ARM Cortex30 - 40 ms in ARM Cortex

FastSlow

11 – 40 ms in PC11 – 40 ms in PC

25 fps real-time

Perspective

multiscale

Brute-force

multiscale

2 - 10 ms in PC 2 - 10 ms in PC

Conclusions

• Perspective is a contextual information available in many situations

• Assumptions: dominant ground plane and known object size

• Its computation is easy (K, R, t) using homographies

• It can be used for object detection to focus computational Twoways of applying it

2323

ways of applying it

• A) Perspective Multiscale: Wrapping multiscale function (~60% reduction in typical surveillance scene)

• B) Grid of fixed positions: for even more reduction of complexity (x7 speed up in low perspective scenes like onboardvehicle detection)

Thank You!Dr. Marcos Nieto

2424

Dr. Marcos Nieto

Researcher

mnieto@vicomtech.org

2525

2626

Offline process

Online process

top related