poster: avss 2012

1
Background Subtraction for Real-time Video Analytics Based on Multi-hypothesis Mixture-of-Gaussians Mahfuzul Haque and Manzur Murshed Gippsland School of Information Technology, Monash University, Victoria 3842, Australia Email: {Mahfuzul.Haque, Manzur.Murshed}@monash.edu Robust background subtraction (BS) is essential for high quality foreground detection in most video analytics systems. Recent BS techniques achieve superior detection quality mostly by exploiting the complementary strengths of multiple background models. Consequently, these techniques fail to meet the operational requirements of real-time video analytics. The proposed BS technique, named multi-hypothesis mixture-of-Gaussians (MH-MOG), maintains a single background model based on perception-aware mixture-of-Gaussians and then, generates multiple detection hypotheses with different processing bases. Finally, only during the detection stage, the complementary strengths of the hypotheses are exploited to achieve superior detection quality without significant computational overhead. [1] M. Haque and M. Murshed, Background Subtraction for Real-time Video Analytics Based on Multi-hypothesis Mixture-of-Gaussians, IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), Beijing, China, 2012. [2] M. Haque and M. Murshed, Robust Background Subtraction Based on Perceptual Mixture-of-Gaussians with Dynamic Adaptation Speed, IEEE International Workshop on Advances in Automated Multimedia Surveillance for Public Safety, Melbourne, Australia, 2012. [3] D.-S. Lee. Effective Gaussian mixture learning for video background subtraction, IEEE TPAMI, 27(5):827832, 2005. 2 9 The images shown in the header has been taken from http://www.informationliberation.com Dynamic Background Subtraction Video Frame Background Model Foreground Mask 3 Multiple Detection Hypotheses for Superior Detection Quality Model 1 Detection Decision Hypothesis 1 Hypothesis 2 Hypothesis n Model 2 Model n Conventional Detection Decision Hypothesis 1 Hypothesis 2 Hypothesis n Model Proposed 4 The Proposed Background Subtraction Technique (MH-MOG) Incoming Video Stream MH-MOG Background Model High Quality Foreground Mask Perception Inspired Detection Hypothesis Probabilistic Detection Hypothesis Detection Algorithm Confidence Level for Detection Hypothesis Confidence Level for Detection Hypothesis In the proposed background subtraction technique, a single background model is maintained based on observed video frames. Then based on this background model two independent detection hypotheses (e.g., perception inspired and probabilistic) are generated. For both hypotheses, associated confidence levels are computed based on spatial detection results in the corresponding hypothesis space. Finally, all these information is used by the detection algorithm to produce high quality foreground mask by maximising the complementary strengths of both hypotheses [1]. Dynamic background subtraction is an essential precursor in most video analytics systems for moving foreground detection. The quality of foreground detection directly impacts the performance of subsequent processing tasks. To achieve superior detection quality conventional approaches use the complementary strengths of multiple detection hypotheses that are originated from different background models while the proposed technique uses a single underlying background model to generate complementary detection hypotheses. 5 Background Modelling 6 Perception inspired detection hypothesis 7 Probabilistic detection hypothesis P(x) Intensity The background of the operating environment is modelled at pixel-level by maintaining at most N observed intensity values (m 1 , m 2 , …, m N ). For each sample, associated Gaussian variables (μ, σ, and ω) are maintained to determine the order of the samples based on observation frequency. Observed intensity value: m Mean: μ Standard deviation: σ Weight: ω 0 255 m 1 m 2 m 3 Intensity A confidence interval is determined for each believed-to-be- background intensity value based on the characteristics of human visual system in perceiving noticeable intensity deviation from background (Weber’s Law). Observed intensity values are classified as background based on their membership in any background confidence interval [2]. Quantitative Evaluation 1 Abstract 10 Visual Comparison Unlike perception inspired hypothesis, no subset of samples is chosen as background for intensity comparison. Rather a probabilistic formulation involving all Gaussian components is used [3] for background/foreground classification. This hypothesis shows higher foreground sensitivity and thus recovers missing foreground regions due to intensity thresholding by the perception inspired hypothesis. Quantitative comparison: This figure shows overall (ALL), dataset-wise (PETS, WF, UCF, IBM, CAV, VSSN), and sequence- class-wise (SR, MM, LC) performance comparisons. First Frame Test Frame Ground Truth MOG (S&G) MOG (Lee) ViBe MH-MOG 8 Experiments More than 50 test sequences were selected from eight different datasets including PETS, Wallflower, IBM, VSSN06, CAVIAR, and UCF and categorised in following classes based on the characteristics of the operating environments: low contrast foreground (LC), shadows and reflections (SR), multi-modal background (MM), indoor (INDOOR), and outdoor (OUTDOOR). MOG (S&G) TPAMI 2000, MOG (Lee) TPAMI 2005, ViBe TIP 2011, and MH-MOG Proposed.

Upload: mahfuzul-haque

Post on 05-Jul-2015

40 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Poster: AVSS 2012

Background Subtraction for Real-time Video Analytics Based on

Multi-hypothesis Mixture-of-Gaussians Mahfuzul Haque and Manzur Murshed Gippsland School of Information Technology, Monash University, Victoria 3842, Australia Email: {Mahfuzul.Haque, Manzur.Murshed}@monash.edu

Robust background subtraction (BS) is essential for high quality foreground detection in most video analytics systems. Recent

BS techniques achieve superior detection quality mostly by exploiting the complementary strengths of multiple background

models. Consequently, these techniques fail to meet the operational requirements of real-time video analytics. The proposed

BS technique, named multi-hypothesis mixture-of-Gaussians (MH-MOG), maintains a single background model based on

perception-aware mixture-of-Gaussians and then, generates multiple detection hypotheses with different processing bases.

Finally, only during the detection stage, the complementary strengths of the hypotheses are exploited to achieve superior

detection quality without significant computational overhead.

[1] M. Haque and M. Murshed, Background Subtraction for Real-time Video Analytics Based on Multi-hypothesis Mixture-of-Gaussians, IEEE International

Conference on Advanced Video and Signal Based Surveillance (AVSS), Beijing, China, 2012.

[2] M. Haque and M. Murshed, Robust Background Subtraction Based on Perceptual Mixture-of-Gaussians with Dynamic Adaptation Speed, IEEE International

Workshop on Advances in Automated Multimedia Surveillance for Public Safety, Melbourne, Australia, 2012.

[3] D.-S. Lee. Effective Gaussian mixture learning for video background subtraction, IEEE TPAMI, 27(5):827– 832, 2005.

2

9

The images shown in the header has been taken from http://www.informationliberation.com

Dynamic Background Subtraction

Video Frame

Background

Model

Foreground Mask

3 Multiple Detection Hypotheses for Superior Detection Quality

Model1

Detection Decision

Hypothesis1 Hypothesis2 Hypothesisn …

Model2 Modeln …

Conventional

Detection Decision

Hypothesis1 Hypothesis2 Hypothesisn …

Model

Proposed

4 The Proposed Background Subtraction Technique (MH-MOG)

Incoming

Video Stream

MH-MOG

Background

Model

High Quality

Foreground Mask

Perception Inspired

Detection Hypothesis

Probabilistic

Detection Hypothesis

Detection Algorithm

Confidence Level for

Detection Hypothesis

Confidence Level for

Detection Hypothesis

In the proposed background subtraction technique, a single background model is maintained based on observed video frames. Then based on this background model two

independent detection hypotheses (e.g., perception inspired and probabilistic) are generated. For both hypotheses, associated confidence levels are computed based on

spatial detection results in the corresponding hypothesis space. Finally, all these information is used by the detection algorithm to produce high quality foreground mask by

maximising the complementary strengths of both hypotheses [1].

Dynamic background subtraction is an essential precursor in

most video analytics systems for moving foreground detection.

The quality of foreground detection directly impacts the

performance of subsequent processing tasks.

To achieve superior detection quality conventional approaches use the complementary strengths of multiple

detection hypotheses that are originated from different background models while the proposed technique

uses a single underlying background model to generate complementary detection hypotheses.

5 Background Modelling 6 Perception inspired detection hypothesis 7 Probabilistic detection hypothesis

P(x)

Intensity

The background of the operating environment is modelled at

pixel-level by maintaining at most N observed intensity values

(m1, m2, …, mN). For each sample, associated Gaussian

variables (µ, σ, and ω) are maintained to determine the order

of the samples based on observation frequency.

Observed intensity value: m

Mean: µ

Standard deviation: σ

Weight: ω

0 255 m1 m2 m3

Intensity

A confidence interval is determined for each believed-to-be-

background intensity value based on the characteristics of

human visual system in perceiving noticeable intensity

deviation from background (Weber’s Law). Observed

intensity values are classified as background based on their

membership in any background confidence interval [2].

Quantitative Evaluation

1 Abstract

10 Visual Comparison

Unlike perception inspired hypothesis, no subset

of samples is chosen as background for intensity

comparison. Rather a probabilistic formulation

involving all Gaussian components is used [3] for

background/foreground classification. This

hypothesis shows higher foreground sensitivity

and thus recovers missing foreground regions

due to intensity thresholding by the perception

inspired hypothesis.

Quantitative comparison: This figure shows overall (ALL),

dataset-wise (PETS, WF, UCF, IBM, CAV, VSSN), and sequence-

class-wise (SR, MM, LC) performance comparisons.

First Frame Test Frame Ground Truth MOG (S&G) MOG (Lee) ViBe MH-MOG

8 Experiments

More than 50 test sequences

were selected from eight

different datasets including

PETS, Wallflower, IBM,

VSSN06, CAVIAR, and UCF

and categorised in following

classes based on the

characteristics of the operating

environments: low contrast

foreground (LC), shadows and

reflections (SR), multi-modal

background (MM), indoor

(INDOOR), and outdoor

(OUTDOOR).

MOG (S&G) – TPAMI 2000, MOG (Lee) – TPAMI 2005, ViBe – TIP 2011, and MH-MOG – Proposed.