object tracking a survey

41
Object Tracking and Detection By Alper Yilmaz Omar Javed And Mubarak Shah Compiled by Haseeb Hassan [email protected] Anhui University Hefei,China School of Computer Sciences Anhui University

Upload: haseeb-hassan

Post on 16-Apr-2017

268 views

Category:

Engineering


1 download

TRANSCRIPT

Page 1: Object tracking a survey

Object Tracking and Detection

By Alper YilmazOmar Javed

And Mubarak ShahCompiled by

Haseeb [email protected]

Anhui University Hefei,China

Sc hoo l o f Com pute r S c i enc es A nhui U ni vers i t y

Page 2: Object tracking a survey

Three Authors work discussed different articles from 1979-2006. Presented the scenario in a deep way and a good way. The paper have covers approximately 162 references. Difficult to understand each and every thing in the paper but tried level best to establish some basic concepts. Our survey is focused on methodologies for tracking objects in

general and not on trackers tailored for specific objects, for example, person trackers that use human kinematics as the basis of their implementation.

About this Review Paper

Sc hoo l o f Com pute r S c i enc es A nhui U ni vers i t y

Page 3: Object tracking a survey

Preface

Sc hoo l o f Com pute r S c i enc es A nhui U ni vers i t y

Extensive survey of object tracking methods and also give a brief review of related topics.

• We divide tracking methods in three categories based on object representations methods point correspondence, primitive geometric models and contour evolution.

• Point trackers require detection in every frame, geometric region or contours-based trackers require detection only when the object first appears in the scene.

• Also included some discussion on object detection.• Provided summaries of Object trackers, object representations, motion models.• We believe that this survey of object tracking with a rich bibliography content, can give

valuable insight into this important research topic and encourage new research.

Page 4: Object tracking a survey

1.What is Object Tracking

Sc hoo l o f Com pute r S c i enc es A nhui U ni vers i t y

Estimating the trajectory of an object over time by locating its position in every frame. important task within the field of computer vision. or

Estimating the trajectory of an object in the image plane as it moves around a scene. Important task within the field of computer vision.There are three key steps in video analysis: Detection of interesting moving objects Tracking of objects from frame to frame Objects tracks recognition

Page 5: Object tracking a survey

1.2-Difficulties in Tracking

Sc hoo l o f Com pute r S c i enc es A nhui U ni vers i t y

Difficulties in tracking objects can arise due to

Abrupt object motion Changing appearance patterns of both the object

and the scene, Non-rigid object structures, object-to-object and

object-to-scene occlusions, and camera motion.

Page 6: Object tracking a survey

1.3-Object Tracking Applications

Sc hoo l o f Com pute r S c i enc es A nhui U ni vers i t y

Motion-Based Recognition ,human identification based on gait, automatic object detection, etc.;Automated Surveillance, that is, monitoring to detect suspicious activitiesVideo Indexing, that is, automatic annotation and retrieval of the videos in multimedia databasesHuman-Computer Interaction, that is, gesture recognition, eye gaze tracking for data input to computers, etc.;Traffic monitoring, that is, real-time gathering of traffic statistics to direct traffic flow.Vehicle Navigation, that is, video-based path planning and obstacle avoidancecapabilities.

Page 7: Object tracking a survey

Different Approaches Proposed

Sc hoo l o f Com pute r S c i enc es A nhui U ni vers i t y

Proposed Numerous approaches for object tracking based on the following questions:

A. Which object representation is suitable? B. Which image features should be used? C. How should the motion, appearance, and shape of the

object be modeled?

Answers are: Depends on the context/environment in which the tracking is

performed Large number of tracking methods have been proposed

which attempt to answer these questions for a variety of scenarios.

Page 8: Object tracking a survey

2.Object Representation

Sc hoo l o f Com pute r S c i enc es A nhui U ni vers i t y

In a tracking scenario, object is anything that is of interest for further analysis. For instance, boats on the sea, fish inside an aquarium, vehicles on a road, planes in the air, people walking on a road, or bubbles in the water are a set of objects that may be important to track in a specific domain.

Objects can be represented by shapes and appearances.

Points Primitive geometric shapes Object silhouette and

contour Articulated shape models Skeletal models

Page 9: Object tracking a survey

Objects Shape Representation

Sc hoo l o f Com pute r S c i enc es A nhui U ni vers i t y

PointsObject

silhouette and contour

Primitive geometric

shapes

Articulated shape models

Skeletal models

Page 10: Object tracking a survey

Continued…

Sc hoo l o f Com pute r S c i enc es A nhui U ni vers i t y

—Probability densities of object appearanceEither parametric or non-parametric such as Gaussian or Mixture of

Gaussian. The probability densities of object appearance features (color, texture) can be computed from the image regions specified by the shape models (interior region of an ellipse or a contour).—Templateso Templates are formed using simple geometric shapes or silhouettes

[Fieguth and Terzopoulos 1997].o Carries both spatial and appearance informationo Only suitable for tracking objects which does not vary considerably

Page 11: Object tracking a survey

Models

Sc hoo l o f Com pute r S c i enc es A nhui U ni vers i t y

—Active appearance models. Generated by simultaneously modeling the object shape and appearance [Edwards et al. 1998]. object shape is defined by a set of landmarks in the form of color, texture, or gradient magnitude.—Multiview appearance models. Refers to different views of an object. One approach to represent the different object views is to generate a

subspace from the given views. Like Principal Component Analysis (PCA) and Independent Component Analysis (ICA), have been used for both shape and appearance representation [Mughadam and Pentland 1997; Black and Jepson 1998].

Another approach to learn the different views of an object is by training a set of classifiers, for example, the support vector machines [Avidan 2001] or Bayesian networks [Park and Aggarwal 2004].

Page 12: Object tracking a survey

3.Feature Selection

Sc hoo l o f Com pute r S c i enc es A nhui U ni vers i t y

1.Color

2.Edges

3.Optical Flow

4.Texture

Selecting the right features plays a critical role in

tracking.

Page 13: Object tracking a survey

4. Object Detection

Sc hoo l o f Com pute r S c i enc es A nhui U ni vers i t y

Tracking method requires an object detection mechanism Common approach for detection to use information in single

frame Some object detection methods use of temporal information

computed from sequence of frames to reduce the numbers false detections.

This temporal information is usually in the form of frame differencing, which highlights changing regions in consecutive frames.

.

Page 14: Object tracking a survey

Object Detection Categories

Sc hoo l o f Com pute r S c i enc es A nhui U ni vers i t y

Page 15: Object tracking a survey

4.1-Point Detectors

Sc hoo l o f Com pute r S c i enc es A nhui U ni vers i t y

(a) (b) (c)

Page 16: Object tracking a survey

4.2-Background Subtraction

Sc hoo l o f Com pute r S c i enc es A nhui U ni vers i t y

Object detection can be achieved by building a representation of the scene called the background model.

Significant change in an image region from the background model signifies a moving object.

The pixels constituting the regions undergoing change are marked for further processing.

background subtraction became popular following the work of Wren et al. [1997]. An alternate approach for background subtraction is intensity

variations of a pixel in an image sequence.

Page 17: Object tracking a survey

Background Subtraction

Sc hoo l o f Com pute r S c i enc es A nhui U ni vers i t y

Mixture of Gaussian modeling for background subtraction.

Most of state-of-the-art tracking methods for fixed cameras, for example, Haritaoglu et al. [2000] and Collins et al. [2001] use background subtraction methods to detect regions of interest.

The most important limitation of background subtraction is the requirement of stationarycameras.

Methods can be applied to video acquired by mobile cameras for small motion in successive frames.

Page 18: Object tracking a survey

5-Segmentation

Sc hoo l o f Com pute r S c i enc es A nhui U ni vers i t y

Image segmentation algorithms is to partition the image into perceptually

similar regions. Every

segmentation algorithm addresses two problems the criteria for a good partition and the

method for achieving efficient partitioning [Shi and Malik 2000].

1.Mean Shift Clustering For the image segmentation problem, Comaniciu andMeer [2002] propose the mean-shift approach to find clusters in the joint spatial+color space, [l , u, v,x, y], where [l , u, v] represents the color and [x, y] represents the spatial

location.

Page 19: Object tracking a survey

Clustering

Sc hoo l o f Com pute r S c i enc es A nhui U ni vers i t y

Mean-Shift-Vector

All Trajectories lead tothe same mode.

Page 20: Object tracking a survey

Continued…

Sc hoo l o f Com pute r S c i enc es A nhui U ni vers i t y

Mean-shift clustering is scalable to various other applications such as edge detection,image regularization [Comaniciu and Meer 2002], and tracking [Comaniciu et al. 2003].

Page 21: Object tracking a survey

5.2-Image Segmentation Using Graph-Cuts

Sc hoo l o f Com pute r S c i enc es A nhui U ni vers i t y

• A cut in a graph is a set of edges whose removal disconnects the graph.

• Image segmentation can also be formulated as a graph partitioning problem, where the vertices (pixels), V = {u, v, . . .}, of a graph (image), G, are partitioned into N disjoint sub graphs (regions), Ai , N

i = 1 Ai = V, Ai ∩ Aj = , ∅ i = j.• Limitation of minimum cut is its bias

toward over segmenting the image• Shi and Malik [2000] propose the

normalized cut to overcome the over segmentation problem.

normalized cut

Page 22: Object tracking a survey

Active Contours

Sc hoo l o f Com pute r S c i enc es A nhui U ni vers i t y

o Object segmentation is achieved by evolving a closed contour to the object’s boundary, such that the contour tightly encloses the object region.

o The concept of active contours models was first introduced in 1987.

o Active contour model, also called snakes.

o Snakes do not solve the entire problem of finding contours in images, since the method requires knowledge of the desired contour shape beforehand.

(a) (b)

(c)

Page 23: Object tracking a survey

6.Supervised Learning

Sc hoo l o f Com pute r S c i enc es A nhui U ni vers i t y

Given a data set and already know our correct output, having the idea there about the relationship of the input and output.

Supervised learning methods generate function that maps inputs to desired outputs.

Learning different object views waives requirement of storing a complete set of templates.

Supervised learning methods require large collection of samples from each object class with manually labels.

Possible approach for reducing labeled data amount is Cotraining with supervised learning [Blum and Mitchell 1998]

Build model, train model and test model.Suppose a student want to learn machine Learning.1 – Suppose we are a model. 2 - Now your teacher will teach you machine learning. During teaching, your teacher use some resource, this is the training process. Where we train our model with past/current data. 3 - At the end of the course your teacher may test your knowledge to check how well you have done.

Page 24: Object tracking a survey

Cotraining Means

Sc hoo l o f Com pute r S c i enc es A nhui U ni vers i t y

In the case of web-page classification, you build one model on the URL features of your website and build a different model on the text features of the website. The idea is that these models are complementary to one another and can help “correct” each other since they are each likely to make different mistakes. Generally, this process is run iteratively until some convergence criterion is met and if certain assumptions hold (such as that the two views are independent but sufficient for learning the class targets) will work well.

Page 25: Object tracking a survey

6.1-Adaptive Boosting(Classifiers)

Sc hoo l o f Com pute r S c i enc es A nhui U ni vers i t y

Iterative method of finding a very accurate classifier by combining many base

classifiers,

Boosting mechanism selects a base classifier gives the least error.

The algorithm encourages the selection of another classifier/classifiers that performs

better on the misclassified data in the next iteration.

In 2003, Viola et al. used the Adaboost framework to detect pedestrians. In their

approach, perceptrons were chosen as the weak classifiers

The individual learners can be weak, but as long as the performance of each one is

slightly better than random guessing the final model can be proven to converge to a

strong learner

Page 26: Object tracking a survey

6.2-Support Vector Machines

Sc hoo l o f Com pute r S c i enc es A nhui U ni vers i t y

Classifier used to cluster data into two classes by finding the maximum marginal hyperplane that separates one class from the other [Boser et al. 1992].

In the context of object detection, Papageorgiou et al. [1998] use SVM for detecting pedestrians and faces in images.

Page 27: Object tracking a survey

7.Taxonomy and Categories of tracking methods

Sc hoo l o f Com pute r S c i enc es A nhui U ni vers i t y

Page 28: Object tracking a survey

Continued…

Sc hoo l o f Com pute r S c i enc es A nhui U ni vers i t y

Page 29: Object tracking a survey

Point Tracking

Sc hoo l o f Com pute r S c i enc es A nhui U ni vers i t y

Point Tracking

Deterministic methods

Statistical methods

Point Tracking divided into two broad categories

Page 30: Object tracking a survey

6-1.2--Deterministic Methods

Sc hoo l o f Com pute r S c i enc es A nhui U ni vers i t y

Deterministic methods define a cost function which is made up of constraints like maximum velocity, common motion and rigidity.

This cost function must then be minimized for tracking.

A greedy algorithm can be used for this which iteratively optimizes point correspondences [26 paste reference].

This algorithm is used by is based on the algorithm used in a paper by Sethi and Jain.

The algorithm is modified in [26] to preserve a lot of motion information so that point measurements are not missed.

Proximity assumes location of object would not change notably from one frame to other.Maximum velocity defines upper bound on the object velocity and limits the possible correspondences to the circular neighborhood around object.Small velocity change (smooth motion) assumes direction and speed of object does not change drastically.Common motion constraints the velocity of objects in a small neighborhood to be similar This constraint is suitable for objects represented by multiplepoints.

Page 31: Object tracking a survey

Continued…

Sc hoo l o f Com pute r S c i enc es A nhui U ni vers i t y

Rigidity assumes that objects in the 3D world are rigid, therefore, the distance betweenany two points on the actual object will remain unchanged (see Figure 10(e)).Proximal uniformity is a combination of the proximity and the small, velocity change constraints.

Note: That these constraints are not specific to the deterministic methods, and they can also be used in the context of point tracking using statistical methods.

Page 32: Object tracking a survey

7.Statistical Methods

Sc hoo l o f Com pute r S c i enc es A nhui U ni vers i t y

o Statistical methods models uncertain-ties to handle noise in an image. A well-known method for statistical point tracking is multiple hypothesis tracking(MHT). A set of hypotheses are designed for an object and predictions are made for each hypothesis for the object's position. The hypothesis with the highest prediction is the most likely and is chosen for tracking .

o Multiple hypothesis tracking(MHT) is used in [Fieguth, P.& Terzopoulose], in order to overcome occlusion

o For tracking single objects are the Kalman filter and Particle filters. The Kalman filter is limited to a linear system and uses prediction and correction to estimate an object's motion ..

o Initialization of the particle filter was done using an algorithm based on Support Vector Machines. The results from the study in [18], showed that this method of using color distributions along with particle filtering is very effective in tracking fast-moving, non-rigid objects.

o For example, these methods have extensively been used for tracking contours [Isard and Blake 1998], activity recognition [Vaswani et al. 2003], object identification [Zhou et al. 2003], and

o structure from motion [Matthies et al. 1989].

Page 33: Object tracking a survey

8.Kernel Tracking

Sc hoo l o f Com pute r S c i enc es A nhui U ni vers i t y

o Represents object as a geometric shape, called a kernel, and estimates motion of this kernel in consecutive frames.

o KT commonly used to track a single object. Uses brute force to search an image for a region that matches the template in the previous image [28]

o The brute force searching results in this method computationally expensive, but this can be overcome by optimizations to the method, such as limiting the search to a certain region.

o Mean-shift is used for template matching which eliminates the need for brute force. Mean shift was first introduced in 1975 by Fukunaga and Hostetler in the paper .It is an iterative algorithm that shifts a point towards the average of other points in that area.

o A limitation of kernel tracking is that parts of the background may appear inside the kernel, but this can be overcome by making the kernel inside the object, instead of around it.

o We divide these tracking methods into two subcategories based on the appearance representation used.

Page 34: Object tracking a survey

8.1 Tracking single objects Approaches

Sc hoo l o f Com pute r S c i enc es A nhui U ni vers i t y

• Template matching is common approach which is a brute force method of searching the image.

• A limitation of template matching is its high computation cost due to the brute force search.

• Other object representations can be used for tracking, like color histograms or mixture models can be computed by using the appearance of pixels inside the rectangular or ellipsoidal regions.

• Fieguth and Terzopoulos [1997] generate object models by finding the mean color of the pixels inside the rectangular object region. To reduce computational complexity, they search the object in eight neighboring locations.

• Comaniciu and Meer [2003] use a weighted histogram computed from a circular region to represent the object instead of brute force search.

• Jepson et al. [2003] propose an object tracker that tracks an object as a three component mixture, consisting of the stable appearance features, transient features

and noise process.

Page 35: Object tracking a survey

Examples

Sc hoo l o f Com pute r S c i enc es A nhui U ni vers i t y

• In 1994, Shi and Tomasi proposed the KLT tracker.

Results of the robust online tracking method by Jepson et al. [2003].

Tracking features using the KLT tracker.

Page 36: Object tracking a survey

8.2 Tracking Multiple Objects

Sc hoo l o f Com pute r S c i enc es A nhui U ni vers i t y

Propose this method based on modeling the whole image, I t , as a set of layers. This representation includes a single background layer and one layer for each object. Each layer consists of shape priors (ellipse), , motion model (translation and rotation), , and layer appearance, A, (intensity modeled using a single Gaussian).

Isard and MacCormick [2001] propose joint modeling of the background and foreground regions for tracking. The background appearance is represented by a mixture of Gaussians.

Appearance of all foreground objects is also modeled by mixture of Gaussians.

Comparison of kernel trackers can be obtained based on tracking single or multiple objects, ability to handle occlusion, requirement of training, type of motion model.

Page 37: Object tracking a survey

9. Silhouette Tracking

Sc hoo l o f Com pute r S c i enc es A nhui U ni vers i t y

Objects have complex shapes, for example, hands, head, and shoulders cannot be well described by simple geometric shapes. Silhouette based methods provide an accurate shape description for these objects.

This model can be in the form of a color histogram, object edges or the object contour. We divide silhouette trackers into two categories shape matching and contour tracking.

Shape Matching can be performed similar to tracking based on template matching where an object silhouette and its associated model is searched in the current frame.

The search is performed by computing the similarity of the object with the model generated from the hypothesized object silhouette based on previous frame.

In 1993, Huttenlocher et al. performed shape matching using an edge-based representation.

Another approach to match shapes is to find corresponding silhouettes detected in two consecutive frames. Establishing silhouette correspondence, or in short silhouette matching, can be considered similar to point matching discussed.

Page 38: Object tracking a survey

Silhouette Tracking Categories

Sc hoo l o f Com pute r S c i enc es A nhui U ni vers i t y

Contour Tracking methods, in contrast to shape matching methods. iteratively evolve an initial contour in the previous frame to its new position in the current frame. This contour evolution requires that some part of the object. in the current frame overlap with the object region in the previous frame.Silhouette tracking is employed when tracking of the complete regionof an object is required.

Page 39: Object tracking a survey

10.Resolving Occlusion

Sc hoo l o f Com pute r S c i enc es A nhui U ni vers i t y

o Three categories: self occlusion, inter object occlusion, and occlusion by the background scene structure.

o Self occlusion occurs when one part of the object occludes another. This situation most frequently arises while tracking articulated objects.

o For interobject occlusion, the multiobject trackers(MOT) like MacCormick and Blake [2000] and Elgammal et al. [2002] can exploit the knowledge of the position.

o A common approach to handle complete occlusion is to model the object motion by linear dynamic models or by nonlinear dynamics.

o A nonlinear dynamic model is used in Isard and MacCormick [2001] and a particle filter employed for state estimation.

o Other features to resolve occlusion, for example, silhouette projections and optical flow also utilized.

o Yilmaz et al. [2004] build online shape priors using a mixture model based on the level set contour representation. Their approach is able to handle complete object occlusion.

Page 40: Object tracking a survey

11.Future Direction

Sc hoo l o f Com pute r S c i enc es A nhui U ni vers i t y

o A lot of progress has been done in last few years and many trackers developed.o From this survey smoothness of motion, minimal amount of occlusion, illumination

constancy, high contrast with respect to background, are violated in many realistic scenarios so we need trackers.

For Tracking associated problems of feature selection, object representation, dynamic shape, and motion estimation are very active areas of research and new solutions are continuously being proposed.

Challenges:1:One challenge develop algorithms for tracking objects in unconstrained videos like from broadcasting and homemade videos due to noise, compression acquired from moving cameras from multiple views.

2: In a formal and informal meetings in a small field of view so many people so severe occlusion occurs. Solution to this employ audio for tracking.While developing of tracking algos is integration of contextual information. In vehicle tracking application, the location of vehicles should be constrained to paths on the ground as opposed to vertical walls or the sky. Recent work in the area of object recognition [Torralba 2003; Kumar and Hebert 2003] has shown that exploiting contextual information

Page 41: Object tracking a survey

Future Direction

Sc hoo l o f Com pute r S c i enc es A nhui U ni vers i t y

• In addition, advances in classifiers [Friedman et al. 2000; Tipping 2001] have made accurate detection of scenes.A tracker which take advantage of contextual information performs better.

• Feature Set for tracking also affect the performance like by discriminating multiple objects ,between the objects and background.

• Wide Range of feature selection algos investigated but these algorithms require offline training information for target detection Collins and Liu 2003 done some work but still feature selection sets remains unsolved.

• One interesting direction that has largely been unexplored is the use of semisupervised learning techniques for modeling objects.

• Kalman Filters [Bar-Shalom and Foreman 1988], JPDAFs [Cox 1993], HMMs [Rabiner 1989], and Dynamic Bayesian Networks (DBNs) [Jensen 2001] have been extensively used to estimate object motion parameters.

• Overall, we believe that additional sources of information, in particular prior and contextual information, should be exploited.