tracking multiple moving objects by binary object forest segmentation

10
Tracking multiple moving objects by binary object forest segmentationi David Nichol and Merrilyn Fiebig* In an earlier study it was shown that the low level image segmentation technique known as binary object forest (BOF) analysis could be successfully used to extract one or two moving objects from complex backgrounds, even when the motion involved was very large. The method involved performing BUF analysis on each of a pair of images from a sequence and then matching the vertices of the resulting graphs. In the present study the problem of tracking multiple objects in complex backgrounds and in difficult circumstances such as partial occlusion, is considered. The approach taken is once again to perform an initial BOF analysis of each image but now to attempt matching over subgraphs of the BOF rather than simply on individual vertices. It is shown theoreti- cally and experimentally that this results in a much more robot matching scheme. This increase in robustness not only allows multiple objects to be tracked but facilitates correct matching even when partial object occlusion occurs and when motion towards the sensor results in large (apparent) size changes between frames. Keywords: disparity analysis, image segmentation, motion detection, binary object forest The binary object forest (BOF) is a low level approach to image segmentation based on slicing a grey-level image into a set of binary images and finding topologi- cal relationships between connected regions (binary objects or atoms) in these binary images. It has been shown 1*2 that these relationships can be described as a set of trees where each vertex corresponds to an atom. These trees constitute the binary object forest. The process of deriving the BOF is illustrated in Figures 1 and 2. The application we considered elsewhere’ was the problem of finding a moving object in a complex background where the object could move a consider- able difference between frames. (To avoid any confu- Information Technology Division, Electronics Research Laboratory, *Optoelectronics Division, Surveillance Research Laboratory, DSTO, Salisbury, SA 5108, Australia ‘0 Crown copyright t 99 I paper received: 20 December 1989. Revised paper received: _?I January 1991 sion between moving objects and binary objects the latter will be subsequently referred to as atoms.) The procedure used was to derive the BOF for each of the two images and then compare their vertices to obtain a set of possible matches. These matches were classified as corresponding to background or moving object. Essentially, only the size and grey-level of the atoms were used for comparison, and mismatches can occur. As in most feature matching methods3” a ‘iocal- smoothness’ rule was used to reduce such mismatches. In the present paper, a more complex matching method is proposed. This seeks matches between subgraphs rather than individual nodes of the two BOF. The major advantage of this is that matching becomes much more robust. This increase in robustness enables several moving objects to be tracked in complex backgrounds, and correct matches can still be obtained even with large changes in position and in object size between frames. The former case is very difficult for ‘optical flow’ (gradient) method@,’ and this latter may arise if the object has a significant velocity component in the direction of the observer. In addi- tion, the increased robustness largely removes the need for postprocessing. A second type of benefit arises as instead of extracting individual disparity vectors the subgraph matching automatically produces matched groups of vectors. In this sense it partially performs the object extraction stage of processing. It may be thought that by increasing frame sampling the need to process sequence with larger inter frame disparities can be averted. Whilst this may be true for some applications it is certainly not true for all. For example, a ‘pushbroom’ scanning sensor only acquires one frame per pass and revisits may be days apart in the case of satellite systems. Similarly, ‘pop up’ surveillance only yields frame intermittantly. It is usually the case that objects in a two dimensional image (i.e. projection) of a scene appear to have a boundary which encloses ‘internal substructure’. For example, an automobile appears to consist of internal structures such as windows, doors, wheels and panels which are enclosed by the vehicle outline. Now the extraction of the boundary of such an object when viewed against a complex background is very difficult if 362 0262-8856/91/006362-10 @ 1991 Butterworth-Heinemann Ltd image and vision computing

Upload: david-nichol

Post on 25-Aug-2016

216 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Tracking multiple moving objects by binary object forest segmentation

Tracking multiple moving objects by binary object

forest segmentationi

David Nichol and Merrilyn Fiebig*

In an earlier study it was shown that the low level image segmentation technique known as binary object forest (BOF) analysis could be successfully used to extract one or two moving objects from complex backgrounds, even when the motion involved was very large. The method involved performing BUF analysis on each of a pair of images from a sequence and then matching the vertices of the resulting graphs. In the present study the problem of tracking multiple objects in complex backgrounds and in difficult circumstances such as partial occlusion, is considered. The approach taken is once again to perform an initial BOF analysis of each image but now to attempt matching over subgraphs of the BOF rather than simply on individual vertices. It is shown theoreti- cally and experimentally that this results in a much more robot matching scheme. This increase in robustness not only allows multiple objects to be tracked but facilitates correct matching even when partial object occlusion occurs and when motion towards the sensor results in large (apparent) size changes between frames.

Keywords: disparity analysis, image segmentation, motion detection, binary object forest

The binary object forest (BOF) is a low level approach to image segmentation based on slicing a grey-level image into a set of binary images and finding topologi- cal relationships between connected regions (binary objects or atoms) in these binary images. It has been shown 1*2 that these relationships can be described as a set of trees where each vertex corresponds to an atom. These trees constitute the binary object forest. The process of deriving the BOF is illustrated in Figures 1 and 2. The application we considered elsewhere’ was the problem of finding a moving object in a complex background where the object could move a consider- able difference between frames. (To avoid any confu-

Information Technology Division, Electronics Research Laboratory, *Optoelectronics Division, Surveillance Research Laboratory, DSTO, Salisbury, SA 5108, Australia ‘0 Crown copyright t 99 I paper received: 20 December 1989. Revised paper received: _?I January 1991

sion between moving objects and binary objects the latter will be subsequently referred to as atoms.) The procedure used was to derive the BOF for each of the two images and then compare their vertices to obtain a set of possible matches. These matches were classified as corresponding to background or moving object. Essentially, only the size and grey-level of the atoms were used for comparison, and mismatches can occur. As in most feature matching methods3” a ‘iocal- smoothness’ rule was used to reduce such mismatches.

In the present paper, a more complex matching method is proposed. This seeks matches between subgraphs rather than individual nodes of the two BOF. The major advantage of this is that matching becomes much more robust. This increase in robustness enables several moving objects to be tracked in complex backgrounds, and correct matches can still be obtained even with large changes in position and in object size between frames. The former case is very difficult for ‘optical flow’ (gradient) method@,’ and this latter may arise if the object has a significant velocity component in the direction of the observer. In addi- tion, the increased robustness largely removes the need for postprocessing. A second type of benefit arises as instead of extracting individual disparity vectors the subgraph matching automatically produces matched groups of vectors. In this sense it partially performs the object extraction stage of processing. It may be thought that by increasing frame sampling the need to process sequence with larger inter frame disparities can be averted. Whilst this may be true for some applications it is certainly not true for all. For example, a ‘pushbroom’ scanning sensor only acquires one frame per pass and revisits may be days apart in the case of satellite systems. Similarly, ‘pop up’ surveillance only yields frame intermittantly.

It is usually the case that objects in a two dimensional image (i.e. projection) of a scene appear to have a boundary which encloses ‘internal substructure’. For example, an automobile appears to consist of internal structures such as windows, doors, wheels and panels which are enclosed by the vehicle outline. Now the extraction of the boundary of such an object when viewed against a complex background is very difficult if

362

0262-8856/91/006362-10 @ 1991 Butterworth-Heinemann Ltd

image and vision computing

Page 2: Tracking multiple moving objects by binary object forest segmentation

Z

c”

Wb curd e f

z”’

r P dfe

S&X? Region Enclosure Tree

W(1) e=1 Z

A’> d I 1 e f

W(3) f3=3

q r ‘5 t S t

Figure I. Grey-level image decomposed using three thresholds into three binary images. The region enclo- sure tree (RET) for each binary slice is also shown

0

1 d b f

I i I I 2 hgmk 1

A// 3 P

l-Object Subset Tree

(OST) showing Figure 2. object subset trees the topological relationships between binary atoms in adjacent slices for the image shown in Figure I. The binary object forest (BOF) of an image is the set of its RET and OST

S q Z”’ r

0

t

t

the latter includes similar grey levels to the object. However, the internal substructure is not affected by the background. If the BOF subgraphs used for matching correspond to such substructure then match- ing should be very robust. For example, the topological and other information about the substructure produced by BOF analysis is essentially invariant to (planar) translations and rotation. Also, it is invariant to changes in object size between images and there should

vol9 no 6 december 1991

thus also be some robustness to non-planar rotation. Of course, for certain object geometries changes in aspect may be abrupt rather than graceful. To fully accommo- date such changes, prior knowledge of the three dimensional nature of the object is required. BOF analysis does not assume the existence of such know- ledge and, not su~~s~ngly, some matches may be missed in such iifficult cases. However, if an object contains several nodes it is unlikely they would all vanish simultaneously, and sufficient matches may be found to enable tracking to continue even during severe changes of aspect. In addition, as the matching procedure operates hierarchically, it may still be possible to obtain matches even if an object is partially occluded in one (or both) frames. Obviously, it will no longer be possible to extract all boundary atoms, but there may still be sufficient information present to produce a correct match. The BOF approach facilitates the extraction of this internal topological information because all region relationships are fully described by trees. As a tree is a graph without cycles’ it is easier in principle to search an image thus segmented than say a region adjacency graphs segmentation”,“.

SUMMARY OF BOF ANALYSIS

The first stage of BOF analysis is to slice the images at K thresholds (A,, A*, . . . hkr . . . A,), where A,<AZ<. . . <A,. . . <A,. This results in K binary valued images W,(X), W,(X), . . ., W&), . . . W,(n), where W,(x) is set to 1 if f(x) 2Ak and 0 otherwise. Figure 1 shows the set of binary images which result from slicing a simple grey-level image. The choice of thresholds has been discussed previously’, and the usual approach is to set them to be equispaced within two standard deviations of the mean. This works well for most images, but an image pair containing a lot of gentle gradients may not match particularly well due to problems such as region growth. In such cases increas- ing the number of thresholds, and allowing matching between different thresholds, leads to a significant improvement. Next the elementary connected regions (atoms) in each slice are found. These are of two types: l-atoms which consist of pixels with truth value 1, and O-atoms which consists of pixels with truth value 0. All atoms are labelled and elementary statistical informa- tion, such as size and centroid position, is recorded for each atom. It was shown’ that the topological relation- ships of enclosure in a given slice, and subset relation- ships (between adjacent slices) can be described by a series of K + 2 trees which constitute the binary object forest. As illustrated in Figures 1 and 2, the BOF consists of K region enclosure trees (RET) and two object subset trees (OST). As each atom is enclosed by, and is a subset of, only one other atom, then it is very convenient from a computational viewpoint to store both the statistical and the topological information as series of one dimensional arrays indexed by atom label.

INTERNAL SUBSTRUCTURE AND THE SET OF SUBATOMS

Underlying concepts

In principle it is possible to determine the boundary of an object in an image, though in practice factors such as

363

Page 3: Tracking multiple moving objects by binary object forest segmentation

imperfect optics, sampling effects and propogation conditions may make it difficult to do so exactly. Suppose the boundary of an object 0 which is completely within the field-of-regard of a sensor is given by a closed curve Co. Suppose the rectangular image region is considered enclosed by an imaginary outer region Z of unspecified truth value’. Then:

Definition 1. The internal ~~~~~~~c~~~e of an object 0 consists of those atoms cy for which a path from Z to a must pass through C o. This set So = {cy, p, y . . .} is said to represent the set of internal atoms of 0.

The set So can be divided into two disjoint subsets. These are Soi and So2, where S4*i consists of the primary internal atoms and So2 of the set of secondary internal atoms.

Definition 2. An internal atom cy f So is said to be primary if it is neither enclosed by, nor a subset atom of, any other atom E So. Otherwise, LY is a secondary atom of So.

The aim of this distinction is that some atoms, namely the set of primary atoms, can be reached from Z without passing through other atoms. The secondary atoms are internal to these atoms and a path to them must pass through a primary atom. (More strictly, it must pass through pixels which comprise the outer boundary of a primary atom). Now it is not generally the case that a single atom will coincide exactly with an object boundary, so there may well be more than one primary atom per object. This distinction can be seen in Figure 1, The atoms labelled h, i and j are all internal to the object CAR, but as i and j can only be reached via h they are secondary atoms. Atom h is primary because, although h is a subset atom of p, this latter atom is not internal to CAR due the fact it is merged with part of the object TREE, and therefore can be reached without crossing the boundary of CAR. Similarly, atoms k and I are primary atoms of CAR and n and o are secondary.

munition 3. The set of subatoms of an atom a! whose outer boundary is the closed curve C, consists of those atoms p for which a path from Z to /3 must pass through C,.

Obviously, this is a very similar definition to that for internal substructure. The difference is that for the former the outer boundary always corresponds to an object, whereas in the latter case it corresponds to an atom. From the above definitions it follows that:

Lemma 1. The set of secondary atoms of an object 0 is equivalent to the set of subatoms of the primary atoms of@.

In general of course, we do not know a priori where object boundaries are located in images. However, in the matching scheme discussed below it is proposed to seek matches between sets consisting of atoms and their subatoms. If good matches are obtained which corres- pond to significant shifts between frames then this set will be presumed to be part of the internal substructure

of a moving object. Before matching can occur the set of subatoms must be extracted from the BOF.

Extracting subatom sets from the BOF

Suppose the top level atom a! which it is wished to match occurs in slice k, where 0~ k, 6 K. Suppose the set S, contains the subatoms of LY. Then it is obvious that:

Lemma 2. All atoms in slice k, which f S, can be found by noting the vertices of the branch of the region enclosure tree which has (Y as a root.

All vertices found correspond to subatoms of LY in slice k,. However, other subatoms may be located in different slices. To see how these can be identified the following two theorems are needed:

Theorem 1. All l-atoms E S, at level k + 1, where k3 k,, are subset atoms of l-atoms E S,at level k and all O-atoms E S, at level k’ - 1, where k’ d k,, are subset atoms of O-atoms E S, at level k’.

Proof. Consider the set of all pixels which belong to I- atoms ES, at level k,. Now as threshold h(k, + 1) >h (k) then the set of pixels with truth value 1 within the boundary of (Y at level k, + 1 will be a subset of this set. All l-atoms at level k, c 1 can contain only pixels from this subset thus, by definition, they must be subset atoms of l-atoms at level k, E S,. Clearly, any subset atom of an atom ES, must also ES,. This argument can be repeated for any pair of adjacent slices k, + m + 1 and k, + m where m > 1 and thus the first part of the theorem follows by induction.

A similar argument for O-atoms, but for slices <k,, proves the second part and the theorem follows. Cl

Theorem 2. All O-atoms E S, at level k, where k> k,, are enclosed by l-atoms ES, at level k and all l-atoms E S, at level k’, where k’ <k,, are enclosed by O-atoms E S, at level k’.

Proof. Consider a O-atom /3 E S, in slice k, where k > k,. Now all (non-edge) O-atoms must be enclosed by a l-atom. Suppose the enclosing l-atom for p is y but y&S,. Now y cannot be contained within C, as it then would, by Definition 3, ES,. Neither can it lie completely outside S,, as then p could&S,. So y must lie partly inside and partly outside y. Now as threshold A (k,) <h (k) then all pixels of truth value 1 in slice k must also be equal to I in slice k,. This, however, implies connected pixels of the same truth value must lie both inside and outside the atom Q. However, from the definition of an atom this is a contradiction. Thus y must lie completely within C, and hence g E S,.

A similar argument for l-atoms, but for slices <k,, proves the second part and the theorem follows. Cl

It follows from Lemma 2 and Theorems 1 and 2 that the following algorithm will extract the subatom set of any atom (Y:

364 image and vision computing

Page 4: Tracking multiple moving objects by binary object forest segmentation

Algorithm 1

Srep I. Find all atoms which are enclosed by GY and add these to the set Sencl. Find all atoms which are enclosed by atoms E Sencl. Denote these atoms by s enc2* Repeat this process until now new enclosed atoms are found. Denote the union of these sets by S,t . At this stage all vertices of the subtree of the RET with root LY have been found (Lemma 1).

Step 2. Find all atoms which are subset atoms of the members of S,,. Denote these by Ssubl. Find the subset atoms of the members of Ssubl. Denote these by S sub2. Repeat this process until now new atoms are found. Denote the union of these sets by S,z. At this stage all 2-subset atoms E S, at levels >k, and O-subset atoms E S, at levels <k, have been found (Theorem 1).

Step 3. Find all atoms which are enclosed by members of Sa2. Denote these by Sn3. From Theorem 2 this set constitutes the O-atoms E S, in slices >k, and l-atoms E S, in slices <k,.

Step 4. The required subatom set is S, = S&l u so2 u Sa3.

Thus all members of the subatom set for a given atom can be found. This process is easily implemented if, as previously suggested, the topological relationships of the BOF are stored as one dimensional arrays, The process then involves searching each array for the sets of enclosed and subset objects for each atom. These sets can be stored as a data structure consisting of an array of lists. When this process is complete the kth element of this structure represents the subatom set of atom k. After repeating this process for the second image, matching between these subatom sets can occur.

MATCHrN~

Theoretical basis of robustness

An atom in an image is simple if its subatom set is empty. Such atoms are easily found in the BOF as these are atoms which are terminal nodes of both the RET and OST in which they occur. All other atoms are complex. The proposed disparity matching process is concerned with finding matches between complex atoms in each BOF. This is a step beyond single vertex matchingI. By using the extra information it was hoped to make the matching process moire robust. A full theoretical examination of the increase in robustness is rather involved so only sufficient detail to demonstrate the underlying principle is given in this study.

Suppose the complex atom to be matched in image 1 is denoted by (Y, of truth value and subatom set S, = {a,, a2, ff3 . . .} of size N,. Suppose the density function of atoms in image 2 is F(s, k, p) where s is the size of the atom in pixels, k is the slice number and p the truth value. Then the possible number of primary matches na for LY in image 2 is found by summing F(s, k, pa) over the range of sizes and slices allowed by the matching algorithm. Typically, this generates, under the weak constraints used to accommodate potentially large changes between images, a hundred or more possible matches. Obviously, one only, at most, match can be correct, so the question to be addressed is can

~019 no 6 december I991

this large false alarm rate be reduced to an acceptable level by using subatom matching?

Consider any one of the subatoms ai E S, of truth value p’ which lies in slice ki. Now the secondary matching process, described below, searches for atoms in image 2 based on similar (though tighter) constraints on size and position to the primary search. Suppose the appropriate summation limits on F(s, k, p) yield ni possible matches to (Y~ in image 2. Now if it is assumed that all these matches are false (for example, the object in question has moved out of the image) then it is of interest to calculate how many complex atoms in image 2 (falsely) match the complex atom {a, cri} in image 1. To simplify the discussion suppose that all searches are restricted to the equivalent slice only in image 2, and suppose, initially, that ai is a subset atom of LT. Suppose atso that the ni matching atoms are randomly distributed in slice ki. (However, as these atoms are subset atoms they must he within the boundary of the p’ portion of this slice.) If the total area of p’ pixels in slice ki is Qp’ (pixels), the mean size of atoms which match (Y is #,, and the mean size of atoms which match (Y~ is 4i then the maximum number of p’ atoms of size @i which can be placed in slice ki is M,,xi = @p’/ 4,i, and the maximum number which can be assigned within the boundary of the n, primary matches is M,i = n~~~/~i. This is illustrated in Figure 3. This last number corresponds to the maximum number of cases where matches are obtained for both the primary and secondary atoms. That is this is the maximum number of ‘successes’ (actually false alarms) for the matching process.

The problem thus reduces to finding the number of such successes that occur when the Iti atoms are randomly distributed over allowable parts of the image. However, if a success (or failure) occurs on a given trial the number of possible successes (or failures) must be reduced by one for the next trial. The appropriate probability dist~bution is thus hypergeomet~c” rather than binomial. Thus out of ni trials the probability of m successes is:

h(m; M,,i, ni, n,) = (;)( M:‘:I:i)/( ;li,

The mean of this distribution is ,u~= njn,lM pi with v$ann .I v:= (Mp,-ni) Iii ~~(1 -n,IMP*i~/‘[(MP~i

Obvi~sly, F(s, k, p) is data dependant, but to illustrate the sorts of values of pi and a: that arise the empirical values of the image shown in Plate 1 (see page 369) will be used. (This particular image contains

Figure 3. To obtain a secondary match an atom of appropriate characteristics must fall within one of the primary match candidates (image 1, left; image 2, right)

365

Page 5: Tracking multiple moving objects by binary object forest segmentation

a significantly greater density of atoms than the others used, so false alarms here will tend to be higher.) Part of the distribution, as a function of size, is shown in Table 1.

Clearly, there is a very strong inverse relationship between numbers of atoms and size. This has been found in all images processed. Using the primary and higher order matching parameters discussed in the previous section yields the following raw figures of n; = 784 and n: = 599. (These correspond to 4, = 20 and 4; = 10, and thus are close to the worst case from a false alarm viewpoint.) However, these numbers are the totals for all slices and both truth values. If it is assumed that these atoms are uniformily spread over the slices and truth values then these numbers must be divided by the product of the number of slices and the number of truth values. Typically, this product is 16 (8 slices and 2 truth values). Thus for this example n, = 49 and ni = 37. Using these numbers in the above formulas gives ,uui = 0.28 and a standard deviation of (+i = 0.28. SO for a complex atom containing only a single subatom the expected number of false alarms reduces from approximately 50 to less than 0.5 is a secondary match is required besides the primary one.

Now as many complex atoms have more than a single subatom there may be the opportunity to further reduce false alarms. Suppose atom (Y has N, subatoms then the question becomes what is the probability of obtaining m (0 < m 6 NJ or more matching subatoms in addition to the primary match? To simplify the analysis assume that ni=n* and ~i=~~Vl~i~N,. With this simplification the expected number of successes for each random distribution of the n * subatoms is identical and given by v* = n* k,lN* where N* = <p)/4 *. As v * is the mean number of hits in n * trials then it is reasonable to set the probability of a success of all trials for a given subatom top * = v *ln *. (For the above example p * = 0.0075). Assuming each matching of subatoms is independent then the probability b(m; N,, p*) of obtaining m or more matches is given by summing the binomial formula. Thus:

b(m; N,, p*) = 5 (z)p’(l -P)~s-’ k=m

Table 2 shows the probabilities for obtaining m or more matches randomly for various N, and p *.

The power of secondary matching is clear from Table 2. Even for a highly pessimistic p * = 0.1 the probabili- ties of mismatches become acceptably low if sufficient numbers of submatches are required a priori. For the case of p * = 0.01, which is more realistic though still pessimistic, requiring only a single match in addition to the primary match leads to an acceptable false alarm rate. This leads to confidence in the robustness of the matching scheme which is now described.

Table 1. Number of atoms falls off exponentially with atom size

Atom size 1 2 3 4 8 16 32 64 128 Count 5593 2907 1690 1326 : : : 448 104 30 5 5

Table 2. Probability of obtaining m or more false- matching aubatoms in image 2 given N, subatoms in object (Y in image 1. The probability of a single mismatch is p*

Minimum number of successes m

0 1 2 3 4 5 6

N.9 p* = 0.01 1 1.OOOO 0.0100 2 1.0000 0.0200 0.0001 4 i.OOGu 0.0394 0.0005 0.0000 0.0000 8 1.0000 0.0772 0.0027 0.0001 0.0000 0.0000 0.0000

N, p* =O.l 1 1.0000 0.1000 2 l.OOOo 0.1900 0.0100 4 1.0000 0.3439 0.0523 0.0037 0.0001 8 1.0000 0.5895 0.1869 0.0381 0.0050 0.0004 0.0000

Matching process

Step 1. Assembly of complex atoms. Find the set of subatoms for each non-edge atom (Y in image 1 using Algorithm 1. This process is then repeated for image 2.

Step 2. Finding primary matches. For each complex atom (Y in image 1 find the set of possible matches in image 2. Such matches will be referred to as first order or primary, and are selected on the basis of comparative area, slice, truth value and position. If the number of sub-atoms, area, slice number, truth value and centroid of an atom t are denoted by sub([), area (l), slice (l), truth([) and (xi,yS), respectively, then an atom p in image 2 is accepted as a possible match for LY if the following five primary matching goal are met:

Goal I area 3 minl AND sub(ru)lfactl s sub( p) s sub(a) * fact1

where fact1 2 land minl >O

Goal 2 area(cy)/fact2 d area( /3) d area * fact2 where fact2 2 1

Goal 3 slice(a) - gl <slice(p) d slice(cy) + g2 where gl 3 0 and g2 5 0

Goal 4 truth(a) = truth(P)

Goal 5 Ixp-x,)<dxl AND /yp-y,I<dyl where dxl and dyl are SO

Notes on Step 1 and Step 2 From Table 1 it is clear that many possible matches occur for very small atoms. To reduce searching time, and minimize false alarms, minl in Goal 1 should be set as large as possible without risking oversights. For the examples shown a value of 20 was chosen.

Typically, in Goal 3 fact 2 is chosen to be 2.0 which allows significant changes in object size to be accommo- dated. This is, however, achieved at the expense of a significant increase in the numbers of primary matches which have to be processed. In the examples discussed in the present paper only atoms of the same slice are compared; thus gl =g2 = 0 in Goal 3. As one of the major aims of BOF disparity analysis is to allow for great variation in object position, no limits are placed on possible object shifts. Thus effectively dxl = dx2 = ~4 in Goal 5.

366 image and vision computing

Page 6: Tracking multiple moving objects by binary object forest segmentation

Step 3. Finding secondary matches. Having established a set of possible primary matches the process of substructure, or higher order, matching occurs. This is achieved by searching for matches between elements of the subatom sets of each pair of possible primary matches. The following three goals must be met for a higher order match between atoms ffi E S, and pi ES, to be accepted:

Goal 6 area(cyJ Z min2 AND area (pi) 2 min2 where min2 is a positive integer

Goal 7 ratiol/fact2 d area(cy$area( Pi) < ratio1 * fact2

where fact1 2 1 and ratio1 = area(a)/area( p)

Goal8 I/xp--xx,/-1 Xpj - x,; 11 c dx2 AND

IIYp_Y,I_IYp,-Y,iII<dY2 where dx2 and dy2> 0.

Notes on Goals 6 to 8 There is obviously no point in making min2 > minl .

Typically, min2 was set to 4. Goal 7 uses the idea that for a rigid body all subatoms will change in a similar fashion to the primary atom. Typically, fact2 = 1.3. The limits set in Goal 8 depends on the amount of expected object rotation. If this is unknown then dx2 and dx2 should be set to allow for maximum rotation. Effectively, this means setting no upper bounds. (Note: in the previous section this was done implicitly so the false alarm rates correspond to the worst case.)

Step 4. Conflict resolution. For each possible primary match (a, p) there is now a set of possible higher order matches S,, = {(ai, Pi), (%c, P,), a,, Pn), . . .}. By simply counting the number of elements in S,, it is possible to derive an estimate of the strength of the overall match between (Y and p. These estimates are then compared to produce a monotonically decreasing sequence R = ((a, P), (7, a>, (6 n * . .) ordered on the basis of substructure matches per primary match. A second sequence R * is derived from this as follows:

Take the first element (cy, p) in sequence R and remove all elements (L, cp) from R where L = LY XOR cp =p. This is consistent with the analysis given in the previous section.

Repeat this removal procedure for each element (ai, fij) E S,,. The reason for this is that Steps 1 to 3 will generate many cases where smaller complex atoms will be subatoms of larger complex atoms. If better matches are obtained with the larger complex atoms there is no need to retain the smaller complex atoms matches independently. Denote the resulting sequence by R’.

Repeat the above process for the next strongest match in R’. This results in R”. This process is repeated until there are no more matches to be considered. In view of the discussion in the previous section the residual sequence is then further pro- cessed to remove all matches ((Y, p) where S,, is empty. The resulting sequence is R *.

Step 5. Grouping. From Definitions 1 and 2 it is clear that R * will contain the strongest matching primary atoms of an object 0. It is of course possible that the moving object has more than one primary internal atom. This final step seeks to group similar primary matches into the set of internal atoms of 0. If the mean x and y displacements of group (Y and dx, and dy,, respectively, then complex (Y is grouped with complex p if:

Goal 9 Idx,-dxpI<dx3 AND

Idy,-dy,+dy3 where dx3 and dy3 3 0.

Values of dx3 = dy3 = 10 were used in the examples shown.

EXAMPLES

Image Pair A

Image Pair A is seen in Plate 1 (see page 369). This shows a tabletop scene with nine objects all of which move between frames. The movements are indepen- dent and exhibit varying degrees of planar rotation and translation. In addition, a global size change has occurred due to a change in focal length between frames. Even though the background is only moder- ately complex, Pair A is quite a difficult example due to the number of moving objects. In addition, the change of focal length makes the patterned background appear as many small moving objects. Pair A was processed using four techniques. These were BOF analysis as in Nichol and Fiebig’ (vertex matching with postproces- sing smoothing), BOF analysis with substructure matching as described above and disparity matching using the methods of Prazdny5 and Barnard and Thompson4. Plates 2-5 (see page 369-70) show the resulting disparity vectors for the four methods.

The basic BOF technique (Plate 2) performs quite well. Correct vectors are picked up on seven of the nine objects, and the change of scale is also detected in the correct matching of the background pattern. Two erroneous matches in the background (lower right corner) are also apparent. The numbers of vectors detected on objects is fairly sparse (often only a single vector), but trying to increase this by altering thresholds also leads to more erroneous vectors.

Plate 3 shows the results of BOF subgraph matching. Unlike the other methods the vectors shown are automatically grouped both by the complex matching process and by the grouping Rule 9. Correct matches are found for all objects even the very difficult PEN. An erroneous group occurs for the CALCULATOR, probably due to the fact that the keys are identical. No background groupings are found as the atoms in the background are simple.

Plate 4 shows the results of the Prazdny5 algorithm. This has produced dense correct vectors on the large highly structured objects (CALCULATOR, SCIS- SORS and DISKETTE) but has missed four objects completely. Some had bad mismatches also occur (PEN matched with PAPERWEIGHT). The worst mis- matches, however, occur in the background pattern. Plate 5 shows the results of the Barnard and

~019 no 6 december 1991 367

Page 7: Tracking multiple moving objects by binary object forest segmentation

Thompson4 algorithm. The results here are generally comparable to Prazdny5 although the density of vectors is less. Both methods work best on the detailed objects containing many ‘comers’. This is not surprising as both use the Morevac interest operator” to generate fea- tures for matching. (Plate 6 (see page 370) shows the features detected by this operator for pair A. The density is greatest in the highly detailed parts of the images. Objects which generate only a few interest points have little chance of being detected.

Image pair B

Image Pair B (see Plate 7 on page 370) were taken (handheld) about five seconds apart at the finish of an ocean yacht race. Of the three major moving objects the helicopter is the easiest as it occurs against a simple background in both images. Both the large and small yachts are partially occluded in both frames. Plate 8 (see page 371) shows the results of BOF subgraph matching. As expected, the helicopter is the strongest object found. The next strongest group found corres- pond to various background. objects. The third and fourth strongest are the small and large yachts, respectively. In both cases the complex objects found correspond to smaller parts of the sails: the larger parts are occluded (differently) in both images and are (correctly) not matched. It has not been possible to achieve any worthwhile results using the other two algorithms on either this Image Pair or Pair C; accordingly their disparity vectors are not reproduced.

Image pair C

Image Pair C (see Plate 9 on page 371) was taken about 200ms apart during the running of a Formula 1 race. Due to the high speeds involved considerable changes in apparent size has occurred between frames. There has also been a significant change in aspect. Plate 10 (see page 371) shows the results of BOF subgraph matching. The two strongest groupings found are actually part of the same car but the differences between the mean disparity vectors are greater than that allowed by Rule 9. The next strongest grouping corresponds to the front car. The final groupings correspond to the background.

CONCLUDING REMARKS

BOF substructure matching produces a robust tech- nique for disparity analysis. Images which contain multiple moving objects in complex backgrounds can be successfully analysed. The robustness derives from the unlikelihood of atoms corresponding to different

objects having the same topological subst~cture. Thus the probability of false matches is reduced drastically compared with single node matching. However, it is of course true that the number of missed matches also increases; between simple atoms say. Empirically, the total number of matches found is sufficient to be able to accommodate some losses due to oversights. If this were a problem then one possible technique to over- come this would be to allow single node matches but invoke postprocessing, using local smoothing derived from higher order matches, to filter out erroneous single matches. It has not yet been found necessary to implement this.

REFERENCES

1

2

3

4

5

6

7

8

9

10

11

12

Nichol, D G and Fiebig, M J ‘Image segmentation and matching using the binary object forest’, Image & Vision Comput., Vol 9 No 3 (June 1991) pp 139-149 Nichol, D G and Fiebig, M J ‘The application of binary object forest matching to images with high disparity’, Proc. Acoustics, Speech & Signal Pro- cess. A ustra~ia~ Conf. (ASSPA -89) ( 1989) Price, K E ‘Relaxation matching techniques - a comparison’, IEEE Trans. PAMI Vol PAMI- (1985) pp 617-623 Barnard, S T and Thompson, W B ‘Disparity analysis of images’, IEEE Trans. PAMI Vol PAMI- (1980) pp 333-340 Prazdny, K ‘Egomotion and relative depth map fri-_;;2ptical flow’, Biol. Cybern. Vo136 (1980) pp

Adiv, G ‘Determining three-dimensional motion and structure from optical flow generated by several moving objects’, IEEE Trans. PAMI, Vol PAMI- (1985) pp 384-401 Thomp~n, W B, Mutch, K M and Berzins, V A ‘Dynamic occlusion analysis in optical flow fields’, IEEE Trans. PAMI, Vol PAMI-7(1985) pp 374- 383 Berge, C Graphs and Hypergraphs, North- Holland, Netherlands (1970) Pavlidis, T Structural Pattern Recognition, Springer-Verlag, Germany (1977) Nichol, D G ‘Region adjacency analysis of remotely sensed imagery’, Int. J. Remote Sensing (1989) (to appear) Waipole, R E and Myers, R H Probabifity and Stat~tics for Engineers and Scientist, Macmillan, USA (1978) Moravec, H P ‘Towards automatic visual obstacle avoidance’, Proc. 5th Int. Joint Conf. on Artif. Intell. (1987) p. 584

368 image and vision computing

Page 8: Tracking multiple moving objects by binary object forest segmentation

Tracking multiple moving objects by binary object forest Segmentation by D Nichol and M Fiebig

I a I Plate 2. The results of using only primary matches and post detection smoothing as described in I. Whilst most objects have been found some have been missed

Plate I. image Pair A. Nine objects have moved (independently) between frames. There has also been a global scale change

Plate 3. Using BOF subgraph matching correct group- ings are found on all objects. An incorrect grouping is found on the calculator due to the fact that aif its keys are identical

vol9 no 6 december 1991 369

Page 9: Tracking multiple moving objects by binary object forest segmentation

Plate 4. The result of using the Prazdny algorithm on Pair A. Many correct vectors are found on the larger complex objects but few are found on the simpler objects. There are many erroneous vectors in the background pattern and some on the moving object

Plate 6. The features found by the Moravec interest operator Image 1 of Pair A. Only objects with moder- ately dense feature points will be retained by the methods which use this operator

Plate 5. The results of using the Barnard and Thompson algorithm on Pair A. Generally the results are similar to the Prazdny method but the vectors found, both correct and erroneous, are sparser

Plate 7. Image Pair B. Finish of the Sydney to Hobart Yacht Race. Three moving objects are visible, two of which undergo partial occlusion

370 image and vision computing

Page 10: Tracking multiple moving objects by binary object forest segmentation

Plate 8. Moving objects extracted from Pair B

Plate IO. Moving objects extracted from Pair C

~019 no 6 december 1991

Plate 9. Image Pair C. The Australian Grand Prix. Two moving objects are seen with large velocity component towards observer. Shadows on the track also make segmentation more difficult

371