[ieee 2010 international conference on electronics and information engineering (iceie 2010) - kyoto,...

MBLBP Face Detection with Multi-exit Asymmetric Boosting

Maha Sharkas Electronics & Communications Eng. Dept.

AAST Alexandria, Egypt

[email protected]

Amr El-Helw Electronics & Communications Eng. Dept.

AAST Alexandria, Egypt

[email protected] AlSaba

Electronics & Communications Eng. Dept. AAST

Alexandria, Egypt [email protected]

Abstract— Face detection plays an important role in many applications such as video surveillance, face recognition, face image database management..etc. This paper presents a new technique which reduces the learning and detection time using the multi block local binary pattern (MBLBP) with Multi-exit Asymmetric Boosting. In this technique, the selected features are reduced by around 1/20 of Haar-like method so the learning time is also reduced by about 1/20. The detection time is also reduced by more than 1/4 of Haar-like detector. Multi-exit Asymmetric Boosting reduces features by about 1/5 of the cascade method so the learning and detection time is also reduced.

Keywords-component; MBLBP, Multi-exit assymetric boostin, Haar-like features

I. INTRODUCTION Face detection has gained increased interest in recent

years. As computers become faster and more affordable, many applications that use face detection are becoming an integral part of our life. For example, face recognition systems are being tested and installed in airports to provide a new level of security. Human-computer interfaces based on facial expressions and body gestures are being exploited as ways to replace the traditional interfaces such as the mouse and the keyboard. These and other related applications all require, as an initial step, some form of face detection, which can be simply defined as follows: Given an image I, find all occurrences of faces and the extent of each face in I. This definition implies that some form of discrimination must be made between faces and all other objects. However, there are many difficulties and challenges associated with face detection like pose, expression, image view, and image conditions. In any face detection algorithm, there are three factors which must be considered to evaluate the performance: • Detection rate (or False rejection rate) • False acceptance rate • Detection speed

In this paper, integral image with Adaboost algorithm to be used for face detection are discussed in section 2. In section

3 multi block local binary pattern (MBLBP) is presented. Multi-Exit asymmetric boosting is introduced in section 4. Section 5 presents the experimental results. The results are compared in section 6 and the paper is concluded in section 7.

II. BAISC TOOLS

A. Integral Image Integral images were first introduced by Viola-Jones in

2001[1, 2]. Integral image is the first stage of learning or detection. The input data or image must be first turned into an integral image. To turn an input image into an integral image, each pixel must be made equal to the entire sum of all pixels above and to the left of the concerned pixel or a cumulative summation of columns and rows is performed. This is demonstrated in Figure 1 [3].

Input image Integral image

Figure 1. Integral image

This allows for the calculation of the sum of all pixels inside any given rectangle using only four values. These values are the pixels in the integral image that coincide with the corners of the rectangle in the input image [1,2,4]. This is demonstrated in Figure 2.

2010 International Conference on Electronics and Information Engineering (ICEIE 2010)

V2-234 Volume 2C978-1-4244-7681-7/$26.00 2010 IEEE

Figure 2. Sum calculation

The sum of the pixels within rectangle D can be computed with four array references. The value of the integral image at location 1 is the sum of the pixels in rectangle A. The value at location 2 is A + B, at location 3 is A + C, and at location 4 is A + B + C+D. The sum within D can be computed as 4+1 − (2 + 3). Since both rectangles B and C include rectangle A then the sum of A has to be added to the calculation. It has now been demonstrated how the sum of pixels within rectangles of arbitrary size can be calculated in constant time.

B. Adaboost AdaBoost is a machine learning boosting algorithm

capable of constructing a strong classifier through a weighted combination of weak classifiers. (A weak classifier classifies correctly in only a little bit more than half the cases.) To match this terminology to the presented theory, each feature is considered to be a potential weak classifier. A weak classifier is mathematically described as [3]:

Where x is a sub window, f is the applied feature, p is the polarity and θ is the threshold that decides whether x should be classified as a positive (a face) or a negative (a non-face). Since only a small amount of feature values are expected to be potential weak classifiers, the AdaBoost algorithm is modified to select only the best features by Viola-Jones [1,2,4,5,6]. The modified AdaBoost algorithm is presented in [3]. An important part of the modified AdaBoost algorithm is the determination of the best features, polarity and threshold. There seems to be no smart solution to this problem and Viola-Jones suggested a simple brute force method. This means that the determination of each new weak classifier involves evaluating each feature on all the training examples in order to find the best performing feature. This is expected to be the most time consuming part of the training procedure. The best performing feature is chosen based on the weighted error it produces. This weighted error is a function of the weights belonging to the training examples. As seen in the modified Adaboost algorithm part 4 the weight of a correctly classified example is decreased and the weight of a misclassified example is kept constant. As a result it is more ‘expensive’ for the second feature (in the final classifier) to

misclassify an example also misclassified by the first feature, than an example classified correctly. An alternative interpretation is that the second feature is forced to focus harder on the examples misclassified by the first. The point being that the weights are a vital part of the mechanics of the AdaBoost algorithm. With the integral image, the computationally efficient features and the modified AdaBoost algorithm in place it seems like the face detector is ready for implementation [3].

III. MBLBP The local binary pattern uses many blocks (256 blocks)

instead of one block so it is called multi block local binary pattern. The basic idea of MBLBP is to encode rectangular regions by local binary pattern operator [7]. The MBLBP features can also be calculated rapidly through integral images, while these features capture more information about the image structure than Haar-like features and show more distinctive performance. Comparing with original Local Binary Pattern calculated in a local 3x3 neighborhood between pixels, the MBLBP features can capture large scale structure that may be the dominant features of image structures. It directly uses the output of LBP operator as the feature value. But a problem is that this value is just a symbol for representing the binary string. For this non-metric feature value, multi-branch regression tree is designed as weak classifiers. The adaboost is implemented for feature selection and classifier construction then a multi-exit asymmetric boosting detector is built. Another advantage of MBLBP is that the number of exhaustive set of MBLBP features is much smaller than Haar-like features (more than 1/20 of Haar-like feature). Boosting-based method use Adaboost algorithm to select a significant feature set from the large complete feature set. This process often spends much time even several weeks. Then a small feature set of MBLBP can make this procedure simpler.

The basic idea of LBP feature, in its simplest form, is created in the following manner:

Divide the examined window to cells (e.g. 24x24 pixels for each cell).

For each pixel in a cell, compare the pixel to each of its 8 neighbors (on its left-top, left-middle, left-bottom, right-top, etc.). Follow the pixels along a circle, i.e. clockwise or counter-clockwise as shown in figure 3.

Where the center pixel's value is greater than the neighbor, write "1", otherwise, write "0". This gives an 8-digit binary number (which is usually converted to decimal for convenience).

Compute the histogram, over the cell, of the frequency of each "number" occurring (i.e., each combination of which pixels are smaller and which are greater than the center).

Optionally normalize the histogram as shown in figure 4.


V2-235 Volume 2

Concatenate normalized histograms of all cells. This gives the feature vector for the window.

The feature vector can now be applied on Adaboost.

Figure 3. Basic LBP operator

Figure 4. Multi-block LBP feature for image representation. As shown in the figure, the MBLBP features encode rectangular regions’ intensities by

local binary pattern. The resulting binary patterns can describe diverse image structures. Compared with original Local Binary Pattern calculated

in a local 3x3 neighborhood between pixels, MBLBP can capture large scale structure.

IV. MULTI-EXIT ASYMMETRIC BOOSTING To make a good description of multi-exit asymmetric

boosting [8, 9] it can be put in the form of a problem and solution.

A. Problem In object detection methods using boosting, it is often

desirable to learn a boosted classifier of M weak classifiers

Such that , for some

given error rates (α0, β0), Also, Because M is roughly proportional the running time of classifier, it is desirable to minimize M. Two trends for solving this problem are introduced: 1. For each iteration, new weak classifier fm+1(x) is learned

by minimizing FAR(Fm+fm+1)+FRR(Fm+fm+1). A threshold is introduced to trade-off between FAR(Fm+1) ≤ α0 and FRR(Fm+1) ≤ β0 achieved by some value of the threshold, then stop. Otherwise, run the next iteration. Issue: - Weak classifiers are not trained to attain the boosted

classifier's goal. They are therefore sub-optimal

- Too many weak classifiers are often required, increasing the classifier's running time (and training time)

2. Train many boosted classifiers. For each boosted classifiers, the weak classifiers fm+1(x) are learned with an asymmetric goal FAR(Fm+fm+1)+λFRR(Fm+fm+1) for some parameter λ chosen empirically. The training process for each boosted classifier stops when either the conditions FAR(Fm) ≤ α0 and FRR(Fm) ≤ β0 are met, or M is too large. Then, the boosted classifier that achieves the two conditions with the smallest M is selected. Issue: - How to choose λ? - How to avoid training many boosted classifiers (which

multiplies the training time of detector)?

B. Solution In optimal asymmetric goal: Learn each classifier fm+1(x) using a single asymmetric goal:

Where .

Why? Consider two desired bounds or targets for learning a boosted classifier FM (x): • Exact bound: FAR(FM) ≤ α0 and FRR(FM) (1) • Conservative bound: FAR(FM) + FRR(FM) (2)

(2) is more conservative than (1) because (2) ≥ (1) When , for every new weak classifier learned, the

ROC operating point moves the fastest towards the conservative bound, shown in figure 5.

Figure 5. Illustration of how the operating point of a boosted classifier

moves when more weak classifiers are trained. Blue solid curves are equivalent to the ROC curves of the boosted classifier. (a): A case when the symmetric goal G1 is used, i.e., λ = 1. (b): A case when an asymmetric goal

Gλ is used with

C. Multi-exit Asymmetric Boosting


V2-236 Volume 2

A method to train a single boosted classifier with multiple exit nodes [8]. Features: The classifier does the same job as a cascade of boosted

classifiers: sequentially rejecting large proportion of negative examples.

Weak classifiers are trained with the same goal .

Every pass rejection, detection is guaranteed with FAR≤ α0 and FRR ≤ β0.

The score of an example is propagated from one node to another, including exit nodes.

Main advantages Weak classifiers are learned (approximately) optimally. No training of multiple boosted classifiers. Much fewer weak classifiers are needed than traditional

cascades. In this paper the Haar sub window is applied followed by MBLBP on multi-exit asymmetric boosting and the results are presented in the next section.

V. EXPERIMENTAL RESULT A 24x24 sub window is used for learning and detection.

It was found that MBLBP with multi-exit asymmetric boosting achieved the target of reducing the number of learning features which accordingly reduced the learning time. We use MBLBP in multi-exit asymmetric boosting instead of cascade because multi-exit asymmetric boosting is faster in learning and detection and also gives more accurate results than cascade as shown in figure 6.

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1ROC for Gentleboosting with T = 100 for different technics of cascading

CascadeMultiExitFull

Figure 6. ROC for MBLBP with comparison between cascade and multi-

exit asymmetric

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

ROC for Multi-Exit Asymetric FastAdaBoosting with T = 23, nexit = 7

FullMulti-exit

200 400 600 800 1000 1200

100

200

300

400

500

600

700

800

900

1000

Figure 7. ROC for MBLBP with multi-exit asymmetric boosting with 6 exit and result of it detection

Figure 8. Result for MBLBP with 6 exit


V2-237 Volume 2

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

ROC for Multi-Exit Asymetric FastAdaBoosting with T = 23, nexit = 7

FullMulti-exit

Figure 9. Figure 9. ROC for MBLBP with multi-exit asymmetric boosting

with 7exit and result of it detection

VI. COMPARISON BETWEEN HAAR-LIKE FEATURE AND MBLBP

The Haar-like features and the MBLBP can be compared based on the presented ROC curves and the detection results. For example if we use 20x20 sub window we get 45891 Haar features and 2049 MBLBP features which is about 1/20 [7] of Haar features so the consumed time must be also

about 1/20. When we train the same data and take 5 exits it was found that in Haar featurse the learning time is 9953.222590 seconds = 2.77 hours but in MBLBP 473.963seconds = 7.9 minutes which means that MBLP time is around 1/20 of the Haar features time. Table 1 compares between the learning time of both techniques in seconds.

TABLE I. COMPARISON BETWEEN THE LEARNING TIME OF THE HAAR-LIKE FEATURES AND THE MBLBP FEATURES

No.of exit 5 6 7

Haar 9953.222590 12868.886892 15522.44358

MBLBP 473.963 643.4443446 777.89056

VII. CONCLUSION In this paper the learning and detection times using Haar-like features and MBLBP are compared. In general multi-exit symmetric boosting reduces feature classifiers to about 1/5 of cascade and MBLP reduces features to about 1/20 of Haar so the MBLBP is faster when used with multi-exit asymmetric boosting.

REFERENCES

[1] Paul Viola and Micheal J. Jones. Robust real-time object detection. Technical Report CRL2001/01, Cambridge Research Laboratory, February 2001

[2] Paul Viola and Micheal J. Jones. Rapid object detection using a boosted cascade of simple features. In Proc. IEEE conf. Computer Vision and Pattern Recognition, Hawaii, USA, 2001

[3] Ole Helvig Jensen. Implementing the Viola-Jones Face Detection Algorithm. Kongens Lyngby 2008. IMM-M.Sc.-2008-93

[4] Paul Viola and Micheal J. Jones. Robust Real-Time Face Detection. International Journal of Computer Vision 57(2), 137–154, 2004, Kluwer Academic Publishers. Manufactured in The Netherlands

[5] Paul Viola and Micheal J. Jones. Fast and Robust Classification using Asymmetric AdaBoost and a Detector Cascade, Mitsubishi Electric Research Laboratories, 2002

[6] Paul Viola, Michael J. Jones. Fast Multi-view Face Detection. Mitsubishi Electric Research Laboratories, TR2003-096, August 2003.

[7] Lun Zhang, Rufeng Chu, Shiming Xiang, Shengcai Liao, Stan Z. Li. Face Detection Based on Multi-Block LBP Representation. Center for Biometrics and Security Research & National Laboratory of Pattern Recognition Institute of Automation, Chinese Academy of Sciences, Vol. 4642, 2007.

[8] Minh-Tri Pham Viet-Dung D. Hoang Tat-Jen Cham. Detection with Multi -exit Asymmetric Boosting. IEEE conference on computer vision and pattern recognition (CVPR), Anchorage, AK, 2008

[9] Minh-Tri Pham Tat-Jen Cham. Fast training and selection of Haar features using statistics in boosting-based face detection. conference on computer vision (ICCV), Rio de Janeiro, Brazil, 2007


V2-238 Volume 2

[ieee 2010 international conference on electronics and information engineering (iceie 2010) - kyoto,...

Documents