[ieee third international conference on intelligent information hiding and multimedia signal...

4
An Improved Face Detection Method in Low-resolution Video Chih-Chung Hsu and Hsuan T. Chang Photonics and Information Laboratory Department of Electrical Engineering National Yunlin University of Science & Technology Douliu Yunlin, 64045 Taiwan ROC {g9412716, htchang}@yuntech.edu.tw Ting-Cheng Chang Department of Commercial Technology & Management Lin Tung University Taichung, 40852 Taiwan ROC [email protected] Abstract In this study, an efficient face detection method is pro- posed for low-resolution video. The cascaded face detec- tor proposed by Viola can achieve real-time detection and a high detection rate. However, the motion blurr of the face images in the low-resolution video usually exists. The detec- tion rate in low-resolution video is lower than that in static images because the training set in the Adaboost algorithm only considers about normal face images. Therefore, the enhanced training set which contains the normal face im- ages and the motion blurred face images is used to improve the detection rate. The simulation results show that the face images in low-resolution video can be efficiently extracted. 1 Introduction The face detection system is a popular technique. Nowa- days, many researches are proposed for improving face de- tection. There are four approaches in face detection tech- niques [3]: the rule-based, the feature-based, the template matching and the learning-based methods. The learning- based method is widely used in the face detection tech- niques. However, the detection performance of these tech- niques is related with the training set. If the size of the training set is too small, the detection rate will be decreased significantly. This research was partially granted from the National Science Council, Taiwan, under contract NSC 95-2221-E-224-070-MY2. An efficient and fast face detection system is proposed by Viola in 2001 [1] [2]. There are three contributions in Vi- ola’s system. First, the Adaboost [4] algorithm is used to se- lect the rectangle features of face images. If the training set is large, the training time will be increased significantly. In addition, the integral image is used for accelerating the rect- angle features calculation. Finally, the cascaded detection structure which can reject non-face images rapidly is pro- posed and it can achieve real-time face detection. However, the motion blurred of the moving face images in the low- resolution video usually existed. Therefore, the result of the face detection in such kind of video is not good enough. In Ref. [5], the robust and efficient rotation invariant multiview face detection system is proposed. However, the images in training set and target images are high-resolution images. The detection rate will be decreased when the im- age contains motion blurred faces. In Ref. [6], the support vector machine (SVM) is used to enhance the training set and to increase detection rate. However, this method does not consider the motion blur in faces. In practical, the reso- lution of most surveillance monitors is low, especially while using a web cam. Therefore, if the frames per second (FPS) is low, the motion blurred will occur in the moving face images. Hence, the detection rate will be decreased. To solve this problem, the face detection system can employ the motion blurred parameters during the face features de- tecting. In Ref. [7], the motion blurred estimation method is proposed. This method can combine with the face detec- tion system to solve the previous problem. However, that will decrease the detection speed significantly. Hence, it is needed that the positive samples in the training set must contain both the motion blurred and normal face images.

Upload: ting-cheng

Post on 22-Mar-2017

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: [IEEE Third International Conference on Intelligent Information Hiding and Multimedia Signal Processing - Kaohsiung, Taiwan (2007.11.26-2007.11.28)] Third International Conference

An Improved Face Detection Method in Low-resolution Video

Chih-Chung Hsu and Hsuan T. Chang∗

Photonics and Information LaboratoryDepartment of Electrical Engineering

National Yunlin University of Science & TechnologyDouliu Yunlin, 64045 Taiwan ROC

{g9412716, htchang}@yuntech.edu.tw

Ting-Cheng ChangDepartment of Commercial Technology & Management

Lin Tung UniversityTaichung, 40852 Taiwan ROC

[email protected]

Abstract

In this study, an efficient face detection method is pro-posed for low-resolution video. The cascaded face detec-tor proposed by Viola can achieve real-time detection and ahigh detection rate. However, the motion blurr of the faceimages in the low-resolution video usually exists. The detec-tion rate in low-resolution video is lower than that in staticimages because the training set in the Adaboost algorithmonly considers about normal face images. Therefore, theenhanced training set which contains the normal face im-ages and the motion blurred face images is used to improvethe detection rate. The simulation results show that the faceimages in low-resolution video can be efficiently extracted.

1 Introduction

The face detection system is a popular technique. Nowa-days, many researches are proposed for improving face de-tection. There are four approaches in face detection tech-niques [3]: the rule-based, the feature-based, the templatematching and the learning-based methods. The learning-based method is widely used in the face detection tech-niques. However, the detection performance of these tech-niques is related with the training set. If the size of thetraining set is too small, the detection rate will be decreasedsignificantly.

∗This research was partially granted from the National Science Council,Taiwan, under contract NSC 95-2221-E-224-070-MY2.

An efficient and fast face detection system is proposed byViola in 2001 [1] [2]. There are three contributions in Vi-ola’s system. First, the Adaboost [4] algorithm is used to se-lect the rectangle features of face images. If the training setis large, the training time will be increased significantly. Inaddition, the integral image is used for accelerating the rect-angle features calculation. Finally, the cascaded detectionstructure which can reject non-face images rapidly is pro-posed and it can achieve real-time face detection. However,the motion blurred of the moving face images in the low-resolution video usually existed. Therefore, the result ofthe face detection in such kind of video is not good enough.

In Ref. [5], the robust and efficient rotation invariantmultiview face detection system is proposed. However, theimages in training set and target images are high-resolutionimages. The detection rate will be decreased when the im-age contains motion blurred faces. In Ref. [6], the supportvector machine (SVM) is used to enhance the training setand to increase detection rate. However, this method doesnot consider the motion blur in faces. In practical, the reso-lution of most surveillance monitors is low, especially whileusing a web cam. Therefore, if the frames per second (FPS)is low, the motion blurred will occur in the moving faceimages. Hence, the detection rate will be decreased. Tosolve this problem, the face detection system can employthe motion blurred parameters during the face features de-tecting. In Ref. [7], the motion blurred estimation methodis proposed. This method can combine with the face detec-tion system to solve the previous problem. However, thatwill decrease the detection speed significantly. Hence, itis needed that the positive samples in the training set mustcontain both the motion blurred and normal face images.

Page 2: [IEEE Third International Conference on Intelligent Information Hiding and Multimedia Signal Processing - Kaohsiung, Taiwan (2007.11.26-2007.11.28)] Third International Conference

Figure 1. Four rectangle features

Then, the Adaboost algorithm is used on the training set togenerate a strong classifier. Finally, the enhanced face de-tector can be generated to increase the detection rate.

In general, the higher parameters toleration of the facedetection, the higher detection rate of detection. However,the false alarm rate will be increased as well. Therefore, themotion detection based on Gaussian mixture model (GMM)[11] is used in the face detection system to decrease the falsealarm rate. On the other hand, the motion blurred face sam-ples is added in the training set for increasing the overalldetection rate without decreasing the detection speed.

This paper is organized as follows: In Section 2, fun-damentals of the face detection are presented. Section 3deals with the proposed face detection method. In Section 4,the experimental results of the proposed scheme are demon-strated. Finally, Section 5 concludes this paper.

2 Related Work

The first contribution of the real-time face detection [1][2] is called the integral image. The integral image at co-ordinate (x, y) contains the sum of the pixels above and tothe left of (x, y). Then, Viola defined the rectangle featuresfor face images. Figure 1 shows rectangle feature of theface image. For gray rectangle feature in Fig. 1, the lu-minance sum of this feature can be obtained via computingF − E − D + C. Therefore, the difference value betweenthe gray and the white rectangle features can be calculatedfast. In Refs [8]–[10], the method which can calculate theoblique rectangle features in the integral image was pro-posed. This method can achieve the non-front face detec-tion. This method is also used in this paper for the rectanglefeatures calculation.

Second, a classifier can be constructed by selecting asmall number of important features using the AdaBoost al-gorithm. The binary decision will be used on each featurevia the corresponding weak classifiers. Then, the strongclassifier will be created via combining all the weak clas-sifiers. Therefore, the face image can be determined usingthis strong classifier.

Finally, a cascaded detector is proposed. The input sub-

Figure 2. The difference value of rectanglefeatures comparison between the normal andmotion blurred face image.

image can be detected using the classifier of first stage. Ifthis sub-image is non-face image then the classifier rejectsit. Most of the non-face images can be rejected in first stage.Therefore, the cascaded face detection can be achieved inreal-time.

In general, GMM can achieve motion detection withreal-time. After the GMM is constructed, the parametersof the GMM can only be updated for motion detection. Inorder to increase the detection rate of GMM, the parame-ters estimation method is proposed [12]. The parametersestimation method in Ref. [12] is used in this paper.

3 Proposed Method

In low-resolution video, the motion blurred is usually ex-isted in the moving objects. The rectangle features in themotion blurred face images will be changed. Therefore, theconstructed strong classifier from the normal training set isinefficient. It is clear that the motion blurred face imagescould not easily extracted. Figure 2 shows the differencevalue of the rectangle feature comparison between the nor-mal and the motion blurred face images. Figure 2 (a) showsthe normal face image and Fig. 2 (b) show the correspond-ing rectangle feature. The difference value of rectangle fea-ture in Fig.2 (b) is 15501. Figure 2 (c) and (d) show themotion blurred face image and the corresponding rectanglefeature. The difference value of rectangle feature in Fig.2(d) is 17145. It is clear that the rectangle feature in the mo-tion blurred face image will be modified. Therefore, formotion blurred face images, the face detection system is in-efficient.

The definition of the motion blurred is:

F (x, y) = IDFT(DFT(F (x, y)) × H(x, y)), (1)

where

H(x, y) =T

π(ua + vb)sin[π(ua + vb)] exp−jπ(ua+vb),

(2)

Page 3: [IEEE Third International Conference on Intelligent Information Hiding and Multimedia Signal Processing - Kaohsiung, Taiwan (2007.11.26-2007.11.28)] Third International Conference

Figure 3. The flow chart of the proposedmethod

where DFT denotes the forward Fourier transformation, theF (x, y) denotes the original image, a and b denote the mo-tion range, and the T denote the motion strength. For restor-ing the motion blurred image, the parameters of Wiener fil-ter H(x, y) can be estimation from Equation 1. However,these parameters cannot be estimated easily. On the otherhand, the motion blurred features estimation is time con-suming [7]. The face detection system cannot achieve real-time processing. Therefore, it is not practical that the identi-fication of the motion blurred feature in the detecting stage.In this paper, some positive samples with the motion blurredproperty will be added using Equation 1. Hence, some posi-tive samples contain the normal face images and the motionblurred face images. Then, the cascaded face detector willbe constructed using the Adaboost algorithm. However, ifthe amount of the motion blurred face images is large, thedetection rate in the normal face will be decreased. On theother hand, the less amount of the motion blurred face im-ages will not detecting the motion blurred face images. Inour observation, the ratio of the motion blurred face imagesto the normal face images is 10% for obtaining the best de-tection rate.

In low-resolution, the information in video is not enough.Hence, some face features are insignificant. For higher de-tection rate in the face detection system, the threshold val-ues of the parameters in detection system should be de-creased. However, the false alarm rate will be increasedsimultaneously. Therefore, the motion detection of GMMis used to decrease the false alarm rate. If the Equation 3 istrue, then a face candidate is regarded as a face image.

if1

wh

xk+wk,yk+hk∑

i=xk,j=yk

F (i, j) > T, (3)

where xk denotes the kth face rectangle at coordinate X

with wk width and hk height, the F (i, j) denotes the fore-ground image and T is the threshold value. Here the thresh-old value T is 0.2.

Figure 3 shows the flow chart of the proposed method.First, the cascaded face detector is used on the input imagefor obtaining the face rectangle candidates. Then, the mo-tion detection of GMM is used to obtain the moving objects.Finally, if the momentum of the face rectangle candidate is

(a) (b)

Figure 4. The test video

0 0.05 0.1 0.15 0.2 0.250.2

0.25

0.3

0.35

0.4

0.45

0.5

0.55

0.6

0.65

False alarm rate

Det

ectio

n ra

te

Conventional with GMMProposed method with GMM

(a)

0.01 0.015 0.02 0.025 0.03 0.035 0.04 0.045 0.050.78

0.8

0.82

0.84

0.86

0.88

0.9

0.92

0.94

0.96

0.98

False alarm rate

Det

ectio

n ra

te

Conventional method with GMMProposed method with GMM

(b)

Figure 5. ROC curve for face detector on (a)test video 1 (b) test video 2

larger than the threshold value T , then this face rectanglecandidate is regarded as true face image.

4 Experimental Results

There are 1002 face images of the sampling size 24×24.The amount of the non-face images is 400. Our program isbased on the Intel OpenCV library, the negative samples aresampled randomly from the non-face images. The trainingset of the proposed face detector contains 900 normal faceimages and 100 motion blurred face images. Note that themotion blurred face images is duplicated from the normalface images. The frames per second (FPS) of the test videois 5 and the resolution is 320×240. We defined that the sizeof the face must be larger than 24× 24. Figure 4 shows twotest videos and the red rectangle means the motion blurredface image. The number of the test video (a) is 30, andthe number of the test video (b) is 116. Figures 5 (a) and(b) show the receiver operating characteristic (ROC) curvecomparison between the proposed method and conventionalmethod. It is clear that the proposed method can achievehigher detection rate in the test video (a) and (b).

Finally, Fig.6 shows the extracted faces performancecomparison between conventional method and the proposedmethod. In Fig.6(a), conventional method cannot extract allface images accurately. In Fig. 6(c) and Fig. 6(d), the mo-tion blurred test image is constructed using Equation 1. Theproposed can successfully extract the motion blurred face

Page 4: [IEEE Third International Conference on Intelligent Information Hiding and Multimedia Signal Processing - Kaohsiung, Taiwan (2007.11.26-2007.11.28)] Third International Conference

(a) (b)

(c)

(d)

Figure 6. The extracted faces performancecomparison between conventional methodand the proposed method. (a) and (c) denoteconventional method. (b) and (d) denote pro-posed method.

images in these two test images.

5 Conclusion

An efficient face detection system is proposed for low-resolution video. The detection rate is increased via the en-hanced training set which contains the motion blurred posi-tive samples. And the motion detection of GMM techniqueis used to decrease the false alarm rate. Therefore, the pro-posed method can achieve the face detecting in the motionblurred images.

Our future work is to analyze the features selection of themotion blurred face images in the Adaboost training stagefor constructing a more robust face classifier.

References

[1] P. Viola and M. Jones, “Rapid Object Detection usinga Boosted Cascade of Simple Features,” Proc. IEEECS Conf. Computer Vision and Pattern Recognition,December 2001.

[2] P. Viola and M. Jones,“Robust Real Time Object De-tection,” IEEE ICCV Workshop Statistical and Com-putational Theories of Vision, July 2001.

[3] M.H. Yang, D. Keirgman, and N. Ahuja, “DetectingFaces in Images: A Survey,” IEEE Trans. on Patternand Mach. Intell., vol. 24, no 1, pp. 34–58, 2002.

[4] Y. Freund and R. E. Schapire, “A Decision-theoreticGeneralization of On-line Learning and An Applica-tion to Boosting,” Journal of Computer and SystemSciences, vol. 55, pp. 119–139. December 1997.

[5] C. Huang, H. Ai, Y. Li, and S. Lao, “High-Performance Rotation Invariant Multiview Face De-tection,” IEEE Transactions On Pattern Analysis AndMachine Intelligence, vol. 29, no. 4, pp. 671–686,April 2007.

[6] R. Wang, J. Chen, S. Shan, X. Chen, and W. Gao, “En-hancing Training Set for Face Detection,” The 18th In-ternational Conference on Pattern Recognition, vol. 3,pp.477–480, Hong Kong, August 20–24, 2006.

[7] W. Tan, J. Zhang, G. Rnng, and H. Chen, “Identifica-tion of Motion Blur Direction Based on Analysis ofIntentional Restoration Errors,” 2004 IEEE Interna-tional Conference on Networking, Sensing and Con-trol, vol. 2, pp.1253–1258, Taipei, Taiwan, March 21-23, 2004

[8] R. Lienhart and J. Maydt, “An Extended Set of Haar-like Features for Rapid Object Detection,” 2002 IEEEInternational Conference on Image Processing, vol. 1,pp. 900-903, September 2002.

[9] C. Lerdsudwichai and M. Abdel-Mottaleb, “Algo-rithm for Multiple Faces Tracking”, 2003 IEEE Inter-national Conference on Multimedia and Expo, vol2,pp. 777–780, July 2003.

[10] R. Lienhart, A. Kuranov, and V. Pisarevsky, “Empiri-cal Analysis of Detection Cascades of Boosted Classi-fiers for Rapid Object Detection,” MRL Technical Re-port, Intel Labs, May 2002.

[11] C. Stauffer, and W. E. L. Grimson, “Adaptive back-ground mixture models for real-time tracking,” IEEEComputer Society Conference on Computer Visionand Pattern Recognition, vol. 2, pp. 246-252, 1999.

[12] P. KaewTraKulPong and R. Bowden, “An ImprovedAdaptive Background Mixture Model for Real-timeTracking and Shadow Detection,” Proc. 2nd EuropeanWorkshop on Advanced Video Based Surveillance Sys-tems, vol. 25, 2001.