[ieee 2006 international conference on intelligent information hiding and multimedia - pasadena, ca,...

4
Efficient Moving Object Extraction in Compressed Low-bit-rate Video Chih-Chung Hsu and Hsuan T. Chang * Photonics and Information Laboratory Department of Electrical Engineering National Yunlin University of Science & Technology Douliu Yunlin, 64045 Taiwan ROC {g9412716, htchang}@yuntech.edu.tw Ting-Cheng Chang College of Information Science Lin Tung University Taichung City, 40852 Taiwan ROC [email protected] Abstract Previous studies usually consider the motion object ex- traction (MOE) problem under uncompressed video, which is not practical since the high-frequency components are discarded already in the compressed low-bit-rate video. Moreover, extracting the motion object based only on il- lumination variation cannot work well when the moving object and background are with similar light intensities. Therefore, an MOE algorithm based on RGB-color model is proposed in this study. On the other hand, the block- based background extraction method, which can more effi- ciently construct the background image, is also proposed. The simulation results show that the motion objects can be efficiently extracted using the proposed methods. 1 Introduction Motion estimation, or motion object detection and ex- traction, is an important issue in video surveillance ap- plications. In this issue, most researches focus on detec- tion complete objects, designing algorithms being robust to light/shadow variation, and efficient background construc- tion, while considering the video sequences are not com- pressed yet. However, the video sequences usually are stored or transmitted in compressed formats to reduce the memory or bandwidth requirement, respectively. Therefore, it is more practical to perform the motion object extraction (MOE) for video sequences in the compressed domain. There are certain related techniques for performing MOE in video sequences. The basic method is the background subtraction (BS) [7], which is performed by subtracting the pixel values between the current frame and the con- structed background frame. Then the differences are bina- rized and the moving object can be roughly detected. On * This research was partially granted from the National Science Council, Taiwan, under contract NSC 95-2221-E-224-070-MY2. the other hand, the multiple-resolution growth method [3], differential histogram equalization (DHE) [4], and linear dependence (LD) methods [5] were also proposed. How- ever, most methods cannot directly be applied to practical surveillance systems because there exist the problems of camera noise, light/shadow variation, and the compressed effects in low-bit-rate video sequences. To overcome those problems above, the combination of above and proposed new methods are desirable. Background extraction (BE) is the first step in extracting moving objects in video sequences. Precise BE would be quite helpful in the MOE problem. Conventional BE meth- ods determine the temporal variation of grayscale values in each pixel [1], [2]. If the variation is less than a thresh- old value during a finite time period, then the correspond- ing pixel is assigned as the background pixel. However, the temporal grayscale variation for the pixel not belonging to the moving object might increase due to the following rea- sons: (1) The blocking effects caused from the video com- pression, especially in the cases of low bit rate. (2) The different types of camera noise and vibration. (3) The in- tensity variation from non-perfect light sources. Increasing the threshold value may solve the problems above. Instead of using pixel-based BE (PBE) method , here we employ the block-based BE (BBE) method, in which only the block variation is determined, to construct the background image more efficiently and accurately. The luminance-based motion detection [5], [6], [8], [9] is usually employed in detection moving objects in video sequences. However, the objects are blurred and their high frequency components are discarded in the compressed video sequences. For the object whose luminance is simi- lar to that of the background, it will be difficult to detect using the luminance-based MOE (L-MOE) method. There- fore, in this study, we propose the RGB-color model-based motion object extraction (RGB-MOE) method, which also considers the color information for detecting the moving objects. Furthermore, the previous background extraction, Proceedings of the 2006 International Conference on Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP'06) 0-7695-2745-0/06 $20.00 © 2006

Upload: ting-cheng

Post on 16-Mar-2017

213 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: [IEEE 2006 International Conference on Intelligent Information Hiding and Multimedia - Pasadena, CA, USA (2006.12.18-2006.12.18)] 2006 International Conference on Intelligent Information

Efficient Moving Object Extraction in Compressed Low-bit-rate Video

Chih-Chung Hsu and Hsuan T. Chang∗

Photonics and Information LaboratoryDepartment of Electrical Engineering

National Yunlin University of Science & TechnologyDouliu Yunlin, 64045 Taiwan ROC

{g9412716, htchang}@yuntech.edu.tw

Ting-Cheng ChangCollege of Information Science

Lin Tung UniversityTaichung City, 40852 Taiwan ROC

[email protected]

Abstract

Previous studies usually consider the motion object ex-traction (MOE) problem under uncompressed video, whichis not practical since the high-frequency components arediscarded already in the compressed low-bit-rate video.Moreover, extracting the motion object based only on il-lumination variation cannot work well when the movingobject and background are with similar light intensities.Therefore, an MOE algorithm based on RGB-color modelis proposed in this study. On the other hand, the block-based background extraction method, which can more effi-ciently construct the background image, is also proposed.The simulation results show that the motion objects can beefficiently extracted using the proposed methods.

1 Introduction

Motion estimation, or motion object detection and ex-traction, is an important issue in video surveillance ap-plications. In this issue, most researches focus on detec-tion complete objects, designing algorithms being robust tolight/shadow variation, and efficient background construc-tion, while considering the video sequences are not com-pressed yet. However, the video sequences usually arestored or transmitted in compressed formats to reduce thememory or bandwidth requirement, respectively. Therefore,it is more practical to perform the motion object extraction(MOE) for video sequences in the compressed domain.

There are certain related techniques for performing MOEin video sequences. The basic method is the backgroundsubtraction (BS) [7], which is performed by subtractingthe pixel values between the current frame and the con-structed background frame. Then the differences are bina-rized and the moving object can be roughly detected. On

∗This research was partially granted from the National Science Council,Taiwan, under contract NSC 95-2221-E-224-070-MY2.

the other hand, the multiple-resolution growth method [3],differential histogram equalization (DHE) [4], and lineardependence (LD) methods [5] were also proposed. How-ever, most methods cannot directly be applied to practicalsurveillance systems because there exist the problems ofcamera noise, light/shadow variation, and the compressedeffects in low-bit-rate video sequences. To overcome thoseproblems above, the combination of above and proposednew methods are desirable.

Background extraction (BE) is the first step in extractingmoving objects in video sequences. Precise BE would bequite helpful in the MOE problem. Conventional BE meth-ods determine the temporal variation of grayscale values ineach pixel [1], [2]. If the variation is less than a thresh-old value during a finite time period, then the correspond-ing pixel is assigned as the background pixel. However, thetemporal grayscale variation for the pixel not belonging tothe moving object might increase due to the following rea-sons: (1) The blocking effects caused from the video com-pression, especially in the cases of low bit rate. (2) Thedifferent types of camera noise and vibration. (3) The in-tensity variation from non-perfect light sources. Increasingthe threshold value may solve the problems above. Insteadof using pixel-based BE (PBE) method , here we employthe block-based BE (BBE) method, in which only the blockvariation is determined, to construct the background imagemore efficiently and accurately.

The luminance-based motion detection [5], [6], [8], [9]is usually employed in detection moving objects in videosequences. However, the objects are blurred and theirhigh frequency components are discarded in the compressedvideo sequences. For the object whose luminance is simi-lar to that of the background, it will be difficult to detectusing the luminance-based MOE (L-MOE) method. There-fore, in this study, we propose the RGB-color model-basedmotion object extraction (RGB-MOE) method, which alsoconsiders the color information for detecting the movingobjects. Furthermore, the previous background extraction,

Proceedings of the 2006 International Conference on IntelligentInformation Hiding and Multimedia Signal Processing (IIH-MSP'06)0-7695-2745-0/06 $20.00 © 2006

Page 2: [IEEE 2006 International Conference on Intelligent Information Hiding and Multimedia - Pasadena, CA, USA (2006.12.18-2006.12.18)] 2006 International Conference on Intelligent Information

multi-resolution growing, differential histogram equaliza-tion, and morphological methods are integrated as a com-plete MOE system.

2 Preliminary Methods

Conventional PBE is a pre-processing method in MOE.First the variation of grascale values of the each pixel in theframe is determined. A pixel is assigned to the backgroundif the variation is less than a threshold value. Let B(x, y)denote the background information of the video, in whichthe digits ‘0’ and ‘1’ represent the (x, y)th pixel belongsand not belong to the background, respectively. For each Nframes, B(x, y) can be determined as

B(x, y) =

{

1, 1

N−1

∑N

t=2|It(x, y) − It−1(x, y)| < r

0, Otherwise,(1)

where I(x, y) and r denote the grayscale values of the(x, y)th pixel and the threshold value, respectively, in thetth frame. For each pixel in the frame, Eq. (1) is appliedon until that the corresponding background pixel has beendetermined. That is, the background extraction is complete.

There are several motion detection methods used in thispaper: the BS, DHE, and MRG methods, which will bebriefly introduced as follows. The BS method is the sim-plest way for MOE in videos. By subtracting the currentframe to the background image, the moving object can bedetected if the pixel difference is greater than a thresholdvalue. In order to reduce the noisy disturbance, a largethreshold value is usually defined. However, the accuracywill be low since some details in the moving object maybe lost. To solve the problem above, the DHE method de-termines the histogram of the difference image and thenuses the histogram equalization to enhance the differenceimage. The equalized histograms corresponding to the cur-rent frame with and without moving objects are quite dif-ferent. By examining the characteristics in the equalizedhistogram, the moving object in the frame can be correctlydetected.

The MRG method performs the MOE in a hierarchi-cal fashion. First of all, the image frame is subsampledsuch that both the noise and moving object information aresuppressed. Then, the MOE based on the proposed BBEmethod is performed in the low-resolution image. In eachlevel, the morphological dilation is applied on the extractedmoving object. During the binarization process, a smallerthreshold value is assigned for the neighboring area of theup-sampled dilated object. Finally, the real moving objectcan be successfully extracted and the noise can be signif-icantly suppressed. In this study, we integrate the aboveBBE, BS, DHE, and MRG methods in R, G, B color domainto efficiently perform the MOE in compressed low-bit-ratevideos.

Partition into

8 x 8 blocks

Measure Vth of 8x8

block in all frames

Measure variance

of all Vth values

If variance < rYesNext

block

Block of

background

The background

is completed ?

Next block

Last block?

No

Yes

Yes

No

r++

No

Output

background

image

Input image

sequences

Figure 1. The BBE flow chart.

3 Motion Object Extraction in CompressedLow-bit-rate Video

3.1 Block-based Background Extraction

In the compressed low-bit-rate video, the mean value ofthe block is easily affected by noise. Therefore, the varianceof the same pixel could increase in difference frames. Thethreshold value in Eq. (1) will increase for temporal varia-tion of compressed low-bit-rate video. Thus the probabilityof false detection increases.

Figure 1 shows the flow chart of the proposed BBEmethod. Several steps in the proposed BBE method areshown as follows: The first step in the process is to measurethe variance Vth of each block. Next, the variance value Dof the variance values in difference frames is determined.If the value D is smaller than a predefined threshold value,then the corresponding block is regarded as the background.The variance of a block of n × m pixels can be determinedas

µkt =

1

m × n

n∑

x=0

m∑

y=0

F kt (x, y)

∆kt = µk

t − F kt (x, y),

(2)

where µkt denotes the mean value of kth block of the frame

at the time t, F kt (x, y) denotes kth block of image, and ∆k

t

is the block variance. Here Eq. (1) can be rewritten as

Bk =

{

1, 1

N

∑N

t=1|∆k

t − ∆kt−1

| < r0, Otherwise,

(3)

where Bk is the kth block in the background image, r isthe threshold and N is the number of frames. If the num-ber N is smaller than r then the corresponding kth block isregarded as the block of background. Note that the compu-tational complexity of the proposed BBE method is muchlower than that of PBE method.

Proceedings of the 2006 International Conference on IntelligentInformation Hiding and Multimedia Signal Processing (IIH-MSP'06)0-7695-2745-0/06 $20.00 © 2006

Page 3: [IEEE 2006 International Conference on Intelligent Information Hiding and Multimedia - Pasadena, CA, USA (2006.12.18-2006.12.18)] 2006 International Conference on Intelligent Information

R, G, B components

BS

R, G, B results

OR operation

DHE MRG

Input image sequences

Extracted motion objects

Figure 2. The RGB-MOE flow chart

3.2 Motion Object Extraction Method

Figure 2 shows the flow chart of the proposed RGB-MOE method. First of all, the moving objects in R, G, andB color components (frames) are separately detected. Themoving objects are determined using the following equa-tion:

D(x, y) = (|FR(x, y) − BR(x, y)|+

|FG(x, y) − BG(x, y)|+

|FB(x, y) − BB(x, y)|)/3,

(4)

where FR(x, y)and BR(x, y) denote the red components inthe foreground and background images, respectively. Simi-lar representations are used for green and blue componentsin images. To detect the motion object, the following equa-tion is utilized:

M(x, y) =

{

1, D(x, y) > Thr0, Otherwise,

(5)

where the symbol Thr denotes the threshold value. For ex-ample, the R, G, and B values in the foreground and back-ground image are R = 175, G = 33, B = 68 and 80, 116,and 80, respectively. The difference value measured in lu-minance is zero (|(175+33+68)/3−(80+116+80)/3|=0) in HSV color space, but not in the RGB model in thiscase. Here, in the RGB-MOE, the averaging differencelevel is about 63 ((|175−80|+ |33−116|+ |68−80|)/3∼=63). Therefore, it is expected that the RGB-MOE methodcan outperform the luminance-based methods. The idea ofthe RGB-MOE method will also applied to the MRG andDHE methods. As shown in Fig. 2, the RGB-MOE method,which combines the BS, DHE, MRG, and the morphologymethods are used to perform the MOE. The motion objectsseparately detected by the use of different techniques willbe ORed to obtained the final result.

(a) (b)

Figure 3. The background image extracted bythe use of (a) conventional PBE and (b) theproposed BBE methods.

4 Experimental Results

The data rate in the test video is set to 5 frames per sec-ond with that the number of frames is 30. The thresholdvalue r is 0.001 within six consecutive frames. In the testvideo, there exist motion objects exist in each frame. Table1 shows the comparison of the PSNR values between thePBE and the proposed BBE methods. Compared with theprevious PBE method, the PSNR value of BBE is slightlyimproved. On the other hand, Table 2 shows that the pro-cessing time of the proposed BBE method can be signifi-cantly reduced due to the block-based computation. More-over, the extremely low-bit-rate (0.1 bpp) compressed videois employed for the extracted background quality test. Fig-ure 3 shows the comparison of the one of the extracted back-ground image between the proposed BBE and PBE meth-ods. Compared with the PBE method, the subjective qualityof the constructed background in the proposed BBE methodis obviously improved. In addition, Table 3 shows that theobjective quality of the constructed background in the pro-posed BBE method can also be improved.

The default background is constructed by using the sub-jective vision. The same parameters are applied to the L-MOE and the proopsed RGB-MOE methods in the experi-ments. Figure 4 shows the comparison of the extracted mo-

Table 1. Comparison of PSNR values (in dB).Pixel-based Block-based

Sequence 1 30.40 30.52Sequence 2 29.92 29.95Sequence 3 31.39 31.57

Proceedings of the 2006 International Conference on IntelligentInformation Hiding and Multimedia Signal Processing (IIH-MSP'06)0-7695-2745-0/06 $20.00 © 2006

Page 4: [IEEE 2006 International Conference on Intelligent Information Hiding and Multimedia - Pasadena, CA, USA (2006.12.18-2006.12.18)] 2006 International Conference on Intelligent Information

(a) (b)

Figure 4. The extracted motion objects using(a) the proposed RGB-MOE and (b) L-MOEmethods.

tion objects between both methods. The extracted motionobjects in the proposed RGB-MOE method are better thanthat in the L-MOE method. Moreover, the objective qualityof extracted image in the RGB-MOE method can be alsoimproved. Table 4 also shows that the error rate in the pro-posed RGB-MOE is also smaller than that of the L-MOEmethod.

5 Conclusion

In this paper, an efficient RGB-MOE method for com-pressed low-bit-rate video is proposed. The proposedmethod successfully improves the subjective/objectivequality and efficiently accelerates the construction of back-ground image. According the experimental results, the pro-posed method performs the MOE more complete than thatin the L-MOE method. Moreover, the proposed RGB-MOEmethod can be applied directly to practical surveillance sys-tems In our future work, we can integrate more other L-MOE methods with the proposed RGB-MOE method to ob-tain better performance in MOE.

Table 2. Comparison of processing time (inseconds).

Pixel-based Block-basedSequence 1 58.79 16.72Sequence 2 43.60 9.31Sequence 3 49.84 11.10

Table 3. Comparison of PSNR values (in dB).Pixel-based Block-based

16 × 16 20.19 20.378 × 8 20.19 20.85

Table 4. Comparison of pixel error rates.L-MOE RGB-MOE

Sequence 1 0.65 % 0.60 %Sequence 2 1.76 % 1.66 %Sequence 3 0.60 % 0.55 %

References

[1] A. Lai and N. Yung, “A fast and accurate scoreboard al-gorithm for estimating stationary backgrounds in an imagesequence,” IEEE International Symposium on Circuits andSystems, vol. 4, pp. 241–244, June, 1998.

[2] G. Gordon, T. Darrell, M. Harville, and J. Woodfill, “Back-ground estimation and removal based on range and color,”Computer Vision and Pattern Recognition, vol. 2, pp. 23–25,June, 1999.

[3] J.H. Chang, H.H. Wu, J.C. Hsieh, M.H. Hsu, P.K. Weng, andY.I. Wu, “Moving object segmentation with shadow suppres-sion,” Workshop on Consumer Electronics and Signal Pro-cessing (WCEsp2004), November, Taiwan, 2004.

[4] R.P.R. Hasanzadeh , A. Shahmirzaie, and A.H. Rezaie,“Motion detection using differential histogram equalization,”Proceedings of Fifth IEEE International Symposium on Im-age Analysis and Processing, pp. 404–409, September, 2001.

[5] M. Latzel , E. Darcourt, and J.K. Tsotsos, “People trackingusing robust motion detection and estimation,” Proceedingsof The 2nd Canadian Conference on Computer and RobotVision, pp. 270–275, May, 2005.

[6] M.F. Abdelkader, R. Chellappa, Q. Zheng, and A.L. Chan,“Integrated motion detection and tracking for visual surveil-lance,” IEEE International Conference on Computer VisionSystems, pp. 28–28, January, 2006.

[7] J.-C. Tai and K.-T. Song, “Background Segmentation andIts Application to Traffic Monitoring using Modified His-togram,” Proceedings of the 2004 IEEE International Con-ference on Networking, Sensing & Control, vol. 1, pp. 13–18,Taipei, Taiwan., March 2004.

[8] H.-L. Eng, K.-A. Toh, A.H. Kam, J. Wang and W.-Y. Yau,“An automatic drowning detection surveillance system forchallenging outdoor pool environments,” Computer Vision,vol. 1, pp. 532–539, 2003.

[9] W. Hu, M. Hu, X. Zhou, T. Tan, J. Lou and S. May-bank, “Principal axis-based correspondence between multi-ple cameras for people tracking,” IEEE Trans. Pattern Anal-ysis and Machine Intelligence, vol. 28, pp. 663–671 ,April2006 .

Proceedings of the 2006 International Conference on IntelligentInformation Hiding and Multimedia Signal Processing (IIH-MSP'06)0-7695-2745-0/06 $20.00 © 2006