chapter 5 proposed approaches for human...

40
78 CHAPTER 5 PROPOSED APPROACHES FOR HUMAN POSE MODELLING USING SKIN COLOUR SEGMENTATION 5.1 INTRODUCTION The identification of human body parts using colour space models based skin colour is discussed in the last chapter. The identified body parts are used to develop the human pose modelling from which the human activity analysis can be performed. The face detection algorithm, reduction in search space of body parts and learning approach provide inefficient human pose modelling and only upper body models can be constructed. To overcome these issues, the skin colour segmentation and wavelet transform based marker-less human pose modelling is proposed for the upper as well as the lower body in this chapter. For this implementation, the twelve predefined human poses are considered from the monocular video sequences. 5.2 BACKGROUND Lim Siew Hooi et al (2007) implemented a method for human body estimation automatically from the monocular video sequences using skin colour features. The human limb positions were identified using skin colour segmentation technique. Here, HSV (Hue Saturation Value) colour space was used for finding the skin regions of face and arms of the human body. The head of the human body was identified using vertical projection histogram technique. Here, only upper body parts and estimation were implemented. The feed forward neural network was used for the pose classification.

Upload: vuonghanh

Post on 19-Mar-2018

216 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: CHAPTER 5 PROPOSED APPROACHES FOR HUMAN …shodhganga.inflibnet.ac.in/bitstream/10603/24484/10/10_chapter 5.pdf · MODELLING USING SKIN COLOUR SEGMENTATION ... The face detection

78

CHAPTER 5

PROPOSED APPROACHES FOR HUMAN POSE

MODELLING USING SKIN COLOUR SEGMENTATION

5.1 INTRODUCTION

The identification of human body parts using colour space models

based skin colour is discussed in the last chapter. The identified body parts are

used to develop the human pose modelling from which the human activity

analysis can be performed. The face detection algorithm, reduction in search

space of body parts and learning approach provide inefficient human pose

modelling and only upper body models can be constructed. To overcome

these issues, the skin colour segmentation and wavelet transform based

marker-less human pose modelling is proposed for the upper as well as the

lower body in this chapter. For this implementation, the twelve predefined

human poses are considered from the monocular video sequences.

5.2 BACKGROUND

Lim Siew Hooi et al (2007) implemented a method for human body

estimation automatically from the monocular video sequences using skin

colour features. The human limb positions were identified using skin colour

segmentation technique. Here, HSV (Hue Saturation Value) colour space was

used for finding the skin regions of face and arms of the human body. The

head of the human body was identified using vertical projection histogram

technique. Here, only upper body parts and estimation were implemented.

The feed forward neural network was used for the pose classification.

Page 2: CHAPTER 5 PROPOSED APPROACHES FOR HUMAN …shodhganga.inflibnet.ac.in/bitstream/10603/24484/10/10_chapter 5.pdf · MODELLING USING SKIN COLOUR SEGMENTATION ... The face detection

79

Figure 5.1 gives the upper body parts of left hand rise and standing poses. It is

seen that the body parts like head, right hand and left hand are detected.

In this approach, the colour space was used to find the face. Then,

the other parts of body such as right hand and left hand were identified using

skin colour distribution modelling method. The problem of this work is that if

the face detection algorithm is not worked well, the other parts of body can

not be identified efficiently. Here, the upper body pose modelling in 2D views

alone was attempted.

(a) Left hand rise (b) Standing

Figure 5.1 Human body parts identification using skin colour features

A. Standing B. Right hand rise

C. Left hand rise D. Both hands rise

Page 3: CHAPTER 5 PROPOSED APPROACHES FOR HUMAN …shodhganga.inflibnet.ac.in/bitstream/10603/24484/10/10_chapter 5.pdf · MODELLING USING SKIN COLOUR SEGMENTATION ... The face detection

80

E. Right hand up F. Left hand up

G. Both hands up H. Right salute

I. Left salute

Figure 5.2 Human pose modelling by Lim Siew Hooi et al (2007)

Figure 5.2 shows simulation results of upper body pose models in

which different video frames and backgrounds are considered. Here, human

pose models are constructed for poses like Standing, Right hand rise, Left

hand rise, Both hands rise, Right hand up, Left hand up, Both hands up, Right

salute, and Left salute. Table 5.1 shows the efficiency of existing algorithm

by Lim Siew Hooi et al (2007) for different videos and Table 5.2 gives the

efficiency and computation time of various poses in a single video.

Page 4: CHAPTER 5 PROPOSED APPROACHES FOR HUMAN …shodhganga.inflibnet.ac.in/bitstream/10603/24484/10/10_chapter 5.pdf · MODELLING USING SKIN COLOUR SEGMENTATION ... The face detection

81

Table 5.1 Efficiency of upper body model by Lim Siew Hooi et al

Videos No. of

frames

No. of poses correctly

detected

Efficiency

(%)

Video 1 1006 870 86.4

Video 2 1147 998 87.0

Video 3 1382 1150 83.2

Video 4 864 755 87.4

Video 5 1502 1200 79.9

Video 6 1257 1125 89.5

Video 7 1286 1170 91.0

Video8 671 556 82.9

Video 9 1600 1350 84.4

Video 10 1826 1600 87.6

Average 85.9

Table 5.2 Efficiency of various human activities by Lim Siew Hooi et al

Name of

the poses

No. of

poses

No. of poses

correctly detected

Efficiency

(%)

Time

(Sec)

A 93 85 91.4 1.21

B 110 97 88.2 1.42

C 98 89 90.8 1.26

D 98 85 86.7 1.41

E 95 84 88.4 1.23

F 100 79 79.0 1.43

G 89 75 84.3 1.11

H 89 77 86.5 0.96

I 98 81 82.7 1.25

Total/

Average 870 752 86.4 1.25

Page 5: CHAPTER 5 PROPOSED APPROACHES FOR HUMAN …shodhganga.inflibnet.ac.in/bitstream/10603/24484/10/10_chapter 5.pdf · MODELLING USING SKIN COLOUR SEGMENTATION ... The face detection

82

Eichner and Ferrari (2009) developed a method for upper body

human pose modelling using pictorial structures in the single images. It was

developed based on the torso in the middle of the upper body and the

appearances of body parts were interrelated to each other. The input test

images were estimated using trained images. The algorithm was developed

using two steps such as training and testing. In training, the prior probability

of the part was considered whereas the appearance models for a new image

were tested in testing phase. Here, the upper body pose modelling alone was

attempted.

A. Standing B. Right hand rise

C. Left hand ri D. Both hands rise

E. Right hand up F. Left hand up

Page 6: CHAPTER 5 PROPOSED APPROACHES FOR HUMAN …shodhganga.inflibnet.ac.in/bitstream/10603/24484/10/10_chapter 5.pdf · MODELLING USING SKIN COLOUR SEGMENTATION ... The face detection

83

G. Both hands up H. Right salute

I. Left salute

Figure 5.3 Human pose modelling using Eichner and Ferrari (2009)

Figure 5.3 shows simulation results of upper body pose models for

different poses using Eichner and Ferrari (2009). Table 5.3 shows the

efficiency of existing algorithm for different videos. Table 5.4 shows the

efficiency and computation time of various poses in a single video.

Table 5.3 Efficiency of existing algorithm for upper body model

Videos No. of frames No. of poses correctly

detected

Efficiency

(%)

Video 1 1006 850 84.4

Video 2 1147 980 85.4

Video 3 1382 1023 74.0

Video 4 864 776 89.8

Video 5 1502 1102 73.3

Video 6 1257 996 79.2

Video 7 1286 1100 85.5

Video 8 671 600 89.4

Video 9 1600 1256 78.5

Video 10 1826 1563 85.6

Average 82.5

Page 7: CHAPTER 5 PROPOSED APPROACHES FOR HUMAN …shodhganga.inflibnet.ac.in/bitstream/10603/24484/10/10_chapter 5.pdf · MODELLING USING SKIN COLOUR SEGMENTATION ... The face detection

84

Table 5.4 Efficiency of human activities for a video

Name of the

poses

No. of

poses

No. of poses

correctly

detected

Efficiency

(%)

Time

(Sec)

A 93 85 91.4 1.54

B 110 81 73.6 1.63

C 98 84 85.7 0.96

D 98 85 86.7 0.89

E 95 84 88.4 1.75

F 100 80 80.0 1.62

G 89 75 84.3 0.82

H 89 78 87.6 1.47

I 98 81 82.6 1.72

Total/

Average 870 733 84.4 1.37

To overcome the issues of the background works mentioned earlier,

the human pose modelling for the upper and lower body without using

markers are proposed in this research work. The human body models are

proposed based on the two methods such as skin colour features and

silhouette. In the skin colour based, the three types of approaches such as

Skin Colour segmentation Modelling (SCM), Wavelet Transform Modelling

(WTM) and Wavelet transform based Skin Colour segmentation Modelling

(WSCM) are proposed. In silhouette based, the Thinning Algorithm

Modelling (TAM) and Delaunay Triangulation Modelling (DTM) approaches

are proposed. The skin colour based approaches are presented in Chapter 5

whereas Chapter 6 presents the silhouette based approaches.

Page 8: CHAPTER 5 PROPOSED APPROACHES FOR HUMAN …shodhganga.inflibnet.ac.in/bitstream/10603/24484/10/10_chapter 5.pdf · MODELLING USING SKIN COLOUR SEGMENTATION ... The face detection

85

5.3 SKIN COLOUR SEGMENTATION MODELLING (SCM)

5.3.1 Overview of an Approach

Here, an approach has been proposed to develop a human pose

models using Skin colour segmentation. In the previous work by Antoni

Jaume I Capo et al (2006), only nine feature points have been considered to

model the human body and the full body modelling was not attempted. In this

present research work, the thirteen feature points are considered in the upper

body as well as in the lower body for making the human pose models.

First, the videos are acquired from the monocular video camera. The

pre-processing stage is helpful for enhancing the excellence of the frames.

The background of the scene is removed using the background subtraction

method. The body parts such as left hand, left leg, head, right leg and right

hand are identified using skin colour segmentation and then the remaining

eight parts are detected. The eight points are neck, left shoulder, right

shoulder, left hand elbow, right hand elbow, abdomen, left knee and right

knee. After the thirteen points are detected, the 2D modelling and 3D

modelling are implemented. In the 2D modelling, stick figure model and

cylinder model are proposed for the pre-defined poses. Finally, the human

activity analysis is performed. Figure 5.4 shows the flow chart of the

proposed research work using SCM approach. In the proposed algorithm, the

following steps are used to model the human body for the video surveillance

applications.

Proposed Human Body Modelling algorithm using SCM approach

Step 0: Import the video sequences to MATLAB environment.

Step 1: Perform pre-processing.

Step 2: Perform background subtraction.

Step 3: Apply skin colour segmentation to the human body.

Page 9: CHAPTER 5 PROPOSED APPROACHES FOR HUMAN …shodhganga.inflibnet.ac.in/bitstream/10603/24484/10/10_chapter 5.pdf · MODELLING USING SKIN COLOUR SEGMENTATION ... The face detection

86

Step 4: Identify the terminal skin area using skin features.

Step 5: Determine neck and shoulder points.

Step 6: Determine elbow and knee joints.

Step 7: Find the centroid of the human body which is considered as

abdomen.

Step 8: Plot thirteen points on human body model.

Step 9: Develop 2D models and 3D models.

Figure 5.4 Flow chart of the proposed research work by SCM Approach

Video acquisition

Pre-processing

Background Subtraction

Skin Colour Segmentation

Find Terminal Points (5Nos)

Determination of remaining

eight points

Plot thirteen points on model

Start

End

2D Modelling 3D Modelling

Human Activity analysis

Page 10: CHAPTER 5 PROPOSED APPROACHES FOR HUMAN …shodhganga.inflibnet.ac.in/bitstream/10603/24484/10/10_chapter 5.pdf · MODELLING USING SKIN COLOUR SEGMENTATION ... The face detection

87

5.3.2 Role of Skin Colour Segmentation

Skin colour is a basic feature in identifying, detecting, analyzing the

motion of human in the video surveillance. It is an important feature of the

human body that varies according to the persons. The skin colour detection

from the human body has an issue due to camera noise and lighting

conditions. Here, the function of skin colour segmentation is to classify the

skin regions from non-skin regions using colour space models. The colour

spaces models are used to find the skin regions. The skin features are

separated with the use of proper threshold values.

In this proposed research work, RGB colour space has produced

better performance in detecting skin features which is described in Chapter 4

so that RGB colour space is considered in this section to find the skin features

from the human body in the video sequences. The RGB colour space is most

widely used colour space for separating skin features. By the experimentation

with various videos, the threshold value is selected for each colour

components. For red component, the threshold value is 95 and 40, 25 for

green and blue components respectively. These R, G, and B components are

stored separately. In the proposed algorithm, the five terminal points such as

left hand, left leg, head, right leg and finally right hand are found using the

skin features. Figure 5.5 shows the five skin features of the human body that

represents the body parts. It shows the body parts like left hand, left leg, head,

right leg and right hand for both hands rise, both hands up, left hand up and

crouching. These five points are considered as very important to find the

remaining feature points.

Page 11: CHAPTER 5 PROPOSED APPROACHES FOR HUMAN …shodhganga.inflibnet.ac.in/bitstream/10603/24484/10/10_chapter 5.pdf · MODELLING USING SKIN COLOUR SEGMENTATION ... The face detection

88

(a) Both hands rise (b) Both hands up

(c) Left hand up (d) Crouching

Figure 5.5 Determination of five terminal segmented skin area of body

5.3.3 Detection of Body Segments

This section deals the detection of body parts such as neck,

shoulder, hand elbow, abdomen and knee. These parts are used for making

human pose models.

5.3.3.1 Detection of neck, shoulder and hand elbow

After the detection of the head, right hand, left hand, right leg and left

leg using skin colour segmentation, the next five points such as neck, right

shoulder, left shoulder, right hand and left hand are detected. The neck is

placed 15% from the top pixel of the human body. The distance between the

neck point and the shoulder point is only 19.5% of the width of the human

body. From the shoulder point, the elbow points are tracked. Here, the

Page 12: CHAPTER 5 PROPOSED APPROACHES FOR HUMAN …shodhganga.inflibnet.ac.in/bitstream/10603/24484/10/10_chapter 5.pdf · MODELLING USING SKIN COLOUR SEGMENTATION ... The face detection

89

distance between two points namely, for example, (p,q) and (r,s) is computed

using the Euclidean Distance (ED) formula as in equation (5.1).

2 2ED= (p-r) +(q-s) (5.1)

5.3.3.2 Detection of abdomen of the body

The abdomen of the body is determined by drawing the bounding

box after the extraction of desired segmentation result. The centroid of the

bounding box is the abdomen of the human body. The determination of

bounding box is explained in Figure 5.6 in which original video frame and

bounding box are shown.

(a) Original video frame (b) Human body with bounding box

Figure 5.6 Detection of bounding box around human body

5.3.3.3 Detection of knee points

The knee joints of the human legs lie at half the distance between

the abdomen point and the leg points. And the leg points are detected from the

skin colour segmentation. After determining all the points, the thirteen points

are plotted as shown in Figure 5.7. It shows that the plot of points on the

poses of standing, left hand rise and crouching. After finding all the points,

they are joined through the line for stick figure models. The cylinder is

Page 13: CHAPTER 5 PROPOSED APPROACHES FOR HUMAN …shodhganga.inflibnet.ac.in/bitstream/10603/24484/10/10_chapter 5.pdf · MODELLING USING SKIN COLOUR SEGMENTATION ... The face detection

90

formed on those points to get cylinder 2D model and the entire human body

model is displayed finally.

(a) Standing (b) Left hand rise (c) Crouching

Figure 5.7 Plotting of thirteen points on human body

5.3.4 Human Pose Modelling

The two-dimensional modelling refers a skeletal structure made by

line segments linked by joints and gives the actions and gestures with respect

to the original human in a two-dimensional view. The limitation of 2D

modelling is the restriction of their view point. The 3D modelling gives the

representation of body segmentation in the three axes. In Figure 5.8, the

column I shows the input video images and the outputs of Stick figure

modelling and Cylinder modelling are shown in column II and column III

respectively. It shows the models of twelve human poses such as standing,

right hand rise, left hand rise, both hands rise, right hand up, left hand up,

both hands up, right leg rise, left leg rise, right salute, left salute and

crouching.

A 3D human body is a three dimensional representation of body

segments. A 3D model has been reconstructed from 2D images by computer

vision techniques using volumetric intersection. Here, sphere is used to

develop a 3D model. The sphere is formed from an ellipsoid by adjusting its

dimensions in three axes. For this, a 2D ellipse is created first and it is

converted into 3D ellipsoid. Once an ellipsoid is created, then it is converted

Page 14: CHAPTER 5 PROPOSED APPROACHES FOR HUMAN …shodhganga.inflibnet.ac.in/bitstream/10603/24484/10/10_chapter 5.pdf · MODELLING USING SKIN COLOUR SEGMENTATION ... The face detection

91

to sphere. This is implemented on the thirteen points in the human body to

form a 3D model representation which are shown in Figure 5.9. Also it shows

the human pose models of twelve poses.

5.3.5 Results and Discussion

This section presents the experimental results of 2D and 3D human

pose modelling using SCM approach. Here, the twelve human poses are

considered for the analysis.

A. Standing

B.Right hand rise

C. Left hand rise

D. Both hands rise

Page 15: CHAPTER 5 PROPOSED APPROACHES FOR HUMAN …shodhganga.inflibnet.ac.in/bitstream/10603/24484/10/10_chapter 5.pdf · MODELLING USING SKIN COLOUR SEGMENTATION ... The face detection

92

E. Right hand up

F. Left hand up

G. Both hands up

H. Right leg rise

I. Left leg rise

Page 16: CHAPTER 5 PROPOSED APPROACHES FOR HUMAN …shodhganga.inflibnet.ac.in/bitstream/10603/24484/10/10_chapter 5.pdf · MODELLING USING SKIN COLOUR SEGMENTATION ... The face detection

93

J. Right salute

K. Left salute

L. Crouching

Column I Column II Column III

Figure 5.8 Experimental results of 2D Human pose modelling

(Column I) Original video frame; (Column II) Stick figure

model and (Column III) Cylinder model

A. Standing

B. Right hand rise

Page 17: CHAPTER 5 PROPOSED APPROACHES FOR HUMAN …shodhganga.inflibnet.ac.in/bitstream/10603/24484/10/10_chapter 5.pdf · MODELLING USING SKIN COLOUR SEGMENTATION ... The face detection

94

C. Left hand rise

D. Both hands rise

E. Right hand up

F. Left hand up

G. Both hands up

Page 18: CHAPTER 5 PROPOSED APPROACHES FOR HUMAN …shodhganga.inflibnet.ac.in/bitstream/10603/24484/10/10_chapter 5.pdf · MODELLING USING SKIN COLOUR SEGMENTATION ... The face detection

95

H. Right leg rise

I. Left leg rise

J. Right salute

.

K. Left salute

L. Crouching

Column I Column II

Figure 5.9 Experimental results of 3D Human pose modeling. (Column I)

Original video frame;(Column II) 3D model

Page 19: CHAPTER 5 PROPOSED APPROACHES FOR HUMAN …shodhganga.inflibnet.ac.in/bitstream/10603/24484/10/10_chapter 5.pdf · MODELLING USING SKIN COLOUR SEGMENTATION ... The face detection

96

From the simulation results obtained by the proposed algorithm, the

efficiency has been determined. The efficiency is calculated by determining

the number of poses correctly detected from the total number of poses in the

video sequences as shown in Table 5.5. In the 2D models, stick figure and

cylinder models are constructed and 10 videos are considered for the analysis.

For these videos, efficiencies of 2D and 3D models are computed. Also, the

efficiency of twelve poses in a single video (for Video1) is presented in

Table 5.6.

Table 5.5 Efficiency of pose models using SCM

Videos No. of

frames

No. of poses correct in 2D

Modelling 3D Modelling

Stick

Figure Cylinder

Efficiency

(%)

No. of

poses

correct

Efficiency

(%)

Video 1 1006 900 900 89.5 890 88.4

Video 2 1147 1052 1052 91.7 1012 88.2

Video 3 1382 1280 1280 92.6 1207 87.3

Video 4 864 805 805 93.2 700 81.0

Video 5 1502 1342 1342 89.3 1315 87.5

Video 6 1257 1140 1140 90.6 1108 88.1

Video 7 1286 1200 1200 93.3 1190 92.5

Video 8 671 630 630 93.9 620 92.3

Video 9 1600 1510 1510 94.3 1360 85.0

Video10 1826 1750 1750 95.8 1700 93.0

Average 92.4 Average 88.3

Page 20: CHAPTER 5 PROPOSED APPROACHES FOR HUMAN …shodhganga.inflibnet.ac.in/bitstream/10603/24484/10/10_chapter 5.pdf · MODELLING USING SKIN COLOUR SEGMENTATION ... The face detection

97

Table 5.6 Efficiency of poses for a single video using SCM

Name of

the poses

No. of

poses

No. of poses correct in 2D

Modelling 3D Modelling

Stick

Figure

Cyli-

nder

Effic-

iency

(%)

No. of poses

correct

Efficiency

(%)

A 86 80 80 93.0 83 96.5

B 81 72 72 88.9 75 92.6

C 67 60 60 90.0 60 90.0

D 85 77 77 90.6 76 89.4

E 88 81 81 92.0 80 90.9

F 94 80 80 85.1 84 89.4

G 79 66 66 83.5 67 84.8

H 93 83 83 89.2 78 83.8

I 73 64 64 87.7 64 87.7

J 61 52 52 85.2 50 82.0

K 103 95 95 92.2 90 87.4

L 96 90 90 94.0 83 86.5

Total/

Average 1006 900 900 89.5 890 88.4

5.4 WAVELET TRANSFOM MODELLING (WTM)

The goal of this section is to develop a human pose modelling

through Discrete Wavelet Transform (DWT) for the monocular videos. Here,

the DWT uses to extract the human body from the video scene.

Page 21: CHAPTER 5 PROPOSED APPROACHES FOR HUMAN …shodhganga.inflibnet.ac.in/bitstream/10603/24484/10/10_chapter 5.pdf · MODELLING USING SKIN COLOUR SEGMENTATION ... The face detection

98

5.4.1 Overview of an Approach

The features that are extracted from the human body are useful to

model the surveillance persons and it is applied to recover the human body

poses. In this proposed work as in Figure 5.10, the video sequences are

acquired using camera and pre-processing technique is applied to improve the

quality of the frames. Then, the human body is segmented using Discrete

Wavelet Transform (DWT). The approximation information of the reference

frame and current frames are subtracted to obtain the human body. After

detecting the human body parts, the models have been developed in 2D and

3D views. From the feature points obtained by pose models, the activity

analysis can be performed.

Figure 5.10 Block diagram of the proposed work using WTM approach

Video acquisition

Pre-processing

Human Body Segmentation

using DWT

Determination of body parts

2D Modelling 3D Modelling

Human Activity Analysis

Page 22: CHAPTER 5 PROPOSED APPROACHES FOR HUMAN …shodhganga.inflibnet.ac.in/bitstream/10603/24484/10/10_chapter 5.pdf · MODELLING USING SKIN COLOUR SEGMENTATION ... The face detection

99

5.4.2 Human Body Segmentation using DWT

In the proposed research work, the extraction of the human body

plays an important role for the further analysis. This is implemented with the

aid of two-dimensional discrete wavelet transform. The main function of the

human body segmentation is to extract the features of human which can be

merged in order to build objects of interest on which analysis is performed.

The methodology of human body segmentation is already explained in

Section 3.5. Figure 5.11 shows the way of finding human body segmentation.

In which Figure 5.11(e) gives the human body using background subtraction

technique by subtracting each four frequency components of Figure 5.11 (d)

from Figure 5.11 (c).

(a) Reference frame (b) Current frame

(c) DWT of (a) (d) DWT of (b) (e) Human body Figure 5.11 Human body segmentation using Discrete Wavelet

Transform (DWT)

Page 23: CHAPTER 5 PROPOSED APPROACHES FOR HUMAN …shodhganga.inflibnet.ac.in/bitstream/10603/24484/10/10_chapter 5.pdf · MODELLING USING SKIN COLOUR SEGMENTATION ... The face detection

100

5.4.3 Determination of Body Parts

The thirteen feature points are considered for the human pose

modelling in this research work. These points are Terminating points (5Nos),

Intersecting points (2Nos), Shoulder points (2Nos), Elbow joints (2Nos), and

Knee joints (2Nos). The terminating points include head, right hand, left

hand, right leg and left leg. The intersecting points are neck and abdomen.

Initially, the terminating points are determined by traversing of pixels either

row wise or column wise.

(a) Points determination on body model

(b) Plotting of thirteen points on human body model

Figure 5.12 Graphical ideas to find human body segments

Elbow Joint Shoulder Point

Knee Joint

Finding of 2 Shoulders, 2 Elbows, 2 Knees

Finding of head, right and left hands, right and

left legs

Page 24: CHAPTER 5 PROPOSED APPROACHES FOR HUMAN …shodhganga.inflibnet.ac.in/bitstream/10603/24484/10/10_chapter 5.pdf · MODELLING USING SKIN COLOUR SEGMENTATION ... The face detection

101

Figure 5.12 shows an idea of finding human body segments with the

use of proposed algorithm. Figure 5.12(a) gives the way of determining

feature points on the human body. Every frame has an array made of rows and

columns under which particular pixel values are stored. The modelling is

achieved with respect to the binary form of the original image from which

easy traversing with ones and zeros can be achieved. When the person comes

inside the room, the top pixel is detected by traversing row wise from the top

of an image. This point is the head of the body. The right hand and left hand

are located by scanning of pixels from left to right and vice-versa. Then, the

legs are determined by scanning of pixels using bottom up approach by

keeping the column as constant.

After the detection of the head, right hand, left hand, right leg and

left leg, the next five points such as neck, right shoulder, left shoulder, right

hand and left hand are detected. The neck is placed 15% from the top pixel of

the human body. The shoulder point is determined 19.5% from the top pixel

of the human body. From the shoulder point, the elbow points are tracked.

The elbow point is approximately halfway between the shoulder and the

terminating points of the two hands.

The detection of abdomen is one of the important feature point in

developing the human pose modelling. The procedure to find the abdomen is

same as in the Section 5.3.3.2. The knee joints of the human legs lie at half

the distance between the abdomen point and the leg points detected. After

determining all the points using this approach, the thirteen points are plotted

as shown in Figure 5.12(b).

5.4.4 Results and Discussion

The proposed algorithm is implemented in the indoor surveillance

videos with straight poses. Here, twelve human poses have been used from

Page 25: CHAPTER 5 PROPOSED APPROACHES FOR HUMAN …shodhganga.inflibnet.ac.in/bitstream/10603/24484/10/10_chapter 5.pdf · MODELLING USING SKIN COLOUR SEGMENTATION ... The face detection

102

the monocular video sequences for the 2D and 3D pose modelling through

Discrete Wavelet Transform (DWT). Figure 5.13 shows the experimental

results of 2D modelling in which stick figure model and cylinder model are

displayed. The simulation results of 3D models are shown in Figure 5.14.

B

A. Standing

B.Right hand rise

C. Left hand rise

D. Both hands rise

Page 26: CHAPTER 5 PROPOSED APPROACHES FOR HUMAN …shodhganga.inflibnet.ac.in/bitstream/10603/24484/10/10_chapter 5.pdf · MODELLING USING SKIN COLOUR SEGMENTATION ... The face detection

103

E. Right hand up

F. Left hand up

G. Both hands up

H. Right leg rise

I. Left leg rise

Page 27: CHAPTER 5 PROPOSED APPROACHES FOR HUMAN …shodhganga.inflibnet.ac.in/bitstream/10603/24484/10/10_chapter 5.pdf · MODELLING USING SKIN COLOUR SEGMENTATION ... The face detection

104

J. Right salute

K. Left salute

L. Crouching

Column I Column II Column III

Figure 5.13 Experimental results of 2D human pose modelling using

WTM approach. (Column I) Original video frame; (Column II)

Stick Figure model, and (Column III) Cylinder model

A. Standing

Page 28: CHAPTER 5 PROPOSED APPROACHES FOR HUMAN …shodhganga.inflibnet.ac.in/bitstream/10603/24484/10/10_chapter 5.pdf · MODELLING USING SKIN COLOUR SEGMENTATION ... The face detection

105

B.Right hand rise

C. Left hand rise

D. Both hands rise

E. Right hand up

F. Left hand up

Page 29: CHAPTER 5 PROPOSED APPROACHES FOR HUMAN …shodhganga.inflibnet.ac.in/bitstream/10603/24484/10/10_chapter 5.pdf · MODELLING USING SKIN COLOUR SEGMENTATION ... The face detection

106

G. Both hands up

H. Right leg rise

I. Left leg rise

J. Right salute

Page 30: CHAPTER 5 PROPOSED APPROACHES FOR HUMAN …shodhganga.inflibnet.ac.in/bitstream/10603/24484/10/10_chapter 5.pdf · MODELLING USING SKIN COLOUR SEGMENTATION ... The face detection

107

K. Left salute

L. Crouching

Column I Column II

Figure 5.14 Experimental results of 3D human pose modelling

(Column I) Original video frame;(Column II) 3D model

Here, the twelve human poses are considered such as standing,

right hand rise, left hand rise, both hands rise, right hand up, left hand up,

both hands up, right leg rise, left leg rise, right salute, left salute and

crouching for making the models. The efficiency is calculated by determining

the number of poses correctly detected from the total number of poses in the

video sequences as shown in Table 5.7. In this, 2D models provide better

results than 3D modelling because the development of 3D model is complex

compared to 2D models. The stick figure model and cylinder model have

produced same efficiency due to their easy construction. Also, the efficiency

of twelve poses in a single video (for Video1) is presented in Table 5.8.

Page 31: CHAPTER 5 PROPOSED APPROACHES FOR HUMAN …shodhganga.inflibnet.ac.in/bitstream/10603/24484/10/10_chapter 5.pdf · MODELLING USING SKIN COLOUR SEGMENTATION ... The face detection

108

Table 5.7 Efficiency of models for different videos using WTM

Videos No. of

frames

No. of poses correct in 2D

Modelling 3D Modelling

Stick

Fig. Cylinder

Efficiency

(%)

No. of poses

correct

Efficiency

(%)

Video 1 1006 950 950 94.4 901 89.5

Video 2 1147 1099 1099 95.8 1045 91.1

Video 3 1382 1296 1296 93.8 1274 92.1

Video 4 864 790 790 91.4 785 90.8

Video 5 1502 1370 1370 91.2 1351 89.9

Video 6 1257 1149 1149 90.7 1141 90.8

Video 7 1286 1220 1220 94.8 1205 93.7

Video 8 671 650 650 96.8 630 93.9

Video 9 1600 1531 1531 95.7 1505 94.1

Video10 1826 1789 1789 97.9 1751 95.9

Average 94.3 Average 92.2

Table 5.8 Efficiency of human poses for a single video using WTM

Name of

the poses

No. of

poses

No. of poses correct in 2D

Modelling 3D Modelling

Stick

Fig. Cylinder

Efficiency

(%)

No. of

poses

correct

Efficiency

(%)

A 86 84 84 97.7 84 97.7

B 81 78 78 96.3 76 93.8

C 67 64 64 95.5 62 92.5

D 85 80 80 94.1 77 90.6

E 88 85 85 96.6 81 92.0

F 94 88 88 93.6 86 91.5

G 79 70 70 88.6 68 86.1

H 93 87 87 93.5 79 84.9

I 73 69 69 94.5 65 89.0

J 61 55 55 90.2 50 82.0

K 103 97 97 94.2 90 87.4

L 96 93 93 96.9 83 86.5

Total/

Average 1006 950 950 94.4 901 89.5

Page 32: CHAPTER 5 PROPOSED APPROACHES FOR HUMAN …shodhganga.inflibnet.ac.in/bitstream/10603/24484/10/10_chapter 5.pdf · MODELLING USING SKIN COLOUR SEGMENTATION ... The face detection

109

5.5 WAVELET TRANSFOM BASED SKIN COLOUR

SEGMENTATION MODELLING (WSCM)

5.5.1 Introduction

The aim of this section is to develop a human pose modelling in

two-dimensional and three dimensional views through wavelet based skin

colour segmentation for the monocular videos. The approximation

information of the Discrete Wavelet Transform (DWT) is used for segmenting

the human body from the video frame and to find the skin features. The

human pose modelling is proposed by detecting thirteen features from the

upper body as well as in the lower body. Here, twelve human poses are

considered for the analysis.

5.5.2 Proposed Approach

The wavelet based skin colour segmentation is presented for

developing the human pose models in this section. Initially, the video

sequences are acquired using the camera. Then, pre-processing technique is

applied to improve the quality of the frame. The two dimensional wavelet

transform is used to segment the human body from the video frame and it is

discussed in Section 3.5. After the human body is obtained, skin colour

segmentation is applied to extract the skin features to make pose models. The

identification of skin regions is discussed in Section 5.3.2. Initially, the

terminating points such as head, right hand, left hand, right leg, and left leg

are found. From these points, the remaining points are detected. Totally,

thirteen feature points are considered for making the human pose models in

2D and 3D views. The steps of the proposed approach include,

Page 33: CHAPTER 5 PROPOSED APPROACHES FOR HUMAN …shodhganga.inflibnet.ac.in/bitstream/10603/24484/10/10_chapter 5.pdf · MODELLING USING SKIN COLOUR SEGMENTATION ... The face detection

110

Wavelet transform based Skin Colour segmentation Modelling (WSCM)

Algorithm

Step 0: Acquire video sequences from the video camera.

Step 1: Perform pre-processing.

Step 2: Apply DWT to segment human body from the video frames.

Step 3: Apply skin colour segmentation to extract skin features.

Step 4: Identify the thirteen human body feature points.

Step 5: Develop the human pose models.

5.5.3 Results and Discussion

The experimental results of the proposed models are shown in this

section. It is implemented in the indoor surveillance videos with straight

poses. Here, twelve human poses have been used from the monocular video

sequences for the 2D and 3D pose modelling through wavelet based skin

colour segmentation. Figure 5.15 shows the experimental results of 2D

modelling and 3D modelling in which stick figure and cylinder models are

indicated. Figure 5.16 displays results of 3D modelling for different poses

using WSCM approach.

A. Standing

B.Right hand rise

Page 34: CHAPTER 5 PROPOSED APPROACHES FOR HUMAN …shodhganga.inflibnet.ac.in/bitstream/10603/24484/10/10_chapter 5.pdf · MODELLING USING SKIN COLOUR SEGMENTATION ... The face detection

111

C. Left hand rise

D. Both hands rise

E. Right hand up

F. Left hand up

G. Both hands up

Page 35: CHAPTER 5 PROPOSED APPROACHES FOR HUMAN …shodhganga.inflibnet.ac.in/bitstream/10603/24484/10/10_chapter 5.pdf · MODELLING USING SKIN COLOUR SEGMENTATION ... The face detection

112

H. Right leg rise

I. Left leg rise

J. Right salute

K. Left salute

L. Crouching

Column I Column II Column III

Figure 5.15 Experimental results of 2D human pose modelling using

WSCM approach. (Column I) Original video frame;

(Column II) Stick figure model, and (Column III) Cylinder

model

Page 36: CHAPTER 5 PROPOSED APPROACHES FOR HUMAN …shodhganga.inflibnet.ac.in/bitstream/10603/24484/10/10_chapter 5.pdf · MODELLING USING SKIN COLOUR SEGMENTATION ... The face detection

113

A. Standing

B.Right hand rise

C. Left hand rise

D.Both hands rise

E. Right hand up

Page 37: CHAPTER 5 PROPOSED APPROACHES FOR HUMAN …shodhganga.inflibnet.ac.in/bitstream/10603/24484/10/10_chapter 5.pdf · MODELLING USING SKIN COLOUR SEGMENTATION ... The face detection

114

F. Left hand up

G. Both hands up

H. Right leg rise

I. Left leg rise

J. Right salute

Page 38: CHAPTER 5 PROPOSED APPROACHES FOR HUMAN …shodhganga.inflibnet.ac.in/bitstream/10603/24484/10/10_chapter 5.pdf · MODELLING USING SKIN COLOUR SEGMENTATION ... The face detection

115

K. Left salute

L. Crouching

Column I Column II

Figure 5.16 Experimental results of 3D human pose modelling using

WSCM approach. (Column I) Original video frame;

(Column II) 3D model

The efficiency is calculated by determining the number of poses

correctly detected from the total number of poses in the video sequences as

shown in Table 5.9. In which, 2D models provides better results than 3D

modelling because the development of 3D model is complex compared to 2D

models. Also, the effectiveness of twelve poses in a single video (for Video1)

is presented in Table 5.10.

Page 39: CHAPTER 5 PROPOSED APPROACHES FOR HUMAN …shodhganga.inflibnet.ac.in/bitstream/10603/24484/10/10_chapter 5.pdf · MODELLING USING SKIN COLOUR SEGMENTATION ... The face detection

116

Table 5.9 Efficiency of pose models using WSCM approach

Videos No. of

frames

No. of poses correct in 2D

Modelling 3D Modelling

Stick

Fig. Cylinder

Efficiency

(%)

No. of

poses

correct

Efficiency

(%)

Video 1 1006 965 965 96.0 940 93.4

Video 2 1147 1102 1102 96.1 1095 95.5

Video 3 1382 1300 1300 94.0 1296 93.8

Video 4 864 850 850 98.3 840 97.2

Video 5 1502 1410 1410 93.8 1396 92.9

Video 6 1257 1179 1179 93.8 1100 87.5

Video 7 1286 1250 1250 97.2 1195 92.9

Video8 671 650 650 96.8 630 93.9

Video 9 1600 1580 1580 98.7 1550 96.9

Video 10 1826 1750 1750 95.8 1740 95.3

Average 96.1 Average 93.9

Table 5.10 Effectiveness of human poses for a single video

Name

of the

poses

No. of

poses

No. of poses correct in 2D

Modelling 3D Modelling

Stick

Figure Cylinder

Efficiency

(%)

No. of

poses

correct

Efficiency

(%)

A 86 84 84 97.6 80 93.0

B 81 78 78 96.2 77 95.1

C 67 65 65 97.0 60 90.0

D 85 85 85 100 80 94.1

E 88 85 85 96.5 83 94.3

F 94 90 90 96.0 88 94.0

G 79 75 75 95.0 73 92.4

H 93 87 87 93.5 86 92.4

I 73 70 70 96.0 68 93.2

J 61 55 55 90.2 54 89.0

K 103 98 98 95.1 98 95.1

L 96 93 93 97.0 93 97.0

Total/

Average 1006 965 965 96.0 940 93.4

Page 40: CHAPTER 5 PROPOSED APPROACHES FOR HUMAN …shodhganga.inflibnet.ac.in/bitstream/10603/24484/10/10_chapter 5.pdf · MODELLING USING SKIN COLOUR SEGMENTATION ... The face detection

117

5.6 SUMMARY

In this chapter, an approach of marker-less 2D and 3D human pose

modelling using skin colour segmentation and discrete wavelet transform

have been discussed. With the use of these models, twelve predefined human

poses are detected and analyzed. The efficiency for different videos and for

the different poses in a single video are presented.