chapter 5 proposed approaches for human...
TRANSCRIPT
78
CHAPTER 5
PROPOSED APPROACHES FOR HUMAN POSE
MODELLING USING SKIN COLOUR SEGMENTATION
5.1 INTRODUCTION
The identification of human body parts using colour space models
based skin colour is discussed in the last chapter. The identified body parts are
used to develop the human pose modelling from which the human activity
analysis can be performed. The face detection algorithm, reduction in search
space of body parts and learning approach provide inefficient human pose
modelling and only upper body models can be constructed. To overcome
these issues, the skin colour segmentation and wavelet transform based
marker-less human pose modelling is proposed for the upper as well as the
lower body in this chapter. For this implementation, the twelve predefined
human poses are considered from the monocular video sequences.
5.2 BACKGROUND
Lim Siew Hooi et al (2007) implemented a method for human body
estimation automatically from the monocular video sequences using skin
colour features. The human limb positions were identified using skin colour
segmentation technique. Here, HSV (Hue Saturation Value) colour space was
used for finding the skin regions of face and arms of the human body. The
head of the human body was identified using vertical projection histogram
technique. Here, only upper body parts and estimation were implemented.
The feed forward neural network was used for the pose classification.
79
Figure 5.1 gives the upper body parts of left hand rise and standing poses. It is
seen that the body parts like head, right hand and left hand are detected.
In this approach, the colour space was used to find the face. Then,
the other parts of body such as right hand and left hand were identified using
skin colour distribution modelling method. The problem of this work is that if
the face detection algorithm is not worked well, the other parts of body can
not be identified efficiently. Here, the upper body pose modelling in 2D views
alone was attempted.
(a) Left hand rise (b) Standing
Figure 5.1 Human body parts identification using skin colour features
A. Standing B. Right hand rise
C. Left hand rise D. Both hands rise
80
E. Right hand up F. Left hand up
G. Both hands up H. Right salute
I. Left salute
Figure 5.2 Human pose modelling by Lim Siew Hooi et al (2007)
Figure 5.2 shows simulation results of upper body pose models in
which different video frames and backgrounds are considered. Here, human
pose models are constructed for poses like Standing, Right hand rise, Left
hand rise, Both hands rise, Right hand up, Left hand up, Both hands up, Right
salute, and Left salute. Table 5.1 shows the efficiency of existing algorithm
by Lim Siew Hooi et al (2007) for different videos and Table 5.2 gives the
efficiency and computation time of various poses in a single video.
81
Table 5.1 Efficiency of upper body model by Lim Siew Hooi et al
Videos No. of
frames
No. of poses correctly
detected
Efficiency
(%)
Video 1 1006 870 86.4
Video 2 1147 998 87.0
Video 3 1382 1150 83.2
Video 4 864 755 87.4
Video 5 1502 1200 79.9
Video 6 1257 1125 89.5
Video 7 1286 1170 91.0
Video8 671 556 82.9
Video 9 1600 1350 84.4
Video 10 1826 1600 87.6
Average 85.9
Table 5.2 Efficiency of various human activities by Lim Siew Hooi et al
Name of
the poses
No. of
poses
No. of poses
correctly detected
Efficiency
(%)
Time
(Sec)
A 93 85 91.4 1.21
B 110 97 88.2 1.42
C 98 89 90.8 1.26
D 98 85 86.7 1.41
E 95 84 88.4 1.23
F 100 79 79.0 1.43
G 89 75 84.3 1.11
H 89 77 86.5 0.96
I 98 81 82.7 1.25
Total/
Average 870 752 86.4 1.25
82
Eichner and Ferrari (2009) developed a method for upper body
human pose modelling using pictorial structures in the single images. It was
developed based on the torso in the middle of the upper body and the
appearances of body parts were interrelated to each other. The input test
images were estimated using trained images. The algorithm was developed
using two steps such as training and testing. In training, the prior probability
of the part was considered whereas the appearance models for a new image
were tested in testing phase. Here, the upper body pose modelling alone was
attempted.
A. Standing B. Right hand rise
C. Left hand ri D. Both hands rise
E. Right hand up F. Left hand up
83
G. Both hands up H. Right salute
I. Left salute
Figure 5.3 Human pose modelling using Eichner and Ferrari (2009)
Figure 5.3 shows simulation results of upper body pose models for
different poses using Eichner and Ferrari (2009). Table 5.3 shows the
efficiency of existing algorithm for different videos. Table 5.4 shows the
efficiency and computation time of various poses in a single video.
Table 5.3 Efficiency of existing algorithm for upper body model
Videos No. of frames No. of poses correctly
detected
Efficiency
(%)
Video 1 1006 850 84.4
Video 2 1147 980 85.4
Video 3 1382 1023 74.0
Video 4 864 776 89.8
Video 5 1502 1102 73.3
Video 6 1257 996 79.2
Video 7 1286 1100 85.5
Video 8 671 600 89.4
Video 9 1600 1256 78.5
Video 10 1826 1563 85.6
Average 82.5
84
Table 5.4 Efficiency of human activities for a video
Name of the
poses
No. of
poses
No. of poses
correctly
detected
Efficiency
(%)
Time
(Sec)
A 93 85 91.4 1.54
B 110 81 73.6 1.63
C 98 84 85.7 0.96
D 98 85 86.7 0.89
E 95 84 88.4 1.75
F 100 80 80.0 1.62
G 89 75 84.3 0.82
H 89 78 87.6 1.47
I 98 81 82.6 1.72
Total/
Average 870 733 84.4 1.37
To overcome the issues of the background works mentioned earlier,
the human pose modelling for the upper and lower body without using
markers are proposed in this research work. The human body models are
proposed based on the two methods such as skin colour features and
silhouette. In the skin colour based, the three types of approaches such as
Skin Colour segmentation Modelling (SCM), Wavelet Transform Modelling
(WTM) and Wavelet transform based Skin Colour segmentation Modelling
(WSCM) are proposed. In silhouette based, the Thinning Algorithm
Modelling (TAM) and Delaunay Triangulation Modelling (DTM) approaches
are proposed. The skin colour based approaches are presented in Chapter 5
whereas Chapter 6 presents the silhouette based approaches.
85
5.3 SKIN COLOUR SEGMENTATION MODELLING (SCM)
5.3.1 Overview of an Approach
Here, an approach has been proposed to develop a human pose
models using Skin colour segmentation. In the previous work by Antoni
Jaume I Capo et al (2006), only nine feature points have been considered to
model the human body and the full body modelling was not attempted. In this
present research work, the thirteen feature points are considered in the upper
body as well as in the lower body for making the human pose models.
First, the videos are acquired from the monocular video camera. The
pre-processing stage is helpful for enhancing the excellence of the frames.
The background of the scene is removed using the background subtraction
method. The body parts such as left hand, left leg, head, right leg and right
hand are identified using skin colour segmentation and then the remaining
eight parts are detected. The eight points are neck, left shoulder, right
shoulder, left hand elbow, right hand elbow, abdomen, left knee and right
knee. After the thirteen points are detected, the 2D modelling and 3D
modelling are implemented. In the 2D modelling, stick figure model and
cylinder model are proposed for the pre-defined poses. Finally, the human
activity analysis is performed. Figure 5.4 shows the flow chart of the
proposed research work using SCM approach. In the proposed algorithm, the
following steps are used to model the human body for the video surveillance
applications.
Proposed Human Body Modelling algorithm using SCM approach
Step 0: Import the video sequences to MATLAB environment.
Step 1: Perform pre-processing.
Step 2: Perform background subtraction.
Step 3: Apply skin colour segmentation to the human body.
86
Step 4: Identify the terminal skin area using skin features.
Step 5: Determine neck and shoulder points.
Step 6: Determine elbow and knee joints.
Step 7: Find the centroid of the human body which is considered as
abdomen.
Step 8: Plot thirteen points on human body model.
Step 9: Develop 2D models and 3D models.
Figure 5.4 Flow chart of the proposed research work by SCM Approach
Video acquisition
Pre-processing
Background Subtraction
Skin Colour Segmentation
Find Terminal Points (5Nos)
Determination of remaining
eight points
Plot thirteen points on model
Start
End
2D Modelling 3D Modelling
Human Activity analysis
87
5.3.2 Role of Skin Colour Segmentation
Skin colour is a basic feature in identifying, detecting, analyzing the
motion of human in the video surveillance. It is an important feature of the
human body that varies according to the persons. The skin colour detection
from the human body has an issue due to camera noise and lighting
conditions. Here, the function of skin colour segmentation is to classify the
skin regions from non-skin regions using colour space models. The colour
spaces models are used to find the skin regions. The skin features are
separated with the use of proper threshold values.
In this proposed research work, RGB colour space has produced
better performance in detecting skin features which is described in Chapter 4
so that RGB colour space is considered in this section to find the skin features
from the human body in the video sequences. The RGB colour space is most
widely used colour space for separating skin features. By the experimentation
with various videos, the threshold value is selected for each colour
components. For red component, the threshold value is 95 and 40, 25 for
green and blue components respectively. These R, G, and B components are
stored separately. In the proposed algorithm, the five terminal points such as
left hand, left leg, head, right leg and finally right hand are found using the
skin features. Figure 5.5 shows the five skin features of the human body that
represents the body parts. It shows the body parts like left hand, left leg, head,
right leg and right hand for both hands rise, both hands up, left hand up and
crouching. These five points are considered as very important to find the
remaining feature points.
88
(a) Both hands rise (b) Both hands up
(c) Left hand up (d) Crouching
Figure 5.5 Determination of five terminal segmented skin area of body
5.3.3 Detection of Body Segments
This section deals the detection of body parts such as neck,
shoulder, hand elbow, abdomen and knee. These parts are used for making
human pose models.
5.3.3.1 Detection of neck, shoulder and hand elbow
After the detection of the head, right hand, left hand, right leg and left
leg using skin colour segmentation, the next five points such as neck, right
shoulder, left shoulder, right hand and left hand are detected. The neck is
placed 15% from the top pixel of the human body. The distance between the
neck point and the shoulder point is only 19.5% of the width of the human
body. From the shoulder point, the elbow points are tracked. Here, the
89
distance between two points namely, for example, (p,q) and (r,s) is computed
using the Euclidean Distance (ED) formula as in equation (5.1).
2 2ED= (p-r) +(q-s) (5.1)
5.3.3.2 Detection of abdomen of the body
The abdomen of the body is determined by drawing the bounding
box after the extraction of desired segmentation result. The centroid of the
bounding box is the abdomen of the human body. The determination of
bounding box is explained in Figure 5.6 in which original video frame and
bounding box are shown.
(a) Original video frame (b) Human body with bounding box
Figure 5.6 Detection of bounding box around human body
5.3.3.3 Detection of knee points
The knee joints of the human legs lie at half the distance between
the abdomen point and the leg points. And the leg points are detected from the
skin colour segmentation. After determining all the points, the thirteen points
are plotted as shown in Figure 5.7. It shows that the plot of points on the
poses of standing, left hand rise and crouching. After finding all the points,
they are joined through the line for stick figure models. The cylinder is
90
formed on those points to get cylinder 2D model and the entire human body
model is displayed finally.
(a) Standing (b) Left hand rise (c) Crouching
Figure 5.7 Plotting of thirteen points on human body
5.3.4 Human Pose Modelling
The two-dimensional modelling refers a skeletal structure made by
line segments linked by joints and gives the actions and gestures with respect
to the original human in a two-dimensional view. The limitation of 2D
modelling is the restriction of their view point. The 3D modelling gives the
representation of body segmentation in the three axes. In Figure 5.8, the
column I shows the input video images and the outputs of Stick figure
modelling and Cylinder modelling are shown in column II and column III
respectively. It shows the models of twelve human poses such as standing,
right hand rise, left hand rise, both hands rise, right hand up, left hand up,
both hands up, right leg rise, left leg rise, right salute, left salute and
crouching.
A 3D human body is a three dimensional representation of body
segments. A 3D model has been reconstructed from 2D images by computer
vision techniques using volumetric intersection. Here, sphere is used to
develop a 3D model. The sphere is formed from an ellipsoid by adjusting its
dimensions in three axes. For this, a 2D ellipse is created first and it is
converted into 3D ellipsoid. Once an ellipsoid is created, then it is converted
91
to sphere. This is implemented on the thirteen points in the human body to
form a 3D model representation which are shown in Figure 5.9. Also it shows
the human pose models of twelve poses.
5.3.5 Results and Discussion
This section presents the experimental results of 2D and 3D human
pose modelling using SCM approach. Here, the twelve human poses are
considered for the analysis.
A. Standing
B.Right hand rise
C. Left hand rise
D. Both hands rise
92
E. Right hand up
F. Left hand up
G. Both hands up
H. Right leg rise
I. Left leg rise
93
J. Right salute
K. Left salute
L. Crouching
Column I Column II Column III
Figure 5.8 Experimental results of 2D Human pose modelling
(Column I) Original video frame; (Column II) Stick figure
model and (Column III) Cylinder model
A. Standing
B. Right hand rise
94
C. Left hand rise
D. Both hands rise
E. Right hand up
F. Left hand up
G. Both hands up
95
H. Right leg rise
I. Left leg rise
J. Right salute
.
K. Left salute
L. Crouching
Column I Column II
Figure 5.9 Experimental results of 3D Human pose modeling. (Column I)
Original video frame;(Column II) 3D model
96
From the simulation results obtained by the proposed algorithm, the
efficiency has been determined. The efficiency is calculated by determining
the number of poses correctly detected from the total number of poses in the
video sequences as shown in Table 5.5. In the 2D models, stick figure and
cylinder models are constructed and 10 videos are considered for the analysis.
For these videos, efficiencies of 2D and 3D models are computed. Also, the
efficiency of twelve poses in a single video (for Video1) is presented in
Table 5.6.
Table 5.5 Efficiency of pose models using SCM
Videos No. of
frames
No. of poses correct in 2D
Modelling 3D Modelling
Stick
Figure Cylinder
Efficiency
(%)
No. of
poses
correct
Efficiency
(%)
Video 1 1006 900 900 89.5 890 88.4
Video 2 1147 1052 1052 91.7 1012 88.2
Video 3 1382 1280 1280 92.6 1207 87.3
Video 4 864 805 805 93.2 700 81.0
Video 5 1502 1342 1342 89.3 1315 87.5
Video 6 1257 1140 1140 90.6 1108 88.1
Video 7 1286 1200 1200 93.3 1190 92.5
Video 8 671 630 630 93.9 620 92.3
Video 9 1600 1510 1510 94.3 1360 85.0
Video10 1826 1750 1750 95.8 1700 93.0
Average 92.4 Average 88.3
97
Table 5.6 Efficiency of poses for a single video using SCM
Name of
the poses
No. of
poses
No. of poses correct in 2D
Modelling 3D Modelling
Stick
Figure
Cyli-
nder
Effic-
iency
(%)
No. of poses
correct
Efficiency
(%)
A 86 80 80 93.0 83 96.5
B 81 72 72 88.9 75 92.6
C 67 60 60 90.0 60 90.0
D 85 77 77 90.6 76 89.4
E 88 81 81 92.0 80 90.9
F 94 80 80 85.1 84 89.4
G 79 66 66 83.5 67 84.8
H 93 83 83 89.2 78 83.8
I 73 64 64 87.7 64 87.7
J 61 52 52 85.2 50 82.0
K 103 95 95 92.2 90 87.4
L 96 90 90 94.0 83 86.5
Total/
Average 1006 900 900 89.5 890 88.4
5.4 WAVELET TRANSFOM MODELLING (WTM)
The goal of this section is to develop a human pose modelling
through Discrete Wavelet Transform (DWT) for the monocular videos. Here,
the DWT uses to extract the human body from the video scene.
98
5.4.1 Overview of an Approach
The features that are extracted from the human body are useful to
model the surveillance persons and it is applied to recover the human body
poses. In this proposed work as in Figure 5.10, the video sequences are
acquired using camera and pre-processing technique is applied to improve the
quality of the frames. Then, the human body is segmented using Discrete
Wavelet Transform (DWT). The approximation information of the reference
frame and current frames are subtracted to obtain the human body. After
detecting the human body parts, the models have been developed in 2D and
3D views. From the feature points obtained by pose models, the activity
analysis can be performed.
Figure 5.10 Block diagram of the proposed work using WTM approach
Video acquisition
Pre-processing
Human Body Segmentation
using DWT
Determination of body parts
2D Modelling 3D Modelling
Human Activity Analysis
99
5.4.2 Human Body Segmentation using DWT
In the proposed research work, the extraction of the human body
plays an important role for the further analysis. This is implemented with the
aid of two-dimensional discrete wavelet transform. The main function of the
human body segmentation is to extract the features of human which can be
merged in order to build objects of interest on which analysis is performed.
The methodology of human body segmentation is already explained in
Section 3.5. Figure 5.11 shows the way of finding human body segmentation.
In which Figure 5.11(e) gives the human body using background subtraction
technique by subtracting each four frequency components of Figure 5.11 (d)
from Figure 5.11 (c).
(a) Reference frame (b) Current frame
(c) DWT of (a) (d) DWT of (b) (e) Human body Figure 5.11 Human body segmentation using Discrete Wavelet
Transform (DWT)
100
5.4.3 Determination of Body Parts
The thirteen feature points are considered for the human pose
modelling in this research work. These points are Terminating points (5Nos),
Intersecting points (2Nos), Shoulder points (2Nos), Elbow joints (2Nos), and
Knee joints (2Nos). The terminating points include head, right hand, left
hand, right leg and left leg. The intersecting points are neck and abdomen.
Initially, the terminating points are determined by traversing of pixels either
row wise or column wise.
(a) Points determination on body model
(b) Plotting of thirteen points on human body model
Figure 5.12 Graphical ideas to find human body segments
Elbow Joint Shoulder Point
Knee Joint
Finding of 2 Shoulders, 2 Elbows, 2 Knees
Finding of head, right and left hands, right and
left legs
101
Figure 5.12 shows an idea of finding human body segments with the
use of proposed algorithm. Figure 5.12(a) gives the way of determining
feature points on the human body. Every frame has an array made of rows and
columns under which particular pixel values are stored. The modelling is
achieved with respect to the binary form of the original image from which
easy traversing with ones and zeros can be achieved. When the person comes
inside the room, the top pixel is detected by traversing row wise from the top
of an image. This point is the head of the body. The right hand and left hand
are located by scanning of pixels from left to right and vice-versa. Then, the
legs are determined by scanning of pixels using bottom up approach by
keeping the column as constant.
After the detection of the head, right hand, left hand, right leg and
left leg, the next five points such as neck, right shoulder, left shoulder, right
hand and left hand are detected. The neck is placed 15% from the top pixel of
the human body. The shoulder point is determined 19.5% from the top pixel
of the human body. From the shoulder point, the elbow points are tracked.
The elbow point is approximately halfway between the shoulder and the
terminating points of the two hands.
The detection of abdomen is one of the important feature point in
developing the human pose modelling. The procedure to find the abdomen is
same as in the Section 5.3.3.2. The knee joints of the human legs lie at half
the distance between the abdomen point and the leg points detected. After
determining all the points using this approach, the thirteen points are plotted
as shown in Figure 5.12(b).
5.4.4 Results and Discussion
The proposed algorithm is implemented in the indoor surveillance
videos with straight poses. Here, twelve human poses have been used from
102
the monocular video sequences for the 2D and 3D pose modelling through
Discrete Wavelet Transform (DWT). Figure 5.13 shows the experimental
results of 2D modelling in which stick figure model and cylinder model are
displayed. The simulation results of 3D models are shown in Figure 5.14.
B
A. Standing
B.Right hand rise
C. Left hand rise
D. Both hands rise
103
E. Right hand up
F. Left hand up
G. Both hands up
H. Right leg rise
I. Left leg rise
104
J. Right salute
K. Left salute
L. Crouching
Column I Column II Column III
Figure 5.13 Experimental results of 2D human pose modelling using
WTM approach. (Column I) Original video frame; (Column II)
Stick Figure model, and (Column III) Cylinder model
A. Standing
105
B.Right hand rise
C. Left hand rise
D. Both hands rise
E. Right hand up
F. Left hand up
106
G. Both hands up
H. Right leg rise
I. Left leg rise
J. Right salute
107
K. Left salute
L. Crouching
Column I Column II
Figure 5.14 Experimental results of 3D human pose modelling
(Column I) Original video frame;(Column II) 3D model
Here, the twelve human poses are considered such as standing,
right hand rise, left hand rise, both hands rise, right hand up, left hand up,
both hands up, right leg rise, left leg rise, right salute, left salute and
crouching for making the models. The efficiency is calculated by determining
the number of poses correctly detected from the total number of poses in the
video sequences as shown in Table 5.7. In this, 2D models provide better
results than 3D modelling because the development of 3D model is complex
compared to 2D models. The stick figure model and cylinder model have
produced same efficiency due to their easy construction. Also, the efficiency
of twelve poses in a single video (for Video1) is presented in Table 5.8.
108
Table 5.7 Efficiency of models for different videos using WTM
Videos No. of
frames
No. of poses correct in 2D
Modelling 3D Modelling
Stick
Fig. Cylinder
Efficiency
(%)
No. of poses
correct
Efficiency
(%)
Video 1 1006 950 950 94.4 901 89.5
Video 2 1147 1099 1099 95.8 1045 91.1
Video 3 1382 1296 1296 93.8 1274 92.1
Video 4 864 790 790 91.4 785 90.8
Video 5 1502 1370 1370 91.2 1351 89.9
Video 6 1257 1149 1149 90.7 1141 90.8
Video 7 1286 1220 1220 94.8 1205 93.7
Video 8 671 650 650 96.8 630 93.9
Video 9 1600 1531 1531 95.7 1505 94.1
Video10 1826 1789 1789 97.9 1751 95.9
Average 94.3 Average 92.2
Table 5.8 Efficiency of human poses for a single video using WTM
Name of
the poses
No. of
poses
No. of poses correct in 2D
Modelling 3D Modelling
Stick
Fig. Cylinder
Efficiency
(%)
No. of
poses
correct
Efficiency
(%)
A 86 84 84 97.7 84 97.7
B 81 78 78 96.3 76 93.8
C 67 64 64 95.5 62 92.5
D 85 80 80 94.1 77 90.6
E 88 85 85 96.6 81 92.0
F 94 88 88 93.6 86 91.5
G 79 70 70 88.6 68 86.1
H 93 87 87 93.5 79 84.9
I 73 69 69 94.5 65 89.0
J 61 55 55 90.2 50 82.0
K 103 97 97 94.2 90 87.4
L 96 93 93 96.9 83 86.5
Total/
Average 1006 950 950 94.4 901 89.5
109
5.5 WAVELET TRANSFOM BASED SKIN COLOUR
SEGMENTATION MODELLING (WSCM)
5.5.1 Introduction
The aim of this section is to develop a human pose modelling in
two-dimensional and three dimensional views through wavelet based skin
colour segmentation for the monocular videos. The approximation
information of the Discrete Wavelet Transform (DWT) is used for segmenting
the human body from the video frame and to find the skin features. The
human pose modelling is proposed by detecting thirteen features from the
upper body as well as in the lower body. Here, twelve human poses are
considered for the analysis.
5.5.2 Proposed Approach
The wavelet based skin colour segmentation is presented for
developing the human pose models in this section. Initially, the video
sequences are acquired using the camera. Then, pre-processing technique is
applied to improve the quality of the frame. The two dimensional wavelet
transform is used to segment the human body from the video frame and it is
discussed in Section 3.5. After the human body is obtained, skin colour
segmentation is applied to extract the skin features to make pose models. The
identification of skin regions is discussed in Section 5.3.2. Initially, the
terminating points such as head, right hand, left hand, right leg, and left leg
are found. From these points, the remaining points are detected. Totally,
thirteen feature points are considered for making the human pose models in
2D and 3D views. The steps of the proposed approach include,
110
Wavelet transform based Skin Colour segmentation Modelling (WSCM)
Algorithm
Step 0: Acquire video sequences from the video camera.
Step 1: Perform pre-processing.
Step 2: Apply DWT to segment human body from the video frames.
Step 3: Apply skin colour segmentation to extract skin features.
Step 4: Identify the thirteen human body feature points.
Step 5: Develop the human pose models.
5.5.3 Results and Discussion
The experimental results of the proposed models are shown in this
section. It is implemented in the indoor surveillance videos with straight
poses. Here, twelve human poses have been used from the monocular video
sequences for the 2D and 3D pose modelling through wavelet based skin
colour segmentation. Figure 5.15 shows the experimental results of 2D
modelling and 3D modelling in which stick figure and cylinder models are
indicated. Figure 5.16 displays results of 3D modelling for different poses
using WSCM approach.
A. Standing
B.Right hand rise
111
C. Left hand rise
D. Both hands rise
E. Right hand up
F. Left hand up
G. Both hands up
112
H. Right leg rise
I. Left leg rise
J. Right salute
K. Left salute
L. Crouching
Column I Column II Column III
Figure 5.15 Experimental results of 2D human pose modelling using
WSCM approach. (Column I) Original video frame;
(Column II) Stick figure model, and (Column III) Cylinder
model
113
A. Standing
B.Right hand rise
C. Left hand rise
D.Both hands rise
E. Right hand up
114
F. Left hand up
G. Both hands up
H. Right leg rise
I. Left leg rise
J. Right salute
115
K. Left salute
L. Crouching
Column I Column II
Figure 5.16 Experimental results of 3D human pose modelling using
WSCM approach. (Column I) Original video frame;
(Column II) 3D model
The efficiency is calculated by determining the number of poses
correctly detected from the total number of poses in the video sequences as
shown in Table 5.9. In which, 2D models provides better results than 3D
modelling because the development of 3D model is complex compared to 2D
models. Also, the effectiveness of twelve poses in a single video (for Video1)
is presented in Table 5.10.
116
Table 5.9 Efficiency of pose models using WSCM approach
Videos No. of
frames
No. of poses correct in 2D
Modelling 3D Modelling
Stick
Fig. Cylinder
Efficiency
(%)
No. of
poses
correct
Efficiency
(%)
Video 1 1006 965 965 96.0 940 93.4
Video 2 1147 1102 1102 96.1 1095 95.5
Video 3 1382 1300 1300 94.0 1296 93.8
Video 4 864 850 850 98.3 840 97.2
Video 5 1502 1410 1410 93.8 1396 92.9
Video 6 1257 1179 1179 93.8 1100 87.5
Video 7 1286 1250 1250 97.2 1195 92.9
Video8 671 650 650 96.8 630 93.9
Video 9 1600 1580 1580 98.7 1550 96.9
Video 10 1826 1750 1750 95.8 1740 95.3
Average 96.1 Average 93.9
Table 5.10 Effectiveness of human poses for a single video
Name
of the
poses
No. of
poses
No. of poses correct in 2D
Modelling 3D Modelling
Stick
Figure Cylinder
Efficiency
(%)
No. of
poses
correct
Efficiency
(%)
A 86 84 84 97.6 80 93.0
B 81 78 78 96.2 77 95.1
C 67 65 65 97.0 60 90.0
D 85 85 85 100 80 94.1
E 88 85 85 96.5 83 94.3
F 94 90 90 96.0 88 94.0
G 79 75 75 95.0 73 92.4
H 93 87 87 93.5 86 92.4
I 73 70 70 96.0 68 93.2
J 61 55 55 90.2 54 89.0
K 103 98 98 95.1 98 95.1
L 96 93 93 97.0 93 97.0
Total/
Average 1006 965 965 96.0 940 93.4
117
5.6 SUMMARY
In this chapter, an approach of marker-less 2D and 3D human pose
modelling using skin colour segmentation and discrete wavelet transform
have been discussed. With the use of these models, twelve predefined human
poses are detected and analyzed. The efficiency for different videos and for
the different poses in a single video are presented.