pose-invariant face recognition using curvelet neural network

www.ietdl.org

1&

Published in IET BiometricsReceived on 1st February 2013Revised on 16th May 2013Accepted on 20th June 2013doi: 10.1049/iet-bmt.2013.0019

28The Institution of Engineering and Technology 2014

ISSN 2047-4938

Pose-invariant face recognition using curveletneural networkPoonam Sharma1, Ram N. Yadav2, Karmveer V. Arya3

1Department of Computer Science and Engineering, Madhav Institute of Technology and Science, Gwalior, India2Department of Electronics and Communication Engineering, Maulana Azad National Institute of Technology,

Bhopal, India3Department of Information and Communication Technology, ABV-Indian Institute of Information Technology and

Management, Gwalior, India

E-mail: [email protected]

Abstract: A novel pose-invariant face recognition method is proposed by combining curvelet-invariant moments with curveletneural network. First a special set of statistical coefficients using higher-order moments of curvelet are extracted as the featurevector and then the invariant features are fed into curvelet neural networks. Finally, supervised invariant face recognition isachieved by converging the neural network using curvelet as the activation function of the hidden layer neurons. Theexperimental results demonstrate that curvelet higher-order moments and curvelet neural networks achieve higher accuracy forface recognition across pose and converge rapidly than standard back propagation neural networks.

1 Introduction

Face recognition has been studied extensively by a number ofresearchers as it is natural and passive than other biometrictechniques. Most of the existing face recognition methodswork efficiently on frontal faces [1–4] and are able toachieve low error rates in well controlled environment.Being non-intrusive, a face recognition system must be ableto identify a face in an uncontrolled environment, unnoticedand without cooperation from the subject. However, recentliterature survey [5] shows that pose variation is aprominent unsolved problem in development of real-timeautomatic face recognition system. The performance of facerecognition is seriously affected by variation in the testimage from the frontal image. The difficulty in the facerecognition across different poses is because of the fact thatwithin class variance dominates between class scatter of thedata and results in loss of discriminative feature data. Aspose variation is linear in three-dimensional (3D) butnon-linear in two-dimensional (2D), good results can beobtained by developing a 3D model of the subject but hasdrawback of precise registration and exhaustive computationinvolved. Based on these, pose-invariant face recognition isdivided into two parts namely 2D-based methods and3D-based methods. 2D methods are more attractive asaddition of new subject to the database is easy, computationis simpler and memory requirement is less. Most of therecently developed 2D methods are again divided into twoparts. First set of methods are based on the synthesis ofdifferent poses from the frontal face and these faces act asthe gallery images, or a 3D face can be generated. Differentsynthetic poses are generated by finding a mapping function

between gallery frontal pose and the gallery non-frontalposes. Another set of methods in 2D directly model thediscriminative features in face because of pose across samesubject and among different subjects. A number of methodshave been proposed to model the local appearance change.It is based on the assumption that the appearance variationamong different subjects with same pose is always largerthan the appearance variation among different poses of thesame subject. But it is not applicable to large posevariation, where the appearance variation among differentposes of the same subject is significant as compared withappearance variation among same pose of different subjects.These problems can be overcome by combining the globalfeatures and the local features of the face image. In theproposed method, the global features are obtained bygenerating multiresolution curvelets. Motivation behindusing curvelet is its similarity to human visual system(HVS). HVS have spatial frequency subband decompositionas the key feature. HVS recognises images by scale,position and orientation variation. Curvelet is amultiresolution tool with variation in scale, position andorientation. Adaptive nature of curvelet transform makes itmore similar to HVS. Then each curvelet is divided on agrid in polar coordinates into a central region and eightradial sectors. Then the local features are considered bygenerating invariant higher-order moments of differentregions. Higher-order moments are used to counteract thedrawback of curvelet as being global and also to reduce thedimensionality of the feature vector without reducing itsdiscriminatory power. The results are applied to the curveletneural network for classification that uses curvelet as theactivation function. Curvelet neural network also closely

IET Biom., 2014, Vol. 3, Iss. 3, pp. 128–138doi: 10.1049/iet-bmt.2013.0019

www.ietdl.org
resembles multiresolution properties of visual neurons. Theneural network with activation function closer to HVSprovides higher efficiency and quick convergence ascompared with basic neural networks [6–8].The rest of this paper is organised as follows. In Section 2,
we describe the related work done in the field of facerecognition across pose. Section 3 describes the curvelettransform, curvelet-invariant moments and curvelet neuralnetworks. The detailed comparative analysis ofexperimental results has been presented in Section4. Section 5 concludes the paper.

2 Related work

The state-of–the-art methods for face recognition undervarying poses may be divided into three categories. Firstcategory is based on transforming an input image to thesame pose as stored in the database. Second category ofmethods is based on deforming a generic face model to fitwith the input image. Third category is based on generatingfeature vector from all views and then generating a commonfeature vector that is a combination of feature vector of allviews. Pentland et al. [9] proposed a view-based methodfor recognition under variable pose. In this method,recognition is done on the basis of eigenvector of that viewspace and calculating distance from face space. Thismethod has a drawback that it requires a large number ofgallery images in different poses. Cootes et al. [10]proposed a view-based active appearance model (AAM)assuming that 2D statistical models can capture the facialfeatures from any view point. However, the experimentationwas done on face tracking only. Gross et al. [11] proposedan eigen light fields (ELF) method to tackle the poseproblem. In this method, ELF of the subject head iscalculated from the gallery image. Gallery and probeimages are then compared on the basis of ELF. Thismethod requires a large number of gallery images and thecalculation of the ELF is difficult and the feature vector wascalculated by concatenating the normalised image vectors ofdifferent gallery images of the same subject. Chai et al.[12] presented an affine transformation based on statisticalanalysis. Face is divided into three rectangular regions andaffine transformation of rectangular regions with differentposes is calculated and is used for recognition of faceacross pose. Although the results improved as comparedwith other existing methods but recognition rate is still low.Also there is no method for automatic marking of facelandmarks. Prince et al. [13] proposed a generative modelthat generates different poses from an identity space.Expectation maximisation algorithm is used for estimatingthe linear transformation and noise data from training data.This method is probabilistic and provides a posteriorprobability for the matching to a gallery. Also the methoddescribes how an underlying pose invariant data generates apose varying data. It gives good result for face recognitionin constrained environment but the result reduces forunconstrained or real-time environment. Shan et al. [14]presented an extension of adaptive principal componentanalysis (APCA) method. They developed a face model anda rotation model that is used to generate the feature vectorand synthesise frontal face from non-frontal face. Adaboostcascade face detection is used to reduce the problem ofinitialisation of AAM. AAM is used for developing rotationmodel and pose estimation. APCA is used fordimensionality reduction and is insensitive to illumination


and expression change for face recognition. Sarfraz andHellwich [15] developed a pose-invariant method that doesnot require perfect alignment between the gallery and theprobe image. It models approximated joint probabilitydistribution of the gallery and the probe images at differentposes. Features are extracted using gradient locationorientation histogram signatures and features are synthesisedfrom non-frontal to frontal views.Chai et al. [16] developed local linear regression (LLR)

method for generating virtual frontal face from a non-frontalface image. They used the fact that the local facial regionsbetter correspond to the local regions of non-frontal face.Thus, the face is divided into multiple local patches and thelinear regression is applied to each patch to obtain the patchof the frontal face. The recognition rate is better ascompared with global linear regression and ELF method butconsiders the image as piecewise linear. Choi et al. [17]proposed pose and illumination invariant face recognitionwhere pose is estimated based on 2D image and uses aclassification rule to classify a pose of a face image. Theshadow compensation is obtained after determining thelight direction and the feature is extracted by applying nullspace linear discriminant analysis. Then classification isdone using nearest neighbour rule. However, dimensionalityreduction technique reduces the efficiency. Wang et al. [18]presented face recognition across pose by considering aprobe image with a different pose from gallery imageswhich is represented by a linear combination of the galleryimages. They proved that orthogonal discriminant vector(ODV) is a pose-invariant feature. ODVs of each subject isgenerated where a subject posses zero projection and possesmaximum projection with reference to other subjects. Thenclassification is done using distance metric. However, theresult reduces with large pose variations. Moallem et al.[19] proposed a fuzzy rule-based system forpose-independent face detection. Skin colour, lip position,face shape information and ear texture are fed to fuzzyrule-based classifier to extract face candidate. Threshold onface candidates is optimised by using genetic algorithm. Itworks well for slight variation in pose. They usedgeometric moments based ear texture classification to verifyits outcomes. Mohammad et al. [7] developed a facerecognition based on multidimentional principal componentanalysis (PCA). Extreme learning (EL) machine is used toclassify the subjects. Face image is decomposed usingcurvelet and a subband that exhibits maximum standarddeviation is selected. Then dimensionality reduction is doneusing PCA and the result is applied to EL machine forclassification. This method has a faster and efficientrecognition as compared with other existing methods, but isdependent on variation in gallery images, scales of curveletsand number of hidden neurons.Huang et al. [20] proposed a pose-invariant face

recognition approach by generating view specific eigenfacesfor each view. Feature coefficients are extracted for eacheigenfaces for each view. Feature coefficients extracted foreach eigenfaces is used to train view-specific neuralnetworks. Then second layer of neural network is used tocombine the decisions obtained from the first layer. Theefficiency tends to reduce as the number of eigenfaces isreduced. Singh et al. [21] proposed a hierarchicalregistration method by using affine transformation andmutual information-based registration method. Themosaicing is done by mask generation, stitching andblending using Laplacian and Gaussian pyramids andclassification is done using support vector machine (SVM)

129& The Institution of Engineering and Technology 2014

www.ietdl.org
classifier. Arashloo et al. [22] proposed recognition of facesin arbitrary pose by using the energy of the establishedmatch between a pair of images as the matching criterion. Ituses a multiresolution approach based on super-couplingtransform to establish pixel-wise correspondences. Thefeature vector is obtained by applying PCA andnearest-neighbour classifier is used for classification. Theefficiency is low when eye coordinates are used for texturecomparison as single gallery image is used for training.Zhang et al. [23] proposed pose robust face recognitionmethod using sparse property of the representationcoefficients of a face image over its corresponding view,which is invariant to pose. This method models the latentidentity space via sparse coefficients and formulates themapping from non-frontal to frontal faces as a non-linearmapping by solving a sparsity regularised optimisationproblem. The drawback of the method is that the probeimage must be known as a prior knowledge. Lu et al. [24]proposed face recognition via weighted sparserepresentation and integrated data linearity and locality withbetter results in lower dimensional subspace. The efficiencyis very low for lower dimensions. Ma et al. [25] proposedhead pose estimation by combining biologically inspiredfeatures with local binary pattern (LBP) after applyingGabor filter on the training image. PCA is used to reducethe dimensionality of the local biologically inspiredfeatures. Nearest neighbour and SVM classifier is used forclassification. However, dimensionality reduction reducesthe efficiency. Meshgini et al. [26] proposed facerecognition using a combination of Gabor wavelets, directlinear discriminant analysis (DLDA) and SVM. Featurevector is extracted using DLDA. SVM is used forclassification with hyperhemisphercally normalisedpolynomial as the kernel function. The combination ofDLDA and Gabor filter generate pose invariant features butdimensionality reduction reduces the efficiency. Singh andSahan [27] proposed face recognition using a combinationof global and local features. Wavelet moments werecombined with wavelet invariants to enhance theperformance. Wavelet moments are the image descriptors,which includes both global and local characteristics ofimage and whose magnitude is invariant to image rotation.Sharma et al. [6] proposed a face recognition method usingcombination of Gabor wavelet and LBP for featureextraction. The feature vector generated is classified usinggeneralised mean neural network. Generalised mean neuralnetwork proved to be better classification method ascompared with SVM. Efficiency of the proposed method isvery high for slight variation in pose and illumination butthe efficiency reduces with large pose variation.
3 Curvelet transform

3.1 Basic fast discrete curvelet transform

Curvelet transform [28–30] developed by Candes andDonoho is a multiresolution tool and have the capability ofmore sparse representation of the images as compared withwavelet transform. For facial images, the lines and curvesare the main features. The face recognition using curvelettransform [28] was first proposed to highlight the curvedsingularities in images for face recognition and proved tobe better as compared to wavelet transform in terms ofefficiency. Curvelet better characterise the facial imagesas it can capture the intrinsic geometrical structures likeedges in face image. Wavelet can only capture point


discontinuities, whereas curvelet can capture linearsegments of contours. After applying curvelet transform,image is converted to coefficients of low frequency andhigh frequency in matrix form. The low-frequencycomponent contains the approximation of the face images.The high-frequency component contains the detailedinformation of the curves. The low-frequency coefficientsalso known as curvefaces, contain most significantinformation of faces and are crucial for face recognition.The curvelet transform is a combination of subband

decomposition, smooth partitioning, renormalisation andridgelet analysis [29]. Curvelet transform directly takesedges as the basic representation elements and is anisotropicwith strong direction and so it is useful for representing theedges of images efficiently. In a 2D image with a numberof edges, curvelet transform is used to capture the edgeinformation. To form an efficient feature set it is crucial tocollect these interesting edge information which in turnincreases the discriminatory power of a recognition system.Discrete curvelet coefficients defined by Candes and

Donoho [28] are given as:

CD j, l, k1, k2( ) = ∑0≤m≤M

0≤n≤N

f m, n[ ]fDj,l,k1,k2 m,n[ ] (1)

Here, each ∅Dj,l,k1,k2 m,n[ ] is a digital curvelet waveform. The2D moment of order (p + q) of the digital image F(x, y) ofsize M × N is defined as

mpq =∑M−1

x=0

∑N−1

y=0

xpyqf (x, y) (2)

Magnitude of mpq remains invariant even if the image rotates.We consider the curvelet family function in polar form as

follows

Cabpq(r) =1��a

√ Cpqr − b

a

( )(3)

where a is a dilation parameter and b is a shifting parameter.The mother curvelet Cpq is a higher-order moment of curvelet.The Cabpq(r) is the invariant moments of the face image. Theapproximate band and detailed band obtained after applyingcurvelet for sample image are shown in Fig. 1.

3.2 Curvelet-invariant higher-order moments

It is observed from literature review that face recognitionrequires both global and local features to effectivelyrepresent the entire face information. Also most of themethods suffer from high computational complexity.Wavelet moments have been used by Singh and Sahan [27]since it combines the characteristics of multi resolution andinvariance. It also reduces the computational complexity.Wavelet moments has been used for image-processingapplications and proved better than radial momentinvariants. However, it suffers from the drawback ofignoring higher-order moments, which also includesimportant facial information. Curvelet is a multiresolutiontool with better directionality, optimal approximation rate,easy implementation and non-redundancy as compared withwavelet transform. Moment-based LBP [31] has also beenused for invariant pattern recognition. Thus curvelet-based


Fig. 1 Approximate and Detailed subband for an example image

www.ietdl.org

moments have been used in the proposed method to overcomethe drawbacks of wavelet moments. The magnitude ofmoments for sample images with varying pose and curveletimages are shown in Figs. 2a and b. It is clear from thefigure that the curvelet moments are invariant to posevariation. The statistical moments that are invariant to poseand illumination can be used for face recognition. The 2Dmoment of order (p + q) of the digital image F(x, y) of sizeM ×N is defined as [32]

mpq =∑M−1

x=0

∑N−1

y=0

xpyqf (x, y) (4)

The corresponding central moment of order (p + q) is defined

Fig. 2 For different poses of sample image from CMU-PIE database

a Magnitude of momentsb Magnitude of curvelet moments


as

m pq =∑M−1

x=0

∑N−1

y=0

x− �x( )p y− �y( )q

f (x, y) (5)

where

�x = m10

m00and �y = m01

m00

The six statistical moments used in the proposed methodincludes mean, standard deviation, measurement ofsmoothness, skewness, measurement of uniformity andrandomness. The mean intensity that estimates the valuearound central clustering for one-dimensional (1D) is


Fig. 3 Curvelet neural network

www.ietdl.org

given by

m =∑L−1

i=0

zip zi( )

(6)

where zi is a random variable indicating intensity, p(zi) is thehistogram of the intensity levels in the region, L is the numberof possible intensity levels. Standard deviation is a measure ofthe average contrast for 1D and is given as

s = ��m2(z)

√ (7)

Measure of the relative smoothness of the intensity in theregion for 1D is given by

R = 1− 1/ 1+ s2( )(8)

Third moment that measures the skewness of a histogram for1D is given by

m3 =∑L−1

i=0

zi − m( )3

p zi( )

(9)

Fig. 4 Block diagram of the proposed method


Uniformity measures the uniformity of the intensity in thehistogram for 1D and is given by

U =∑L−1

i=0

p2 zi( )

(10)

A measure of randomness for 1D is given by

e = −∑L−1

i=0

p zi( )

log2p zi( )

(11)

The statistical features for 2D can be calculated for eachcombination of scale and orientation using (4) and (5).

3.3 Curvelet neural network

Neural network is a very old technique, but generalised meanneural network is a new advancement in the field of neuralnetwork, which have showed good results for differentapplications of signal processing [33] and image processing[6]. It has been observed that, having a non-linearaggregation functions, generalised mean neuron enhancesthe classification capability and reduces the convergencetime of neural network when used for image-processingapplications. Curvelet neural network is applied forclassification, and is a kind of multilayer feedforwardnetwork, based on curvelet analysis. The curvelet neuralnetwork has higher function learning capability and hasbetter directionality and anisotropy characteristics thanGabor wavelet neural network [6–8]. The curvelet neuralnetwork has input layer excitation function as the lineartransform function, the hidden layer excitation function asthe curvelet function and output layer excitation function issigmoid function.Suppose the nth sample input is

Xn = xni{ }

, i = 0, 1, 2, . . . ., L


Fig. 5 Feature extraction using higher moments of curvelet transform

www.ietdl.org

Network output is

Yn = ynk{ }

, k = 0, 1, 2, . . . ., S

and expected output is

Dn = dnk{ }

, k = 0, 1, 2, . . . ., S, n = 0, 1, 2, . . . , N

Let N be the number of samples, L is the input layer unitnumber and S is the output layer unit number. The curveletneural network has the connections as shown in Fig. 3. Theweight coefficient between jth hidden neuron and ith inputunit is Wij and the weight coefficient between Kth outputneuron and jth hidden neuron is Vjk. The input to the jthhidden neuron is given by

netnj =∑Li=1

Wijxni +Woj

[ ]1/n

(12)

where n n e <+( )is the generalisation parameter and gives the

various means (arithmetic mean, geometric mean andharmonic mean) depending on the value of n.The output of the hidden layer units is given by

Cabpq netnj

( )= 1��

a√ Cpq

netnj − b

a

( )(13)

Fig. 6 Histogram

a Approximate bandb Detailed subband


The output of the kth unit in the output layer is given by

Ynk = s

∑Li=1

VjkCabpq netnj

( )+ Vok

[ ]1/n

(14)

where σ(x) = 1/(1 + e−x) is a sigmoid function.Mean-square error (MSE) can be calculated as follows

E = 1

2

∑Nn=1

∑s

k=1

Ynk − Dn

k

( )2(15)

Network can be trained by gradient descent rule as follows

dE

dWij= dE

dy

dy

dWij= −

∑s

k=1

Yk − Dk

( )Cabpq netnj

( )( )(16)

dE

dVjk= dE

dy

dy

dVjk= −

∑s

k=1

Yk − Dk

( )xni

( )(17)

dE

dbij= dE

dy

dy

dCabpq netnj

( ) dCabpq netnj

( )dbij

= −∑s

k=1

Yk − Dk

( )Ynk 1− Yn

k

( )Vjk∗Qij 1−

∑ni=1

Q2ij

{ }( )

(18)


Table

1Comparisonofdifferen

tstate-of-the-artmetho

dsin

thetraintim

ean

dclas

sific

ationrate

forLF

W,F

ERETan

dCMU-PIE

datab

ase

S.N

o.Metho

ds

LFW

FERET

CMU-PIE

Trainingtim

e,s

Rec

ognitio

nrate,%

Trainingtim

e,s

Rec

ognitio

nrate,%

Trainingtim

e,s

Rec

ognitio

nrate,%

35

73

57

35

73

57

35

73

57

1BPneu

raln

etwork

10.6

15.1

41.2

56.2

61.3

63.9

7.8

13.1

32.2

76.2

81.3

89.9

7.67

14.16

35.2

75.2

83.3

90.23

2wav

elet

neu

raln

etwork

10.2

12.6

26.5

74.6

76.8

79.3

7.71

12.3

24.3

79.6

84.8

91.3

7.21

13.30

32.6

78.3

82.3

90.6

3ridge

letneu

raln

etwork

9.8

11.8

24.3

77.9

79.8

82.3

7.56

12.1

22.7

85.9

89.8

92.3

7.16

12.67

31.7

83.7

88.4

91.56

4ridge

+Gab

or[37]

10.1

11.9

23.9

78.7

79.9

83.4

7.81

11.9

21.0

86.3

89.9

92.7

7.18

12.55

28.9

85.4

91.7

92.3

5robust

regression[38]

11.1

12.5

25.4

79.3

80.6

83.7

8.31

12.7

23.8

87.5

91.6

93.1

8.12

13.8

34.5

87.2

92.1

93.33

6PLS

+Gab

or[37]

10.5

11.8

24.1

81.2

83.7

86.4

7.64

11.82

20.9

89.2

93.8

95.2

7.16

12.40

28.2

88.5

92.4

95.67

7propose

dmetho

d9.6

11.6

23.7

82.3

84.7

87.4

7.32

11.8

18.7

92.3

94.7

98.4

7.02

12.40

26.3

92.6

95.3

98.77

www.ietdl.org

dE

daj= dE

dy

dy

dCabpq netnj

( ) dCabpq netnj

( )daj

= −∑s

k=1

Yk − Dk

( )Ynk 1− Yn

k

( )Vjk∗Q3

ij 1−∑ni=1

Q2ij

{ }( )

(19)

Thus, the parameters are updated using the followingequations

Wij(t + 1) = Wij(t)− hdE

dWij

+ aDWij(t) (20)

Vjk(t + 1) = V jk(t)− hdE

dV jk+ aDVjk(t) (21)

bij(t + 1) = bij(t)− hdE

dbij+ aDbij(t) (22)

aj(t + 1) = aj(t)− hdE

daj+ aDaj(t) (23)

where η and α are learning rate and momentum, respectively.The learning rate and momentum have significant effect onthe learning speed and stability of neural network. Althougha large value of η results in faster convergence but maylead to instability and divergence yet a small value of ηresults in stable but leads to slower convergence. Thus, anadaptive learning rate has been used in the proposedalgorithm. If the error reduces the learning rate is increasedand vice versa. The momentum coefficient α reduceshigh-frequency weight changes and provides stability andfast learning. Momentum values are kept high to improvetraining characteristics.

4 Proposed methodology

The block diagram of the proposed method is given in Fig. 4.The proposed method for pose invariant face recognitionusing curvelet neural network is given below

Algorithm 1

Step 1: Extract the face and normalise it to 112 × 92 pixels.Step 2: Apply curvelet transform on the normalised faceimage. Select detailed subbands as they are invariant tochanges in illumination and pose.Step 3: Subbands are divided into radial sections with centreregion and other neighbouring regions (Fig. 5.).Curvelet-invariant moments are obtained by using (4–11) ofthe subbands to form the feature vector for classification(Fig. 1.).Step 4: Feature vectors from the curvelet are fed as input tothe curvelet neural network. Number of input nodes is equalto the size of the feature vector. Number of output neuronsis equal to the number of subjects to be recognised.Number of hidden neurons has been used equal to thenumber of clusters.Step 5: Initial values of translation and dilation parameters arefixed based on the centres and width of the clusters.Step 6: Initialise all weights Wij and Vjk to some randomvalues between 0 and 1.



Table 2 Recognition accuracy (in%) of different methods with different views for CMU-PIE database

Views MSM [39] Graph [39] SRC [39] JSRC [39] JDSRC [39] Proposed method

1 36.5 44.5 45.0 45.0 45.0 44.73 48.9 63.4 59.5 53.6 72.0 74.35 52.5 72.0 62.2 55.0 82.3 83.67 55.9 76.5 63.3 51.4 84.5 88.3

Table 3 Recognition accuracy (in%) of different methods with different dimensionality for CMU-PIE database

Dimensionality MSM [39] Graph [39] SRC [39] JSRC [39] JDSRC [39] Proposed method

32 0 62.6 47.7 34.6 68.8 69.164 52.0 73.9 63.2 54.7 82.1 84.3128 68.8 75.4 68.5 67.9 83.4 84.1256 74.3 77.7 69.3 74.1 85.9 86.7

Table 4 Recognition accuracy (in%) of different state-of-the-artmethods for FERET & CMU-PIE database

S.no.

Methods Recognition rate, %

CMU-PIE(average all13 images)

FERET(average of

all set)

LFW

1 Eigen face [37] 16.6 — —2 ridge + intensity

[37]88.24 81.6 —

3 local linearregression [37]

94.6 68.1 72.6

4 ridge +Gabor [37] 90.9 92.6 —5 PLS +Gabor [37] 89.5 — —6 local ridge

regression [40]90.8 — 70.2

7 ortho discriminantvector [18]

85.5 92 —

8 probabilisticlearning [15]

80.7 — —

9 latent SDA [40] 90.08 — 72.410 eigenspace +Neural

network [20]— — 59.0

11 mosaicing [21] 94.76a — —12 multiresolution MRF

[22]94.1 89 —

13 Gabor + LDA + SVM[26]

94.0 86 —

14 LLR [16] (For 23o) 93.5 — —15 LLR [16] (For 45o) 89.716 appearance based +

light field [11]78.8 — —

17 TFA [13] (for 23o) 100b — —18 TFA [13] (for 65o) 91 — —19 3 ODV [18] 91.16 — —20 2DPCA + ELM (very

low pose variation)[7]

— 99.78c —

21 Gabor + LBP +GMN[6]

93.3 98.56d 86.9d

22 complex moment[27]

-.- 98.0e —

23 proposed method 94.9 96.6 76.5

aResults are calculated on the same platform by simulating thealgorithm in [21].bResult is only for 23° pose and 65° pose [13].cPose variation or the set of database has not been explicitlydeclared in [7]. It seems to be less than 10° because the resultsare comparable with ORL [41] database used which have posevariation less than 10°.dPose variation is declared to be less than 10° [6].eResults are only for fb set of FERET database [27].

www.ietdl.org

Step 7: For all the images available in the dataset do Step 8 toStep 14.Step 8: Repeat step 9 to 14 until the error becomes negligiblysmall or 10−6.Step 9: Set all inputs to activation of xni . Set all outputs toactivation of Yn

k .Step 10: Compute the curvelet function at the hidden nodeusing (13).Step 11: Calculate the output of the curvelet neural networkusing (14).Step 12: Calculate the MSE for the kth pattern using

ek =1

2Yk − Dk

( )2Step 13: Calculate the error function E using (15).Step 14: Update the weights, Wij and Vjk, translation bij anddilation aj using (20–23).Step 15: Test the curvelet neural network so trained for theunknown patterns.

The curvelet neural network converges the network using(14) and gradient decent rule (back propagation algorithm).Activation function is the curvelet function given in (1).Training set includes the images 2–8 per subject chosenrandomly.Based on the experimentation the observations are made

for false acceptance rate (FAR) and false rejection rate(FRR). The percentage of imposter image acceptance iscalled as FAR. FAR is computed by training the proposedsystem with the faces from the one database and tested forfaces from other database and vice-versa. FRR is thepercentage of rejection of genuine images. FRR iscalculated by training the curvelet neural network by theimages from one database and testing on the images fromthe same database.

5 Experiment results

The face recognition experiments were carried out inMATLAB 2012b, on a 32-bit Intel (R) CORE (TM) 2 Duo2.10 GHz processor, with 3 GB RAM. The 40 subjects eachfrom FERET [34], LFW [35] and CMU-PIE [36] databaseswere considered. FERET database [34] consists of 14051eight-bit greyscale images of human heads with views



www.ietdl.org
ranging from frontal to left and right profiles. Fa containing1196 frontal images of 1196 subjects is used as the galleryset, whereas Fb contains 1195 images with expressionvariations, Fc contains 194 images taken under differentillumination conditions, Dup I includes 722 images ofdifferent subjects taken later in time between 1 min to 1031days and Dup II contains 234 images, a subset of Dup Itaken at least after 18 months were used as the probe sets.LFW [35] is a database for unconstrained face recognition.The data set contains more than 13 000 images of facescollected from the web. 1680 of the people pictured havetwo or more distinct photos in the data set. In LFW, as thenumber of images used for training is one there is reductionin the recognition rate. For comparison, we simulated theavailable methods for LFW database on the commonplatform. CMU-PIE is a database of 41 368 images of 68people. By extending the CMU 3D Room we were able toimage each person under 13 different poses, 43 differentillumination conditions and with four different expressions.This database is known as the CMU Pose, Illumination andExpression (PIE) database.In both the databases ten images per subject having pose
and illumination variation were considered. The databasewas divided into training set and test sets by consideringrandomly one images from each subject in training set andone in test sets. The training images were decomposedusing curvelet transform at scale = 2 and angle = 8. Thusnine components were produced, including one approximateand eight detailed subbands. Approximate and detailedsubband for an example image from FERET database isshown in Fig. 1. Histograms of approximate and detailedsubband are shown in Fig. 6. It is clear from the figure thatthe approximate bands are much affected by pose andillumination, but detailed subbands are less affected. This isthe reason for considering detailed subbands in theproposed method.It is evident from Table 1 that with increase in the number

of features, the recognition rate and training time increases.

Fig. 7 ROC plot for different comparable methods for

a FERET databaseb CMU-PIE database


The training time of proposed method is the least in boththe databases. This is because of orthogonal feature ofcurvelet. The recognition rate of curvelet neural network ishigher than other neural networks. The effectiveness of theproposed method is also tested for different number ofviews of the same face image. 21 images of 100 subjectswere considered and training was done using one to sevenimages at different angles and rest of the images wereconsidered in the test set. Proposed method showed slightlyweaker results for one view, but showed better results formore number of views. The results are given in Table 2.The effectiveness of the proposed method is also evaluatedby changing the dimensionality of data and is shown inTable 3. As the dimensionality of the data increases therecognition accuracy also increases. Also the table showsthat the proposed method outperforms with wide variationin poses. The proposed algorithm is compared with otherstate-of-the-art for recognition accuracy in Table 4. InTable 4, the results are calculated on the same platform bysimulating the algorithm in [21]. In [16] the result is onlyfor 23o pose and 65o pose. In [7], the pose variation or theset of database has not been explicitly declared in the result.It seems to be less than 10° because the results arecomparable to ORL [41] database used which have posevariation less than 10°. In [26] the pose variation isdeclared to be <10°. The results are only for Fb Set ofFERET database [27].To visualise the effectiveness of the proposed method the

receiver operating curve (ROC) that represents the genuineacceptance and FAR of different state-of-the-art methodsalong with the proposed method is shown in Fig. 7. It isclear from Fig. 7 that the proposed scheme is efficient asthe obtained ROC curve is approaching the ideal curve forFERET and CMU-PIE dataset. Ideal ROC curve is astraight line touching genuine acceptance rate of 1 for FARfrom 0 to 1.Fig. 8a shows the improvement in the recognition accuracy

for features from approximate band and moments of detailed


Fig. 8 Recognition accuracy for different poses of CMU-PIE database

www.ietdl.org

subband. The efficiency for all the poses using curveletmoment is higher as compared with moment of the image.This proves that the curvelet moments are more efficientthan the moments of the image. The recognitionperformance of proposed method for each of the 13 posesfor the test set for CMU-PIE database in comparison withother methods is shown in Fig. 8b. It is clear from thefigure that the proposed method outperforms other existingstate-of-the-art methods.

6 Conclusion

This paper presents an efficient face recognition techniqueusing higher-order invariant moments of the curveletsubbands. It enhances the recognition rate as invarianthigher-order moments include global and local informationof the face image. Furthermore, the capability of curveletneural network is better than other variants of neuralnetworks in terms of convergence time and recognition rate.The experiments on LFW, FERET and CMU-PIE facedatabases prove that the proposed method is robust to thechanges in pose and slight variation in lighting conditionsand outperformed all the curvelet-based face recognitionmethods.

7 Acknowledgment

This work was supported by the Department of Science andTechnology, New Delhi, India under Technology SystemDevelopment Scheme DST/TSG/ICT/2011/56-G.

8 References

1 Aroussi, M.E., Hassouni, M.E., Ghouzali, S., Rziza, M., Aboutajdine,D.: ‘Local appearance based face recognition method using blockbased steerable pyramid transform’, Signal Process., 2011, 91,pp. 38–50

2 Guan, N., Tao, D., Luo, Z., Yuan, B.: ‘NeNMF: an optimal gradientmethod for non-negative matrix factorization’, IEEE Trans. SignalProcess., 2012, 60, (6), pp. 2882–2898


3 Liu, G., Lin, Z., Yan, S., Sun, J., Yu, Y., Ma, Y.: ‘Robust recovery ofsubspace structures by low-rank representation’, IEEE Trans. PatternAnal. Mach. Intell., 2012, 35, (1), pp. 170–174

4 Zhang, B., Gao, Y., Zhao, S., Liu, J.: ‘Local derivative pattern versuslocal binary pattern: face recognition with high-order local patterndescriptor’, IEEE Trans. Image Process., 2010, 19, (2), pp. 533–544

5 Shen, L., Bai, L.: ‘A review on Gabor wavelets for face recognition’,Pattern Anal. Appl., 2006, 9, (2), pp. 273–292

6 Sharma, P., Arya, K.V., Yadav, R.N.: ‘Efficient face recognition usingwavelet based generalized neural network’, Signal Process., 2013, 93,(6), pp. 1557–1565

7 Mohammad, A.A., Minhas, R., Wu, Q.M.J., Sid-Ahmad, M.A.: ‘Humanface recognition based on multidimentional PCA and extreme learningmachine’, Pattern Recognit., 2011, 44, pp. 2588–2597

8 Zainuddin, Z., Pauline, O.: ‘Modified wavelet neural network infunction approximation and its application in prediction of time-seriesof time series pollution data’, Appl. Soft Comput., 2011, 11, (8),pp. 4866–4874

9 Pentland, A., Moghaddam, B., Starner, T.: ‘View-based and modulareigenspaces for face recognition’. Proc. ITEE Conf. Computer Visionand Pattern Recognition, 1994, pp. 84–91

10 Cootes, T.F., Walker, K., Taylor, C.J.: ‘View-based active appearancemodels’. Proc. Fourth IEEE Int. Conf. Automatic Face and GestureRecognition, 2000, pp. 227–232

11 Gross, R., Matthews, I., Baker, S.: ‘Appearance based face recognitionand light-fields’, IEEE Trans. PAMI, 2004, 26, pp. 449–465

12 Chai, X., Shan, S., Gao, W.: ‘Pose normalization for robust facerecognition based on statistical affine transformation’. Information,Communications and Signal Processing Conf., 2003, 3, pp. 1413–1417

13 Prince, S.J.D., Elder, J.H., Warrell, J., Felisberti, F.M.: ‘Tied factoranalysis for face recognition across large pose differences’, IEEETrans. Pattern Anal. Mach. Intell., 2008, 30, (6), pp. 1–14

14 Shan, T., Lovell, B.C., Chen, S.: ‘Face recognition robust to head posefrom one sample image’. Proc. 18th Int. Conf. Pattern Recognition,2006, pp. 515–518

15 Sarfraz, M.S., Hellwich, O.: ‘Probabilistic learning for fully automaticface recognition across pose’, Image Vis. Comput., 2010, 28,pp. 744–753

16 Chai, X., Shan, S., Chen, X., Gao, W.: ‘Local linear regression (LLR)for pose invariant face recognition’, IEEE Trans. Image Process.,2007, 16, (7), pp. 1716–1729

17 Choi, S., Choi, C., Kwak, N.: ‘Face Recognition based on 2D Imagesunder illumination and pose variations’, Pattern Recognit. Lett., 2011,32, pp. 561–571

18 Wang, J., You, J., Li, Q., Xu, Y.: ‘Orthogonal discriminant vector forface recognition across pose, pattern recognition, http://dx.doi.org/10.1016/j.patcog.2012.04.012, 2012, 45, (12), pp. 4069–4079

19 Moallem, P., Mousavi, B.S., Monadjemi, S.A.: ‘A novel fuzzy rule basesystem for pose independent faces detection’, Appl. Soft Comput., 2011,11, (2), pp. 1801–1810


www.ietdl.org
20 Huang, F.J., Zhou, Z., Zhang, H.J., Chen, T.: ‘Pose invariant face
recognition’. Proc. IEEE Int. Conf. Automatic Face and GestureRecognition, Grenoble, France, 2000, pp. 245–250

21 Singh, R., Vatsa, m., Ross, A., Noore, A.: ‘A Mosaicing scheme forpose-invariant face recognition’, IEEE Trans. Syst. Man Cybern. B,2007, 37, (5), pp. 1212–1225

22 Arashloo, S.R., Kittler, J.J., Christmas, W.J.: ‘Pose invariant facerecognition by matching on multiresolution MRF’s linked bysupercoupling transform’, Comput. Vis. Image Underst., 2011, 115,pp. 1073–1083

23 Zhang, H., Zhang, Y., Huang, T.S.: ‘Pose robust face recognition viasparse representation’, Pattern Recognit., 2013, 46, pp. 1511–1521

24 Lu, C.Y., Min, H., Gui, J., Zhu, L., Lei, Y.K.: ‘ Face recognition viaweighted sparse representation’, J. Vis. Commun. Image Represent.,2013, 24, pp. 111–116

25 Ma, B., Chai, X., Wang, T.: ‘A novel feature descriptor based onbiologically inspired feature for head pose estimation’, NeuroComput., http://dx.doi.org/10.1016/j.neucom2012.11.005, 115, (4),pp. 1–10

26 Meshgini, S., Aghagolzadeh, A., Seyedarabi, H.: ‘Face recognitionusing gabor based direct linear discriminant analysis and supportvector machine’, Comput. Electr. Eng., http://dx.doi.org/10.1016/j.compeleceng2012.12.011, 39, (3), pp. 727–745

27 Singh, C., Sahan, A.M.: ‘ Face recognition using complex waveletmoments’, Opt. Laser Technol., 2013, 47, pp. 256–267

28 Candes, E.J., Donoho, D.L.: ‘Curvelets- a suprisingly effectivenonadaptive representation for objects with edges’ (VanderbiltUniversity Press, Nashville, TN, 2000)

29 Candes, E.J., Donoho, D.L.: ‘New tight frames of curvelets and optimalrepresentations of objects with C2 singularities’, Commun. Pure Appl.Math., 2002, 57, (2), pp. 219–266

30 Candes, E.J., Demanet, L., Donoho, D.L., Ying, L.: ‘Fast discrete curvelettransform’, SIAM Multiscale Model. Simul., 2005, 5, pp. 861–899


31 Papakostas, G.A., Koulouriotis, D.E., Karakasis, E.G., Tourassis, V.D.:‘Moment-based local binary patterns: a novel descriptor for invariantpattern recognition applications’, Neuro Comput., 2013, 99, pp. 358–37

32 Sharma, P., Arya, K.V., Yadav, R.N.: ‘Extraction of facial features usinghigher order moments in curvelet transform and recognition usinggeneralized mean neural networks’. Int. Conf. Soft Computing forProblem Solving at IIT Roorkee, 20–22 December, 2011, vol. 131,pp. 717–728

33 Yadav, R.N., Kumar, N., Kalra, P.K., John, J.: ‘Learning withgeneralized-mean neuron model’, Neuro Comput., 2006, 69,pp. 2026–2032

34 Phillips, P.J.: ‘The FERET database and evaluation procedure forface recognition algorithm’, Image Vis. Comput., 1998, 16, (5),pp. 295–306

35 Huang, G.B., Ramesh, M., Berg, T., Learned-Miller, E.: ‘Labeled facesin the wild: a database for studying face recognition in unconstrainedenvironments’, Technical Report 07–49, University of Massachusetts,Amherst, October, 2007

36 Sim, T., Baker, S., Bsat, M.: ‘The CMU Pose, illumination andexpression database of human faces’, CMU Technical ReportCMU-RI-TR-01-02, 2001

37 Sharma, A., Haj, m.A., Choi, J., Davis, L.S., Jacob, D.W.: ‘Robust poseinvariant face recognition using coupled latent space discriminantanalysis’, Comput. Vis. Image Underst., 2012, 116, pp. 1095–1110

38 Naseem, I., Togneri, R., Bennamoun, M.: ‘Robust regression for facerecognition’, Pattern Recognit., 2012, 45, pp. 104–118

39 Zhang, H., Nasrabadi, N.M., Zhang, Y., Huang, T.S.: ‘Joint dynamicsparse representation for multi view face recognition’, PatternRecognit., 2012, 45, pp. 1290–1298

40 Xue, H., Zhu, Y., Chan, S.: ‘Local ridge regression for face recognition’,Neuro Comput., 2009, 72, pp. 1342–1346

41 ORL Database at URL: www.uk.research.att.com/facedatabase.html


pose-invariant face recognition using curvelet neural network

Documents