statistical face recognition and intruder detection via a

Statistical Face Recognition and IntruderDetection Via a k-means Iterative Algorithm:

a Resampling Approach

C. CifarelliDept. of Probability and Applied Statistics

University of Rome “Sapienza”P.za A. Moro, 5 00185-Rome, Italy

G. Manfredi∗ and L. Nieddu‡∗Faculty of Political Science, ‡Faculty of Economics

University of Rome “S. Pio V”Via C. Colombo, 200 00145-Rome, Italy

ABSTRACT

In this paper a face recognition algorithm based on a iter-ated k-means classification technique is introduced. Thesuggested algorithm, when compared with popular PCA(Principal Component Analysis) algorithms for face recog-nition more than holds its own. The presented algorithm,unlike PCA, is not a dimensional reduction algorithm,nonetheless it yields barycentric-faces which can be usedto determine different types of face expressions, light con-ditions and pose. The accuracy of PCA and k-means meth-ods has been evaluated under varying expression, illumina-tion and pose using standard face images.

Using a resampling approach we will show how the pro-posed technique can be used to detect intruders, i.e. pic-tures of objects or persons not in the database.

Keywords: Face Recognition, Intruder Detection, Con-strained k-means, Resampling.

1 INTRODUCTION

Face recognition constitutes a very important problem incomputer vision and pattern recognition. It has numerousapplications in security systems, scene surveillance, identi-fication, human machine interface and area access control[1]. Face recognition is also a quite difficult problem tofigure out since images of the human face are subject to awide range of variations such as illuminations, occlusions,facial expression and individual differences. The aim ofa face recognition algorithm should be that of recognizinga face in a manner that is robust enough to these imagevariations. In 1989 a method for face recognition using

deformable templates was presented by Yuille et al [17].In this method a template of part of a face was deformed inthe attempt to get the best fitting on the image by minimiz-ing an energy function associated to the template. In 1991Turk and Pentland published two seminal papers [13] [14]on the application of Principal Component Analysis (PCA)[7] to face recognition. The goal of this approach is to rep-resent a picture of a face in terms of an optimal coordinatesystem. Among the optimality properties is the fact thatthe mean-square error introduced by truncating the expan-sion is minimized. The set of basis vectors which makeup this coordinate system is referred to as eigenfaces oreigenpictures.

Since then there have been numerous works extendingand implementing the eigenfaces scheme for face recogni-tion, and all sort of variations have been considered, suchas, for instance, eigenmouth, eigennose etc. [1]. Unfortu-nately the eigenface approach starts to break down whenuntrained images (intruders) are feed to the classifier. Inmany ways the algorithm is robust enough to recognizewhether a given image is a face or not.

In [1] a survey on several statistical-based, neuralnetwork-based and feature based methods for face recog-nition has been presented. Nowadays PCA is one of thetechnique which is more promising for frontal face recog-nition [4].

In 2002 Sim et al [12] presented a system (ARENA)where, in order to reduce possible noise and speed up thealgorithm, images where reduced to a 16x16 pixel matricesand then PCA was performed on the reduce matrix. In or-der to improve the accuracy of PCA methods in face recog-nition by taking into account varying facial expression, il-lumination and pose, Gottumukkal and Asari [4] have pro-

posed, in 2004, a Modular PCA (MPCA) approach whichis partially similar to the modular eigenspaces approachsuggested by Pentland et al [11].

The aim of this paper is to present an iterative k-meansbased classification algorithm which can be used as a validalternative to PCA for face recognition. The algorithm pre-sented here more than holds its own when compared withpopular PCA methods, always yielding better results onbanchmark face images. The approach presented in thispaper differs from the previous approaches in a number ofways. It does not use PCA on the images, instead it uses abarycentric approach which can be useful in case of addi-tive noise on the images. The algorithm used to recognizenew images is a k-means based pattern recognition algo-rithm which has proven to be very effective in various pat-tern recognition applications (see for instance [2, 5, 8, 9]).Furthermore, as a byproduct, the technique used in this ap-proach yields a way for detecting outliers (i.e. intruders)via a resampling procedure.

In Section 2 the algorithm will be illustrated, while inSection 3 the results of its application on standard bench-mark face images will be presented. Section 3.2 is dedi-cated to the presentation of the intruder detection techinqueusing a resampling approach and to the assessment of itsperformance on some intruder images. Finally in Section4 the general conclusions of the proposed technique willbe drawn together with future research agenda.

2 THE ALGORITHM - T.R.A.C.E.

The algorithm presented in this paper (T.R.A.C.E.: TotalRecognition by Adaptive Classification Experiments) is asupervised classification algorithm [15], i.e. a data set ofelements with known classes is supposed to be available.As any supervised learning technique it is composed of twophases:

1. a training phase: a training set of element of knownclasses is used to fine-tune the algorithm for the par-ticular problem at hand.

2. a classification phase: once trained the algorithm isused to classify elements of unknown classes. Theseelements are usually referred to as query points

The performance of the algorithm is assessed via cross-validation [15].

Given a data set of n pattern vectors in Rp, assume apartition defined on the dataset, i.e. each pattern vectoris assigned to one and only one of k known classes. Letassume a Euclidean norm defined on the dataset and letψ be a function from Rp onto the set C = {1, 2, . . . , k}which maps each pattern vector xj , j = 1, . . . , n into the

Begin

Step1 Let

– xj , j = 1, . . . , n be the pattern vectors in the training set

– B0 be the set of k initial barycentres bi, i = 1, . . . , k

Step2 Compute the distances of each xj from all the bi ∈ Bt

LetM be the set of xw that are closer to a barycentre of a classdifferent from their own.

t← 0

Step3 while M 6= ∅

– Let xs, s ∈ M be the vector with the greatest distancefrom its own barycentre.

– c← ψ(xs)

– Let Bt+1 ← Bt ∪ xs

– for all the elements of class c perform a k-means routineusing as starting points the barycentres of Bt+1 that belongto class c

– t← t+ 1

– Compute the distances of each xj from all the bi ∈ Bt

– LetM be the set of xw that are closer to a barycentre of aclass different from their own.

End

Figure 1: T.R.A.C.E. in meta-language

class c ∈ C that it belongs to. T.R.A.C.E. begins comput-ing the barycentre of each class, yielding an initial set of kbarycentres. Then the Euclidean distance of each patternvector from each barycentre is computed. If each patternvector is closer to the barycentre of its class the algorithmstops, otherwise there will be a non empty setM of patternvectors which belong to a class and are closer to a barycen-tre of a different class. InM select the pattern vector xwthat is farthest from the barycentre of its class. This patternvector will be used as a seed for a new barycentre for classψ(xw). A k-means algorithm [3] will then be performedfor all the pattern vectors in class ψ(xw) using, as start-ing points, the set of barycentres for class ψ(xw) and thevector xw . Once the k-means has been performed the setof barycentres will be composed of k + 1 elements. Thebarycentres at the new iterations need not be computed forall classes, but only for class ψ(xw), since the barycentresfor the other classes have remained unchanged. In the fol-lowing step the distance of each pattern vector from all thebarycentres is computed anew, and so is the set M (seefigure 1).

IfM is not empty then the pattern vector inM which isfarthest from a barycentre of its own class is once againselected to serve as a seed for a new barycentre. Thisprocedure iterates until the set M is empty. The conver-gence of the T.R.A.C.E. in a finite number of steps hasbeen proved in various ways (see [9, 10]). Upon conver-

gence, T.R.A.C.E. yields a set of barycentres which, in theworst case, are in a number equal to the number of ele-ments in the dataset and which has a lower bound in thenumber of classes. The aim of this algorithm is to findsubclasses in the dataset which can be used to classify newvectors of unknown class. It is worth noticing that if thepartition defined on the dataset is consistent with the fea-tures considered, i.e. if the pattern vectors are linearly sep-arable, then T.R.A.C.E. generates a number of barycentresequal to the number of classes. On the other hand, if thedataset is not linearly separable, then T.R.A.C.E. continuessplitting the classes until the subclasses obtained are lin-early separable. It is obvious that it can continue splittinguntil all the subclasses are composed of only one vector(singleton). It will not converge only if two vectors in thedataset belong to different classes and are represented bythe same pattern vector [9, 10]. This problem can be eas-ily overcome increasing the dimension of the vector space.Once T.R.A.C.E. has converged the sets of barycentres canbe used to classify new query points assigning the new el-ement to the class of the barycentre it is closest to. If el-ements from the training set are used as query points thenthe algorithm always classify them correctly because, onceconverged, all pattern vectors in the training set are closerto a centroid of their own class.

3 EXPERIMENTAL SETUP

In the following paragraphs the results of the applicationof the proposed algorithm on benchmark face images arepresented, together with the suggestion of an intruder de-tection technique. The face images have been taken fromthe AT&T database which is particularly suitable to test therobustness of a pattern recognition technique to variationsin pose and expression.

The performance of T.R.A.C.E. has been determined viaa 10% cross-validation procedure, i.e. 10% of the datasethas been randomly selected for testing and the remaining90% of the images has been used to train the algorithm.To get a more stable estimate of the correct classificationrate the average of the classification rates over 100 trialshas been considered, each time randomly selecting the testset as the 10% of the whole dataset of images and train-ing the algorithm on the remaining 90%. T.R.A.C.E. hasbeen compared with the results obtained using the PCAapproach to face recognition [13].

The intruder detection technique has been tested bothon images, some of human faces and some of objects, ran-domly taken from the Web and on images of intruders com-parable to those in the dataset of non intruders.

3.1 Face RecognitionThe AT&T database of faces (formerly ’The ORLDatabase of Faces’), contains a set of face images takenbetween April 1992 and April 1994 at the AT&T Labora-tories Cambridge. The database was used in the contextof a face recognition project carried out in collaborationwith the Speech, Vision and Robotics Group of the Cam-bridge University Engineering Department. There are 10different images of each of 40 distinct subjects. For mostsubjects the images were shot at different times with differ-ent lighting conditions, facial expressions and facial details(glasses/no glasses). All the images were taken against adark homogeneous background with the subjects in an up-right, frontal position and some side movements and headrotations and tilting were tolerated. The size of each im-age is 92x112 pixels, with 256 grey levels per pixel. Aspecimen of three individuals from the AT&T database isdepicted in Figure 2.

Figure 2: Images from the AT&T database

To take into account the effect of different lighting condi-tions the algorithm has been applied both on the originalimages and on normalized images. Normalization of pixelintensities has been obtained columnwise by subtractingfrom each pixel in the image the mean gray level and di-viding by its standard deviation. In Table 1 the correctrecognition rates of T.R.A.C.E. are displayed both for theOriginal Dataset and for the Normalized Dataset. Onceagain they have been obtained in a 10%-cross-validationframework. Figures show that the performance of the algo-rithm does not improve by normalizing the data: this couldbe due to the fact in the AT&T dataset all images havebeen taken against a dark homogenous background, there-fore normalizing for variations in the background should

T.R.A.C.E.AT&T Mean Worst Best

Original Data 0.953 0.844 1.000Normalized Data 0.932 0.791 1.000

Table 1: Results on the AT&T Database: estimates ob-tained over 100 trials.

have little or no effect on the recognition rate. Nonethelessthe results are very good: the corresponding PCA resultson the same dataset vary in the literature from 0.88 to 0.95(see e.g. [16, 18, 19]) (the large variability being mainlydue to different experimentals setups and cross-validationschemes) and are significantly different from the randomrecognition rate.

3.2 Intruder Detection via Resampling

The algorithm proposed in this paper assigns query pointsaccording to a minimum distance criterion. Namely, so fara Closed Universe [6] approach has been applied, whichmeans that every individual represented by an image in thequery point is also represented in the training set. This im-plies that if a vector representing the picture of an individ-ual who is not in the dataset, is submitted to the classifier,it will be classified according to the class of the baricenterwhich it is closest to. Therefore the algorithm, althoughvery effective when applied to images of persons presentin the dataset, is not able to determine if the person is anintruder or not, or if it is a person at all.

To solve this problem the information on the distancebetween a particular query point and the barycenter can beused to determine if the query point is an intruder.

What is to be decided is whether or not the distance dof the query point x to the closest barycenter Bc is smallenough to imply that x is similar to Bc and therefore tothe other images in the training-set. If not, the querypoint is considered to be too different even from the closestbarycenter and then should be treated as an intruder.

Given a dataset, let F be the theorical distribution of thedistances of all the elements in the dataset from their re-spective barycenters, i.e. F is the distribution of distancesof the elements which are correctly classified.

The problem of determining if an image is an intrudercan therefore be formulated as the following statistical sys-tem of hypothesis: {

H0 : d ∈ FH1 : d /∈ F (1)

If the null hypothesis H0 is rejected at a certain level ofsignificance α, then the query point x should be consid-ered as intruder. If H0 cannot be rejected then there is no

2000~-~--~--~-~~-~--~--~-~

1800

1600

1400

1200

1000

800

600

400

200

2000 2500 3000 3500 4000 4500

Figure 3: Resampled distribution of distances of correctlyclassified elements: mean=2446.1; std=468.7; 95th per-centile=3359

sufficient evidence, at the selected level of significance, toassume that the query point is an intruder.

Anyway the distribution F is unknown and can be ap-proximated using its bootstrap distribution FB. In orderto do so the results of the training phase performed on theAT& T dataset have been used. Namely the process lieson resampling the training set from the AT & T datasetwhen applying the 10% cross-validation procedure. Ineach replication of the resampling process, upon conver-gence, the set of distances of the elements correctly classi-fied from their barycenters will be available, one distancefor each element of the training set. The resampling pro-cess has been performed 100 times, each time selecting90% of the dataset to use in training and recording, uponconvergence, the distribution of the distances of the ele-ments from their own barycenter. All these distance havebeen considered together, obtaining an empirical distribu-tion based on about 36000 points which can be consideredthe bootstrap FB distribution of the distances of each faceimage in the dataset from its own barycenter, i.e. the re-sampled distribution FB under the null hypothesis that theelement is not an intruder.

Such distribution has been depicted in Figure 3 togetherwith its mean (2446.1), standard deviation (468.7) and95th percentile (3359).

To test the null hypothesis H0 in (1), at the significancelevel α, the (1 − α) percentile dα of the resampled dis-tribution FB will be used as a critical value. Should thedistance of a query point from the barycenter it is assignedto be greater than dα, then H0 should be rejected, i.e. theimage should be classified as intruder. On the other handif d should be less or equal to dα then there is not enoughevidence to classify the image as an intruder according tothe resampled distribution and therefore the image should

(8792.6) (5157.1) (6579.0) (4121.9) (3817.7)

(3355.9) (7235.8) (4972.6) (6828.5) (7974.3)

(5387.2) (7098.6) (4236.3) (4714.7) (6490.8)

Figure 4: Images of intruder, i.e. objects or persons notpresent in the training set together with their distance dfrom the closest barycenter

be classified according to the class of the barycenter it isclosest to.

To test the performance of this technique a sample of 15images have been downloaded from the Internet and havebeen cut and scaled in order to fit the format and shape ofthe images in our dataset (92× 112pxs gray scale images)(Figure 4). The images have been classified according tothe barycenters obtained in the training phase of the AT&T dataset. In Figure 4 the set of 15 intruders has been de-picted together with the distances d of each image from theclosest barycenters of the AT& T dataset. All the imagesproduce a distance d from the barycenter which is greaterthan the 95th percentile (3359) of the resampled distribu-tion (α = 0.05), except for the “Bush” image, which, witha distance of 3355.9 is very close to the boundary of therejection region. Therefore 14 out of 15 images have beencorrectly classified as intruders by the proposed techniquewith an error of 6.7%.

To test this technique on images of the same “nature” ofthose in the dataset, five individual have been selected outof the 40 in the dataset, i.e. 50 images, ten for each individ-ual, have been removed from the 400 images in the dataset.The barycenters obtained on the remaining 35 individualshave been used to classify these five individuals which nowshould be considered intruders because they have been re-moved from the dataset before training.

The performance of the proposed technique is reportedin Table 2: only 8% of the images have not been recog-

Intruder Non-intruder Total Images46 4 50

(92%) (8%) (100%)

Table 2: Performance of the algorithm on 50 images ofintruders

Images Classified as Totalintruder non-intruder Images

57 343 400(14.25%) (85.75%) (100%)

Table 3: Performance of the algorithm for false rejectionon all the 400 images

nized as intruder.Considering that each individual is represented by ten

images, and that only 4 images have not been recognized asintruder, using a majority vote rule all the five individualscan be recognized as intruder.

To test for false rejection rate all the 400 images havebeen tested to see if they could be misrecognized as in-truders by the algorithm. Only 57 out of 400 images havebeen recognized as intruders (false rejection) as displayedin Table 3.

4 CONCLUSIONS

A supervised k-means based patter recognition algorithmhas been presented in this paper. Results show that the al-gorithm is a viable alternative to the PCA approach to facerecognition and more than holds its own when comparedwith modern approaches to PCA face recognition. Thetechnique proposed is flexible, has been applied on a va-riety of pattern recognition problems always yielding goodresults.

The distribution of distances of images from theirbarycenters obtained during the 10% cross-validationphase has been used to detect intruders, i.e. images of peo-ple or objects that are not represented in the training set.This technique seems to work fairly well to detect intrud-ers.

As future research agenda, we are planning to intro-duce the intruder detection routine in the classification al-gorithm and to do extensive experiments on benchmarkdatasets in order to determine the performance of the algo-rithm to detect intruders and recognize individuals in largedatasets. According to a boosting philosophy we are alsoplanning to get a weighted resampled distribution of dis-tances: i.e. the distances that make up the resampling dis-tribution will be weighted according to the performance ofthe algorithm on the remaining 10% part of the dataset thathas been left out for testing during cross-validation.

REFERENCES

[1] R. Chellappa, C. L. Wilson, and S. Sirohey. Human andmachine recogniton of faces: a survey. In Proc. Of IEEE,volume 83, pages 705–740, 1995.

[2] C. Cifarelli, L. Nieddu, O. Seref, and P. Pardalos. K-t.r.a.c.e.: A kernel k-means procedure for classificaion.Computers and Operations Research, 34(10):3154–3161,2007.

[3] A. D. Gordon. Classification. Chapman & Hall Ltd, Lon-don; New York, 1999.

[4] R. Gottumukkal and V. K. Asari. An improved face recog-nition technique based on modula pca approach. PatternRecognition Letters, 25:429–436, 2004.

[5] G. Grimaldi, C. Manna, L. Nieddu, G. Patrizi, and P. Si-monazzi. A diagnostic decision support system and its ap-plications of the choice of suitable embryos in human as-sited reproduction. Central European journal of Opera-tions Research, 10(1):29–44, April 2002.

[6] R. Gross, J. Shi, and J. Cohn. Quo vadis face recognition.In Proceedings of the 3rd Workshop on Empirical Evalua-tion Methods in Computer Vision, 2001.

[7] I. T. Jolliffe. Principal Component Analysis. Springer Ver-lag, New York, 2002.

[8] A. Lozano, G. Manfredi, and L. Nieddu. An algorithm forthe recognition of levels of congestion in road traffic prob-lems. Mathematics and Computers in Simulation, 2007.

[9] L. Nieddu and G. Patrizi. Formal properties of patternrecognition algorithms: A review. European Journal ofOperational Research, 120:459–495, 2000.

[10] L. Nieddu and G. Patrizi. Optimization and algebraic tech-inques for image analysis. In M. Lassonde, editor, Approxi-mation, Optimization and Mathematical Economics, pages235 –242. Physica-Verlag, 2001.

[11] A. Pentland, B. Moghaddam, and T. Startner. View-baedand modular eigenspaces for face recognition. In IEEEConf. On computer vision and pattern recognition, 1994.

[12] T. Sim, R. Sukthankar, M. Mulling, and S. Baluja.Memory-based face recogntion for visitor identification. In4th Intl. Conf. On face and gesture recogntion, pages 214–220, Grenoble, France, March 2000.

[13] M. A. Turk and A. P. Pentland. Eigenfaces for recognition.Journal of cognitive neuroscience, 3(1):71–86, 1991.

[14] M. A. Turk and A. P. Pentland. Face recogntion usingeigenfaces. In Proc. CVPR, pages 586–591, June 1991.

[15] S. Watanabe. Pattern Recognition: Human and Mechani-cal. Wiley, New York, 1985.

[16] J. ying Gan, D. pei Zhou, and C.-Z. Li. A method for im-proved pca in face recognition. International Journal ofInformation Technology, 11(11):79–85, 2005.

[17] A. L. Yuille, D. S. Cohen, and P. W. Halliman. Featureextraction from face using deformable templates. In Proc.CVPR, San Diego, CA, June 1989.

[18] D. Zhang and Z.-H. Zhou. (2d)2pca: Two-directional two-dimensional pca for efficient face representation and recog-nition. Neurocomputing, 69(1-3):224–231, 2005.

[19] J. Y. Zhang, D. Frangi, and A. J. yu Yang. Two-dimensionalpca: a new approach to appearance-based face representa-tion and recognition. IEEE Transcations on Pattern Analy-sis and Machine Intelligence, 26:131–137, 2004.

statistical face recognition and intruder detection via a

Documents