[ms.c thesis] a new illumination normalization approach for face recognition 2009

A New Illumination Normalization

Approach for Face Recognition

A thesis submitted to the Department of Computer Science, Faculty of Computer and

Information Sciences, Ain Shams University, in partial fulfillment of the requirements for the degree of Master of Computer and Information Sciences

By:

Ahmed Salah ELDin Mohammed ELSayed

B.Sc. in Computer Science, Faculty of Computer and Information Sciences,

Ain Shams University. Cairo, Egypt

Supervised By:

Faculty of Computer and Information Sciences Ain Shams University

Cairo – 2009

Ain Shams University Faculty of Computer

& Information Sciences Computer Science Department

Prof. Dr. Taha I. ELAreif CS Dept., Faculty of Computer and

Information Sciences, Ain Shams University

Dr. Haitham ELMessairy CS Dept., Faculty of Computer and

Information Sciences, Ain Shams University

II

Acknowledgement

In the name of Allah, most Beneficent, most Merciful: “And whatever of

comfort you enjoy, it is from Allah…” Al-Nahl (53). First and foremost, I humbly

give my deep thanks to Allah and my Parents for giving me the opportunity and

the strength to accomplish this work. Then, all thanks to students, colleagues,

relatives and all who pray for me to finish my work.

I would like to thank Prof. Dr. Taymoor Nazmy - the vice dean of the faculty of

Computer and Information Sciences, Ain Shams University- for his support. My

great thanks to Prof. Dr. Saied ELGhonaimy, Computer Systems Dept. – Faculty

of Computer and Information Sciences, Ain Shams University, for his

encouragement and support. Special thanks to Prof. Dr. Mohammed Hashem,

Head of Information Systems Dept. – Faculty of Computer and Information

Sciences, Ain Shams University, for his advices and support.

I would like to express my deep appreciation and thanks to all who supervise me:

Prof. Dr. Mostafa Seyam (god bless him), Prof. Dr. Taha I. El-Areif, Dr.

Khaled A. Nagaty and Dr. Haitham ELMessairy, for their great help and

encouragement during the execution of this work. Special thanks to Dr. Khaled

A. Nagaty for his follow, care, patience and support for me. Also, I would like to

thank my colleagues Mona Wagdy, Amr EL-Desoky, Kareem Emara,

Mahmoud Hossam and Mohammed Hamdy for their valuable help and

cooperation. I would like to thank the team of “Face Recognition using Eigenface, NN and

Mosaicing Techniques” graduation project, Faculty of Computer and Information

Sciences, Ain Shams University, 2005, for their project which is used in

experiments of this work.

III

Publications

The work presented in this thesis has been published in the following conferences:

1. T.I. El-Arief, K.A. Nagaty, and A.S. El-Sayed, “Eigenface vs.

Spectroface: A Comparison on the Face Recognition Problems”,

IASTED Signal Processing, Pattern Recognition, and Applications

(SPPRA’07), Austria, 2007.

2. S. El-Sayed, K. A. Nagaty, T. I. El-Arief, “An Enhanced Histogram

Matching Approach using the Retinal Filter’s Compression Function

for Illumination Normalization in Face Recognition”, ICIAR’08,

Springer-Verlag LNCS 5112, pp. 873–883, Portugal, 2008.

IV

Abstract

Although many face recognition techniques and systems have been proposed,

evaluations of the state-of-the-art techniques and systems have shown that

recognition performance of most current technologies degrades due to the

variations of illumination. In the last face recognition vendor test FRVT 2006,

they conclude that relaxing the illumination condition has a dramatic effect on the

performance. Moreover, It has been proven both experimentally and theoretically

that the variations between the images of the same face due to illumination are

almost always larger than image variations due to change in face identity.

There has been much work dealing with illumination variations in face

recognition. Although most of these approaches can cope with illumination

variation well, some may bring negative influence on images without illumination

variation. In addition, some approaches show great difference on performance

when combined with different recognition methods. Some other approaches

require perfect alignment of face within the image which is difficult to achieve in

practical/real-life systems.

In this thesis, we propose an illumination normalization approach that proves

flexibility to different face recognition approaches and robustness to non-aligning

of faces in addition to having the minimum negative influence on images without

illumination variations. The proposed approach is based on enhancing the image

resulting from histogram matching, called GAMMA-HM-COMP.

To verify both the flexibility to face recognition approaches and robustness to non-

aligning of faces, the proposed approach is tested over two face recognition

methods representing the two broad categories of the holistic-based approach –

V

namely Standard Eigenface method from the Eigenspace-based category and

Spectroface from the frequency-based category. In each method, the testing is

done using both aligned and non-aligned versions of the Yale B database.

In order to compare the proposed approach with other approaches, we select four

best-of-literature illumination normalization approaches among 38 different

approaches based on surveying nine different comparative studies. All five

approaches are compared using Eigenface and Spectroface methods on images

with illumination variations and images with other facial and geometrical

variations.

In illumination variation, the proposed approach gives the best results on the

Eigenface and the second best results on the Spectroface, when images are not

perfectly aligned. Moreover, the proposed approach is the minimum affected

approach (i.e. most robust) due to the non-aligning of faces on both methods.

In other facial and geometrical variations, the proposed approach has the minimum

negative influence on each of the two methods among the four other approaches.

In this work, all illumination normalization approaches are tested on two face

recognition methods representing the two broad categories of the holistic-based

approach. It’s important to extend this work to include local-based face

recognition methods in testing these approaches as they may introduce great

difference in performance when combined with such methods.

Moreover, this work introduces a technology evaluation for the proposed approach

and the other best-of-literature approaches. In order to complete the thorough

evaluation cycle, both scenario and operational evaluations need to be performed

for these approaches.

VII

Table of Contents ACKNOWLEDGEMENT .......................................................................................................................... II

PUBLICATIONS ........................................................................................................................................ III

ABSTRACT ................................................................................................................................................ IV

TABLE OF CONTENTS ..........................................................................................................................VII

LIST OF FIGURES .................................................................................................................................... IX

LIST OF TABLES .....................................................................................................................................XII

CHAPTER 1: INTRODUCTION ............................................................................................................... 1

1.1 BIOMETRICS AND FACE RECOGNITION ................................................................................................. 1 1.2 PROBLEM DEFINITION........................................................................................................................... 3 1.3 METHODS CATEGORIZATION ................................................................................................................ 3 1.4 VARIATIONS CATEGORIZATION ............................................................................................................ 3 1.5 SUCCESSFUL SCENARIOS ...................................................................................................................... 4 1.6 COMMERCIAL SYSTEMS ........................................................................................................................ 5 1.7 RECENT EVALUATIONS ......................................................................................................................... 6 1.8 THESIS OBJECTIVES AND ORGANIZATION ............................................................................................. 7

CHAPTER 2: FACE RECOGNITION APPROACHES .......................................................................... 9

2.1 INTRODUCTION ..................................................................................................................................... 9 2.2 LOCAL-BASED APPROACHES .............................................................................................................. 10 2.3 HOLISTIC-BASED APPROACHES .......................................................................................................... 15

2.3.1 Eigenspace-based Category ....................................................................................................... 16 2.3.2 Frequency-based Category ........................................................................................................ 21 2.3.3 Other Holistic-Based Approaches .............................................................................................. 24

2.4 HYBRID APPROACHES ......................................................................................................................... 27 2.5 PERFORMANCE EVALUATIONS AND COMPARATIVE STUDIES.............................................................. 28

2.5.1 Performance Evaluation ............................................................................................................ 28 2.5.2 Comparative Studies .................................................................................................................. 30

CHAPTER 3: ILLUMINATION NORMALIZATION APPROACHES .............................................. 31

3.1 INTRODUCTION ................................................................................................................................... 31 3.2 MODEL-BASED APPROACHES ............................................................................................................. 32 3.3 IMAGE-PROCESSING-BASED APPROACHES ......................................................................................... 40

3.3.1 Global Approaches .................................................................................................................... 40 3.3.2 Local Approaches ...................................................................................................................... 45

3.4 COMPARATIVE STUDIES & BEST-OF-LITERATURE APPROACHES ........................................................ 53

CHAPTER 4: SETUP THE ENVIRONMENT ....................................................................................... 62

4.1 INTRODUCTION ................................................................................................................................... 62 4.2 METHODS DESCRIPTIONS ................................................................................................................... 63

4.2.1 Standard Eigenface Method ....................................................................................................... 63 4.2.2 Spectroface Method ................................................................................................................... 63

4.3 DATABASES DESCRIPTIONS ................................................................................................................ 65 4.3.1 UMIST database ........................................................................................................................ 65 4.3.2 Yale B database .......................................................................................................................... 65 4.3.3 Grimace database ...................................................................................................................... 66 4.3.4 JAFFE database ......................................................................................................................... 67 4.3.5 Nott-faces database .................................................................................................................... 68 4.3.6 Yale database ............................................................................................................................. 68 4.3.7 Face 94 database ....................................................................................................................... 68

VIII

4.4 EXPERIMENTAL RESULTS ................................................................................................................... 69 4.4.1 Pose Variation ........................................................................................................................... 69 4.4.2 Facial Expressions Variation ..................................................................................................... 70 4.4.3 Non-Uniform Illumination Variation ......................................................................................... 72 4.4.4 Translation Variation ................................................................................................................. 73 4.4.5 Scaling Variation ....................................................................................................................... 75

4.5 SUMMARY .......................................................................................................................................... 76

CHAPTER 5: THE PROPOSED ILLUMINATION NORMALIZATION APPROACH................... 77

5.1 INTRODUCTION ................................................................................................................................... 77 5.2 IDEA OF THE PROPOSED APPROACH .................................................................................................... 77 5.3 HISTOGRAM MATCHING ALGORITHM ................................................................................................. 79 5.4 IMAGE ENHANCEMENT METHODS ...................................................................................................... 81

5.4.1 Histogram Equalization (HE) .................................................................................................... 81 5.4.2 Log Transformation (LOG) ........................................................................................................ 81 5.4.3 Gamma Correction (GAMMA) .................................................................................................. 81 5.4.4 Compression Function of the Retinal Filter (COMP) ................................................................ 82

5.5 THE ENHANCED HM APPROACHES ..................................................................................................... 83 5.5.1 Enhancement After HM .............................................................................................................. 83 5.5.2 Enhancement Before HM ........................................................................................................... 84 5.5.3 Further Enhancement ................................................................................................................ 85

5.6 VERIFICATION OF THE SELECTION CONDITIONS ................................................................................. 85 5.7 EXPERIMENTAL RESULTS ................................................................................................................... 87 5.8 SUMMARY .......................................................................................................................................... 97

CHAPTER 6: EVALUATE THE PROPOSED APPROACH ................................................................ 99

6.1 INTRODUCTION ................................................................................................................................... 99 6.2 IMPLEMENTATION OF THE COMPARED APPROACHES .......................................................................... 99

6.2.1 Preprocessing Chain Approach (CHAIN) .................................................................................100 6.2.2 Local Normal Distribution (LNORM) .......................................................................................101 6.2.3 Single Scale Retinex with Histogram Matching (SSR-HM) ......................................................101 6.2.4 Local Binary Patterns (LBP) ....................................................................................................102 6.2.5 Proposed Approach (GAMMA-HM-COMP) ............................................................................102

6.3 COMPARISON ON ILLUMINATION VARIATIONS ..................................................................................103 6.3.1 Aligned Faces ............................................................................................................................103 6.3.2 Non-Aligned Faces ....................................................................................................................103

6.4 COMPARISON ON OTHER VARIATIONS ...............................................................................................106 6.4.1 Pose Variations .........................................................................................................................107 6.4.2 Facial Expressions Variations ..................................................................................................108 6.4.3 Translation Variations ..............................................................................................................111 6.4.4 Scaling Variations .....................................................................................................................115

6.5 SUMMARY .........................................................................................................................................116

CHAPTER 7: CONCLUSIONS AND FUTURE WORKS ....................................................................118

7.1 CONCLUSIONS....................................................................................................................................118 7.2 FUTURE WORKS ................................................................................................................................120

REFERENCES ..........................................................................................................................................121

IX

List of Figures Figure 1.1: Distribution of some biometrics over the market 1 Figure 1.2: Number of published items and citations on face recognition between 1991 and 2006 2 Figure 1.3: Easy scenarios in face recognition 4 Figure 1.4: Easy scenarios in face recognition 4 Figure 1.5: Difficult scenarios for face recognition 5 Figure 2.1: Face bunch graph (FBG) serves as a general representation of faces. It is designed to

cover all possible variations in the appearance of faces. The FBG combines information from a number of face graphs. Its nodes are labeled with set of jets, called bunches, and its edges are labeled averages of distance vectors. During comparison to an image, the best fitting jet in each bunch, indicated by gray shading, is selected independently. 11

Figure 2.2: A visualized example for the steps of automatically localizing features. In (e), the black cross on a white background indicates an extracted and stored feature vector at this location while white cross on black background indicates an ignored feature vector. 12

Figure 2.3: System overview of the component-based face detector using four components 13 Figure 2.4: Sample of the normalized whole face image and the three regions that are used for the

local analysis 14 Figure 2.5: Example for illustrating the basic LBP operator 15 Figure 2.6: Examples of circular neighborhoods with number of sampled points P and radius R (P,R)

15 Figure 2.7: Block diagram of the standard Eigenface method 16 Figure 2.8: The subspace LDA face recognition system 17 Figure 2.9: Flowchart for the Face recognition using evolutionary pursuit (EP) method 18 Figure 2.10: Image synthesis model for Architecture 1. To find a set of IC images, the images in X are

considered to be a linear combination of statistically independent basis images, S, where A is an unknown mixing matrix. The basis images are estimated as the learned ICA output U. 19

Figure 2.11: Example of the projection map and the projection-combined image 19 Figure 2.12: 3-level wavelet decomposition 22 Figure 2.13: (a) input image, (b) the log-magnitude of its DCT, (c) the scanning strategy of

coefficients 22 Figure 2.14: Most variant frequencies: a) real, b) imaginary and c) selected numbering 23 Figure 2.15: Spectroface representation steps 24 Figure 2.16: Examples of FBT of (A) an 8 radial cycles image, (B) a 4 angular cycles image and (C)

an image of the average of these images. The magnitude of the FBT coefficients is presented in colored levels (red indicates the highest value) 24

Figure 2.17: Block diagram for face recognition based on moments 25 Figure 2.18: Examples of the Trace transform on (a) full image (b) masked with rectangular shape

and (c) masked with elliptical shape. 26 Figure 2.19: Training and recognition stages of Face Recognition Using Local and Global Features

approach 27 Figure 2.20 (a) A gray-scale face image, (b) it’s edginess image, and (c) the cropped eyes. 28 Figure 3.1: The same individual imaged with the same camera and the same facial expression may

appear dramatically different with changes in the lighting conditions. 31 Figure 3.2: Effect of applying QIR on an illuminated face image from Yale B database 34 Figure 3.3: Effect of applying SQI approach to illuminated face images from Yale B and CMU PIE

databases. 34 Figure 3.4: The effect of the scale, σ, on processing an illuminated facial image using the SSR. 37 Figure 3.5: Histogram fitted version of SSR with σ = 6 38

X

Figure 3.6: Discretization lattice for the PDE in equation 3.23 39 Figure 3.7: Effect of applying GROSS approach on some illuminated face images from Yale B

database 40 Figure 3.8: Effect of applying histogram equalization on an illuminated image 41 Figure 3.9: Histogram matching process to an illuminated image 42 Figure 3.10: Transformation functions of LOG and GAMMA (L: number of gray levels) 43 Figure 3.11: Effect of applying LOG approach to an illuminated face image. 44 Figure 3.12: Effect of applying GIC to an illuminated face image 45 Figure 3.13: Effect of applying NORM approach to an illuminated image. (Note that the gray-level of

the resulting image is stretched to [0,255] for displaying purpose only) 45 Figure 3.14: Effects of applying the three local normalization methods to an illuminated face image 46 Figure 3.15: An example of ideal region partition 46 Figure 3.16: The four regions for illumination normalization 47 Figure 3.17: The effects of applying region-based strategy of HE and GIC over the four face regions

47 Figure 3.18: Block histogram matching. In each image pair, the left one is the input image while the

right one is the reference image. 47 Figure 3.19: The windowing filter H used in the Block HM method 48 Figure 3.20: Images before and after intensity normalization with BHM. (a) Input images, (b)

corresponding output images after applying BHM 49 Figure 3.21: The LBP operator 49 Figure 3.22: The extended LBP operator with (8,2) neighborhood. Pixel values are interpolated for

points which are not in the center of a pixel. 50 Figure 3.23: Original image (left) processed by the LBP operator (right). 50 Figure 3.24: Effects of applying the image processing steps proposed by [91] 51 Figure 3.25: Examples of images of one person from the Extended Yale-B frontal database. The

columns respectively give images from subsets 1 to 5. 53 Figure 3.26: Summarization for the first four comparative studies. For each study, it shows the

normalization approach to be compared, the face databases and the face recognition approaches in addition to the best normalization approaches from each study (grayed boxes) 55

Figure 3.27: Summarization for the nine comparative studies showing some relations between these studies in addition to the final best normalization approaches from all studies (dark grayed boxes). For each study, it shows the normalization approach to be compared, the face databases and the face recognition approaches in addition to the best normalization approaches from each study (light grayed boxes) 60

Figure 4.1: Standard Eigenface block diagram 63 Figure 4.2: Spectroface block diagram 64 Figure 4.3: UMIST: selected images for one subject in both training and testing sets 66 Figure 4.4: Yale B: Training images for one subject in the four subsets with the light angle of each

image 66 Figure 4.5: Selected images for one subject from each database used for studying the facial

expression variation 67 Figure 4.6: Example images from JAFFE database. The images in the database have been rated by

60 Japanese female subjects on a 5-point scale for each of the six adjectives. The majority vote is shown underneath each image (with natural being defined through the absence of a clear majority) 67

Figure 4.7: Face 94: 15 images for each subject in both training and testing sets 69 Figure 4.8: Translation Variation: example for translating with and without circulation 74 Figure 5.1: Histogram matching process to an illuminated image 80

XI

Figure 5.2: Transformation functions of LOG and GAMMA (L: number of gray levels) 82 Figure 5.3: Effect of the four enhancement methods on an illuminated face 83 Figure 5.4: Block diagram of applying the image enhancement method after the HM 83 Figure 5.5: Effects of applying the image enhancement methods after applying the HM 84 Figure 5.6: Block diagram of applying the image enhancement method before the HM 84 Figure 5.7: Effects of applying the image enhancement methods before applying the HM 84 Figure 5.8: Block diagram showing the further enhancement of combinations in 5.5.1 and 5.5.2 85 Figure 5.9: Effects of further enhancement on both HM-GAMMA and GAMMA-HM combinations

using each of the four enhancement methods 85 Figure 5.10: Sample faces from Yale B database – automatically and manually cropped 86 Figure 5.11: Eigenface method over YALE B-AUTO: Effects of further enhancement over the eight

single enhancements using (a) HE, (b) LOG, (c) GAMMA and (d) COMP 89 Figure 5.12: Eigenface method over YALE B-MANU: Effects of further enhancement over the eight

single enhancements using (a) HE, (b) LOG, (c) GAMMA and (d) COMP 91 Figure 5.13: Spectroface method over YALE B-AUTO: Effects of further enhancement over the eight

single enhancements using (a) HE, (b) LOG, (c) GAMMA and (d) COMP 92 Figure 5.14: Spectroface method over YALE B-MANU: Effects of further enhancement over the

eight single enhancements using (a) HE, (b) LOG, (c) GAMMA and (d) COMP 93 Figure 5.15: Effects of the five enhancement combinations that satisfy the three conditions 95 Figure 6.1: Average increasing/decreasing in recognition rates after applying each of the five

illumination normalization approaches on YALE B-MANU version 105 Figure 6.2: Average increasing/decreasing in recognition rates after applying each of the five

illumination normalization approaches on YALE B-AUTO version 106 Figure 6.3: Performance decreasing of each normalization approach due to the non-aligning of faces

(i.e. subtracting the performance on YALE B-AUTO from the performance on YALE B-MANU) 106

Figure 6.4: Average difference in recognition rates after applying each of the five illumination normalization approaches on UMIST database 108

Figure 6.5: Average difference in recognition rates after applying each of the five illumination normalization approaches on Yale database 109

Figure 6.6: Average difference in recognition rates after applying each of the five illumination normalization approaches on Grimace database 110

Figure 6.7: Average difference in recognition rates after applying each of the five illumination normalization approaches on JAFFE database 110

Figure 6.8: Average difference in recognition rates after applying each of the five illumination normalization approaches on Nott-faces database 110

Figure 6.9: Average decreasing in recognition rates after translating with circulation 114 Figure 6.10: Average decreasing in recognition rates after translating without circulation 115 Figure 6.11: Average decreasing in recognition rates when applying each of the five illumination

normalization approaches before and after scaling the Face 94 database 116

XII

List of Tables Table 1.1: Different applications of face recognition 2 Table 2.1: A brief comparison between holistic-based and local-feature-based approaches 9 Table 3.1: Default parameter settings for CHAIN approach 52 Table 3.2: List for 24 illumination normalization approaches that LNORM perform better than them

57 Table 3.3: The 38 different illumination normalization approaches appear in the above nine

comparative studies together with the corresponding studies numbers. (Note that the cited approaches, from 29 to 38, are not described in details in their corresponding comparative studies) 61

Table 4.1: Comparison between results in Lai et al. [43] and in our implementation (better rates are italic) 65

Table 4.2: Pose Variation: recognition rates over 12 training cases (top four rates in each method are italic) 70

Table 4.3: Expressions Variation: recognition rates over four databases with two Eigenface tests 71 Table 4.4: Illumination Variation: recognition rates over 25 training cases (top three rates in each

method are italic) 73 Table 4.5: Translation Variation: chosen cases from the six databases and their recognition rates 73 Table 4.6: Translation Variation: average decreasing in the recognition rates of both methods after

translating with circulation in the four directions 74 Table 4.7: Translation Variation: average decreasing in the recognition rates of both methods after

translating without circulation in the four directions 75 Table 4.8: Scaling Variation: description of the training cases 75 Table 4.9: Scaling Variation: decreasing in recognition rates after scaling all images in the testing set

76 Table 5.1: The 25 different training cases used in testing 87 Table 5.2: The number of combinations that lead to increase the recognition rates after using each of

the enhancement methods for further enhancement 94 Table 5.3: Results of using the best five combinations with the Eigenface method over the two

versions of the database. Average recognition rate is calculated over the 25 different training cases. (The best average differences are italic) 96

Table 5.4: Results of using the best five combinations with the Spectroface method over the two versions of the database. Average recognition rate is calculated over the 25 different training cases. (The best average differences are italic) 97

Table 6.1: Default parameter settings for CHAIN approach 100 Table 6.2: Results of applying CHAIN with and without sliding on Spectroface method on both

versions of the YALE B database 100 Table 6.3: Difference between our implementation of the LNORM and the original one 101 Table 6.4: Results of applying LNORM with and without sliding on Spectroface method on both

versions of the YALE B database 101 Table 6.5: Difference between our implementation of the SSR-HM and the original one 102 Table 6.6: Results of applying each of the five illumination normalization approaches with both

Eigenface and Spectroface methods over YALE B-MANU version. Average recognition rate is calculated over the 25 different training cases. 104

Table 6.7: Results of applying each of the five illumination normalization approaches with both Eigenface and Spectroface methods over YALE B-AUTO version. Average recognition rate is calculated over the 25 different training cases. (0: NONE, 1: LNORM, 2: LBP, 3: CHAIN, 4: SSR-HM, 5: GAMMA-HM-COMP, nor: normal, ver: vertical, hor: horizontal) 105

XIII

Table 6.8: Results of applying each of the five illumination normalization approaches with both Eigenface and Spectroface methods over UMIST database. Average recognition rate is calculated over all training cases. 107

Table 6.9: Results of applying each of the five illumination normalization approaches with both Eigenface and Spectroface methods over Grimace, Yale, JAFEE, and Nott-faces databases. Average recognition rate is calculated over all training cases. 109

Table 6.10: Average decreasing in the recognition rates of both methods after translating with circulation in the four directions while applying (a) LNORM, (b) LBP, (c) CHAIN, (d) SSR-HM and (e) GAMMA-HM-COMP approaches as preprocessing step. 111

Table 6.11: Average decreasing in the recognition rates of both methods after translating without circulation in the four directions while applying (a) LNORM, (b) LBP, (c) CHAIN, (d) SSR-HM and (e) GAMMA-HM-COMP approaches as preprocessing step. 113

Table 6.12: Decreasing in recognition rates after applying each of the five illumination normalization approaches with both Eigenface and Spectroface methods over Face 94 database. Average decreasing in recognition rate is calculated over all training cases. 115

1

CHAPTER 1: Introduction

1.1 Biometrics and Face Recognition Biometric recognition [1] refers to the use of distinctive physiological (e.g., fingerprints, face, retina, iris) and behavioral (e.g., gait, signature) characteristics, called biometric identifiers, for automatically recognizing individuals. Because biometric identifiers cannot be easily misplaced, forged, or shared, they are considered more reliable for person recognition than traditional token or knowledge-based methods. Others typical objectives of biometric recognition are user convenience (e.g., service access without a Personal Identification Number), better security (e.g., difficult to forge access). Fig.1.1 shows the distribution of some biometrics over the market.

Figure 1.1: Distribution of some biometrics over the market

As one of the most successful applications of image analysis and understanding, face recognition has recently gained significant attention, especially during the past several years. This is evidenced by the emergence of specific face recognition conferences such as AFGR and CVPR and systematic empirical evaluation of face recognition techniques (FRT), including the XM2VTS, FERET, FRGC and FRVT. Fig.1.2 shows how many items on face recognition were published between 1991 and 2006, and also the number of citations to source items indexed within WoS - Web of Science. The figure proves that the face recognition still a hot research area. There are at least two reasons for such a trend, the first is the wide range of commercial and law enforcement applications and the second is the availability of feasible technologies after 35 years of research.

2

Figure 1.2: Number of published items and citations on face recognition between 1991 and 2006

The strong demand for user friendly systems which can secure our assets and protect our privacy without losing our identity in a sea of numbers is obvious. At present, one needs a PIN to get cash from an ATM, a password for a computer, a dozen others to access the internet, and so on. Although extremely reliable methods of biometric personal identification exist, e.g., fingerprint analysis and retinal or iris scans, these methods have yet to gain acceptance by the general population due to their needs to the cooperation of the participants. A personal identification system based on analysis of frontal or profile images of the face is non intrusive and therefore user friendly. Moreover, personal identity can often be ascertained without the participant’s cooperation or knowledge. In addition, the need for applying FRT has been boosted by recent advances in multimedia processing along with others such as IP (Internet Protocol) technologies. Table-1.1 lists some of the applications of face recognition [2]:

Table 1.1: Different applications of face recognition

Area Specific Applications

Biometrics Drivers’ Licenses, Entitlement Programs Immigration, National ID, Passports, Voter Registration Welfare Fraud

Information Security

Desktop Logon (Windows XP, Windows Vista) Application Security, Database Security, File Encryption Intranet Security, Internet Access, Medical Records Secure Trading Terminals

Law Enforcement and Surveillance

Advanced Video Surveillance, CCTV Control Portal Control, Post-Event Analysis Shoplifting and Suspect Tracking and Investigation

Smart Cards Stored Value Security, User Authentication Access Control Facility Access, Vehicular Access

3

1.2 Problem Definition A general statement of face recognition problem can be formulated as follows [2]: Given still or video images of a scene, identify or verify one or more persons in the scene using a stored database of faces. Available collateral information such as race, age, gender, facial expression and speech may be used in narrowing the search (enhancing recognition). The solution of the problem involves segmentation of faces (face detection) from cluttered scenes, feature extraction from the face region, identification or verification. In identification problems, the input to the system is an unknown face, and the system reports back the decided identity from a database of known individuals, whereas in verification problems, the system needs to confirm or reject the claimed identity of the input face.

1.3 Methods Categorization A number of intensity-image face recognition methods have been proposed and implemented in commercial systems. These methods fall into two broad approaches, namely, holistic-based – where features are mainly extracted from the whole face – and local-feature-based in which features are mainly extracted from some locations in the face. Even though approaches of all these types have been successfully applied to the task of face recognition, they do have certain advantages and disadvantages. Thus an appropriate approach should be chosen based on the specific requirements of a given task. However, most of current face recognition techniques assume that several (at least two) samples of the same person are always available for training. Unfortunately, in many real-world applications, the number of training samples we actually have is by far smaller than that we supposedly have [3]. More specifically, in many application scenarios, especially in large-scale identification applications, such as law enforcement, driver license or passport card identification, there is usually only one training sample per person in the database. In addition, we seldom have opportunity to add more samples of the same person to the underlying database, because collecting samples is costly and even we can do so. These raise the need for developing face recognition techniques that specially deal with one-sample-per-person problem.

1.4 Variations Categorization Many issues hinder research efforts in the field of face recognition. Variation exists in every imaging modality used, and finding fast, simple algorithms that are robust to variation is difficult (as evidenced by years of research). Categorizing the variation may be helpful in the development of effective face recognition algorithms [4]. Intrinsic sources of variation include identity, facial expression, speech, gender, and age [7]. Extrinsic sources of variation include viewing geometry, illumination, imaging processes,

4

and other objects. Viewing geometry includes pose changes, either by the observer or the object to be recognized; illumination changes include shading, color, self-shadowing, and specular highlights; imaging process variations include resolution, focus, imaging noise, sampling technique, and perspective distortion effects; variation from other objects include occlusions, shadowing, and indirect illumination. These sources of variation may or may not hinder the recognition process depending on which algorithm is used. It is possible that the variation due to factors such as facial expression, lighting, occlusions, and pose is larger than the variation due to identity [6], [7]. That makes identification under such varying environments a difficult task. However, human proficiency at face recognition [8] has motivated enormous research in this area despite these challenges. (The ability of humans to recognize faces is also an actively researched field with widely varying results depending on numerous factors. Additional information on this topic can be found predominantly in the psychology literature [9], [10].)

1.5 Successful Scenarios The approaches proposed in the last years have been able to solve specific still face images recognition applications. Examples of scenarios where face recognition achieves very good results are given in Fig.1.3 and Fig.1.4, [11].

Figure 1.3: Easy scenarios in face recognition

Figure 1.4: Easy scenarios in face recognition

When the scenario departs from the easy scenario, then face recognition approaches experience severe problems. Among the special challenges let us mention: pose variation, illumination conditions, scale variability, images taken years apart, glasses, moustaches, beards, low quality image acquisition, partially occluded faces etc. Fig.1.5 shows different images which present some of the problems encountered in face recognition. In search of finding solutions for difficult face recognition scenarios, some help is found in two broad areas: video-based face recognition and multimodal approaches.

5

Figure 1.5: Difficult scenarios for face recognition

1.6 Commercial Systems Currently there are many commercial face tracking and recognition systems available. For obvious reasons, many companies are reluctant to disclose the technology used in their products. We list below many commercialization of face recognition technology together with the specific approach used [4]:

1. HNeT (Holographic/quantum Neural Technology) Facial Recognition System by AcSys Biometrics Corporation [122]. This neural network approach was developed by John Sutherland.

2. ZN-Face by ZN Vision Technologies uses an undisclosed neural network approach [135].

3. Nvisage by Neurodynamics uses an undisclosed neural network approach [132]. 4. FaceTools by Viisage uses a proprietary algorithm based on the eigenfaces

approach developed at the MIT Media Lab [133]. 5. Biometrica uses eigenfaces, but does not disclose further details [124]. 6. FaceIt by Identix (formerly Visionics) uses Local Feature Analysis (LFA)

developed by Dr. Joseph J. Atick to generate and measure intra-feature distances for recognition [129].

7. Face Guardian by Keyware uses local feature analysis. No information on this product is available on their website [131].

8. Visec-FIRE by Berninger Software uses a facial processing approach [123]. 9. ID2000 by Imagis uses a proprietary wavelet representation of the face for

recognition [130]. 10. BioID uses an undisclosed multimodal system implementing face, voice, and lip

movement identification [126], [128]. 11. FaceVACS by Cognitec applies transforms to specific areas of the face in order to

create a user specific feature vector [125]. 12. UnMask by Vision Sphere Technologies Inc. uses a proprietary feature analysis

algorithm [134]. 13. Face Key by Intelligent Verification Systems uses an undisclosed face and

fingerprint recognition algorithm [127].

6

1.7 Recent Evaluations Since 1993, a series of six face recognition technology evaluations sponsored by U.S. Government have been held. In thirteen years, performance has improved by two orders of magnitude and there exist numerous companies selling face recognition systems [12]. The evaluations provided regular assessments of the state of the technology and helped to identify the most promising approaches. The challenge problems also nurtured research efforts by providing large datasets for use in developing new algorithms. The Face Recognition Technology (FERET) program, Face Recognition Grand Challenge (FRGC) and Face Recognition Vendor Test (FRVT) evaluations and challenge problems were instrumental in advancing face recognition technology, and they show the potential for the evaluation and challenge problem paradigm to advance biometric, pattern recognition, and computer vision technologies. One of the main conclusions of the face recognition vendor test FRVT 2002 is that face recognition from outdoor imagery remains a research challenge area. Moreover, the primary goal of the last technology evaluation in the series (FRVT 2006) is to look at recognition from high-resolution still images and three-dimensional (3D) face images, and measures performance for still images taken under controlled and uncontrolled illumination. Following are some conclusions from this evaluation, [12]: • The FRVT 2006 results show that relaxing the illumination condition still has a

dramatic effect on performance. • Face recognition performance on still frontal images taken under controlled

illumination has improved by an order of magnitude since the FRVT 2002. There are three primary components to the improvement in algorithm performance since the FRVT 2002:

1. The recognition technology, 2. Higher resolution imagery, 3. Improved quality due to greater consistency of lighting.

• Since performance was measured on the low-resolution dataset in both the FRVT 2002 and the FRVT 2006, it is possible to estimate the improvement in performance due to algorithm design. The improvement in algorithm design resulted in an increase in performance by a factor of between four and six depending on the algorithm. For the results on the high and very-high resolution datasets, the improvement in performance comes from a combination of algorithm design and image size and quality. This is because new recognition techniques have been developed to take advantage of the larger high quality face images.

• The FRVT 2006 and the Iris Challenge Evaluation (ICE 2006) compared recognition performance from very-high resolution still face images, 3D face images, and single-iris images. On the FRVT 2006 and the ICE 2006 datasets, recognition performance

7

of all three biometrics is comparable when all three biometrics are acquired under controlled illumination.

Since the human visual system contains a very robust face recognition capability that is excellent at recognizing familiar faces [5]. However, human face recognition capabilities on unfamiliar faces fall far short of the capability for recognizing familiar faces. The FRVT 2006, for the first time, integrated measuring human face recognition capability into an evaluation. Performance of humans and computers was compared on the same set of images. The FRVT 2006 human and computer experiment measures the ability to recognize faces across illumination changes. This experiment found that algorithms are capable of human performance levels, and that at false accept rates in the range of 0.05, machines can out-perform humans.

1.8 Thesis Objectives and Organization As we can see from the conclusions of the recent technology evaluations, although many face recognition techniques and systems have been proposed, the recognition performance of most current technologies degrades due to the variations of illumination. Moreover, It has been proven both experimentally [13] and theoretically [14] that the variations between the images of the same face due to illumination are almost always larger than image variations due to change in face identity. There has been much work dealing with illumination variations in face recognition. Although most of these approaches can cope with illumination variation well, some may bring negative influence on images without illumination variation. In addition, some approaches show great difference on performance when combined with different recognition approaches. Some other approaches require perfect alignment of face within the image which is difficult to achieve in practical/real-life systems. This thesis aims to propose an illumination normalization approach that proves flexibility to different face recognition approaches and independency to face alignment in addition to having the minimum negative influence on images without illumination variations. We do this through the following:

1. Study the face recognition approaches. 2. Study the illumination normalization approaches for face recognition. 3. Propose an illumination normalization approach that proves flexibility to different

face recognition approaches and independency to face alignment. 4. Make comparative study between the proposed approach and other best-of-

literature approaches over images with and without illumination variations. The thesis is organized as follows: Chapter 2 surveys the three main face recognition approaches which are local-based, holistic-based and hybrid approaches with brief descriptions about some methods under

8

each approach. In addition, the types of performance evaluations and literature comparative studies are introduced in this chapter. Chapter 3 surveys the two main illumination normalization approaches for face recognition, namely model-based and image-processing-based approaches. It introduces brief descriptions about some methods under each approach. In addition, nine comparative studies are introduced at the end of the chapter to select the best-of-literature approaches. Chapter 4 introduces the detailed descriptions about the environment that we build up in order to use it for testing our proposed illumination normalization approach and the other approaches. The chapter includes descriptions about the selected face recognition methods and the selected databases that cover five different face recognition variations. The experimental results of the selected methods over each database are also introduced in this chapter. All experiments are done without applying any illumination normalization approach. This allows us to study the effects of any illumination normalization approach on the selected methods over each variation separately. Chapter 5 proposes an illumination normalization approach based on enhancing the image resulting from histogram matching. Four different image enhancement methods are experimentally tried in two different approaches, 1) After HM; on the resulting image from HM, 2) Before HM; on the reference image before matching the input image on it. The best approach is chosen such that it proves flexibility to the two selected face recognition methods and independency to face alignment. Chapter 6 evaluates the proposed illumination normalization approach and other best-of-literature approaches over images with illumination variation and images with other facial and geometrical variations using the two selected face recognition methods. Chapter 7 contains the final conclusions of this work in addition to suggestions for future works. As a start for this work, the following chapter surveys the three main face recognition approaches which are local-based, holistic-based and hybrid approaches with brief descriptions about some methods under each approach. In addition, the types of performance evaluations and literature comparative studies are introduced in this chapter.

9

CHAPTER 2: Face Recognition Approaches

2.1 Introduction A number of intensity-image face recognition methods have been proposed and implemented in commercial systems. Basically, they can be divided into holistic-based, local-feature-based, and hybrid approaches. Even though approaches of all these types have been successfully applied to the task of face recognition, they do have certain advantages and disadvantages. Thus an appropriate approach should be chosen based on the specific requirements of a given task. Local-feature-based methods rely on the identification of certain fiducial points on the face such as the eyes, the nose, the mouth, etc. The location of those points can be determined and used to compute geometrical relationships between the points as well as to analyze the surrounding region locally. Thus, independent processing of the eyes, the nose, and other fiducial points is performed and then combined to produce recognition of the face. Holistic-based methods treat the image data simultaneously without attempting to localize individual points. The face is recognized as one entity without explicitly isolating different regions in the face. Holistic techniques utilize statistical analysis, neural networks, and transformations. They usually require large samples of training data. The advantage of holistic-based methods is that they utilize the face as a whole and do not destroy any information by exclusively processing only certain fiducial points. Thus, they generally provide more accurate recognition results. However, such techniques are sensitive to variations in position, scale, etc., which restrict their use to standard, frontal mug-shot images. Table-2.1 shows a brief comparison between both approaches.

Table 2.1: A brief comparison between holistic-based and local-feature-based approaches

Holistic-based Local-feature-based Extract Feature Vector from the whole face Extract Feature Vector at certain locations Sensitive to pose and illumination changes Robust to pose and illumination changes

Feature detection is not required Depend on accurate feature detection, which is not simple

Computationally less expensive Computationally more expensive The rest of this chapter is organized as follows: section 2 contains brief description about some local-based approaches. Section 3 describes some holistic-based approaches. Examples of hybrid approaches appear in section 4. Section 5 describes the types of performance evaluation and introduces results from existing comparative studies.

10

2.2 Local-Based Approaches

1. Elastic Bunch Graph Matching 1997 This method is considered one of the famous local-based methods in the literature. The work in [15] presents a system for recognizing human faces from single images out of a large database containing one image per person. Faces are represented by labeled graphs, based on a Gabor wavelet transform, in which nodes are located at facial landmarks and labeled with a 40-dimensional Gabor-based complex vector called jets. The edges are labeled with two dimensional distance vectors between corresponding nodes. In order to extract the image graphs automatically, the face bunch graph (FBG) is first constructed for certain pose by combining a representative set of individual model graphs into a stack-like structure as shown in Fig.2.1. Each model graph has the same grid structure and the nodes refer to identical fiducial points. The first set of model graphs is generated manually. A set of jets referring to one fiducial point is called a bunch. An eye bunch, for instance, may include jets from closed, open, female and male eyes…etc. to cover these local variations. The corresponding FBG graph is then given the same grid structure as the individual model graphs, its nodes are labeled with the bunches of jets and its edges are labeled with the averaged distances between these jets. Once the system has an FBG, graphs for new images can be generated automatically by elastic bunch graph matching which based on maximizing a graph similarity between an image graph and the FBG of identical pose. A heuristic algorithm is used to find the image graph which maximizes the graph similarity function. First, the location of the face is found by a sparse scanning of the FBG over the image. Then, the FBG is varies in size and aspect ratio to adapt to the right format of the face. These steps are of no cost in the topography term of the similarity function because the edge labels are transformed accordingly. Finally, all nodes are moved locally and relative to each other to optimize the graph similarity further. After having extracted model graphs from the gallery image and image graph from the probe image, recognition is done by comparing an image graph to all model graphs and selecting the one with the highest similarity value.

2. Face Recognition Based on Multiple Facial Features 2000 In [16], a different facial feature detection scheme, which is based on the framework of Elastic Bunch Graph Matching in [15], is utilized. Only 17 facial features instead of the 48 in [15], all of which have clear meanings and exact correct positions, are localized for each new face image. The whole facial feature detection process consists of three stages— global face search, individual facial feature localization and graph adjusting. The first stage serves to find a face in an image and provides near-optimal starting points for the following individual facial feature localization stage. In the second stage, each of

11

the 17 facial features is localized individually, without taking its relative positions to other facial features into consideration. In graph adjusting stage, relative positions between facial features are utilized to localize those misplaced facial features from the second stage. After facial feature detection, 17 basic facial features, each labeled with a 40-dimensional Gabor-based complex vector, are detected for each new face image. Face recognition is then executed on the basis of these complex vectors, which represent local features of the areas around the multiple facial features. Two face recognition approaches, named Two-Layer Nearest Neighbor (TLNN) and Modular Nearest Feature Line (MNFL) respectively, are proposed.

Figure 2.1: Face bunch graph (FBG) serves as a general representation of faces. It is designed to cover all possible variations in the appearance of faces. The FBG combines information from a

number of face graphs. Its nodes are labeled with set of jets, called bunches, and its edges are labeled averages of distance vectors. During comparison to an image, the best fitting jet in each bunch,

indicated by gray shading, is selected independently.

3. Biometric system: A Face recognition approach 2000 The work in [17] proposes an automatic system in which informative feature locations in the face image are automatically located by Gabor filters, which make the system not dependent on accurate detection of facial features. The system starts by filtering the image with a set of Gabor filters. The filtered image is then multiplied with a 2-D Gaussian to focus on the center of the face, and avoid extracting features at the face contour. This Gabor filtered and Gaussian weighted image is then searched for peaks, which considered as interesting feature locations for face recognition. At each peak, a feature vector consisting of Gabor coefficients is extracted. In testing, the Euclidian distances to all feature vectors in the gallery are calculated and rank them accordingly. A visualized example for the steps of localizing features is shown in Fig.2.2.

12

Figure 2.2: A visualized example for the steps of automatically localizing features. In (e), the black cross on a white background indicates an extracted and stored feature vector at this location while

white cross on black background indicates an ignored feature vector.

4. Face Recognition with Support Vector Machines: Global versus Component-based Approach 2001

In [18], a local-based method and two holistic-based methods based on using the support vector machine SVM are presented and evaluated with respect to robustness against pose changes. Extensive tests are performed on a database which included faces rotated up to about 40° in depth. The local-based method clearly outperformed both holistic-based methods on all tests. The Local-based method starts with locating 14 different facial components, extracting and combining them into a single feature vector which is classified by a Support Vector Machine (SVM). To locate facial components, a two-level, component-based face detector is implemented. On the first level, 14 component classifiers independently detect the facial components. On the second level, a geometrical configuration classifier performs the final face detection by combining the results of the component classifiers. The steps of the component-based face detector are illustrated in Fig.2.3. Given a 58 × 58 window over the input image, the maximum continuous outputs of the component classifiers within rectangular search regions around the expected positions of the components are used as inputs to the geometrical configuration classifier. The search regions have been calculated from the mean and standard deviation of the components’ locations in the training images. The geometrical classifier is also provided with the X–Y locations of the maxima of the component classifier outputs relative to the upper left corner of the 58 × 58 window. To train the component-based face detector, the component is located and extracted from set of synthetic images to build a positive component training set. The negative component training set is extracted from non-face patterns. Thus, 14 linear SVMs are trained on the component data and applied to the whole training set in order to generate

13

the training data for the geometrical classifier. The geometrical configuration classifier, which is again a linear SVM, is trained on the X–Y locations and continuous outputs of the 14 component classifiers.

Figure 2.3: System overview of the component-based face detector using four components

In training stage, the component-based face detector is first run over each image in the training set to extract the local facial components. Only 10 out of the 14 local facial components are kept for face recognition, removing those that either contained few gray value structures (e.g. cheeks) or strongly overlapped with other components. Each of the 10 components is then normalized in size and their gray values are combined into a single feature vector. These feature vectors are used to train a one-vs-all linear SVM for every person in the database. In testing stage, the feature vector of the probe image is first extracted and then provided to the one-vs-all linear SVM for every person in order to recognize it. The matched person is the one that its corresponding SVM gives the maximum value for the probe feature vector.

5. Automatic Face Recognition System Based on Local Fourier-Bessel Features 2005

The work in [19] presents an automatic face verification system inspired by known properties of biological systems. In the proposed algorithm, the whole image is converted from the spatial to polar frequency domain by a Fourier-Bessel Transform (FBT). The local feature vector is then constructed from the FBT coefficients of the upper right region, upper middle region, and the upper left region of the face based on ground-truth information as shown in Fig.2.4. Using local features is compared to the case where the whole image FBT coefficients are considered. The resulting representations are

14

embedded in a dissimilarity space, where each image is represented by its distance to all the other images, and a Pseudo-Fisher discriminator is built.

Figure 2.4: Sample of the normalized whole face image and the three regions that are used for the

local analysis

Verification test results on the FERET database show that the local-based algorithm outperforms the global-FBT version. The local-FBT algorithm performed as state-of-the-art methods under different testing conditions, indicating that the proposed system is highly robust for expression, age, and illumination variations. In addition, the performance of the proposed system is also evaluated under strong occlusion conditions and found that it is highly robust for up to 50% of face occlusion. However, when the verification system is automated completely by implementing face and eye detection algorithms, the performance of the local approach is reduced and becomes only slightly superior to the global approach.

6. Local Features for Biometrics-Based Recognition 2004 The work in [20] introduces an approach combining a simple local representation method with a k-nearest neighbors-based direct voting scheme for both face and speaker recognition. The extraction of local features starts by first selecting pixels that have local variance above a certain global threshold. Then, for each selected pixel, a w2-dimensional vector is obtained by applying a w × w window around it. Finally, the dimensionality of this vector is reduced using PCA and each vector is labeled with an identifier of the class. Each test image is then classified by first computing the k-nearest neighbors of each of its corresponding feature vectors. Then the class with the largest number of votes accumulated over all the vectors is selected.

7. Face Description with Local Binary Patterns: Application to Face Recognition 2006

In [21], a novel and efficient facial representation is proposed. It is based on dividing a facial image into small regions and computing a texture description of each region using local binary patterns (LBP). These descriptors are then combined into a spatially enhanced histogram (or feature vector). The spatially enhanced histogram encodes both the appearance and the spatial relations of facial regions. To extract the LBP texture descriptor of a region, the operation starts by first assigning a label to every pixel of a region by thresholding the 3 × 3-neighborhood of each pixel with

15

the center pixel value and considering the result as a binary number. Then the histogram of the labels can be used as a texture descriptor. See Fig.2.5 for an illustration of the basic LBP operator.

Figure 2.5: Example for illustrating the basic LBP operator

To be able to deal with textures at different scales, a circular LBP operator is used. Defining the local neighborhood as a set of sampling points evenly spaced on a circle centered at the pixel to be labeled allows any radius and number of sampling points. Bilinear interpolation is used when a sampling point does not fall in the center of a pixel. Fig.2.6 shows some examples of circular neighborhoods.

(8,1) (16,2) (8,2)

Figure 2.6: Examples of circular neighborhoods with number of sampled points P and radius R (P,R)

As the m facial regions have been determined, a histogram of the LBP texture description is computed independently within each of the m regions. The resulting m histograms are combined yielding the spatially enhanced histogram with size m × n where n is the length of a single LBP histogram. In the spatially enhanced histogram, the face is effectively described on three different levels of locality: the LBP labels for the histogram contain information about the patterns on a pixel-level, the labels are summed over a small region to produce information on a regional level and the regional histograms are concatenated to build a global description of the face. However, since some facial features (such as eyes) play more important roles in human face recognition than other features, the regions can be weighted based on the importance of the information they contain. So, a weighted distance such as the Chi square distance can be used for classification.

2.3 Holistic-Based Approaches Most of holistic-based approaches can be classified into two wide categories, Eigenspace-based category and frequency-based category [22]. In the following subsections, we give an introduction about each category in addition to some brief description about existing methods under both categories. Moreover, we also introduce some brief description about some other holistic-based methods that are not belonging to any of the two categories.

16

2.3.1 Eigenspace-based Category In Eigenspace-based category, Principal Component Analysis (PCA) – usually called Eigenface – plays a key role in many holistic methods. Sirovich and Kirby [23] propose a method that uses Karhunen-Loève transform to represent human faces. In 1991, Turk and Pentland [24] develop a face recognition system using PCA (K-L expansion). Along this direction, many Eigenspace based recognition systems have been developed, they differ mostly in the kind of projection approaches (standard-, differential- or kernel-Eigenspace), in the projection algorithm employed (PCA, ICA and FLD), in the use of simple or differential images before/after projection, and in the similarity matching criterion or classification method employed (Euclidean, Mahalanobis, Cosine distances and SOM-Clustering, RBF, LDA, and SVM). Many comparative studies between different Eigenspace-based methods have been established. Following are some brief descriptions about existing Eigenspace-based methods in addition to some Eigenspace-based comparative studies.

1. Eigenfaces for Recognition 1991 The method in [24] is considered a key-role in developing many Eigenspace-based methods in literature. The system is initialized by first acquiring the training set (ideally a number of examples of each subject with varied lighting and expression). Eigenvectors and eigenvalues are computed on the covariance matrix of the training images. The first eigenvectors with the highest eigenvalues are kept to construct the face space. Finally, the known individuals are projected into this face space, and their weights are stored. This process is repeated as necessary. The new image is then projected into the same face space. The Euclidean distance measures the distance between the new projected image and a class of projected faces. If the minimum distance measure is less than a threshold, the face is recognized. Fig.2.7 shows the block diagram of this method.

Figure 2.7: Block diagram of the standard Eigenface method

17

2. Subspace Linear Discriminant Analysis (LDA) 1999 The method in [25] consists of two steps: first, the face image is projected from the original vector space to a face subspace via PCA where the subspace dimension is carefully chosen, and then use Linear Discriminant Analysis (LDA) to obtain a linear classifier in the subspace. The criterion that is used to choose the subspace dimension enables the system to generate class-separable features via LDA. Fig.2.8 shows the main steps of this method.

Figure 2.8: The subspace LDA face recognition system

3. Evolutionary Pursuit and Its Application to Face Recognition 2000 Evolutionary Pursuit (EP) implements strategies characteristic of genetic algorithms (GAs) for searching the space of possible solutions to determine the optimal basis. In [26], EP starts by projecting the original data into a lower dimensional whitened Principal Component Analysis (PCA) space. Directed but random rotations of the basis vectors in this space are then searched by GAs where evolution is driven by a fitness function defined in terms of performance accuracy (empirical risk) and class separation (confidence interval). Accuracy indicates the extent to which learning has been successful so far, while separation gives an indication of the expected fitness on future trials. Fig.2.9 shows a flowchart for the main steps of this method.

4. Face Recognition Using ICA and SVM 2003 The work in [27] uses the Independent Component Analysis (ICA) for feature extraction, which can be considered as a generalization of the PCA, followed by using the Support Vector Machine (SVM) as a classifier. It shows that the results obtained by using the combination PCA/SVM are not very far from those obtained with ICA/SVM. They suggest that SVMs are relatively insensitive to the representation space.

18

Figure 2.9: Flowchart for the Face recognition using evolutionary pursuit (EP) method

5. Face Recognition by Independent Component Analysis 2002 In [28], a version of ICA derived from the principle of optimal information transfer through sigmoidal neurons is used. ICA is performed on face images under two different architectures, one which treats the images as random variables (mixtures) and the pixels as outcomes (sources), and a second which treats the pixels as random variables and the images as outcomes. It founds that both ICA representations are superior to representations based on PCA for recognizing faces across days and changes in expression. In addition, a classifier that combines the two ICA representations gives the best performance. In architecture 1 for example, the ICA model is given by:

NMMMNM XWU ××× ×= ( 2.1)

Where XM×N contains M mixtures (training images) each of dimension N, WM×M is the unmixing matrix and UM×N contains M sources each with dimension N. Due to the high dimensionality of the mixtures (images), which makes the problem of solving W matrix intractable and takes long time. Instead, the dimension of the input images is first reduced by projecting them on PCA, then use these lower-dimension representation to solve W. Fig.2.10 explains the image synthesis model for architecture 1. In training, each image is first projected on the PCA then the result is projected on the ICA to get the feature vector. In testing, the same steps is applied on the probe image to extract its feature vector, then the cosine distance is used as the similarity measure between the input and the stored feature vectors.

19

Figure 2.10: Image synthesis model for Architecture 1. To find a set of IC images, the images in X are

considered to be a linear combination of statistically independent basis images, S, where A is an unknown mixing matrix. The basis images are estimated as the learned ICA output U.

6. Face Recognition with One Training Image per Person 2002 In [29], an extension of the Eigenface technique called Projection-Combined Principal Component Analysis (PC)2A, is proposed. (PC)2A combines the original face image with its horizontal and vertical projections and then performs principal component analysis on the enriched version of the image. It requires less computational cost than the standard Eigenface technique and experimental results show that on a gray-level frontal view face database where each person has only one training image, (PC)2A achieves 3%-5% higher accuracy than the standard Eigenface technique through using 10%-15% fewer Eigenfaces. Fig.2.11 shows an example of the projection map and the projection-combined image.

Figure 2.11: Example of the projection map and the projection-combined image

7. Bayesian Modeling of Facial Similarity 1998 In [30], the intrapersonal space (IPS) and extrapersonal space (EPS) are constructed first by computing the intrapersonal differences (i.e. difference images between any two image pairs belonging to the same individual) and the extrapersonal differences (by matching images of different individuals in the gallery), then, performing a separate PCA

20

analysis on each. All images are then projected on both intrapersonal and extrapersonal spaces. In testing, the probe image is projected on both spaces and then the Euclidean distances are computed between the interior projected vectors and the exterior projected vectors of both the input image and training images in order to get the Bayesian similarity score which used for recognition.

8. Intra-Personal Kernel Space for Face Recognition 2004 In [31], an intrapersonal space (IPS) is constructed first by collecting all the difference images between any two image pairs belonging to the same individual, to capture all intra-personal variations. Then, the probabilistic analysis of kernel principal components (PKPCA) is performed on this IPS which actually derives the intrapersonal kernel subspace. Finally, the Mahalanobis distance is used for recognition. The recognition performance demonstrates the advantage of this approach over other traditional subspace approaches including PCA, Kernel PCA, ICA, Kernel ICA, Fisher Discriminant Analysis FDA and Kernel FDA.

9. Eigenspace-based Comparative Studies Many comparative studies between different Eigenspace-based methods have been established [28], [32-40]. Among these comparisons, [39] presented an independent, comparative study of three most popular appearance-based face recognition projection methods (PCA, ICA and LDA) and their accompanied four distance metrics (L1, L2, cosine and Mahalanobis) in completely equal working conditions. The results show that no particular projection-metric combination is the best across all standard FERET tests and the choice of appropriate projection-metric combination can only be made for a specific task. In addition, the work in [40] presents another independent comparative study among different Eigenspace-based approaches. The study considers standard, differential and kernel Eigenspace-based methods. In the case of the standard ones, three different projection algorithms (PCA, FLD and EP) and eight different similarity measures (Euclidean, Whitening Euclidean (Mahalanobis), Cosine, and Whitening Cosine distances, SOM and Whitening SOM Clustering, FFC and Whitening FFC) are considered. In the case of the differential methods, two approaches are used, the pre-differential and the post-differential. In both cases Bayesian and SVM classification are employed. Finally, regarding kernel methods, Kernel PCA and Kernel FD are used together with the eight similarity measures employed for the standard approaches. Simulations are performed using the Yale Face Database, a database with few classes and several images per class, and FERET, a database with many classes and few images per class. They conclude that:

21

• Considering recognition rates, generalization ability as well as processing time, the best results are obtained with the post-differential approach, using either a Bayesian Classifier or SVM.

• In the specific case of the Yale Face Database, where the requirements are not very high, any of the compared approaches gives rather similar results. Thanks to their simplicity, Eigenfaces or Fisherfaces are probably the best alternatives.

• Although kernel methods obtain the best recognition rates, they suffer from problems such as low processing speed and the difficulty to adjust the kernel parameters.

2.3.2 Frequency-based Category In frequency-based category, the main idea is to map the image from spatial domain to frequency domain, and then construct the feature vector from this domain. Z. Pan et al. [41] and Spiess H. et al. [42] use DCT and FFT respectively to extract most important features. J. H. Lai et al. [43], apply FWT to make the image less sensitive to expressions variations, then apply FFT twice to make the features set invariant to translation, scale, and on-the-plane rotation. Also, Dai D. et al. [44] and [45] suggest the use of wavelet transform, followed by applying LDA as a classifier in [44] or PCA as a representation method in [45]. Following are some brief description about existing Frequency-based methods.

1. Human Face Recognition Using PCA on Wavelet Subband 2000 In [45], an image is decomposed into a number of subbands with different frequency components using the wavelet transform (WT). A mid-range frequency subband image (HH after three decomposition levels) with resolution 16 × 16 is selected, subband number 4 in Fig.2.12. The subbands of the training images are used to construct the PCA subspace in which all training images are projected on it to get their corresponding feature vectors. In recognition stage, the probe image is first subtracted by the mean value of the reference images, then, a mid-range frequency subband image (HH after three decomposition levels) is extracted followed by projecting it on the PCA subspace to get the feature vector. Finally, the similarity measurement between the feature vectors of the probe image and the reference images is performed to determine whether the input probe image matched any of the images. The proposed method reduces the computational complexity significantly. Moreover, experimental results demonstrated that applying PCA on WT sub-image with mid-range frequency components gives better recognition accuracy and discriminatory power than applying PCA on the whole original image.

22

Figure 2.12: 3-level wavelet decomposition

2. Face Recognition Based on Local Fisher Features 2000 Authors of [44] propose a Localized LDA (LLDA) system based on applying the LDA on the mid-range subband of wavelet transform. The experiments show that this system has good classification power. The LLDA features are edge information of images. It concentrates on a mid-range subband by the two reasons: (1) it gives edge information; (2) it overcomes the difficulty in solving a singular Eigen value problem.

3. High speed face recognition based on discrete cosine transforms and neural networks 2000

In this method [47], discrete cosine transforms (DCTs) are used to reduce the dimensionality of face space by truncating high frequency DCT components. The remaining coefficients are fed into a neural network for classification. The selection of the DCT coefficients is done as in Fig.2.13. Because only a small number of low frequency DCT components are necessary to preserve the most important facial features, the proposed DCT-based face recognition system is much faster than other approaches.

Figure 2.13: (a) input image, (b) the log-magnitude of its DCT, (c) the scanning strategy of

coefficients

23

4. Face Recognition in Fourier Space 2000 The work in [42] describes a simple face recognition system based on an analysis of faces via their Fourier spectra. The feature vectors are constructed by taking the Fourier coefficients at selected frequencies as shown in Fig.2.14. Recognition is done by finding the closest match between feature vectors using the Euclidian distance classifier.

Figure 2.14: Most variant frequencies: a) real, b) imaginary and c) selected numbering

5. Face Recognition Using Holistic Fourier Invariant Features 2001 Authors of [43] introduce the Spectroface representation which is based on the wavelet transform and holistic Fourier invariant features as illustrated in Fig.2.15. Wavelet transform is applied to the face image to eliminate the effect of facial expressions. Then, the holistic Fourier invariant features (Spectroface) are extracted from the low frequency subband image (LL) by applying Fourier transform twice. The first Fourier transform is applied to the low frequency subband to make it invariant to the spatial translation. Then the second Fourier transform is applied to the polar transformation of the result to make it invariant to scale and on-the-plane rotation. Recognition is done by finding the closest match, using Euclidian distance, between the Spectroface of the probe image and those stored in the gallery.

6. Face Recognition Based on Polar Frequency Features 2006 A novel biologically motivated face recognition algorithm based on polar frequency is presented in [48]. Polar frequency descriptors are extracted from face images by Fourier-Bessel transform (FBT), which based on converting the image from Cartesian to polar coordinates followed by extracting the Fourier-Bessel series. Examples of the FBT of some images are shown in Fig.2.6. Next, the Euclidean distance between FBT’s of all images is computed and each image is now represented by its dissimilarity to the other images. A Pseudo-Fisher Linear Discriminant was built on this dissimilarity space. The results indicate the high informative value of the polar frequency content of face images in relation to recognition and verification tasks.

24

Figure 2.15: Spectroface representation steps

Figure 2.16: Examples of FBT of (A) an 8 radial cycles image, (B) a 4 angular cycles image and (C)

an image of the average of these images. The magnitude of the FBT coefficients is presented in colored levels (red indicates the highest value)

2.3.3 Other Holistic-Based Approaches

1. A Hybrid Feature Extraction Approach for Face Recognition based on Moments 2004

In this method [49], different feature extraction techniques such as Fourier descriptors, Zernike moments, Hu moments and Legendre moments are considered and classification techniques such as Nearest Neighbor classifiers, Linear Discriminant Analysis LDA classifiers and neural network classifiers are compared. Results on ORL [112] database show that using hybrid features composed of Fourier descriptors and Zernike moments with back-propagation NN as a classifier give the best recognition results. Fig.2.17 shows the block diagram for this face recognition system.

25

Figure 2.17: Block diagram for face recognition based on moments

2. Face Recognition with Support Vector Machines: Global versus Component-based Approach 2001

Authors of [18] present a local-based method and two holistic-based methods based on using the support vector machine SVM and evaluate them with respect to robustness against pose changes. Local-based method starts with locating facial components, extracting and combining them into a single feature vector which is classified by a Support Vector Machine (SVM), described above in the Local-based section. The two holistic-based methods recognize faces by classifying a single feature vector consisting of the gray values of the whole face image. In the first one, a single SVM classifier is trained for each person in the database. The second system consists of sets of viewpoint-specific SVM classifiers and involves clustering during training. Extensive tests are performed on a database which included faces rotated up to about 40° in depth. The local-based method clearly outperformed both holistic-based methods on all tests.

3. Face Recognition with Pose and Illumination Variations using new SVRDM Support Vector Machine 2005

A new support vector representation and discrimination machine (SVRDM) classifier is proposed in [50] and face recognition-rejection results are presented using the CMU PIE face database; both pose and illumination variations are considered. The recognition approach is based on a view-based two-step strategy, in which the pose of a test input is first estimated using SVRDM and this is followed by an identity classification with another SVRDM assuming the estimated pose. Four different classifiers are compared; namely SVM, SVRDM, Eigenface and Fisherface classifiers. Experimental results shows that the SVRDM performs best among all classifiers using the two-step strategy and that the SVRDM is less sensitive to the size of the classification problem than are other classifiers.

26

4. Face Recognition using a New Texture Representation of Face Images 2003

Authors of [51] present a new texture representation of face image using a robust feature from the Trace transform. The masked Trace transform (MTT) offers texture information for face representation which is used to reduce the within-class variance by masking out the background and non pure-face information. Fig.2.18 shows an example of the trace transform on a full face image and its masked version. The method starts with transforming the image space to the Trace transform space to produce the MTT. Weighted Trace transform (WTT) is then calculated which identifies the tracing lines of the MTT that produce similar values irrespective of intra-class variations. Finally, a new distance measure is proposed by incorporating the WTT for measuring the dissimilarity between reference and test images.

Figure 2.18: Examples of the Trace transform on (a) full image (b) masked with rectangular shape

and (c) masked with elliptical shape.

5. Embedded Bayesian networks for face recognition 2002 The work in [52] introduces a family of embedded Bayesian networks (EBN), which considered a generalization of the embedded hidden Markov models, and investigates their performance for face recognition. Results show that the members of the EBN family outperform some of the existing approaches such as the Eigenface method and the embedded HMM method.

27

2.4 Hybrid Approaches

1. Face Recognition Using Local and Global Features 2004 The work in [53] proposes to combine local and global facial features for face recognition. Four popular face recognition methods, namely, Eigenface [24], Spectroface [43], independent component analysis (ICA) [54], and local Gabor wavelet [15] are selected for combination. Since each of Spectroface, PCA, and ICA use distance measurement for classification, while local Gabor wavelet use similarity measurement, these measurements should be normalized at the same scale to be able to combine the four methods. Two normalization methods, namely, linear-exponential normalization method and distribution-weighted Gaussian normalization method, are proposed here. In addition, to choose the best set of classifiers for recognition, a simple but effective algorithm for classifiers selection is proposed. It is based on the leave-one-out algorithm through an iterative scheme. The basic idea of the scheme is that if one classifier is redundant, the accuracy will increase if that classifier is removed from combination. Finally, a weighted combination of classifiers based on the sum rule is used instead of assigning equal weight to each classifier. Fig.2.19 shows both the training and recognition stages. The experimental results show that the proposed method has 5–7% accuracy improvement over using a single global/local classifier.

Figure 2.19: Training and recognition stages of Face Recognition Using Local and Global Features

approach

2. A Probabilistic Fusion Methodology for Face Recognition 2005 The work in [55] considers three facial features, two global and one local, which are the entire face (i.e., the gray-level image of the face), the edginess image of the face, and the eyes, respectively. Fig.2.20 shows the three facial features. The edginess image is a global facial feature that is reasonably robust to illumination. It is a measure of the change in intensity from one pixel to the next. The eyes are manually located and cropped

28

to be used as a facial feature that is robust to facial expressions and occlusions (especially when the lower part of the face is fully occluded).

Figure 2.20 (a) A gray-scale face image, (b) it’s edginess image, and (c) the cropped eyes.

Next, the facial features are encoded to lower-dimensional feature spaces using the principal component analysis (PCA) in conjunction with Fisher’s Linear Discriminant (FLD). Three individual spaces are constructed corresponding to the three facial features. The distance-in-feature-space (DIFS) values are calculated for all the images in the training set and in each of the feature spaces. These values are used to compute the distributions of the DIFS values. Given a new test image, the three facial features are first extracted and their DIFS values are computed in each feature space. Each feature provides an opinion on the claim in terms of a confidence value. The confidence values of all the three features are fused for final recognition. The identity established by the proposed fusion technique is more reliable compared to the case when features are used individually.

2.5 Performance Evaluations and Comparative Studies

2.5.1 Performance Evaluation Performance evaluations of biometric technology are divided into three categories: technology, scenario, and operational. Each category of evaluation takes a different approach and studies different aspects of the system. A thorough evaluation of a system for a specific purpose starts with a technology evaluation, followed by a scenario evaluation and finally an operational evaluation [113]. The goal of a technology evaluation is to compare competing algorithms from a single technology, which in this case is facial recognition. Testing of all algorithms is done on a standardized database collected by a "universal" sensor and should be performed by an organization that will not see any benefit should one algorithm outperform the others. The use of a test set ensures that all participants see the same data. Someone with a need for facial recognition can look at the results from the images that most closely resemble their situation and can determine, to a reasonable extent, what results they should expect. Technology evaluations are always completely repeatable. Results from a technology evaluation typically show specific areas that require future research and development, as well as provide performance data that is useful when selecting algorithm(s) for scenario

29

evaluations. This evaluation class includes the FERET (face recognition technology) series of face recognition evaluations and the FRVT (face recognition vendor test) series [114]. Scenario evaluations aim to evaluate the overall capabilities of the entire system for a specific application. In face recognition, a technology evaluation would study the face recognition algorithms only but the scenario evaluation studies the entire system, including camera and camera-algorithm interface, for a specific application. An example is face recognition systems that verify the identity of a person entering a secure room. Each tested system would normally have its own acquisition sensor and would thus receive slightly different data. Scenario evaluations are not always completely repeatable for this reason, but the approach used can always be completely repeatable. Scenario evaluations typically take a few weeks to complete because multiple trials, and for some scenario evaluations, multiple trials of multiple subjects/areas, must be completed. Results from a scenario evaluation typically show areas that require additional system integration, as well as provide performance data on systems for a specific application. An example of the scenario evaluation is the UK Biometric Product Testing [56]. At first glance, an operational evaluation appears very similar to a scenario evaluation, except that the test is at the actual site and uses actual subjects. Rather than testing for performance, however, operational evaluations aim to study the workflow impact of specific systems installed for a specific purpose. Operational evaluations are not very repeatable unless the actual operational environment naturally creates repeatable data. Operational evaluations typically last from several weeks to several months. The evaluation team must first examine workflow performance prior to technology insertion, and again after users are familiar with the technology. Accurate analysis of the benefit of the new technology requires a comparison of the workflow performance before and after the technology insertion. In an ideal three-step evaluation process, technology evaluations are performed on all applicable technologies that could conceivably meet requirements. The technical community will use the results to plan future research and development (R&D) activities, while potential end-users will use the results to select promising systems for application-specific scenario evaluations. Results from the scenario evaluation will enable end-users to find the best system for their specific application and have a good understanding of how it will operate at the proposed location. This performance data, combined with workflow impact data from subsequent operational evaluations, will enable decision makers to develop a solid business case for a large-scale installation.

30

2.5.2 Comparative Studies Many comparative studies have been established in the last 10 years due to increasing in the number of available algorithms and techniques. These comparative studies usually try to evaluate two or more face recognition algorithms using one or more small to medium size databases. They can be classified into two categories according to the nature of databases they work on:

1. Comparative studies using global database(s) usually contain more than one variation to determine which algorithm is better over these databases, as in [40], [57-60].

2. Comparative studies for specific variation(s) using suitable database(s) each with one variation only to study the algorithms over each variation separately, as in [21], [39], [61-65].

Under the first category, [57] evaluate three holistic-based algorithms and a local one using seven different classifiers over FERET database with variations in illumination and aging. In [40], seven Eigenspace-based algorithms with five similarity matching criteria are compared over two databases, Yale with variations in illumination and expressions and FERET. Five algorithms based on local binary patterns are compared with a holistic and a local algorithm in [59] over two databases, BANCA with complex background and difficult lightning conditions and XM2VTS with uniform background. In [60], two holistic algorithms and a local one are compared over four different databases, each with two or more variations. In the second category, only the pose variation is considered in [61] and [62]. In [61], two holistic-based algorithms are compared using two databases (ALAN and UMIST). In [62], five holistic-based algorithms are compared using one database (FERET). In [39] and [21], expressions, illumination and aging variations are tested separately using FERET database over three holistic-based algorithms in [39] and two holistic-based and two local-based algorithms in [21]. Non-uniform illumination variation is considered in [63] in which five holistic-based algorithms are compared over two different databases, CMU-PIE and YALE B each with illumination variations only. As the main aim of this thesis is to propose an illumination normalization approach, the next chapter will discuss the different illumination normalization approaches in the literature. Also, nine comparative studies are introduced at the end of the chapter to select the best-of-literature approaches.

31

CHAPTER 3: Illumination Normalization Approaches

3.1 Introduction Although many face recognition techniques and systems have been proposed in the last 20 years, evaluations of the state-of-the-art techniques and systems have shown that recognition performance of most current technologies degrades due to the variations of illumination [78], [12]. In the last face recognition vendor test FRVT 2006 [12], they conclude that relaxing the illumination condition has a dramatic effect on the performance. Moreover, It has been proven both experimentally [84] and theoretically [14] that the variations between the images of the same face due to illumination are almost always larger than image variations due to change in face identity. As is evident in Fig.3.1, the same subject, with the same facial expression, can appear strikingly different when light source direction and viewpoint vary [79].

Figure 3.1: The same individual imaged with the same camera and the same facial expression may

appear dramatically different with changes in the lighting conditions.

There has been much work dealing with illumination variations in face recognition. Generally, these approaches can be classified into two categories: model-based and image-processing-based approaches. Model-based approaches derive a model of an individual face, which will account for variation in lighting conditions. Examples of this approach include spherical harmonics representation [81], Eigen Light-Fields [82], illumination cone [64], Quotient Image [100], Self Quotient Image [107] and Retinex algorithms [89], [90], [66]. Though the model-based approaches are perfect in theory, they require a training set with several different lighting conditions for the same subject, which can be considered as a weakness for realistic applications. Although some work has been done to enlarge a small learning set by virtually re-imaging the input face image as in [83], [96], the requirements of additional constraints or assumptions in addition to the highly computational cost make the model-based approaches unsuitable for realistic applications [80], [71]. Image-processing-based approaches attempt to normalize the variation in appearance due to illumination, either by image transformations or by synthesizing a new image from the given image in some normalized form. Recognition is then performed using this

32

canonical form. Examples of this approach include histogram equalization/matching HE/HM [71], gamma intensity correction GIC [84], local binary patterns LBP [80] and local normal distribution LNORM [72]. Compared to the model-based approach, preprocessing has two main advantages: it is completely stand-alone and thus can be used with any classifier. Moreover, it transforms images directly without any training images, assumptions or prior knowledge. Therefore, they are more commonly used in practical systems for their simplicity and efficiency. The rest of this chapter is organized as follows: sections 2 and 3 contain the description of some model-based and image-processing-based approaches, respectively. Section 4 contains the results of existing comparative studies among different illumination normalization approaches focusing on the best approaches of each comparison and then concludes the best-of-literature approaches from these studies.

3.2 Model-Based Approaches

1. Quotient Illumination Relighting (QIR) The work in [84] firstly proposes the Quotient Illumination Relighting (QIR) approach for robust face recognition under varying lighting conditions. QIR is based on the Lambertian model in which the face image can be described by the product of the albedo and the cosine angle between a point light source and the surface normal, as follows:

syxnyxyxI T),(),(),( ρ= ( 3.1)

where 0 ≤ ρ(x, y) ≤ 1 is the surface reflectance (albedo) associated with point x, y in the image, n(x, y) is the surface normal direction associated with point x, y in the image, and s is the light source direction (point light source) and whose magnitude is the light source intensity [100]. To understand the idea of QIR, we firstly need to consider the following three definitions from [84]:

1. Ideal class of objects: is a collection of 3D objects that have the same shape but differ in the surface albedo function. The image space of such a class is represented by:

jT

i syxnyx ),(),(ρ ( 3.2)

where ρi (x, y) is the albedo of object i of the class, n(x, y) is the surface normal of the object (the same for all objects of the class), and sj is the light source direction, which can vary arbitrarily.

2. Quotient Illumination: for the lighting condition sj of an ideal class of objects (whose shape is n) is defined as:

0.),(.),(

),(syxnsyxn

yxRT

jT

j = ( 3.3)

33

Where s0 (point light source) be a pre-defined canonical lighting condition. Obviously, the Quotient Illumination is completely independent of the surface reflectance (albedo), and depends only on the variance of the lighting condition from the pre-defined canonical lighting one (considering all the shapes are assumed to be the same). Thus, Quotient Illumination can be computed easily by calculating the quotient between the images of the object i of the ideal class of objects as follows:

),(),(

.),(),(

.),(),(),(

00 yxIyxI

syxnyxsyxnyx

yxRi

ij

Ti

jT

ij ==

ρρ

( 3.4)

where Iij is the image of object i captured under the j-th lighting condition, and Ii0

is the image of the same object i captured under the canonical lighting condition. 3. Quotient illumination bootstrap set: since faces are not strictly ideal class of

objects as the 3D shapes of faces are different despite their approximate similarity. Therefore, a Quotient illumination bootstrap set needs to be constructed. It consists of a set of pairs of face images captured under some non-canonical lighting condition and under the pre-defined canonical lighting condition, as follows:

( ) LjNiII iij ,...,2,1;,...,2,1|, 0 == ( 3.5)

where N is number of persons and L is number of non-canonical lighting conditions in the system. Given such a bootstrap set, Quotient Illumination can be statistically modeled or computed simply as the mean over all the faces in the set, for instance:

LjyxIyxI

NyxR

N

i i

ijj ,...,2,1,

),(),(1),(

1 0

== ∑=

( 3.6)

Finally, the Quotient Illumination Relighting (QIR) can be computed as follows: Given an image of arbitrary face, Iij, assume that it is lighted by the j-th known lighting condition, and the j-th quotient illumination Rj has been computed too. Then, its canonical image captured under the pre-defined 0-th lighting condition can be derived by:

),(),(

),(0 yxRyxI

yxIj

iji = ( 3.7)

This provides a direct and simple way for illumination normalization but under the condition that the direction of the lighting source of the image can be known. See Fig.3.2 for its intuitive effect on an illuminated face image from Yale B database. Note that the QIR is based on the assumption that the lighting modes of the images, both probe and references, are known or can be estimated. This is a strong constraint in a practical application system.

34

Figure 3.2: Effect of applying QIR on an illuminated face image from Yale B database

2. Self-Quotient Image (SQI) To avoid the requirement for knowing/estimating the lighting modes of images in QIR, the concept of Self-Quotient Image (SQI) is first introduced in [107] for robust face recognition under varying lighting conditions. SQI is based also on the Lambertian model [107] rather than the reflectance illumination model [107] and it’s defined by:

IFI

IIQ

*ˆ == ( 3.8)

where I is the smoothed version of I, F is the smoothing kernel, and the division is point-wise. Fig.3.3 shows the effect of applying SQI on illuminated face images from two face databases, Yale B and CMU PIE.

a) Yale B

b) CMU PIE

Figure 3.3: Effect of applying SQI approach to illuminated face images from Yale B and CMU PIE databases.

The only processing needed for SQI is smoothing filtering. A weighed Gaussian filter is designed for anisotropic smoothing according to the following equation:

WGF *= ( 3.9)

(a) Illuminated Image (b) QIR image

35

where W is the weight and G is the Gaussian kernel, and N is the normalization factor for which:

∑Ω

= 11 WGN ( 3.10)

where Ω is the convolution kernel size. The convolution region is divided into two sub-regions M1 and M2 with respect to a threshold τ. Where M1 has more pixels than M2, τ is calculated by:

)( Ω= IMeanτ ( 3.11)

For the two sub-regions, W has corresponding value:

⎩⎨⎧

∈∈

=1

2

),(1),(0

),(MjiIMjiI

jiW ( 3.12)

If the convolution image region is smooth, i.e. little gray value variation (non-edge region), there is also little difference between the smoothing the whole region and part of the region. If there is large gray value variation in convolution region, i.e. edge region, the threshold can divide the convolution region into two parts M1 and M2 along the edge and the filter kernel will convolute only with the large part M1, which contains more pixels. Therefore the halo effects can be significantly reduced by the weighted Gaussian kernel. The essence of this anisotropic filter is that it smoothes only the main part of convolution region (i.e. only one side of edge region in case of step edge region). The division operation in the SQI may magnify high frequent noise especially in low signal noise ratio regions, such as in shadows. To reduce noise in Q, a nonlinear transformation function is used to transform Q into D,

)(QTD = ( 3.13)

where T is a nonlinear transform which may be Log, Arctangent or Sigmoid nonlinear function. The implementation of SQI approach is summarized below:

1. Select several smoothing kernel G1, G2, …, Gn and calculate corresponding weights W1, W2, …, Wn according to image I, and then smooth I by each weighted anisotropic filter WGi.

nkWGN

II kk ,...,2,1,1*ˆ == ( 3.14)

Calculate self-quotient image (SQI) between input image I and its smoothing version

nkIIQk

k ,...,2,1,ˆ == ( 3.15)

36

2. Transfer self-quotient image (SQI) with nonlinear function nkQTD kk ,...,2,1),( == ( 3.16)

3. Summarize nonlinear transferred results

∑=

==n

kkk nkDmQ

1

,...,2,1, ( 3.17)

The m1, m2,… mn are the weights for each scale of filter and are set to one in experiments of [107].

3. Single Scale Retinex with Histogram Matching (SSR-HM) When the dynamic range of a scene exceeds the dynamic range of the recording medium, the visibility of color and detail will usually be quite poor in the recorded image. Dynamic range compression attempts to correct this situation by mapping a large input dynamic range to a relatively small output dynamic range. Simultaneously, the colors recorded from a scene vary as the scene illumination changes. Color constancy aims to produce colors that look similar under widely different viewing conditions and illuminants. The Retinex is an image enhancement algorithm that provides a high level of dynamic range compression and color constancy [89]. The work in [66] proposes an illumination normalization approach based on applying the Retinex followed by histogram matching for illumination invariant face recognition. Different from the QIR and SQI approaches, Retinex algorithms are based on the reflectance illumination model [107] rather than the Lambertian model [107]:

),(),(),( yxLyxRyxI ×= ( 3.18)

where I is the image, R is the reflectance of the scene and L is illuminance/lighting at each point (x, y). Many variants of the Retinex have been published over the years. The last version from Land [90] is now referred to as the Single Scale Retinex (SSR) and is defined for a point (x, y) in an image as:

[ ]),(),(log),(log),( yxIyxFyxIyxI iiR ⊗−= ( 3.19)

where IR(x, y) is the Retinex output and Ii(x, y) is the image distribution in the i-th spectral band. There are three spectral bands – one each for red, green and blue channels in a color image.

In equation 3.19, the symbol ⊗ represents the convolution operator and F(x, y) is the Gaussian surround function given by equation 3.20. The final image produced by Retinex processing is denoted by IR:

[ ]222

2121 )(exp),( σκ xxxxF +−= ( 3.20)

37

where σ is the standard deviation of the filter and controls the amount of spatial detail that is retained, and κ is a normalization factor that keeps the area under the Gaussian curve equal to one. The standard deviation of the Gaussian σ is referred to as the scale of the SSR. A small value of σ provides very good dynamic range compression but at the cost of poorer color rendition, causing graying of the image in uniform areas of color. Conversely, a large scale provides better color rendition but at the cost of dynamic range compression [89]. Since face recognition is conventionally performed on grey-scale images, the loss of color is out of concern here. Moreover, the dynamic range compression gained by small scales is the essence of the illumination normalization process proposed in [66]. All the shadowed regions are grayed out to a uniform color, eliminating soft shadows and specularities and hence creating an illumination invariant signature of the original image. Fig.3.4 illustrates the effect of Retinex processing on a facial image, I, for different values of σ. As σ increases, the normalized image IN, contains reduced graying and lesser loss of intensity values, as seen in Fig.3.4 (c) and (d). However, for larger values of σ, the shadow is still visible. On the other hand, with σ = 6 in Fig.3.4 (b), the resulting image has grayed out the shadow region to blend in with the rest of the face.

Figure 3.4: The effect of the scale, σ, on processing an illuminated facial image using the SSR.

Finally, histogram matching/fitting is then applied in to bring all the images that have been processed by the SSR to the same dynamic range of intensity [66]. The histogram of IR is modified to match a histogram of a specified target image ÎR as shown in Fig.3.5.

4. GROSS Method The work in [85] proposes an illumination normalization approach, we call it GROSS approach, for illumination invariant face recognition. Same as Retinex algorithms, GROSS is based on the reflectance illumination model rather than the Lambertian model, see equation 3.18. The GROSS approach is motivated by two widely accepted assumptions about human vision:

1. Human vision is mostly sensitive to scene reflectance and mostly insensitive to the illumination conditions.

(b) IR with σ = 6 (c) IR with σ = 50 (a) Illuminated Image

(d) IR with σ = 100

38

2. Human vision responds to local changes in contrast rather than to global brightness levels.

Having these assumptions, the goal is to find an estimate of L(x, y) such that when it divides I(x, y) it produces R(x, y) in which the local contrast is appropriately enhanced.

Figure 3.5: Histogram fitted version of SSR with σ = 6

This view is called perception gain model, in which R(x, y) takes the place of perceived sensation, I(x, y) takes the place of the input stimulus and L(x, y) is then called perception gain which maps the input sensation into the perceived stimulus, that is:

),(),(

1),( yxRyxL

yxI = ( 3.21)

The solution for L(x, y) is found by minimizing: ( ) ( )∫∫∫∫ ΩΩ

++−= dxdyLLdxdyLyxLJ yx2221),()( λρ ( 3.22)

where the first term drives the solution to follow the perception gain model, while the second term imposes a smoothness constraint. Here Ω refers to the image. The parameter

(b) IR with σ = 6 (d) Well-lit Image, Î (e) ÎR with σ = 6 (a) Illuminated Image, I

(c) Source SSR Histogram (f) Target SSR Histogram

(g) Histogram Matched/Fitted Image (h) Histogram of Image in (g)

39

λ controls the relative importance of the two terms. The space varying permeability weight ρ(x, y) controls the anisotropic nature of the smoothing constraint. The Euler-Lagrange equation for this calculus of variation problem yields:

( ) ILLL yyxx =++ρλ

( 3.23)

Discretized on a rectangular lattice, this linear partial differential equation (PDE) becomes:

( ) ( ) ( ) ( ) ILLh

LLh

LLh

LLh

L jiji

ji

jiji

ji

jiji

ji

jiji

ji

ji =⎥⎥⎥

⎦

⎤

⎢⎢⎢

⎣

⎡−+−+−+−+ +

+

−

−

+

+

−

−

,1,

,21

,1,

,21

1,,

21,

1,,

21,

,

1111ρρρρ

λ ( 3.24)

where h is the pixel grid size and the value of each ρ is taken in the middle of the edge between the center pixel and each of the corresponding neighbors (see Fig.3.6). In this formulation, ρ controls the anisotropic nature of the smoothing by modulating permeability between pixel neighbors. Equation 3.24 can be solved numerically using multi-grid methods for boundary value problems [108] in order O(N).

Figure 3.6: Discretization lattice for the PDE in equation 3.23

The smoothness is penalized at every edge of the lattice by weights ρ (see Fig.3.6). To make the weight changes proportionally with the strength of the discontinuities, the following relative measure of local contrasts is used that will equally "respects" boundaries in shadows and bright regions:

( )ba

baba II

IIIIh

,min2

−=

∆=+ρ ( 3.25)

where 2

ba+ρ is the weight between two neighboring pixels whose intensities are Ia and Ib.

Fig.3.7 shows the effect of applying GROSS approach on some illuminated face images from Yale B database.

40

Note that the GROSS approach does not require any training steps, knowledge of 3D face models or reflective surface models. Only a single parameter, which meaning is intuitive and simple to understand, needs to be adjusted by the user.

Figure 3.7: Effect of applying GROSS approach on some illuminated face images from Yale B

database

3.3 Image-Processing-Based Approaches The Image-processing-based approaches are applied either globally on the whole image or locally over blocks or regions. The global approaches usually produce realistic images while the local approaches have the disadvantage that the output is not necessarily realistic. But since the objective is to obtain a representation of the face that is invariant to illumination, while keeping the information necessary to allow a discriminative recognition of the subjects, it’s possible to use them for illumination normalization. However, since local approaches are highly dependent on the local distribution of the pixels within the image, this make them sensitive to any geometrical effects on the images such as translation, rotation and scaling rather than global approaches which are not affected by such geometrical changes. Following are some examples under each approach.

3.3.1 Global Approaches

1. Histogram Equalization (HE) It’s one of the most common illumination normalization approaches [74]. It aims to create an image with uniform distribution over the whole brightness scale by using the cumulative density function of the image as a transfer function. Thus, for an image of

(a) Illuminated images from Yale B

(b) Processed images by GROSS approach

41

size M × N with G gray levels and cumulative histogram H(g), the transfer function at certain level T(g) is given as follows:

NMGgHgT

×−×

=)1()()( ( 3.26)

Fig.3.8 shows the effect of applying histogram equalization to an illuminated image. The illuminated image and its histogram are shown in Fig.3.8 (a), (b) while the equalized image and its corresponding histogram are shown in Fig.3.8 (c), (d).

Figure 3.8: Effect of applying histogram equalization on an illuminated image

2. Histogram Matching/Fitting (HM) HM is most commonly used techniques of histogram adjustment. Given an illuminated face image X and a well-lit face image Y, histogram matching [74] is applied to bring the illumination level of the input image X to that of the reference image Y. This is done by making the histogram of the X approximately “match” to the histogram of Y which makes both images having roughly the same mean and variance in their histograms. Fig.3.9 demonstrates the histogram matching process to an illuminated image. The illuminated image (source) and its histogram are shown in Fig.3.9 (a), (b). The well-lit image (target) and its corresponding histogram are shown in Fig.3.9 (c), (d) respectively. The resulting image and its histogram after applying the histogram matching are shown in Fig.3.9 (e), (f).

Illuminated Image

Equalized Image

(a) (b)

(c) (d)

42

Figure 3.9: Histogram matching process to an illuminated image

To explain the algorithm, Let H(i) be the histogram function of an illuminated image X, and G(i) be the desired histogram of the well-lit image Y, we wish to map H(i) to G(i) via a transformation FH G(i). We first compute a transformation function for both H(i) and G(i) that will map the histogram to a uniform distribution, U(i). These functions are FH U(i) and FG U(i), respectively. Equations 3.27 and 3.28 depict the mapping to a uniform distribution, which is also known as histogram equalization [74].

∑

∑−

=

=

→ = 1

0

0U(i) H

)(

)(F N

j

i

j

jH

jH

( 3.27)

∑

∑−

=

=

→ 1

0

0U(i)G

)(

)(

N

j

i

j

jG

jGF ( 3.28)

Where N is the number of discrete intensity levels. N = 256 for 8-bit grayscale images.

(a)

(f)

(d)(c)

Illuminated Image

Well-lit Image

Resulting Image

(b)

(e)

43

To find the mapping function, FH G(i), the function FG U(i) is inverted to obtain FU G(i). Since the domain and the range of the functions of this form are identical, the inverse mapping is trivial and is found by cycling through all values of the function. However, due to the discrete nature of these functions, inverting can yield a function which is undefined for certain values. Thus, linear interpolation is used and assumes smoothness to fill undefined points of the inverse function according to the values of well-defined points in the function. As a result, a fully defined mapping FU G(i) is generated which transforms a uniform histogram distribution to the distribution found in histogram G(i). The mapping FH G(i) can then be defined as in equation 3.29, [66].

)( )()( iUHGUiGH FFF →→→ = ( 3.29)

It’s common in literature to match all images, in both training and testing sets, with a single histogram of either a fixed well-lit image as in [71], [67] or an average image as in [72].

3. Logarithmic Transform (LOG) LOG is a frequently used technique of gray-scale transform. It simulates the logarithmic sensitivity of the human eye to the light intensity. The general form of the log transformation [74] is:

)1log( rcs += ( 3.30)

Where r and s are the old and new intensity value, respectively and c is a gray stretch parameter used to linearly scaling the result to be in the range of [0, 255]. The shape of the log curve in Fig.3.10 shows that this transformation maps a narrow range of dark input gray-levels (shadows) into a wider range of output gray levels. The opposite is true for the higher values of the input gray-levels. Fig.3.11 shows the effect of applying LOG to an illuminated face image.

Figure 3.10: Transformation functions of LOG and GAMMA (L: number of gray levels)

44

Figure 3.11: Effect of applying LOG approach to an illuminated face image.

4. Gamma intensity correction (GIC) Gamma correction is a technique commonly used in the field of Computer Graphics. It concerns how to display an image accurately on a computer screen. Images that are not properly corrected can look either bleached out, or too dark. Gamma correction can control the overall brightness of an image by changing the Gamma parameter, see Fig.3.10 for the effect of choosing Gamma (γ) to be greater than one which map a narrow range of dark input values (shadows) into a wider range of output values. Unlike the traditional Gamma correction technique in Computer Graphics, but motivated by its idea, the Gamma Intensity Correction (GIC) method is proposed by [84] to correct the overall brightness of the face images to a pre-defined “canonical” face images. It is formulated as following: Predefine a canonical face image, I0, which should be lighted under some normal lighting condition. Then, given any face image, I, captured under some unknown lighting condition. Its canonical image is computed by a Gamma transform pixel by pixel over the image position x, y:

( )*;γxyxy IGI =′ ( 3.31)

Where the Gamma coefficient γ* is computed by the following optimization process, which aims at minimizing the difference between the transformed image and the predefined normal face image I0:

( ) ( )[ ]∑ −=yx

xy yxIIG,

20

* ,;minarg γγγ

( 3.32)

Where Ixy is the gray-level of the image position x, y; and

( ) γγ1

.; xyxy IcIG = ( 3.33)

is the Gamma transform; c is a gray stretch parameter used to linearly scaling the result to be in the range of [0, 255], and γ is the Gamma coefficient. From equations 3.32 and 3.33, intuitively, the GIC is expected to make the overall brightness of the input images best fit that of the pre-defined normal face images. Thus, its intuitive effect is that the overall brightness of all the processed face images is

LOG

Resulting Image

Illuminated Image

45

adjusted to the same level as that of the common normal face I0. See Fig.3.12 for its intuitive effect.

Figure 3.12: Effect of applying GIC to an illuminated face image

5. Normal Distribution (NORM) This technique normalizes the image by assuming the gray values form a normal distribution [72]. The idea is to make the mean (µr) and the standard deviation (σr) of the resulting image to zero and one respectively. For an image of mean (µi) and standard deviation (σi), the output image is calculated using the following equation.

( )( ) ( )i

iyxIyxIfσ

µ−=

,, ( 3.34)

The effect of applying the NORM to an illuminated face image is shown in Fig.3.13.

Figure 3.13: Effect of applying NORM approach to an illuminated image. (Note that the gray-level of

the resulting image is stretched to [0,255] for displaying purpose only)

3.3.2 Local Approaches

1. Local Normalization Methods The local normalization methods have the disadvantage that the output is not necessarily realistic. In the face recognition problem the objective is not to have a realistic image but to obtain a representation of the face that is invariant to illumination, while keeping the information necessary to allow a discriminative recognition of the subjects. With this idea in mind, it makes sense to use local illumination normalization methods this type of application.

NORM

Illuminated Image (I)

Normal Image (I0)

GIC

GIC Image (I’)

Illuminated Image

Resulting Image

46

There are three local normalization methods proposed by [72] which are: Local Histogram Equalization (LHE), Local Histogram Matching (LHM) and Local Normal Distribution (LNORM). They are the same as their global counterparts described in section 3.3.1 but applied locally. Applying a function locally mean the following; take a window from the image, starting in the up left corner, with a window size considerably smaller than the image size. The global normalization function is applied to the windowed image. This process is repeated by moving the window pixel by pixel all over the image and for each one applying the normalization function. Because the windows overlap, the final pixel value is the average of all the results for that particular pixel. Fig.3.14 shows the effect of applying the three local normalization methods, LHE, LHM and LNORM to an illuminated face image.

Figure 3.14: Effects of applying the three local normalization methods to an illuminated face image

2. Region-based strategy combining GIC and HE It is obvious that both HE and GIC are global transforms over the whole image area. Therefore, they are doomed to fail when side lighting exists. To partly solve this problem, [84] propose to process the face images based on different local regions, that is, performing HE or GIC in some pre-defined face regions in order to better alleviate the highlight, shading and shadow effect caused by the unequal illumination. Ideally, it is expected to strictly partition the face according to the structure of the facial organs, for instance, as illustrated in Fig.3.15.

Figure 3.15: An example of ideal region partition

However, complex region partition needs complicated region segmentation approach, which is often impractical. And, since the possible side lighting mainly cause the non-symmetry between the left and right part of the face, as well as the intensity variance between the top region and the bottom region. In [84], the strategy is to simply partition the face into four regions according to the given eye centers as shown in Fig.3.16.

(d) LNORM (a) Illuminated Image (c) LHM (b) LHE

47

Figure 3.16: The four regions for illumination normalization

After the coarse partition of the face regions, HE or GIC can be conducted in the four regions separately. Hereafter, the region-based HE is abbreviated to RHE, and the region-based GIC to RGIC. The effects of the RHE and RGIC can be seen from Fig.3.17.

Figure 3.17: The effects of applying region-based strategy of HE and GIC over the four face regions

3. Block Histogram Matching (BHM) In [55], a simple block histogram matching (BHM) technique is proposed for illumination compensation. It assumes that a reference image taken under well-controlled lighting conditions is available. Let X and Y be the input and the reference images, respectively, of size N × N pixels. The goal is to bring the illumination level of the input image X to that of the reference image Y by applying BHM. Consider a block image BI from the input image X with pixel locations ranging from 1 to M and also a block image BR from the reference image Y at the corresponding pixel locations (Fig.3.18). A histogram matching is applied to the input image block BI to make the pixel intensity distribution of BI equivalent to the pixel intensity distribution of BR.

Figure 3.18: Block histogram matching. In each image pair, the left one is the input image while the

right one is the reference image.

(d) RGIC + RHE (a) Illuminated Image (c) RGIC (b) RHE

48

The histogram-matching block image intensity values are scaled with a windowing filter H which is as defined below and is pictorially shown in Fig.3.19:

( ) ( ) ( ) MmnmnHmnBmnB OO ≤≤= ,1,,,, ( 3.35)

where

( )( )

( )

( )( )⎪⎪⎪⎪

⎩

⎪⎪⎪⎪

⎨

⎧

≤<+−+−

≤<≤≤+−

≤≤≤<+−

≤≤

=

.,2

,114

,2

,2

1,14

,2

1,2

,14

,2

,1,4

,

2

2

2

2

MmnMM

mMnM

MmMMnM

mMn

MmMnMM

nMm

MmnMnm

mnH ( 3.36)

Figure 3.19: The windowing filter H used in the Block HM method

By simultaneously shifting the blocks in both the horizontal and the vertical directions in steps of M/2 + 1 pixel locations (as shown in Fig.3.18), and adding pixel intensity values in overlapping regions, the final image Z is achieved. The intensity changes are smoothed out across adjacent blocks. The blocks are overlapped to avoid edges and patches from appearing in the illumination compensated image. The window H is defined such that the sum of the weights in the overlapping region is 1. Fig.3.20 shows examples of images taken under different illumination directions and the corresponding intensity normalized images using BHM method. The reference image was kept the same for all the images.

49

Figure 3.20: Images before and after intensity normalization with BHM. (a) Input images, (b)

corresponding output images after applying BHM

4. Local Binary Patterns (LBP) The local binary pattern (LBP) is a non-parametric operator which describes the local spatial structure of an image. Ojala et al. [86] first introduced this operator and showed its high discriminative power for texture classification. At a given pixel position (xc, yc), LBP is defined as an ordered set of binary comparisons of pixel intensities between the center pixel and its eight surrounding pixels, Fig.3.21.

Figure 3.21: The LBP operator

The decimal form of the resulting 8-bit word (LBP code) can be expressed as follows:

( ) ( )∑=

−=7

0

2,n

ncncc iisyxLBP ( 3.37)

where ic corresponds to the grey value of the center pixel (xc, yc), in to the grey values of the 8 surrounding pixels, and function s(x) is defined as:

( )⎩⎨⎧

<≥

=.0,0,0,1

xx

xs ( 3.38)

By definition, the LBP operator is unaffected by any monotonic gray-scale transformation which preserves the pixel intensity order in a local neighborhood. Note that each bit of the LBP code has the same significance level and that two successive bit

(b) (a)

50

values may have a totally different meaning. Actually, The LBP code may be interpreted as a kernel structure index. Later, Ojala et al. [87] extended their original LBP operator to a circular neighborhood of different radius size. Their LBPP,R notation refers to P equally spaced pixels on a circle of radius R. In [80], the LBP8;2 operator illustrated in Fig.3.22 is used to preprocess the input image before providing it to the face authentication algorithms: the face is represented with its texture patterns given by the LBP operator at each pixel location as shown in Fig.3.23.

Figure 3.22: The extended LBP operator with (8,2) neighborhood. Pixel values are interpolated for

points which are not in the center of a pixel.

Figure 3.23: Original image (left) processed by the LBP operator (right).

The conducted experiments in [80] show that the LBP operator provides a texture representation of the face which improves the performances of two different face authentication classifiers (PCA-LDA and HMM) as compared to histogram equalization. Moreover, obtained results are comparable when using the GROSS algorithm proposed in [85] and described previously in section 3.2, on the same databases [88] and removes the need for parameter selection.

5. Laplacian of Gaussian In order to remove the influence caused by illumination variations, [91] apply image processing which is a combination of histogram equalization, Laplacian of Gaussian filter and contrast adjustment. Histogram equalization is applied first to enhance biased contrast image in which some pixels are concentrated on a narrow range of the pixel intensity. In order to remove the information of pixel intensity while reserving local features that are useful to recognition, [91] apply Laplacian of Gaussian (LoG) filter following the histogram equalization. The equation of the 2D LoG function centered on zero and with Gaussian standard deviation σ has the form:

LBP operator

51

( ) eyxyxyxLoG 2

22

22

22

4 211, σ

σπσ

+−

⎥⎦

⎤⎢⎣

⎡ +−−= ( 3.39)

Fig.3.24 shows the effects of the three steps of this approach. As shown in Fig.3.24 (c), local features of each face are reserved and the influence of lighting is almost removed. However, contrast of processed image is biased to a certain range. In order to improve this contrast and emphasize the local features, contrast adjustment is applied. The final processed image is shown in Fig.3.24 (d).

Figure 3.24: Effects of applying the image processing steps proposed by [91]

6. Preprocessing Chain Approach (CHAIN) The work in [109] proposes a preprocessing chain that incorporates a series of steps chosen to counter the effects of illumination variations, local shadowing and highlights, while still preserving the essential elements of visual appearance for use in recognition. The chain consists of the following consecutive steps:

1. Gamma Correction 2. Difference of Gaussian (DoG) 3. Masking 4. Contrast Equalization

Gamma Correction replaces gray level I with γI , where γ ∈ [0, 1] is a user-defined parameter. It has the effect of enhancing the local dynamic range of the image in dark or shadowed regions, while compressing it in bright regions and at highlights. Difference of Gaussian (DoG) Filtering. Gamma correction does not remove the influence of overall intensity gradients such as shading effects. Shading induced by surface structure is potentially a useful visual cue but it is predominantly low frequency spatial information that is hard to separate from effects caused by illumination gradients. High pass filtering can be applied to remove the effects of this shading. Moreover, suppressing the highest spatial frequencies reduces aliasing and noise, and in practice it often manages to do so without destroying too much of the underlying signal on which recognition needs to be based. DoG filtering is a convenient way to obtain the resulting bandpass behavior. Fine spatial detail is critically important for recognition so the inner (smaller) Gaussian is typically quite narrow (σ0 ≤ 1 pixel), while the outer one might have σ1 of 2–4 pixels or more, depending on the spatial frequency at which low frequency information becomes misleading rather than informative. The work in [109] finds that σ1

(c) LoG (a) Illuminated Image (b) HE (d) Contrast Adjustment

52

≈ 2 typically gives the best results, but values up to about 4 are not too damaging and may be preferable for datasets with less extreme lighting variations. Masking. If a mask is needed to suppress facial regions that are felt to be irrelevant or too variable, it should be applied at this point. Otherwise, either strong artificial gray-level edges are introduced into the convolution, or invisible regions are taken into account during contrast equalization. Contrast Equalization. The final step of the CHAIN approach is to globally rescale the image intensities to standardize a robust measure of overall contrast or intensity variation. It is important to use a robust estimator because the signal typically still contains a small admixture of extreme values produced by highlights, garbage at the image borders and small dark regions such as nostrils. A simple and rapid approximation based on a two stage process is applied to accomplish this:

( ) ( )( )( )( )[ ] aayxImean

yxIyxI 1

,

,,′′

= ( 3.40)

( ) ( )( )( )( )[ ] aayxImean

yxIyxI 1

,,min

,,′′

=τ

( 3.41)

Here, a is a strongly compressive exponent that reduces the influence of large values, τ is a threshold used to truncate large values after the first phase of normalization, and the mean is over the whole (unmasked part of the) image. The resulting image is now well scaled but it can still contain extreme values. To reduce their influence on subsequent stages of processing, a nonlinear function is finally applied to compress over-large values. In [109], the hyperbolic tangent I(x, y) = τ tanh(I(x, y)/τ), is used thus limiting I to the range (−τ, τ). In [109], the default settings of the various parameters of CHAIN approach are summarized in Table-3.1. Moreover, it’s found that the CHAIN approach gives similar results over a broad range of parameter settings, which greatly facilitates the selection of parameters. Fig.3.25 shows the effect of applying CHAIN approach on various illuminated faces from Extended Yale B database.

Table 3.1: Default parameter settings for CHAIN approach

Procedure Parameter Value

Gamma Correction γ 0.2

DoG Filtering σ0 1

σ1 2

Contrast Equalization

α 0.1

τ 10

53

Figure 3.25: Examples of images of one person from the Extended Yale-B frontal database. The

columns respectively give images from subsets 1 to 5.

3.4 Comparative Studies & Best-of-Literature Approaches Here, we give brief description about nine different comparative studies presented in literature. For each comparative study, we state the illumination normalization approaches that are compared, the database(s) they compared on and the result(s) of the study. At the end, we give a summarization about the results of these studies and introduce some relationships between these studies as a try to deduce the best illumination normalization approaches in literature.

1. Study 1 The study in [71] empirically compares five image-processing-based approaches for illumination insensitive face recognition, which are:

1. Histogram Equalization (HE) 2. Histogram Matching (HM) 3. Logarithmic Transform (LOG) 4. Gamma intensity correction (GIC) 5. Self-Quotient Image (SQI)

These approaches are compared on the CMU-PIE database [102], the FERET database [120] and the CAS-PEAL database [103]. The PCA followed by LDA approach is used for face recognition. The results on the lighting subsets of the three databases show that HM gives the best results among the four other approaches over FERET and CAS-PEAL, while comes after GIC over CMU-PIE.

2. Study 2 The work in [84] proposes the following three illumination normalization approaches:

1. Gamma Intensity Correction (GIC) method. 2. Region-based strategy combining GIC and the Histogram Equalization (HE).

(a) Illuminated face images (b) After applying CHAIN approach

54

3. Quotient Illumination Relighting (QIR) method. Experiments are then conducted to compare the following approaches:

1. HE: Histogram equalization globally over the images. 2. RHE: Region-based Histogram equalization. 3. GIC: Gamma Intensity Correction globally. 4. RGIC: Region-based GIC. 5. GIC+RHE: perform RHE after GIC. 6. RGIC+RHE: perform RHE after RGIC. 7. RHE+RGIC: perform RGIC after RHE. 8. HE+RGIC: perform RGIC after HE. 9. QIR: Quotient Illumination Relighting.

The above approaches are empirically compared on the Yale B database [64] and Harvard database [106]. The simplest normalized correlation, i.e., cosine of the angle between two image vectors, is exploited as the distance measurement. And for all experiments, classification is performed using the nearest neighbor classifier. The results show that the proposed QIR approach gives the best results on both databases. However, the terrific performance of QIR is based on the assumption that the lighting modes of the images are known or can be estimated. This is a strong constraint in a practical application system. In contrast, the RHE combined with RGIC methods are more general and practical to be exploited in a recognition system efficiently, since they need not the illumination estimation procedure.

3. Study 3 The work in [96] empirically compares the following 12 approaches for illumination insensitive face recognition:

1. Correlation method [93]. 2. Eigenface method [24]. 3. Eigenface method without the first three principle components [32]. 4. Nearest Neighbor using 9 training images per subject [93]. 5. Linear Subspace [94]. 6. Gradient angles [97] 7. Cones-attached [95]. 8. Cones-cast [95]. 9. Harmonic images (no cast shadow) [96]. 10. Harmonic images-cast (with cast shadows) [96]. 11. Nine point of lights (9PL) using simulated images [96]. 12. Nine point of lights (9PL) using real images [96].

The above approaches are empirically compared on the Yale B database [64]. In all the experiments on the last four methods, the actual recognition algorithm is straightforward.

55

For each test image, the usual L2 (Euclidean distance) is computed between the image and all the subspaces. The identity associated with the subspace that gives the minimal distance to the image is declared to be its identity. The results show that only two out of the 12 approaches give 100% recognition rate, they are the Cones-cast and the Nine point of lights (9PL) using real images.

4. Study 4 The work in [85] introduces a simple and automatic image-processing-based approach for illumination normalization in face recognition, we call it GROSS approach. It empirically compares the proposed approach with two other approaches which are Histogram Equalization (HE) and Gamma Correction (GAMMA). These approaches are compared on the Yale B database [64] and the CMU PIE database [102]. In all experiments, the recognition accuracies are reported for two algorithms: Eigenfaces (Principal Component Analysis (PCA)) and FaceIt [104], a commercial face recognition system from Identix. The results show the superiority of the proposed approach (GROSS) over the HE and GAMMA. Fig.3.26 gives a summarization about the above four studies showing the best approaches from each study.

Figure 3.26: Summarization for the first four comparative studies. For each study, it shows the

normalization approach to be compared, the face databases and the face recognition approaches in addition to the best normalization approaches from each study (grayed boxes)

Study 1 Approaches: HE, HM, GIC, LOG, SQI Databases: CMU-PIE, FERET, CAS-PEAL Recog. Method: PCA LDA

HM, GIC

Study 2 Approaches: HE, RHE, GIC, RGIC, GIC+RHE, RGIC+RHE, RHE+RGIC, HE+RGIC, QIR Databases: Yale B, Harvard Recog. Method: Cosine NN

Study 3 Approaches: Correlation, Eigenface, Eigenface w/o 1st 3 PCs, NN (9 images), LS, GA, Cones-attached, Cones-cast, Harmonic (no cast), Harmonic-cast, 9PL (simulated), 9PL (real) Database: Yale B Recog. Method: Euclidean Distance

Study 4 Approaches: GROSS, HE, GAMMA Databases: Yale B, CMU PIE Recog. Methods: Eigenfaces, FaceIt

QIR, RHE+RGIC 9PL (real), Cones-cast GROSS

HE: Histogram Equalization LS: Linear Subspace RHE: Region-based Histogram Equalization GA: Gradient Angles HM: Histogram Matching NN: Nearest Neighbor GIC: Gamma Intensity Correction 9PL: Nine Point of Lights RGIC: Region-based Gamma Intensity Correction LOG: Logarithmic Function QIR: Quotient Illumination Relighting SQI: Self Quotient Image

56

5. Study 5 The work in [66] proposes an illumination normalization approach based on applying the Single Scale Retinex (SSR) followed by histogram matching (HM) to bring all the images to the same dynamic range of intensity, we notice it by SSR HM. It then compares the proposed approach with the histogram matching (HM) approach for illumination insensitive face recognition. The Yale B face database [64] is used for face recognition experiments. Support Vector Machine (SVM) is used as the learning scheme [92] for the face recognition experiments. The results show that using the proposed approach, SSR HM, as an illumination normalization approach gives better recognition rates than using the HM alone.

6. Study 6 The study in [72] empirically compares the following seven image-processing-based approaches, four global and three local, for illumination insensitive face recognition:

1. Gamma Intensity Correction (GIC). 2. Histogram Equalization (HE). 3. Histogram Matching (HM). 4. Normal Distribution (NORM). 5. Local Histogram Equalization (LHE). 6. Local Histogram Matching (LHM). 7. Local Normal Distribution (LNORM).

These approaches are compared on the Extended Yale B database [96] and the Yale B database [64]. In all experiments, the nearest neighbor classifier using the Euclidean distance between the images is used for recognition. The results of the first experiment on the Extended Yale B database show that LNORM gives the best results among the six other approaches. The second experiment on the Yale B database is performed to be able to compare the results of the LNORM with results found in the literature. When using LNORM with window size 5 × 5 and training with five randomly images from Subset 1, the results in [72] show that:

1. LNORM outperforms all the nine illumination normalization approaches appear in Study 2 [84].

2. LNORM performs better than 10 approaches from Study 3 [96], while it gives comparable results to the Study 3’s best two approaches which are Cones-cast and the Nine point of lights (9PL) using real images.

3. LNORM perform better than Harmonic Image Exemplars approach [98]. As a result, we can find that LNORM is better than a total of 24 approaches. These approaches are listed below in Table-3.2 in conjunction with the corresponding Study of each approach.

57

Table 3.2: List for 24 illumination normalization approaches that LNORM perform better than them

I Approach Study No. 1 HE: Histogram equalization globally over the images Study 2 2 RHE: Region-based Histogram equalization Study 2 3 GIC: Gamma Intensity Correction globally Study 2 4 RGIC: Region-based GIC Study 2 5 GIC+RHE: perform RHE after GIC Study 2 6 RGIC+RHE: perform RHE after RGIC Study 2 7 RHE+RGIC: perform RGIC after RHE Study 2 8 HE+RGIC: perform RGIC after HE Study 2 9 QIR: Quotient Illumination Relighting Study 2

10 Correlation method [93] Study 3 11 Eigenface method [24] Study 3 12 Eigenface without the first three principle components [32] Study 3 13 Nearest Neighbor using 9 training images per subject [93] Study 3 14 Linear Subspace [94] Study 3 15 Gradient angles [97] Study 3 16 Cones-attached [95] Study 3 17 Harmonic images (no cast shadow) [96] Study 3 18 Harmonic images-cast (with cast shadows) [96] Study 3 19 Nine point of lights (9PL) using simulated images [96] Study 3 20 Harmonic Image Exemplars [98] Study 6 21 Histogram Matching (HM) Study 6 22 Normal Distribution (NORM) Study 6 23 Local Histogram Equalization (LHE) Study 6 24 Local Histogram Matching (LHM) Study 6

7. Study 7 The work in [105] introduces the Logarithmic Total Variation (LTV) model as a preprocessing technique for face recognition under varying illumination. The proposed approach is empirically compared with the following four approaches:

1. Quotient Image (QI) [100]. 2. Quotient Illumination Relighting (QIR) proposed in Study 2 [84]. 3. Self Quotient Image (SQI) [101]. 4. Histogram Equalization (HE).

These approaches are compared on Yale B face database [64] and the CMU PIE database [102]. Then an outdoor database [105] is used for evaluating the performance under natural lighting condition. Two different face recognition approaches are used for evaluation, template matching and PCA. The results show that the proposed approach (LTV) always gives the best results among the four other normalization approaches.

58

Moreover, the results on Yale B database show that the LTV gives 100% recognition rate over all the five subsets which is better than the Harmonic Image Exemplar [98] and comparable (or may better) to the 9PL (real images) and Cones-cast from Study 3 [96] that give the same recognition rate but on four subsets only (no results are reported wherever, for the tests on subset 5). In addition, the proposed approach (LTV) reached similar results to the ones obtained by Corefaces [99] based on PCA recognition on Yale B and CMU PIE database. As a result, we can find that LTV is better than the following four normalization approaches:

1. Quotient Image (QI) [100]. 2. Quotient Illumination Relighting (QIR) proposed in Study 2 [84]. 3. Self Quotient Image (SQI) [101]. 4. Histogram Equalization (HE).

While gives comparable (or may better) results to the following three normalization approaches:

1. Nine Point of Lights (9PL) form Study 3 [96]. 2. Cones-cast from Study 3 [96]. 3. Corefaces [99].

8. Study 8 The work in [80] proposes an original preprocessing technique based on Local Binary Pattern (LBP) for illumination robust face authentication. It empirically compares the proposed approach with two other approaches which are GROSS approach from Study 4 [85] and Histogram Equalization (HE). The efficiency of the proposed approach is empirically demonstrated using both an appearance-based (LDA) and a feature-based (HMM) face authentication systems on two databases: BANCA and XM2VTS (with its darkened set). Conducted experiments show that the proposed preprocessing approach (LBP) is suitable for face authentication: results are comparable with or even better than those obtained using the GROSS approach proposed in Study 4 [85], while it removes the need for parameter selection.

9. Study 9 The work in [109] presents a simple and efficient preprocessing chain (CHAIN), described previously in section 3.3.2, that eliminates most of the effects of changing illumination while still preserving the essential appearance details that are needed for recognition. It empirically compares the proposed approach with the following approaches:

59

1. Histogram Equalization (HE). 2. Multi Scale Retinex (MSR) [89]. 3. Self Quotient Image (SQI) [101]. 4. Logarithmic Total Variation (LTV) proposed in Study 7 [105]. 5. GROSS approach proposed in Study 4 [85].

These approaches are empirically compared over three different databases, namely Face Recognition Grand Challenge version 1 experiment 4 (FRGC-104) [110], Extended Yale B [96], and CMU PIE [102]. The efficiency of the proposed approach is empirically demonstrated using Local Ternary Patterns (LTP) [109], a generalization of the Local Binary Pattern (LBP), combined with a local distance transform (DT) based similarity metric as a classifier [111]. The results show that the proposed CHAIN approach outperforms all the five approaches over the three databases. However, the LTV approach gives recognition rates marginally less than the CHAIN on Extended Yale B database and equal to it on CMU PIE database but about 300 times slower than CHAIN approach [109]. Fig.3.27 completes the summarization appear previously in Fig.3.26 about all the above nine comparative studies showing some relations between these studies in addition to the best approaches resulting from all studies. We can conclude from these nine comparative studies that among 38 different illumination-normalization approaches appear in Table-3.3, the following 7 approaches are the best for face recognition:

1. Single Scale Retinex followed by Histogram Matching (SSR HM) [66]. 2. Local Normal Distribution (LNORM) [72]. 3. Preprocessing Chain Approach (CHAIN) [109]. 4. Nine Point of Lights with real images (9PL) [96]. 5. Cones-cast [95]. 6. Corefaces [99]. 7. Local Binary Patterns (LBP) [80].

Observe that the second approach (LNORM) is proven to be better than 24 out of these 38 different illumination-normalization approaches as shown previously in Table-3.2. Four out of the above seven best-of-literature approaches will be compared later in chapter 6 together with the proposed approach in chapter 5. The comparisons are done on illuminated and non-illuminated face images in order to study both the positive and negative effects of each approach.

60

Figure 3.27: Summarization for the nine comparative studies showing some relations between these studies in addition to the final best normalization approaches from all studies (dark grayed boxes).

For each study, it shows the normalization approach to be compared, the face databases and the face recognition approaches in addition to the best normalization approaches from each study (light

grayed boxes)

Study 1 Approaches: HE, HM, GIC, LOG, SQI Databases: CMU-PIE, FERET, CAS-PEAL Recog. Method: PCA LDA

HM, GIC

Study 2 Approaches: HE, RHE, GIC, RGIC, GIC+RHE, RGIC+RHE, RHE+RGIC, HE+RGIC, QIR Databases: Yale B, Harvard Recog. Method: Cosine NN

Study 3 Approaches: Correlation, Eigenface, Eigenface w/o 1st 3 PCs, NN (9 images), LS, GA, Cones-attached, Cones-cast, Harmonic (no cast), Harmonic-cast, 9PL (simulated), 9PL (real) Database: Yale B Recog. Method: Euclidean Distance

Study 4 Approaches: GROSS, HE, GAMMA Databases: Yale B, CMU PIE Recog. Methods: Eigenfaces, FaceIt

QIR, RHE+RGIC 9PL (real), Cones-cast GROSS

HE: Histogram Equalization LS: Linear Subspace LTV: Logarithmic Total Variation RHE: Region-based HE GA: Gradient Angles HIE: Harmonic Image Exemplar HM: Histogram Matching NN: Nearest Neighbor QI: Quotient Image GIC: Gamma Intensity Correction QIR: Quotient Illumination Relighting 9PL: Nine Point of Lights RGIC: Region-based GIC LOG: Logarithmic Function LBP: Local Binary Pattern QIR: Quotient Illumination Relighting SQI: Self Quotient Image SSR: Single Scale Retinex LHE: Local Histogram Equalization LHM: Local Histogram Matching MSR: Multi Scale Retinex LNORM: Local Normal Distribution NORM: Normal Distribution CHAIN: Preprocessing Chain Appr.

Study 5 Approaches: HM, SSR HM Databases: Yale B Recog. Method: SVM

Study 6 Approaches: GIC, HE, HM, NORM, LHE, LHM, LNORM, HIE [98] Databases: Extended Yale B, Yale B Recog. Method: Euclidean Distance

HM

Study 8 Approaches: LBP, HE, GROSS Databases: BANCA, XM2VTS Recog. Methods: LDA, HMM

SSR HM LNORM, 9PL (real), Cones-cast

LBP

Study 7 Approaches: LTV, QI, QIR, SQI, HE Corefaces [99], HIE [98] Databases: Yale B, CMU PIE, Outdoor DB [42] Recog. Methods: Template Matching, PCA

LTV, 9PL (real), Cones-cast,Corefaces

Study 9 Approaches: CHAIN, LTV, SQI, MSR, GROSS, HE Databases: Ext. Yale B, CMU PIE, FRGC-104 Recog. Methods: LTP/DT

LTV

CHAIN

QIR

: Approach is compared by direct implementation : Approach is compared by published results

[ ] : Cited approach is compared by published results

61

Table 3.3: The 38 different illumination normalization approaches appear in the above nine comparative studies together with the corresponding studies numbers. (Note that the cited

approaches, from 29 to 38, are not described in details in their corresponding comparative studies)

I Illumination Normalization

Approach Study

No. I

Illumination Normalization Approach

Study No.

1 Histogram Equalization (HE) 1,2,4,

6,7,8 2 Histogram Matching (HM)

1,5,6

3 Gamma Intensity Correction (GIC) 1,2,6 4 Logarithmic Function 1 5 Normal Distribution 6 6 Gamma correction 4 7 Local HE 6 8 Local HM 6 9 Local Normal Distribution 6 10 Local Binary Pattern 8 11 Region-based HE 2,6 12 Region-based GIC 2,6

13 GIC followed by Region-based HE

2,6 14Region-based GIC followed by Region-based HE

2,6

15 Region-based HE followed by Region-based GIC

2,6 16HE followed by Region-based GIC

2,6

17 Single Scale Retinex followed by HM

5 18Quotient Illumination Relighting

2,6,7

19 Self Quotient Image 1,7 20 Quotient Image 7 21 Logarithmic Total Variation 7 22 GROSS method 4,8

23 Harmonic images (no cast shadow)

3 24Harmonic images-cast (with cast shadows)

3

25 Nine point of lights (9PL) using simulated images

3,6,7 26Nine point of lights (9PL) using real images

3

27 Multi Scale Retinex 9 28 Preprocessing Chain approach 9 29 Harmonic Image Exemplar [98] 6,7 30 Corefaces [99] 7 31 Linear Subspace [94] 3 32 Gradient Angles [97] 3 33 Cones-attached [95] 3 34 Cones-cast [95] 3 35 Correlation method [93] 3 36 Eigenface method [24] 3

37 Eigenface without the first three principle components [32]

3 38Nearest Neighbor using 9 training images per subject [93]

3

After surveying the different face recognition approaches in chapter 2 and the different illumination normalization approaches in chapter 3, we will introduce in the following chapter the detailed descriptions about the environment that we build up in order to use it for testing our proposed illumination normalization approach and the other approaches. The chapter includes descriptions about the selected face recognition methods and the selected databases that cover five different face recognition variations. The experimental results of the selected methods over each database are also introduced in this chapter. All experiments are done without applying any illumination normalization approach. These results are considered as a baseline and allow us to study the effects of any illumination normalization approach on the selected methods over each variation separately.

62

CHAPTER 4: Setup the Environment

4.1 Introduction This chapter introduces the detailed descriptions about the environment that we build up in order to use it for testing our proposed illumination normalization approach and the other approaches. The chapter includes descriptions about the selected face recognition methods and the selected databases in addition to the reasons behind these selections. The experimental results of the selected methods over each database are also introduced in this chapter. In Chapter 2, we revise the main face recognition approaches and we found that many face recognition methods fall into the holistic-based approach. One possible reason is that these methods usually utilize the face as a whole and do not destroy any information by exclusively processing only certain fiducial points which make them generally provide more accurate recognition results. Moreover, most of these holistic-based approaches are fall into two broad categories, Eigenspace-based category and frequency-based category. To be able to widely studying the effects of any preprocessing/illumination normalization approach on the two broad holistic-based categories, we chose one method under each category representing the main characteristics of this category. The two chosen methods are the Standard Eigenface method [24] which considered the core of many Eigenspace-based methods, and the Holistic Fourier Invariant Features (Spectroface) method [43] that represents the main characteristics of the frequency-based category. These two methods are compared over five different face recognition variations using suitable database(s) for each variation. These variations are divided into two-geometrical, which are translation and scaling, and three-facial, which are 3D pose, facial expressions, and non-uniform illumination. No preprocessing/illumination normalization approach is applied to both methods during the comparison. The aim of these comparisons is to establish a base that can be used for further studying the effects of any preprocessing/illumination normalization approach on each of the five variations separately using two methods representing the two broad holistic-based categories, Eigenspace-based and frequency-based. The rest of this chapter is organized as follows; section 2 describes the two face recognition methods. Section 3 describes the face databases used in comparisons and how they are prepared and configured for training and testing. Section 4 contains the comparison results over each of the five variations in addition to some observations about these results. Finally, the chapter summary is presented in section 5.

63

4.2 Methods Descriptions

4.2.1 Standard Eigenface Method The standard Eigenface method [24] approximates the face images by lower dimensional feature vectors. In training phase, the projection matrix (W ∈ RN × M) – which achieves the dimensional reduction – is obtained using all the database face images, where N and M denote for the dimension of image and feature vector respectively. Eigenvectors and eigenvalues are computed on the covariance matrix of the training images. The M highest eigenvectors are kept – which form the projection matrix W. Finally, the known individuals are projected into the face space (pk), where p denotes for the feature vector and k denotes for person number. These feature vectors are stored in addition to the mean face. The recognition process works as in Fig.4.1: a preprocessing module transforms the face image into a unitary vector using a normalization module [24]. Then, subtract the mean face from this unitary vector. The resulting vector I is projected using the projection matrix W. This projection corresponds to a dimensional reduction of the input, starting with vector I in RN (where N is the dimension of the image vector) and obtaining the projected vector q in RM, with M<<N. Then, the similarity of q with each of the reduced vectors (pk) is computed using Euclidean distance. The class of the most similar vector is the result of the recognition process, i.e. the identity of the face.

Figure 4.1: Standard Eigenface block diagram

In this work, we use a MATLAB implementation for the standard Eigenface method publicly available on the web and link them with C# project to facilitate the training and testing steps.

4.2.2 Spectroface Method Spectroface [43] representation is based on the wavelet transform and holistic Fourier invariant features. Wavelet transform is applied to the face image to eliminate the effect of facial expressions. Also, decomposing the face image will reduce its resolution, which

64

in turn, reduces the computation of the recognition system. After decomposing the face image, the holistic Fourier invariant features (Spectroface) are extracted from the low frequency subband image by applying Fourier transform twice. The first FFT is applied to the low frequency subband to make it invariant to the spatial translation. Then the second Fourier transform is applied to the polar transformation of the result to make it invariant to scale and on-the-plane rotation. The block diagram of the Spectroface representation is shown in Fig.4.2. In the recognition stage, the probe image is translated into Spectroface representation, and then matches it – using Euclidean distance – with those referent images stored in the gallery to identify the face image. Our implementation for this method is done using C++ language. Different from implementation in [43], we do not use the two preprocessing steps, namely histogram equalization and intensity normalization, as when we use them over the ORL [112] database, the recognition rates are always decreased. One possible reason for this decreasing is that there are no illumination changes in the ORL database. Table-4.1 shows the recognition rates obtained by our implementation against those mentioned in [43]. It is shown that our implementation gives approximately the same results over the ORL database. However, in Yale database, the results of our implementation are approximately the same as those of [43] except when the testing includes the two non-uniform-illuminated images, the results are decreased significantly. The possible reason for this is that we don’t apply the two preprocessing steps mentioned in [43].

Figure 4.2: Spectroface block diagram

65

Table 4.1: Comparison between results in Lai et al. [43] and in our implementation (better rates are

italic)

ORL DB (rank 1) Yale DB –not cropped (rank 1) 1

training image

3 training images

1 training image 2 training images without 2

illum with 2 illum without 2 illum with 2 illum

Method in [43] 76.38% 94.64% 95.0% 91.33% 99.05% 95.56%

Our Implementation 78.61% 95.00% 93.33% 82.00% 99.05% 83.70%

4.3 Databases Descriptions

4.3.1 UMIST database This database [46] is used for studying the pose variation. It consists of 565 images of 20 subjects. The pose varies from frontal to profile with slight variations in the tilt of the head as well. Each image is cropped to contain only the face and is 112 × 92 pixels in grayscale. The set of subjects contains 16 males and 4 females. There 8 subjects with glasses. We use only 300 images (15 images per subject) with significant pose effects up to 80˚. We do not consider the profile images in this study since it is shown by our experiment that these profile images should be treated as separate cases in order to recognize them correctly, which is out of the scope of this comparison. Each image is flipped around y-axes to represent the pose changes in both directions. The resulting 600 images are divided into training and testing set. The training set consists of 200 images with 10 images/subject chosen according to the pose as follows: two normal images, two images with ±10~15˚, two images with ±30˚, two images with ±45˚, and two images with ±75~80˚. The testing set consists of 400 images with 20 images per subject chosen to cover all pose variations from frontal to 80˚. An example for training and testing sets is shown in Fig.4.3.

4.3.2 Yale B database We use the Yale B database [64] – frontal images only – for studying the non-uniform illumination variation. It consists of 10 subjects (9 males and 1 female) each with 65 (64 illuminations + 1 normal) images. The 64 illuminated images are acquired in about two seconds, so there are only small changes in head pose and facial expressions. Only 46 out of these 65 images are divided into four subsets according to the angle the light source direction makes with the camera’s axis (12˚, 25˚, 50˚, and 77˚).

66

Figure 4.3: UMIST: selected images for one subject in both training and testing sets

We use only these four subsets. All images are cropped to include only the head portion. Subject’s images on each subset are divided into training and testing as follows: subset 1 is divided into 3 training images and 5 testing images; each of subset 2, 3, and 4 is divided into 4 training images and 8, 8 and 10 testing images, respectively. As a result, the training set consists of 15 images × 10 subjects while the testing set consists of the remaining 31 images × 10 subjects. Fig.4.4 shows the training images, randomly selected, in each subset and the light angle of each image.

Figure 4.4: Yale B: Training images for one subject in the four subsets with the light angle of each

image

4.3.3 Grimace database We use the Grimace database [115] for studying the facial expression variation. It consists of 18 subjects with 20 images for each one. There are 2 females and 16 males. The images are with major facial expression variation and very little image lighting variations. There is no hairstyle variation as the images were taken in a single session. There are no special configurations for this database. Fig.4.5 shows sample images from this database.

Subset 1 Subset 2 Vertical

(+ 10°, - 10°) Normal Vertical

(+ 25°, - 25°)Horizontal

(+ 20°, - 20°)Vertical

(+ 70°, - 70°) Horiz & Ver (+ 65°, ± 35°)

Subset 4 Subset 3 Vertical

(+ 50°, - 50°)Horizontal

(+ 45°, - 35°)

normal 10 - 15° pose

30° pose

45° pose

75-80° pose

(a) Training images in one direction only

(b) Testing images in one direction only

67

4.3.4 JAFFE database This database [65] is used for studying the facial expression variation. It contains 213 images of 10 Japanese female models obtained in front of a semi-reflective mirror. Each subject was recorded three or four times while displaying the six basic emotions and a neutral face. The camera trigger was controlled by the subjects. The resulting images have been rated by 60 Japanese women on a 5-point scale for each of the six adjectives. The rating results are distributed along with the images. Fig.4.6 shows example images for one subject along with the majority rating. The images are originally printed in monochrome and then digitized using a flatbed scanner. In our comparisons, all images are cropped using the face detection function in the Intel OpenCV library [116] to contain only the head portion, examples are shown in Fig.4.5.

(a) Notterdam

5 images × 70 subjects

(b)

Yale 8 images × 15

subjects

(c) JAFFE

213 images 10 subjects

(d) Grimace

20 images × 18 subjects

Figure 4.5: Selected images for one subject from each database used for studying the facial expression variation

Figure 4.6: Example images from JAFFE database. The images in the database have been rated by

60 Japanese female subjects on a 5-point scale for each of the six adjectives. The majority vote is shown underneath each image (with natural being defined through the absence of a clear majority)

68

4.3.5 Nott-faces database This database [117] is used for studying the facial expression variation. It consists of 70 males each with 7 images, 4 frontal images with facial expressions, 1with bathing cap and 2 in 3/4 profile. The images are with non-fixed background. There are some translation of face in image, very small head scale variation and small linear uniform illumination. We exclude the 2 images with 3/4 profile from our test and work only on the 5 frontal images which form a total of 5 images × 70 subjects. All images are cropped using the face detection function in the Intel OpenCV library [116] to contain only the head portion. Fig.4.5 shows sample images from this database.

4.3.6 Yale database Yale database [32] consists of 15 subjects (14 males and 1 female) each with 11 images, 1 normal image, 3 non-uniform illuminated images and 7 images with facial expressions. We exclude the 3 illuminated images from our test which form a total of 8 images × 15 subjects. This allows us to use this database for studying the facial expression variation. All images are cropped using the face detection function in the Intel OpenCV library to contain only the head portion, examples are shown in Fig.4.5.

4.3.7 Face 94 database We use the Face 94 database [118] for studying the scale variation. It consists of 152 subjects with 20 images for each one. There are 19 females and 133 males. The images are with fixed background with neither head scale nor illumination variations. There are small expression variations due to the speaking of subject during the acquisition phase. Only the first 15 images for each subject are used to form the training and testing sets, 5 for training and 10 for testing. For each subject in the training set, two images are scaled by factors ±8% and two are scaled by factors ±17% and the remaining image is left without changes. In the testing set, the ten images of each subject are scaled by 10 different factors: ±3%, ±6%, ±9%, ±12.5%, and ±15.5%. Fig.4.7 shows the training and testing sets of one subject. Note that after scaling up, the image is cropped to be 128 × 128 while after scaling down, the boundary pixels are repeated to reach 128 × 128.

69

3% 6%

9%

12.5%

15.5%

Scale

Up

Scale Down

(b) Testing set

Figure 4.7: Face 94: 15 images for each subject in both training and testing sets

4.4 Experimental Results In this section, we describe the five different comparisons between the two methods, Eigenface and Spectroface. For each comparison, we state the used databases(s) and describe the training and testing methodologies that are applied on each database, and then state the results of comparison in addition to some observations about these results. In all comparisons, only two preprocessing steps are applied to both methods:

1. Convert each image to grayscale. 2. Resize each image to a fixed size of 128 × 128.

This allow us to establish a base that can be used for further studying the effects of any preprocessing/illumination normalization approach on each of the five variations separately using two methods representing the two broad holistic-based categories, Eigenspace-based and frequency-based categories. In both methods, we use single feature vector for each training image rather than using the average feature vector for each subject since it gives – by our experiments – better results over all variations. To indicate how much significant is certain method better than another, we calculate the difference between their average recognition rates over all training cases.

4.4.1 Pose Variation

1. Training and Testing Methodologies We use the UMIST database for studying this variation. First two columns in Table-4.2 describe both the training cases and the number of training images/subject in each case. There are 12 training cases differ in the degree of

0% +8%

+17%

-8%

-17%

(a) Training set

70

rotation in the chosen images. Note that the normal images are common in all cases. The cases are chosen to cover all possible combinations for training by both four and six images/subject. This is in addition to training by the two normal images per subject and by all the training images up to 75˚ (10 images per subject). The testing is done using all the 400 images of the testing set.

2. Results We observe the following results from Table-4.2: First, concerning the comparison between both methods, the Spectroface method gives better recognition rates than Eigenface method in 10 out of 12 training cases with average difference of 4.6%. This means that for the 3D pose variation, the Spectroface method outperforms the Eigenface method. Second, concerning the best training case, the top four rates in both methods show that there is no significant difference between training by all the five angles – namely 0˚, 10˚, 30˚, 45˚, and 75˚ (10 images/subject) and training by three angles, 0˚, 75˚, and an in-between angle [10˚, 45˚], (six images/subject). This means that to achieve the best recognition rate over poses up to 75-80˚, the system should be trained by normal image + 75˚ pose + an in-between angle [10˚, 45˚].

Table 4.2: Pose Variation: recognition rates over 12 training cases (top four rates in each method are italic)

Training Case # train/subject Eigenface Spectroface normal only 2 64.0 48.0 normal + 10˚ 4 68.5 67.3 normal + 30˚ 4 75.0 76.0 normal + 45˚ 4 87.0 89.0 normal + 75˚ 4 85.5 89.0 normal + 10˚ + 30˚ 6 74.0 74.5 normal + 10˚ + 45˚ 6 84.5 90.0 normal + 10˚ + 75˚ 6 87.5 94.5 normal + 30˚ + 45˚ 6 85.0 90.3 normal + 30˚ + 75˚ 6 88.5 94.8 normal + 45˚ + 75˚ 6 87.5 95.0 normal+10˚+30˚+45˚+75˚ 10 88.0 95.0

4.4.2 Facial Expressions Variation

1. Training & Testing Methodologies We use four different databases for studying this variation – namely Grimace [115], Yale [32], JAFEE [65], and Nott-faces [117]. First four columns in Table-4.3 describe the training cases for each database in addition to the number of both training and testing images per subject in each case. There are three

71

training cases for each database. In each case, the testing is done using all other images that are not included in the training. In Nott-faces, the first two cases are tested two times, one without the capped images and other with them. In Eigenface method, we try to improve its results in facial expressions variation by first applying the wavelet transform on the original image, and then compute the Eigenface from the resulting low subband. This low subband contains less information about facial expressions which usually founded in LH and HL subbands. All previous training & testing cases are applied to this low subband to be able to study the effect of it on the recognition rates.

2. Results We observe the following results from Table-4.3: First, concerning the comparison between both methods, the Spectroface method gives better recognition rates than the Eigenface on Original images in all the 14 training cases with average difference of 5.4%.

Table 4.3: Expressions Variation: recognition rates over four databases with two Eigenface tests

Database Training Case

(images/subject) # train × #subject

# test × #subject

Spectro-face

Eigenface on

Original on

Wavelet

Yale normal only 1 × 15 7 × 15 92.4 81.9 81.9 normal + 2 expressions1 3 × 15 5 × 15 98.7 96.0 96.0 normal + 3 expressions 4 × 15 4 × 15 98.3 96.7 98.3

Grimace normal only 1 × 18 19 × 18 100 96.2 96.8 normal + 2 expressions 3 × 18 17 × 18 100 97.1 97.1 normal + 4 expressions 5 × 18 15 × 18 100 96.7 96.7

JAFFE normal only 1 × 10 2032 93.1 84.2 86.2 normal + 2 expressions 3 × 10 183 97.8 90.2 90.7 normal + 4 expressions 5 × 10 163 97.6 89.6 89.6

Not

t-fa

ces with

cap.

normal only 1 × 70 4 × 70 59.6 55.4 55.7 normal + 1 expression 2 × 70 3 × 70 60.0 55.7 56.7 normal + 2 expressions 3 × 70 2 × 70 50.7 47.9 48.6

w/out cap.

normal only 1 × 70 3 × 70 76.2 69.0 70.5 normal + 1 expression 2 × 70 2 × 70 84.3 76.4 78.6

Second, in Eigenface method, the last two columns show that computing the Eigenface from the low subband of the wavelet transform gives better results than computing it from the original image directly. The results are better in 9 training cases with average

1 Normal + N expression(s):- means that we train with normal image + N images each contains single expression – randomly selected. 2 Number of testing images for all subjects, since in JAFFE database, each subject has different number of images.

72

difference of 1.2% and are equal in the remaining cases. However, the Spectroface method gives better recognition rates than the Eigenface on Wavelet also in all the 14 training cases with average difference of 4.7%. Thus, in facial expression variation, it is better to use the wavelet low subband as it contains less information about facial expressions which usually founded in LH and HL subbands. However, applying the frequency-based method on the low subband of the wavelet transform is much better than applying the PCA-based method on it. One possible reason is that the wavelet low subband contains information about the frequencies’ positions, and since the changing in facial expression causes changes in pixels’ positions within the face, thus any information about pixels’ positions will contain also information about facial expressions. As a result, applying the PCA (Eigenface) on the wavelet low subband directly will still affected by some information about facial expressions. In contrast, applying the Fourier transform on the wavelet low subband will eliminate the information about pixels’ positions contained in this subband which leads to reduce the information about facial expressions. Briefly, for the facial expressions variation, the Spectroface method outperforms both Eigenface on original image and Eigenface on the low subband of the wavelet transform. However, applying Eigenface on the low subband is better than applying it on the original image.

4.4.3 Non-Uniform Illumination Variation

1. Training & Testing Methodologies We use the Yale B database – frontal images only – for studying this variation. First two columns in Table-4.4 describe the training cases and their corresponding subsets. There are 25 different training cases. Note that the normal image is common in all cases. We have four elementary subsets – namely subset 1, 2, 3, and 4 shown in the table to the left. Each elementary subset composed of training by the normal image and either the vertical lighting, horizontal lighting or both of them. Also, we have seven combinations of the elementary subsets, shown in the table to the right, where subset 1 is essential in all of them as it contains the lowest illumination. Each combination composed of training by the normal image and either the vertical lighting or the vertical and the horizontal lighting. The testing is done using all the 31 × 10 images of the testing set.

2. Results Concerning the comparison between both methods, Table-4.4 shows that the Spectroface method gives better recognition rates than the Eigenface method in all the 25 training

73

cases with average difference of 9.0%. This means that for non-uniform illumination variation, the Spectroface method outperforms the Eigenface method.

Table 4.4: Illumination Variation: recognition rates over 25 training cases (top three rates in each method are italic)

Four Elementary Subsets

Sub-sets

Training Case (train.

images/subject)

Eigen-face

Spectroface

1 nor only 45.8 48.4 nor + 2 ver 48.7 52.3

2 nor + 2 ver 54.9 57.7 nor + 2 hor 50.7 58.1 nor + 2 ver + 2 hor 55.2 61.0



nor: normal

ver: vertical hor: horizontal

Seven Combinations

Sub-sets

Training Case (train.

images/subject)

Eigen-face

Spectroface

1, 2 nor + 4 ver 57.7 58.7 nor + 4 ver + 2 hor 59.7 60.7



1, 2, 3nor + 6 ver 61.0 66.5 nor + 6 ver + 4 hor 61.0 71.6



1, 2, 3, 4

nor + 8 ver 57.4 71.0 nor + 8 ver + 6 hor 55.5 77.1

4.4.4 Translation Variation

1. Training & Testing Methodologies We use the previous six databases for studying this variation – namely UMIST, Grimace, Yale, JAFEE, Nott-faces, and Yale B. For training, we chose one training case from each of the six previous databases, usually the one that gives best results in one or both methods, as shown in Table-4.5. Thus, we have 6 training cases with the same previous results.

Table 4.5: Translation Variation: chosen cases from the six databases and their recognition rates

Database Training Case Eigenface Spectroface UMIST normal + 45˚ + 75˚ 87.5 95.0 Grimace normal + 2 expressions 97.1 100.0 Yale normal + 3 expressions 96.7 98.3 JAFFE normal + 2 expressions 90.2 97.8 Nott-faces normal + 1 expression 76.4 84.3 Yale B Subsets 1, 2, 3 normal + 6 vertical 61.0 66.5

The testing is applied two different times; first, by translating with circulation in which the output pixels after translation are circulated to fill the empty pixels in the opposite direction. Second, translating without circulation in which the empty pixels after

74

translation are filled by fixed color (gray in our case). In both times, each test image is translated with 2, 4, 6, and 8 pixels in each of the four directions which give 16 new recognition rates for each training case. Then, the decreasing in recognition rates is calculated. Finally, the average decreasing for each translating value is calculated over the four directions. Examples for translating are shown in Fig.4.8.

2. Results First, in translating with circulation, Table-4.6 shows that the recognition rates of Eigenface are decreased significantly. On other hand, the recognition rates of Spectroface are not affected by these translations, as the maximum decrease value in all the 24 testing cases is 0.8%.

O riginal Right by 2

Right by 4

Right by 6

Right by 8

(a) Translating with circulation

(b) Translating

without ci rculation

Figure 4.8: Translation Variation: example for translating with and without circulation

Table 4.6: Translation Variation: average decreasing in the recognition rates of both methods after translating with circulation in the four directions

(a) Eigenface Method (b) Spectroface Method

Database Translation Value 2 4 6 8

UMIST 1.7 4.6 10.5 20.7 Grimace 0.1 1.3 11.7 29.9 Yale 2.1 10.5 19.6 31.7 JAFFE 2.9 11.9 20.4 31.5 Nott-faces 4.1 14.3 27.7 39.6 Yale B 2 6.1 13.6 18.8

Average 2.2 8.1 17.3 28.7


UMIST 0.0 0.0 0.0 0.0 Grimace 0.0 0.0 0.0 0.0 Yale 0.0 0.0 0.0 0.0 JAFFE 0.8 0.0 0.8 0.0 Nott-faces 0.0 0.0 0.0 0.0 Yale B 0.0 0.0 0.0 0.0

Average 0.1 0.0 0.1 0.0 Second, in translating without circulation, Table-4.7 shows that the recognition rates in both methods are decreased. However, the decreasing in Eigenface is much more significant than in Spectroface – see the average row. As a result, it is clear that the Spectroface method is more robust against the translation variation than the Eigenface method.

75

Table 4.7: Translation Variation: average decreasing in the recognition rates of both methods after translating without circulation in the four directions

(a) Eigenface Method (b) Spectroface Method



Average 2.1 9 18.4 31.2


UMIST 0.6 1.8 4.5 9.2 Grimace 0.1 2.1 7.6 12 Yale 0 0 0.4 1.2 JAFFE 1.2 1.4 3.5 5.3 Nott-faces 0 1.8 6.6 13.2 Yale B 0.1 2.3 7.5 13.3

Average 0.3 1.6 5 9

4.4.5 Scaling Variation

1. Training & Testing Methodologies We use the Face 94 database for studying this variation. There are seven different training cases according to the scaling factors of the chosen images, as shown in Table-4.8:

Table 4.8: Scaling Variation: description of the training cases

Training Case # train/ subject Description normal only 1 normal image only normal + up8 2 normal image & scaled up image with factor 8% normal + down8 2 normal image & scaled down image with factor 8% normal + up8 + down8 3 normal image, scaled up & scaled down with factor 8% normal + up17 2 normal image & scaled up image with factor 17% normal + down17 2 normal image & scaled down image with factor 17% normal + up17 + down17 3 normal image, scaled up & scaled down with factor 17%

The testing is done using all the images in the testing set. For each training case, the testing is done two times, before and after scaling, in order to record the decreasing in recognition rates after scaling all testing images, see Table-4.9.

2. Results Concerning the comparison between both methods, Table-4.9 shows that the Eigenface method gives better results (less decreasing in recognition rates) than the Spectroface method in six out of the seven used training cases with average difference 2.7%. This means that for the scaling variation, the Eigenface method outperforms the Eigenface method.

76

Table 4.9: Scaling Variation: decreasing in recognition rates after scaling all images in the testing set

Training Case Eigenface Spectroface normal only 14.6 19.1 normal + up8 6.6 7.7 normal + down8 9.8 12.8 normal + up8 + down8 0.7 0.7 normal + up17 5.8 8.0 normal + down17 8.7 13.1 normal + up17 + down17 0 0.9

4.5 Summary In this chapter, we introduce a comparison between two holistic-based face recognition methods chosen to represent the two broad categories of the holistic-based approach – namely Standard Eigenface method from the PCA-based category and Spectroface from the frequency-based category. Seven databases, ranged from small to medium size, are used to compare the two methods against five main variations separately using suitable database(s) for each variation. All comparisons are applied without using any preprocessing/illumination normalization approaches. The aim of these comparisons is to establish a base that can be used for further studying the effects of any preprocessing/illumination normalization approach on each of the five variations separately using two methods representing the two broad holistic-based categories, Eigenspace-based and frequency-based categories. Moreover, the comparison results show that the Spectroface method outperforms the Eigenface method in four out of the five variations – namely the 3D pose, facial expressions, non-uniform illumination, and translation variations while the Eigenface method is better in the scaling variation. Also, in facial expressions variation, applying the frequency-based method on the low subband of the wavelet transform is much better than applying the PCA-based method on it. One possible reason is that applying the PCA (Eigenface) on the wavelet low subband directly will still be affected by some information about pixels’ positions which represent information about facial expressions. In contrast, applying the Fourier transform on this subband will eliminate this information about pixels’ positions which leads to reduce the information about facial expressions. In the next chapter, we will describe the proposed illumination normalization approach together with its results on illuminated databases. Moreover, the comparisons of the proposed approach with other best-of-literature approaches will be discussed in chapter 6. We’ll use the results of this chapter as a baseline for those of the next two chapters to see the effect of the normalization approaches on different databases.

77

CHAPTER 5: The Proposed Illumination Normalization Approach

5.1 Introduction As we stated previously in Chapter 3, illumination normalization approaches can be classified into two categories: model-based and image-processing based approaches. Although the model-based approaches are perfect in theory, the requirement of assumptions and constraints in addition to their highly computational cost make these approaches unsuitable for realistic applications. On the other-hand, the image-processing based approaches are more commonly used in practical systems for their simplicity and efficiency. Although most of the illumination normalization approaches can cope with illumination variation well, some may bring negative influence on images without illumination variation. In addition, some approaches show great difference on performance when combined with different face recognition approaches. Some other approaches require perfect alignment of face within the image which is difficult to achieve in practical/real-life systems. So, in this chapter, we aim to propose an image-processing-based illumination normalization approach that proves flexibility to different face recognition approaches and independency to face alignment. These make it suitable for practical/real-life systems as it can be used with different face recognition approaches and doesn’t need any pre-assumptions or constraints concerning the face alignment. This chapter is organized as follows: section 2 describes the idea behind the proposed illumination normalization approach. Sections 3 to 6 contain the detailed descriptions of the proposed approach. Experiments appear in section 7. Finally, chapter summary is presented in section 8.

5.2 Idea of the Proposed Approach Among different image-processing based approaches, the histogram matching (HM) is considered one of the most common and successful approach [66], [67], [68], [69], [70]. Some comparative studies in literature show the superiority of HM among other approaches [71], [72]. For example, [71] compares five different illumination normalization approaches, namely histogram equalization (HE), histogram matching (HM), log transformation (LOG), gamma intensity correction (GIC) and self quotient image (SQI) over three large-scale face databases which are FERET, CAS-PEAL and CMU-PIE. The results show that HM gives the best results among the four other approaches over FERET and CAS-PEAL, while comes after GIC over CMU-PIE. Results in [72] over the extended Yale B face database show that HM gives the best results

78

among three globally-applied approaches, which are normal distribution (NORM), HE and GIC. Moreover, Histogram matching has the following two main advantages:

1. It is a preprocessing step that can be applied with any face recognition approach. 2. It is insensitive to geometrical effects on the image as it’s applied globally and

thus no additional alignment steps are required. Although enhancing the image resulting from the HM can lead to increase the recognition rates over using the HM alone, no attempts have been made to combine the HM with other image enhancement methods for illumination normalization. Also, the compression function of the Retinal filter [73] as an image enhancement method has not been used in the literature. So, it’s very interesting to combine the HM with other image enhancement methods as illumination normalization for face recognition. As a result, we introduce a new illumination normalization approach based on enhancing the image resulting from HM. Four different image enhancement methods are used in this study – three of them are common in literature, namely histogram equalization, log transformation and gamma correction [74], while the fourth method is newly suggested to be used as an image enhancement method in this study, which is the compression function of the Retinal filter [73]. These four image enhancement methods are applied in two different approaches through this study:

1. After histogram matching; on the resulting image from HM. 2. Before histogram matching; on the reference image before matching the input

image on it. In addition, for each approach, we try to further enhancing the results by applying one of these four methods again. Finally, the proposed approach is chosen from these combinations based on the increase in recognition rates against using the HM alone regardless of the following conditions:

1. Face recognition approach that the normalization approach is applied with it, 2. Face alignment within the image, 3. Number of training images, and the degree of illumination within these images.

This ensures both the flexibility of the proposed approach among different face recognition approaches and the ability to apply it on practical/real-life systems in which perfect alignment of faces is difficult to achieve. The verifications of these conditions are described in detail later in this chapter. All previous combinations are empirically demonstrated and compared over Yale B database [64] using the two holistic-based face recognition approaches introduced previously in Chapter 4, namely, standard Eigenface [24] and Spectroface [43]. These two approaches are chosen to represent the two broad holistic-based categories, Eigenspace-based and Frequency-based respectively [22].

79

The rest of this chapter is as follows: section 3 contains the description of the histogram matching algorithm. Section 4 contains the description of the four image enhancement methods. In section 5, the different approaches of applying these four methods to enhance the image resulting from HM are introduced. Section 6 is dedicated to describe the verification of the selection conditions using the Yale B database. Experimental results showing the best combinations of HM with different image enhancement methods are presented in section 7. Finally, chapter summary is presented in section 8.

5.3 Histogram Matching Algorithm Given an illuminated face image X and a well-lit face image Y, histogram matching [74] is applied to bring the illumination level of the input image X to that of the reference image Y. This is done by making the histogram of the X approximately “match” to the histogram of Y which makes both images having roughly the same mean and variance in their histograms. Fig.5.1 demonstrates the histogram matching process to an illuminated image. The illuminated image (source) is shown in Fig.5.1 (a) and the corresponding histogram is shown in Fig.5.1 (b). The well-lit image (target) and its corresponding histogram are shown in Fig.5.1 (c) and (d) respectively. The resulting image and its histogram after applying the histogram matching are shown in Fig.5.1 (e) and (f). To explain the algorithm, Let H(i) be the histogram function of an illuminated image X and G(i) be the desired histogram of the well-lit image Y, we wish to map H(i) to G(i) via a transformation FH G(i). We first compute a transformation function for both H(i) and G(i) that will map the histogram to a uniform distribution, U(i). These functions are FH U(i) and FG U(i), respectively. Equations 5.1 and 5.2 depict the mapping to a uniform distribution, which is also known as histogram equalization [74].

∑

∑−

=

=

→ = 1

0

0U(i) H

)(

)(F N

j

i

j

jH

jH

( 5.1)

∑

∑−

=

=

→ 1

0

0U(i)G

)(

)(

N

j

i

j

jG

jGF ( 5.2)

Where N is the number of discrete intensity levels. N = 256 for 8-bit grayscale images. To find the mapping function, FH G(i), we invert the function FG U(i) to obtain FU G(i). Since the domain and the range of the functions of this form are identical, the inverse mapping is trivial and is found by cycling through all values of the function. However, due to the discrete nature of these functions, inverting can yield a function which is undefined for certain values. Thus, we use linear interpolation and assume smoothness to fill undefined points of the inverse function according to the values of well-defined points

80

in the function. As a result, we generate a fully defined mapping FU G(i) which transforms a uniform histogram distribution to the distribution found in histogram G(i). The mapping FH G(i) can then be defined as in equation 5.3, [66].

)( )()( iUHGUiGH FFF →→→ = ( 5.3)

Figure 5.1: Histogram matching process to an illuminated image

It’s common in literature to match all images, in both training and testing sets, with a single histogram of either a fixed well-lit image as in [71], [67] or an average image as in [72]. In this work, the reference image for HM is constructed by calculating the average image of a set of well-lit images – one for each subject which gives, by our experiments, better results than using a single well-lit image for the whole image set. The complexity to match the histogram of the input image to the one of the reference image is O(L), where L is the number of pins in the histogram (equal to 255 in gray-scale image). While the complexity of applying the new histogram to the input image takes order O(N × M), where N, M represent the image resolution. This makes the complexity of the whole HM process is O(N × M).

(a)

(f)

(d)(c)

Illuminated Image

Well-lit Image

Resulting Image

(b)

(e)

81

5.4 Image Enhancement Methods The principal objective of image enhancement is to process the original image to be more suitable for the recognition process. Many image enhancement methods are available in the literature. Usually, a certain number of trial and error experiments are required before a particular image enhancement method is selected [74]. In this study, four image enhancement methods are chosen. Three of them are common in literature, namely histogram equalization, log transformation and gamma correction, while the fourth method which called the compression function of the Retinal filter [73] is newly suggested to be used as an image enhancement method in this study.

5.4.1 Histogram Equalization (HE) It’s one of the most common image enhancement methods [74]. It aims to create an image with uniform distribution over the whole brightness scale by using the cumulative density function of the image as a transfer function. Thus, for an image of size M × N with G gray levels and cumulative histogram H(g), the transfer function at certain level T(g) is given as follows:

NMGgHgT

×−×

=)1()()( ( 5.4)

5.4.2 Log Transformation (LOG) LOG is a frequently used technique of gray-scale transform. It simulates the logarithmic sensitivity of the human eye to the light intensity. The general form of the log transformation [74] is:

( )rcs += 1log ( 5.5)

Where r and s are the old and new intensity value, respectively and c is a gray stretch parameter used to linearly scaling the result to be in the range of [0, 255]. The shape of the log curve in Fig.5.2 shows that this transformation maps a narrow range of dark input gray-levels (shadows) into a wider range of output gray levels. The opposite is true for the higher values of the input gray-levels.

5.4.3 Gamma Correction (GAMMA) Gamma correction is a technique commonly used in the field of Computer Graphics. It concerns how to display an image accurately on a computer screen. Images that are not properly corrected can look either bleached out, or too dark. Gamma correction can control the overall brightness of an image by changing the Gamma parameter. The general form of the gamma correction [74] is:

γ1crs = ( 5.6)

82

where r and s are the old and new intensity value, respectively, c is a gray stretch parameter used to linearly scaling the result to be in the range of [0, 255] and γ is a positive constant. In our case, the γ is chosen to be greater than 1 (empirically, it’s chosen to be four) in order to map a narrow range of dark input values (shadows) into a wider range of output values, with the opposite being true for higher values of input levels as shown in Fig.5.2. Unlike the log transformation, the gamma correction has a family of possible transformation curves obtained simply by varying γ values.

Figure 5.2: Transformation functions of LOG and GAMMA (L: number of gray levels)

5.4.4 Compression Function of the Retinal Filter (COMP) A Retinal filter [75] acts as the human retina by inducing a local smoothing of illumination variations. It has been successfully used as an illumination normalization step in the segmentation of facial features in [73], [76]. In this work, we try to use it as an illumination normalization step in face recognition. However, our empirical results over both Eigenface and Spectroface methods using an illuminated YALE B database show that using the Retinal filter as an illumination normalization step is significantly affected when the faces are not perfectly aligned. One possible reason is that the Retinal filter produces a non-realistic image leaving only the high frequencies (edges) which in turn may require the faces to be perfectly aligned specially in the holistic-based approaches. Therefore, in this study, we use only the compression function of the Retinal filter as an image enhancement method since it’s applied globally and so produces more realistic image (for more details about the Retinal filter, see [75]). Let G be a Gaussian filter of size 15 × 15 with standard deviation σ = 2 [73]. Let Iin be the input image and let IG be the result of G filtering of Iin. Image X0 is defined by:

G

G0

I105.5I 410 0.1 X

++

= ( 5.7)

83

The definition of the compression function C is based on X0:

0in

in0

X II )X (255 C

++

= ( 5.8)

Fig.5.3 shows the result of applying each of the four enhancement methods on a face with non-uniform illumination.

Illuminated HE LOG GAMMA COMP

Figure 5.3: Effect of the four enhancement methods on an illuminated face

5.5 The Enhanced HM Approaches A total of 40 different enhancement combinations using the HM [74] combined with different enhancement methods are considered and compared in this study in order to enhance the results of applying the HM alone [77]. As stated in section 5.3, our reference image used for HM is constructed by calculating the average image of a set of well-lit images – one for each subject which gives, by our experiments, better results than using a single well-lit image. Each of the four enhancement methods is applied in three different approaches; 1) After the HM, 2) Before the HM, 3) Further enhancing 1 and 2.

5.5.1 Enhancement After HM Each of the image enhancement methods, discussed in section 5.4, is applied on the result of HM in order to enhance it, as shown in Fig.5.4. This give us four combinations, noted by HM-HE, HM-LOG, HM-GAMMA, HM-COMP, corresponding to applying HE, LOG, GAMMA and COMP, respectively, on the result of HM. Fig.5.5 shows the effect of these combinations on an illuminated face.

Figure 5.4: Block diagram of applying the image enhancement method after the HM

Histogram Matching (HM)

Input Image Matched Image

Image Enhancement

Output Image

Average well-lit Reference Image

84

Illuminated HM

HM-HE HM-LOG HM-GAMMA HM-COMP

Figure 5.5: Effects of applying the image enhancement methods after applying the HM

5.5.2 Enhancement Before HM Opposite to the approach in 5.5.1, each of the image enhancement methods is applied on the reference image before matching the input image on it, see Fig.5.6. This give us another four combinations, noted by HE-HM, LOG-HM, GAMMA-HM, COMP-HM, corresponding to applying HE, LOG, GAMMA and COMP respectively, on the reference image. Fig.5.7 shows the effect of these combinations on an illuminated face.

Figure 5.6: Block diagram of applying the image enhancement method before the HM

Illuminated HM

HE-HM LOG-HM GAMMA-HM COMP-HM

Figure 5.7: Effects of applying the image enhancement methods before applying the HM

Average well-lit Reference Image

Histogram Matching (HM)Input Image Output Image

Image Enhancement

Enhanced Reference Image

85

5.5.3 Further Enhancement Here, we further enhancing the result of each combination using each of the four enhancement methods which give us 8 × 4 = 32 additional combinations. Fig.5.8 shows block diagrams for such enhancements. The effects of further enhancement on both the HM-GAMMA and GAMMA-HM combinations using each of the four enhancement methods are illustrated in Fig.5.9.

Figure 5.8: Block diagram showing the further enhancement of combinations in 5.5.1 and 5.5.2

Illuminated HM

First Enhancement Further Enhancement Using

HM-GAMMA HE GAMMA LOG COMP

GAMMA-HM HE GAMMA LOG COMP

Figure 5.9: Effects of further enhancement on both HM-GAMMA and GAMMA-HM combinations

using each of the four enhancement methods

5.6 Verification of the Selection Conditions As stated in section 5.5, we have 40 different enhancement combinations resulting from combining the HM with different enhancement methods to enhance the results of applying the HM alone. As stated previously in section 5.2, the proposed approach is chosen from these combinations based on the increase in recognition rates against using the HM alone regardless of the following conditions:

Enhancement approach in 5.5.1/5.5.2

Image enhancement

Input image

Output Image

Enhanced image

86

1. Face recognition approach that the normalization approach is applied with it, 2. Face alignment within the image, 3. Number of training images, and the degree of illumination within these images.

This ensures both the flexibility of the proposed approach among different face recognition approaches and the ability to apply it on practical/real-life systems in which perfect alignment of faces is difficult to achieve. We use the Yale B database [64] – frontal images only – as described in section 4.3.2 for studying and comparing the 40 enhancement combinations. In order to verify the first condition, each of the 40 enhancement combinations is applied with the two face recognition methods, Eigenface and Spectroface, representing the two broad holistic-based categories, Eigenspace-based and Frequency-based respectively. The better enhancement combination the one that always enhances the recognition results in both methods. To verify the second condition, all images are cropped in two different ways to include only the head portion:

1. Automatic cropping using the face detection function in Intel OpenCV library [116] to produce non-aligned version of the database; we call it YALE B-AUTO.

2. Manual cropping using the landmarks’ coordinates available on the Yale B website [119] to produce an aligned version of it; we call it YALE B-MANU.

These two versions, shown in Fig.5.10, allow us to test the robustness of each enhancement combination against the geometrical changes in faces within the images. The better enhancement combination, the one that always enhances the recognition results either with or without aligning the faces inside images.

YALE B-

AUTO

YALE B-

MANU

Figure 5.10: Sample faces from Yale B database – automatically and manually cropped

To verify the third condition, all the 25 different training cases, described in section 4.4.3, are used in the testing with this database, as shown in Table-5.1, in which the normal image is common in all cases. These training cases are chosen to cover both the training with each elementary subset – namely subset 1, 2, 3, and 4, and the training with the seven combinations of these subsets where subset 1 is essential in all of them as it

87

contains the lowest illumination. Each elementary subset is composed of training by the normal image and either the vertical, horizontal or both lighting. While each combination is composed of training by the normal image and either vertical lighting or vertical and horizontal lighting. These training varieties help us to test the robustness of each enhancement combination against the number of training images and the changes in illumination direction of these images. The better enhancement combination, the one that always increases the recognition rates regardless of the training case.

Table 5.1: The 25 different training cases used in testing

Elementary Subsets

Subsets Training Case(train. images/subject)

1 nor only nor + 2 ver

2 nor + 2 ver nor + 2 hor nor + 2 ver + 2 hor



nor: normal ver: vertical hor: horizontal

Seven Combinations

Subsets Training Case(train. images/subject)

1, 2 nor + 4 ver nor + 4 ver + 2 hor



1, 2, 3 nor + 6 ver nor + 6 ver + 4 hor



1, 2, 3, 4

nor + 8 ver nor + 8 ver + 6 hor

5.7 Experimental Results The aim of these experiments is to choose the best enhancement combination from the 40 different combinations described in section 5.5 according to the selection conditions stated in section 5.6. Thus, each combination is applied four different times corresponding to the Eigenface and Spectroface methods over YALE B-AUTO and YALE B-MANU versions. In each time, a combination is tested over the 25 training cases and their average recognition rate is calculated and then compared with the one resulting from applying the HM alone. The best enhancement combination, the one that increases the recognition rates resulting from applying the HM alone in all of the following:

1. Both face recognition methods (Eigenface and Spectroface), 2. Over aligned and non-aligned versions (YALE B- MANU and YALE B- AUTO), 3. In all the 25 training cases.

The first condition is to ensure the flexibility of the chosen combination among different face recognition approaches. While the second ensures its suitability for practical/real-life systems, in which perfect aligning of the faces inside images is not a simple task. Finally,

88

by ensuring the increasing of recognition rates in all the 25 training cases, it proves that the chosen combination is not affected by either the number of training images or the changes in illumination direction of these images. As described in section 5.5, 32 out of the 40 enhancement combinations are for further enhancement. So, to see if further enhancement the image leads to further increasing the recognition rates or not, we plot the average difference in recognition rates from applying the HM alone for each of the eight single enhancement combinations together with the ones achieved by further enhancing them using either HE, GAMMA, LOG or COMP. Fig.5.11 shows such plotting for the Eigenface method over the YALE B-AUTO database. Fig.5.12 is dedicated for the Eigenface method over YALE B-MANU while Figures 5.13, 5.14 are dedicated for the Spectroface method over the YALE B-AUTO and YALE B-MANU, respectively. We summarize the results of the further enhancement combinations in Table-5.2 as follows: For each of the four further enhancement combinations, corresponding to applying either HE, GAMMA, LOG or COMP after the eight single enhancement combinations, we count how many times the further enhancement combination lead to increase the average difference in recognition rates over the eight single enhancement combinations.

(a) Further enhancement using HE

89

(b) Further enhancement using LOG

(c) Further enhancement using GAMMA

(d) Further enhancement using COMP

Figure 5.11: Eigenface method over YALE B-AUTO: Effects of further enhancement over the eight single enhancements using (a) HE, (b) LOG, (c) GAMMA and (d) COMP

90




91


Figure 5.12: Eigenface method over YALE B-MANU: Effects of further enhancement over the eight single enhancements using (a) HE, (b) LOG, (c) GAMMA and (d) COMP



92



Figure 5.13: Spectroface method over YALE B-AUTO: Effects of further enhancement over the eight single enhancements using (a) HE, (b) LOG, (c) GAMMA and (d) COMP


93




Figure 5.14: Spectroface method over YALE B-MANU: Effects of further enhancement over the eight single enhancements using (a) HE, (b) LOG, (c) GAMMA and (d) COMP

94

Table 5.2: The number of combinations that lead to increase the recognition rates after using each of the enhancement methods for further enhancement

Face Recognition

Method Database

Further Enhancement Combination Using: HE

(8 combinations) GAMMA

(8 combinations)LOG

(8 combinations) COMP

(8 combinations)

Eigenface

YALE B-AUTO 0 5 5 8

YALE B-MANU 0 0 2 8

Spectroface

YALE B-AUTO 0 1 0 5

YALE B-MANU 1 0 0 5

It’s clear from Table-5.2 that further enhancement the image using any of the three traditional enhancement methods – namely HE, GAMMA and LOG, doesn’t lead to further enhancement in recognition rates of both the Eigenface and Spectroface methods especially in the YALE B-MANU version, see the second and fourth rows. Only COMP that’s lead to further enhancement in recognition rates of both face recognition methods over the two database’s versions. For clarification, in Spectroface method over the YALE B-MANU (last row in Table-5.2), we can see that when applying HE as further enhancement after each of the eight single combinations, ONLY one of these combinations get further increasing in its average recognition rate due to further enhancement it with HE. When applying either GAMMA or LOG as further enhancement, NONE of the eight single combinations get further increasing in its average recognition rate due to further enhancement it with HE. On other hand, when applying COMP as further enhancement, five out of the eight single combinations get further increasing in their average recognition rates after applying the COMP over those accomplished before applying it. As a result, only five out of 40 enhancement combinations are satisfying the three previously mentioned conditions, their effect is shown in Fig.5.15:

1. GAMMA-HM, where gamma is applied before HM. 2. GAMMA-HM-COMP, where gamma is applied before HM, then the result is

further enhanced by applying the compression function. 3. HE-HM-COMP, where equalization is applied before HM, then the result is

further enhanced by applying the compression function. 4. COMP-HM-COMP, where compression function is applied before HM, then the

result is further enhanced by applying it again. 5. HM-HE-COMP, where equalization is applied after HM, then the result is further

enhanced by applying the compression function.

95

Illuminated

GAMMA-HM GAMMA-HM-

COMP HE-HM-COMP COMP-HM-COMP HM-HE-COMP

Figure 5.15: Effects of the five enhancement combinations that satisfy the three conditions

Table-5.3 and Table-5.4 show the results of using these combinations with the Eigenface and the Spectroface methods, respectively, over both versions of the Yale B database. In addition to the results over the 25 training cases, both the average recognition rate of each combination over these training cases and the difference between it and the average recognition rate of applying the HM alone are shown in the last two rows, respectively. It appears from Table-5.3 that using the second enhancement combination, namely GAMMA-HM-COMP, with the Eigenface method gives the best average difference from HM alone (see last row) over the four other combinations in both database’s versions. While in the Spectroface method, Table-5.4 shows that there are no significant differences between using any of the five combinations in each of the database’s versions, the enhancement in recognition rates ranged from 3.7% to 4.2% in YALE B-AUTO and from 6.6% to 7.4% in YALE B-MANU. As a result, we can choose the GAMMA-HM-COMP combination as the best enhancement combination over the 40 different combinations according to the criteria stated above. Complexity of the Proposed Approach The GAMMA-HM-COMP approach based on applying three consecutive steps, namely GAMMA, HM and compression function of the Retinal filter. For an N × N image, both GAMMA and HM take O(N2). Since the compression function is based on Gaussian filtering the image by applying the 1D Gaussian filter twice, it takes order O(N2 × k), where k is the mask size. But since the mask size is fixed and equal to 15 in our case [73], the overall complexity of the GAMMA-HM-COMP approach remains O(N2) which is equal to the complexity of using the HM alone.

96

Table 5.3: Results of using the best five combinations with the Eigenface method over the two versions of the database. Average recognition rate is calculated over the 25 different training cases.

(The best average differences are italic)

(1: GAMMA-HM, 2: GAMMA-HM-COMP, 3: HE-HM-COMP, 4: COMP-HM-COMP, 5: HM-HE-COMP, nor: normal, ver: vertical, hor: horizontal)

Sub-sets Training Case

YALE B-AUTO YALE B-MANU HM 1 2 3 4 5 HM 1 2 3 4 5

1 nor only 49.0 61.6 63.2 60.3 57.7 58.7 74.8 81 89.4 84.5 83.2 83.5nor + 2 ver 63.6 68.7 70 67.7 68.1 68.4 89 95.2 95.8 94.5 94.5 94.5

2 nor + 2 ver 67.1 73.2 74.5 71 71.3 71.9 88.1 96.8 96.5 94.8 95.2 94.8nor + 2 hor 58.7 66.1 68.1 65.2 62.9 65.2 82.6 89.7 92.6 90.6 89.4 90.6nor + 2 ver + 2 hor 66.8 74.5 74.5 73.2 72.6 73.2 87.7 95.8 96.8 95.5 94.8 95.5

3 nor + 2 ver 68.7 74.2 73.9 73.2 72.6 72.9 93.5 97.7 98.7 95.8 96.5 95.8nor + 2 hor 63.6 72.9 72.9 71.6 69.7 71.6 87.7 92.3 93.9 89 90.6 89 nor + 2 ver + 2 hor 67.1 73.9 74.8 74.2 73.2 74.5 89.4 96.1 96.5 94.8 95.5 94.8

4 nor + 2 ver 55.8 60.7 61.9 58.1 57.7 57.4 94.5 96.8 98.1 97.1 97.1 97.1nor + 2 hor 54.2 67.4 68.4 63.5 61.6 63.2 76.8 91 93.9 90.3 88.1 89.4nor + 2 ver + 2 hor 56.8 64.5 65.5 63.5 62.3 63.5 87.1 95.2 95.2 95.2 94.2 95.2

1, 2 nor + 4 ver 69.7 74.5 78.7 74.5 72.6 74.2 90.3 96.5 96.5 95.2 95.5 95.2nor + 4 ver + 2 hor 69.4 75.5 76.5 74.2 75.2 74.2 87.1 96.5 96.8 95.2 94.8 95.2

1, 3 nor + 4 ver 73.9 77.1 76.5 75.8 76.5 75.8 95.8 98.4 98.1 97.4 97.4 97.4nor + 4 ver + 2 hor 70.0 77.4 78.7 75.8 73.9 75.8 91 96.5 97.4 96.8 96.8 96.8

1, 4 nor + 4 ver 64.2 67.1 67.1 65.5 64.8 65.2 96.8 98.1 98.4 97.7 97.7 97.7nor + 4 ver + 2 hor 64.5 70.3 71 68.1 67.4 68.1 93.5 96.8 96.5 96.5 95.8 96.5

1, 2, 3

nor + 6 ver 75.8 79.4 79.4 78.4 78.4 78.4 96.1 98.1 98.4 97.4 97.7 97.4nor + 6 ver + 4 hor 74.2 79.7 79.4 79.4 77.1 79.4 90 96.5 97.4 96.8 95.8 96.8

1, 2, 4

nor + 6 ver 66.8 71.6 72.9 69.7 70 69 97.7 98.7 98.7 99 98.7 98.7nor + 6 ver + 4 hor 67.7 73.2 74.8 72.9 72.6 72.6 94.5 96.8 96.8 97.4 96.8 97.4

1, 3, 4

nor + 6 ver 72.6 76.5 76.1 74.5 74.5 74.2 98.1 98.1 98.7 99.7 99.4 99.7nor + 6 ver + 4 hor 71.6 76.5 77.7 76.1 76.1 75.5 94.5 96.5 96.5 96.5 96.1 96.8

1,2,3,4

nor + 8 ver 74.8 77.4 78.1 76.8 75.2 76.8 99 99 99 100 99.4 100nor + 8 ver + 6 hor 74.5 79.0 79 78.4 76.5 78.7 94.5 97.1 97.1 97.1 96.8 97.1

Average Recognition Rate 66.4 72.5 73.3 71.2 70.4 71.1 90.8 95.6 96.5 95.4 95.1 95.3

Average Difference - 6.1 6.9 4.8 4.0 4.7 - 4.8 5.7 4.6 4.3 4.5

97

Table 5.4: Results of using the best five combinations with the Spectroface method over the two

versions of the database. Average recognition rate is calculated over the 25 different training cases. (The best average differences are italic)

(1: GAMMA-HM, 2: GAMMA-HM-COMP, 3: HE-HM-COMP, 4: COMP-HM-COMP, 5: HM-HE-COMP, nor: normal, ver: vertical, hor: horizontal)


YALE B-AUTO YALE B-MANU HM 1 2 3 4 5 HM 1 2 3 4 5

1 nor only 56.8 62.6 62.9 62.6 63.2 62.6 61.9 69 67.7 69 72.3 70.3nor + 2 ver 68.7 73.2 70.7 75.8 72.6 74.8 73.6 81.3 76.1 80.3 80.7 79.7

2 nor + 2 ver 71.3 77.4 78.7 77.4 78.1 77.1 76.1 84.2 81.3 86.1 84.8 86.5nor + 2 hor 62.6 66.1 67.7 67.4 66.8 66.8 68.4 74.5 75.5 74.5 78.7 75.5nor + 2 ver + 2 hor 73.6 78.1 81.0 78.1 78.7 77.7 81.6 85.2 82.6 88.7 87.7 89

3 nor + 2 ver 72.6 76.8 77.4 77.1 78.1 76.8 80.7 89.7 92.9 89.7 89.7 89.7nor + 2 hor 61.9 64.8 65.5 64.8 66.5 65.2 67.1 74.5 74.8 73.9 77.4 74.2nor + 2 ver + 2 hor 76.5 78.4 80.3 80.0 79.7 80 85.5 91.6 94.2 92.6 93.2 92.3

4 nor + 2 ver 67.4 73.6 72.9 72.3 71.6 72.3 80 86.8 88.4 86.5 88.4 86.5nor + 2 hor 56.8 63.2 63.9 62.6 62.6 62.9 63.2 71 74.2 70 73.6 71 nor + 2 ver + 2 hor 67.4 71.9 73.9 71.6 71.0 71.3 80 87.4 89.7 85.5 87.4 85.8


1, 3 nor + 4 ver 77.4 80.7 81.0 80.7 81.3 80.7 83.6 91.6 93.2 91.6 90.7 91 nor + 4 ver + 2 hor 82.3 83.9 83.6 83.2 83.6 83.2 87.7 93.2 94.5 93.9 94.2 93.6

1, 4 nor + 4 ver 74.2 77.4 77.1 79.4 77.4 79.4 85.2 91.9 91.3 91 91.6 90.7nor + 4 ver + 2 hor 73.9 76.5 77.1 78.7 77.1 78.4 85.2 92.6 92.6 90.3 90.7 90

1, 2, 3

nor + 6 ver 77.7 80.0 80.7 80.7 83.6 81.9 82.6 92.3 93.9 91.9 91 91.6nor + 6 ver + 4 hor 82.9 83.9 82.9 82.9 84.5 84.2 87.7 93.6 94.5 93.9 94.2 93.9

1, 2, 4

nor + 6 ver 75.5 79.7 80.0 80.7 80.0 81 85.5 93.2 94.2 91.3 91.9 91.3nor + 6 ver + 4 hor 77.4 80.0 81.6 81.3 80.7 82.3 89.7 94.2 95.8 92.3 93.6 91.9

1, 3, 4

nor + 6 ver 78.7 82.6 82.9 82.6 81.9 82.9 85.8 92.9 93.9 92.3 92.3 91.9nor + 6 ver + 4 hor 83.6 86.1 87.1 86.5 86.5 87.1 90 94.8 95.5 93.6 94.5 93.9

1,2,3,4

nor + 8 ver 78.7 82.6 82.6 82.3 82.9 82.9 85.2 93.6 94.5 92.6 92.9 92.6nor + 8 ver + 6 hor 83.9 85.8 86.1 85.2 86.1 86.1 90.3 94.8 95.8 93.9 94.8 93.9

Average Recognition Rate 73.2 76.9 77.4 77.2 77.3 77.4 80.6 87.4 87.7 87.3 88.0 87.4

Difference of Averages - 3.7 4.2 4.0 4.1 4.2 - 6.8 7.1 6.6 7.4 6.7

5.8 Summary Many illumination normalization approaches have been proposed in literature and can be classified into two categories: model-based and image-processing based approaches. The image-processing based approaches are more commonly used in practical systems for their simplicity and efficiency.

98

Although most of the illumination normalization approaches can cope with illumination variation well, some may bring negative influence on images without illumination variation. In addition, some approaches show great difference on performance when combined with different face recognition approaches. Some other approaches require perfect alignment of face within the image which is difficult to achieve in practical/real-life systems. This chapter introduces a new image-processing based illumination normalization approach based on enhancing the image resulting from histogram matching using the gamma correction and the Retinal filter’s compression function, which we called GAMMA-HM-COMP approach. It is based on three consecutive steps:

1. Applying the gamma correction on the reference average well-lit image, 2. Histogram matching the input image to the result from 1, 3. Applying the Retinal filter’s compression function to further enhancing the result

of 2. Among 40 different enhancement combinations, GAMMA-HM-COMP approach proves its flexibility among different face recognition approaches and independency to face alignment. These make it suitable for practical/real-life systems as it can be used with different face recognition approaches and doesn’t need any pre-assumptions or constraints concerning the face alignment. The results show that the GAMMA-HM-COMP leads to average increasing in recognition rates over HM alone ranges from 4~7% in Eigenface and Spectroface methods using aligned and non-aligned versions of the Yale B database. Moreover, in this study, the compression function of the Retinal filter is newly applied as an image enhancement method. It proves its suitability for further enhancement rather than the other three traditional enhancement methods which are the histogram equalization, gamma correction and log transformation. In the following chapter, we will evaluate the proposed illumination normalization approach (GAMMA-HM-COMP) together with other best-of-literature approaches introduced previously in chapter 3 over images with illumination variation and images with other facial and geometrical variations using the two selected face recognition methods.

99

CHAPTER 6: Evaluate the Proposed Approach

6.1 Introduction The aim of this chapter is to establish comparative studies between the proposed illumination normalization approach and best-of-literature approaches over images with illumination variation and images with other facial and geometrical variations using the two selected face recognition methods. This allows us to test which of these approaches is flexible to different face recognition approaches, which is independent on face alignment and which has less side-effects over variations other than illumination. As chapter 3 introduced previously, there are seven best-of-literature approaches selected among 38 different illumination normalization approaches based on surveying nine different comparative studies. Here we chose four out of these seven approaches to compare with the proposed approach. The chosen approaches are:

1. Single Scale Retinex with Histogram Matching (SSR-HM). 2. Local Normal Distribution (LNORM). 3. Local Binary Patterns (LBP). 4. Preprocessing Chain Approach (CHAIN).

The detailed descriptions of these approaches are introduced previously in chapter 3. The rest of this chapter is organized as follows: section 2 describes the implementation parameters of the four approaches and the proposed one in addition to the difference in results between our implementation of some of these approaches and the published ones. The comparison between the four approaches and the proposed one over images with illumination variations and images with other facial and geometrical variations are introduced in sections 3 and 4, respectively. Finally, chapter summary is introduced in section 5.

6.2 Implementation of the Compared Approaches In all experiments through this thesis, only two preprocessing steps are applied to all face images in both Eigenface and Spectroface methods:

1. Convert each image to grayscale. 2. Resize each image to a fixed size of 128 × 128.

In the following subsections, we describe the implementation parameters of each of the four approaches and the proposed one in addition to the difference in results between our implementation of some of these approaches and the published ones. Please refer to chapter 3 for detailed descriptions about each of the four approaches and chapter 5 for the proposed approach.

100

6.2.1 Preprocessing Chain Approach (CHAIN) As described previously in chapter 3, the CHAIN approach consists of four consecutive steps, which are:

1. Gamma Correction. 2. Difference of Gaussian (DoG). 3. Masking. 4. Contrast Equalization.

Here we use the original implementation of the CHAIN approach used by the authors of [109] without the masking step. The original implementation can be found in [121]. Also, we use the same default settings of the various parameters of the CHAIN approach that are summarized in Table-6.1. Moreover, it’s found by authors of [109] that the CHAIN approach gives similar results over a broad range of parameter settings, which greatly facilitates the selection of parameters.

Table 6.1: Default parameter settings for CHAIN approach

Procedure Parameter Value Gamma Correction γ 0.2

DoG Filtering σ0 1 σ1 2

Contrast Equalization

α 0.1 τ 10

We use the output of applying CHAIN approach as it is without normalizing it to [0-255]. However, in Spectroface method, we bring to positive the whole grayscale range of the resulting image by adding fixed value (equal 15) to all pixels. This gives much better results on YALE B database with its two versions (YALE B-AUTO & YALE B-MANU) as shown in Table-6.2.

Table 6.2: Results of applying CHAIN with and without sliding on Spectroface method on both versions of the YALE B database

Database CHAIN Preprocessing without sliding with sliding

YALE B-AUTO 31.4% 72.2% YALE B-MANU 59.1% 96.5%

The possible reason behind this is when we bring the whole range to positive; the DC component of the FFT will have the maximum magnitude over all other components. Thus, when we normalize the FFT magnitudes by dividing on DC to remove the scaling factor (refer to section 4.2.2), the consistency between the FFT magnitudes remains the same on the image itself and on all other images (since the DC always the max in all images). On the other hand, if we don’t bring the whole range to positive, the max magnitude will appear at different locations for different images, even for the images of the same person. So, when we divide all FFT magnitudes on it for normalization, the consistency between the FFT magnitudes remains the same on the image itself but differ

101

on other images. This leads to misclassification even between the images of the same person due to the different location of the max value after normalization between the images.

6.2.2 Local Normal Distribution (LNORM) Here we use our implementation for the LNORM with window size 7 × 7 as we found that it is more suitable to image size 128 × 128 rather than 5 × 5 that is originally used by the authors of [72] for image size 75 × 85. To test if there is a difference between our implementation and the original one, we re-implement some of the original experiments of the LNORM published in [72] using the same recognition approach, Euclidean distance, the same image size, 75 × 85, and the best window size, 5 × 5. The results in Table-6.2 show that there are no significant difference between our implementation of the LNORM and the original one; 0.4% in Extended Yale B and 0.6% in Yale B. These small differences may be due to the difference in selection of the five training images from subset 1 (S1) in Extended Yale B and the randomization of the five training images in Yale B.

Table 6.3: Difference between our implementation of the LNORM and the original one

Normaliz. Approach Database Description

Original Results Our Results Differ-ence Recog.

Approach % Recog. Approach %

LNORM 5×5 Extended Yale B train: 5 from S1

test: remain 59 Euclidean 97.3 Euclidean 97.7 +0.4%

Yale B – pure faces (average of 10 random)

train: 5 from S3 test: remain 59 Euclidean 99.4 Euclidean 100 +0.6%

We use the output of applying LNORM approach as it is without normalizing it to [0-255]. However, in Spectroface method, we bring to positive the whole grayscale range of the resulting image by adding fixed value (equal 5) to all pixels as we done before in CHAIN. This gives much better results on YALE B database with its two versions (YALE B-AUTO & YALE B-MANU) as shown in Table-6.4.

Table 6.4: Results of applying LNORM with and without sliding on Spectroface method on both versions of the YALE B database

Database LNORM Preprocessing without sliding with sliding

YALE B-AUTO 21.7% 67.1% YALE B-MANU 48.1% 95.6%

6.2.3 Single Scale Retinex with Histogram Matching (SSR-HM) Here we use our implementation for the SSR-HM with σ = 4 as the authors of [66] conclude that the illumination correction is best at Retinex scales between σ = 2 and σ = 6. Moreover, the reference image for HM is constructed by calculating the average image

102

of a set of well-lit images – one for each subject rather than using a single well-lit image for the whole image set. To test if there is a difference between our implementation and the original one, we re-implement the original experiments of the SSR-HM published in [66] using the best sigma in the published work (σ = 2) but with the Eigenface as recognition approach rather than the SVM that is used in [66]. The results in Table-6.5 show that there are no significant difference between our implementation and the original one except in one experiment. These differences may be due to the randomization of the training images and/or the different recognition approach used by our experiments (Eigenface rather than SVM).

Table 6.5: Difference between our implementation of the SSR-HM and the original one

Normaliz. Approach Database Description

Original Results Our Results Differ-ence Recog.

Approach % Recog. Approach %

SSR HM (σ = 2)

Yale B – head train: 1 from S1 test: remain 63 SVM 99.0 Eigenface 100 +1.0%

Yale B – head (average of 20

random from the whole database)

train: 1 from any subset test: remain 63 SVM 90.2 Eigenface 98.2 +8.0%

train: 2 from any subsettest: remain 62 SVM 99.8 Eigenface 99.7 -0.1%

6.2.4 Local Binary Patterns (LBP) Here we modify the implementation of the LBP found in [121] to be similar as the one used by the authors of [80]. We use the same parameters that are used in the original implementation (i.e. applying LBP operator on 8 equally spaced neighborhood pixels on circle of radius 2). Since the original implementation of the LBP is used in [80] for face authentication rather than face recognition as in our case, unfortunately we are not able to compare the results of both implementations here.

6.2.5 Proposed Approach (GAMMA-HM-COMP) As stated previously in chapter 5, the proposed illumination normalization approach consists of applying three consecutive steps, which are:

1. Gamma correction, 2. Histogram matching, 3. Compression function of the Retinal filter.

The detailed description of the proposed approach can be found in chapter 5. Here we review quickly about the parameters used for this approach during the experiments. First, the gamma value used in first step is chosen to be equal four and the output of the gamma correction is normalized to be in range [0-255]. Second, the reference image for HM is constructed by calculating the average image of a set of well-lit images – one for

103

each subject rather than using a single well-lit image for the whole image set. Finally, the parameters of the Gaussian filter used in the compression function of the Retinal filter is originally taken from [73] which uses a Gaussian filter of size 15 × 15 with standard deviation σ = 2.

6.3 Comparison on Illumination Variations We use the Yale B database (frontal images only) to compare between the five approaches on illuminated face images. To test if each approach requires perfect face alignment or not, the comparison is applied on both aligned and non-aligned versions of the Yale B, namely YALE B-MANU and YALE B-AUTO (please refer to chapter 5 for descriptions about both versions). On both versions, the comparison is done on all the 25 training cases described previously in chapter 4 then the average recognition rate is calculated and used as a comparison measure. Following are the comparison results on each version using both Eigenface and Spectroface methods.

6.3.1 Aligned Faces Here we use the YALE B-MANU version. Table-6.6 shows the results of applying each of the five illumination normalization approaches for the 25 training cases on each of the Eigenface and the Spectroface method. It shows also the results without applying any of the five approaches. The average recognition rates over the 25 training cases are shown in the last row of the table. Fig.6.1 (a) and (b) shows the difference between the average recognition rates before and after applying each of the five approaches on Eigenface and Spectroface respectively. It’s clear from the figure that the best illumination normalization approach on both Eigenface and Spectroface methods is the SSR-HM approach.

6.3.2 Non-Aligned Faces Here we use the YALE B-AUTO version. Table-6.7 shows the results of applying each of the five illumination normalization approaches for the 25 training cases on each of the Eigenface and the Spectroface method. It shows also the results without applying any of the five approaches. The average recognition rates over the 25 training cases are shown in the last row of the table. Fig.6.2 (a) and (b) shows the difference between the average recognition rates before and after applying each of the five approaches on Eigenface and Spectroface respectively. It’s clear from Fig.6.2 that the proposed approach is the best one on the Eigenface method while the SSR-HM is the best one on the Spectroface method. Note the significant decreasing in the performance of the four best-of-literature approaches on both methods when the images are not perfectly aligned. Moreover, both LNORM and CHAIN approaches bring negative influence when they used with the Eigenface method.

104

This means that these approaches require the images to be perfectly aligned which is difficult to achieve in practical/real-life systems. Fig.6.3 (a) and (b) shows the decreasing in the performance of each approach due to the non-aligning of faces on Eigenface and Spectroface respectively. (i.e. the difference between the performance of each approach on the YALE B-MANU and YALE B-AUTO). It’s clear that the minimum affected approach due to the non-aligning of faces on both methods is the proposed approach.

Table 6.6: Results of applying each of the five illumination normalization approaches with both Eigenface and Spectroface methods over YALE B-MANU version. Average recognition rate is

calculated over the 25 different training cases.

(0: NONE, 1: LNORM, 2: LBP, 3: CHAIN, 4: SSR-HM, 5: GAMMA-HM-COMP, nor: normal, ver: vertical, hor: horizontal)


Eigenface Spectroface 0 1 2 3 4 5 0 1 2 3 4 5

1 nor only 62.9 100 100 100 100 89.4 51 79 74.5 89.4 88.7 68.1nor + 2 ver 60.6 100 100 100 100 95.8 55.5 87.7 76.8 93.6 96.5 77.7

2 nor + 2 ver 66.8 100 100 100 100 96.5 61.3 91.9 82.6 97.1 97.4 82.9nor + 2 hor 67.1 100 100 100 100 92.6 58.4 92.6 90.3 94.5 91.9 75.5nor + 2 ver + 2 hor 68.7 100 100 100 100 96.8 64.5 94.5 90.7 97.1 97.7 83.6

3 nor + 2 ver 65.8 100 100 100 100 98.7 73.6 96.8 89.4 97.4 98.4 92.3nor + 2 hor 70 100 100 100 100 93.9 61.9 93.9 91.9 96.5 95.5 73.9nor + 2 ver + 2 hor 63.2 100 100 100 100 96.5 79 98.7 93.9 97.7 98.7 93.2

4 nor + 2 ver 62.3 100 100 100 100 98.1 66.8 98.4 89 95.5 99 89 nor + 2 hor 57.4 100 99.7 100 100 93.9 52.9 86.1 82.6 94.2 92.9 73.2nor + 2 ver + 2 hor 54.2 100 100 100 100 95.2 66.1 98.4 91.6 95.5 99 89

1, 2 nor + 4 ver 70.6 100 100 100 100 96.5 61.3 90.7 81.9 96.8 97.4 84.2nor + 4 ver + 2 hor 69.7 100 100 100 100 96.8 64.5 93.9 90.3 96.8 97.7 84.2

1, 3 nor + 4 ver 70.6 100 100 100 100 98.1 74.2 96.1 90.7 96.8 98.4 92.9nor + 4 ver + 2 hor 70.6 100 100 100 100 97.4 80 98.7 93.9 97.1 98.7 93.6

1, 4 nor + 4 ver 69.7 100 100 100 100 98.4 69.7 99 91.3 97.1 99.7 93.6nor + 4 ver + 2 hor 58.1 100 99.7 100 100 96.5 68.7 99 93.2 97.4 99.7 93.6

1, 2, 3

nor + 6 ver 68.7 100 100 100 100 98.4 73.9 97.1 89.4 97.4 98.4 93.2nor + 6 ver + 4 hor 73.5 100 100 100 100 97.4 81.9 98.7 94.8 97.4 98.7 94.2

1, 2, 4


1, 3, 4


1,2,3,4

nor + 8 ver 70 100 100 100 100 99 77.7 99.7 93.6 97.7 99.7 95.8nor + 8 ver + 6 hor 65.8 100 100 100 100 97.1 86.1 99.7 96.8 97.7 99.7 96.8

Average Recognition Rate 66 100 100 100 100 96.5 69.8 95.6 90 96.4 97.7 88.1

105

LNORMCHAIN

LBPSSR-HM

Proposed

S1

28

29

30

31

32

33

34

Aver

age

diffe

renc

e fro

m N

ONE

Normalization Approach

Eigenface on YALE B-MANU(NONE = 66.0%)

(a) Eigenface (b) Spectroface

Figure 6.1: Average increasing/decreasing in recognition rates after applying each of the five illumination normalization approaches on YALE B-MANU version

Table 6.7: Results of applying each of the five illumination normalization approaches with both Eigenface and Spectroface methods over YALE B-AUTO version. Average recognition rate is

calculated over the 25 different training cases. (0: NONE, 1: LNORM, 2: LBP, 3: CHAIN, 4: SSR-HM, 5: GAMMA-HM-COMP, nor: normal, ver: vertical, hor: horizontal)


Eigenface Spectroface 0 1 2 3 4 5 0 1 2 3 4 5

1 nor only 46.5 41.6 47.7 39 53.9 66.8 48.4 46.5 45.8 55.5 62.9 57.1nor + 2 ver 48.7 49.7 53.5 50 66.8 70 52.3 61.6 57.4 67.7 74.2 70

2 nor + 2 ver 54.8 50.6 56.8 44.5 65.2 74.5 57.7 61.9 60 70.3 78.4 77.4nor + 2 hor 50.6 50.3 57.7 54.2 63.9 68.1 58.1 63.9 65.5 68.4 75.8 65.5nor + 2 ver + 2 hor 55.2 57.1 60.6 53.2 68.1 74.5 61 71 69 74.5 81 78.1

3 nor + 2 ver 55.5 49.4 59.7 47.7 65.2 73.9 61.6 55.8 65.2 65.8 79 76.1nor + 2 hor 53.2 43.2 53.9 51.9 60.6 72.9 56.8 62.6 64.5 64.5 73.2 63.9nor + 2 ver + 2 hor 52.6 51 56.5 52.3 69 74.8 65.2 68.1 74.8 71.3 81.6 76.8

4 nor + 2 ver 45.5 44.8 47.4 42.3 59.7 62.3 55.5 55.5 57.1 57.4 71 73.9nor + 2 hor 44.8 43.5 50 40 57.4 68.4 48.1 51.6 54.5 60.7 69.4 61.3nor + 2 ver + 2 hor 41.9 46.1 50.3 44.8 63.2 65.5 54.8 57.7 61.3 62.9 76.1 74.8


1, 3 nor + 4 ver 60 51.6 56.8 52.6 65.8 76.5 65.5 68.4 70 73.6 78.7 80.7nor + 4 ver + 2 hor 58.1 54.5 60 55.2 69.4 78.7 69 74.8 77.4 78.1 82.3 81.3

1, 4 nor + 4 ver 50 49.4 53.2 52.3 64.2 67.1 60.3 68.4 67.4 73.9 80.7 78.4nor + 4 ver + 2 hor 46.8 51.6 60.3 49.4 66.8 71 59.7 68.4 71.9 75.8 82.9 78.1

1, 2, 3

nor + 6 ver 61 54.5 61 53.9 68.7 79.4 66.5 69.7 71.6 75.2 79 81.3nor + 6 ver + 4 hor 61 60.6 63.2 56.8 72.6 79.4 71.6 76.1 80.7 80.7 82.9 81.6

1, 2, 4

nor + 6 ver 55.2 51.3 54.8 52.6 63.5 72.9 65.2 71.9 71 77.4 83.2 82.3nor + 6 ver + 4 hor 52.3 58.4 62.6 56.5 69.7 74.8 68.4 77.4 79 81.9 86.5 82.6

1, 3, 4

nor + 6 ver 55.5 52.9 55.2 51.9 69 76.1 70 72.6 74.8 76.8 82.6 84.5nor + 6 ver + 4 hor 53.5 53.2 59.7 54.8 69.7 77.7 74.5 80.3 83.2 81.6 87.7 86.8

1,2,3,4

nor + 8 ver 57.4 53.2 60.6 53.9 69.4 78.1 71 74.5 76.5 79.4 83.2 84.8nor + 8 ver + 6 hor 55.5 60.6 65.2 59 69 79 77.1 81.6 86.1 83.9 88.1 85.8

Average Recog. Rate 53.3 51.5 57.1 50.9 65.8 73.5 62.3 67.1 68.8 72.2 79.2 76.7

106

LNORMCHAIN

LBPSSR-HM

Proposed

S1

-5

0

5

10

15

20

25

Aver

age

diffe

renc

e fro

m N

ON

E


Eigenface on YALE B-AUTO(NONE = 53.3%)


Figure 6.2: Average increasing/decreasing in recognition rates after applying each of the five illumination normalization approaches on YALE B-AUTO version


Figure 6.3: Performance decreasing of each normalization approach due to the non-aligning of faces (i.e. subtracting the performance on YALE B-AUTO from the performance on YALE B-MANU)

The final conclusion from comparison on illuminated images is to use the SSR-HM approach when the illuminated images are aligned properly while use the proposed approach when the illuminated images are not aligned.

6.4 Comparison on Other Variations The aim of these comparisons is to study the side-effect of each of the five illumination normalization approaches on variations other than the illumination. Therefore, we compare the five approaches on each of the four face recognition variations described previously in chapter 4 (excluding illumination). These four variations are divided into two-facial, which are 3D pose, facial expressions, and two-geometrical, which are translation and scaling. We use the same databases and the same training cases that are used in chapter 4 for comparisons.

107

In the two-facial variations, each approach is applied on all training cases of each database, then, its average recognition rate over these training cases is calculated and compared with the corresponding baseline average rate that is calculated before in chapter 4. In the two-geometrical variations, the average recognition rate for each approach is calculated before and after each variation, then the difference between the two averages is calculated and compared with the corresponding baseline difference that is calculated before in chapter 4. Following are the comparison results on each face recognition variation using both Eigenface and Spectroface methods.

6.4.1 Pose Variations Same as in chapter 4, we use the UMIST database with the 12 training cases for this comparison. Table-6.8 shows the results of applying each of the five illumination normalization approaches for the 12 training cases on each of the Eigenface and the Spectroface method. It shows also the results without applying any of the five approaches – taken from chapter 4. The average recognition rates are shown in the last row of the table. Fig.6.4 (a) and (b) shows the difference between the average recognition rates before and after applying each of the five approaches on Eigenface and Spectroface respectively.

Table 6.8: Results of applying each of the five illumination normalization approaches with both Eigenface and Spectroface methods over UMIST database. Average recognition rate is calculated

over all training cases.

(0: NONE, 1: LNORM, 2: LBP, 3: CHAIN, 4: SSR-HM, 5: GAMMA-HM-COMP)

Training Case Eigenface Spectroface

0 1 2 3 4 5 0 1 2 3 4 5 normal only 64 25 35.5 23.5 28.5 44.5 48 41.3 48.8 36.8 36.3 40 normal + 10˚ 68.5 30.5 47.3 40.5 54 60 67.3 52.8 63 49 55 55.5 normal + 30˚ 75 37.5 57 43.5 56 71 76 57 74 58.5 54 65.8 normal + 45˚ 87 46.5 64.5 43 65 78 89 71.5 82.5 66.8 67.3 78 normal + 75˚ 85.5 52 67.5 50 61 80.5 89 76.8 84.8 67 73 83.3 normal + 10˚ + 30˚ 74 41.5 57.5 46 56.5 68.5 74.5 58.5 72.5 58 57.5 65 normal + 10˚ + 45˚ 84.5 55.5 69 52.5 73 79 90 76.8 85 69.3 72.3 80.5 normal + 10˚ + 75˚ 87.5 61 75.8 63 74.5 87.5 94.5 86.5 90.8 79.5 83.3 89.8 normal + 30˚ + 45˚ 85 49 71 50 67.5 80 90.3 76.3 85.5 70.8 70.5 80 normal + 30˚ + 75˚ 88.5 62.5 78 57.5 74 88 94.8 88 91 82.5 82.3 90.3 normal + 45˚ + 75˚ 87.5 63 79.3 54 70.5 87.5 95 83.5 89.3 79.5 80.5 87.8 normal+10˚+30˚+45˚+75˚ 88 68.5 83.3 66 81 90 95 90.5 92.5 86 85.8 90.3Average Recognition

Rate 81.3 49.4 65.5 49.1 63.5 76.2 83.6 71.6 80 67 68.2 75.5

108


Figure 6.4: Average difference in recognition rates after applying each of the five illumination normalization approaches on UMIST database

Although all approaches lead to decrease the recognition rates as shown in Fig.6.3, the proposed approach has the least side-effect due to 3D pose variation on Eigenface method while the LBP has the least side-effect on the Spectroface method.

6.4.2 Facial Expressions Variations Same as in chapter 4, we use each of the Grimace, Yale, JAFEE, and Nott-faces databases with their training cases for this comparison. Table-6.9 shows the results of applying each of the five illumination normalization approaches for the four databases on each of the Eigenface and the Spectroface method. It shows also the results without applying any of the five approaches – taken from chapter 4. The average recognition rates over all databases are shown in the last row of the table. Figures 6.5, 6.6, 6.7 and 6.8 show the difference between the average recognition rates before and after applying each of the five approaches on Yale, Grimace, JAFFE and Nott-faces, respectively. It’s clear from these figures that in all the four databases, the proposed approach always has the least side-effect due to facial expression variation on both Eigenface and Spectroface methods among the four other approaches. In Nott-faces database, we note that applying the proposed approach and also the SSR-HM approach lead to increase the recognition rate over the baseline. One possible reason for this increasing is the uniform illumination effects on the faces of this database. So, when we apply the illumination normalization approach, it normalizes these effects and thus gives better recognition.

109

Table 6.9: Results of applying each of the five illumination normalization approaches with both Eigenface and Spectroface methods over Grimace, Yale, JAFEE, and Nott-faces databases. Average

recognition rate is calculated over all training cases.

(0: NONE, 1: LNORM, 2: LBP, 3: CHAIN, 4: SSR-HM, 5: GAMMA-HM-COMP, expr: expression(s))

DB Training Case Eigenface Spectroface

0 1 2 3 4 5 0 1 2 3 4 5

Yal

e

normal only 81.9 75.2 90.5 66.7 81 88.6 92.4 83.8 91.4 73.3 84.8 93.3 normal + 2 expr3 96 88 93.3 84 94.7 94.7 98.7 94.7 96 86.7 97.3 97.3 normal + 3 expr 96.7 85 91.7 81.7 93.3 93.3 98.3 95 98.3 86.7 98.3 98.3

Gri

mac

e

normal only 96.2 76.6 80.1 82.2 87.1 93.6 100 84.8 93 93.3 95 100 normal + 2 expr 97.1 88.9 93.1 89.2 92.8 96.7 100 94.8 98.7 99 98 99 normal + 4 expr 96.7 91.5 94.1 88.5 91.5 96.7 100 94.4 99.3 99.3 97.8 100

JAFF

E normal only 84.2 65.5 65 55.7 73.4 79.8 93.1 82.3 81.8 69 84.2 87.7

normal + 2 expr 90.2 71 82 71.6 78.1 85.2 97.8 89.1 87.4 85.8 91.8 94.5 normal + 4 expr 89.6 77.9 78.5 75.5 77.9 89.6 97.6 90.8 93.9 84.7 95.7 97.6

Not

t-fa

ces

with

cap

. normal only 55.4 43.9 52.5 36.1 42.5 59.6 57.1 51.4 46.1 49.3 62.9 66.1 normal + 1 expr 55.7 46.7 51 35.7 45.2 60 62.9 51.9 54.8 54.3 62.4 63.3 normal + 2 expr 47.9 43.6 43.6 33.6 42.1 50.7 56.4 47.9 47.1 51.4 52.9 52.9

w/o

ut

cap normal only 69 51 62.9 39.5 48.1 76.2 63.8 65.2 55.2 57.6 80 83.3

normal + 1 expr 76.4 55.7 65 40 52.9 84.3 77.1 71.4 70 67.1 87.9 87.9Average Recognition

Rate 80.9 68.6 74.5 62.9 71.5 79.4 86.3 80.5 80.5 75.1 80.2 86.7

LNORMCHAIN

LBPSSR-HM

Proposed

S1

-16

-14

-12

-10

-8

-6

-4

-2

0

2

Ave

rage

diff

eren

ce fr

om N

ON

E


Eigenface on YALE(NONE = 91.5%)


Figure 6.5: Average difference in recognition rates after applying each of the five illumination normalization approaches on Yale database

3 normal + N expr:- means that we train with normal image + N images each contains single expression – randomly selected.

110

LNORMCHAIN

LBPSSR-HM

Proposed

S1

-12

-10

-8

-6

-4

-2

0

Ave

rage

diff

eren

ce fr

om N

ON

E


Eigenface on Grimace(NONE = 96.7%)


Figure 6.6: Average difference in recognition rates after applying each of the five illumination normalization approaches on Grimace database

LNORMCHAIN

LBPSSR-HM

Proposed

S1

-25

-20

-15

-10

-5

0

Ave

rage

diff

eren

ce fr

om N

ON

E


Eigenface on JAFFE(NONE = 88.0%)


Figure 6.7: Average difference in recognition rates after applying each of the five illumination normalization approaches on JAFFE database

LNORMCHAIN

LBPSSR-HM

Proposed

S1

-25

-20

-15

-10

-5

0

Ave

rage

diff

eren

ce fr

om N

ON

E


Eigenface on Nott-faces(NONE = 60.9%)


Figure 6.8: Average difference in recognition rates after applying each of the five illumination normalization approaches on Nott-faces database

111

6.4.3 Translation Variations Here we use the same training and testing methodologies described in chapter 4 for testing each of the five illumination normalization approaches. As described previously in chapter 4, the testing of translation is applied two different times; first, by translating with circulation in which the output pixels after translation are circulated to fill the empty pixels in the opposite direction. Second, translating without circulation in which the empty pixels after translation are filled by fixed color (gray in our case).

For translating with circulation case, Table-6.10 (a)-(e) shows the average decreasing in recognition rates4 per database on both Eigenface and Spectroface methods for each of the five approaches. Table-6.11 (a)-(e) is dedicated for translating without circulation case. The average decreasing in recognition rates over all databases are shown in the last rows of each table.

Table 6.10: Average decreasing in the recognition rates of both methods after translating with circulation in the four directions while applying (a) LNORM, (b) LBP, (c) CHAIN, (d) SSR-HM and

(e) GAMMA-HM-COMP approaches as preprocessing step.

(a) LNORM Approach

Eigenface Method Spectroface Method


UMIST 18.3 34.9 41.4 42.8 Grimace 8.5 38.6 58 64 Yale 13.3 51.7 70.8 73.7 JAFFE 11.8 39.3 53.4 53.8 Nott-faces 18.2 40.3 51.6 53.6 Yale B 7 24.1 37.5 41

Average 12.9 38.2 52.1 54.8


UMIST 1.2 0.3 1.6 0.4 Grimace 0 0.1 0 0.2 Yale 0.4 3.3 0.4 2.9 JAFFE 4.2 2.9 3.8 2.6 Nott-faces 4.1 3.2 4.6 2 Yale B 1 0 0.6 0.6

Average 1.8 1.6 1.8 1.5

(b) LBP Approach



UMIST 6.8 23.2 34.1 40.4 Grimace 4.4 16.6 28.1 41.6 Yale 1.7 15 37.5 49.6 JAFFE 14.7 32.8 46.8 53.4 Nott-faces 15.5 34.1 45.2 50.5 Yale B 4.3 14.7 26 33.1

Average 7.9 22.7 36.3 44.8


UMIST 0.2 0.7 1 1.1 Grimace 0.2 0.3 0.8 0.6 Yale 0.8 2.9 2.1 2.9 JAFFE 1.8 1 2.5 3.4 Nott-faces 2.3 7.1 6.3 7.1 Yale B 1.9 2.1 3.1 3.1

Average 1.2 2.4 2.6 3

4 There are some cases in which the recognition rates are marginally increased, we consider it as noise and set the decreasing value to 0 to indicate that there’s no decreasing due to the translation.

112

(c) CHAIN Approach



UMIST 7.6 29.4 41.4 42.3 Grimace 4 38.5 61.9 73.1 Yale 12.5 44.2 63 69.6 JAFFE 12.4 39.2 53.8 55.6 Nott-faces 8 22.3 31.1 34.3 Yale B 6.9 24.8 35.6 38.9

Average 8.6 33.1 47.8 52.3


UMIST 6.3 6.9 8.2 8 Grimace 0.5 0.7 1.3 0.8 Yale 7.1 11.7 6.7 12.1 JAFFE 4.2 4.6 4.9 4.6 Nott-faces 7.3 8.8 9.5 7.5 Yale B 2.3 4.1 3.8 3.6

Average 4.6 6.1 5.7 6.1

(d) SSR-HM Approach



UMIST 16.8 42.8 54.6 59 Grimace 2.3 27.8 57 68.7 Yale 7.9 31.2 52.9 67.1 JAFFE 5.1 24 41.8 49 Nott-faces 10.6 26.7 36.6 42 Yale B 5.1 20.8 38.1 47

Average 8 28.9 46.8 55.5


UMIST 13.2 25.8 30.1 29.1 Grimace 0.7 4.9 12.4 8.5 Yale 1.7 5 10.4 12.9 JAFFE 0 2.5 4.4 3.8 Nott-faces 8.6 21.1 27.7 26.2 Yale B 3.7 7.7 11.5 11.7

Average 4.7 11.2 16.1 15.4

(e) GAMMA-HM-COMP Approach



UMIST 1.4 7.4 18.4 31.6 Grimace 0 1.8 15.7 38.4 Yale 0.8 9.1 20.4 37.1 JAFFE 3 13 26.3 40.5 Nott-faces 3.9 16.4 28.9 42.7 Yale B 1.7 6.1 13.9 24.8

Average 1.8 9 20.6 35.9


UMIST 0 0.1 0 0 Grimace 0 0 0 0 Yale 0 0 0 0 JAFFE 0 0 0 0 Nott-faces 0.9 1.1 0.9 0.9 Yale B 0.3 0.5 0.2 0.3

Average 0.2 0.3 0.2 0.2

113

Table 6.11: Average decreasing in the recognition rates of both methods after translating without circulation in the four directions while applying (a) LNORM, (b) LBP, (c) CHAIN, (d) SSR-HM and

(e) GAMMA-HM-COMP approaches as preprocessing step.

(a) LNORM Approach



UMIST 17.9 34 41 43.4 Grimace 8.6 38.7 58.9 63.3 Yale 15 52.9 70.4 73.8 JAFFE 10.8 38.3 54.2 55 Nott-faces 17.8 40.2 51.2 52.3 Yale B 7.4 25.9 38.1 40.1

Average 12.9 38.3 52.3 54.7


UMIST 1.7 1.2 1.7 2.6 Grimace 0.1 0 0 0.5 Yale 1.7 2.9 1.3 4.2 JAFFE 4 3.4 4.4 3.4 Nott-faces 4.3 5.9 7.1 10.2 Yale B 1.5 1.9 2.6 3.9

Average 2.2 2.6 2.9 4.1

(b) LBP Approach



UMIST 7.1 23.6 35.2 41.1 Grimace 4.5 16.6 29.2 42.8 Yale 2.1 16.3 40.5 51.7 JAFFE 14.4 36.1 50.9 59 Nott-faces 15.7 37.3 47.7 53.8 Yale B 5 15.8 26.7 34.1

Average 8.1 24.3 38.4 47.1


UMIST 0.3 1.8 2.9 5.5 Grimace 0.3 1.6 2.5 4 Yale 0.8 2.5 2.9 6.2 JAFFE 1.5 5.6 11.5 15.7 Nott-faces 2.5 11.6 21.6 36.1 Yale B 1.4 4.8 9.8 15.2

Average 1.1 4.7 8.5 13.8

(c) CHAIN Approach



UMIST 6.9 30.6 42.1 41.8 Grimace 4.8 38.2 61.4 73.8 Yale 15.9 50.5 63.4 68 JAFFE 12 39.6 54.7 55.8 Nott-faces 7.5 22.1 30.5 33 Yale B 7.5 26.6 34.6 36.9

Average 9.1 34.6 47.8 51.6


UMIST 8.8 11.8 12.9 7.1 Grimace 0.4 0.2 0.7 0.3 Yale 25.4 34.6 37.5 21.3 JAFFE 3 3.7 3.4 2.6 Nott-faces 5.7 7.3 7.5 3.2 Yale B 8 12.3 14.6 11.2

Average 8.6 11.7 12.8 7.6

114

(d) SSR-HM Approach



UMIST 14.6 39.4 52.4 55.8 Grimace 2.4 22.8 54 69.4 Yale 7.5 31.6 52.9 64.1 JAFFE 4.7 23.6 38.9 46.1 Nott-faces 9.7 26.8 37 42.7 Yale B 4.4 23.1 40.5 45

Average 7.2 27.9 46 53.9



Average 5.2 14.3 19.3 19

(e) GAMMA-HM-COMP Approach



UMIST 1.9 8.6 18.4 33.6 Grimace 0 1.8 16.6 42.5 Yale 0.8 9.1 19.1 34.1 JAFFE 3.2 13.2 26.3 39 Nott-faces 4.1 15.7 30.6 41.6 Yale B 2.1 7.2 20.8 30.4

Average 2 9.3 22 36.9


UMIST 0 0.1 0.9 3.6 Grimace 0.5 1.5 2.5 5.6 Yale 0 0 0 0 JAFFE 0 0.1 0.5 1.4 Nott-faces 0.2 0.7 2.9 7 Yale B 0.5 3.3 11.6 22.3

Average 0.2 1 3.1 6.7 Fig.6.9 and Fig.6.10 show the average decreasing curves after translating with and without circulation, respectively, on both (a) Eigenface and (b) Spectroface methods. Observe that in Figures 6.9 and 6.10, in both Eigenface and Spectroface, the best two curves with minimum average decreasing in recognition rates are the NONE curve and the proposed approach curve. This means that the proposed approach has the least side-effect due to translation variation on both Eigenface and Spectroface methods among the four other approaches. Note that the performance of the SSR-HM approach is dramatically affected by the translation variation over both recognition methods.


Figure 6.9: Average decreasing in recognition rates after translating with circulation

115


Figure 6.10: Average decreasing in recognition rates after translating without circulation

6.4.4 Scaling Variations Same as in chapter 4, we use the Face 94 database with the seven training cases for this comparison. For each training case, the testing is done two times, before and after scaling, in order to record the decreasing in recognition rates after scaling all testing images. Table-6.12 shows the decreasing in recognition rates when applying each of the five illumination normalization approaches for the seven training cases on each of the Eigenface and the Spectroface method. It shows also the decreasing in recognition rates without applying any of the five approaches (baseline) – taken from chapter 4. The average decreasing in recognition rates are shown in the last row of the table.

Table 6.12: Decreasing in recognition rates after applying each of the five illumination normalization approaches with both Eigenface and Spectroface methods over Face 94 database. Average decreasing

in recognition rate is calculated over all training cases.

(0: NONE, 1: LNORM, 2: LBP, 3: CHAIN, 4: SSR-HM, 5: GAMMA-HM-COMP)

Training Case Eigenface Spectroface

0 1 2 3 4 5 0 1 2 3 4 5 normal only 14.6 51.4 46.3 38.9 40.9 19.7 19.1 62.4 47.1 62.7 49.9 24 normal + up8 6.6 33.1 30.1 24.9 27.1 9.7 7.7 42.9 30.9 43.6 33.2 9.2 normal + down8 9.8 34.3 30.2 23.9 27.2 15.8 12.8 41.3 27.4 41.9 31.5 17.3normal + up8 + down8 0.7 19.2 14.7 9.8 11.5 3.2 0.7 21.9 10.5 21.4 14.3 2.1 normal + up17 5.8 40.7 31.8 31.9 32.3 8.9 8 48.7 31 45.2 35.7 9.5 normal + down17 8.7 38.2 33.3 25.1 26.7 14.1 13.1 48.3 29.4 49.8 32.2 17.7normal + up17 + down17 0 31.7 21.2 19.2 18.1 0.5 0.9 33.9 12.9 32.6 17.8 2.5 Average Decreasing in Recognition Rate 6.6 35.5 29.7 24.8 26.3 10.3 8.9 42.8 27 42.5 30.7 11.8

Fig.6.11 (a) and (b) shows the average decreasing in recognition rates after applying each of the five approaches on Eigenface and Spectroface respectively.

116


Figure 6.11: Average decreasing in recognition rates when applying each of the five illumination normalization approaches before and after scaling the Face 94 database

It’s clear from Fig.6.10 that the proposed approach has the least side-effect due to scaling variation on both Eigenface and Spectroface methods among the four other approaches, while has side-effect slightly more than that in the NONE case (i.e. without applying any approach). Note that the performance of the other four approaches is dramatically affected by the scaling variation over both recognition methods.

6.5 Summary In this chapter, we establish comparative studies between the proposed illumination normalization approach and four best-of-literature approaches over images with illumination variation and images with other facial and geometrical variations using the two selected face recognition methods. When dealing with illuminated images, the results show that the proposed approach is the best one in the Eigenface method and the second best after SSR-HM in the Spectroface method when the images are not perfectly aligned. However, the SSR-HM is the best one in both methods when the images are perfectly aligned. In addition, the proposed approach is the minimum affected approach (i.e. most robust) due to the non-aligning of faces on both methods. When dealing with non-illuminated images, the proposed approach brings the least side-effects, among the four other approaches, in both methods for each of the two-facial variations and the two-geometrical variations (except in the Spectroface method over pose variation, it comes in second place after LBP). Moreover, the performance of the SSR-HM is dramatically affected by the two-geometrical variations, translation and scaling, on both recognition methods. Thus, we can conclude the following about the proposed approach:

1. It’s flexible to different face recognition approaches, as it usually gives the best results on both methods either on illumination variations or other variations.

117

2. It’s robust to the non-aligning of faces rather than the other four approaches which show great difference in performance when applied to non-aligned face images.

3. It has the least side-effects, among the four other approaches, over both facial variations and geometrical variations.

The following chapter of this thesis will summarize the conclusion of the work together with the suggestions for the future works.

118

CHAPTER 7: Conclusions and Future Works

7.1 Conclusions Although many face recognition techniques and systems have been proposed, evaluations of the state-of-the-art techniques and systems have shown that recognition performance of most current technologies degrades due to the variations of illumination. As prove for this, the last face recognition vendor test FRVT 2006 concludes that relaxing the illumination condition has a dramatic effect on the performance. Moreover, It has been proven both experimentally and theoretically that the variations between the images of the same face due to illumination are almost always larger than image variations due to change in face identity. There has been much work dealing with illumination variation in face recognition. Although most of these approaches can cope with illumination variation well, some may bring negative influence on images without illumination variation. In addition, some approaches show great difference on performance when combined with different recognition methods. Some other approaches require perfect alignment of face within the image which is difficult to achieve in practical/real-life systems. In this thesis, we propose an illumination normalization approach that is based on enhancing the image resulting from histogram matching; called GAMMA-HM-COMP. The proposed approach is compared with four best-of-literatures approaches selected among 38 different approaches based on surveying nine comparative studies. The comparison is performed on images with illumination variation and images with other facial and geometrical variations using two face recognition methods. These two methods are chosen to represent the two broad categories of the holistic-based approach – namely Standard Eigenface method from the Eigenspace-based category and Spectroface from the Frequency-based category. The results show that the proposed approach is the best one in the Eigenface method and the second best in the Spectroface method when dealing with illuminated images that are not perfectly aligned. Moreover, the performance of the other approaches is significantly affected by the aligning of the faces inside images, opposite to the proposed approach which not significantly affected by the aligning condition. In addition, the proposed approach brings the least side-effects, among the four other approaches, in each of the two methods when dealing with either facial or geometrical variations.

119

These results lead to conclude that the proposed approach: 1. is flexible to different face recognition approaches, 2. is robust to the non-aligning of faces, and 3. brings the least side-effects on images with either facial or geometrical variations.

In addition, this thesis establishes an environment that can be used for further studying the effects of any preprocessing/illumination normalization approach. The environment consists of:

1. Two face recognition methods representing the two broad categories of the holistic-based face recognition approach – namely Standard Eigenface method from the Eigenspace-based category and Spectroface from the Frequency-based category.

2. Seven databases representing five different face recognition variations with suitable database(s) for each variation. The variations include three-facial, which are 3D pose, expressions and non-uniform illumination, and two-geometrical, which are translation and scaling.

The comparative study between these two face recognition methods over the five variations show that the Spectroface method outperforms the Eigenface method in each of 3D pose, facial expressions, non-uniform illumination and translation variations while the Eigenface method is better in the scaling variation. Finally, we nominate seven illumination normalization approaches that are considered the best-of-literature approaches based on surveying nine different comparative studies containing 38 different approaches. These seven approaches can be considered as benchmark that can be used for comparing any further new/suggested illumination normalization approach.

120

7.2 Future Works As the proposed approach depends mainly on the histogram matching, it’s possible to apply the HM in block-wise manner rather than on the whole face. This can produce better results as it will normalize the illumination of each face region separately according to its illumination condition. Another possible modification is to apply the region-based GIC to automate the selection of gamma value over each region separately rather than using a single, fixed, gamma value over the whole face. In this work, all illumination normalization approaches are tested on two face recognition methods representing the two broad categories of the holistic-based approach. It’s important to extend this work to include both local-based and hybrid face recognition methods in testing these approaches as they may introduce difference in performance when combined with such methods. Moreover, the environment established in this work can be extended to include additional databases (specially illuminated ones) and/or other face recognition variations (e.g. aging). Here in this work we nominate seven best-of-literature illumination normalization approaches as benchmark but compare with four only. The remaining three approaches can be implemented and included in comparisons on images with illumination variation and images with other facial and geometrical variations. Finally, this work introduces a technology evaluation for the proposed approach and the other best-of-literature approaches. In order to complete the thorough evaluation cycle, both scenario and operational evaluations need to be performed for these approaches.

121

References [1] Maltoni, D.; Maio D., Jain A.K. & Prabhakar S. Handbook of Fingerprint Recognition, Springer,

New York, 2003. [2] W. Zhao, R. Chellappa, P. J. Phillips, and A. Rosenfeld, Face recognition: A literature survey, ACM

Computing Surveys (CSUR), 35(4), 399–458, 2003. [3] Xiaoyang Tan , Songcan Chen , Zhi-Hua Zhou , Fuyan Zhang, Face recognition from a single image

per person: A survey, Pattern Recognition, v.39 n.9, pp.1725-1745, Sep 2006. [4] Matthew Curtis Hesher, Automated Face Tracking and Recognition, M.Sc. Thesis, the Florida State

University, 2003. [5] P. J. B. Hancock, V. Bruce, and A. M. Burton, “Recognition of unfamiliar faces,” Trends in

Cognitive Sciences, vol. 4, pp. 330–337, 2000. [6] Daugman, John. Face and Gesture Recognition: Overview. IEEE Transactions on Pattern Analysis

and Machine Intelligence, vol. 19, issue 7, Jul 1997. [7] Gong, Shaogang, Stephen J. McKenna, Alexandra Psarrou. Dynamic Vision: From Images to Face

Recognition. Imperial College Press, 2000. [8] Hochberg, Julian, Ruth Ellen Galper. Recognition of Faces: I. An Exploratory Study. Psychonomic

Science, vol. 9, pp. 619-620, 1967. [9] Brigham, J. C., A. Maass, L. D. Snyder, K. Spaulding. Accuracy of Eyewitness Identification in a

Field Setting. Journal of Personality and Social Psychology, vol. 42, pp. 673-681, 1982. [10] Brown, E., K. De_enbacher, W. Sturgill. Memory for Faces and the Circumstances of Encounter.

Journal of Applied Psychology, vol. 62, pp. 311-318, 1977. [11] L. Torres, Is there any hope for face recognition?, Proc. of the 5th International Workshop on Image

Analysis for Multimedia Interactive Services, WIAMIS 2004, Lisboa, Portugal, Apr 2004. [12] P. J. Phillips, W. T. Scruggs, A. J. O'Toole, P. J. Flynn, K.W. Bowyer, C. L. Schott, and M. Sharpe,

”FRVT 2006 and ICE 2006 Large-Scale Results”, National Institute of Standards and Technology, NISTIR 7408, http://face.nist.gov, 2007.

[13] Y. Adini, Y. Moses, and S. Ullman, “Face recognition: The problem of compensating for changes in illumination direction” IEEE Tran. PAMI, 19(7), 721-732, 1997.

[14] W. Zhao, and R. Chellappa, “Robust face recognition using symmetric shape-from-shading” Technical report, Center for Automation Research, University of Maryland, 1999.

[15] Laurenz Wiskott, Jean-Marc Fellous, Norbert Kruger, Christoph von der Malsburg, "Face Recognition by Elastic Bunch Graph Matching," IEEE Transactions on Pattern Analysis and Machine Intelligence ,vol. 19, no. 7, pp. 775-779, Jul 1997.

[16] Liao, R. and Li, S. Z. 2000. Face Recognition Based on Multiple Facial Features. In Proceedings of the Fourth IEEE international Conference on Automatic Face and Gesture Recognition 2000. FG. IEEE Computer Society, Washington, DC, 239, Mar 2000.

[17] E. Hjelmås. Biometric Systems: A Face Recognition Approach. In Proceedings of the Norwegian Conference on Informatics, pp. 189-197, 2000.

[18] B. Heisele P. Ho and T. Poggio, “Face Recognition with Support Vector Machines: Global versus Component-Based Approach,” Proc. Int'l Conf. Computer Vision, vol. 2, pp. 688-694, 2001.

[19] ZANA, Y.; CESAR-JR, R. M.; BARBOSA, R. A. Automatic face recognition system based on local Fourier-Bessel features. In: BRAZILIAN SYMPOSIUM ON COMPUTER GRAPHICS AND IMAGE PROCESSING, 18. (SIBGRAPI), Brazil, 2005.

[20] R. Paredes and E. Vidal and F. Casacuberta, Local Features for Biometrics-Based Recognition, 2nd COST 275 Workshop, Biometrics on the Internet Fundamentals, Advances and Applications, 2004.

[21] Timo Ahonen, Abdenour Hadid, Matti Pietikäinen, "Face Description with Local Binary Patterns: Application to Face Recognition," IEEE Transactions on Pattern Analysis and Machine Intelligence ,vol. 28, no. 12, pp. 2037-2041, Dec 2006.

[22] T.I. El-Arief, K.A. Nagaty, and A.S. El-Sayed, “Eigenface vs. Spectroface: A Comparison on the Face Recognition Problems”, IASTED Signal Processing, Pattern Recognition, and Applications (SPPRA), Austria, 2007.

[23] M. Kirby and L. Sirovich, Application of karhunen-Loeve procedure for characterization of human faces, IEEE Patt. Anal. Mach. Intell., 12, 103-108, 1990.

122

[24] M. Turk and A. Pentland, Eigenfaces for recognition, Journal of Cognitive Neuroscience, 3(1), 71-86, 1991.

[25] Zhao .W, Chellappa .R and Phillips .P `Subspace Linear Discriminant Analysis for Face Recognition', technical report, Center for Automation Research, University of Maryland, College park, 1999.

[26] Liu, C. and Wechsler, H. 2000. Evolutionary Pursuit and Its Application to Face Recognition. IEEE Trans. Pattern Anal. Mach. Intell. 22, 6, 570-582, Jun 2000.

[27] Déniz, O., Castrillón, M., and Hernández, M. 2003. Face recognition using independent component analysis and support vector machines. Pattern Recogn. Lett. 24, 13, 2153-2157, Sep 2003.

[28] Bartlett, M.S., Movellan, J.R., T.J., Sejnowski.: Face Recognition by Independent Component Analysis. IEEE Transactions on Neural Networks, vol. 13, No. 6, Nov 2002.

[29] Wu, J. and Zhou, Z. 2002. Face recognition with one training image per person. Pattern Recogn. Lett. 23, 14, 1711-1719, Dec 2002.

[30] Moghaddam, B., Jebara, T., and Pentland, A. Bayesian modeling of facial similarity. In Proceedings of the 1998 Conference on Advances in Neural information Processing Systems II D. A. Cohn, Ed. MIT Press, Cambridge, MA, 910-916, 1999.

[31] Shaohua Zhou Chellappa, R. Moghaddam, B., Intra-personal kernel space for face recognition, Sixth IEEE International Conference on Automatic Face and Gesture Recognition, Proceedings. pp. 235- 240, 2004.

[32] P. Belhumeur, J. Hespanha, and D. Kriegman, Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection, Proc. of the Fourth European Conference on Computer Vision, Vol. 1, Cambridge, UK, pp. 45-58, Apr 1996.

[33] C. Liu and H. Wechsler, Comparative Assessment of Independent Component Analysis (ICA) for Face Recognition, Second International Conference on Audio- and Video-based Biometric Person Authentication, Washington D.C., USA, Mar 1999.

[34] K. Baek, B. Draper, J.R. Beveridge, and K. She, PCA vs. ICA: A Comparison on the FERET Data Set, Proc. of the Fourth International Conference on Computer Vision, Pattern Recognition and Image Processing, Durham, NC, USA, pp. 824-827, Mar 2002.

[35] B. Moghaddam, Principal Manifolds and Probabilistic Subspaces for Visual Recognition, IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. 24, No. 6, pp. 780-788, Oct 2002.

[36] J.R. Beveridge, K. She, B. Draper, and G.H. Givens, A Nonparametric Statistical Comparison of Principal Component and Linear Discriminant Subspaces for Face Recognition, Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, Kaui, HI, USA, pp. 535- 542, Dec 2001.

[37] A. Martinez and A. Kak, PCA versus LDA, IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. 23, No. 2, pp. 228-233, Feb 2001.

[38] P. Navarrete and J. Ruiz-del-Solar, Analysis and Comparison of Eigenspace-Based Face Recognition Approaches, International Journal of Pattern Recognition and Artificial Intelligence, Vol. 16, No. 7, pp. 817-830, Nov 2002.

[39] Delac, K., Grgic, M., and Grgic, S., Independent comparative study of PCA, ICA, and LDA on the FERET data set, Int'l Journal of Imaging Systems and Technology, 15(5), 252-260, 2005.

[40] J. Ruiz-del-Solar and P. Navarrete, "Eigenspace-based face recognition: a comparative study of different approaches," IEEE Transactions on Systems, Man and Cybernetics Part C: Applications and Reviews, vol. 35, no. 3, pp. 315-325, 2005.

[41] Pan, Z. and H. Bolouri, High speed face recognition based on discrete cosine transforms and neural networks, Technical Report, Univ. of Hertfordshire, UK, 1999.

[42] Spiess, H. and Ricketts, I., Face recognition in Fourier space, Proc. Vision Interface Conf., Montreal, Canada, 2000.

[43] J. H. Lai, P. C. Yuen, and G. C. Feng, Face recognition using holistic Fourier invariant features, Pattern Recognition, 34(1), 95–109, 2001.

[44] Dai, D., Feng, G., Lai, J., and Yuen, P. C. Face Recognition Based on Local Fisher Features. In Proceedings of the Third international Conference on Advances in Multimodal interfaces. T. Tan, Y. Shi, and W. Gao, Eds. Lecture Notes In Computer Science, vol. 1948. Springer-Verlag, London, 230-236, 2000.

[45] G. C. Feng, P. C. Yuen, and D. Q. Dai, Human face recognition using PCA on wavelet subband, Journal of Electronic Imaging, 9(2), 226-233, 2000.

123

[46] D.B. Graham and N.M. Allinson, Characterizing virtual eigen signatures for general purpose face recognition, Face Recognition: From Theory to Applications, H. Wechsler, P.J. Phillips, V. Bruce, F. Fogelman-Soulie, and T.S. Huang, eds., 163, 446-456, 1998.

[47] Z. Pan, R. Adams, and H. Bolouri, “Dimensionality reduction of face images using discrete cosine transforms for recognition." submitted to IEEE Conference on Computer Vision and Pattern Recognition, 2000.

[48] Zana, Y. and Cesar, R. M. 2006. Face recognition based on polar frequency features. ACM Trans. Appl. Percept. 3, 1, 62-82, Jan 2006.

[49] Saradha, A. & Annadurai, S. A Hybrid Feature Extraction Approach for Face Recognition Systems. Institute of Road and Transport Technology, Erode, 2004.

[50] Casasent D et al, “Face Recognition with Pose and Illumination Variations Using New SVRDM Support-Vector Machine”, Optical Engineering, vol. 43, No. 8, Society of Photo-Optical Instrumentation Engineers, pp. 1804-1813, Aug 2005.

[51] S. Srisuk and W. Kurutach, Face Recognition using a New Texture Representation of Face Images, Proceedings of Electrical Engineering Conference, Cha-am, Thailand, pp. 1097-1102, Nov 2003.

[52] A. Ne an, "Embedded Bayesian networks for face recognition," ICME 2002 - IEEE International Conference on Multimedia and Expo, Lausanne, Switzerland, pp. 133-136, Aug 2002.

[53] Huang, J., Yuen, P. C., Lai, J. H., and Li, C. 2004. Face recognition using local and global features. EURASIP J. Appl. Signal Process., 530-541, Jan 2004.

[54] P. C. Yuen and J. H. Lai, “Face representation using independent component analysis,” Pattern Recognition, vol. 35, no. 6, pp. 1247–1257, 2002.

[55] Rao, K. S. and Rajagopalan, A. N. 2005. A probabilistic fusion methodology for face recognition. EURASIP J. Appl. Signal Process., 2772-2787, Jan 2005.

[56] Tony Mansfield, Gavin Kelly, David Chandler, and Jan Kane, “Biometric Product Testing Final Report”, CESG/BWG Biometric Test Programme. www.cesg.gov.uk, Mar 2001.

[57] Bolme, D. S., Beveridge, J. R., Teixeira, M. and Draper, B. A., The CSU Face Identification Evaluation System: Its Purpose, Features, and Structure, ICVS, Graz, Austria, 2003.

[58] V. Jain, Human face classification using neural networks, Project at IITK, Kanpur, India, 1998. [59] Sebastien M., Yann R., and Guillaume H., On the recent use of local binary patterns for face

authentication, IDIAP, Valais, Swiss, 2006. [60] J. Zhang, Y. Yan and M. Lades, Face Recognition: Eigenface, Elastic Matching, and Neural Nets,

Proceedings of the IEEE, Vol. 85, No. 9, pp. 1423-1435, 1997. [61] Alan Brooks and Li Gao, Face recognition: Eigenface and Fisherface performance across pose, Final project

report, Northwestern University, Evanston, IL, 2004. [62] Y. Zhang, L. Lang, and O. Hamsici, Facial image recognition by subspace learning: a comparative

study, Project at the Ohio State Univ., Ohio, USA, 2004. [63] Qi Li, Jieping Ye and Chandra Kambhamettu, Linear projection methods in face recognition under

unconstrained illuminations: A comparative study, IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'04), Washington, USA, 2004.

[64] Georghiades, A.S., Belhumeur, P.N., and Kriegman, D.J., From few to many: illumination cone models for face recognition under variable lighting and pose, IEEE Patt. Anal. Mach. Intell., 23(6), 643-660, 2001.

[65] M. J. Lyons, S. Akamatsu, M. Kamachi, and J. Gyoba, Coding facial expressions with gabor wavelets, Proceedings, Third IEEE Int’l Conf. on Automatic Face and Gesture Recognition, Nara, Japan, 1998.

[66] Levine, M. D., Gandhi, Maulin R. and Bhattacharyya, Jisnu: “Image Normalization for Illumination Compensation in Facial Images”. Department of Electrical & Computer Engineering & Center for Intelligent Machines, McGill University, Montreal, Canada, Unpublished Report (2004)

[67] Yang, J., Chen, X. Kunz, W. and Kundra, H., "Face as an index: Knowing who is who using a PDA", Inter. Journal of Imaging Systems and Technology, 13(1), pp. 33-41, 2003.

[68] Jebara, T., 3D Pose estimation and normalization for face recognition, Honours Thesis, McGill University, Canada, 1996.

[69] Dubuisson, S., Davoine, F. and Masson, M., A solution for facial expression representation and recognition. Signal Process. Image Commun. 17(9). 657-673, 2002.

124

[70] N. Ikizler, J. Vasanth, L. Wong and D. Forsyth, Finding Celebrities in Video, Technical Report, EECS Department, University of California, Berkeley, USA, 2006. http://www.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-77.html

[71] B. Du, S. Shan, L. Qing, W. Gao, “Empirical comparisons of several preprocessing methods for illumination insensitive face recognition”, Proceedings ICASSP'05, V(2), 2005.

[72] Mauricio Villegas Santamar´_a and Roberto Paredes Palacios, “Comparison of Illumination Normalization Methods for Face Recognition”, Third COST 275 Workshop Biometrics on the Internet, Univ. of Hertfordshire, UK, 2005.

[73] Hammal, Z., Eveno, N., Caplier, A., and Coulon, P. 2006. Parametric models for facial features segmentation, Signal Process., 86 (2), pp. 399-413, Feb 2006.

[74] R. C. Gonzales and R. E. Woods, Digital Image Processing, Addison Wesley Publishing Company, Inc., New York, 1993.

[75] W. Beaudot, “The neural information processing in the vertebra retina: a melting pot of ideas for artificial vision”, Phd thesis, tirf laboratory, Grenoble, France, 1994.

[76] Z. Hammal, C. Massot, G. Bedoya, and A. Caplier, "Eyes segmentation applied to gaze direction and vigilance estimation," Proc. ICAPR '05, 236-246, UK, 2005.

[77] A. S. El-Sayed, K. A. Nagaty, T. I. El-Arief, “An Enhanced Histogram Matching Approach using the Retinal Filter’s Compression Function for Illumination Normalization in Face Recognition”, ICIAR’08, Springer-Verlag LNCS 5112, pp. 873–883, Portugal, 2008.

[78] Phillips P. J., Grother P., Micheals R. J, Blackburn D.M., Tabassi E., and Bone J. M. “FRVT 2002: Evaluation Report”, http://www.frvt.org/DLs/FRVT_2002_Evaluation_Report.pdf, Mar 2003.

[79] Peter Belhumeur. “Ongoing Challenges in Face Recognition”, pp. 5-14, Frontiers of Engineering: Reports on Leading-Edge Engineering from the 2005 Symposium, ISBN-10: 0-309-10102-6, 2006

[80] Heusch, G., Rodriguez, Y., and Marcel, S., “Local Binary Patterns as an Image Preprocessing for Face Authentication”, In Proceedings of the 7th international Conference on Automatic Face and Gesture Recognition (Fgr06) – Vol. 00, Washington, Apr 2006.

[81] R. Basri and D.W. Jacobs, “Lambertian reflectance and linear subspaces”, IEEE Trans. on Pattern Analysis and Machine Intelligence, 25(2):218–233, 2003.

[82] Gross, R., Matthews, I., and Baker, S., “Eigen Light-Fields and Face Recognition Across Pose”. In Proceedings of the Fifth IEEE international Conference on Automatic Face and Gesture Recognition, Washington, USA, 2002.

[83] T.Sim, T.Kanade, Combining Models and Exemplars for Face Recognition: An Illuminating Example, In Proceedings of Workshop on Models versus Exemplars in Computer Vision, CVPR 2001.

[84] Shan, S., Gao, W., Cao, B., and Zhao, D. “Illumination Normalization for Robust Face Recognition Against Varying Lighting Conditions”, In Proceedings of the IEEE international Workshop on Analysis and Modeling of Faces and Gestures AMFG, Washington, Oct 2003.

[85] R. Gross and V. Brajovic. An image preprocessing algorithm for illumination invariant face recognition. In Audio- and Video-Based Biometric Person Authentication, AVBPA’03, 2003.

[86] T. Ojala, M. Pietikäinen, and D. Harwood. A comparative study of texture measures with classification based on featured distributions. Pattern Recognition, 29(1):51–59, 1996.

[87] T. Ojala, M. Pietikäinen, and T. Mäenpää. Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. on Pattern Analysis and Machine Intelligence, 24(7):971–987, 2002.

[88] J. Short, J. Kittler, and K. Messer. Photometric normalization for face verification. In Audio- and Video-based Biometric Person Authentication, AVBPA’05, 2005.

[89] D. J. Jobson , Z. Rahman, G. A. Woodell, “A Multiscale Retinex for Bridging the Gap Between Color Images and the Human Observation of Scenes,” IEEE Transactions on Image Processing, Volume: 6, No: 3, Page(s): 965-976, Jul 1997.

[90] E. Land, “The Retinex Theory of Color Vision,” Scientific American, Page(s): 108-129, Dec 1977. [91] H. Ando, N. Fuchigami, M. Sasaki and A. Iwata, "Robust Face Recognition Methods under

Illumination Variations toward Hardware Implementation on 3DCSS", The Fourth Hiroshima International Workshop on Nanoelectronics for Tera-Bit Information Processing, pp. 139-141, Sep 2005.

[92] V. Vapnik. The Nature of Statistical Learning Theory. Springer, New York, 1995.

125

[93] A. Brunelli and T. Poggio, Face Recognition: Features vs Templates, IEEE Trans. Pattern Anal. Mach. Intelligence, 15(10):1042-1053, 1993.

[94] P. Belhumeur, D. Kriegman and A. Yuille, The bas-relief ambiguity, In proc. IEEE Conf. on Comp. Vision and Patt. Recog., p.p 1040-1046, 1997.

[95] A.Georghiades, D. Kriegman, and P. Belhumeur, “From Few to Many: Generative Models for Recognition under Variable Pose and Illumination,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 40, pp. 643-660, 2001.

[96] Kuang-Chih Lee, Jeffrey Ho, David J. Kriegman, "Acquiring Linear Subspaces for Face Recognition under Variable Lighting," IEEE Transactions on Pattern Analysis and Machine Intelligence ,vol. 27, no. 5, pp. 684-698, May 2005.

[97] H. Chen, P. Belhumeur, and D. Jacobs, “In Search of Illumination Invariants,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 1-8, 2000.

[98] L. Zhang and D. Samaras. Face recognition under variable lighting using harmonic image exemplars. In CVPR, volume 1, pages 19–25. IEEE Computer Society, 2003.

[99] M. Savvides, B. V. K. Vijaya Kumar, and P. L. Khosla, “Corefaces - Robust Shift Invariant PCA based Correlation Filter for Illumination Tolerant Face Recognition” CVPR, 2004.

[100] A.Shashua, and T. Riklin-Raviv, “The quotient image: Class-based re-rendering and recognition with varying illuminations” IEEE Tran. PAMI, Vol. 23(2), pp. 129-139, 2001.

[101] H. Wang, S. Z. Li, and Y. Wang, “Generalized Quotient Image” CVPR, 2004. [102] Sim, T., Baker, S., Bsat, M.: The CMU Pose, Illumination, and Expression (PIE) database. In: IEEE

Int. Conf. on Automatic Face and Gesture Recognition, 2002. [103] W. Gao, B. Cao, S. Shan, D. Zhou, X. Zhang, D. Zhao, “The CAS-PEAL Large-Scale Chinese Face

Database and Baseline Evaluations”, IEEE Transactions on Systems, Man and Cybernetics, Part A, 38(1), pp. 149-161, 2008. http://www.jdl.ac.cn/peal/index.html.

[104] Blackburn, D., Bone, M., Philips, P.: Facial recognition vendor test 2000: evaluation report (2000) [105] Yin, W. Total Variation Models for Variable Lighting Face Recognition. IEEE Trans. Pattern Anal.

Mach. Intell. 28, 9, 2006. [106] Peter Halinan, Alan Yeuille, and David Mumford, Harvard face Database. [107] Haitao Wang, Stan Z Li, Yangsheng Wang, "Face Recognition under Varying Lighting Conditions

Using Self Quotient Image," fg, p. 819, Sixth IEEE International Conference on Automatic Face and Gesture Recognition (FG'04), 2004.

[108] Press, W., Teukolsky, S., Vetterling, W., Flannery, B.: Numerical Recipes in C. Cambridge University Press, 1992.

[109] X. Tan and B. Triggs. Enhanced local texture feature sets for face recognition under difficult lighting conditions. In IEEE Conf. on AMFG, pages 168--182, 2007.

[110] Phillips, P.J., Flynn, P.J., Scruggs, W.T., Bowyer, K.W., Chang, J., Hoffman, K., Marques, J.,Min, J.,Worek, W.J.: Overview of the face recognition grand challenge. In: Proc. CVPR 2005, San Diego, CA, pp. 947–954, 2005.

[111] Borgefors, G.: Distance transformations in digital images. Comput. Vision Graph. Image Process. 34(3), 344–371, 1986.

[112] F. Samaria and A. Harter, Parameterization of a stochastic model for human face identification, Proc. 2nd IEEE Workshop on Applications of Computer Vision, Sarasota, USA, 1994.

[113] “Introduction to Biometrics” section of the Biometrics Catalog, http://www.biometricscatalog.org. [114] “Face Recognition Vendor Test”, www.FRVT.org. [115] Grimace database, http://cswww.essex.ac.uk/mv/allfaces/grimace.html. [116] Intel OpenCV Library, http://sourceforge.net/projects/opencvlibrary/. [117] Nott-faces database, http://pics.psych.stir.ac.uk. [118] Face 94 database, http://cswww.essex.ac.uk/mv/allfaces/face94.html. [119] Yale B face database, http://cvc.yale.edu/projects/yalefacesB/yalefacesB.html. [120] “The Facial Recognition Technology (FERET) Database”,

http://www.itl.nist.gov/iad/humanid/feret/feret_master.html. [121] Face normalization and descriptor code, http://lear.inrialpes.fr/people/triggs/src/amfg07-demo-

v1.tar.gz. [122] AcSys Biometrics, http://www.acsysbiometrics.com. [123] Berninger Software BmbH, http://www.berningersoftware.de. [124] Biometrica Systems, Inc., http://www.biometrica.com.

126

[125] Cognitec Systems, http://cognitec-systems.de. [126] Dialog Communication Systems Inc., http://www.bioid.com.tw/. [127] FaceKey Corp., http://www.facekey.com. [128] Humanscan GmbH, http://www.bioid.com. [129] Identix, Inc., http://www.identix.com. [130] Imagis Technologies, Inc., http://www.imagistechnologies.com. [131] Keyware Technologies N.V., http://www.keyware.com. [132] Neurodynamics Limited, http://www.neurodynamics.com. [133] Viisage Technology, http://www.viisage.com/facetools.htm. [134] VisionSphere Technologies, http://www.visionspheretech.com. [135] ZN Vision Technologies, http://www.zn-ag.com.

127

ملخص الرسالة لهذه أحدث تقييمإال أن ، خالل السنوات الماضية قد اقترحتالتعرف على الوجه ان العديد من نظم وتقنيات من رغمبال 2006لهذه النظم عام خر اختبارآفي . ةضاءتغيرات اإلل نظرا ها يتأثر سلباالتقنيات قد أظهر أن أداء معظمونظم ال

ة أن بالتجربنظريا ووعالوة على ذلك، ثبت . هذه النظم داءآتأثير آبير على لهة ءضااإلشروط الى ان تخفيف خلص نتيجة للتغيير في تحدث التى االختالفات أآبر من يكون غالبا بسبب اإلضاءةالشخص االختالفات بين صور نفس

.هذا الشخص هوية

على الرغم من ان . التعرف على الوجهاءة في اإلضاألبحاث والطرق المختلفة للتعامل مع تأثيرات هناك الكثير من له تأثير سلبي على يكون قد إال أن بعضها ، بشكل فعال اإلضاءةات التغلب على اختالف هايمكنالطرق معظم هذه

بطرق داء عندما يقترناآلفي آبير والبعض اآلخر يظهر فرق .التي لم تتعرض الختالفات بسبب اإلضاءةالصور قد مر وهو أالصورة وجه ضمنة التطلب مواءموهناك البعض اآلخر من هذه الطرق ي. ى الوجهمختلفة للتعرف عل

.نظم العملية الحقيقيةالصعب تحقيقه في ي

داء اآلفي آبير بحيث ال تظهر فرق اإلضاءة للتعامل مع اختالفات جديدة الرسالة طريقةفي هذه لذلك فإننا نقترح مما يسهل استخدامها الصورة وجه ضمنالمواءمة على الوجه وآذلك ال تشترط بطرق مختلفة للتعرف قترنتعندما .نظم العملية الحقيقيةفي ال

اقترانها بطرق مختلفة للتعرف على الوجه وآذلك داء عنداآلفي آبير أن الطريقة المقترحة ال تظهر فرق للتحقق من فئتين آبيرتين من ارها على طريقتين مختلفتين تمثالن ، فإنه قد تم اختبالصورة وجه ضمنالمواءمة عدم اشتراطها ل

قاعدة استخدام تم في آل طريقة، .طرق التعرف على الوجه والمعتمدة على دراسة الوجه آامال وليس أجزاء منهحيث يكون الطريقة المقترحة مرتين، بختبار بيانات من الصور التي تحتوي على اختالفات شديدة في األضاءة ال

الوجه مواءما ضمن الصورة في المرة األولى وال يكون مواءما في المرة الثانية وذلك حتى نتمكن من الحكم على .الطريقة المقترحة ومدى احتياجها لهذا الشرط من عدمه

وقمنا رنةدراسات المقا من ةتسعفإننا قمنا بعمل مسح ل، مناظرة خرىطرق أمع ةالمقترحالطريقة جل مقارنة أمن تم اختبارها الخمس الطرق جميع . الدراساتشملتهم هذه مختلفة طريقة 38بين باختيار أفضل أربعة طرق من

بها اختالفات بسبب على صور طريقتي التعرف على الوجه، والتي تم اختيارهما مسبقا، وذلك باستخدام تهامقارنو .وصور أخرى بها اختالفات غير اإلضاءةاإلضاءة

ة آانت األفضل من بين الطرق المقترح طريقةالأن اإلضاءة التي بها اختالفات بسبب صوررت النتائج على الأظهاألخرى على طريقة التعرف األولى وثاني أفضل طريقة على طريقة التعرف الثانية وذلك عندما يكون الوجه غير

قل تأثرا في نتائجها من بين األربع طرق األخرى إضافة إلى ذلك، آانت الطريقة المقترحة األ. مواءما ضمن الصورةالتي بها صورأما على ال. نتيجة لعدم مواءمة الوجه ضمن الصورة وذلك على آلتا طريقتي التعرف المستخدمتين

من بين األربع طرق تأثير سلبي ة لها أقل المقترح طريقةالفقد أظهرت النتائج أن اإلضاءة،اختالفات أخرى غير .آلتا طريقتي التعرفعلى أيضا لكاألخرى وذ

فئتين طريقتي تعرف على الوجه تمثالن على في هذا العمل تم اختبارها اإلضاءةجميع طرق التعامل مع اختالفات توسيع هذا العمل لذلك فإنه من المهم أن يتم . الطرق المعتمدة على دراسة الوجه آامال وليس أجزاء منهآبيرتين من

128

داء عندما تقترن هذهفرق آبير في األيحدث نه قد أللمعتمدة على دراسة أجزاء من الوجه أيضا وذلك الطرق اليشمل .اإلضاءةالطرق مع طرق التعامل مع اختالفات

لذلك فإنه من المهم . الطرق األخرى المختارةوة المقترحللطريقة تكنولوجياا وعالوة على ذلك، يقدم هذا العمل تقييم

.لهاجل انجاز تقييم شامل الطرق من أتنفيذي ألداء هذه آذلك تقييم و يناريوسعمل تقييم

129

في التعرف على الوجه اإلضاءةمشكلة للتعامل مع جديدةريقة ط

بكلية الحاسبات والمعلومات جامعة عين شمس علوم الحاسبرسالة مقدمة إلى قسم آجزء من متطلبات الحصول على درجة الماجستير فى الحاسـبات والمعلومـات

إعـداد

مد صالح الدين محمد السيدأح

علوم الحاسبمعيد بقسم آلية الحاسبات والمعلومات

جامعة عين شمس

تحت اشراف

طه إبراهيم العريف/ االستاذ الدآتور

األستاذ بقسم علوم الحاسب و المعلوماتآلية الحاسبات


هيثم المسيري/ الدآتور

بقسم علوم الحاسبالمدرس و المعلوماتحاسبات آلية ال


2009

جامعة عين شمسسبات و المعلوماتآلية الحا

قسم علوم الحاسب

[ms.c thesis] a new illumination normalization approach for face recognition 2009

Documents