fusion of facial expressions and eeg for multimodal...

Research ArticleFusion of Facial Expressions and EEG for MultimodalEmotion Recognition

Yongrui Huang Jianhao Yang Pengkai Liao and Jiahui Pan

School of Software South China Normal University Guangzhou 510641 China

Correspondence should be addressed to Jiahui Pan panjh82qqcom

Received 31 May 2017 Revised 24 July 2017 Accepted 14 August 2017 Published 19 September 2017

Academic Editor Luis Vergara

Copyright copy 2017 Yongrui Huang et alThis is an open access article distributed under the Creative Commons Attribution Licensewhich permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

This paper proposes two multimodal fusion methods between brain and peripheral signals for emotion recognition The inputsignals are electroencephalogram and facial expression The stimuli are based on a subset of movie clips that correspond to fourspecific areas of valance-arousal emotional space (happiness neutral sadness and fear) For facial expression detection four basicemotion states (happiness neutral sadness and fear) are detected by a neural network classifier For EEG detection four basicemotion states and three emotion intensity levels (strong ordinary and weak) are detected by two support vector machines (SVM)classifiers respectively Emotion recognition is based on two decision-level fusion methods of both EEG and facial expressiondetections by using a sum rule or a production rule Twenty healthy subjects attended two experiments The results show thatthe accuracies of two multimodal fusion detections are 8125 and 8275 respectively which are both higher than that offacial expression (7438) or EEG detection (6688) The combination of facial expressions and EEG information for emotionrecognition compensates for their defects as single information sources

1 Introduction

Emotion plays a powerful role in social influence not onlydoes it include psychological responses to external stimuli oronersquos own stimuli but it is also accompanied by physiologicalresponses to psychological reactions in individualsrsquo dailylives Emotional influences are manifested across a varietyof levels and modalities [1] On the one hand peripheralsignals are related to the somatic nervous system and showphysiological changes in emotion states For instance thereare physical signals that emerge facial expressions verbalspeech or body language On the other hand there arealso influences on cognitive processes including copingbehaviors such as wishful thinking resignation or blame-shifting The goal of our research is to perform a multimodalfusion between EEGs and peripheral physiological signals foremotion recognition

Previous studies have investigated the use of peripheraland brain signals separately but little attention has beenpaid thus far to a fusion between brain and peripheralsignals In one study Ekman and Friesen made a pioneering

contribution to modern facial expression recognition [2]They defined the six basic expressions of human beings thatis pleasure anger surprise fear disgust and sadness andidentified the categories of objects to be investigated Masemade use of optical flow to determine the main directionof movement of the muscles and then constructed the FaceRecognition System [3] Picard and Daily at MIT MediaLaboratory developed pattern recognition algorithms thatattained 784 classification accuracy for three categories ofemotion states using the peripheral signals of galvanic skinresistance blood pressure respiration and skin temperature[4]

Compared to periphery physiological signals EEG sig-nals have been proven to provide greater insights intoemotional processes and responses Furthermore becauseEEG has been widely used in BCIs the study of EEG-basedemotion detection may provide great value for improvingthe user experience and performance of BCI applicationsChanel et al reported an average accuracy of 63 by usingEEG time-frequency information as features and supportvector machine (SVM) as a classifier to characterize EEG

HindawiComputational Intelligence and NeuroscienceVolume 2017 Article ID 2107451 8 pageshttpsdoiorg10115520172107451

2 Computational Intelligence and Neuroscience

signals into three emotion states [5] Nasehi et al madeuse of quadratic discriminant analysis and SVM to classifyemotions into the six categories of pleasure surprise angerfear disgust and sadness achieving accuracies of 623 and8333 respectively [6] Ishino and Hagiwara categorizeduser status into four emotion states using neural networkswith accuracies ranging from 545 to 677 for each ofthe four emotion states [7] However the use of EEG-basedemotion recognition is still in its infancy

In recent years with the development of multisourceheterogeneous information fusion processing it has becomepossible to fuse features from multicategory reference emo-tion statesTheuse of different types of signals to support eachother through supplementary information fusion processingcan be greatly improvedTherefore people have begun to usefacial expressions voice messages eye movements gesturesand physiological signals and other channels of emotionalinformation between the complementarity to study iden-tification problems that is based on multimodal emotionrecognition [8] Most previous works have focused on thefusion of audiovisual information for automatic emotionrecognition for example combining speech with facialexpression Busso et al proposed a rule-based decision-level fusion method for combined analysis of speech andfacial expressions [9] Wanger et al used boosting techniquesto automatically determine adaptive weights for audio andvisual features [10] A few studies have focused on themultimodal fusion of EEG and physiological signals In astudy [11] the International Affective Picture System (IAPS)was utilized as stimuli and the use of self-assessment labelsfor arousal assessment yielded accuracies of 55 53 and54 for EEG physiological and fused features respectivelyAll of the studies have shown that the performances ofemotion recognition systems can be improved by employingmultimodal information fusion

In this study we propose two multimodal fusion meth-ods combining brain and peripheral signals for emotionrecognition The input signals are electroencephalogram andfacial expression The stimuli are based on a subset of movieclips that correspond to four specific areas of valance-arousalemotional space (happiness neutral sadness and fear) Forfacial expression detection four basic emotion states aredetected by a neural network classifier For EEG detectionfour basic emotion states and three emotion intensity levels(strong ordinary and weak) are detected by two SVMclassifiers respectively Emotion recognition is based ontwo decision-level fusion methods of both EEG and facialexpression detections by using a sum rule or a productionrule Twenty healthy subjects attended two experiments Theresults show that the accuracies of two multimodal fusiondetections are 8125 and 8275 respectively which areboth higher than that of facial expression or EEG detectionThe combination of facial expressions and EEG informationfor emotion recognition compensates for their defects assingle information sources

2 Methods

21 Data Acquisition System A Mindwave Mobile device(Neurosky Inc Abbotsford Australia) was used to capture

scalp EEG signals and a Logitech camera (25 FPS 800 times 600image size) was used to capture facial expressions Accordingto the standard 10ndash20 system the EEG signals are referencedto the right mastoid The EEG signals used for analysis wererecorded from the ldquoFzrdquo electrode The impedances of allelectrodes were maintained below 5 kΩ22 Data Processing and Algorithm For our proposed sys-tem the EEG and facial expression detectors were designedseparately The EEG and image data were fed into the twodetection procedures simultaneously Figure 1 shows the dataprocessing procedure The analysis methods and algorithmsused in this study are described below

221 Facial Expression Detection For the face featuresextraction the face position is detected in real-time bythe AdaBoost algorithm based on the Haar eigenvalue TheHaar classifier uses the AdaBoost algorithm of the Boostingalgorithm resulting in a cascade of weak classifiers trainedby the AdaBoost algorithm We use the Haar-like feature inimage as input of the classifier The output of the classifieris whether this image is human face [12] When the humanface from the input image is found we resize it into 48-pixelwidth and 48-pixel height Next the PCA method is used toreduce the dimensionality The output of this phrase is 169dimensions (obtained by the grid search method mentionedbelow) after dimensionality reduction

The feature vectors are then fed to a feedforward neuralnetwork Figure 2 shows the architecture of the proposedsystem for face expression classification Applying the trainedfeedforward neural network classifier to the face imagefeature we obtain four scores (the values of the objectivefunction of the feedforward neural network) denoted as1199041119895 (119895 = 1 4) 119895 represents the four emotion states (hap-piness neutral sadness and fear) detected by face expressionclassifier We normalize the four scores by mapping them tothe range [0 1]

1199041119895 = 1199041119895 minusmin 11990411 11990414max 11990411 11990414 minusmin 11990411 11990414 (1)

1199031 = argmax119895

(1199041119895) (2)

The four normalized scores 1199041119895 (119895 = 1 2 3 4) represent-ing the output of feedforward neutral network are used in thedecision-making step described later and 1199031 represents theemotion state detected by face expression

Note that the hyperparameters of the face expressionclassifier are determined by the grid search method a brute-force searching through a manually specified subset of thehyperparameters space of a learning algorithm We initializethe subset of the hyperparameters space in

119863 isin 121 144 169 225119873 isin 150 200 250 300119877 isin 0001 001 01 1

(3)

Computational Intelligence and Neuroscience 3

EEGsensor

Facecamera

Featureextraction

Featureextraction Classification

Classification

Decision-making

EEG data

Face data

EEG features

Face features

SVM scores

NN scores

Decisionfusion

２ＭＯＦＮ

２ＭＯＦＮ＝

Figure 1 Data processing procedure of the multimodal emotion recognition

1

2

169

1

2

200

1

2

4

Inputlayer

Hiddenlayer

Outputlayer

3

Happiness

Neutral

Sadness

Fear

X1

X2

X169

Figure 2 The architecture of the proposed system for face expression classification the network has one hidden layer with 200 neuronsThe input of this network is 169 image features we get from dimensionality reduction while the output is the scores of four emotion states(happiness neutral sadness and fear) The learning rate of this network is 01 We use sigmoid function as the activation function of thisnetwork

where 119863 is the number of the dimensions 119873 is the numberof the neurons in hidden layer and 119877 is the learning rate inclassification Grid search then trains a classifier with eachpair (119863 119873 119877) in the Cartesian product of these three setsand evaluates their performance on a held-out validation setWe select the classifier parameters with the best performanceand apply it into our model

222 EEG Detection The EEG-based detection includestwo progressive stages feature extraction based on PSDand classification using SVM The analysis methods andalgorithms used in this study are described below

The EEG data are bandpass filtered over eight frequencybands delta (1ndash3Hz) theta (4ndash7Hz) alpha1 (8ndash10Hz)alpha2 (11ndash13Hz) beta1 (14ndash20Hz) beta2 (21ndash30Hz)gamma1 (31ndash40Hz) and gamma2 (41ndash50Hz) We computethe traditional PSD features using the Short Time FourierTransform (STFT) with a 1-s window and no overlappingHanning window For classification we use two linear SVMclassifiers here one for the emotion states classificationand one for the emotion intensities classification We trainsamples (119909119894 119910119894) and (119909119894 1199101015840119894 )119909119894 = [ DELTATHETAALPHA1ALPHA2

BETA1BETA2GAMMA1GAMMA2] (4)

119910119894 =

1 happiness

2 neutral

3 sadness

4 fear

(5)

1199101015840119894 =

minus1 weak

0 moderate

1 strong(6)

where DELTA THETA ALPHA1 ALPHA2 BETA1 BETA2GAMMA1 and GAMMA2 represent the power density spec-trum corresponding to the eight frequency bands mentionedabove 119910119894 represents the label of the four emotion states1199101015840119894 is the label of the three emotion intensity levels and119909119894 represents the feature vectors corresponding to the fouremotion states or the three emotion intensity levels

Applying the first trained SVM classifier to the featurevectors we obtain four scores (the values of the objectivefunction of the SVM) denoted as 1199042119895 (119895 = 1 4)119895 represents the four emotion states (happiness neutral


sadness and fear) detected by EEG classifier We normalizethe four scores by mapping them to the range [0 1]1199042119895 = 1199042119895 minusmin 11990421 11990424

max 11990421 11990424 minusmin 11990421 119904241199032 = argmax

119895

(1199042119895) (7)

The four normalized scores 1199042119895 (119895 = 1 2 3 4) and theindex of themaximum score 1199032 representing the output of theemotion state in the EEG detection are used in the first fusionmethod described later

Applying the second trained SVM classifier to the featurevectors we obtain three scores 11990421015840119896 (119896 = 1 2 3) correspond-ing to the three emotion intensity levels (weakmoderate andstrong) and find the index of the maximum score

11990310158402 = argmax119896

(11990421015840119896) (8)

The index of the maximum score 11990310158402 representing theoutput of the emotion intensity level in the EEG detection isused in the second fusion method described later

223 Classification Fusion In the decision-level fusion theoutputs generated by two classifiers of the facial expressionand EEG detections are combined We employ two fusionmethods of both EEG and facial expression detections asfollows

For the first fusion method we have applied the sumstrategy (eg [12]) to the decision-level fusion Specificallywe calculate the sum of the normalized face expressionclassifier scores 1199041 and EEG classifier scores 1199042 for each of thefour emotion states Finally we find themaximum of the foursummed values as shown as follows

sum119895 = 1199041119895 + 1199042119895 (119895 = 1 2 3 4)119903sum = argmax

119895

(sum119895) (9)

where 1199041119895 (119895 = 1 2 3 4) and 1199042119895 (119895 = 1 2 3 4) are calculatedin (1) and (6) and 119903sum is the index corresponding to themaximum of the summed values

For the second fusion method we adopt the decision-making strategy based on production rules which are com-monly used as a simple expert system in the cognitivemodeling and artificial intelligence (eg [13 14]) Throughthe production rule the four emotion states (happinessneutral sadness or fear) and the three emotion intensitylevels (strong moderate or weak) are combined to emotionrecognition A production rule consists of an IF part (acondition or premise) and a THEN part (an action orconclusion) The form of production rules is

119877119894 IF 119875 THEN 119876 (10)

where119877119894 represents the rule 119894119875 is the antecedent of rule 119894 and119876 is the latter of rule 119894 In this study 119875 is formed by (1199031 11990310158402)1199031 represents the emotion state detected by facial expression

Table 1 The production rules of combining the emotion state andintensity level

119877119894 119875 1198761198771 (Happiness strong) Happiness1198772 (Happiness moderate) Happiness1198773 (Happiness weak) Neutral1198774 (Neutral strong) Happiness1198775 (Neutral moderate) Neutral1198776 (Neutral weak) Neutral1198777 (Sadness strong) Fear1198778 (Sadness moderate) Sadness1198779 (Sadness weak) Sadness11987710 (Fear strong) Fear11987711 (Fear moderate) Fear11987712 (Fear weak) Sadness

while 11990310158402 represents the emotion intensity level detected byEEG The production rules are defined as shown in Table 1

All the rules will be triggered as soon as their conditionsare met For example in the production rule 1198773 if theemotion state detected by facial expression is happiness andthe emotion intensity level detected by EEG is weak the finalresult of the emotion recognition is neutral

3 Experiment

Two experiments offline and online were conducted in thisstudy In this study the data of the first experiment was usedfor training Twenty healthy 19- to 33-year-old subjects fromthe local research unit attended the experiments During theexperiments the subjects were seated in a comfortable chairand instructed to avoid blinking or moving their bodies

31 Experiment 1 (Offline) The data collected in this experi-ment consisted of 40 trials for each subject At the beginningof each trial a fixation cross was first presented at the centerof the GUI to capture the subjectsrsquo attention After 2 s movieclips inducing different emotional conditions were presentedat the center of the GUI in a random order Each movie clipwas presented for 2-3 minutes preceded by 5 s of a blankscreen as the start hint At the ends of trials subjects wereasked to view each movie clip assign valence and arousalratings and rate the specific emotions they had experiencedThe rating procedure lasted approximately 60 s There wasa 10-s break between two consecutive trials for emotionalrecovery During each trial we collected 100 human faceimages using a camera and 200 groups of EEG signals using aMindwave Mobile device Valence and arousal ratings wereobtained using the Self-Assessment Manikin (SAM) [15]Four basic emotion states (happiness neutrality sadness andfear) and three emotion intensities (strong ordinary andweak) were evaluated in this study The given self-reportedemotion states and intensity level were used to verify thefacial and EEG emotion classifications We used images andcorresponding emotion states to train a feedforward neuralnetwork classifier We used EEG signals and corresponding


(a) A happiness emotion effect (b) A neutral emotion effect

(c) A sadness emotion effect

Figure 3 Example screenshots of face videos from Experiment 2

emotion states to train a SVM classifier A different neuralnetwork classifier and a different SVM classifier were fitted toeach subject They were both used in Experiment 2 to detectemotion states

32 Experiment 2 (Online) This experiment was composedof 40 trials for each subject corresponding to the 40 movieclips evaluated in Experiment 1 The procedure of each trialwas similar to that in Experiment 1 However at the endof each movie clip 3 different detectors (a face expressiondetector EEG detectors and the first fusion detector) wereused to determine the emotion state If the detection resultwas correct positive feedback consisting of auditory applauseoccurred for 4 s Otherwise no feedback was given Forperformance evaluation the online accuracy was calculatedas the ratio of the number of correct predictions to thetotal number of presented trials Figure 3 shows severalscreenshots of face videos from Experiment 2 Figure 3(a)shows a subject who was watching a lively movie clipFigure 3(b) shows a subject who was watching a normalmovie clip Figure 3(c) shows a subject who was watching asad movie clip

33 Data Analysis (Offline) To validate the second fusionmethod combining type of emotion and intensity level anoffline data analysis was conducted For the data set ofExperiment 1 we used images and corresponding emotionstates to train a feedforward neural network classifier andused EEG signals and corresponding emotion intensities to

train a SVM classifier For the data set of Experiment 2 weused the second fusion detector based on the productionrules to determine the emotion state and calculated thecorresponding offline accuracy rates

4 Results

The average accuracies of the two fusion methods for twentysubjects are shown in Table 2 The classification accuraciesof the face expression detection and the EEG detection arealso shown in Table 2 It shows that the accuracy of thefirst fusion detection using a sum rule is 8125 and theaccuracy of the second fusion detection using a productionrule is 8275 which are both higher than that of facialexpression (7438) or EEG detection (6688) Specificallyseventeen of 20 subjects achieved the highest accuracies usingthe fusion methods Moreover accuracies in each of thethree detections were tested using paired 119905-test Results wereconsidered significant when 119901 values were below 005 Thestatistical analysis based on 119905-test indicated the following (i)higher accuracies were achieved for the two fusion methodsthan for the face expression detection or the EEG detection(the first fusion method versus face expression detection119901 = 003 the first fusion method versus EEG detection119901 lt 001 the second fusion method versus face expressiondetection 119901 = 003 the second fusion method versus EEGdetection 119901 lt 001) (ii) the accuracies were not significantlydifferent between the face expression detection and the EEGdetection (EEG detection versus face expression detection


Table 2 The accuracies for the detections of face expression EEG and two fusion methods

Subject Face expression EEG The first fusion method (online) The second fusion method (offline)1 925 675 925 8752 575 700 825 8753 500 725 875 8754 625 750 925 8755 600 600 750 7506 875 750 950 9257 725 725 725 8008 700 700 800 8759 925 600 750 80010 850 625 725 80011 675 725 800 80012 800 750 850 85013 925 575 925 87514 725 550 775 80015 700 525 750 77516 925 625 775 77517 775 575 900 87518 925 800 925 85019 500 625 600 75020 625 775 700 750Average 7438 plusmn 1455 6688 plusmn 819 8125 plusmn 947 8275 plusmn 519

119901 = 008) (iii) the accuracies were also not significantlydifferent between the first and the second fusion methods(the first fusion method versus the second fusion method119901 = 056) Furthermore we can see that low accuracies wereobtained for subjects 5 19 and 20That could be attributed tothem having less expressive facial expressions or perhaps ourapproach is less sensitive to them

5 Discussions

This paper employs information fusion technology com-bined with facial expression recognition technology and EEGemotion recognition technology The stimuli are based ona subset of movie clips that correspond to four specificareas of valance-arousal emotional space (happiness neutralsadness and fear) The four emotion states are detectedby both facial expression and EEG Emotion recognitionis based on a decision-level fusion of both EEG and facialexpression detection Twenty healthy subjects attended twoexperiments The results show that the accuracies of twoinformation fusion detections are 8125 and 8275 whichare both higher than that of facial expression (7438) or EEGdetection (6688)

The notion that combining brain and peripheral phys-iological signals will result in a more accurate emotionrecognition compared to using these variables on their ownseems very sensible and has frequently been suggested in theliterature as a potential way to improve emotion recognition[16] However a few studies explicitly mention that combi-nation of physiological information did not result in reliableimprovement (eg [17ndash19]) or only to a modest degree in

one of multiple conditions without statistical evidence (eg[5]) In this study the experimental results and the statisticalanalysis have provided clear evidence for the benefit ofmultimodal combination for emotion recognition It could beexplained that the emotion state involves multiple processesthat are presumably reflected by different types of variables(eg cognitive processes by EEG and physical change byperipheral facial expression measures)

In this study we did find significant improvement for themultimodal fusion detection compared to the single patterndetectionThe reason could be based on the fact that the facialexpression detection has a fast and strong but fluctuatingresponse and the EEG detection had a smooth but stableresponse over the trial time [20] Specifically there is highvolatility in real emotion recognition based only on facialexpressions because subjects are able to trick the machine aslong as they know how to pretend via their facial expressionsIn this respect the drawbacks of facial expression detectioncan be compensated for by the EEG detection to a verylarge extent Thus the facial expression detection and EEGdetection were irreplaceable and complementary to eachother and the multimodal fusion should achieve higheraccuracies using both detections than using one of the twodetectionsThis was demonstrated by the data analysis resultsin Table 2

While most studies combine information by fusion atthe feature level we thought that fusion of information atthe decision level could have contributed to finding a strongreliable advantage of combining information One the onehand fusion at this level is difficult to achieve in practicebecause the feature sets of the various modalities may not be


compatible (eg brain and peripheral physiological signalsin this study) [21] Most commercial biometric systems donot provide access to the feature sets nor the raw data whichthey use in their products [22] On the other hand theadvantage of decision-level fusion is that all knowledge aboutthe different signals can be applied separately [23] In thisstudy the facial expression and EEG signals have their owncapabilities and limitations as mentioned above and we canuse this information to optimize the detection performanceIn the decision-level fusion it is relatively easy to access andcombine the scores generated by neural network and SVMclassifiers pertaining to facial expression and EEGmodalitiesThus fusion at the decision level is preferred in this study

For the decision-level fusion the classifier selection forfacial expression and EEG detections is also importantSeveral properties have to be taken into consideration suchas the long term variability of facial expression signals andthe availability of small data sets of EEG signals First theneural network-based methods are found to be particularlypromising for facial expression recognition since the neuralnetworks can easily implement the mapping from the featurespace of face images to the facial expression space [24]Second a neural network model generally requires a largeamount of high-quality data for training In this study theEEG signals recorded by a one-electrode mobile device couldlack sufficient training data for the neural network-basedmethod Third SVM is known to have good generalizationproperties and to be insensitive to overtraining and to thecurse-of-dimensionality especially in the small data set [25]It should be noted that SVM classifier was widely used inthe EEG-based brain computer interface in practice [26ndash29]Furthermore some modified support vector classification(SVC) methods had the advantage of using a regulariza-tion parameter to control the number of support vectorsand margin errors For example Gu and Sheng developeda modified SVC formulation based on a sum-of-marginsstrategy to achieve better online accuracy than the existingincremental SVC algorithm [30] They further proposed arobust SVC method based on lower upper decompositionwith partial pivoting which results in fewer steps and lessrunning time than original one does [31] Taken together theneural network classifier was used for the facial expressiondetection and the SVM classifier was used for EEG detectionin this study

Two multimodal fusion methods are proposed in thisstudy For the first fusion method SVM classified the EEGsignal into the four types of emotion and fusion is performedusing a sum rule For the second fusion method SVMclassified the EEG signal into three intensity levels (weakmoderate and strong) and fusion is performed using aproduction rule It is interesting to note that the second fusionmethod combining type of emotion and intensity level yieldscomparable average accuracies with the first fusion methodIndeed it might very well be what humans do for emotionrecognition for example an expression of weak happiness istypically answered with neutral whereas a strong expressionof sadness usually evokes fear

For the results of Experiment 2 average accuracies of8125 (online) and 8275 (offline) were achieved by two

fusion methods for four-class emotion recognition Superiorperformance was obtained compared to the results in thestate-of-the-art results [3 11 32] In fact the authors of [3]reported an average accuracy of 784 by using optical flowto determine themain direction of movement of the musclesIn [11] the IAPS was used as stimuli and the use of self-assessment labels for arousal assessment yielded accuraciesof 55 53 and 54 for EEG and physiological and fusedfeatures respectively Zheng and his colleges presented anemotion recognition method combining EEG signals andpupillary response collected from eye tracker and achievedaverage accuracies of 7359 and 7298 for three emotionstates using feature level fusion strategy and decision-levelfusion strategy respectively

This study still has open issues that need to be consideredin the future At this present stage the image data setwe obtained is very limited and the EEG signals used foranalysis were recorded from only one electrode In the futurehowever we will collect more image data frommore subjectsand use a more complicated model to train our data toyield a classifier with better performance Furthermore wecould consider an EEG device with more electrodes to obtainhigher-quality data

Conflicts of Interest

The authors declare that there are no conflicts of interestregarding the publication of this paper

Acknowledgments

This study was supported by the National Natural ScienceFoundation of China under Grant 61503143 and GuangdongNatural Science Foundation under Grant 2014A030310244and the Pearl River SampT Nova Program of Guangzhou underGrant 201710010038

References

[1] J Gratch and S Marsella ldquoEvaluating a computational model ofemotionrdquo Autonomous Agents and Multi-Agent Systems vol 11no 1 pp 23ndash43 2005

[2] P Ekman and W V Friesen ldquoFacial action coding system atechnique for the measurement of facial movementrdquo Rivista diPsichiatria vol 47 no 2 pp 126ndash138 1978

[3] K Mase ldquoRecognition of facial expression from optical flowrdquoIEICE Transactions vol 74 no 10 pp 3474ndash3483 1991

[4] R W Picard and S B Daily ldquoEvaluating affective interactionsalternatives to asking what users feelrdquo in Proceedings of theCHI Workshop on Evaluating Affective Interfaces InnovativeApproaches (CHI rsquo05) April 2005

[5] G Chanel J J M Kierkels M Soleymani and T Pun ldquoShort-term emotion assessment in a recall paradigmrdquo InternationalJournal of Human Computer Studies vol 67 no 8 pp 607ndash6272009

[6] S Nasehi H Pourghassem and I Isfahan ldquoAn optimal EEG-based emotion recognition algorithm using gaborrdquo WSEASTransactions on Signal Processing vol 3 no 8 pp 87ndash99 2012


[7] K Ishino and M Hagiwara ldquoA feeling estimation system usinga simple electroencephalographrdquo in Proceedings of the IEEEInternational Conference on Systems Man and Cybernetics vol5 pp 4204ndash4209 October 2003

[8] Z Khalili and M H Moradi ldquoEmotion recognition systemusing brain and peripheral signals using correlation dimensionto improve the results of EEGrdquo in Proceedings of the Interna-tional Joint Conference on Neural Networks (IJCNN rsquo09) pp1571ndash1575 IEEE June 2009

[9] C Busso Z Deng S Yildirim et al ldquoAnalysis of emotionrecognition using facial expressions speech and multimodalinformationrdquo in Proceedings of the 6th international conferenceon Multimodal interfaces pp 205ndash211 ACM October 2004

[10] J Wagner E Andre F Lingenfelser and J Kim ldquoExploringfusion methods for multimodal emotion recognition withmissing datardquo IEEE Transactions on Affective Computing vol2 no 4 pp 206ndash218 2011

[11] P J Lang M M Bradley and B N Cuthbert ldquoInternationalaffective picture system (IAPS) affective ratings of pictures andinstruction manualrdquo Tech Rep A-8 2008

[12] Y Li J Pan F Wang and Z Yu ldquoA hybrid BCI systemcombining P300 and SSVEP and its application to wheelchaircontrolrdquo IEEE Transactions on Biomedical Engineering vol 60no 11 pp 3156ndash3166 2013

[13] A Zambrano C Toro M Nieto R Sotaquira C Sanın andE Szczerbicki ldquoVideo semantic analysis framework based onrun-time production rules-towards cognitive visionrdquo Journal ofUniversal Computer Science vol 21 no 6 pp 856ndash870 2015

[14] A T Tzallas I Tsoulos M G Tsipouras N Giannakeas IAndroulidakis and E Zaitseva ldquoClassification of EEG signalsusing feature creation produced by grammatical evolutionrdquo inProceedings of the 24th Telecommunications Forum (TELFORrsquo16) pp 1ndash4 IEEE November 2016

[15] M M Bradley and P J Lang ldquoMeasuring emotion the self-assessment manikin and the semantic differentialrdquo Journal ofBehaviorTherapy and Experimental Psychiatry vol 25 no 1 pp49ndash59 1994

[16] M A Hogervorst A-M Brouwer and J B F van ErpldquoCombining and comparing EEG peripheral physiology andeye-related measures for the assessment of mental workloadrdquoFrontiers in Neuroscience vol 8 article 322 2014

[17] J C Christensen J R Estepp G F Wilson and C A RussellldquoThe effects of day-to-day variability of physiological data onoperator functional state classificationrdquoNeuroImage vol 59 no1 pp 57ndash63 2012

[18] E B J Coffey A-M Brouwer and J B F Van Erp ldquoMeasuringworkload using a combination of electroencephalography andnear infrared spectroscopyrdquo inProceedings of theHumanFactorsand Ergonomics Society Annual Meeting pp 1822ndash1826 SAGEPublications Boston Mass USA October 2012

[19] M Severens J Farquhar J Duysens and P Desain ldquoA multi-signature brain-computer interfaceUse of transient and steady-state responsesrdquo Journal of Neural Engineering vol 10 no 2Article ID 026005 2013

[20] R Leeb H Sagha R Chavarriaga and J D R Millan ldquoAhybrid brain-computer interface based on the fusion of elec-troencephalographic and electromyographic activitiesrdquo Journalof Neural Engineering vol 8 no 2 Article ID 025011 2011

[21] K Chang K Bowyer and P Flynn ldquoFace recognition using 2Dand 3D facial datardquo in Proceedings of the ACM Workshop onMultimodal User Authentication pp 25ndash32 2003

[22] A Ross and A K Jain ldquoMultimodal biometrics an overviewrdquoin Proceedings of the 12th European Signal Processing Conferencepp 1221ndash1224 IEEE 2004

[23] A Ross and A Jain ldquoInformation fusion in biometricsrdquo PatternRecognition Letters vol 24 no 13 pp 2115ndash2125 2003

[24] L Ma and K Khorasani ldquoFacial expression recognition usingconstructive feedforward neural networksrdquo IEEE Transactionson Systems Man and Cybernetics Part B Cybernetics vol 34no 3 pp 1588ndash1595 2004

[25] F Lotte M Congedo A Lecuyer F Lamarche and B ArnaldildquoA review of classification algorithms for EEG-based brain-computer interfacesrdquo Journal of Neural Engineering vol 4 no2 pp R1ndashR13 2007

[26] A Temko E Thomas W Marnane G Lightbody and GBoylan ldquoEEG-based neonatal seizure detection with SupportVector Machinesrdquo Clinical Neurophysiology vol 122 no 3 pp464ndash473 2011

[27] Y Li J Pan J Long et al ldquoMultimodal BCIs target detectionmultidimensional control and awareness evaluation in patientswith disorder of consciousnessrdquo Proceedings of the IEEE vol104 no 2 pp 332ndash352 2016

[28] J Pan Y Li Z Gu and Z Yu ldquoA comparison study of twoP300 speller paradigms for brain-computer interfacerdquoCognitiveNeurodynamics vol 7 no 6 pp 523ndash529 2013

[29] M Mohammadpour S M R Hashemi and N HoushmandldquoClassification of EEG-based emotion for BCI applicationsrdquoin Proceedings of the Artificial Intelligence and Robotics (IRA-NOPEN rsquo17) pp 127ndash131 IEEE April 2017

[30] B Gu V S Sheng K Y TayW Romano and S Li ldquoIncrementalsupport vector learning for ordinal regressionrdquo IEEE Transac-tions on Neural Networks and Learning Systems vol 26 no 7pp 1403ndash1416 2015

[31] B Gu and V S Sheng ldquoA robust regularization path algorithmfor ]-support vector classificationrdquo IEEETransactions onNeuralNetworks and Learning Systems vol 28 no 5 pp 1241ndash12482016

[32] W-L Zheng and B-L Lu ldquoA multimodal approach to estimat-ing vigilance using EEG and forehead EOGrdquo Journal of NeuralEngineering vol 14 no 2 Article ID 026017 2017

Submit your manuscripts athttpswwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014


Distributed Sensor Networks


Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014


ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014


Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014


Electrical and Computer Engineering

Journal of

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia


Biomedical Imaging


Advances in


RoboticsJournal of



Computational Intelligence and Neuroscience

Industrial EngineeringJournal of


Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014


Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in



signals into three emotion states [5] Nasehi et al madeuse of quadratic discriminant analysis and SVM to classifyemotions into the six categories of pleasure surprise angerfear disgust and sadness achieving accuracies of 623 and8333 respectively [6] Ishino and Hagiwara categorizeduser status into four emotion states using neural networkswith accuracies ranging from 545 to 677 for each ofthe four emotion states [7] However the use of EEG-basedemotion recognition is still in its infancy

In recent years with the development of multisourceheterogeneous information fusion processing it has becomepossible to fuse features from multicategory reference emo-tion statesTheuse of different types of signals to support eachother through supplementary information fusion processingcan be greatly improvedTherefore people have begun to usefacial expressions voice messages eye movements gesturesand physiological signals and other channels of emotionalinformation between the complementarity to study iden-tification problems that is based on multimodal emotionrecognition [8] Most previous works have focused on thefusion of audiovisual information for automatic emotionrecognition for example combining speech with facialexpression Busso et al proposed a rule-based decision-level fusion method for combined analysis of speech andfacial expressions [9] Wanger et al used boosting techniquesto automatically determine adaptive weights for audio andvisual features [10] A few studies have focused on themultimodal fusion of EEG and physiological signals In astudy [11] the International Affective Picture System (IAPS)was utilized as stimuli and the use of self-assessment labelsfor arousal assessment yielded accuracies of 55 53 and54 for EEG physiological and fused features respectivelyAll of the studies have shown that the performances ofemotion recognition systems can be improved by employingmultimodal information fusion

In this study we propose two multimodal fusion meth-ods combining brain and peripheral signals for emotionrecognition The input signals are electroencephalogram andfacial expression The stimuli are based on a subset of movieclips that correspond to four specific areas of valance-arousalemotional space (happiness neutral sadness and fear) Forfacial expression detection four basic emotion states aredetected by a neural network classifier For EEG detectionfour basic emotion states and three emotion intensity levels(strong ordinary and weak) are detected by two SVMclassifiers respectively Emotion recognition is based ontwo decision-level fusion methods of both EEG and facialexpression detections by using a sum rule or a productionrule Twenty healthy subjects attended two experiments Theresults show that the accuracies of two multimodal fusiondetections are 8125 and 8275 respectively which areboth higher than that of facial expression or EEG detectionThe combination of facial expressions and EEG informationfor emotion recognition compensates for their defects assingle information sources

2 Methods

21 Data Acquisition System A Mindwave Mobile device(Neurosky Inc Abbotsford Australia) was used to capture

scalp EEG signals and a Logitech camera (25 FPS 800 times 600image size) was used to capture facial expressions Accordingto the standard 10ndash20 system the EEG signals are referencedto the right mastoid The EEG signals used for analysis wererecorded from the ldquoFzrdquo electrode The impedances of allelectrodes were maintained below 5 kΩ22 Data Processing and Algorithm For our proposed sys-tem the EEG and facial expression detectors were designedseparately The EEG and image data were fed into the twodetection procedures simultaneously Figure 1 shows the dataprocessing procedure The analysis methods and algorithmsused in this study are described below

221 Facial Expression Detection For the face featuresextraction the face position is detected in real-time bythe AdaBoost algorithm based on the Haar eigenvalue TheHaar classifier uses the AdaBoost algorithm of the Boostingalgorithm resulting in a cascade of weak classifiers trainedby the AdaBoost algorithm We use the Haar-like feature inimage as input of the classifier The output of the classifieris whether this image is human face [12] When the humanface from the input image is found we resize it into 48-pixelwidth and 48-pixel height Next the PCA method is used toreduce the dimensionality The output of this phrase is 169dimensions (obtained by the grid search method mentionedbelow) after dimensionality reduction

The feature vectors are then fed to a feedforward neuralnetwork Figure 2 shows the architecture of the proposedsystem for face expression classification Applying the trainedfeedforward neural network classifier to the face imagefeature we obtain four scores (the values of the objectivefunction of the feedforward neural network) denoted as1199041119895 (119895 = 1 4) 119895 represents the four emotion states (hap-piness neutral sadness and fear) detected by face expressionclassifier We normalize the four scores by mapping them tothe range [0 1]

1199041119895 = 1199041119895 minusmin 11990411 11990414max 11990411 11990414 minusmin 11990411 11990414 (1)

1199031 = argmax119895

(1199041119895) (2)

The four normalized scores 1199041119895 (119895 = 1 2 3 4) represent-ing the output of feedforward neutral network are used in thedecision-making step described later and 1199031 represents theemotion state detected by face expression

Note that the hyperparameters of the face expressionclassifier are determined by the grid search method a brute-force searching through a manually specified subset of thehyperparameters space of a learning algorithm We initializethe subset of the hyperparameters space in

119863 isin 121 144 169 225119873 isin 150 200 250 300119877 isin 0001 001 01 1

(3)


EEGsensor

Facecamera

Featureextraction


Classification

Decision-making

EEG data

Face data

EEG features

Face features

SVM scores

NN scores

Decisionfusion

２ＭＯＦＮ

２ＭＯＦＮ＝


1

2

169

1

2

200

1

2

4

Inputlayer

Hiddenlayer

Outputlayer

3

Happiness

Neutral

Sadness

Fear

X1

X2

X169






119910119894 =

1 happiness

2 neutral

3 sadness

4 fear

(5)

1199101015840119894 =

minus1 weak

0 moderate

1 strong(6)






119895

(1199042119895) (7)



11990310158402 = argmax119896

(11990421015840119896) (8)




sum119895 = 1199041119895 + 1199042119895 (119895 = 1 2 3 4)119903sum = argmax

119895

(sum119895) (9)



119877119894 IF 119875 THEN 119876 (10)






3 Experiment











4 Results






5 Discussions















Acknowledgments


References









































Advances in

FuzzySystems


Volume 2014












Journal of



Advances in

Multimedia


Biomedical Imaging


Advances in


RoboticsJournal of










Advances in




EEGsensor

Facecamera

Featureextraction


Classification

Decision-making

EEG data

Face data

EEG features

Face features

SVM scores

NN scores

Decisionfusion

２ＭＯＦＮ

２ＭＯＦＮ＝


1

2

169

1

2

200

1

2

4

Inputlayer

Hiddenlayer

Outputlayer

3

Happiness

Neutral

Sadness

Fear

X1

X2

X169






119910119894 =

1 happiness

2 neutral

3 sadness

4 fear

(5)

1199101015840119894 =

minus1 weak

0 moderate

1 strong(6)






119895

(1199042119895) (7)



11990310158402 = argmax119896

(11990421015840119896) (8)




sum119895 = 1199041119895 + 1199042119895 (119895 = 1 2 3 4)119903sum = argmax

119895

(sum119895) (9)



119877119894 IF 119875 THEN 119876 (10)






3 Experiment











4 Results






5 Discussions















Acknowledgments


References









































Advances in

FuzzySystems


Volume 2014












Journal of



Advances in

Multimedia


Biomedical Imaging


Advances in


RoboticsJournal of










Advances in






119895

(1199042119895) (7)



11990310158402 = argmax119896

(11990421015840119896) (8)




sum119895 = 1199041119895 + 1199042119895 (119895 = 1 2 3 4)119903sum = argmax

119895

(sum119895) (9)



119877119894 IF 119875 THEN 119876 (10)






3 Experiment











4 Results






5 Discussions















Acknowledgments


References









































Advances in

FuzzySystems


Volume 2014












Journal of



Advances in

Multimedia


Biomedical Imaging


Advances in


RoboticsJournal of










Advances in











4 Results






5 Discussions















Acknowledgments


References









































Advances in

FuzzySystems


Volume 2014












Journal of



Advances in

Multimedia


Biomedical Imaging


Advances in


RoboticsJournal of










Advances in







5 Discussions















Acknowledgments


References









































Advances in

FuzzySystems


Volume 2014












Journal of



Advances in

Multimedia


Biomedical Imaging


Advances in


RoboticsJournal of










Advances in












Acknowledgments


References









































Advances in

FuzzySystems


Volume 2014












Journal of



Advances in

Multimedia


Biomedical Imaging


Advances in


RoboticsJournal of










Advances in





































Advances in

FuzzySystems


Volume 2014












Journal of



Advances in

Multimedia


Biomedical Imaging


Advances in


RoboticsJournal of










Advances in



fusion of facial expressions and eeg for multimodal...

Documents