[ieee 2012 ieee embs conference on biomedical engineering and sciences (iecbes 2012) - langkawi,...

EEG Spectral Analysis for Attention State

Assessment: Graphical Versus Classical

Classification Techniques

Ahmed Fathy, Ahmed Fahmy, Mohamed ElHelw

Center for Informatics Science

Nile University

Cairo, Egypt

Seif Eldawlatly

Computer and Systems Engineering Department

Faculty of Engineering, Ain Shams University

Cairo, Egypt

[email protected]

Abstract— Advances in Brain-computer Interface (BCI)

technology have opened the door to assisting millions of people

worldwide with disabilities. In this work, we focus on assessing

brain attention state that could be used to selectively run an

application on a hand-held device. We examine different

classification techniques to assess brain attention state. Spectral

analysis of the recorded EEG activity was performed to compute

the Alpha band power for different subjects during attentive and

non-attentive tasks. The estimated power values were used to

train a number of classical classifiers to discriminate among the

two attention states. Results demonstrate a classification

accuracy of 70% using both individual- and multi-channel data.

We then utilize a graphical approach to assess the causal

influence among EEG electrodes for each of the two attention

states. The inferred graphical representations for each state were

used as signatures for state classification. A classification

accuracy of 83% was obtained using the graphical approach

outperforming the examined classical classifiers.

Keywords-EEG; brain-computer interface; attention state.

I. INTRODUCTION

In the last few years, tablets and smartphones have

increasingly become essential devices for many people. These

devices mainly rely on touch screens technology that can be

controlled by direct interaction between the user and the

device. While this is considered a great achievement, it

however requires that the user uses his hands to interact with

the touch screen of these devices. This neglects users with

hand disabilities who cannot use touch screen-based smart

devices that mainly rely on physical touch.

One possible way to enable this large population to use

such devices is to use brain activity to interact with smart

devices. Recent advances in non-invasive Brain-Computer

Interfaces (BCIs) have demonstrated the efficacy of

monitoring electroencephalogram (EEG) signals and

subsequently translating them into actions that represent the

user’s intentions [1]. Successful examples of BCIs include

EEG-speller systems [2], brain-controlled games [3] and

wheel-chair control [4]. However, a number of limitations

prevented BCIs from being of commercial interest in addition

to being inconvenient for most users such as the high cost of

monitoring hardware, large size of amplifiers and using wet

electrodes for EEG monitoring. Recently, a number of

portable EEG headsets have been made commercially

available at reasonable prices that overcome the

aforementioned limitations. Successful applications have been

demonstrated using these portable units such as emotion

detection [5] and brain-controlled dialing application for

mobile phones [6].

In this paper, we propose the use of a portable EEG headset for assessing attention state. This component is part of a larger system aiming at enabling people with motor disability to fully interact with touch-screen devices using EEG activity. The attention assessment component presented in this paper will be used to run an application after a brain-controlled cursor has been moved to the application icon. We capitalize on the extensive research that has been carried out in the last two decades on using spectral analysis of EEG activity to estimate different physiological and psychological states [7, 8]. To assess attention state, we estimate the power of the Alpha waves (8 – 12 Hz) in the recorded EEG activity [9]. The abundance of Alpha waves has been shown to positively correlate with being idle, whereas the lack of Alpha waves indicates attention [10] as demonstrated in Fig. 1. We report

0 1 2

Time (sec)

Non-attentive

0 1 2

Time (sec)

Attentive

Figure 1. Sample data recorded on electrode O2 during (top) the non-attentive state and (bottom) the attentive state. The figure clearly indicates the abundance of the Alpha wave (8 – 12 cycles/sec) in the non-attentive state compared to the attentive state.

978-1-4673-1666-8/12/$31.00 ©2012 IEEE

2012 IEEE EMBS International Conference on Biomedical Engineering and Sciences | Langkawi | 17th - 19th December 2012

888

the classification accuracy among two brain states: attentive and non-attentive, using different classification and feature extraction techniques.

II. METHODS

A. Subjects and Task

EEG activity was recorded from 3 healthy adult subjects (1 female and 2 males) using the wireless Emotiv EPOC neuroheadset - the research edition (Emotiv Systems Inc., San Francisco, USA). This neuroheadset has 14 electrodes located at positions AF3, F7, F3, FC5, T7, P7, O1, O2, P8, T8, FC6, F4, F8 and AF4 according to the international 10-20 system. Recorded EEG was sampled at 128 Hz.

Subjects were asked to perform 2 tasks: attentive and non-attentive. For the attentive task, subjects were asked to play 2 concentration-based online games for ~5 minutes and fully concentrate on the game and nothing else. For the non-attentive task, subjects were asked to close their eyes and relax without thinking about anything in particular for ~5 minutes.

B. Data Pre-processing

For each channel and each of the two tasks, raw signals

recorded using the EPOC neuroheadset were first divided into

1 sec epochs. The signals recorded within each epoch were

filtered using common average reference (CAR) spatial filter

[11]. This is done by computing the mean of all channels

within the considered epoch and subtracting this mean value

from each channel

N

jjii tx

Ntxty

1

1

(1)

where ix represents the raw signal recorded on electrode i in

epoch τ, iy represents the filtered signal and N is the total

number of channels. CAR spatial filter has been demonstrated to outperform other referencing techniques such as ear referencing [11]. Filtered signals were then subsequently thresholded to eliminate blinking and muscle artifacts (mean ± 3 standard deviation). Using Fast Fourier Transform (FFT), spectral analysis of the filtered signals within each epoch for each channel was performed to extract the frequency components corresponding to the Alpha wave in the range 8-12 Hz. Power spectral analysis was then performed to estimate the Alpha band power for each epoch on each channel.

Attention assessment performed using individual channels data was compared to that performed using multi-channel data. In order to efficiently combine Alpha band power computed for individual channels, we utilized Principal Component Analysis (PCA) to project the multi-channel data into a reduced-dimensions space where the most significant features are expressed [12]. Briefly, this is done by, first, subtracting the mean of the input training data for each epoch at each electrode and subsequently computing the covariance matrix. The eigenvectors of the covariance matrix are then computed and ordered based on their corresponding eigenvalues. To classify a new test epoch, it is first projected into the principal component

space identified using the training data before applying the classifier.

C. Classical Classification Techniques

The estimated Alpha band power data was then used to

train three different classical classifiers to discriminate among

two classes: attentive versus non-attentive. For each subject,

the Alpha band power data was divided into two datasets:

training dataset (80% of the data) to learn the parameters of the

examined classifiers and test dataset (20% of the data) to

compute the corresponding classification accuracy. The

classifiers examined are

1) Linear Discriminant Analysis (LDA) Classifier

A linear discriminant function can be expressed as [12]

0wz T sws (2)

where s is the input training data (Alpha band power data in

this case), w is the weights vector, w0 is the bias and z is the

output of the discriminant functions. If z > 0, s is classified as

belonging to class C1, whereas if z < 0, s is classified as

belonging to class C2. LDA aims at projecting the input data to

a reduced dimensions space such that the separation between

the projections of different classes is maximized while

minimizing the within-class variance.

2) Naive Bayes Classifier

Naive Bayes classifier can be categorized as a probabilistic

generative model classifier that assumes complete

independence among the input features to estimate the

likelihood of the data [12]

N

n

lnl CsC1

PrPr s

(3)

where sn represents the Alpha band power for channel n in this

case. Each of the probabilities ln CsPr is assumed to follow

a Gaussian distribution 2, nnnsN whose parameters are

estimated from the training data. To classify s, the posterior

probability slCPr is computed using Bayes’ rule

s

ss

Pr

PrPrPr

lll

CCC (4)

where s is classified as belonging to class C1 if s1Pr C is

greater than s2Pr C , and as belonging to C2 otherwise.

3) Support Vector Machine (SVM)

The two aforementioned classifiers are linear classifiers. In

the case of non-linearly separable data, a transformation can

be performed to project the input data into a higher-

dimensional space in which the data is linearly separable. To

project the data into this new space, an inner-product kernel is

typically used such as the Euclidean kernel given by [13]

2

22

1exp),( jijiK ssss

(5)


889

where si and sj are two input data vectors, and σ2 is the kernel

variance.

Given the training set Miii t

1,

s , where si is i

th training

vector, ti is the corresponding desired response (1 for class C1

or -1 for class C2), and M is the number of epochs in the

training data, SVM attempts to find the Lagrangian multipliers

Mii 1

that maximize the objective function

M

i

M

j

jijiji

M

i

i KttQ1 11

,2

1)( ss

(6)

subject to the constraints

,...,,2,10

01

MiC

t

i

M

i

ii

(7)

where C is a positive parameter to allow SVM to identify non-

linear decision boundary.

D. Graphical Approach for Classification

An alternative classification approach to attention state

assessment that we propose here is to classify the recorded data

based on the inferred causal relationships between the recorded

electrodes. We utilize Dynamic Bayesian Networks (DBNs) to

infer such causal relationships. DBNs represent an extension to

Bayesian networks to model time-dependent causal

relationships between random variables [14].

In the spectrally analyzed EEG recordings context, a DBN

represents the causal relationships inferred from the Alpha

band power as B =<G, P>, where G is a directed acyclic

graph (DAG) and P is a set of conditional probabilities that

expresses the statistical dependence between the

simultaneously observed Alpha power (s1, s2,…., sn) [14]. Each

graph G consists of a set of nodes {vi(t)

}, where i = 1 to n, in

which each node corresponds to the Alpha band power on one

electrode si at time t, denoted by si(t)

. Each directed edge in G

indicates conditional dependence. Using DBN formulation, the

conditional probability 1:121 ,...,,Pr tt

ntt sss s

can be

factorized as a product of individual conditional probabilities

Pr(si(t)

|sπ(i)(1:t-1)

) given that the status of any variable si(t)

in a

DBN is determined by only its parents’ history, denoted by

sπ(i)(1:t-1)

. The parents’ status history is considered up to a

maximum Markov lag T

.Pr,...,,Pr1

:1:121

n

i

Tt

i

ti

Tttn

tt ssss

ss (8)

The structure of a DBN for a given dataset can be learned

using a score-based approach, where a criterion is first defined

by which an arbitrary Bayesian network structure can be

evaluated on a given dataset, then a search is carried out

through the space of all possible structures to find the graph

with the highest score [15]. DBN has been shown to

successfully discriminate different brain states at the individual

neuron level [16].

To infer attention state networks, the observed Alpha

power on each electrode was discretized to 3 uniform levels

(0, 1 and 2). The training data for each attention state was

divided to 4 datasets and DBN was then used to infer a

network for each of the 4 datasets with 2 Markov lags. To

classify the test data as attentive or non-attentive, DBN was

used to infer the corresponding networks (test networks). The

similarity between test networks and the networks inferred for

the training data was quantified by, first, representing each

inferred network as a 14 × 14 binary adjacency matrix A. Each

element A(i, j) takes the value ‘1’ if there is a connection

inferred from electrode i to electrode j and ‘0’ if there is no

connection. All adjacency matrices of the inferred networks

were vectorized and stacked together into one matrix.

Principal component analysis (PCA) was then applied to this

matrix to extract significant features from the inferred

networks by projecting the adjacency matrices into a p-

dimension network space that accounts for most of the

variance in the networks [16]. The distance D(Al, Am) between

a pair of matrices Al and Am was defined as

mlml qqAAD , (9)

where ql and qm are the projections of Al and Am in the p-

dimension network space, respectively, and ||.|| is the

Euclidean distance. The number of principal components p

was set to 2. A test network was classified as belonging to the

attention state with minimum distance between the test

network and the corresponding training networks.

III. RESULTS

We first examined the performance of each of the three

classical classifiers in discriminating among the two brain

states: attentive or non-attentive. First, we tested each

classifier on the Alpha band power computed for individual

channels. Fig. 2 illustrates the classification accuracy for each

classifier averaged across subjects (mean ± SD). As can be

seen, the three classifiers performed equally well with

maximum average accuracy obtained on channel O2 (LDA:

67.8±5.6%; Naive Bayes: 67.2±4.9%; SVM: 65.2±3.1%).

The relatively higher accuracy obtained on the occipital and

parietal electrodes (O2, P7 and P8) compared to other

electrodes is consistent with previous studies that demonstrated

the increased amplitude of the Alpha wave on the occipital and

parietal areas when idle and the diminished amplitude when

attentive [10].

AF3 AF4 F3 F4 F7 F8 FC5 FC6 O1 O2 P7 P8 T7 T80

20

40

60

80

Channel

Cla

ssific

atio

n A

ccu

racy (

%)

LDA

NB

SVM

Figure 2. Classification accuracy obtained for individual channels using

Linear Discriminant Analysis (LDA), Naive Bayes (NB) and Support Vector Machine (SVM) classifiers.


890

1 2 3 4 5 6 7 8 9 10 11 12 13 140

20

40

60

80

Number of PCs

Cla

ssific

atio

n A

ccu

racy (

%)

LDA

NB

SVM

Figure 3. Classification accuracy obtained for the multi-channel data for different number of principal components (PCs).

We also examined the performance of the same classifiers

when applied to multi-channel data. This was done by first

using PCA to extract the most significant features from the

input multi-channel data. Fig. 3 illustrates the performance of

each classifier for different number of principal components

(PCs). Similar to the results obtained using individual channels

data, the performance of the three examined classifiers was not

significantly different. However, the results indicate a slight

improvement in the classification accuracy compared to using

individual channels data when using 3 PCs (LDA: 70±7.7%;

Naive Bayes: 69.5±4.5%; SVM: 69.6±9%).

As an alternative approach, we used graphical

representation of the causal influence among the recording

electrodes as signatures of the attention state. Using Dynamic

Bayesian Network (DBN), we inferred causal connections for

both attentive and non-attentive task. Fig. 4 illustrates sample

networks inferred for both attention states for the training data.

The significant difference between the networks across the two

states indicates the feasibility of using this approach as means

to classify the networks inferred for the test data. A larger

average classification accuracy of 83% was obtained using

DBN compared to the other approaches as illustrated in Fig. 5.

IV. CONCLUSION

We investigated the use of different classification

techniques in assessing brain attention state. Using affordable

wireless EEG headset, we recorded EEG activity during

performing attentive and non-attentive tasks. Data recorded on

individual channels were pre-processed and subsequently used

to train each of the examined classifiers. Results demonstrate

the superiority of graphical representations to classical

classification techniques to discriminate among attentive and

non-attentive brain states with acceptable accuracy. The

examined methods can be further extended to examine other

spectral bands (such as Theta and Beta waves and the

AF3 AF4

F3 F4

F7 F8

FC5 FC6

O1 O2

P7 P8

T7 T8

Non-attentive State

AF3 AF4

F3 F4

F7 F8

FC5 FC6

O1 O2

P7 P8

T7 T8

Attentive State

Figure 4. Sample networks for non-attentive and attentive states. Each node corresponds to 1 electrode. Each directed edge indicates a causal relationship.

LDA NB SVM DBN0

20

40

60

80

100

Max. A

vg. A

ccura

cy (

%)

Figure 5. Comparing the maximum average classification accuracy of the multi-channel data (from Fig. 3) to that obtained using DBN.

corresponding Theta-Beta ratio) that have been shown to

correlate with attention. The approaches presented in this paper

will be used in the context of a brain-smart device interface to

run an application on a touch-screen device based on the

attention level of the user.

REFERENCES

[1] J. R. d. Millán, et al., "Combining brain-computer interfaces and

assistive technologies: state-of-the-art and challenges," Front. Neurosci.,

vol. 4, p. 161, 2010.

[2] E. W. Sellers and E. Donchin, "A P300-based brain–computer interface: initial tests by ALS patients," Clin Neurophysiol, vol. 117, pp. 538–548,

2006.

[3] A. Nijholt, D. O. Bos, and B. Reuderink, "Turning shortcomings into challenges: Brain-computer interfaces for games," Entertain Comput,

vol. 1, pp. 85–94, 2009.

[4] K. Tanaka, K. Matsunaga, and H. O. Wang, "Electroencephalogram-Based Control of an Electric Wheelchair," IEEE Transactions on

Robotics, vol. 21, pp. 762–766, 2005.

[5] Y. Liu, O. Sourina, and M. K. Nguyen, "Real-time EEG-based human emotion recognition and visualization," in Int. Conf. on Cyberworlds,

Singapore, 2010, pp. 262–269.

[6] A. Campbell, et al., "NeuroPhone: brain-mobile phone interface using a wireless EEG headset," in ACM. MobiHeld, New Delhi, India, 2010, pp.

3–8.

[7] B. Hamadicharef, et al., "Learning EEG-based spectral-spatial patterns for attention level measurement," in IEEE International Symposium on

Circuits and Systems (ISCAS2009), Taipei, Taiwan, 2009, pp. 1465–

1468. [8] H. Laufs, et al., "Electroencephalographic signatures of attentional and

cognitive default modes in spontaneous brain activity fluctuations at

rest," Proc. Natl. Acad. Sci. U. S. A., vol. 100, pp. 11053–11058, 2003. [9] G. Buzsáki, Rhythms of the brain New York: Oxford University Press,

2006.

[10] J. J. Foxe and A. C. Snyder, "The role of alpha-band brain oscillations as a sensory suppression mechanism during selective attention," Front.

Psychology, vol. 2, p. 154, 2011.

[11] D. J. McFarland, L. M. McCane, S. V. David, and J. R. Wolpaw, "Spatial filter selection for EEG-based communication,"

Electroencephalogr. Clin. Neurophysiol., vol. 103, pp. 386-394, 1997.

[12] C. Bishop, Pattern recognition and machine learning. New York: Springer, 2006.

[13] C. Burges, "A Tutorial on Support Vector Machines for Pattern

Recognition," Data Mining and Knowledge Discovery, vol. 2, pp. 1-47,

1998.

[14] K. Murphy, "Dynamic Bayesian Networks: Representation, Inference and Learning," PhD thesis, UC Berkeley, Computer Science Division,

2002.

[15] A. J. Hartemink, D. K. Gifford, T. Jaakkola, and R. Young, "Using graphical models and genomic expression data to statistically validate

models of genetic regulatory networks," in Pacific Symposium on

Biocomputing (PSB01), 2001, pp. 422-433. [16] S. Eldawlatly and K. G. Oweiss, "Millisecond-Timescale Local Network

Coding in the Rat Primary Somatosensory Cortex," PLoS ONE, vol. 6, p.

e21649, 2011.


891

[ieee 2012 ieee embs conference on biomedical engineering and sciences (iecbes 2012) - langkawi,...

Documents