classifying the multiplicity of the eeg source models using sphere-shaped support vector machines

1912 IEEE TRANSACTIONS ON MAGNETICS, VOL. 41, NO. 5, MAY 2005

Classifying the Multiplicity of the EEG SourceModels Using Sphere-Shaped Support

Vector MachinesQing Wu1, Xueqin Shen1, Ying Li2, Guizhi Xu2, Weili Yan2, Guoya Dong2, and Qingxin Yang2

Computer Science Department, Hebei University of Technology, Tianjin 300130, ChinaProvince-Ministry Joint Key Laboratory of Electromagnetic Field and Electrical Apparatus Reliability,

Hebei University of Technology, Tianjin 300130, China

Support vector machines (SVMs) are learning algorithms derived from statistical learning theory, and originally designed to solvebinary classification problems. How to effectively extend SVMs for multiclass classification problems is still an ongoing research issue.In this paper, a sphere-shaped SVM for multiclass problems is presented. Compared with the classical plane-shaped SVMs, the numberof convex quadratic programming problems and the number of variables in each programming are smaller. Such SVM classifier isapplied to the electroencephalogram (EEG) source localization problem, and the multiplicity of source models is determined accordingto the potentials recorded on the scalp. Experimental results indicate that the sphere-shaped SVM based classifier is an effective andpromising approach for this task.

Index Terms—EEG source model, multiclass classification, sphere classifier, support vector machine.

I. INTRODUCTION

THE support vector machine (SVM) is an effective machinelearning method proposed by Vapnik et al. for general pur-

pose pattern recognition [1], [2]. Based on the idea of VC di-mension and the principal of structural risk minimization, anSVM is intuitively a two-class classifier in the form of a hyper-plane that leaves the largest possible fraction of points of thesame class on the same side, while maximizing the distance ofeither class from the hyperplane. The points of either class thathave closest distance to the hyperplane are called support vec-tors and the distance is called margin. The hyperplane, calledoptimal separating hyperplane, minimizes not only the empir-ical risk, but also the expectant risk, and thus has better general-ization ability compared with traditional classifiers. Because oftheir excellent performance and simple structures, SVMs havebeen applied to many fields.

Although SVM approaches were originally developed forbinary classification, there are some ways to extend them formulticlass classification, such as all-together, one-versus-rest,one-versus-one, and so on [3]. However, in order to solvelarge-scale multiclass problems, the computational time ofthese methods is too long. In this paper, sphere-shaped SVMinstead of plane-shaped SVM is used as the basis for multiclassclassifier construction. The presented algorithm can be moreefficient in training time, classification time, and generalizationability.

Electroencephalogram (EEG) is the integrated representationof the electric biological activities of the neuron groups withinthe brain on the scalp after being conducted by the volume con-ductor (including the cortex, cerebrospinal fluid, skull, scalp,etc.). Interpretation of the clinical EEG almost always involvesspeculation as to the possible locations of the sources inside the

Digital Object Identifier 10.1109/TMAG.2005.846231

brain that are responsible for the observed potential distributionon the surface of the head. Generally, bioelectromagnetic field isviewed as a quasi-static current system and the EEG sources aremodeled as current dipoles. Localization of focal electrical ac-tivity in the brain using equivalent dipoles is usually performedby iteratively modifying the parameters of the source model,until optimal correspondence is reached between the observedand the predicted potential vectors on the head. So it is neces-sary to choose the source configuration that can satisfy all of theknown constraints in advance. However, in most applicationsthe exact forms of the sources are difficult to know. For example,in single time-slice source localization, there is no quantitativeway to estimate the multiplicity or type of the sources. But ifthe EEG source model is not correctly estimated beforehand,it is likely that none of the predicted source locations will becorrect [4].

In this paper, the practice of using a sphere-shaped multiclassSVM (M-SVM) to determine the EEG source multiplicity is car-ried out. In our experiments, solving the EEG forward problemon four source models assumed forms training samples. Duringthe training process, the SVM-based classifier builds up its ownmemory reflecting the relationship between the scalp potentialsand the type of source models. Then, it can make a satisfactorydecision when new EEG data are given.

II. CONSTRUCTION OF SPHERE-SHAPED M-SVM CLASSIFIERS

While SVMs were initially proposed for two-class problems,two main approaches have been proposed to solve multiclassproblems currently. One is all-together M-SVM that directlyconsiders all classes in one optimization formulation, while theother is combined M-SVM that constructs several binary clas-sifiers through methods of one-versus-rest, one-versus-one, andso on [3]. Having their own advantages and disadvantages, allthese SVM classifiers belong to algorithms based on an optimalseparating hyperplane. To a multiclass problem, such SVMs are

0018-9464/$20.00 © 2005 IEEE

WU et al.: CLASSIFYING THE MULTIPLICITY OF THE EEG SOURCE MODELS 1913

to divide the data space by several hyperplanes and each classof data is confined into a certain area constructed by a numberof hyperplanes. Here, inspired by a paper of Scholkopf [5] anda paper of Tax [6], a sphere-structured SVM is presented. Thesame class of data being bound by an optimal separating hyper-sphere, the whole data space is then divided by a number of suchhyperspheres. This kind of SVM algorithm is simpler in com-puting complexity and smaller in data scale.

A. Method of Sphere Classifier

For an class problem , let be atraining set composed of class examples. That is,

contains element sets, where eachset includespoints (patterns) that belongs to the same class , and eachpoint is an -dimensional point.

Based on , a multiclass classifier algorithm needs to be con-structed to distinguish different categories, and to classify theunknown test points correctly and efficiently. For each class set

(for simplicity, the superscript of is omitted), asphere structured decision boundary with minimal volume thatcontains all (or most of) the sample pointsshould be found. This hypersphere is described by center andminimum radius . Analogous to the hyperplane SVM [1], thefollowing primal constrained optimization problem should besolved:

(1)

where is slack variables, and is a given constant that givesthe tradeoff between the volume of the sphere and the numberof target points rejected.

From Karush–Kuhn–Tucker (KKT) condition, the lagrangianis constructed [6]

with Lagrange multipliers and . Setting the partialderivatives to zero, new constraints are obtained

Fig. 1. Sphere-shaped classifier.

So (1) can be converted to a relatively simpler dual problem withthe number of variables equal to

(2)

By solving the above quadratic programming problem foreach class, spheres are generated. Each sphere represents acertain class and the points lying on the boundary surface arekey for the description of the sphere (see Fig. 1). These points,which are called support vectors, can be regarded as typical rep-resentation of the training samples.

To test if a new sample point is within the sphere , thedistance to the center of the sphere has to be calculated. The testdata is accepted when this distance is smaller than the radius

(3)

where the center and the radius can be calculated based on anysupport vector, i.e., , and

.Let the number of spheres that contain the point is . The

following cases are involved.

Case 1: .If the point is precluded by all the spheres, seekthe nearest one to , let

where and are respectively the center andthe radius of the minimum sphere . So the point

belongs to class .Case 2: .

If the point is included in only one sphere, it be-longs to the corresponding class.

1914 IEEE TRANSACTIONS ON MAGNETICS, VOL. 41, NO. 5, MAY 2005

Case 3: .If the point is located in the common area of anumber of spheres, the output class can be obtainedby comparing the distance between the test pointand the center of each minimum sphere. To elim-inate the effect of different sphere radii, a relativedistance is applied, i.e.,

where . So thepoint is classified into the class .

B. Generalizing to Kernel Function

Since the sample data are usually not spherically distributed,so to make a more flexible method, the inner products in (2) and(3) can be substituted by a kernel function , when thiskernel satisfies Mercer theorem. Thus, (2) can be changed to

And a test point is accepted when

Different kernel functions result in different types of featurespaces and thus differently shaped domain descriptions. Thiscan make the method more flexible and more accurate than thevery rigid spherical shape in the input space.

Two usually used kernel functions are polynomial kernel

where is the degree of the polynomial, and radial basis kernel,for example Gaussian function

where is the width of the kernel.If a polynomial kernel is used, distances between data points

will be enlarged with the increase of . So this kernel results ina very large and sparse description of the samples. To suppressthe growing distances for large feature spaces, the radial basiskernel is more appropriate to the sphere classifier method. Usinga Gaussian kernel function, the number of support vectors canbe regulated by changing the value of [6].

C. Performance Analysis

Sphere-shaped multiclass SVM can solve large-scale prob-lems effectively. Since every quadratic programming problemis established for only one specified class while other classes

Fig. 2. Source models: (a) single dipole source, (b) two dipoles source, (c) discsource, and (d) line source.

are not considered, the capacity of processing data is greatlyincreased.

The work of solving the quadratic programming problemtakes the largest ratio among all the computational workinvolved in various M-SVM algorithms. For example, theone-versus-one method has to construct binarySVM classifiers where each one is trained on data points fromany two classes and . Hence, quadratic pro-gramming problems where each of them has variables haveto be solved. So the computational time is too long to solvelarge-scale problems. In the sphere classification method, only

quadratic programming problems needs to be solved, so thecomputing time of classification is reduced.

In the hyperplane SVM algorithm, the original classificationsystem will be broken when a new class is added. It is necessaryto compare the new one with those previous ones and recalcu-late many quadratic programming problems to get new supportvectors. In the sphere classification method, no such repetitionis needed since the addition of the new one will not influencethe previous classes. Only one more quadratic programmingproblem for the new one is necessary. So the sphere-shapedM-SVM makes it easy for data to expand.

III. SIMULATION RESULTS

We have applied the above approach to predict the multi-plicity of the EEG source models. A four-concentric-shell struc-ture with different conductivity values (0.33, 1, 0.0042, 0.33)s m respectively representing the brain, cerebrospinal fluid,skull, and scalp is used as a head model, and their relative radiiare (0.8, 0.85, 0.92, 1). Four types of source models: singledipole source, two simultaneously active dipoles source, discsource, and line source [7], as shown in Fig. 2, were used in ourexperiments.

The single dipole model has six parameters,, where the first three position

parameters denote the dipole location in directions, and

WU et al.: CLASSIFYING THE MULTIPLICITY OF THE EEG SOURCE MODELS 1915

TABLE INUMBER OF PATTERNS IN EACH SAMPLE SETS

TABLE IITEST RESULTS FOR THREE TEST SETS

the last three moment parameters determine the orientationand strength of the dipole. Obviously, the two-dipole modelhas 12 parameters. The disc source is identified by sevenparameters , where is the radiusof the disc. The line source is determined by nine parameters

, where the first and lastthree position parameters determine the locations of the beginand end dipole sources.

In order to obtain the training and testing examples, we as-sume random dipole location vectors whose three componentsobey the uniform distribution are independently generated, andthen the dipole moments are generated randomly using the zero-mean, unit-variance Gaussian distribution. The radius of thedisc and the length of the line are allowed to change randomlyfrom zero to the half of the brain radius. And the orientations ofdipoles are always perpendicular to the disc and the line.

Once the source models were formed, we solve the EEGforward solution for each model [8] to generate the respectivepattern sample set over 138 measurement points correspondingto 138 channel measured EEGs. In this way, we have obtainedfour sample sets, one for training, and the other three fortesting as shown in Table I. Each pattern point in sets is with138-dimension.

In the training stage, appropriate parameters of the classifierhave to be chosen by trial and error. Here, we trained minimalhypersphere SVMs with Gaussian kernel. The final key influ-encing factors are the variance of the Gaussian and punitiveparameter . Through experiments, it is shown that has greatinfluence on the classification results while has little influ-ence. On a 1.5-GHz Pentium PC, the training time of the 6000patterns is 24 s. Table II gives the classification results on thetest sets.

As we can see, the classification results are excellent and thetime needed almost linearly scales with the sample size, whichassures the dynamic analyses of EEG. Further analysis showsthat the test patterns with small radius from the disc model or

TABLE IIITEST RESULTS FOR NOISY DATA

small length from the line model are often considered as thepatterns from single dipole model in most wrong classifications.In fact, the disc source with zero radius is equivalent to a dipolesource. The same is also true for line source. So the result isquite consistent with this situation.

To further check the validity of the classifiers when the inputis corrupted by noise, we add a 10-dB normally distributed noiseto the test data and test again. The results are also optimistic asshown in Table III (the time is nearly the same as in Table II).

IV. CONCLUSION

Multiclass classifier based on the sphere-shaped SVM isused to classify the different EEG source models accordingto the scalp potentials, and experimental results showed greatefficiency. Compared with the hyperplane-based M-SVMclassifier, the scale of the optimization problem is smaller.The proposed method is applicable to a large-scale multiclassproblem and is easy to be expanded.

ACKNOWLEDGMENT

This work was supported by the Ph.D. Foundation of HebeiEducation Department under Grant B2004107, and by theNatural Science Foundation of Hebei Province under GrantE2005000024 and 603073.

REFERENCES

[1] V. N. Vapnik, The Nature of Statistical Learning Theory. New York:Springer-Verlag, 1995.

[2] K. R. Muller, S. Mika, G. Ratsch, K. Tsuda, and B. Scholkopf, “Anintroduction to kernel-based learning algorithms,” IEEE Trans. NeuralNetw., vol. 12, no. 2, pp. 181–201, Mar. 2001.

[3] C. W. Hsu and C. J. Lin, “A comparison of methods for multi-class sup-port vector machines,” IEEE Trans. Neural Netw., vol. 13, no. 2, pp.415–425, Mar. 2002.

[4] Z. J. Koles, “Trends in EEG source localization,” Electroencephalogr.Clin. Neurophysiol., vol. 106, pp. 127–137, 1998.

[5] B. Scholkopf, J. C. Platt, J. Shawe-Taylor, A. J. Smola, and R. C.Williamson, “Estimating the support of a high dimensional distribu-tion,” Microsoft Research, Microsoft Corporation, TR99-87, Nov. 1999.

[6] D. Tax and R. Duin, “Data domain description by support vectors,” inProc. ESANN, M. Verleysen, Ed. Brussels, U.K.: D. Facto, 1999, pp.251–256.

[7] M. Sonmez, M. Sun, C. C. Li, and R. J. Sclabassi, “A hierarchical de-cision module based on multiple neural networks,” in Proc. IEEE Int.Conf. Neural Networks, Houston, TX, Jun. 1997, pp. 238–241.

[8] M. Sun, “An efficient algorithm for computing multishell sphericalvolume conductor models in EEG dipole source localization,” IEEETrans. Biomed. Eng., vol. 44, no. 12, pp. 1243–1252, Dec. 1997.

Manuscript received June 8, 2004.

classifying the multiplicity of the eeg source models using sphere-shaped support vector machines

Documents