context related artefact detection in prolonged eeg_imp

8/8/2019 Context Related Artefact Detection in Prolonged EEG_IMP

1/14

Computer Methods and Programs in Biomedicine 60 (1999) 183196

Context related artefact detection in prolonged EEGrecordings

Maarten van de Velde a,*, I. Robert Ghosh b, Pierre J.M. Cluitmans a

a Eindho6en Uni6ersity of Technology, Medical Electrical Engineering Group, PO Box 513, 5600 MB Eindho6en, The Netherlandb Department of Clinical Neurophysiology, St. Bartholomews Hospital, London, UK

Received 30 September 1998; received in revised form 12 February 1999; accepted 15 February 1999

Abstract

The need for reliable detection of artefacts in raw and processed EEG is widely acknowledged. Although differen

EEG analysis systems have been described, only few general applicable artefact recognition techniques have emerged

This paper tackles the problem of artefact detection in seven 24 h EEG recordings in the intensive care unit. ICU

recordings have received less attention than, e.g. epilepsy monitoring, although recordings in this environment presen

an interesting application area. The EEG data used here was recorded during the difficult circumstances of an explorativ

ICU study. The data set includes a diverse set of EEG patterns, as well as EEG artefacts. The study investigates objectiv

artefact detection methods based on statistical differences between signal parameters, using time-varying autoregressiv

modelling (AR) and Slope detection. In addition to matching the performance of artefact detection against two huma

observers, the study focuses on the optimal settings for context incorporation by testing the algorithms for differentime windows and epoch lengths. Results indicate that a relatively short period (2040 s) provides sufficient contex

information for the methods used. The combined AR and Slope detection parameters yielded good performance

detecting approximately 90% of the artefacts as indicated by the consensus score of the human observers. 1999 Elsevi

Science Ireland Ltd. All rights reserved.

Keywords: EEG; ICU; Artefact detection; Validation; Amplitude analysis; Autoregressive modelling

www.elsevier.com/locate/cmp

1. Introduction

The occurrence of artefacts in the EEG hinders

the reliable use of automatic analysis techniques.

Pre-processing by manual screening and marking

of artefacts is a time-consuming and tedious task

especially in prolonged recordings, though thviewing of events detected by automation an

subsequent confirmation of artefact is not s

onerous for human observers. A major proble

in computerised processing of the EEG is th

non-stationary behaviour of the non-artefactu

signal and the fact that some types of artefac* Corresponding author. Tel.: +31-40-2473288; fax: +31-

40-2466508.

0169-2607/99/$ - see front matter 1999 Elsevier Science Ireland Ltd. All rights reserved.

PII: S 0 1 6 9 - 2 6 0 7 ( 9 9 ) 0 0 0 1 3 - 9


2/14

M. 6an de Velde et al. /Computer Methods and Programs in Biomedicine 60 (1999) 183 196184

can resemble EEG activity. In addition, a wave-

form of exactly similar morphology may be cor-

rectly categorised as artefactual in one record, and

non-artefactual in another. They must, therefore,

be assessed in clinical context.

From a practical point of view, the problem of

non-stationarity and artefact identification actu-

ally may lie in the basic differences between hu-man screening and screening by a computer.

Visual evaluation is usually performed on rela-

tively long segments of 10 60 s where artefacts

are observed in relation to the ongoing signal. On

the other hand, a computerised screening process

should always be based on EEG features obtained

from a stationary signal, which requires the use of

short epochs of only 12 s [1]. Somewhat longer

stationary epochs may be found when using adap-

tive segmentation of EEGs, but in general the

segments are still rather short when compared to

human screening (e.g. [24]).

The signals behaviour may be modelled by

analysing the behaviour of features during seg-

ment transitions, thus incorporating the temporal

context of the EEG. We can then apply con-

straints (rules) to restrict the permitted sequence

of segments, and identify distinct segments ac-

cordingly [5,6]. A drawback of these methods is

the amount of heuristics involved in feature selec-

tion and the difficulties in composing an optimal

set of rules [7]. An alternative approach to arte-

fact detection is the comparison of parameters tothresholds that are derived from statistics of a

preceding EEG period. For instance, Flooh et al.

[8] took a short period as referential context,

using an amplitude threshold calculated as sixfold

the average amplitude in the preceding 10 s. A

relatively long context period was used in a study

by Brunner et al. [9], where the median of spectral

power was calculated over 3 min for the detection

of muscle artefact in sleep recordings.

The present study will further explore the con-

cept of temporal context in relation to artefactdetection, using two complementary detection

methods. A time-varying autoregressive (AR)

model will target EEG-like artefacts, where in

particular the identification of low frequency arte-

facts is expected [10,11]. Detection of artefacts in

the higher frequency range is performed by Slope

analysis (first derivative), which has been succes

ful for instance in the detection of muscle artefa[12,13]. Temporal context is modelled for bot

methods by reference to the EEG period immedately preceding the test-epochs, where detection o

significant changes is based on statistical princ

ples for variability tracking (AR) and outlier detection (Slope). Different lengths for the contex

period are investigated.

2. Methods

2.1. Autoregressi6e modelling

Auto-regressive (AR) modelling of discrete tim

series consists of computing the coefficients threpresent the correlation of a discrete time serie

s(n) with the preceding samples at sampling time

(n1) to (np),

s(n)=0+ %p

k=1

ks(nk)+e(n) (

where 0 represents the DC component of s(n1,,p are the AR model coefficients, and e(nis the residual error. The order p determines thnumber of unknown variables in the model.

An optimal solution can be found for a signa

period of length N by minimising the residuerrors, which can be performed with the ordinar

least squares (OLS) method [14]. The N equationthat are used in the calculation are first written ivector notation:

S=0+Z+e (2

S and e are vectors of N elements,

0=

0

0

,

Z=

0

s(1)

s(N1)

0

0

0

s(Np)

, =

1

p


3/14

M. 6an de Velde et al. /Computer Methods and Programs in Biomedicine 60 (1999) 183 196 1

The OLS method consists of minimising the

quadratic cost function

J=eTe=e2to through the following set of equations:

e=SZ0 (3)

and consequently,

J= [SZ0]T[SZ0]

and

dJ

d=ZT[SZ0]=0

The minimum for J is found for

Z=S0

resulting in the least squares estimate for the

model coefficients [14]:

=(ZTZ)1ZT(S0) (4)

2.1.1. Optimal model estimation and order

selection

An adequate fit of the original signal is char-

acterised by a residual error term that has the

statistical properties of an independent white-

noise process. This can be checked by testing for

normality [15] using the Shapiro Wilk statistic

[16]. This test is more powerful than other alter-

natives, and provides a sensitive measure fornon-normality [17,18]. Minimum power of the

residual error process is guaranteed by the OLS

method, and the actual values can be calculated

from Eq. (3).

Higher model orders will generally result in

smaller errors, but apart from the expense of

increasing computing time, a problem of over-

fitting exists [19,20]. In order to find a compro-

mise, Akaikes information criterion (AIC: [21])

includes the log-likelihood of the normalised er-

ror while penalising increasing orders p:

AIC(p)= ln|e2R0+2p

N(5)

error variance|e2

R0 autocorrelation, or power, of s(n)

length in samples of EEG periodN

The best model order p minimises the AIC

for

15pB3Nwhich is a practical test range [19]. Previous in

vestigations in EEG reveal optimal model order

between p=2 and p=15 (e.g. [22,23]). Rel

tively low orders p=5 or 6 have been reporte

consistently in various types of EEG [2426].

2.1.2. EEG 6ariability detection

Theoretically, the AR coefficients describe th

EEG signal in each epoch. However, the r

quirements of independence and normality o

the error term will most likely not be met durin

abnormal and artefact periods. The AR mod

will try to adapt to changes and any distu

bance, which will be reflected in both the coeffi

cients and the residuals. Therefore, usin

coefficients or errors alone is not enough fo

detection of changes in the EEG [26,27].

We calculate the multidimensional vector F i i

every epoch of 1 s, which is assumed to be sta

tionary [1]. The vector F i consists of the A

coefficients 1,,p and includes the mean veand standard deviation |err of the residual error

normalised relative to 1 mV. All vector components are weighted equally in the model. Initi

explorations in the current data set confirme

that the models individual components were ap

proximately of the same order of magnitud

(varying below 5 mV, normal EEG) (also se

e.g. [28]).

Variability tracking is performed on two con

secutive EEG periods, shifting forward in tim

designated context window and test windo

(see Fig. 1). In both periods, the variance of th

Euclidean distances between F i and their averag

is calculated. A high variance is expected whe

artefacts are encountered, for which the statist

cal significance is examined by comparing tes

variance to context variance. For two indepen

dent normal processes, the ratio of variances u

u12 follows an F-distribution, having N21 an


4/14


N11 degrees of freedom for test and context

period respectively. The one-sided 100(1h) per-

centage upper-confidence limit is found from a

standard table for the F-distribution [29]:

u22

u125f

h, N11, N21(6)

The length of the test window is fixed at N2=

10 [s], context length is varied for N1=10, 20, 40,

80, 160 [s], corresponding to f0.01,N11, N21=5.35, 4.81, 4.57, 4.40, and 4.31 respectively.

This approach incorporates all parameters of the

AR estimation and tests Eq. (6) at significance

level h=0.01.

2.2. Detection of short transients

Another statistical approach to EEG validation

is based on the assumption that the occurrence of

artefacts is reflected in changing statistical prop-

erties of amplitude parameters. In the current

study, we used the Slope parameter to target

short transients. This parameter is simple to im-

plement, yet very sensitive to high-frequency arte-

fact (see, e.g. [3032]).

A straightforward statistical implementation

has been used here. In each epoch, the maximum

Slope (1st derivative) is calculated, between all

pairs of successive samples, resulting in the Slope

histogram over a context window of epochs.

The histogram is expected to follow a normaldistribution during normal data conditions [33].

Now we can set a highamplitude threshold at

(v+3|) based on the mean (v), and standard

deviation (|) for outlier detection in the test

window (see Fig. 2). The confidence interval B

, v+3|\defines the range in which the

parameter values are considered normal. In a

normal distribution, this range encompasses

99.9% probability of the distribution function,

therefore promising high specificity (few false de-

tections).The unit epoch length for processing was cho-

sen at 1 s, identical to the autoregressive method.

Apart from accepting this epoch length as sta-

tionary [1], 1 s is also optimal for the accuracy of

detection, e.g. as shown in muscle artefact [34].

The Slope detection process was performed

analogous to the AR approach: the referenc

histogram was obtained from the context win

dow, for context lengths of N=10, 20, 40, 8

and 160 [s]. For increasing numbers of N, th

precision of the threshold estimate (v+3|) in

creases, which should lead to improved hypoth

sis testing.

2.3. Data set

The data used here are EEG registrations a

measured in a feasibility study in the intensiv

care unit (ICU) in Kuopio University Hospita

Finland. The recordings were approved by th

Medical Ethics Committee; informed assent wa

obtained from the patients relatives. Five p

tients (male, age range 1978 years) were in

cluded in this study; two were monitored twic

resulting in seven 24 h recordings. This data

publicly available from the fully annotated da

library (DL) that was acquired in an interna

tional collaboration, the IMPROVE DL [35]. Th

EEG data in the DL presents a wide range o

patterns, and may be considered reasonably rep

resentative of EEG recordings in ICU.

The EEG investigations were restricted to tw

channels, as only globally representative cerebr

changes were being assessed; these were C3-P

and C4-P4 (10-20 system). As a minimal se

these parasagittal derivations are also known a

showing the least number of artefacts in a clinicasetting [36]. Standard Ag AgCl type electrod

were used. Electrode impedance was kept low

and electrodes were reapplied when checks o

sustained artefacts suggested deterioration. Th

input amplitude range was 9200 mV, at a samp

frequency of 100 Hz, using a 2nd order low-pas

filter at 25 Hz cut-off frequency. A comprehen

sive review of procedures and technical details

given by Thomsen et al. [37].

2.4. E6aluation

2.4.1. Visual artefact assessment

Two experienced human observers were in

volved in the visual screening of all data, whic


5/14

Fig. 1. Variability tracking: an autoregressive model of order p is fitted every ith epoch, yielding vectors F i that consist of A

and standard deviation |err of the residual errors (the arrows depict the (p+2) dimensional vectors F i and their average

distances between F i and C (N) is calculated. This procedure is performed in both the context window and the test win

significant changes in signal variability.


6/14

Fig. 2. Detection of slope outliers: In every ith epoch the maximum slope is calculated, resulting in a distribution with mean

in the context window. The threshold of (vS+3|s) is then used in the test window to detect short transients.


7/14


was performed on a high-resolution computer dis-

play, showing only one channel in pages of 10 s.

This means that both channels C3 and C4 were

scored, without bias of the other channel. The

observers were well trained in artefact assessment

of clinical EEGs and worked independently

through the data, browsing through the data page

by page. Artefact-pages were classified as moder-ate artefact or severe artefact, and scored ac-

cordingly by a button-push. The evaluation

included on average 7,500 pages per channel per

patient, amounting to a total of more than

100 000 pages.

Scoring was performed according to the follow-

ing guidelines:

No score was given to pages showing a distinctEEG signal, allowing for minor artefacts (i.e.

very short duration or low amplitude, e.g. mi-

nor 50 Hz/muscle activity).

Moderate artefact was scored in signal pagesshowing: artefacts of relatively small ampli-

tude, total artefact time less than 1 s, or show-

ing presence of only one or two short

electrode-pop artefacts.

Se6ere artefact was assigned to pages otherthan above: large amplitude artefacts, 50 Hz

interference and muscle activity of larger am-

plitudes (twice the background amplitude).

In general, artefact scoring is not a sharply

defined procedure; indeed, these guidelines were

designed to allow for some subjectivity while try-ing to capture most of the artefacts. In view of the

amount of data, the exercise was kept relatively

simple, while obtaining accurate artefact

markings.

2.4.2. Performance measures

The above procedure and methods allow us to

evaluate the performance of both observers and

computer in percentages of time, in brief:

Sensiti6ity is defined as the percentage of true

artefacts (true according to the observer) that aremarked correctly by the detection algorithm, indi-

cating the detection power of the method. Positive

prediction is the accompanying measure of proba-

bility that indicates the percentage of automatic

markings that are considered by the observer(s) to

be true artefacts.

A comparable measure is specificity, to asse

the reliability of leaving unmarked those page

that do not contain any artefacts i.e. a low fal

detection rate results in a high specificity.

Subjective differences in interpretation o

lengthy phenomena may be magnified in the pe

formance measures. However, this evaluation

objective in view of the question what averaglength of EEG context is adequate for detectio

of artefacts?

3. Results

3.1. E6aluation by human obser6ers

3.1.1. Matching the obser6ers artefact scoring

The observers marked approximately 1,00

artefacts (80 min per channel) in each of th

patients, an average 7% of total recording tim

Observerc1 was the most critical of the two, an

scored significantly more artefact pages than ob

serverc2, especially in recordings 34, 35 and 36

This is obvious also from the inter-observer com

parison as depicted in Fig. 3: c2 scored less tha

60% of the artefacts ofc1 in those recordings

The agreement score, or consensus, represen

the sensitivity towards the other observers sco

ing. Mean consensus was 76%, which includes a

artefact markings. The differences in artefact asessment were mostly caused by the subjectiv

interpretation of 50 Hz interference. Some length

periods of this type of artefact were marked b

observerc1 as severe (recording 34, 36) o

moderate (35) and were not marked by ob

serverc2 because of relatively distinct EEG pa

terns. When correcting for these periods, th

agreement score reached well over 80% (corre

tion not shown in graph). Lengthy periods o

serious distortion in patient 38, including 50 H

interference, were marked by both observers.The general agreement of the observers a

well as the subjective interpretation of signals an

guidelines is further illustrated by comparin

the scores for severe artefact periods. The resul

ing higher overlap of the observers markings

also indicated in Fig. 3. This effect was largest i


8/14


the scores for patient 35. In this patient, only

2% of the recording was scored as artefact by

both observers, whereas an extra 2 h of 50 Hz

interference in channel C3 was marked only by

observerc1 (adding 4% artefact time).

The consensus about 6alid EEG periods was

very high: typically, 95 99% of the unmarked

periods by one observer were accorded by theother. In part this is also explained by the low

occurrence of artefacts, relative to the length of

the recordings. The number of markings that

did not match was rather small compared to the

7,500 pages in an average recording.

3.1.2. Artefacts and patients

A previous exploration of the data set had

resulted in an initial classification of artefacts.

The annotations had been made on a 1 min

time scale. Artefact occurrence was found to

consist of: sustained artefacts (71%), brief elec-

trode artefacts (21%), 50 Hz interference (6%),

and scalp muscle potentials (2%). The absence

of eye movement artefacts and the relative

paucity of scalp EMG potentials reflected the

chosen electrode derivations and the medication

or pathologically obtained state of the patients.

Nursing and medical interventions and patient

coughing were responsible for 78% of the arte-

facts. Most artefacts resolved rapidly without at-

tending to electrodes [38]. Although no dire

comparison could be performed because of di

ferent methodology, the observers of the curren

study acknowledged those earlier findings. Th

current study focussed on the aspects of tim

resolution of artefact detection using a highe

resolution for scoring. Therefore, scores an

derived measures are necessarily different (alssee Refs [37,39]).

The patients had been admitted to the ICU

based on the diagnosis of multiple organ failur

(definitions in Ref [40]). Recordings 32 and 3

were of the same patient (age 69), showing

generally attenuated EEG; the patient eventuall

died 7 days after the second recording. Record

ings 33 and 36 were of a cardiac patient (ag

78), without gross abnormalities in the EEG.

presumed drug effect resulted in a burst-suppre

sion (BS) pattern in patient 35 (age 19), whreceived a loading dose of thiopental before th

recording. His ICU diagnose was status epilep

ticus, suspected encephalitis, and the EEG gen

erally showed high-amplitude, irregular pattern

Neither of the observers scored the BS patter

as artefact, nor did the automatic method

Patient 37 (age 39) showed low amplitud

EEG (diagnosis: meningitis Escherichia co

hydrocephalus, septic shock). He died 10 day

Fig. 3. Inter-observer comparison: the consensus or agreement-score for marking of artefacts in the different patients (recordin

32/34, and 33/36 are the same patient). Consensus was high for severe artefacts.


9/14


Fig. 4. Optimal order estimation for the AR model using the

Akaike information criterion (AIC).

were generally in the lower frequency range, an

were often spread over several pages. Therefor

the performance measures in terms of time wer

found to underestimate the true detection powe

since the variability tracking only detected th

start of multiple-page artefacts. For instance, th

variability in a prolonged 50 Hz signal reduces t

zero. This problem could be solved by an agorithm that halts the context window until th

artefact is over. However, this was found a rathe

intricate addition to the current model. Moreove

this would invalidate the investigation of differen

context lengths: the period between artefacts o

ten did not allow reinstating the context model

Fig. 5 shows the performance for detection o

artefact onset of the AR method versus the con

sensus of the observers. The consensus incorpo

rated all artefacts that were marked by bot

observers, regardless of being moderate o

severe. The average sensitivity reached over 50%

only for context lengths of 10 and 20 s. Th

positive prediction for these contexts was signifi

cantly different from the neighbouring serie

(ANOVA, h=0.01). The increasing overlap fo

40, 80 and 160 s context was obvious also from

increasing, high non-significance.

AR detection of only the severe artefacts wa

characterised by approximately 20 40% high

sensitivity values. However, the correspondin

predictive accuracy was below 10%.

3.2.2. Slope detection

Slope detection was very successful in the ICU

EEG data. The results are indicated in Fig.

showing detection performance versus the consen

sus of the observers.

The sensitivity showed acceptable high values i

all patients, except in patient 38. In this patien

the relatively long periods of (consensus) interfe

ence artefact resulted in a very wide distributio

of Slope values, causing insensitive threshol

calculation.Overall, Slope detection performance was no

different for artefact onset alone: foremost, th

method detected short duration, transient art

facts. Average sensitivity was highest when usin

a 20 s context length (76%), rising to 84% whe

excluding patient 38.

after the recording. Patient 38 (age 29) did not

show any grossly abnormal EEG features.

3.2. Performance of automatic methods

3.2.1. Autoregressi6e modelling

Order selection and model validation. Before

starting the evaluation of the variability tracking

method, the autoregressive model was examined

for optimal order and normality of residual er-

rors. These analyses were performed in the first 3

min of every recording, testing both channels.

Fig. 4 shows the grand-averaged data for the

AIC, showing a minor, but obvious inclination

towards AR order 5 as optimal. Therefore, this

order was used in all subsequent calculations.

The normality test was performed using a C-translation of Roystons implementation of the

ShapiroWilk test [41]. Overall, 80% of the error

series was accepted as statistically normal (h=

0.05). The normality was lower in patients 35 and

36 where the recording started with artefact peri-

ods. This further validates the inclusion of the

error parameters in the AR variability tracking.

AR detection results. The detection of statisti-

cally different EEG pages was performed by test-

ing the F-statistic as described in the methods

section. The average variance of the AR-vectorsin the artefactual pages was significantly higher

than in the unmarked pages, but the method

proved to be rather insensitive to artefact detec-

tion in general. Two observations were made: (1)

the method was most successful for artefacts of

higher amplitude, and (2) the detected artefacts


10/14


The variance of positive prediction increased

with longer context windows. At the same time

average prediction decreased: the performance did

not improve. However, no statistical significance

was found.

3.2.3. Combined AR and Slope detection

The results of the combined methods are givenin Table 1, for a context of 20 s. Selection of 20 s

context was based on the observations above: an

AR sensitivity of 51% (at an acceptable 0.3 true

artefact prediction rate), and highest Slope perfor-

mance. The detection process was generally char-

acterised by Slope detection of high frequency

artefacts and AR detection of lower frequency

artefacts. We can see that the average sensitivity

has increased to 89%, which is 5% higher than the

average indicated in Fig. 6 using Slope detection

alone. In the individual patients, the AR method

contributed a 2 10% improvement to detection

power.

The specificity of detection was generally very

high: 9399% of valid EEG pages was left un-

marked by both the Slope and the AR method.

4. Discussion

Signal monitoring in ICU frequently presents a

good mix of biologic, technologic, and extrinsic

artefacts [42]. Validation of EEG data acquire

during such difficult conditions is imperative fo

automatic analysis and incorporation into routin

practice [43].

The current study aimed at detection of a

artefacts in the EEG subset of the IMPROV

data, focussing on context resolution. The meth

ods were based on statistical rules, designed foobjective detection of outlier phenomena in th

EEG. Two observers were involved in scrutinisin

the 24 h recordings at a 10 s time resolutio

Observer 1 scored a total percent artefact time o

7.7%, observer 2 scored 5.7% as artefact.

Subjective interpretation is a general problem i

EEG evaluation studies [44]. For instance, sma

artefacts in delta frequency range amid a (norma

background of larger amplitude can be underest

mated even by experienced observers [11]. Ther

fore, the consensus score of observers was used ttest the performance of automatic algorithms. Th

performance measures were defined to reflect th

percentages of time correct detection.

In general, the detection was performed high

specific partly affected by using performanc

definitions in terms of time, in combination with

low occurrence of artefacts. We acknowledged i

retrospect that 6alid EEG periods were sufficientl

left unmarked by the automatic methods, i.e. im

plying high specificity.

Fig. 5. Artefact detection using time-varying autoregressive variability tracking. Ellipses indicate the (v+|) probability-contou

(mean+standard deviation) for each series. (*) denotes a significant difference in positive prediction for 10 s, 20 s context length


11/14


Fig. 6. Performance of detection for the Slope amplitude method: detection versus context lengths. Ellipses indicate the (v+

probability-contours (excluding the outlier values of patient 38). No change in performance was observed beyond 80 s conte

length.

The results also show that the Slope parameter

detected most of the artefacts, and indicate that

long context lengths were not needed for the

investigated data set. The time-varying autore-

gressive variability tracking method was only rela-

tively successful. Nevertheless, when using a

combination of both methods, AR contributed up

to 10% sensitivity by detecting low frequency

artefacts. The overall performance reached 89%

sensitivity and 53% positive prediction. This latter

figure implies that approximately half of the auto-

matic markings do not indicate artefacts. How-ever, positive prediction is somewhat adversely

influenced because of the consensus data from the

human observers, which may also have excluded

some possible true artefacts. In addition, it would

seem sensible to err towards high sensitivity (at a

cost of lesser positive prediction); this would al-

low observers to visually analyse events detected

by automation, and categorise them as artefact/

non-artefact. This would be in the knowledge that

very few artefacts were missed by automation. If

the aim eventually were to develop event detec-tion as opposed to artefact detection, the posi-

tive prediction would be greatly increased.

Based on the current findings, especially the

EEG-like deviations found by AR variance detec-

tion may be defined as events rather than arte-

facts. Therefore, in clinical recordings event

detection not only includes the artefacts, but als

may highlight the most interesting parts of th

recording. As a discriminating method, highe

AR-variability scores will more likely indica

(low frequency) artefacts. Interestingly, AR base

analysis combined with variance testing was als

used in an early method by Vachon et al. (1978

[45]. They used an F-ratio of only the erro

variances, calculated within the residual array o

the AR model (1 s-epoch, p=5). At significanc

levels h=0.05 and 0.10, they concluded that th

detected non-stationary waveforms also needeadditional pattern recognition. The current ap

proach incorporated all parameters and residua

of the AR estimation and tested formula Eq. (6

at significance level h=0.01, while incorporatin

longer context periods.

Context related detection was implemented her

as a history based detection, therefore still diffe

ent from human screening. Human screening o

ten also involves going back in the data, whic

influences decision about the EEG being artefac

tual or not. In the current implementation, thautomatic methods were designed for objectiv

on-line processing, testing for statistical signifi

cance. As an illustration, Figs. 1 and 2 represen

true data from the current study. Both figur

indicate automatically detected EEG events i

the test window that were not marked by th


12/14


Table 1

Artefact detection using a context length of 20 s: slope detection and autoregressive variability tracking combined

Overall32Patients 383736353433

9887979479Sensitivity (%) 8989 75

53Pos. prediction (%) 49 58505761 50 49

observers, while clearly displaying deviating phe-

nomena in the EEG.

Artefacts often occur in more channels simulta-

neously, therefore a detected event (or candi-

date artefact) is usually checked visually in all

channels displayed together. This was also ob-

served in the current data set, but not incorpo-

rated in the algorithms or the evaluation.

Combining channels has been described by vari-

ous authors (e.g. [4,46,47]), implementing such

spatial (cross-channel) processing mainly for the

identification of eye-artefacts using rule-based sys-

tems. Another recently described system [48] used

artificial neural networks to pre-process EEG fea-

tures, and discriminated between (eye-) artefacts,

muscle artefacts and electrode artefacts in an ad-

ditional knowledge-based stage. The system cor-

rectly identified 90% of artefacts in the initial

evaluation. Unfortunately, the system was not

evaluated in a large clinical data set, and temporal

context was not evaluated systematically.

The current study provides some starting pointsfor choosing the optimal length of the context

periods in automatic analysis. Optimal context

was concluded to be as short as 2040 s.

Acknowledgements

This project was supported by the Co-operation

Centre of the Brabant Universities, project 94CH.

We are also very grateful for the co-operation

with colleagues from the IMPROVE project: DrP. Prior, Dr C.E. Thomsen and Mr R. Pottinger.

Appendix A. Nomenclature

Autoregressi6e model

discrete time signals(n)

N length in samples of EEG period

P order of autoregressive (AR) model

signal vectorS

e residual error vector

S summation

1,,p AR coefficients

AR vector (coefficients)

Z`

matrix of p times N elements

quadratic cost vector (of error poweJD

|e

2 error amplitude variance

R0 autocorrelation, or power, of s(n)

F i vector of 1,,p, mean verr, stan

dard deviation |err of the residu

errors

number of epochs in context, tesN1, N2window respectively

C (N1), C (N

2) average ofF i

u12,u2

2 variance of F i (euclidian distance t

C (N1), C (N

2))

fh, N11, N21

significance of the ratio of variance

u12,u2

2

Slope distribution

minus infinity

vs mean

|s standard deviation

References

[1] J.A. McEwen, G.B. Anderson, Modeling the stationari

and gaussianity of spontaneous electroencephalographactivity, IEEE Trans. Biomed. Eng. 22 (1975) 361369

[2] B.H. Jansen, A. Hasman, R. Lenten, Piecewise EE

analysis: an objective evaluation, Int. J. Biomed. Compu

12 (1981) 1727.

[3] G. Bodenstein, W. Schneider, C.V.D. Malsburg, Compu

erized EEG pattern classification by adaptive segment

tion and probability-density-function classification.


13/14


Description of the method, Comput. Biol. Med. 15 (1985)

297313.

[4] A. Varri, K. Hirvonen, J. Hasan, P. Loula, V. Hakkinen,

A computerized analysis system for vigilance studies,

Comp. Meth. Progr. Biomed. 39 (1992) 113124.

[5] V. Jagannathan, J.R. Bourne, B.H Jansen, J.W. Ward,

Artificial intelligence methods in quantitative electroen-

cephalogram analysis, Comput. Prog. Biomed. 15 (1982)

249258.

[6] B.H. Jansen, B.M. Dawant, Knowledge-based approachto sleep EEG analysisA feasibility study, IEEE Trans.

Biomed. Eng. 36 (1989) 510518.

[7] B.H. Jansen, Quantitative analysis of electroencephalo-

grams: is there chaos in the future?, Int. J. Biomed.

Comput. 27 (1991) 95123.

[8] E. Flooh, E. Korner, G. Ladurner, H. Lechner, EEG-

Nachtschlafableitungen: auswertung mittels automatis-

cher Datenanalyse (EEG-night-sleep-recordings:

automatic analysis. In German), Z. EEG-EMG 13 (1982)

157160.

[9] D.P. Brunner, R.C. Vasko, C.S. Detka, J.P. Monahan,

C.F. Reynolds III, D.J. Kupfer, Muscle artifacts in the

sleep EEG: automated detection and effect on all-nightEEG power spectra, J. Sleep Res. 5 (1996) 155164.

[10] B.H. Jansen, J.R. Bourne, J.W. Ward, Identification and

labelling of EEG graphic elements using autoregressive

spectral estimates, Comput. Biol. Med. 12 (1982) 97106.

[11] J.S. Barlow, Artifact processing (rejection and minimiza-

tion) in EEG data processing, in: F.H. Lopes da Silva,

W.H. Storm van Leeuwen (Eds.), Handbook of Elec-

troencephalography and Clinical Neurophysiology, Re-

vised edition, Vol. 3B: Applications of Analytical

Techniques, Elsevier, Amsterdam, 1986, pp. 1562.

[12] J.S. Barlow, Muscle spike artifact minimization in EEGs

by time-domain filtering, Electroenceph. Clin. Neuro-

physiol. 55 (1983) 487491.[13] J.S. Barlow, Automatic elimination of electrode-pop arti-

facts in EEGs, IEEE Trans. Biomed. Eng. 33 (1986)

517521.

[14] V. Strejc, Least squares parameter estimation, Automat-

ica 16 (1980) 535550.

[15] D.A. Pierce, Testing normality in autoregressive models,

Biometrika 72 (1985) 293297.

[16] S.S. Shapiro, M.B. Wilk, An analysis of variance test for

normality (complete samples), Biometrika 52 (1965) 591

611.

[17] S. Shapiro, M.B. Wilk, H.J. Chen, A comparitive study of

various tests for normality, Am. Stat. Ass. J. 63 (1968)

13431372.

[18] R. Bender, B. Schultz, A. Schultz, I. Pichlmayr, Testing

the gaussianity of the human EEG during anaesthesia,

Meth. Inf. Med. 31 (1992) 5659.

[19] J. Makhoul, Linear prediction: a tutorial review, Proc.

IEEE 63 (1975) 561580.

[20] G.E.P. Box, G.M. Jenkins, Time series analysis, forecast-

ing and control, Revised edition, Holden-Day, London,

1976.

[21] H. Akaike, A new look at the statistical model identific

tion, IEEE Trans. Autom. Control 19 (1974) 716723.

[22] C.W. Anderson, E.A. Stolz, S. Shamsunder, Multivaria

autoregressive models for classification of spontaneo

electroencephalographic signals during mental task

IEEE Trans. Biomed. Eng. 45 (1998) 277286.

[23] S. Cerutti, D. Liberati, G. Avanzini, S. Franceschetti,

Panzica, Classification of the EEG during neurosurger

Parametric identification and Kalman filtering compare

J. Biomed. Eng. 8 (1986) 244254.[24] L.H. Zetterberg, Estimation of parameters for a line

difference equation with application to EEG analys

Math. Biosciences 5 (1969) 205226.

[25] B.H. Jansen, J.R. Bourne, J.W. Ward, Autoregressi

estimation of short segment spectra for computeriz

EEG analysis, IEEE Trans. Biomed. Eng. 28 (1981) 630

638.

[26] J. Pardey, S. Roberts, L. Tarassenko, A review of par

metric modeling techniques for EEG-analysis, Med. En

Physics 18 (1996) 211.

[27] F.D.J. Dunstan, R.W. Marshall, The detection of art

facts in EEG series, Stat. Med. 10 (1991) 17191731.

[28] S. Cerutti, D. Liberati, P. Mascellani, Parameter extra

tion in EEG processing during riskful neurosurgical ope

ations, Signal Proc. 9 (1985) 2535.

[29] D.C. Montgomery, G.C. Runger, Applied statistics an

probability for engineers, Wiley, New York, 1994.

[30] M. Scherg, Simultaneous recording and separation

early and middle latency auditory evoked potentials, Ele

troenceph. Clin. Neurophysiol. 54 (1982) 339341.

[31] H. Hinrichs, H.J. Heinze, M.R. Gaab, Neurophysiolog

ches monitoring bei neurochirurgischen gefaoperatione

spezifische technische anforderungen und deren umse

zung (Neurophysiological monitoring of neurosurgic

vessel-operations: technical specification and implement

tion. In German), Z. EEG-EMG 23 (1992) 195202.

[32] H. Hinrichs, H. Feistner, H.J. Heinze, A trend-detectioalgorithm for intraoperative EEG monitoring, Med. En

Physics 18 (1996) 626631.

[33] P.J.M. Cluitmans, J.W. Jansen, J.E.W. Beneken, Artefa

detection and removal during auditory evoked potenti

monitoring, J. Clin. Mon. 9 (1993) 112120.

[34] M. van de Velde, G. van Erp, P.J.M. Cluitmans, Musc

artefact detection in the normal human awake EEG

Electroenceph. Clin. Neurophysiol. 107 (1998) 149158

[35] I. Korhonen, J. Ojaniemi, K. Nieminen, M. van Gils, A

Heikela, A. Kari, Building the IMPROVE data Librar

IEEE Eng. Med. Biol. 16 (1997) 2532.

[36] B. Schultz, R. Bender, A. Schultz, I. Pichlmayr, Redu

tion der anzahl von EEG-ableitungen fur ein ro

tinemaiges monitoring auf der intensivstatio

(Electroencephalographic monitoring in the ICU R

duction of the number of recorded channels. In German

Biomed. Technik 37 (1992) 194199.

[37] C.E. Thomsen, J. Gade, K. Nieminen, R.M. Langfor

I.R. Ghosh, K. Jensen, M. van Gils, A. Rosenfalck, P.F

Prior, S. White, Collecting EEG signals in the IMPROV

data library, IEEE Eng. Med. Biol. 16 (1997) 3340.


14/14


[38] I.R. Ghosh, P.F. Prior, S.R. White, J. Gade, K. Jensen,

R.M. Langford, A. Rosenfalck, C.E. Thomsen, Artefact

assessment in prolonged EEG-polygraphic recordings in

intensive care, Electroenceph. Clin. Neurophysiol. (In

press).

[39] M. van Gils, A. Rosenfalck, S. White, P. Prior, J. Gade,

L. Senhadji, C.E. Thomsen, I.R. Ghosh, R.M. Langford,

K. Jensen, Signal processing in prolonged EEG record-

ings during intensive care, IEEE Eng. Med. Biol. 16

(1997) 5663.[40] K. Nieminen, R.M. Langford, C.J. Morgan, J. Takala, A.

Kari, A clinical description of the IMPROVE data li-

brary, IEEE Eng. Med. Biol. 16 (1997) 2124.

[41] P. Royston, Shapiro Wilk W test and its significance

level. Algorithm AS R94, Appl. Stat. 44 (1995) 4.

[42] D.W. Klass, The continuing challenge of artifacts in the

EEG, Am. J. EEG Technol. 35 (1995) 239269.

[43] P. Prior, The rationale and utility of neurophysiological

investigations in clinical monitoring for brain and spinal

cord ischaemia during surgery and intensive care, Comp.

Meth. Prog. Biomed. 51 (1996) 1327.

[44] G.W. Williams, H.O. Luders, A. Brickner, M. Goormas-

tic, D.W. Klass, Interobserver variability in EEG inte

pretation, Neurology 35 (1985) 17141719.

[45] B. Vachon, B. Dubuisson, D. Samson-Dollfus, Etu

automatique de lEEG: une methode de detection des no

stationnarites (Automatic EEG processing: a method f

detection of non-stationarities. In French), Int. J. Biom

Comput. 9 (1978) 147162.

[46] T. Pietila, S. Vapaakoski, U. Nousiainen, A. Varri, H

Frey, V. Hakkinen, Y. Neuvo, Evaluation of a compute

ized system for recognition of epileptic activity durinlong-term EEG recording, Electroenceph. Clin. Neur

physiol. 90 (1994) 438443.

[47] M. Nakamura, T. Sugi, A. Ikeda, R. Kagigi, H

Shibasaki, Clinical application of automatic integrati

interpretation of awake background EEG: quantitati

interpretation, report making, and detection of artifac

and reduced vigilance level, Electroenceph. Clin. Neur

physiol. 98 (1996) 103112.

[48] J. Wu, E.C. Ifeachor, E.M. Allen, W.K. Wimalaratn

N.R. Hudson, Intelligent artefact identification in ele

troencephalography signal processing, IEE Proc. S

Meas. Technol. 144 (1997) 193201.

.

context related artefact detection in prolonged eeg_imp

Documents