context related artefact detection in prolonged eeg_imp
TRANSCRIPT
-
8/8/2019 Context Related Artefact Detection in Prolonged EEG_IMP
1/14
Computer Methods and Programs in Biomedicine 60 (1999) 183196
Context related artefact detection in prolonged EEGrecordings
Maarten van de Velde a,*, I. Robert Ghosh b, Pierre J.M. Cluitmans a
a Eindho6en Uni6ersity of Technology, Medical Electrical Engineering Group, PO Box 513, 5600 MB Eindho6en, The Netherlandb Department of Clinical Neurophysiology, St. Bartholomews Hospital, London, UK
Received 30 September 1998; received in revised form 12 February 1999; accepted 15 February 1999
Abstract
The need for reliable detection of artefacts in raw and processed EEG is widely acknowledged. Although differen
EEG analysis systems have been described, only few general applicable artefact recognition techniques have emerged
This paper tackles the problem of artefact detection in seven 24 h EEG recordings in the intensive care unit. ICU
recordings have received less attention than, e.g. epilepsy monitoring, although recordings in this environment presen
an interesting application area. The EEG data used here was recorded during the difficult circumstances of an explorativ
ICU study. The data set includes a diverse set of EEG patterns, as well as EEG artefacts. The study investigates objectiv
artefact detection methods based on statistical differences between signal parameters, using time-varying autoregressiv
modelling (AR) and Slope detection. In addition to matching the performance of artefact detection against two huma
observers, the study focuses on the optimal settings for context incorporation by testing the algorithms for differentime windows and epoch lengths. Results indicate that a relatively short period (2040 s) provides sufficient contex
information for the methods used. The combined AR and Slope detection parameters yielded good performance
detecting approximately 90% of the artefacts as indicated by the consensus score of the human observers. 1999 Elsevi
Science Ireland Ltd. All rights reserved.
Keywords: EEG; ICU; Artefact detection; Validation; Amplitude analysis; Autoregressive modelling
www.elsevier.com/locate/cmp
1. Introduction
The occurrence of artefacts in the EEG hinders
the reliable use of automatic analysis techniques.
Pre-processing by manual screening and marking
of artefacts is a time-consuming and tedious task
especially in prolonged recordings, though thviewing of events detected by automation an
subsequent confirmation of artefact is not s
onerous for human observers. A major proble
in computerised processing of the EEG is th
non-stationary behaviour of the non-artefactu
signal and the fact that some types of artefac* Corresponding author. Tel.: +31-40-2473288; fax: +31-
40-2466508.
0169-2607/99/$ - see front matter 1999 Elsevier Science Ireland Ltd. All rights reserved.
PII: S 0 1 6 9 - 2 6 0 7 ( 9 9 ) 0 0 0 1 3 - 9
-
8/8/2019 Context Related Artefact Detection in Prolonged EEG_IMP
2/14
M. 6an de Velde et al. /Computer Methods and Programs in Biomedicine 60 (1999) 183 196184
can resemble EEG activity. In addition, a wave-
form of exactly similar morphology may be cor-
rectly categorised as artefactual in one record, and
non-artefactual in another. They must, therefore,
be assessed in clinical context.
From a practical point of view, the problem of
non-stationarity and artefact identification actu-
ally may lie in the basic differences between hu-man screening and screening by a computer.
Visual evaluation is usually performed on rela-
tively long segments of 10 60 s where artefacts
are observed in relation to the ongoing signal. On
the other hand, a computerised screening process
should always be based on EEG features obtained
from a stationary signal, which requires the use of
short epochs of only 12 s [1]. Somewhat longer
stationary epochs may be found when using adap-
tive segmentation of EEGs, but in general the
segments are still rather short when compared to
human screening (e.g. [24]).
The signals behaviour may be modelled by
analysing the behaviour of features during seg-
ment transitions, thus incorporating the temporal
context of the EEG. We can then apply con-
straints (rules) to restrict the permitted sequence
of segments, and identify distinct segments ac-
cordingly [5,6]. A drawback of these methods is
the amount of heuristics involved in feature selec-
tion and the difficulties in composing an optimal
set of rules [7]. An alternative approach to arte-
fact detection is the comparison of parameters tothresholds that are derived from statistics of a
preceding EEG period. For instance, Flooh et al.
[8] took a short period as referential context,
using an amplitude threshold calculated as sixfold
the average amplitude in the preceding 10 s. A
relatively long context period was used in a study
by Brunner et al. [9], where the median of spectral
power was calculated over 3 min for the detection
of muscle artefact in sleep recordings.
The present study will further explore the con-
cept of temporal context in relation to artefactdetection, using two complementary detection
methods. A time-varying autoregressive (AR)
model will target EEG-like artefacts, where in
particular the identification of low frequency arte-
facts is expected [10,11]. Detection of artefacts in
the higher frequency range is performed by Slope
analysis (first derivative), which has been succes
ful for instance in the detection of muscle artefa[12,13]. Temporal context is modelled for bot
methods by reference to the EEG period immedately preceding the test-epochs, where detection o
significant changes is based on statistical princ
ples for variability tracking (AR) and outlier detection (Slope). Different lengths for the contex
period are investigated.
2. Methods
2.1. Autoregressi6e modelling
Auto-regressive (AR) modelling of discrete tim
series consists of computing the coefficients threpresent the correlation of a discrete time serie
s(n) with the preceding samples at sampling time
(n1) to (np),
s(n)=0+ %p
k=1
ks(nk)+e(n) (
where 0 represents the DC component of s(n1,,p are the AR model coefficients, and e(nis the residual error. The order p determines thnumber of unknown variables in the model.
An optimal solution can be found for a signa
period of length N by minimising the residuerrors, which can be performed with the ordinar
least squares (OLS) method [14]. The N equationthat are used in the calculation are first written ivector notation:
S=0+Z+e (2
S and e are vectors of N elements,
0=
0
0
,
Z=
0
s(1)
s(N1)
0
0
0
s(Np)
, =
1
p
-
8/8/2019 Context Related Artefact Detection in Prolonged EEG_IMP
3/14
M. 6an de Velde et al. /Computer Methods and Programs in Biomedicine 60 (1999) 183 196 1
The OLS method consists of minimising the
quadratic cost function
J=eTe=e2to through the following set of equations:
e=SZ0 (3)
and consequently,
J= [SZ0]T[SZ0]
and
dJ
d=ZT[SZ0]=0
The minimum for J is found for
Z=S0
resulting in the least squares estimate for the
model coefficients [14]:
=(ZTZ)1ZT(S0) (4)
2.1.1. Optimal model estimation and order
selection
An adequate fit of the original signal is char-
acterised by a residual error term that has the
statistical properties of an independent white-
noise process. This can be checked by testing for
normality [15] using the Shapiro Wilk statistic
[16]. This test is more powerful than other alter-
natives, and provides a sensitive measure fornon-normality [17,18]. Minimum power of the
residual error process is guaranteed by the OLS
method, and the actual values can be calculated
from Eq. (3).
Higher model orders will generally result in
smaller errors, but apart from the expense of
increasing computing time, a problem of over-
fitting exists [19,20]. In order to find a compro-
mise, Akaikes information criterion (AIC: [21])
includes the log-likelihood of the normalised er-
ror while penalising increasing orders p:
AIC(p)= ln|e2R0+2p
N(5)
error variance|e2
R0 autocorrelation, or power, of s(n)
length in samples of EEG periodN
The best model order p minimises the AIC
for
15pB3Nwhich is a practical test range [19]. Previous in
vestigations in EEG reveal optimal model order
between p=2 and p=15 (e.g. [22,23]). Rel
tively low orders p=5 or 6 have been reporte
consistently in various types of EEG [2426].
2.1.2. EEG 6ariability detection
Theoretically, the AR coefficients describe th
EEG signal in each epoch. However, the r
quirements of independence and normality o
the error term will most likely not be met durin
abnormal and artefact periods. The AR mod
will try to adapt to changes and any distu
bance, which will be reflected in both the coeffi
cients and the residuals. Therefore, usin
coefficients or errors alone is not enough fo
detection of changes in the EEG [26,27].
We calculate the multidimensional vector F i i
every epoch of 1 s, which is assumed to be sta
tionary [1]. The vector F i consists of the A
coefficients 1,,p and includes the mean veand standard deviation |err of the residual error
normalised relative to 1 mV. All vector components are weighted equally in the model. Initi
explorations in the current data set confirme
that the models individual components were ap
proximately of the same order of magnitud
(varying below 5 mV, normal EEG) (also se
e.g. [28]).
Variability tracking is performed on two con
secutive EEG periods, shifting forward in tim
designated context window and test windo
(see Fig. 1). In both periods, the variance of th
Euclidean distances between F i and their averag
is calculated. A high variance is expected whe
artefacts are encountered, for which the statist
cal significance is examined by comparing tes
variance to context variance. For two indepen
dent normal processes, the ratio of variances u
u12 follows an F-distribution, having N21 an
-
8/8/2019 Context Related Artefact Detection in Prolonged EEG_IMP
4/14
M. 6an de Velde et al. /Computer Methods and Programs in Biomedicine 60 (1999) 183 196186
N11 degrees of freedom for test and context
period respectively. The one-sided 100(1h) per-
centage upper-confidence limit is found from a
standard table for the F-distribution [29]:
u22
u125f
h, N11, N21(6)
The length of the test window is fixed at N2=
10 [s], context length is varied for N1=10, 20, 40,
80, 160 [s], corresponding to f0.01,N11, N21=5.35, 4.81, 4.57, 4.40, and 4.31 respectively.
This approach incorporates all parameters of the
AR estimation and tests Eq. (6) at significance
level h=0.01.
2.2. Detection of short transients
Another statistical approach to EEG validation
is based on the assumption that the occurrence of
artefacts is reflected in changing statistical prop-
erties of amplitude parameters. In the current
study, we used the Slope parameter to target
short transients. This parameter is simple to im-
plement, yet very sensitive to high-frequency arte-
fact (see, e.g. [3032]).
A straightforward statistical implementation
has been used here. In each epoch, the maximum
Slope (1st derivative) is calculated, between all
pairs of successive samples, resulting in the Slope
histogram over a context window of epochs.
The histogram is expected to follow a normaldistribution during normal data conditions [33].
Now we can set a highamplitude threshold at
(v+3|) based on the mean (v), and standard
deviation (|) for outlier detection in the test
window (see Fig. 2). The confidence interval B
, v+3|\defines the range in which the
parameter values are considered normal. In a
normal distribution, this range encompasses
99.9% probability of the distribution function,
therefore promising high specificity (few false de-
tections).The unit epoch length for processing was cho-
sen at 1 s, identical to the autoregressive method.
Apart from accepting this epoch length as sta-
tionary [1], 1 s is also optimal for the accuracy of
detection, e.g. as shown in muscle artefact [34].
The Slope detection process was performed
analogous to the AR approach: the referenc
histogram was obtained from the context win
dow, for context lengths of N=10, 20, 40, 8
and 160 [s]. For increasing numbers of N, th
precision of the threshold estimate (v+3|) in
creases, which should lead to improved hypoth
sis testing.
2.3. Data set
The data used here are EEG registrations a
measured in a feasibility study in the intensiv
care unit (ICU) in Kuopio University Hospita
Finland. The recordings were approved by th
Medical Ethics Committee; informed assent wa
obtained from the patients relatives. Five p
tients (male, age range 1978 years) were in
cluded in this study; two were monitored twic
resulting in seven 24 h recordings. This data
publicly available from the fully annotated da
library (DL) that was acquired in an interna
tional collaboration, the IMPROVE DL [35]. Th
EEG data in the DL presents a wide range o
patterns, and may be considered reasonably rep
resentative of EEG recordings in ICU.
The EEG investigations were restricted to tw
channels, as only globally representative cerebr
changes were being assessed; these were C3-P
and C4-P4 (10-20 system). As a minimal se
these parasagittal derivations are also known a
showing the least number of artefacts in a clinicasetting [36]. Standard Ag AgCl type electrod
were used. Electrode impedance was kept low
and electrodes were reapplied when checks o
sustained artefacts suggested deterioration. Th
input amplitude range was 9200 mV, at a samp
frequency of 100 Hz, using a 2nd order low-pas
filter at 25 Hz cut-off frequency. A comprehen
sive review of procedures and technical details
given by Thomsen et al. [37].
2.4. E6aluation
2.4.1. Visual artefact assessment
Two experienced human observers were in
volved in the visual screening of all data, whic
-
8/8/2019 Context Related Artefact Detection in Prolonged EEG_IMP
5/14
Fig. 1. Variability tracking: an autoregressive model of order p is fitted every ith epoch, yielding vectors F i that consist of A
and standard deviation |err of the residual errors (the arrows depict the (p+2) dimensional vectors F i and their average
distances between F i and C (N) is calculated. This procedure is performed in both the context window and the test win
significant changes in signal variability.
-
8/8/2019 Context Related Artefact Detection in Prolonged EEG_IMP
6/14
Fig. 2. Detection of slope outliers: In every ith epoch the maximum slope is calculated, resulting in a distribution with mean
in the context window. The threshold of (vS+3|s) is then used in the test window to detect short transients.
-
8/8/2019 Context Related Artefact Detection in Prolonged EEG_IMP
7/14
M. 6an de Velde et al. /Computer Methods and Programs in Biomedicine 60 (1999) 183 196 1
was performed on a high-resolution computer dis-
play, showing only one channel in pages of 10 s.
This means that both channels C3 and C4 were
scored, without bias of the other channel. The
observers were well trained in artefact assessment
of clinical EEGs and worked independently
through the data, browsing through the data page
by page. Artefact-pages were classified as moder-ate artefact or severe artefact, and scored ac-
cordingly by a button-push. The evaluation
included on average 7,500 pages per channel per
patient, amounting to a total of more than
100 000 pages.
Scoring was performed according to the follow-
ing guidelines:
No score was given to pages showing a distinctEEG signal, allowing for minor artefacts (i.e.
very short duration or low amplitude, e.g. mi-
nor 50 Hz/muscle activity).
Moderate artefact was scored in signal pagesshowing: artefacts of relatively small ampli-
tude, total artefact time less than 1 s, or show-
ing presence of only one or two short
electrode-pop artefacts.
Se6ere artefact was assigned to pages otherthan above: large amplitude artefacts, 50 Hz
interference and muscle activity of larger am-
plitudes (twice the background amplitude).
In general, artefact scoring is not a sharply
defined procedure; indeed, these guidelines were
designed to allow for some subjectivity while try-ing to capture most of the artefacts. In view of the
amount of data, the exercise was kept relatively
simple, while obtaining accurate artefact
markings.
2.4.2. Performance measures
The above procedure and methods allow us to
evaluate the performance of both observers and
computer in percentages of time, in brief:
Sensiti6ity is defined as the percentage of true
artefacts (true according to the observer) that aremarked correctly by the detection algorithm, indi-
cating the detection power of the method. Positive
prediction is the accompanying measure of proba-
bility that indicates the percentage of automatic
markings that are considered by the observer(s) to
be true artefacts.
A comparable measure is specificity, to asse
the reliability of leaving unmarked those page
that do not contain any artefacts i.e. a low fal
detection rate results in a high specificity.
Subjective differences in interpretation o
lengthy phenomena may be magnified in the pe
formance measures. However, this evaluation
objective in view of the question what averaglength of EEG context is adequate for detectio
of artefacts?
3. Results
3.1. E6aluation by human obser6ers
3.1.1. Matching the obser6ers artefact scoring
The observers marked approximately 1,00
artefacts (80 min per channel) in each of th
patients, an average 7% of total recording tim
Observerc1 was the most critical of the two, an
scored significantly more artefact pages than ob
serverc2, especially in recordings 34, 35 and 36
This is obvious also from the inter-observer com
parison as depicted in Fig. 3: c2 scored less tha
60% of the artefacts ofc1 in those recordings
The agreement score, or consensus, represen
the sensitivity towards the other observers sco
ing. Mean consensus was 76%, which includes a
artefact markings. The differences in artefact asessment were mostly caused by the subjectiv
interpretation of 50 Hz interference. Some length
periods of this type of artefact were marked b
observerc1 as severe (recording 34, 36) o
moderate (35) and were not marked by ob
serverc2 because of relatively distinct EEG pa
terns. When correcting for these periods, th
agreement score reached well over 80% (corre
tion not shown in graph). Lengthy periods o
serious distortion in patient 38, including 50 H
interference, were marked by both observers.The general agreement of the observers a
well as the subjective interpretation of signals an
guidelines is further illustrated by comparin
the scores for severe artefact periods. The resul
ing higher overlap of the observers markings
also indicated in Fig. 3. This effect was largest i
-
8/8/2019 Context Related Artefact Detection in Prolonged EEG_IMP
8/14
M. 6an de Velde et al. /Computer Methods and Programs in Biomedicine 60 (1999) 183 196190
the scores for patient 35. In this patient, only
2% of the recording was scored as artefact by
both observers, whereas an extra 2 h of 50 Hz
interference in channel C3 was marked only by
observerc1 (adding 4% artefact time).
The consensus about 6alid EEG periods was
very high: typically, 95 99% of the unmarked
periods by one observer were accorded by theother. In part this is also explained by the low
occurrence of artefacts, relative to the length of
the recordings. The number of markings that
did not match was rather small compared to the
7,500 pages in an average recording.
3.1.2. Artefacts and patients
A previous exploration of the data set had
resulted in an initial classification of artefacts.
The annotations had been made on a 1 min
time scale. Artefact occurrence was found to
consist of: sustained artefacts (71%), brief elec-
trode artefacts (21%), 50 Hz interference (6%),
and scalp muscle potentials (2%). The absence
of eye movement artefacts and the relative
paucity of scalp EMG potentials reflected the
chosen electrode derivations and the medication
or pathologically obtained state of the patients.
Nursing and medical interventions and patient
coughing were responsible for 78% of the arte-
facts. Most artefacts resolved rapidly without at-
tending to electrodes [38]. Although no dire
comparison could be performed because of di
ferent methodology, the observers of the curren
study acknowledged those earlier findings. Th
current study focussed on the aspects of tim
resolution of artefact detection using a highe
resolution for scoring. Therefore, scores an
derived measures are necessarily different (alssee Refs [37,39]).
The patients had been admitted to the ICU
based on the diagnosis of multiple organ failur
(definitions in Ref [40]). Recordings 32 and 3
were of the same patient (age 69), showing
generally attenuated EEG; the patient eventuall
died 7 days after the second recording. Record
ings 33 and 36 were of a cardiac patient (ag
78), without gross abnormalities in the EEG.
presumed drug effect resulted in a burst-suppre
sion (BS) pattern in patient 35 (age 19), whreceived a loading dose of thiopental before th
recording. His ICU diagnose was status epilep
ticus, suspected encephalitis, and the EEG gen
erally showed high-amplitude, irregular pattern
Neither of the observers scored the BS patter
as artefact, nor did the automatic method
Patient 37 (age 39) showed low amplitud
EEG (diagnosis: meningitis Escherichia co
hydrocephalus, septic shock). He died 10 day
Fig. 3. Inter-observer comparison: the consensus or agreement-score for marking of artefacts in the different patients (recordin
32/34, and 33/36 are the same patient). Consensus was high for severe artefacts.
-
8/8/2019 Context Related Artefact Detection in Prolonged EEG_IMP
9/14
M. 6an de Velde et al. /Computer Methods and Programs in Biomedicine 60 (1999) 183 196 1
Fig. 4. Optimal order estimation for the AR model using the
Akaike information criterion (AIC).
were generally in the lower frequency range, an
were often spread over several pages. Therefor
the performance measures in terms of time wer
found to underestimate the true detection powe
since the variability tracking only detected th
start of multiple-page artefacts. For instance, th
variability in a prolonged 50 Hz signal reduces t
zero. This problem could be solved by an agorithm that halts the context window until th
artefact is over. However, this was found a rathe
intricate addition to the current model. Moreove
this would invalidate the investigation of differen
context lengths: the period between artefacts o
ten did not allow reinstating the context model
Fig. 5 shows the performance for detection o
artefact onset of the AR method versus the con
sensus of the observers. The consensus incorpo
rated all artefacts that were marked by bot
observers, regardless of being moderate o
severe. The average sensitivity reached over 50%
only for context lengths of 10 and 20 s. Th
positive prediction for these contexts was signifi
cantly different from the neighbouring serie
(ANOVA, h=0.01). The increasing overlap fo
40, 80 and 160 s context was obvious also from
increasing, high non-significance.
AR detection of only the severe artefacts wa
characterised by approximately 20 40% high
sensitivity values. However, the correspondin
predictive accuracy was below 10%.
3.2.2. Slope detection
Slope detection was very successful in the ICU
EEG data. The results are indicated in Fig.
showing detection performance versus the consen
sus of the observers.
The sensitivity showed acceptable high values i
all patients, except in patient 38. In this patien
the relatively long periods of (consensus) interfe
ence artefact resulted in a very wide distributio
of Slope values, causing insensitive threshol
calculation.Overall, Slope detection performance was no
different for artefact onset alone: foremost, th
method detected short duration, transient art
facts. Average sensitivity was highest when usin
a 20 s context length (76%), rising to 84% whe
excluding patient 38.
after the recording. Patient 38 (age 29) did not
show any grossly abnormal EEG features.
3.2. Performance of automatic methods
3.2.1. Autoregressi6e modelling
Order selection and model validation. Before
starting the evaluation of the variability tracking
method, the autoregressive model was examined
for optimal order and normality of residual er-
rors. These analyses were performed in the first 3
min of every recording, testing both channels.
Fig. 4 shows the grand-averaged data for the
AIC, showing a minor, but obvious inclination
towards AR order 5 as optimal. Therefore, this
order was used in all subsequent calculations.
The normality test was performed using a C-translation of Roystons implementation of the
ShapiroWilk test [41]. Overall, 80% of the error
series was accepted as statistically normal (h=
0.05). The normality was lower in patients 35 and
36 where the recording started with artefact peri-
ods. This further validates the inclusion of the
error parameters in the AR variability tracking.
AR detection results. The detection of statisti-
cally different EEG pages was performed by test-
ing the F-statistic as described in the methods
section. The average variance of the AR-vectorsin the artefactual pages was significantly higher
than in the unmarked pages, but the method
proved to be rather insensitive to artefact detec-
tion in general. Two observations were made: (1)
the method was most successful for artefacts of
higher amplitude, and (2) the detected artefacts
-
8/8/2019 Context Related Artefact Detection in Prolonged EEG_IMP
10/14
M. 6an de Velde et al. /Computer Methods and Programs in Biomedicine 60 (1999) 183 196192
The variance of positive prediction increased
with longer context windows. At the same time
average prediction decreased: the performance did
not improve. However, no statistical significance
was found.
3.2.3. Combined AR and Slope detection
The results of the combined methods are givenin Table 1, for a context of 20 s. Selection of 20 s
context was based on the observations above: an
AR sensitivity of 51% (at an acceptable 0.3 true
artefact prediction rate), and highest Slope perfor-
mance. The detection process was generally char-
acterised by Slope detection of high frequency
artefacts and AR detection of lower frequency
artefacts. We can see that the average sensitivity
has increased to 89%, which is 5% higher than the
average indicated in Fig. 6 using Slope detection
alone. In the individual patients, the AR method
contributed a 2 10% improvement to detection
power.
The specificity of detection was generally very
high: 9399% of valid EEG pages was left un-
marked by both the Slope and the AR method.
4. Discussion
Signal monitoring in ICU frequently presents a
good mix of biologic, technologic, and extrinsic
artefacts [42]. Validation of EEG data acquire
during such difficult conditions is imperative fo
automatic analysis and incorporation into routin
practice [43].
The current study aimed at detection of a
artefacts in the EEG subset of the IMPROV
data, focussing on context resolution. The meth
ods were based on statistical rules, designed foobjective detection of outlier phenomena in th
EEG. Two observers were involved in scrutinisin
the 24 h recordings at a 10 s time resolutio
Observer 1 scored a total percent artefact time o
7.7%, observer 2 scored 5.7% as artefact.
Subjective interpretation is a general problem i
EEG evaluation studies [44]. For instance, sma
artefacts in delta frequency range amid a (norma
background of larger amplitude can be underest
mated even by experienced observers [11]. Ther
fore, the consensus score of observers was used ttest the performance of automatic algorithms. Th
performance measures were defined to reflect th
percentages of time correct detection.
In general, the detection was performed high
specific partly affected by using performanc
definitions in terms of time, in combination with
low occurrence of artefacts. We acknowledged i
retrospect that 6alid EEG periods were sufficientl
left unmarked by the automatic methods, i.e. im
plying high specificity.
Fig. 5. Artefact detection using time-varying autoregressive variability tracking. Ellipses indicate the (v+|) probability-contou
(mean+standard deviation) for each series. (*) denotes a significant difference in positive prediction for 10 s, 20 s context length
-
8/8/2019 Context Related Artefact Detection in Prolonged EEG_IMP
11/14
M. 6an de Velde et al. /Computer Methods and Programs in Biomedicine 60 (1999) 183 196 1
Fig. 6. Performance of detection for the Slope amplitude method: detection versus context lengths. Ellipses indicate the (v+
probability-contours (excluding the outlier values of patient 38). No change in performance was observed beyond 80 s conte
length.
The results also show that the Slope parameter
detected most of the artefacts, and indicate that
long context lengths were not needed for the
investigated data set. The time-varying autore-
gressive variability tracking method was only rela-
tively successful. Nevertheless, when using a
combination of both methods, AR contributed up
to 10% sensitivity by detecting low frequency
artefacts. The overall performance reached 89%
sensitivity and 53% positive prediction. This latter
figure implies that approximately half of the auto-
matic markings do not indicate artefacts. How-ever, positive prediction is somewhat adversely
influenced because of the consensus data from the
human observers, which may also have excluded
some possible true artefacts. In addition, it would
seem sensible to err towards high sensitivity (at a
cost of lesser positive prediction); this would al-
low observers to visually analyse events detected
by automation, and categorise them as artefact/
non-artefact. This would be in the knowledge that
very few artefacts were missed by automation. If
the aim eventually were to develop event detec-tion as opposed to artefact detection, the posi-
tive prediction would be greatly increased.
Based on the current findings, especially the
EEG-like deviations found by AR variance detec-
tion may be defined as events rather than arte-
facts. Therefore, in clinical recordings event
detection not only includes the artefacts, but als
may highlight the most interesting parts of th
recording. As a discriminating method, highe
AR-variability scores will more likely indica
(low frequency) artefacts. Interestingly, AR base
analysis combined with variance testing was als
used in an early method by Vachon et al. (1978
[45]. They used an F-ratio of only the erro
variances, calculated within the residual array o
the AR model (1 s-epoch, p=5). At significanc
levels h=0.05 and 0.10, they concluded that th
detected non-stationary waveforms also needeadditional pattern recognition. The current ap
proach incorporated all parameters and residua
of the AR estimation and tested formula Eq. (6
at significance level h=0.01, while incorporatin
longer context periods.
Context related detection was implemented her
as a history based detection, therefore still diffe
ent from human screening. Human screening o
ten also involves going back in the data, whic
influences decision about the EEG being artefac
tual or not. In the current implementation, thautomatic methods were designed for objectiv
on-line processing, testing for statistical signifi
cance. As an illustration, Figs. 1 and 2 represen
true data from the current study. Both figur
indicate automatically detected EEG events i
the test window that were not marked by th
-
8/8/2019 Context Related Artefact Detection in Prolonged EEG_IMP
12/14
M. 6an de Velde et al. /Computer Methods and Programs in Biomedicine 60 (1999) 183 196194
Table 1
Artefact detection using a context length of 20 s: slope detection and autoregressive variability tracking combined
Overall32Patients 383736353433
9887979479Sensitivity (%) 8989 75
53Pos. prediction (%) 49 58505761 50 49
observers, while clearly displaying deviating phe-
nomena in the EEG.
Artefacts often occur in more channels simulta-
neously, therefore a detected event (or candi-
date artefact) is usually checked visually in all
channels displayed together. This was also ob-
served in the current data set, but not incorpo-
rated in the algorithms or the evaluation.
Combining channels has been described by vari-
ous authors (e.g. [4,46,47]), implementing such
spatial (cross-channel) processing mainly for the
identification of eye-artefacts using rule-based sys-
tems. Another recently described system [48] used
artificial neural networks to pre-process EEG fea-
tures, and discriminated between (eye-) artefacts,
muscle artefacts and electrode artefacts in an ad-
ditional knowledge-based stage. The system cor-
rectly identified 90% of artefacts in the initial
evaluation. Unfortunately, the system was not
evaluated in a large clinical data set, and temporal
context was not evaluated systematically.
The current study provides some starting pointsfor choosing the optimal length of the context
periods in automatic analysis. Optimal context
was concluded to be as short as 2040 s.
Acknowledgements
This project was supported by the Co-operation
Centre of the Brabant Universities, project 94CH.
We are also very grateful for the co-operation
with colleagues from the IMPROVE project: DrP. Prior, Dr C.E. Thomsen and Mr R. Pottinger.
Appendix A. Nomenclature
Autoregressi6e model
discrete time signals(n)
N length in samples of EEG period
P order of autoregressive (AR) model
signal vectorS
e residual error vector
S summation
1,,p AR coefficients
AR vector (coefficients)
Z`
matrix of p times N elements
quadratic cost vector (of error poweJD
|e
2 error amplitude variance
R0 autocorrelation, or power, of s(n)
F i vector of 1,,p, mean verr, stan
dard deviation |err of the residu
errors
number of epochs in context, tesN1, N2window respectively
C (N1), C (N
2) average ofF i
u12,u2
2 variance of F i (euclidian distance t
C (N1), C (N
2))
fh, N11, N21
significance of the ratio of variance
u12,u2
2
Slope distribution
minus infinity
vs mean
|s standard deviation
References
[1] J.A. McEwen, G.B. Anderson, Modeling the stationari
and gaussianity of spontaneous electroencephalographactivity, IEEE Trans. Biomed. Eng. 22 (1975) 361369
[2] B.H. Jansen, A. Hasman, R. Lenten, Piecewise EE
analysis: an objective evaluation, Int. J. Biomed. Compu
12 (1981) 1727.
[3] G. Bodenstein, W. Schneider, C.V.D. Malsburg, Compu
erized EEG pattern classification by adaptive segment
tion and probability-density-function classification.
-
8/8/2019 Context Related Artefact Detection in Prolonged EEG_IMP
13/14
M. 6an de Velde et al. /Computer Methods and Programs in Biomedicine 60 (1999) 183 196 1
Description of the method, Comput. Biol. Med. 15 (1985)
297313.
[4] A. Varri, K. Hirvonen, J. Hasan, P. Loula, V. Hakkinen,
A computerized analysis system for vigilance studies,
Comp. Meth. Progr. Biomed. 39 (1992) 113124.
[5] V. Jagannathan, J.R. Bourne, B.H Jansen, J.W. Ward,
Artificial intelligence methods in quantitative electroen-
cephalogram analysis, Comput. Prog. Biomed. 15 (1982)
249258.
[6] B.H. Jansen, B.M. Dawant, Knowledge-based approachto sleep EEG analysisA feasibility study, IEEE Trans.
Biomed. Eng. 36 (1989) 510518.
[7] B.H. Jansen, Quantitative analysis of electroencephalo-
grams: is there chaos in the future?, Int. J. Biomed.
Comput. 27 (1991) 95123.
[8] E. Flooh, E. Korner, G. Ladurner, H. Lechner, EEG-
Nachtschlafableitungen: auswertung mittels automatis-
cher Datenanalyse (EEG-night-sleep-recordings:
automatic analysis. In German), Z. EEG-EMG 13 (1982)
157160.
[9] D.P. Brunner, R.C. Vasko, C.S. Detka, J.P. Monahan,
C.F. Reynolds III, D.J. Kupfer, Muscle artifacts in the
sleep EEG: automated detection and effect on all-nightEEG power spectra, J. Sleep Res. 5 (1996) 155164.
[10] B.H. Jansen, J.R. Bourne, J.W. Ward, Identification and
labelling of EEG graphic elements using autoregressive
spectral estimates, Comput. Biol. Med. 12 (1982) 97106.
[11] J.S. Barlow, Artifact processing (rejection and minimiza-
tion) in EEG data processing, in: F.H. Lopes da Silva,
W.H. Storm van Leeuwen (Eds.), Handbook of Elec-
troencephalography and Clinical Neurophysiology, Re-
vised edition, Vol. 3B: Applications of Analytical
Techniques, Elsevier, Amsterdam, 1986, pp. 1562.
[12] J.S. Barlow, Muscle spike artifact minimization in EEGs
by time-domain filtering, Electroenceph. Clin. Neuro-
physiol. 55 (1983) 487491.[13] J.S. Barlow, Automatic elimination of electrode-pop arti-
facts in EEGs, IEEE Trans. Biomed. Eng. 33 (1986)
517521.
[14] V. Strejc, Least squares parameter estimation, Automat-
ica 16 (1980) 535550.
[15] D.A. Pierce, Testing normality in autoregressive models,
Biometrika 72 (1985) 293297.
[16] S.S. Shapiro, M.B. Wilk, An analysis of variance test for
normality (complete samples), Biometrika 52 (1965) 591
611.
[17] S. Shapiro, M.B. Wilk, H.J. Chen, A comparitive study of
various tests for normality, Am. Stat. Ass. J. 63 (1968)
13431372.
[18] R. Bender, B. Schultz, A. Schultz, I. Pichlmayr, Testing
the gaussianity of the human EEG during anaesthesia,
Meth. Inf. Med. 31 (1992) 5659.
[19] J. Makhoul, Linear prediction: a tutorial review, Proc.
IEEE 63 (1975) 561580.
[20] G.E.P. Box, G.M. Jenkins, Time series analysis, forecast-
ing and control, Revised edition, Holden-Day, London,
1976.
[21] H. Akaike, A new look at the statistical model identific
tion, IEEE Trans. Autom. Control 19 (1974) 716723.
[22] C.W. Anderson, E.A. Stolz, S. Shamsunder, Multivaria
autoregressive models for classification of spontaneo
electroencephalographic signals during mental task
IEEE Trans. Biomed. Eng. 45 (1998) 277286.
[23] S. Cerutti, D. Liberati, G. Avanzini, S. Franceschetti,
Panzica, Classification of the EEG during neurosurger
Parametric identification and Kalman filtering compare
J. Biomed. Eng. 8 (1986) 244254.[24] L.H. Zetterberg, Estimation of parameters for a line
difference equation with application to EEG analys
Math. Biosciences 5 (1969) 205226.
[25] B.H. Jansen, J.R. Bourne, J.W. Ward, Autoregressi
estimation of short segment spectra for computeriz
EEG analysis, IEEE Trans. Biomed. Eng. 28 (1981) 630
638.
[26] J. Pardey, S. Roberts, L. Tarassenko, A review of par
metric modeling techniques for EEG-analysis, Med. En
Physics 18 (1996) 211.
[27] F.D.J. Dunstan, R.W. Marshall, The detection of art
facts in EEG series, Stat. Med. 10 (1991) 17191731.
[28] S. Cerutti, D. Liberati, P. Mascellani, Parameter extra
tion in EEG processing during riskful neurosurgical ope
ations, Signal Proc. 9 (1985) 2535.
[29] D.C. Montgomery, G.C. Runger, Applied statistics an
probability for engineers, Wiley, New York, 1994.
[30] M. Scherg, Simultaneous recording and separation
early and middle latency auditory evoked potentials, Ele
troenceph. Clin. Neurophysiol. 54 (1982) 339341.
[31] H. Hinrichs, H.J. Heinze, M.R. Gaab, Neurophysiolog
ches monitoring bei neurochirurgischen gefaoperatione
spezifische technische anforderungen und deren umse
zung (Neurophysiological monitoring of neurosurgic
vessel-operations: technical specification and implement
tion. In German), Z. EEG-EMG 23 (1992) 195202.
[32] H. Hinrichs, H. Feistner, H.J. Heinze, A trend-detectioalgorithm for intraoperative EEG monitoring, Med. En
Physics 18 (1996) 626631.
[33] P.J.M. Cluitmans, J.W. Jansen, J.E.W. Beneken, Artefa
detection and removal during auditory evoked potenti
monitoring, J. Clin. Mon. 9 (1993) 112120.
[34] M. van de Velde, G. van Erp, P.J.M. Cluitmans, Musc
artefact detection in the normal human awake EEG
Electroenceph. Clin. Neurophysiol. 107 (1998) 149158
[35] I. Korhonen, J. Ojaniemi, K. Nieminen, M. van Gils, A
Heikela, A. Kari, Building the IMPROVE data Librar
IEEE Eng. Med. Biol. 16 (1997) 2532.
[36] B. Schultz, R. Bender, A. Schultz, I. Pichlmayr, Redu
tion der anzahl von EEG-ableitungen fur ein ro
tinemaiges monitoring auf der intensivstatio
(Electroencephalographic monitoring in the ICU R
duction of the number of recorded channels. In German
Biomed. Technik 37 (1992) 194199.
[37] C.E. Thomsen, J. Gade, K. Nieminen, R.M. Langfor
I.R. Ghosh, K. Jensen, M. van Gils, A. Rosenfalck, P.F
Prior, S. White, Collecting EEG signals in the IMPROV
data library, IEEE Eng. Med. Biol. 16 (1997) 3340.
-
8/8/2019 Context Related Artefact Detection in Prolonged EEG_IMP
14/14
M. 6an de Velde et al. /Computer Methods and Programs in Biomedicine 60 (1999) 183 196196
[38] I.R. Ghosh, P.F. Prior, S.R. White, J. Gade, K. Jensen,
R.M. Langford, A. Rosenfalck, C.E. Thomsen, Artefact
assessment in prolonged EEG-polygraphic recordings in
intensive care, Electroenceph. Clin. Neurophysiol. (In
press).
[39] M. van Gils, A. Rosenfalck, S. White, P. Prior, J. Gade,
L. Senhadji, C.E. Thomsen, I.R. Ghosh, R.M. Langford,
K. Jensen, Signal processing in prolonged EEG record-
ings during intensive care, IEEE Eng. Med. Biol. 16
(1997) 5663.[40] K. Nieminen, R.M. Langford, C.J. Morgan, J. Takala, A.
Kari, A clinical description of the IMPROVE data li-
brary, IEEE Eng. Med. Biol. 16 (1997) 2124.
[41] P. Royston, Shapiro Wilk W test and its significance
level. Algorithm AS R94, Appl. Stat. 44 (1995) 4.
[42] D.W. Klass, The continuing challenge of artifacts in the
EEG, Am. J. EEG Technol. 35 (1995) 239269.
[43] P. Prior, The rationale and utility of neurophysiological
investigations in clinical monitoring for brain and spinal
cord ischaemia during surgery and intensive care, Comp.
Meth. Prog. Biomed. 51 (1996) 1327.
[44] G.W. Williams, H.O. Luders, A. Brickner, M. Goormas-
tic, D.W. Klass, Interobserver variability in EEG inte
pretation, Neurology 35 (1985) 17141719.
[45] B. Vachon, B. Dubuisson, D. Samson-Dollfus, Etu
automatique de lEEG: une methode de detection des no
stationnarites (Automatic EEG processing: a method f
detection of non-stationarities. In French), Int. J. Biom
Comput. 9 (1978) 147162.
[46] T. Pietila, S. Vapaakoski, U. Nousiainen, A. Varri, H
Frey, V. Hakkinen, Y. Neuvo, Evaluation of a compute
ized system for recognition of epileptic activity durinlong-term EEG recording, Electroenceph. Clin. Neur
physiol. 90 (1994) 438443.
[47] M. Nakamura, T. Sugi, A. Ikeda, R. Kagigi, H
Shibasaki, Clinical application of automatic integrati
interpretation of awake background EEG: quantitati
interpretation, report making, and detection of artifac
and reduced vigilance level, Electroenceph. Clin. Neur
physiol. 98 (1996) 103112.
[48] J. Wu, E.C. Ifeachor, E.M. Allen, W.K. Wimalaratn
N.R. Hudson, Intelligent artefact identification in ele
troencephalography signal processing, IEE Proc. S
Meas. Technol. 144 (1997) 193201.
.