recognising safety critical events: can automatic video processing improve naturalistic data...

7
Accident Analysis and Prevention 60 (2013) 298–304 Contents lists available at ScienceDirect Accident Analysis and Prevention jo u r n al hom epa ge: www.elsevier.com/locate/aap Recognising safety critical events: Can automatic video processing improve naturalistic data analyses? Marco Dozza a,, Nieves Pa˜ neda González a,b a Chalmers University of Technology, Göteborg S-412 96, Sweden b University of Oviedo, Escuela Politécnica de Ingeniería de Gijón, Gijón-Asturias 33203, Spain a r t i c l e i n f o Article history: Received 3 October 2012 Received in revised form 13 December 2012 Accepted 11 February 2013 Keywords: Traffic safety Naturalistic data analysis Driver reaction Video processing Near-crash Safety-critical event a b s t r a c t New trends in research on traffic accidents include Naturalistic Driving Studies (NDS). NDS are based on large scale data collection of driver, vehicle, and environment information in real world. NDS data sets have proven to be extremely valuable for the analysis of safety critical events such as crashes and near crashes. However, finding safety critical events in NDS data is often difficult and time consuming. Safety critical events are currently identified using kinematic triggers, for instance searching for deceleration below a certain threshold signifying harsh braking. Due to the low sensitivity and specificity of this filtering procedure, manual review of video data is currently necessary to decide whether the events identified by the triggers are actually safety critical. Such reviewing procedure is based on subjective decisions, is expensive and time consuming, and often tedious for the analysts. Furthermore, since NDS data is exponentially growing over time, this reviewing procedure may not be viable anymore in the very near future. This study tested the hypothesis that automatic processing of driver video information could increase the correct classification of safety critical events from kinematic triggers in naturalistic driving data. Review of about 400 video sequences recorded from the events, collected by 100 Volvo cars in the euroFOT project, suggested that drivers’ individual reaction may be the key to recognize safety critical events. In fact, whether an event is safety critical or not often depends on the individual driver. A few algorithms, able to automatically classify driver reaction from video data, have been compared. The results presented in this paper show that the state of the art subjective review procedures to identify safety critical events from NDS can benefit from automated objective video processing. In addition, this paper discusses the major challenges in making such video analysis viable for future NDS and new potential applications for NDS video processing. As new NDS such as SHRP2 are now providing the equivalent of five years of one vehicle data each day, the development of new methods, such as the one proposed in this paper, seems necessary to guarantee that these data can actually be analysed. © 2013 Elsevier Ltd. All rights reserved. 1. Introduction More than 1.2 million people die on the roads in traffic accidents every year (World Health Organization, 2009). Countermeasures to reduce traffic accidents include intelligent safety systems able to avoid (or mitigate) accidents by automatically detecting safety critical events and intervene before they develop into crashes (Bishop, 2005). Today we lack a common established definition of what a safety critical event is despite safety systems algorithms continuously improve their ability to automatically recognise and anticipate safety critical events (Lai and Chung-Ming, 2010). As Corresponding author at: Department of Applied Mechanics, SE-412 96, Gothen- burg, Sweden. Tel.: +46 317723621. E-mail address: [email protected] (M. Dozza). an example, the 100-Car Naturalistic Driving Study (Dingus et al., 2006a) defines safety critical events as follows: Crash: situations in which there is physical contact between the subject vehicle and another vehicle, fixed object, pedestrian, cyclist, or animal. Near-crash: situations requiring a rapid, severe, evasive manoeu- vre to avoid a crash. Incident: situations requiring an evasive manoeuvre occurring at less magnitude than a near crash. Although the definitions of safety critical events proposed by Dingus may be debatable, safety critical events were proven to be valid surrogates for crash analyses according to a recent study from the Virginia Tech Transportation Institute (Guo et al., 2010). 0001-4575/$ see front matter © 2013 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.aap.2013.02.014

Upload: nieves-paneda

Post on 25-Dec-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Recognising safety critical events: Can automatic video processing improve naturalistic data analyses?

Ri

Ma

b

a

ARR1A

KTNDVNS

1

ettc(wca

b

0h

Accident Analysis and Prevention 60 (2013) 298– 304

Contents lists available at ScienceDirect

Accident Analysis and Prevention

jo u r n al hom epa ge: www.elsev ier .com/ locate /aap

ecognising safety critical events: Can automatic video processingmprove naturalistic data analyses?

arco Dozzaa,∗, Nieves Paneda Gonzáleza,b

Chalmers University of Technology, Göteborg S-412 96, SwedenUniversity of Oviedo, Escuela Politécnica de Ingeniería de Gijón, Gijón-Asturias 33203, Spain

r t i c l e i n f o

rticle history:eceived 3 October 2012eceived in revised form3 December 2012ccepted 11 February 2013

eywords:raffic safetyaturalistic data analysisriver reactionideo processingear-crashafety-critical event

a b s t r a c t

New trends in research on traffic accidents include Naturalistic Driving Studies (NDS). NDS are based onlarge scale data collection of driver, vehicle, and environment information in real world. NDS data setshave proven to be extremely valuable for the analysis of safety critical events such as crashes and nearcrashes. However, finding safety critical events in NDS data is often difficult and time consuming. Safetycritical events are currently identified using kinematic triggers, for instance searching for decelerationbelow a certain threshold signifying harsh braking. Due to the low sensitivity and specificity of thisfiltering procedure, manual review of video data is currently necessary to decide whether the eventsidentified by the triggers are actually safety critical. Such reviewing procedure is based on subjectivedecisions, is expensive and time consuming, and often tedious for the analysts. Furthermore, since NDSdata is exponentially growing over time, this reviewing procedure may not be viable anymore in the verynear future.

This study tested the hypothesis that automatic processing of driver video information could increasethe correct classification of safety critical events from kinematic triggers in naturalistic driving data.Review of about 400 video sequences recorded from the events, collected by 100 Volvo cars in the euroFOTproject, suggested that drivers’ individual reaction may be the key to recognize safety critical events. Infact, whether an event is safety critical or not often depends on the individual driver. A few algorithms,able to automatically classify driver reaction from video data, have been compared. The results presented

in this paper show that the state of the art subjective review procedures to identify safety critical eventsfrom NDS can benefit from automated objective video processing. In addition, this paper discusses themajor challenges in making such video analysis viable for future NDS and new potential applications forNDS video processing. As new NDS such as SHRP2 are now providing the equivalent of five years of onevehicle data each day, the development of new methods, such as the one proposed in this paper, seemsnecessary to guarantee that these data can actually be analysed.

. Introduction

More than 1.2 million people die on the roads in traffic accidentsvery year (World Health Organization, 2009). Countermeasureso reduce traffic accidents include intelligent safety systems ableo avoid (or mitigate) accidents by automatically detecting safetyritical events and intervene before they develop into crashesBishop, 2005). Today we lack a common established definition of

hat a safety critical event is despite safety systems algorithms

ontinuously improve their ability to automatically recognise andnticipate safety critical events (Lai and Chung-Ming, 2010). As

∗ Corresponding author at: Department of Applied Mechanics, SE-412 96, Gothen-urg, Sweden. Tel.: +46 317723621.

E-mail address: [email protected] (M. Dozza).

001-4575/$ – see front matter © 2013 Elsevier Ltd. All rights reserved.ttp://dx.doi.org/10.1016/j.aap.2013.02.014

© 2013 Elsevier Ltd. All rights reserved.

an example, the 100-Car Naturalistic Driving Study (Dingus et al.,2006a) defines safety critical events as follows:

• Crash: situations in which there is physical contact betweenthe subject vehicle and another vehicle, fixed object, pedestrian,cyclist, or animal.

• Near-crash: situations requiring a rapid, severe, evasive manoeu-vre to avoid a crash.

• Incident: situations requiring an evasive manoeuvre occurring atless magnitude than a near crash.

Although the definitions of safety critical events proposed byDingus may be debatable, safety critical events were proven to bevalid surrogates for crash analyses according to a recent study fromthe Virginia Tech Transportation Institute (Guo et al., 2010).

Page 2: Recognising safety critical events: Can automatic video processing improve naturalistic data analyses?

nalysis and Prevention 60 (2013) 298– 304 299

aydNwcmedbd2NbeemclneDro

ats(mtdaao

dedasPsc

2

2

pcasd(tclmvdtd

M. Dozza, N.P. González / Accident A

Safety critical events are not only difficult to define; they arelso rare and hard to predict. The most credited method for anal-sis of safety critical events is the collection of naturalistic drivingata (Dingus et al., 2006b; Sayer et al., 2011; Malta et al., 2012).aturalistic driving studies (NDS) continuously collect data in realorld in order not to miss any safety critical events. In NDS, vehi-

les are equipped with cameras, radars, and other sensors to log asuch information as possible about the driver, the vehicle, and the

nvironment. Unlike accident databases and crash investigations,ata from NDS is not limited to the consequence of an accidentut captures the full chain of events leading to the accident, theriver behaviour, and the environmental context (Hanowski et al.,007; Dozza, 2012). However, identifying safety critical events inDS may be difficult. To date, safety critical events are identifiedy looking for extreme values of vehicle dynamics, e.g., high lat-ral or longitudinal accelerations, with kinematic triggers (Dingust al., 2006a; Batelle, 2007; Lee et al., 2011). Kinematic triggersimic intelligent safety systems but use lower and less sophisti-

ated thresholds. As a consequence, kinematic triggers have veryow specificity, i.e. a high number of events are erroneously recog-ised as safety critical, when tuned for a low rate of false negativevents in order not to miss any true events (Malta et al., 2012).espite being time consuming, expensive, and tedious, manual

eview of video sequences is the present solution to low specificityf kinematic triggers (Faber et al., 2012).

While reviewing video sequences to find safety critical events,nalysts often try to establish some empathic link with the driverso understand whether the driver experienced the event as beingafety critical. In fact, different drivers, such as sensation seekersJonah et al., 2001), may exhibit high decelerations even under nor-

al driving conditions. In addition, video information from outsidehe vehicle is generally not enough to distinguish between normalriving and safety critical events. On the contrary, fast reactionsnd surprised or scared expressions on the drivers’ face – referreds oops reaction in Victor et al. (2010) – are more credible indicatorsf safety critical events.

This study tested the hypothesis that automatic processing ofriver video could increase correct classification of safety criticalvents extracted by kinematic triggers from naturalistic drivingata. To test this hypothesis, this study developed and testedlgorithms to automatically recognise oops reactions from videosequences collected in the euroFOT project from 100 Volvo cars.revious work from (Kobayashi, 2007) and (Molinero et al., 2009)upports the underlying assumption that driver reaction is a spe-ific indicator of a safety critical event.

. Methods

.1. The euroFOT data set

This study analysed data from the euroFOT project, a large Euro-ean project involving 28 partners across the continent whichollected the largest naturalistic data set of vehicle data currentlyvailable in Europe (Schoch et al., 2012). The data set used in thistudy was collected from 100 Volvo cars in Gothenburg, Sweden,uring one year. Data was collected continuously from the CAN bus10 Hz), GPS (1 Hz), video images (10 Hz) and eye tracker (60 Hz). Aotal of four cameras were installed in each of the instrumentedars: two with a forward and backward view, respectively, oneocated under the steering wheel to record the pedals and the feet

ovements, and finally, one camera was located under the rear

iew mirror, pointed at the driver. The drivers were all volunteers,riving their own car, who had signed a consent form prior to par-icipation in the euroFOT study. This consent form guaranteed therivers’ rights in accordance with legal and privacy requirements

Fig. 1. Steps for the evaluation of algorithms to automatically recognize oops reac-tions.

by the Swedish government. Due to privacy and ethical issues, datawas only accessible by authorised personnel and stored in lockedrooms at SAFER (the Vehicle and Traffic Safety Centre at Chalmers).

2.2. Extraction of test and validation samples from euroFOT

Several thousand events were identified as potentially safetycritical from the Volvo Cars data set using kinematics triggers inthe euroFOT project. Video sequences from such events were thenreviewed to classify the events as safety critical or non-safety crit-ical. This study used 11 of the 30 positive events (available at thetime when this analysis was performed) and 22 false events as atraining sample for the algorithm. A validation sample was alsocreated by merging the remaining 19 positive events with a ran-dom selection of 96 negative events from the events identified aspotentially safety critical by the kinematic triggers.

2.3. Evaluation of algorithms to automatically recognize oopsreactions

Different algorithms were developed to automatically recognizeoops reactions (Victor et al., 2010) from the video data in the testand validation samples. Oops reactions were reviewed and ana-lysed in several video sequences from events identified using thesame triggers as in the 100-Car Naturalistic Driving Study (Dinguset al., 2006a) on the euroFOT data. This analysis helped specifying aoops reaction as a fast movement, visible in 5–10 frames, involvingtrunk, arms, and sometimes the head. Fig. 1 shows the 5 steps fol-lowed throughout this study to recognize oops reactions and testthe performance of our algorithms.

2.3.1. Step 1 – video preparation (cropping and frames filtering)One safety critical event (true positive event) and two normal

driving events (true negative events) were extracted for 11 differ-ent drivers (a total of 33 events). Events were 2 s long. The twonegative events were picked at 2 s and 4 s before the triggered time,and verified to be true negative. Each frame was reduced in sizeto avoid superfluous information from inside the vehicle (Fig. 2).Flashes from the eye tracker infrared illuminators were removedand a mask was applied to the window area (Fig. 2). This trainingsample was used for testing and searching methods to recognisedriver reaction within a limited collection of data.

2.3.2. Step 2 – recognising silhouettes in the training sampleBased on Victor et al. (2010), and on the observation of driver

reaction in safety critical events, two algorithms were developedand tested on the training sample to distinguish positive and neg-ative events from the training sample.

The first algorithm was based on the optical flow (Horn andSchunck, 1981) which estimates the speed of a three-dimensionalobject motion by distinguishing changes in brightness in two-dimensional images. The public code developed by Sun et al. (2010)

Page 3: Recognising safety critical events: Can automatic video processing improve naturalistic data analyses?

300 M. Dozza, N.P. González / Accident Analysis

Fig. 2. Image preprocessing. The top-right area comprises the vehicle windowwhich was cut. Pixels outside the box were also cut.

wabiu

vme

sHOoMsp

FH

as implemented in the training sample. The main objective was tossess whether it would be possible to estimate driver body motiony evaluating changes in brightness in the image collection. A peak

n the second derivative of the average speed of the optic flow wassed to identify driver reaction (Fig. 3).

The second algorithm analysed the third derivative of the pixelalues. The standard deviation of such derivative showed fast torsoovement by the driver as a white silhouette (Fig. 4) in all positive

vents in the training sample.Different movements by the driver, such as manoeuvring the

teering wheel, were also recognised by this algorithm (Fig. 5).owever, only true events showed the full silhouette of the driver.ther algorithms, based on statistical methods such as analysis

f variance, were tested in the training sample without success.ore specifically, these algorithms failed to distinguish events from

teering manoeuvres and are therefore not discussed further in thisaper.

ig. 3. Example from the optical flow based algorithm. The peak in the 2nd derivative

owever, the same algorithm also picks up a driver poking his nose in one of the negativ

and Prevention 60 (2013) 298– 304

2.3.3. Step 3 – recognising oops reactions from the trainingsample

The graphical information presented in Figs. 4 and 5 requiresan intermediate step to convert it into numerical information inorder to facilitate automatic detection. Three procedures weretested: mean, harmonic mean, and Gray Level Co-occurrenceMatrix (GLCM) properties. The purpose of the procedures is toidentify the spread of pixels needed to reproduce the silhouetteof the driver as a result of a safety critical event. The dispersionof pixels over the image can be estimated by the mean of the dis-tribution of sum of standard deviation values along rows, whichis expected to be higher during a safety critical event. Unlike themean, the harmonic mean is not affected by the outliers. It wasregarded as an alternative to the mean in case of manoeuvres andposition changes from normal driving that generate certain whiteareas in the resulting image. GLCM is a statistical procedure of tex-ture analysis that considers the spatial distribution of pixels overthe image (Hall-Beyer, 2007). GLCM calculate how often pairs ofdifferent combinations of pixel intensities occur in an image, suchinformation can then be used to identify the driver silhouette.

2.3.4. Step 4 – applying a fusion maskThe three procedures applied to identify the driver silhouette

and the optical flow algorithm defined four potential algorithms tobe tested on a validation data set. The use of a fusion mask, able toappreciate changes of intensity in the area of the driver’s torso, wasalso evaluated for the mean procedure.

2.3.5. Step 5 – evaluating the algorithms on the validation dataset

A validation data set comprising 93 events: 74 negative (normaldriving) events and 19 positive (safety critical) driving events wereextracted from euroFOT database. Unevenly lit events were manu-ally excluded from this data set (for example, sequences resultingfrom driving through a tunnel). Specificity and sensitivity from theclassification of the validation data set were calculated for eachof the algorithms. Furthermore, receiver operating characteristic(ROC) curves were used to evaluate the trade-off between speci-

ficity and sensitivity (Swets, 1979) while tuning up the algorithms.The area under a ROC curve (AUC) estimates the overall accu-racy (Hanley and Mcneil, 1983). However, since preserving truepositives in this context is more important than eliminating true

of the average speeds can identify the driver reaction in the true positive event.e events. A fine threshold is necessary to distinguish these two events.

Page 4: Recognising safety critical events: Can automatic video processing improve naturalistic data analyses?

M. Dozza, N.P. González / Accident Analysis and Prevention 60 (2013) 298– 304 301

Fig. 4. Images from the standard deviation of the third derivative of pixel values. A clear white silhouette of the driver is shown in the safety critical event (third frame). Nomovement in the negative events results in black images.

F A cleaS shad

nratit

3

3

piflcwvaiwit

mask which considered the changes of intensity in the area of thedriver’s torso (0.8471 AUC; Table 2). However, according to theminimum requirements, i.e., reducing at least 40% of the negative

Table 1Performances of the four algorithms in the training sample.

Algorithms Training sample

Sensitivity (%) Specificity (%)

ig. 5. Image from the standard deviation of the third derivative of pixels values.

teering manoeuvres in the negative events (first and second frame) result in white

egatives, higher AUC is not necessarily indicating a better algo-ithm. To overcome this limitation, minimum requirements for thelgorithms to (1) reduce by at least 40% the number of candidateso safety critical events and (2) keep more than 80% of the true pos-tive events were defined as success factors and used to comparehe algorithms.

. Results

.1. Performance in the training sample

The algorithm based on harmonic means exhibited the besterformance in recognising true and false events from the train-

ng sample (Table 1). Additionally, the algorithm based on opticalow showed high sensitivity and specificity. However, the opti-al flow algorithm had a very high demand on computation timehich resulted in slow processing (200 s processing time for 1 s

ideo). Since slow computation time was considered non accept-ble for application to large data sets, this algorithm was not

nvestigated further. Algorithms based on mean had some issues

ith discriminating between safety critical events and normal driv-ng manoeuvres (e.g. when drivers turned the steering wheel morehan 180 degrees). In fact, such manoeuvres also generated peaks

r white silhouette of the driver is shown in the safety-critical event (third frame).es around the steering wheel.

in the distribution of sum of standard deviation values which wasused to identify safety critical events. The GLCM-based algorithmshowed the worst performances, due to the energy from the GLCMbeing generally lower while the contrast higher during true positiveevents.

3.2. Performance in the validation data set

The best level of accuracy based on the area under the ROC curvewas achieved using the mean criterion together with the fusion

Harmonic mean 100 100Optical flow 91 95Mean 82 91GLCM props. 64 64

Page 5: Recognising safety critical events: Can automatic video processing improve naturalistic data analyses?

302 M. Dozza, N.P. González / Accident Analysis

Table 2Performances of the four algorithms in the validation sample.

Algorithms Validation sample

Sensitivity Specificity AUC Triptime/processtime

Harmonic mean 0.8421 0.6622 0.8417 0.11Mean (without mask) 0.8421 0.7973 0.8364 0.08

ewtcpt

4

4

migfopedsiiydid

ct

Ftln

Mean (fusion mask) 0.8421 0.6486 0.8471 0.09GLCM contrast (fusion

mask)0.8421 0.7568 0.8350 0.09

vents while preserving 80% of the true events, the best resultsere obtained using the mean criterion unmasked (84.21% sensi-

ivity and 79.73% specificity; see Fig. 6). This reduces the number ofandidates to safety critical events from 93 to 31, while 16 of the 19ositives remain. Finally, all algorithms required similar processingime (Table 2).

. Discussion

.1. Contribution to scholarly knowledge

Throughout this paper, it has been shown that driver reactionay be the key to pinpointing safety critical events from NDS

n order to understand accident causation and evaluate intelli-ent vehicle systems. To date, recognition of safety critical eventsrom NDS requires a subjective estimation of accident risk becausebjective definitions based on kinematic triggers have very poorerformances (Dingus et al., 2006a; Faber et al., 2012). This paperxplored the objective alternative to automatically recognise sud-en motion of the driver’s torso from video data in order to identifyafety critical events. An objective definition of what a safety crit-cal event actually is, such as the one proposed in this study, mayncrease the reliability as well as time and cost efficiency of anal-ses based on safety critical events. Furthermore, this objectiveefinition may help comparing results and harmonising analyses

n different NDS, supporting research on accident causation and

evelopment of intelligent system.

The algorithms tested in this study successfully identified safetyritical events from NDS data, showing for the first time the poten-ial of using video processing of NDS data to simplify current

ig. 6. A ROC curve obtained from applying the mean criterion in the images fromhe standard deviation of the 3rd derivative with different thresholds. The straightines over the image indicate the minimum requirements (reducing by 40% theumber of negatives while keeping at least 80% of the positives).

and Prevention 60 (2013) 298– 304

analyses and possibly enable new ones. In this respect, this papercan be regarded as a feasibility study showing the importance forfuture research to focus on automatic extraction of salient infor-mation from NDS video. An algorithm able to correctly identifyoops reactions may also be adapted to recognise secondary tasks(Blanco et al., 2006) – since the video data pre-processing and theconfounding situations should be essentially the same. This mayopen for other applications of image processing to traffic safetyanalysis. For instance, video data could be used to evaluate andquantify the use of nomadic devices inside the vehicle (Walsh et al.,2008), which is obviously relevant for distraction and inattentionissues (Hancock et al., 2003). Other obvious applications includerecognition of drowsy driving (Wierwille and Ellsworth, 1994) orestimation of out-of-position for airbag depletion (Hault-Dubrulleet al., 2011).

This paper focuses on video post processing, which is suffi-cient for analysis of data a posteriori. However, running videoprocessing in real time to extract salient information would alsoimprove threat assessment in existing active safety systems.Another advantage of real-time processing is the possibility toaccess more accurate data, sampled at higher frequencies, and usethis data to extract and save salient information without storingthe whole data. Finally, real-time in-vehicle processing enables dis-tributed computing and may significantly reduce post-processingtime.

4.2. Preprocessing of video data from NDS

Video data from NDS requires preprocessing before its qual-ity is sufficient to determine safety critical events. In this study,we found it necessary to (1) eliminate image flickering originat-ing from other pieces of equipment such as infrared illuminatorsin the eye tracking system, and (2) eliminate parts of the imagewhich were not relevant to the study and may have included otherfellow passengers (e.g. to the right of the driver) or disturbancefrom the surrounding environment (by traffic seen through theside windows). NDS video sequences are collected in a varietyof situations, e.g., darkness, glare, heavy rain, etc. The weatherand sudden lighting changes, such as when entering a tunnel,challenge the cameras ability to rapidly adapt to the new illu-mination. Furthermore, shades projected from trees, poles, andbuilding to the inside of the car may also influence the cameraand result in continuous changes in the contrast and brightnessof the image. Thus, in general, before any algorithm to recognisesalient information from video is applied, a preprocessing proce-dure should be implemented. Such procedure is intended to selectand improve the quality of the video, and possibly categorise thevideo (e.g. video captured in darkness vs video captured in day-light) to provide sets of more homogeneous data for the followingprocessing.

Even after preprocessing, some confounding situations maystill hinder the recognition of safety critical events from bodymovement. For instance, this study found that bumps on theroad, sharp curves, and unevenly lit frame sequences may inter-fere with correct classification or produce false positive events.Next versions of this algorithm should overcome these issues.For instance, a bump on the road would imply a peak inthe car vertical acceleration which would not generally hap-pen during an oops reaction. Such information is availableon the CAN-bus and could be used to unfold such situa-tions.

4.3. Future developments

The best algorithm proposed in this study missed to correctlyidentify three positive events. However, for all three events the

Page 6: Recognising safety critical events: Can automatic video processing improve naturalistic data analyses?

nalysis

dotodrii

ccfvtitddtttct

bt2ssalr

mtctiwtfisiw

tttbpits

tAimeditncmm

M. Dozza, N.P. González / Accident A

river’s silhouette was visible after Step 2 – even if the thresh-ld used in Step 3 to classify events was not sufficiently lowo recognize it. This finding suggests that the main limitationf this algorithm lays in its ability to correctly recognise theriver’s silhouette after processing the 2 s video and not in correctlyecognising the drivers’ reaction. Therefore a dynamic thresholdn Step 3 would probably be sufficient to reach 100% sensitiv-ty.

In addition, the validation sample used in this study – espe-ially for true positive events – was limited. A larger sample –omprising random events from normal driving as well – may beundamental for a complete and more conclusive validation of theideo processing algorithms. A philosophical concern may rise inhis context when it comes to the nature of NDS data. Natural-stic data is data collected in real world without interfering withhe driver daily routine. However, ethical concerns require for therivers to be aware of the collection. Such knowledge may changeriver behaviour and may have influenced our analysis by preven-ing drivers to exhibit their natural reactions. However, consideringhat the driver behaviour captured by NDS often includes instinc-ive actions such as poking a nose, it seems unlikely that the driversould control their spontaneous reactions in safety critical situa-ions because monitored.

The algorithms described in this paper may also be improvedy including more sophisticated and established image processingechniques. For instance, scale invariant feature transform (Lowe,004) or fuzzy vector quantisation with linear discriminant analy-is (Gkalelis et al., 2008) may be used to (1) better recognise imageilhouettes from our best performing algorithm, (2) replace ourlgorithm capturing oops reactions by looking at how specific bodyandmarks moves across frames, and (3) integrate face expressionecognition (Ng and Cheung, 2004, Yeasin et al., 2004).

Making video processing algorithms dynamic and adaptiveay also improve classification of safety critical events. In fact,

he environment (e.g. lighting) and the driver (e.g. fatigue) mayhange during a trip or across trips. Such changes may requirehe thresholds in our algorithms to change as well. A mov-ng window able to continuously compare the current frames

ith older frames may be used to adapt and dynamically adjusthe algorithm threshold and possibly enable real time classi-cation. However, the equation that such dynamic adaptationhould follow is not obvious and would require extensive test-ng due to the extreme variety of lighting conditions in real

orld.The results presented in this paper also took into account

he computational time required by each algorithm. The impor-ance of this aspect may be diminished in the future if newechnology becomes available. However, computation time maye an important factor to consider when choosing among equalerformance algorithms. In addition, NDS data sets are grow-

ng almost exponentially in projects such as SHRP2, thus futureechnologies will also need to face significantly larger dataets.

The long-term vision of this study is that information cap-ured on video alone will be used to identify safety critical events.lthough the current algorithms are still far for this vision, there

s a great potential for improvement. For instance, video infor-ation from the driver’s feet, which is actually available in the

uroFOT project, may also be combined with the video of theriver for easier recognition of safety critical events. Nevertheless,

ntegrating vehicle kinematics in order to eliminate events poten-ially occurring when the vehicle is stationary appears to be the

ext obvious step for improving the algorithms. In addition, driverontrols may also be considered to improve classification perfor-ances for instance to recognise swerving, harsh braking, or similaranoeuvers.

and Prevention 60 (2013) 298– 304 303

5. Conclusions

NDS are increasingly popular and promise revolutionaryinsights into accident causation and driver behaviour. Over theyears, video data has proven to be essential for NDS and enabledmost of the safety analyses. The amount of video data collected fromNDS is growing exponentially, creating new demands for automaticprocessing. Nowadays, video data from NDS needs to be viewed byhuman eyes before it can serve safety analyses. This paper showsthat automatic processing of video is feasible, desirable, and possi-bly necessary for the analyses of on-going NDS.

The results presented in this paper exemplify how automaticprocessing of video data can increase correct classification of safetycritical events in NDS. More specifically, sudden movements of thedriver’s torso are proposed as the key to automatically recognisesafety critical events from the video data. In this respect, this paperoffers a new objective definition of safety critical events which is notanchored on subjective definitions as in previous studies (Dinguset al., 2006a; Olson et al., 2009).

The procedure presented in this paper may also decreasethe cost of analyses by limiting the need for manual review ofincorrectly classified safety critical events. Furthermore, if videocollection continues to grow at the current speed, manual reviewwill become unsustainable unless procedures such as the one pro-posed in this paper can be used to optimise, or substitute, thehuman labour.

One of the novelties of this paper is its effort to use auto-matic video processing to extract information from naturalisticvideo data. However, in other areas such as medicine (Schreiberand Haslwanter, 2004), military (Boult et al., 2001), sport (Li et al.,2010), and surveillance (Stringa and Regazzoni, 2000), algorithmsable to extract movement from videos are already applied despitetheir known limitation in terms of robustness and tolerance toerror (Geer, 2004). Future research on naturalistic video processingshould investigate the extent to which the algorithms alreadydeveloped in these areas could help analyse naturalistic video toextract information about the environment surrounding the ego-vehicle; for instance by providing a video based estimate of (1)traffic density; (2) number, type, distance and kinematics of roadusers in the vicinity; (3) geometry and nature of the surround-ing infrastructures, e.g., traffic signs and road obstructions; and (4)visibility level.

In the last few years, new facial recognition algorithms havebeen developed and used for different applications, for instanceJabon et al. used facial expression to identify human errors (Jabonet al., 2011a) and unsafe driver behaviour (Jabon et al., 2011b) ina simulator. Despite the increased complexity of real world datacompared to a controlled environment such as a simulator, futureapplications of facial recognition algorithms to naturalistic videowould greatly support analysis by giving insight into the driverstate, activity, and possible impairment.

This paper also elucidates (1) the main requirements for videopreprocessing, and (2) the main potential confounding situations todiscard when trying to objectively recognise safety critical eventsfrom naturalistic video data. In fact, no matter which advancedalgorithm will be developed in the future, the first, elementary –but still crucial – step is to “clean” the video data so that it can beused for automatic video processing. This cleaning procedure canbe automatic and includes at least the following three steps: (1)quality checks (e.g. discarding blurred or black video sequences),(2) filtering (e.g. make lightning consistent across frames), and (3)cropping (e.g. select the part of the frame representing the driver).

In conclusion, this paper presents a first attempt to definea safety critical event objectively by introducing automaticprocessing of video to the analysis of naturalistic driving data.Automatic video processing is not only beneficial; it may be

Page 7: Recognising safety critical events: Can automatic video processing improve naturalistic data analyses?

3 alysis

idtap

A

FaATa

R

B

B

B

B

D

D

D

F

G

G

G

G

H

H

H

H

04 M. Dozza, N.P. González / Accident An

ndispensable considering the increasing size of naturalistic videoata. Furthermore, when successful, video processing outperformshe current manual coding and may provide new ways to clusternd enrich (Guha and Ward, 2012) naturalistic data sets, makingossible new analyses which are currently untenable.

cknowledgements

This study was supported by the Swedish National Strategicunds for Transportation and the European project euroFOT. Theuthors would also like to acknowledge the annotation work fromnna Johansson, Joachim Persson Rukin, and Christopher Skaloud.his work was carried out at SAFER Vehicle and Traffic Safety Centret Chalmers University, Gothenburg, Sweden.

eferences

atelle, 2007. Final report: Evaluation of the volvo intelligent initiative field opera-tional test.

ishop, R., 2005. Intelligent Vehicle Technology and Trends. Artech House, Norwood,MA, USA, Chapters 6–8.

lanco, M., Biever, W.J., Gallagher, J.P., Dingus, T.A., 2006. The impact of secondarytask cognitive processing demand on driving performance. Accident Analysisand Prevention 38 (5), 895–906.

oult, T.E., Micheals, R.J., Gao, X., Eckmann, M., 2001. Into the woods: visual surveil-lance of noncooperative and camouflaged targets in complex outdoor settings.Proceedings of the IEEE 89 (10), 1382–1402.

ingus, T.A., Klauer, S.G., Neale, V.L., Petersen, A., Lee, S.E., Sudweeks, J., Perez, M.A.,Hankey, J., Ramsey, D., Gupta, S., Bucher, C., Doerzaph, Z.R., Jermerland, J., Kni-pling, R.R., 2006a. The 100-car naturalistic driving study – phase II: results ofthe 100-car field experiment. Technical Report DOT HS 810 593.

ingus, T.A., Neale, V.L., Klauer, S.G., Petersen, A.D., Carroll, R.J., 2006b. The develop-ment of a naturalistic data collection system to perform critical incident analysis:An investigation of safety and fatigue issues in long-haul trucking. AccidentAnalysis and Prevention 38 (6), 1127–1136.

ozza, M., 2012. What factors influence drivers’ response time for evasive maneu-vers in real traffic? Accident Analysis and Prevention.

aber, F., Jonkers, E., Ljung Aust, M., Benmimoun, M., Regan, M., Jamson, S., Dobber-stain, J., 2012. Eurofot d6.2 – Analysis Methods.

eer, D., 2004. Will gesture-recognition technology point the way? Computer 37(10), 20–23.

kalelis, N., Tefas, A., Pitas, I., 2008. Combining fuzzy vector quantization with lin-ear discriminant analysis for continuous human movement recognition. IEEETransactions on Circuits and Systems for Video Technology 18 (11), 1511–1521.

uha, T., Ward, R.K., 2012. Learning sparse representations for human action recog-nition. IEEE Transactions on Pattern Analysis and Machine Intelligence 34 (8),1576–1588.

uo, F., Klauer, S.G., Mcgill, M.T., Dingus, T.A., 2010. Evaluating the relationshipbetween near-crashes and crashes: Can near-crashes serve as a surrogate safetymetric for crashes? Technical Report.

all-Beyer, M., 2007. The glcm tutorial home page (http://www.Fp.Ucalgary.Ca/mhallbey/tutorial.Htm) (accessed 18.09.12).

ancock, P.A., Lesch, M., Simmons, L., 2003. The distraction effects of phone useduring a crucial driving maneuver. Accident Analysis and Prevention 35 (4),501–514.

anley, J.A., Mcneil, B.J., 1983. A method of comparing the areas under receiver

operating characteristic curves derived from the same cases. Radiology 148 (3),839–843.

anowski, R.J., Hickman, J.S., Wierwille, W.W., Keisler, A., 2007. A descriptive analy-sis of light vehicle-heavy vehicle interactions using in situ driving data. AccidentAnalysis and Prevention 39 (1), 169–179.

and Prevention 60 (2013) 298– 304

Hault-Dubrulle, A., Robache, F., Drazetic, P., Guillemot, N., Morvan, N., 2011. Determi-nation of pre-impact occupant postures and analysis of consequences on injuryoutcome-part II: Biomechanical study. Accident Analysis and Prevention 43 (1),75–81.

Horn, B.K.P., Schunck, B.G., 1981. Determining optical-flow. Artificial Intelligence 17(1–3), 185–203.

Jabon, M.E., Ahn, S.J., Bailenson, J.N., 2011a. Automatically analyzing facial-featuremovements to identify human errors. IEEE Intelligent Systems 26 (2), 54–63.

Jabon, M.E., Bailenson, J.N., Pontikakis, E., Takayama, L., Nass, C., 2011b. Facial-expression analysis for predicting unsafe driving behavior. IEEE PervasiveComputing 10 (4), 84–95.

Jonah, B.A., Thiessen, R., Au-Yeung, E., 2001. Sensation seeking, risky driving andbehavioral adaptation. Accident Analysis and Prevention 33 (5), 679–684.

Kobayashi, Y., 2007. The emosign – analyzing the emotion signature in humanmotion. In: IEEE International Conference on Systems, Man and Cybernetics,Vols 1-8, pp. 2096–2101.

Lai, T., Chung-Ming, H., 2010. Forwards: A map-free intersection collision-warningsystem for all road patterns. Vehicular Technology, IEEE Transactions 59 (7),3233–3248.

Lee, S.E., Simons-Morton, B.G., Klauer, S.E., Ouimet, M.C., Dingus, T.A., 2011. Natu-ralistic assessment of novice teenage crash experience. Accident Analysis andPrevention 43 (4), 1472–1479.

Li, H.J., Tang, J.H., Wu, S., Zhang, Y.D., Lin, S.X., 2010. Automatic detection and analysisof player action in moving background sports video sequences. IEEE Transactionson Circuits and Systems for Video Technology 20 (3), 351–364.

Lowe, D.G., 2004. Distinctive image features from scale-invariant keypoints. Inter-national Journal of Computer Vision 60 (2), 91–110.

Malta, L., Ljung Aust, M., Freek, F., Metz, B., Saint Pierre, G., Benmimoun, M., Schäfer,R., 2012. Eurofot d6.4 – final results: Impacts on traffic safety.

Molinero, A., Evdorides, H., Naing, C., Kirk, A., Tecl, J., Barrios, J.M., Simon, M.C., Phan,V., Hermitte, T., 2009. Accident causation and pre-accidental driving situations.Part 2. In-depth accident causation analysis. TRACE, Project No. 027763.

Ng, J., Cheung, H., 2004. Dynamic local feature analysis for face recognition. BiometricAuthentication, Proceedings 3072, 234–240.

Olson, R.L., Hanowski, R.J., Hickman, J.S., Bocanegra, J., 2009. Driver distractionin commercial vehicle operations. Technical Report FMCSA-RRR-09-042 (FinalReport).

Sayer, J., Leblanc, D., Bogard, S., Funkhouser, D., Bao, S., Buonarosa, M.L., Blankespoor,A., 2011. Integrated vehicle-based safety systems field operational test TechnicalReport DOT HS 811 482 (Final Report).

Schoch, S., Guidotti, L., Csepinszky, A., Metz, B., Tadei, R., Tesauri, F., Burzio, G.,Schwertberger, W., Val, C., Krishnakumar, R., Selpi, Johansson, E., Gustafsson,D., Obojski, M.-A., Grundler, W., Kadlubek, S., Kessler, C., Hagleitner, W., 2012.Deliverable d5.3 -final delivery of data and answers to questionnaires.

Schreiber, K., Haslwanter, T., 2004. Improving calibration of 3D video oculograpbysystems. IEEE Transactions on Biomedical Engineering 51 (4), 676–679.

Stringa, E., Regazzoni, C.S., 2000. Real-time video-shot detection for scene surveil-lance applications. IEEE Transactions on Image Processing 9 (1), 69–79.

Sun, D.Q., Roth, S., Black, M.J., 2010. Secrets of optical flow estimation and their prin-ciples. In: IEEE Conference on Computer Vision and Pattern Recognition (Cvpr),pp. 2432–2439.

Swets, J.A., 1979. Roc analysis applied to the evaluation of medical imaging tech-niques. Investigative Radiology 14 (2), 109–121.

Walsh, S.P., White, K.M., Hyde, M.K., Watson, B., 2008. Dialling and driving: factorsinfluencing intentions to use a mobile phone while driving. Accident Analysisand Prevention 40 (6), 1893–1900.

Victor, T., Bärgman, J., Hjälmdahl, M., Kircher, K., Svanberg, E., Hurtig, S., Geller-man, H., Moeschlin, F., 2010. Sweden-michigan naturalistic field operationaltest (semifot) phase 1: Final report. SAFER Vehicle and Traffic Safety Centre atChalmers.

Wierwille, W.W., Ellsworth, L.A., 1994. Evaluation of driver drowsiness by trained

raters. Accident Analysis and Prevention 26 (5), 571–581.

World Health Organization, 2009. Global Status Report for Road Safety. WHO Press.Yeasin, M., Bullot, B., Sharma, R., 2004. From facial expression to level of interest:

a spatio-temporal approach. Proceedings of the 2004 IEEE Computer SocietyConference on Computer Vision and Pattern Recognition 2, 922–927.