real-time physiological signal acquisition and analysis...
TRANSCRIPT
Real-Time Physiological Signal Acquisition and Analysis for
the Development of a Wearable Driver Assistance System
THESIS
Submitted in partial fulfillment
of the requirements for the degree of
DOCTOR OF PHILOSOPHY
by
RAJIV RANJAN SINGH
Under the Supervision of
Prof. Rahul Banerjee
BIRLA INSTITUTE OF TECHNOLOGY & SCIENCE
PILANI (RAJASTHAN) INDIA
2014
CERTIFICATE
This is to certify that the thesis entitled "Real-Time Physiological Signal Acquisition and
Analysis for the Development of a Wearable Driver Assistance System" and submitted by
RAJIV RANJAN SINGH ID No. 2001PHXF419P for award of Ph.D. of the Institute
embodies original work done by him under my supervision.
Signature in full of the Supervisor: ---------------------
Name in capital block letters: RAHUL BANERJEE
Designation: Professor, Computer Science & I.S.
Date: January 14, 2014
i
ACKNOWLEDGMENTS
First and foremost, I would like to thank my supervisor, Professor Rahul Banerjee. Without
his invaluable support and guidance, my thesis work would not have been possible. I am very
grateful for his patience, motivation, enthusiasm, and immense knowledge that, taken
together, make him a phenomenal advisor.
I would like to thank Prof. V. N. Waliwadekar who inspired me in choosing research and
teaching as my career options. I had been fortunate to receive guidance from Prof. L. K.
Maheshwari and Prof. B. R. Natarajan. I also acknowledge the kind support from our
Director Prof. G. Raghurama, Deputy Director (Research) Prof. R. N. Saha, Deputy Director
(Off-Campus) Prof. G. Sundar, Dean (Academic and Resource Planning) Prof Sundar S.
Balasubramaniam and Dean (Academic Research and Development) Prof. S. K. Verma. I
would also like to extend my appreciation to my doctoral advisory committee (DAC)
members, Prof. S. Gurunarayanan and Prof. Surekha Bhanot for their continued support and
useful advice. My thanks also go to Prof. Anu Gupta (HOD, Department of Electrical and
Electronics Engineering), Prof. J. P. Misra, Prof. Sudeept Mohan, Prof. V. K. Chaubey and
Dr. Navneet Gupta.
I would like to thank present and past members of Embedded Controller and Application
Centre (ECAC) lab and Centre for Software Development (now SDET Unit) at BITS Pilani
for their friendship and constructive discussions we had. Special thanks go to Mr. Sailesh
Conjeti, working with whom was a sheer pleasure. I also wish to acknowledge the help of
Vamsidhar, Jitin, Shrikumar, Prasanth and Partheesh who assisted me in the course of data
collection.
I would like to thank all my wonderful friends in Pilani for the great and joyous moments we
shared together. It would be a long list to mention all of the friends that I am indebted to but
Dr. B. K. Rout is a special mention. I gratefully thank each one of them.
I wish to thank members of my family, specially my grandfather Late Babu Ram Bachan
Singh, my father Late Shri Harihar Prasad Singh, who would have been happy to see me
completing my doctoral work. It’s all their blessings which made this happen.
ii
My mother, Mrs. Ram Sawari Devi, who always struggled to educate all my siblings and me,
has been a major source of inspiration in this journey of mine.
My deepest gratitude goes to my extended family: my sisters Sandhya and Nisha, my brother
Sanjeev, my brother-in-laws Mr. Sanjay Kumar Singh, Mr. Pradeep Kumar Singh, Mr. Alok.
I also had the blessings of Mr. Yamuna Prasad Singh, Mrs. Rani Singh, Mr. Baikunth Singh
and Mr. Neelkanth Singh which I gratefully acknowledge.
I have no words to thank enough my wife, Mrs. Reena Singh, who always stood beside me
like a rock whenever I had a difficult time. She even sacrificed her career to help me grow.
My son Partheesh never complained about my not being able to spend enough time with him.
My little daughter Alankrita, whose smile makes my day, proved to be an angel by just being
there as only she could.
Rajiv Ranjan Singh
iii
ABSTRACT
The work presented here is part of a long-term research project that aims at creation of a
wearable driver assistance system (WDAS) that could be used to prevent loss of lives and
fatal injuries which may be caused due to road accidents. In particular, this work focuses on
the real-time acquisition and analysis of physiological signals non-invasively sensed from
automotive drivers by making use of body-mounted sensors. Real-time data acquired in this
way could be used for timely detection of physiological state of the driver that may otherwise
lead to unsafe driving.
Methodology used included identification of an exhaustive set of features or attributes which
as per literature and collectable primary data could lead to determination of the most
significant parameters which meaningfully and credibly indicate affective state of a driver.
Using a set of shortlisted parameters like Heart Rate (HR), Heart Rate Variability (HR), Skin
Conductance (SC) level, blood oxygen saturation also known as Saturation of peripheral
Oxygen (SpO2) and respiration rate etc., a set of real-time data collection experiments were
designed to provide the primary data for the purpose of this research. As a consequence, for
over a year, several experiments were conducted on different drivers in pre-driving, in-
driving and post-driving states with appropriate sensors mounted on their body with their
consent. In the next stage, the data collected in such a manner was cleaned, duly formatted
and thereafter subjected to appropriate methods of analysis. The entire process resulted in not
only extraction of appropriate feature sets but also identification of a very small subset of
parameters real-time sensing of which would allow creation of resultant architectural
framework that would pave the way for actually building a cost-effective and robust wearable
computing system for the vehicular drivers.
In this context, a driver-profile analysis based on the Cox Proportional Hazard model firmly
established that the 'Current Physiological State (CPS)' was the most important predictor with
highest hazard ratio. In the next phase, driver's affective state detection was performed by
modeling the given problem as a multiclass problem. This analysis was performed with (i) a
3-Class model with seven different neural network configurations and (ii) a 4-Class model
with six different neural network configurations. Subsequently, a multi-stage verification was
performed by employing multi-turn driver data apart from single-turn data, which took care
of the intra- as well as inter-subject variability aspects. Additionally, the effects of stressful
events and incidents on driver's stress-level have been comprehensively analyzed using
iv
stress-trends detection approaches with the help of Trigg's Tracking Variable (TTV) methods.
Finally, the thesis proposes a fine-grained, bio-inspired ubiquitous computing architecture
around a wearable driver assistance system.
While it is clear that driverless cars are nearing their entry into the mainstream driving
particularly in the countries where their high costs may not matter much, in rest of the world
it may take a while before they completely take over. As a consequence, the work presented
here remains relevant not only now but possibly for foreseeable future as well.
v
TABLE OF CONTENTS
------------------------------------------------------------------------------------------------
ACKNOWLEDGEMENT I
ABSTRACT Iii
TABLE OF CONTENTS V
LIST OF SYMBOLS X
LIST OF ABBREVIATIONS Xi
LIST OF TABLES xiv
LIST OF FIGURES xvi
CHAPTER 1: INTRODUCTION 1
1.1. A Little Insight 1
1.2. Background 2
1.3. Significance of Wearable Computing Approach 7
1.4. Problem Statement and Scope of the targeted research 7
1.5. About the Organization of the Rest of the Thesis 8
CHAPTER 2: LITERATURE REVIEW 9
2.1. Principal Problems and Candidate Solutions 9
2.2. Focus of the Work 10
2.3. The Advanced Driver Assistance System (ADAS) 11
2.3.1 ADAS Definition 11
2.3.2 ADAS Classifications 12
2.3.3 ADAS Functions and Enabling Technologies 15
2.4 The Current State of the Art: Driver's Inattention, Fatigue and
Stress Monitoring
20
2.4.1 Computer Vision based Driver's Inattention Detection
Techniques
23
2.4.2 Physiological Sensors based Stress Level and Fatigue
Monitoring Techniques
24
2.4.3 Hybrid Techniques for Stress Level and Fatigue Monitoring 25
2.5 Wearable Driver Assistance Systems: A Need Analysis 27
2.6 Wearable Sensing Parameters and their Effects on Autonomous
Nervous Systems (ANS)
29
2.6.1 Human Physiology, The Nervous System and Stress 29
vi
2.6.2 Heart Rate (HR) and Heart Rate Variability (HRV) 31
2.6.2.1 Electrocardiography (ECG) 32
2.6.2.2 Photoplethysmography (PPG) 33
2.6.2.3 HRV Measurement Techniques 34
2.6.3 Blood Pressure (BP) 35
2.6.4 Galvanic Skin Response (GSR) 36
2.6.5 Respiration 37
2.6.6 Electromyography (EMG) 38
2.6.7 Electroencephalography (EEG) 38
2.6.8 Blood Oxygen Saturation (SpO2) 40
2.7 Models for Stress-Level Analysis 41
2.7.1 Fisher Projection and Linear Discriminant Analysis (LDA) 41
2.7.2 Support Vector Machines (SVM) 42
2.7.3 Bayesian Networks (BNs) 43
2.7.4. Artificial Neural Networks (ANNs) 43
2.7.5 Neuro-Fuzzy Systems 44
2.8 Enabling Technologies for WDAS Design 45
2.8.1 Wearable Biosensors and Sensing Parameters for Driver
Stress Monitoring
45
2.8.2 Processing Requirements and Elements 47
2.8.3 Communication Elements 47
2.8.4 Storage Devices 48
2.8.5 Power Provisioning 49
2.8.6 Alarm and Warning Actuators 50
2.8.7 Wearable Fabrics 50
2.8.8 Application and System Software 50
2.9 Impact of the Literature Review on Identification of Next Steps 51
CHAPTER 3: PHYSIOLOGICAL SIGNAL: DATA COLLECTION AND
PROCESSING
52
3.1 Steps Involved and their Significance 53
3.2 Sensor Selection 53
3.3 Sensors Employed for Data Collection 55
3.3.1 Galvanic Skin Response (GSR) Sensor 56
3.3.2 Pulse Oximetry Sensor 56
3.3.3 Respiration Sensor 57
vii
3.4 Data Collection: Requirements and Processes 57
3.4.1 Data Collection Protocol 58
3.4.2 Data Acquisition Scenarios 60
3.5 Processing the Data acquired from Real-Time Signals 65
3.5.1 Data Analysis Strategies and Mechanisms 66
3.5.2 Manual Observation 66
3.5.3 Preliminary Statistical Analysis 67
3.5.4 Challenges faced in Signal Preprocessing 69
3.5.5 Approach for Physiological Signal Processing 70
3.5.5.1 Normalization and Spike Removal 70
3.5.5.2 Galvanic Skin Response Signal Processing 70
3.5.5.2.1 Signal Decomposition 71
3.5.5.2.2 Peak and Point of Onset
Detection
71
3.5.5.3 Photoplethysmography Signal Processing 72
3.5.5.3.1 Motion Artifact Removal 72
3.5.5.3.2 Instantaneous Heart Rate
Extraction
72
3.6 Extracting Features from Physiological Signals 73
3.6.1 Methods of Feature Extraction 73
3.6.2 Statistical Features 73
3.6.3 Galvanic Skin Response (GSR) Syntactic Features 74
3.6.4. Photoplethysmogram (PPG) Syntactic Features 76
3.6.5 Heart Rate Variability (HRV) Features Derived from PPG 78
3.6.5.1 HRV Spectral Features using Lomb Periodogram 78
3.6.5.2 HRV Statistical Features 79
3.7 Statistical Significance of Extracted Features 80
3.8 Feature Selection 82
3.8.1 Shape-based Feature Selection 82
3.8.2 Hybrid Approach: Filter and Wrapper based 84
3.9 Conclusions 87
CHAPTER 4: DRIVER-PROFILE ANALYSIS 88
4.1 Profiling and its Significance 88
4.2 Requirement for Profiling 90
4.3 The COX Proportional Hazard (PH) Model 92
viii
4.4 Predictors for Unified Cox PH Driver Stress Model 92
4.5 Results: COX PHM based Driver-Profile Analysis 96
4.6 Conclusions 99
CHAPTER 5: BIOSIGNAL-ASSISTED STRESS ANALYSIS 100
5.1 Affective State Detection using ANN Classifiers 100
5.1.1 Classification Approaches 104
5.1.2 Performance Measures for Classifier Evaluation 105
5.1.3 Employing Unsupervised Learning for Affective State
Monitoring
108
5.1.4 Employing Supervised Learning for Affective State
Monitoring
110
5.1.5 Evaluation of Neural Network Architectures 113
5.1.6. Results of 3-Class Affective State Classification 119
5.1.6.1 Training and Learning Function Evaluation 127
5.1.6.2 Identification of an Optimum Classifier for a
3-Class Affective State Detection
128
5.1.7. Methodology adopted for 4-Class Affective State
Classification
133
5.1.7.1 Methodology for Single-Turn Affective State
Analysis
135
5.1.7.2 Results: Single-Turn Affective State Analysis 136
5.1.7.3 Methodology for Multi-Turn Affective State
Analysis
144
5.1.7.4 Results: Multi-Turn Affective State Analysis 144
5.2 Real-Time Trend Analysis and Detection Methods 146
5.2.1 Need for an online approach and the proposed novelty 146
5.2.2 The Trigg's Statistical Approach 147
5.2.3 The Shape Based Feature Weight Allocation 148
5.2.3.1 Classification of Trend Shapes 148
5.2.3.2 Feature Weight Allocation 149
5.2.3.3 Trigg's Tracking Variable (TTV) Calculation 149
5.2.3.4 Segment Weight Calculation for TTV Analysis 150
5.2.3.5 Optimal Threshold Selection using the Desirability
Function Approach
150
ix
5.2.3.5 Results of Segment Weight based Stress-Trend
Detection
152
5.2.4 Neural Network based Regression Model for Stress-Trend
Detection
153
5.2.4.1 Result of Neural Network based Regression Model
for Stress-Trend Detection
154
5.3 Conclusions 156
CHAPTER 6: A PROPOSED ARCHITECTURE FOR THE RESULTANT
WEARABLE DRIVER ASSISTANCE SYSTEM
157
6.1 About the Architecture of the Overall Envisioned Ubiquitous
Computing Environment
157
6.2 Identification of Constituent Elements of the Resultant System 159
6.3 The Proposed System Design 164
6.4 Possible Implementation Approaches 167
CHAPTER 7: CONCLUSION 169
7.1 Principal Contributions of the Thesis 169
7.2 Limitations of the Work Done 170
7.3 A Comparison with Relevant Contemporary Works 171
7.4 Future Scope 173
REFERENCES 174
LIST OF PUBLICATIONS AND PRESENTATIONS 186
APPENDICES 188
BRIEF BIOGRAPHY OF THE CANDIDATE 193
BRIEF BIOGRAPHY OF THE SUPERVISOR 194
x
LIST OF SYMBOLS
β Regression Coefficient
r(t) Risk Factor
xm Input Signal (mth
)
wk Synaptic weight of neuron 'k'
uk Linear combiner output due to the input signals
bk Bias or offset
φ(∙) Activation function or squashing function or transfer function
yk Output signal of the Neuron
ʋk Net Input
hardlim Hard Limit Function
purelin Linear Function
logsig Log-Sigmoid Function
tansig Tan-Sigmoid Function
a Slope Parameter
tp True Positive
tn True Negative
fp False Positive
fn False Negative
P(a) Relative observed agreement among the classes
P(e) Probability that agreement is due to chance
di Individual Desirability
Individual Response
ttv Trigg's Tracking Variable
Dt Array of Features
α Smoothing Constant
mad Mean Absolute Deviation
ut Predicted Value
et Error in Prediction
St Smoothened Error
xi
LIST OF ABBREVIATIONS
ABS Anti-lock Braking System
ACC Adaptive Cruise Control
ADAS Advanced Driver Assistance Systems
ANFIS Adaptive Neuro-Fuzzy System
ANN Artificial Neural Networks
ANS Autonomic Nervous System
ASIC Application Specific Integrated Circuit
ASR Automatic Speech Recognition
ATIS Advanced Traveler Information Systems
AVCS Advanced Vehicle Control Systems
AVNN Average NN Interval
BA Brake Assist Systems
BP Blood Pressure
BN Bayesian Network
BSW Blind Spot Warning
CPU Central Processing Unit
CNS Central Nervous System
CSLI Curve and Speed Limit Information
DIL Driver's Inattentiveness Level
DSM Driver Status Monitoring System
DVS Dynamic Voltage Scaling
ECG Electrocardiography
EDA Electrodermal Activity
EEG Electroencephalography
EFuNN Evolving Fuzzy Neural Network
EMG Electromyography
EOG Electrooculography
ETSC European Transport Safety Council
EU European Union
FCW Forward Collision Warning
FIS Fuzzy Inference System
FMT Fiber Meshed Transducers
xii
FP False Positive
FPGA Field Programmable Gate Arrays
FN False Negative
GA Genetic Algorithms
GPS Global Positioning System
GSM Global System for Mobile Communications
GSR Galvanic Skin Response
HCI Human Computer Interaction
HFP High Frequency Power
HR Heart Rate
HRV Heart Rate Variability
IRTE Institute of Road Traffic Education
ITS Intelligent Transportation Systems
IVIS In-vehicle information systems
LCA Lane Change Assistant
LDA Linear Discriminant Analysis
LDW Lane Departure Warning
MEMS Micro Electromechanical Systems
PGA Parallel Genetic Algorithms
PSD Power Spectral Density
LF/HF Ratio Low Freq. / High Freq. Ratio
LFP Low Frequency Power
LHW Local Hazard Warning
LKA Lane Keeping Assistant
NHTSA National Highway Traffic Safety Administration
NMVCCS National Motor Vehicle Crash Causation Survey
NN Normal-to-Normal Interval
NV Night Vision
OCA Obstacle and Collision Avoidance
PAS Parking Assist System
PERCLOS Percentage Eyelid Closure
pNN50 Ratio of NN50 and the total number of NN Intervals
pNN20 Ratio of NN52 and the total number of NN Intervals
xiii
PNS Peripheral Nervous System
PPG Photoplethysmography
PSNS Parasympathetic Nervous System
RAM Random Access Memory
RCW Rear-end Collision Warning
rMSSD Square Root of the Mean of the Sum of the Squares of differences
between adjacent NN Intervals
ROM Read Only Memory
ROR Run-off-road
RSP Respiration
TN True Negative
TP True Positive
TPW Tyre Pressure Warning
SDNN Standard Deviation of NN Interval
SH Smart Headlamps
SNS Sympathetic Nervous System
SoNS Somatic Nervous System
SpO2 Blood Oxygen Saturation
SVM Support Vector Machines
UK United Kingdom
USA United States of America
VLFP Very Low Frequency Power
VLSI Very Large Scale Integration Technologies
WDAS Wearable driver assistance system
WHO World Health Organization
xiv
LIST OF TABLES
Table
No.
Title Page
No.
1.1. Fatalities by Road User Category 3
2.1 Classification based on Information Systems and Intervening Systems 13
2.2 Various ADAS Classification Reported in Literature 14
2.3 ADAS Functions and Technologies 17
2.4 Wearable Sensing and Data Acquisition Modules used for Driver Stress
Analysis
28
2.5 List of Parameters of Measurement / Sensing alongwith Related Use 46
3.1 List of Physiological Signals, Sensors and Sensing Parameters of
Measurements alongwith their Related Use
54
3.2 Data Collection Scenarios 62
3.3 Two Way ANOVA Analysis 68
3.4 Feature Extraction Methods 73
3.5 Statistical Features 74
3.6 Syntactic GSR Features 76
3.7 PPG Syntactic Features 77
3.8 HRV Statistical Features 79
3.9 Statistical Significance of Individual Features Extracted Using a
10-Second Time Window
81
3.10 Shape Based Feature Selection Method 83
3.11 Extracted Features and their Selection 86
4.1 Driver Profile Data Acquired Through Questionnaire and Experimenter's
Observations
96
4.2 Description of Predictors for COX PHM 97
4.3 Results of COX Proportional Hazard Model 99
5.1 Confusion Matrix or Contingency Table of a Binary Classifier 106
5.2 Classifier Performance Measures for Multiclass Classifiers 107
5.3 KSOM Configuration and Architecture 109
5.4 Questionnaire and Observations 111
5.5 Stress-Level Assessment for Individual Scenarios for a 3-Class Model 111
xv
Table
No.
Title Page
No.
5.6 Stress-Trend Markers and their Weights 112
5.7 Classifier Performance Parameter: Precision 120
5.8 Classifier Performance Parameter: Sensitivity 121
5.9 Classifier Performance Parameter: Specificity 122
5.10 Classifier Performance Parameter: gmean-1 123
5.11 Classifier Performance Parameter: gmean-2 124
5.12 Classifier Performance Parameter: f-measure 125
5.13 Comparative Results of Neural Network Classifier Evaluation 126
5.14 Classifier Performance Evaluation based on Unified Desirability Measure 127
5.15 Evaluation of the Performance of the Neural Network Learning and
Training Algorithms
128
5.16 Affective State Detection using Layer Recurrent Network 129
5.17 ROC Analysis of the Drivers Affective State 130
5.18 Optimum Window-Size Selection for Single Turn Drives 137
5.19 Classifier Performance Measure for the 4-Class Classifier 138
5.20 Producer and User Accuracy of a Classifier 141
5.21 Individual Class Accuracies: Producer's and User's Accuracy 142
5.22 Individual Class Accuracies: Producer's and User's Accuracy of Two
Classifiers
143
5.23 Multi-Turn Analysis considering Individual Averages 145
5.24 Algorithm Pseudocode for TTV calculation 148
5.25 Results of Segment Weight based Stress-Trend Detection 152
5.26 Optimum Classifier for Stress-Trend Detection 155
6.1 A Possible List of Medical Grade Microcontroller / System-On-Chip
Families for WDAS Design
162
6.2 A Possible List of Communication Elements for WDAS Design 164
7.1 Comparative Analysis of Proposed Approach against Existing Approaches
for Driver Stress Detection
172
xvi
LIST OF FIGURES
Figure
No.
Title Page
No.
1.1 Comparison of Fatality Rate in USA and India between 1997 - 2007 4
2.1 ADAS Functional Levels and Drivers Behavioral Model 16
2.2 Human Physiology and Reflex-Control leading to Stress 32
2.3 A Typical ECG Waveform 33
2.4 Heart Rate Variability (HRV) Features 35
2.5 Typical placement location of Wearable Biosensors alongwith Sensing
Module
46
3.1 Biosignal based Pattern Recognition: Functional Block Diagram 52
3.2 Experimental setup for sensing and computing of chosen parametric data
using body-mounted sensors
56
3.3 Sensor Configuration for data collection under (a) Rest Scenarios (Pr-dr and
Po-dr) and (b) Driving Scenario (Rx-dr, By-dr and Rt-dr).
60
3.4 Satellite route map of Relaxed Driving (Rx-dr) Scenario. 61
3.5 Satellite route map of Busy Driving (By-dr) Scenario 63
3.6 Satellite route map of Intracampus-return Driving (Rt-dr) Scenario 63
3.7 Timeline Chart 65
3.8 Clean Signals sampled during Pre-driving Scenario 69
3.9 Noisy Signals sampled during Drive with Motion Artifacts and Sensor
Errors
69
3.10 Galvanic Skin Response (GSR) Syntactic Features 75
3.11 Galvanic Skin Response Syntactic Features during Busy Driving 75
3.12 PPG Syntactic Features extracted under Relaxed Driving 77
3.13 Lomb Periodogram of Instantaneous Heart Rate Time Series 79
3.14 Feature Selection Techniques Adopted 85
4.1 Survival Analysis Plot of Drivers 98
5.1 A Generic Nonlinear Neural Network Mode 101
5.2 (a) A Hard Limit Function 102
5.2 (b) A purelin Function and a Piecewise-Linear Function 103
5.2 (c) A Log-Sigmoid Function 103
xvii
Figure
No.
Title Page
No.
5.2 (d) A Signum Function and a Tan-Sigmoid Function 104
5.3 Classification Methods 105
5.4 (a) KSOM Weight Vectors 109
5.4 (b) Unified Distance Matrix 109
5.5 Driving Scenarios Route Map. 113
5.6 Single Layer Perceptron Neural Network Model 114
5.7 Multilayer Perceptron Neural Network Model 115
5.8 Cascade Forward Backpropagation Neural Network Model 115
5.9 Feed Forward Distributed Time-Delay Neural Network Model. 116
5.10 Elman Backpropagation Neural Network Model 117
5.11 Layer Recurrent Neural Network Model 117
5.12 Non-Linear Autoregressive with Exogenous Inputs Neural Network Model. 118
5.13 ROC Curves for Affective State Detection using Layer Recurrent Neural
Networks
130
5.14(a) Boxplots of Neural Network Classifiers Performance: (a) Precision 131
5.14(b) Boxplots of Neural Network Classifiers Performance: (b) Sensitivity 131
5.14(c) Boxplots of Neural Network Classifiers Performance: (c) Specificity 132
5.15 Layer Recurrent Neural Network Architecture for Affective State Detection 132
5.16 Feed Forward Time Delay Neural Network Model 133
5.17 Single-Turn Analysis: Window Size Selection 137
5.18 Boxplots of Performance Measures for Single Turn Drives 139
5.19 Feature Shapes and Feature Weight Allocation 150
5.20 Optimum Threshold Identification using Desirability Function. 151
5.21 Stress-Trends Detected 153
5.22 Stress-Trend Analysis Data: MSE 155
6.1 Functional blocks of the Pervasive Computing Environment of the Vehicle
and the Wearable Computer
158
6.2 Architectural Framework of the Proposed Wearable Driver Assistance
System
165
6.3 Hardware Building Blocks of the Proposed Wearable Driver Assistance
System
166
xviii
Figure
No.
Title Page
No.
6.4 Affective State Detection: Complete Logical Flow 167
6.5 Intelligent Inference Engine: Logical Flow 168
1
Chapter 1
Introduction
Life is a precious gift. There are many a lives which are lost around the world due to road
accidents. In addition, many more people get varying degrees of injuries that such accidents
inflict upon the drivers and passengers. Such events not only cause trauma to the affected but
also potentially affect the lives of their loved ones. The fact that apart from deep emotional
and psychological impacts, severe road accidents create potential financial difficulties for the
person or family, only adds to the seriousness of the situation. The impact of such accidents
has other implications also such as on the road infrastructure, litigations, loss of manpower
and revenue etc. for the governments around the world. This work is part of a larger research
project that aims to build a system that could reduce occurrences of such kind.
1.1 A Little Insight
There are several factors which contribute to road accidents, such as vehicular, as well as
environmental factors and human errors etc. Vehicular factors may include vehicle's
parameters which influence driving such as application of brakes, steering wheel maneuvers
etc. Environmental factors include road conditions, intersections, lane changing, vehicles in
the rear-end and / or front-end. Human factors are centered around the vehicular drivers who
may be influenced due to distractions, drowsiness, inattention, slow-reflexes etc. In order to
minimize accidents we must devise measures which use all these contributing factors by
means of sensing, processing the sensed data and taking necessary corrective actions. With
technological advancement it has become possible to develop miniature sensing devices
which may be used to sense appropriate parametric data. A pervasive computing
infrastructure created using all these requirements will be of great help for avoiding accidents
(Yang and Wang, 2007).
The technological advancements in recent past have made it possible to miniaturize
devices to an extent that they can be worn unobtrusively by the drivers. Pervasive computing
environment would be helpful in saving lives of automotive drivers and their passengers. Use
of body mounted non-invasive physiological sensors may help in identifying the impact of
various kinds of physical and mental fatigue of the drivers to a great extent. Apart from
sensing, this requires a local processing unit such as a wearable computer which not only can
collect data, but also process them locally to alert the drivers in time. We collected real-time
2
data from such sensors mounted on a set of vehicular drivers and analyzed using various
pattern recognition techniques. This part of the research forms the initial but significant phase
of the design and development of specific class of wearable computing systems that could
prevent the loss of human lives due to common road-accidents under the project BITS-
LifeGuard (Banerjee, 2005). This wearable computer shall have wireless communication
capability and the capability to continually monitor the relevant critical data1 and would alert
the driver in time so as to enable him / her to take up the necessary action. This requires
timely alert generation and consequent feedback; as well as some post processing of volumes
of data for adaptive use and improved efficiency of the system. It is this context that the rest
of the discussion in this chapter builds upon.
1.2 Background
Every year, more than 1.2 million people die in road accidents worldwide, whereas
approximately 50 million are injured (WHO, 2009). Developing countries, with low-income
and middle-income groups, have been reported to witness the highest percentage of fatal
accidents with over 90% of all road accident deaths reported around the world. Nearly half of
the deaths have occurred in the Asia-Pacific region itself. According to the World Health
Organization's (WHO) 2009 report, an alarming number of 105725 fatalities and over 2
million disabilities resulted due to road accidents in India alone in the year 2007. On an
average, in India approximately 1,275,000 persons are grievously injured on the road every
year and out of world’s total road accident fatalities, almost 10% occurs here2. Although,
between 2007 and 2010, 88 countries have been able to reduce the deaths on their roads
having an overall population of 1.6 billion, another 88 countries saw an increase in road
traffic deaths (WHO, 2013). The WHO (2013) report highlighted that countries having
middle-income groups have shown highest road traffic fatality rates, particularly the African
Region. About 1.24 million deaths still occur annually (WHO, 2013). A comparative analysis
of fatalities reported in some countries shown in Table 1.1, by road user category, reveals that
4-wheeler drivers alongwith their passengers and pedestrians have been the most vulnerable
population, particularly during the years 2006-2007 (WHO, 2009) and 2009-2010 (WHO,
2013).
More than 42,000 people were killed every year on European Union (EU) roads and
about 20% of the road transport crashes were attributed to the driver's fatigue (ETSC, 2001).
1 obtained through a variety of input mechanisms including sensors 2 Institute of Road Traffic Education (IRTE). (India). Citing Internet sources. URL http://www.irte.com.
3
In March 2000, the UK Government’s Department for the Environment, Transport and the
Regions set a target of achieving a 40% reduction in the numbers of people killed or seriously
injured in road accidents by 2010 (RoSPA, 2001).
Table 1.1: Fatalities by Road User Category (Source: WHO, 2013 & 2009 Report)
Countries Year
Drivers /
Occupants†
(4-wheeled
Cars and/or
light
vehicles)
Pedestrians
Passengers
(4-wheeled
Cars
and/or
light
vehicles)
Riders
Motorized
(2- or 3-
wheelers)
Cyclist
Others*
(Includes data for
Drivers/Passengers
of heavy trucks1 /
buses2;
Unspecified3)
Korea 2007 26% 37% 11% 21% 5% N/A
2010 25%† 38% N/A 20% 5% 3%* + 9%1
Japan 2006 28% 32% 9% 18% 13% N/A
2010 31%† 35% N/A 18% 16% <1%*
China 2006 5% 26% 17% 28% 9% 14%*
2010 6% 25% 17% 35% 10% 2%* + 5%1
Australia 2007 49% 13% 21% 15% 3% N/A
2010 47% 13% 21% 16% 3% <1%*
India 2006 N/A 13% 15% 27% 4% 29%* + 11%3
2009 16%† 9% N/A 32% 5% 17%* + 13%1 + 8%2
USA 2006 51% 11% 21% 11% 2% 4%*
2009 50% 12% 20% 13% 2% 2%1 + <1%2
Canada
2006 54% 13% 22% 7% 3% 1%*
2009 49% 14% 20% 9% 2% 3%* + 3%1 + <1%2
UK 2006 36% 21% 19% 19% 4% 1%*
2010 33% 22% 15% 22% 6% 1%* + 1%1 + <1%2
France
2007 43% 12% 16% 25% 3% 1%*
2010 42% 12% 15% 24% 4% 2%* + 1%1 + <1%2
Germany
2007 43% 14% 15% 18% 10% 1%*
2010 37% 13% 14% 19% 10% 1%* + 5%1 + 1%2
Russian
Federation
2007 36% 36% 28% 2% N/A N/A
2010 28% 33% 25% 7% 2% <1%* + 3%1 + <1%2
*Others: Some countries like India classify the fatalities according to the vehicle or road user "at fault" rather
than who died and also some deaths of road users were unreported.
Transportation researchers have observed that over 73% of road accidents are attributed
to degrading physical fitness and mental alertness of the driver at the time of accident, often
attributed to on-road stress and fatigue (RoSPA, 2001). Driver fatigue which is attributed due
to insufficient sleep, tiredness, drowsiness etc. was identified as one of the prime areas to be
4
focused upon to attain this objective (RoSPA, 2001). In a review study at the University of
New South Wales, Australia it was found that over 20% of road accident fatalities were
attributed to driver fatigue, stress and drowsiness (Williamson et al., 2005).
According to the U.S. National Highway Traffic Safety Administration’s (NHTSA)
report, in USA alone, a total of 37,261 people were killed and another 2.35 million people
were injured in road crashes in 2008 (N.H.T.S.A. 2008). Among these, 64% were drivers,
27% passengers and remaining 9% comprised of 4% motorcyclists, 3% pedestrians and 2%
pedalcyclists. Factors contributing to such accidents involved around 24.1% due to improper
lane keeping or running-off-the-road, 21.5% due to driving too fast or in access of the
conditions imposed, 14.3% due to alcohol, drug or medication and 9.4% due to inattention. In
an another recent NHTSA's report, the National Motor Vehicle Crash Causation Survey
(NMVCCS) data collected at crash scenes between 2005 and 2007, established that over 95%
single vehicle run-off-road (ROR) crashes had critical reasons related to drivers (Liu and Ye,
2011). The most frequently occurring category of critical reasons were attributed to drivers
including driver performance errors (27.7%), followed by driver decision errors (25.4%),
critical non-performance errors (22.5%) and recognition errors (19.8%). For single vehicle
ROR crashes the critical reasons attributed to vehicles are only 1%, due to environment only
1.1%. The findings also showed that driver's inattention, fatigue and hurriedness were the
most influential factors (Liu and Ye, 2011).
Figure 1.1: Comparison of Fatality Rate in USA and India between 1997 - 2007.
5
A comparative depiction in Fig. 1.1 represents the fatalities per million persons in USA
and India based on the data presented by N.H.T.S.A. (2008) and Mohan (2009) between
1997 - 2007. Although the graph shows as if compared to USA, India observes less number
of fatal road accidents per unit population, this may be slightly imprecise in view of the fact
that unlike USA India has not adopted a proper method of data collection, many of the
accidents are not recorded and coordination between agencies is missing which results in
unreported deaths. However, there is an interesting observation that this graph leads to, in
terms of the emerging trends! It may be seen that unlike in the US case, wherein the fatal
accidents are on the decline, in case of India, the trend is exactly the opposite. This may,
perhaps, be explained in the light of (i) recent increase in the percentage of vehicles per year,
and (ii) laxity and / or ignorance in terms of observing and enforcing traffic rules.
Financially, the overall global losses were estimated at US$ 518 billion which cost the
governments over 3% of their gross national product (WHO, 2009). In order to reduce
fatalities and economic loss it is required that a driver-assisting machine, capable of
recognizing affective state and responding interactively, is built. This machine should be
trained and evaluated using real-time data collected from drivers subjected to their local
driving environment.
In real-life traffic situations, driving becomes stressful due to the frequent occurrence of
events and incidents; thereby, requiring that the driver is relaxed and has a good reflex
response (James and Nahl, 2003). These events are sequential manoeuvres like stopping for a
light, changing lanes, putting on the brakes etc. whereas the incidents are frequent but
unpredictable like near-misses, frustration due to overtaking or not getting a pass etc. These
events and incidents are sources of physiological responses attributed due to extreme
physiological reactions, emotional reactions and irrational thoughts leading to stress.
Unacceptable levels of stress, fatigue and on-road distractions deteriorate driver’s
performance and may lead to loss of concentration, risk assessment capability and vehicular
control, often, inviting road accidents (Lisetti and Nasoz, 2004). Researchers in the past
have extracted parameters from biosignals to measure the emotion (Lisetti and Nasoz, 2004;
Katsis et al., 2008), stress level (Healey and Picard, 2005; Rigas et al., 2012), fatigue (Ji et
al., 2006) and the affective state (Katsis et al., 2011). They have interchangeably used the
term "affective state" or "emotional state" or the "sentic state" to assess the mental and
physical stress experienced by vehicular drivers (Riener et al., 2009).
Researchers have identified that this issue of ever increasing fatality rate and economic
losses can be addressed through development and deployment of context-aware driver
6
assistance systems capable of predicting accidents and alert the driver proactively. The
Information and Communication Technology for Mobility (ICT for Mobility) programme for
Intelligent Vehicle Systems in 2010 identified the Strategic Research Agenda for future
research on Intelligent Vehicles “...should focus on highly integrated and price worthy
solutions for driver assistance systems to reach wide deployment and achieve increased
traffic safety and efficiency and reduce environmental impact…” (European Commission,
2010).
The BITS-LifeGuard research initiative at Birla Institute of Technology and Science,
Pilani, India aims to enhance driver-safety by designing a custom wearable computing fabric
which can save loss of precious lives by the way of providing fast yet credible real-time alerts
to the drivers and their coupled cars3 (Banerjee, 2005; Singh et al., 2010, 2011,2013a). The
present work is part of this ongoing project that aims to integrate the vehicular infrastructure
with the body-worn wearable computer (Banerjee, 2005).
Over the past decade, different approaches were suggested by the researchers to solve
such problems. The first such approach was chosen by some of the major automobile
manufacturers wherein vehicle mounted sensors were used to primarily assist the drivers.
Typical sensors include steering wheel and lateral position sensor, infrared (IR) camera,
image sensor etc. as used by automobile manufacturers like Toyota, Nissan, Volvo,
Mercedes-Benz, and Saab etc. (Dong et al., 2011). Saab’s Driver Attention Warning System
uses IR cameras to monitor driver's drowsiness and distraction. Toyota's Driver Monitoring
System uses near-IR cameras to track head position as well as drowsiness. Mercedes-Benz's
Attention Assist system uses only vehicle's parameter such as the speed, longitudinal and
lateral acceleration, angle of the steering wheel, brake pedal etc. to create driver's profile to
detect drowsiness and alert them by both visual and audio alerts. Volvo's Driver Alert Control
system monitors the car’s movements on the road to assess tired and non-concentrating
drivers.
In contrast, the second school of researchers opted for a vehicle-independent approach for
identifying suboptimal physiological and mental status identification from the viewpoint of
safe-driving. These researchers chose body-worn computing systems approach by using
physiological sensors to monitor stress, fatigue, emotions etc. (Healy and Picard, 2005;
Lisseti and Nasoz, 2004, Katsis et al., 2008, Katsis et al., 2011, Singh et al., 2013a).
3 The term coupled car refers here to a car that is wirelessly coupled with a specific driver's wearable computer
using Bluetooth®.
7
1.3 Significance of the Wearable Computing Approach
A major reason for choosing the wearable computing approach as the principal approach in
the overall ubiquitous computing solution under the current project is that most of the vehicle
mounted solutions are found in mid-range and high-cost vehicles. The vehicle mounted
systems are still out of reach of majority of the car owners and drivers in most of the
developing world. Therefore, an approach was required which was independent of the type,
make or cost of vehicle and this is where the wearable computing approach seemed to offer
itself as a viable option.
1.4 Problem Statement and Scope of the targeted research
It was envisaged to design and develop a Wearable driver assistance system (WDAS) to be
worn by vehicular drivers for assessing their stress levels. The proposed wearable device with
a combination of physiological sensing, local computing for stress, fatigue and stress-trend
detection alongwith communication facilities will help in identification of alarming situations
for avoiding road accidents.
It appears that an exhaustive and more practical solution approach for saving lives from
road accidents may have to involve an ubiquitous computing infrastructure of which three
major element types shall be (a) vehicular computing system(s), (b) wearable computing
system(s) and (c) Intelligent Transportation Systems (ITS). However, focus of this specific
thesis has been limited to the wearable computing system aspects alone and it is in this
context that the following sections and chapters discuss relevant details, issues, available
solution approaches, the work done by us and the consequent recommendations for use of the
resultant work under the future scope.
The present work involves collection of real-time physiological signal based data from 20
automotive drivers in 5 different semi-urban scenarios (2 relaxed, 3 driving), extraction of
features from the collected physiological signals, modeling the stress classes into (a) 3-Class4
problem and (b) 4-Class5 problem, applying pattern recognition techniques on the extracted
feature vector to identify different stress-classes. In absence of any credible driving simulator
with an ability to test driving under hazardous6 conditions, we focused on collecting primary
4 The term 3-Class here refers to the stress classes or levels to be classified as Relaxed, Moderate and Stressed. 5 The term 4-Class here refers to the stress classes or levels to be classified as Level-1 to Level-4. 6 The term hazardous conditions may refer to situations like unanticipated movement of either a vehicle or a
pedestrian in the path of a speeding vehicle, heavy rains, heavy snowfall, maneuvering vehicles through narrow
hilly terrains etc. Typically, such hazardous condition based data is not obtained through real-life tests. Instead,
where available, hazard simulators are used.
8
data in non-hazardous situations only. We had to depend on the real-life data collection under
normal, moderately strenuous and relatively low risk environment.
1.5 About the Organization of the Rest of the Thesis
Having built upon the first base explaining genesis of the work done, the rest of the chapters
try to dwelve into the details of specific problem chosen and systematically evolve a viable
solution. In particular, Chapter 2 reviews the broad field of wearable computing devices but
with specific reference to the technologies and techniques largely relevant to wearable driver
assistance systems. In Chapter 3, methodology for data collection including the sensory
setup, scenarios and signal processing has been discussed. Driver-Profile Analysis to
understand different behavioral factors has been explained in Chapter 4. Chapter 5 presents
the methodologies for detection of the Affective States and relevant Stress-Trends. Chapter 6
presents the recommended biosignal based architecture for the BITS-LifeGuard System.
Finally, the thesis ends with Chapter 7 that documents the principal contributions of the work
done, a discussion on the comparison with other contemporary works as well as limitations of
the work done in addition to the possible scope for future work.
9
Chapter 2
Literature Review
Majority of the road accidents are caused due to reasons like drivers’ inattention, fatigue,
stress, poor health, lack of sleep etc. Thus, this problem may be termed to be driver-centric.
In this chapter, the current state of the art about the Advanced Driver Assistance Systems
(ADAS) and its enabling technologies, that need to handle aspects like driver's inattention,
fatigue and stress monitoring have been discussed.
2.1 Principal Problems and Candidate Solutions
It is well known that road accidents may be caused due to wide variety of reasons which may
have nothing to do with factors like driver's physical health, mental health, alertness to
respond and awareness of the relevant traffic laws. However, the scope of this work has been
chosen to be limited to the factors like the degrading reflex, stress and fatigue etc. since these
have proven potential to cause road accidents.
Limiting the scope of the work in such a manner would thus help in carrying out an in-
depth analysis of real-world primary data collectable through well designed experiments
apart from addressing a specific subset of issues arising out of variations in measurable
physiological parameters.
Consequent to the above referred approach and scope, a desirable step is identification of
an optimal subset of physiological parameters out of an exhaustive set that could lead to
credible yet computationally efficient as well as cost-effective indication of deterioration of a
driver's state that could potentially lead to unsafe driving (Healey and Picard, 2005).
Once the right subset of parameters get identified as indicated above, the very next
enabling factor that needs to be considered is how to make appropriate and trustworthy use of
the continuously or periodically sensed data related to these parameters. In effect, this
amounts to conceptualizing a functional organization that could faithfully represent the basic
building blocks which would enable those functions as well as interactions between those
functional building blocks. Once such a functional block diagram gets duly refined and
validated, it would lead to bifurcation of functionalities that would be best implemented in
the form of hardware, firmware or software. This shall, thus, pave way for consequent high
level (architectural) and low level (structural) design of the complete system that would have
an ability to not only collect data from sensors in real-time and process it but also quickly use
it for identifying a distinct shift appropriately for safe driving. Subsequently, varying kinds of
10
appropriate precautionary alerts (Singh et al., 2013a) could be triggered in the event of no
timely improvement in the driver's state. This trigger could be used for actuating a service or
a physical system as the case might require.
In the related literature, there has been a noticeable evidence of shift of approach that is
being increasingly adopted by the researchers around the globe. Known as driver assistance
systems (Golias et al., 2002), such solutions have been further bifurcated in vehicle-mounted
and body-mounted sensing approaches.
One of these variants termed as Advanced Driver Assistance Systems (ADAS) (Carsten
and Nilsson, 2001) have been proposed by the Intelligent Transportations Systems (ITS)
research communities as a possible solution to mitigate the occurrences of road accidents.
Based on over two decades of substantial advancements, in ADAS research, it seems that
three infrastructural perspectives have evolved as major directions along which most of the
work is progressing. Known as Vehicular Infrastructure, Environmental Infrastructure, and
Driver-Centric Infrastructure respectively, these have gradually emerged as complementary
elements which collectively promise to offer a complete solution to the entire range of
problems of which both vehicle and the driver are major beneficiaries (Wen et al., 2011).
In the following sections, the discussion builds upon the basics of ADAS, its functions
and associated enabling technologies followed by critical review of contemporary
developments and associated methodologies as applicable to monitoring driver’s mental and
physical health.
2.2 Focus of the Work
The focus of the present work is on driver-centric infrastructure based approach. Such
approaches are characterized by sensing direct or indirect elements of physiological kind.
Such a process of identifying instances that reflect fatigue, stress, likely health issues or
inattention of the vehicular drivers require continuous monitoring of the relevant parameters
for safe driving. Whenever a risky driving pattern is observed, appropriate alert or trigger or
corrective or preventive actions are initiated.
While our focus here is on non-medical grade assistive devices, this work does derive
quite a few benefits and inspirations from the work done in varying domains including those
related to medical monitoring research. There do exist commercially available data logging
systems7 which collect the data obtained from stress and fatigue monitoring of the driver
although not in the form of a wearable device. In such systems, the data collected is usually
7 Thought Technology's Data Loggers. http://www.thoughttechnology.com/hardware.htm.
11
processed in an offline manner. As a consequence, there is an undeniable need for the
development of Wearable Driver Assistance Systems (WDAS). The associated enabling
technologies, in such a case, include the models of computation for the affective state and
stress-trend detection. The chapter concludes with a discussion on a proposed biosignal-based
wearable computing architecture for WDAS.
2.3 The Advanced Driver Assistance System (ADAS)
Advanced Driver Assistance System (ADAS) has become an essential component of
many kinds of commercial vehicle to enhance their overall safety8. Between 1980s - 1990s,
they were known as Advanced Vehicle Control Systems (AVCS) as a group of products and
technologies which could control the movement of the vehicle, assist the driver in
controlling the vehicle and systems that provide "high-bandwidth" hazard information in
particular (Shladover, 1993). It was envisioned that the AVCS development path will have
three overlapping development areas (a) driver-warning and perceptual enhancement
systems, (b) driver control assistance systems, and (c) fully-automated control systems over a
period of time. At each of the three stages of development, the AVCS was expected to have
interaction between vehicle-to-vehicle and between vehicle-to-infrastructure elements.
Gradually, over a period of time, AVCS were termed as ADAS due to the fact that all of the
technological infrastructure and control mechanism eventually are going to assist the driver.
2.3.1 ADAS Definition
ADAS have been defined in several ways by different researchers depending upon the
context of developing an application. Gietelink et al. (2006) defined an ADAS as a vehicle
control system that uses environment sensors (e.g. radar, laser, vision) to improve driving
comfort and traffic safety by assisting the driver in recognizing and reacting to potentially
dangerous traffic situations. Tsugawa (2006) defined ADAS as driver assistance systems in
which a mechanism or system covers part of the sequence of tasks (recognition, decision
making, operation) that a driver performs while driving a motor vehicle. Li et al. (2012)
define ADAS as those automation systems which support drivers by strengthening their
sensing ability, warning in case of error, and reducing the controlling efforts of drivers. Such
systems are built to support the human drivers and not to replace them.
8 Although there have been significant developments in the area of autonomous / self-guided / driverless
vehicles including those designed by researchers at Google, CMU and INRIA, as of this writing share of such
vehicles in the world is less than even 0.05%. Thus, the driver centric approaches as complimentary to other
referred approaches remains both relevant and significant.
12
Lu et al. (2005) discussed the technical feasibility of five key ADAS Functions as:
(a) Enhanced navigation, (b) Speed assistance, (c) Collision avoidance, (d) Intersection
support and (e) Lane keeping, which were considered as adequate substitutes for
infrastructure related measures. Their analysis suggests, that integration of speed assistance
and navigation may reduce the requirement of several other systems as these systems are
capable of achieving most of the safety effects with minimal cost. Systems based on vehicle-
to-vehicle communication and vehicle-positioning are the most promising among other
technologies. However, these systems perform a broad variety of complex functions and tasks
and are built to assist the drivers. Therefore, just by defining their functions, it is difficult to
understand their overall characteristics. In the next section, their classification with some
relevant examples have been discussed to understand their overall characteristics.
2.3.2 ADAS Classifications
Classification of ADAS is a complex issue due to the involvement of functional
requirements, infrastructure support needed, implementation methodology adopted, human
machine interface needs, evaluation and maintenance of subsystems etc. Therefore, ADAS
have been classified and categorized in several ways. Kantowitz and Moyer (2000) studied
the human factors issues pertaining to driver and emphasized the need of integrating the
following three types of in-vehicle information (in-car electronics) directly perceived by
drivers (a) safety and collision avoidance, (b) advanced traveler information systems (ATIS),
and (c) convenience and entertainment. They mentioned the immediate need for the
integration of the two human factor and safety issues (i) integrate warning systems and (ii)
integrate all in-vehicle information systems (IVIS). Basic issues involved were driver
overload, message prioritization and overlap, false alarms, display modality, voice activation,
and timely generation of guidelines and standards for making this information available to
designers. Driver overload issue focuses on developing metrics and marking the thresholds
for avoiding any risk of diverting driver's attention from his main task i.e. safe driving. Any
conflict arising in the process is resolved with the help of message prioritization and
minimization of overlap of critical information by its non-critical counterpart. Threshold for
the driver tolerance had to be set in a careful manner so as to avoid possibility of false alarms.
Except for the support in form of emergency assistance or accessibility to the driver, in most
other forms, voice activation or voice triggering was not considered as significant. This is
justified due to the fact that understanding as well generating auditory information affects the
mental workload of drivers during driving tasks thereby affecting their motor related function
(Kantowitz and Moyer, 2000).
13
Carsten and Nilsson (2001) evaluated two main areas of ADAS viz. information systems
which interact with the driver and intervening systems which interacts with the vehicle. They
categorized such systems into four different classes as shown in the Table 2.1. They
concluded that for information systems such as navigation systems, a standardized
performance assessment system could be designed. In contrast, they argued, for intervening
and warning systems, a structured process oriented approach was inappropriate.
Table 2.1: Classification based on Information Systems and Intervening Systems
(Carsten and Nilsson, 2001)
S.
N.
Classification Tasks to be Performed Examples
1. In-Vehicle
Information Systems
(IVIS)
Driving-assistance Navigations systems, Traffic and Road conditions
providing systems. Vision Enhancement Systems.
2. Feedback and
Warning Systems
Reduction in driver-errors Intelligent Speed Adaptation Systems (advisory),
Longitudinal Collision Warning Systems, Lane
Departure Systems, Lane-change Assistant Systems
3. Vehicle Control
Intervention
Systems
Vehicular-control, drivers
can override the systems
Adaptive Cruise Control (ACC), Stop and Go,
Intelligent Speed Adaptation Systems (Intervening
Facilities) etc.
4. Autonomous
Driving Systems
Used as driver-
replacement, hence no
overriding allowed
Autopilot Systems, Autonomous Vehicles etc.
Golias et al. (2002) reported in their review study about criteria based classifications of
ADAS over a period of time as listed in Table 2.2. In contrast to the traditional
classifications, which consider system or user oriented approaches, they proposed an
alternative scheme based on the systems’ impact to road safety and traffic efficiency
designated as 'high' and 'low'. The systems found to have 'high' impact for both road safety
and traffic efficiency are (a) state of the road surface systems, (b) adaptive cruise control
systems, (c) lane change and merge collision avoidance systems, and (d) vision enhancement
systems.
In their analysis, Golias et al. (2002) examined systems which were either driver-support
systems or the vehicle-support systems9.
9 The evaluation of the driver and vehicle support systems was carried out based on the criteria (a) estimation of
impact of road safety and (b) estimation of impact of traffic efficiency. This categorization and evaluation
pointed out that 40% of the systems considered are expected to have high safety and low traffic efficiency, while
only 15% is expected to have both impacts as high.
14
Table 2.2: Various ADAS Classification Reported in Literature (Golias et al., 2002)
S.
N.
Classification Criteria Task Performed / Examples
1. Technologies Used IT, Wireless Communications etc.
2. Subsystems Used Autonomous in-vehicle, supported by GPS/GSM communication,
linked with road infrastructure systems.
3. Vehicle Type Passenger car, truck, bus etc.
4. Road Network Type Motorway, interurban, urban
5. Distinct Phases in Accident
Process
Pre-crash, crash, post-crash etc.
6. Type of user / drivers Individual driver, professional driver, fleet owner, elderly drivers,
etc.
7. Levels of Driver Tasks Strategic (route/mode choice, etc.); Tactical (vehicle maneuvering,
etc.); Operational (steering, accelerator handling, etc.)
8. Levels of Driver Subtasks Perception (seeing, hearing, feeling, etc.); Decision (for the various
actions); Action (execution)
9. Human and Machine Interface Provision of plain information, advisory / warning messages,
communication with the environment, capability of proceeding to a
specific action.
The driver support systems consists of (a) Driver information, (b) Driver perception, (c)
Driver convenience, and (d) Driver monitoring. For instance, navigation routing as well as
real-time traffic and traveler information constitute part of driver information systems.
Elements that help build driver perceptions systems include vision enhancement, parking and
reversing aid, state of the road surface systems etc. Automated transactions, driver
identification, hands-free and remote control etc. form the part of driver convenience
systems. Similarly, driver monitoring systems include the functions of driver vigilance
monitoring and driver health monitoring.
The attributes of vehicle support systems include (a) General vehicle control, (b)
Longitudinal and lateral control, (c) Collision avoidance, and (d) Vehicle monitoring. The
term general vehicle control related to aspects like automatic stop and go, platooning etc.
Speed and adaptive cruise control as well as road and lane departure / change / merge
collision avoidance are the basic features of longitudinal and lateral control. The collision
avoidance systems make use of rear-end collision avoidance, obstacle and pedestrian
detection and intersection collision warning. Vehicular monitoring involves tachograph,
alerting systems, as well as diagnostic systems.
15
Lu et al. (2005) categorized ADAS as measures to counteract traffic accidents based on
active and passive measures. They categorized such systems into one of the three approaches:
(i) measures related to change in human behavior
(ii) vehicle-related measures
(iii) physical road infrastructure related measures.
Passive safety measures aim to mitigate the consequences of an accident once it has
happened, and active safety measures aim to avoid accidents. Active safety systems are
required to interact much more with the driver than passive safety systems, creating a closed
loop between driver, vehicle, and the environment.
2.3.3 ADAS Functions and Enabling Technologies
Lindgren et al. (2008) at the University of Technology, Gothenburg (Sweden), while
investigating into the socio-cultural aspects of design requirements in case of Advanced
Driver Assistance Systems (ADAS) identified four levels of support that such a system can
offer to an automotive driver. In contrast, the IHRA Working Group on ITS (2011) described
a behavioural model of drivers by identifying three levels of driver assistance for detection,
judgment and operations tasks. However, in both of these works some of the levels have
overlapping functions which can be reorganized as shown in Figure 2.1.
During conventional driving, no ADAS functions are needed and drivers themselves
sense the driving conditions, monitor the behavioral feedback from the vehicle, identify risks
and take appropriate decisions and control the vehicle accordingly.
Level I – This is the basic level where the ADAS performs the task of detection i.e. sense
the driving environment using sensors like a night vision camera and present the information
using heads-up displays to the drivers. These systems act as perception enhancing systems
rather than warning systems.
Level II – This level acts as hazard assessment and warning level by gathering
information from driving environment and vehicles kinetics. The information collected at
Level I may be used to assess and warn the drivers for some critical hazard situations.
Examples of Level II ADAS are the Collision Warning Systems like Forward Collision
Warning (FCW), Rear-end Collision Warning (RCW) and the Lane Departure Warning
(LDW) systems.
16
Figure 2.1: ADAS Functional Levels and Drivers Behavioural Model
Level III – At this level of intervention, the ADAS, in addition to warning the driver of an
imminent danger, intelligently indicates about how to cruise through the situation.
Level IV- At this intervention level, the ADAS overrides the driver’s control, take partial
control or take full control. The Level IV ADAS provides the highest degree of automation
for controlling the vehicle. Such systems may be used in two driving situations such as
normal driving and abnormal driving.
Table 2.3 discusses the popular ADAS functions, their intended use, example of
technologies based on the levels they represent based on the literature presented by Carsten
and Nilsson (2001), Golias et al. (2002), Tango et al. (2006), Lindgren and Chen (2006),
Lindgren et al. (2008) and IHRA Working Group on ITS (2011).
17
Table 2.3: ADAS Functions and Technologies
S.
N.
Functions /
Technologies
Description Sensors / Systems Used
LEVEL I
1. Night Vision
(NV)
Using camera techniques visual images are captured in
dark light conditions. Images are displayed to the driver
using monitors or head up displays.
Near or far infrared camera
which uses thermal imaging
techniques.
2. Smart
Headlamps
(SH)
These headlamps are pre-programmed to automatically
dim when oncoming traffic is detected and
automatically adjust height to compensate the aim of
the headlamps when driving with heavy loads.
Halogen. and Xenon light
sources
3. Lane Departure
Warning
(LDW)
Captures the lane departure event when certain
thresholds (like distance, time to lane crossing) is
violated and warns the driver accordingly. Chances of
false alarms are prominent.
Camera captures lane
markings whereas acoustic,
optic or haptic feedback is
provided for warning
4. Local Hazard
Warning
(LHW)
Hazard occurring at a farther distance in front of the
vehicle and not visible to the driver will be sensed and
warn accordingly.
Appropriate communication
channels are used.
5. Forward
Collision
Warning
(FCW)
FCW systems measure distance, angular position and
relative speed of the car and obstacles ahead and warn
the driver about a potential collision.
Laser or microwave radar
sensors
6. Tyre Pressure
Warning
(TPW)
TPW System measures a wheel’s rotational speed
relative to the other wheels to detect dangerously low
air pressure in the tires and warn drivers.
Wheel speed sensors
7. Lane Keeping
Assistant
(LKA)
Extended version of LDW, detects lane departure and
warns the driver if a defined trajectory is violated. (can
completely take over the steering task of the vehicle).
Camera system for detection
and steering wheel actuators
for warning.
It can be envisaged from Table 2.3 that there exists an overlapping relationship between
the ADAS functions from Level II to Level III, Level III to Level IV from the viewpoint of
the tasks performed and the actuation level guaranteed.
Development of ADAS for modern day vehicles will require several other enabling
technologies besides that are mentioned in the Table 2.3. The typical enabling technologies
requirements for the ADAS development as listed and discussed by Shladover (1993) include
sensors, communication, computation, electromechanical actuators, software and systems and
some special tools which holds true even till date.
18
Table 2.3: ADAS Functions and Technologies (Continued....)
S.
N.
Functions /
Technologies
Description Sensors / Systems Used
LEVEL III
8. Blind Spot
(BSW)
Warning
Detects and warns the driver if a vehicle / cyclist /
pedestrian is present in the so-called ‘‘blind spot’’ area
during a lane change and/or overtaking manoeuvres by
placing a camera into the left rear-mirror.
Infrared Sensor Camera
(Passive Systems)
9. Curve and
Speed Limit
Information
(CSLI)
CSLI informs the driver about speed limits and the
recommended speed when approaching towards curves
and can be combined with ACC to automatically
correct the speed for dangerous curves.
Digital maps, image
processing, communication
systems help in retrieving
the information.
10. Brake assist
(BA) systems
Interprets a quick depression of the brake pedal as an
emergency braking action and complements the applied
braking power if the driver has not applied enough
power on the brake pedal.
Included in various ABS
systems to optimize the
vehicle's braking capacity to
shorten the stopping
distance.
11. Lane Change
Assistant
(LCA)
Works closely with the "Blind Spot" detection system
during a dangerous lane change process to warn the
driver. It can either just warn using a light or provide
haptic feedback at the steering wheel.
Warning Light and Haptic
Interface.
12. Anti-lock
Braking
System (ABS)
ABS avoids vehicle’s wheels from locking up and
skidding during hard braking or normal braking on icy
surfaces by modulating the brake pressure.
Wheel speed sensors detect
brake lock up.
13. Parking Assist
System (PAS)
PAS uses rear and front sensors to detect obstructions
and notifies the driver about objects (pedestrians,
vehicles etc.) close to the vehicle while parking by
measuring distance.
Acoustic signal generation
device and Camera vision
systems.
14. Driver Status
Monitoring
(DSM)
systems
Two broad categories of DSM systems which can
sense, warn and suggest corrective actions.
(i) Driver Impairment Monitoring: Impairment due to
stress, fatigue, alcohol or drug abuse, inattention or
various diseases. Driver’s physiological status like
drowsiness, level of attention, eye-movements, heart-
rate and other parameters are used to identify the
stressful and abnormal conditions or risks.
(ii) Driver Vigilance Monitoring: to monitor and warn
using vehicle’s lateral position, steering wheel position,
driver behavior, and eyelid movements sensors.
Physiological Sensors
(ECG, EMG, EOG, GSR,
Respiration, PPG etc.),
Camera based sensors
vehicle sensors etc.
19
Table 2.3: ADAS Functions and Technologies (Continued....)
S.
N.
Functions /
Technologies
Description Sensors / Systems Used
LEVEL IV
15. Adaptive
Cruise Control
(ACC) / ‘‘Stop
and Go’’
ACC maintains safe distance between the current and
frontal vehicle adaptively by adjusting the speed by
considering the individual preferences of both drivers.
‘‘Stop and Go’’ considers the specific requirements of
individual drivers in the urban environment for example
in a traffic queue it automatically drives the vehicle by
timely providing vehicles’ stops and small movements.
Radar based technology
16. Obstacle and
Collision
Avoidance
(OCA)
OCA systems automatically intervene and take control
over the vehicle in hazardous situations to avoid
accidents. They provide an extended functionality
compared to the FCW.
Multi-modal sensing and
actuation
17. Platooning Several vehicles form a platoon to follow each other
and connected electronically (e.g., by means of
communication). A following vehicle is driven
automatically, requiring complex systems design.
Sensing devices including those required for detecting range, lane change, location, road
friction, longitudinal and lateral acceleration, linear and angular displacement, visual
information, and driver alertness etc. pose a great challenge for the design communities.
Vehicle-to-vehicle as well as vehicle-to-infrastructure information sharing requirements
further need short, medium and long-range communication links. Reliable, low cost, robust
operation and actuation will further require efficient computational and electro-mechanical
elements like processors and actuators. The hardware infrastructure must be supported by
appropriate HCI provisioning and efficient software design by ensuring fault tolerance and
efficient task-scheduling. Stringent testing is an integral part of the process involved. In order
to achieve a perfect design, the developed devices must be evaluated by collecting data from
all sources like accidents, vehicle, driver and road etc. The data thus collected could then be
modeled and tested by carrying out simulation as well as field testing.
It is evident from the foregoing discussion that drivers have to interact with the vehicle
control elements, environment and other factors which influence driving. Drivers make
decisions accordingly to maneuver the vehicle to avoid dangerous situations. ADAS assist
drivers to enhance their safety as well as comfort by the enabling technologies discussed.
However, the enormous complexities involved with respect to handling a number of ADAS
20
functions alongwith driving tasks in different driving environments may degrade their
decision making process and affect their performance by inducing behaviors like distraction,
inattention, frustration, stress, fatigue etc. These behavioral issues requires that human factors
should be incorporated in design to minimize such effects by developing Driver Status
Monitoring (DSM) systems.
McCall and Trivedi (2006), while investigating the human centered design aspects of
driver monitoring systems, identified that there is an essential need to develop algorithms
capable of understanding the driver's intent and attention. A human-centric design approach
is essential for ensuring the usability and safety aspects. Miller and Huang (2002) observed
that the Driver Assist System (DAS) should augment the driving process without causing any
distraction as well as issuing a cautionary warning. False alarms have a potential not only to
cause distrust but may also lead to panic reactions. This can reduce the overall effectiveness
of the system. The human centric factors which should be considered while designing such
DAS include the driver’s social environment, the country’s norms, driver behavior etc. It may
be seemed that the above referred factors may be incorporated into such DSM systems if the
decision making modules were trained on the basis of naturalized data10
, collected from real-
time field tests. All kinds of driving maneuvers are the results of either psychological or
physiological reactions caused due to several stimuli experienced by drivers. The significance
of impact of the psychophysiological parameters can not be, thus, understated.
2.4 The Current State of the Art: Driver's Inattention, Fatigue and Stress
Monitoring
In literature, the term fatigue has been quite debatable in terms of interpretations. In a review
of the psychophysiological parameters on driver fatigue, Lal and Craig (2001) defined fatigue
as the transitory period between awake and sleep which can lead to sleep if not interrupted.
Psychophysiological parameters are associated with fatigue i.e. fatigue can be either mental
or physical. Therefore mental fatigue is related to psychological parameters whereas physical
fatigue is experienced due to muscular parameters (Lal and Craig, 2001). They argued that
among the psychophysiological parameters, the theta11
and delta12
activity of
electroencephalography (EEG) signals were found to be most promising as compared to
EMG, EOG and HR. They also suggested that psychological traits such as anxiety and
10 The data collected under real-time driving conditions with respect to specific road conditions, traffic rules,
traffic density of a specific country. 11 Theta brain activity: related with conscious sleep towards drowsiness. 12 Delta brain activity: related with deep sleep and waking state.
21
negative mood, self reported fatigue measures13
, physiological parameters such as EEG if
monitored simultaneously would lead to better driver fatigue management.
Matthews (2002) discussed a transactional ergonomic model for driver stress and fatigue
in which environmental stressors14
and personality factors15
influence the cognitive stress
processes16
, which in turn, result in subjective outcomes17
such as tiredness, and performance
outcomes18
such as impairment of psychomotor control. He has suggested some transactional
design guidelines for minimizing the effects of stress and fatigue: (a) recognize transactional
safety issues, (b) distinguish different stress reactions qualitatively, (c) design for stress
explicitly, (d) design for variability in workload, (e) work at the individual level if possible,
and (f) direct interventions towards explicit criteria.
Williamson and Chamberlain (2005) reviewed the approaches for detection and control of
fatigue, focusing mainly on the methodologies that reflect fatigue-related measures. Their
analysis suggested three different approaches which the driver fatigue warning devices
adopted as: (a) driver’s current state, especially relating to the eye and eyelid movements
such as percentage eyelid closure (PERCLOS), pupil tracking etc. and physiological state
changes involving drowsiness detection using EEG, (b) driver performance, with a focus on
the vehicle’s behavior including lateral position and headway, and (c) hybrid: combination of
the driver’s current state and performance.
A refreshing perspective on the causes of the events that might lead to accidents may be
found in the work of Young and Regan (2007). Basing their theory on possible distractions
they argued that the interaction of in-vehicle devices cause potential distractions since driving
performance such as ability to maintain speed, throttle control and lateral position on the
road gets impaired. Non driving tasks which distract drivers include HCI-design complexity,
secondary task19
related operations, and driving environment and characteristics20
. When the
difficulty of the secondary and/or driving tasks increase, degradation in driving performance
becomes more pronounced. Older drivers as well as young novice drivers were found to be
more susceptible to the distractions when engaged in secondary tasks while driving than
13 Self Reported Measures: factors perceived by drivers themselves affecting their mental and physical fatigue. 14 Environmental Stressors: such as bad weather, poor visibility, poor road conditions, traffic jams etc. 15 Personality Factors: like aggressiveness, frustration in judging some driving events, fatigue proneness etc. 16 Cognitive Stress Processes: appraisal of external demands and personal control, choice and regulation of
coping etc. 17 Subjective Outcomes: anxiety, anger, tiredness etc. 18 Performance Outcomes: impairment of psychomotor control and changes in speed 19 Secondary Tasks: such as interacting with hand-held or hands-free devices like mobile phones, route guidance
systems etc. 20 Driving environment and characteristics include age and driving experience level etc.
22
experienced or middle-aged drivers. Such distractions also impair drivers’ visual search
patterns, reaction times, decision-making processes and can increase the likelihood of a
collision.
Dong et al. (2011) surveyed the driver inattention problem and classified this into two
main categories as distraction and fatigue. They used task-based definition to define
distraction as a situation when drivers can pay attention, but their attention is divided into
primary task of driving as well as some secondary task like answering a phone, looking at the
navigation system or any attractive object or advertisement etc. According to them, fatigue is
a situation when driver looses concentration due to energy-drain and cannot pay sufficient
attention to driving. They reported that the commercial inattention measuring tools can
perform better only in constrained driving conditions and not in real driving conditions and
also their measure of performance cannot be evaluated for scientific purposes. They studied
five types of measures available in the scientific literature for detecting driver-inattention
such as (1) Subjective Report measures; (2) Driver Biological measures; (3) Driver Physical
measures21
; (4) Driving Performance measures; and (5) Hybrid measures (combination of
all). They observed that hybrid measures are more reliable since they accurately detect
driver's inattention and minimize the number of false alarms. They suggested to combine the
data obtained from the following three distinct sources: (1) driver physical variables;
(2) driving performance variables; and (3) information from the In-Vehicle Information
Systems (IVIS)22
. Characteristics of the driving environment such as road type, weather
conditions, and traffic density, may additionally be considered.
Wen et al. (2011) emphasized that use of driver-oriented cognition models based design
will enhance the existing advanced driver assistant systems (ADASs), starting a new trend
for designing cognitive vehicles, which will relieve drivers’ burdens, worries, and frustrations
and minimize accidents. Normally, driving includes four subtasks: (i) long term plans,
(ii) momentary stimuli, (iii) decision making, and (iv) actions. They categorized driver
cognition research aspects into (a) Environmental stimuli-response based modeling and
recognition, (b) Driver's physiological and psychological status recognition, and (c) Driver's
decision and reaction recognition. Physiological status can be estimated by monitoring the
driver's reaction times which may be delayed against some stimuli and they become less
21 Driver Physical Measures or Variables: eye closure duration, blink frequency, nodding frequency, fixed gaze, and frontal face pose etc. 22 In- Vehicle Information Systems (IVIS) may include traffic information and/or guidance systems, mobile
phones, vehicle diagnostics and/or warning systems and emergency help systems (Available Online:
http://www.umtri.umich.edu/our-focus/vehicle-information-systems).
23
sensitive while fatigued. It is significant to overcome the challenges involved in a proper
analysis that involves other psychophysiological issues like (a) differentiating each driver’s
personality and psychological status; (b) analyzing the abnormal driving behavior and
diagnosing the reasons such as fatigue, alcohol, drugs, or heart attack; and (c) evaluating the
effect of some countermeasures such as loud music, caffeine, or perfume to cure fatigue and
affect emotions etc.. They identified three main difficulties (i) collection of effective
information from stimuli-response and decision-action models23
; (ii) finding a reliable model
to measure the human mind, as identification of stimulus which triggers a particular behavior
is cumbersome; (iii) ready availability of very few sophisticated experiments which can
effectively identify stimuli for drivers having different tolerance levels.
Therefore, the problem of driver's inattention24
, distraction, stress and fatigue level
monitoring can be categorized in the following three broad categories:
1) Computer Vision based Driver's Inattention Detection Techniques
2) Physiological Sensors based Stress Level and Fatigue Monitoring Techniques
3) Hybrid Techniques for Stress Level and Fatigue Monitoring
2.4.1 Computer Vision based Driver's Inattention Detection Techniques
Ji et al. (2004) developed a nonintrusive prototype computer vision system for real-time
monitoring of drivers. They acquired video images of drivers using remotely located charge-
coupled-device cameras equipped with active infrared illuminators. Several visual cues were
extracted such as eyelid movement, gaze movement, head movement, and facial expression
characterizing the level of alertness of a person. Finally, they developed a Bayesian Network
(BA) based probabilistic model to model human fatigue and to predict fatigue based on the
visual cues obtained. The developed system was validated with subjects of different ethnic
backgrounds, genders, and ages; with/without glasses; and under different illumination
conditions and was found to be reasonably robust, reliable, and accurate in a real-life
environment.
Bergasa et al. (2006) developed a nonintrusive prototype computer vision system
consisting of an active IR illuminator and required software algorithms to characterize the
fatigue level of drivers in real driving conditions. They calculated six visual parameters such
as Percent eye closure (PERCLOS), eye closure duration, blink frequency, nodding
23 Due to the fact that many environmental stimuli inputs do not always generate the corresponding behavior and
can be easily interrupted or changed during driving. 24 Driver's inattention has three forms (a) distraction, (b) looked but did not see and (c) sleepy or fell asleep
(Wang et al., 1996).
24
frequency, face position, and fixed gaze from driver's images. They implemented a fuzzy
classifier to merge all these parameters into a single driver's inattentiveness level (DIL). Their
system can be viewed as a drowsiness detection system as well.
Liang et al. (2007) performed simulator based experiment while the participants
interacted with In-Vehicle Information Systems (IVIS) to collect data for detecting cognitive
distraction of drivers. They used the drivers’ eye movements and driving performance data to
train and test both Support Vector Machines (SVM) and Logistic Regression models to
develop a real-time approach. Three different characteristics of the model were investigated:
how distraction was defined, which data were input to the model, and how the input data
were summarized. Although eye movements data performs better than the driving
performance data, it is recommended that considering both eye and driving measures as
inputs to a distraction-detection algorithm is more viable approach.
2.4.2 Physiological Sensors based Stress Level and Fatigue Monitoring Techniques
Lisetti and Nasoz (2004) established that physiological signals like Galvanic Skin
Response (GSR), Blood Oxygen Saturation (SpO2), Electrocardiograph (ECG) and
Photoplethysmogram (PPG) signals can be used in assessing startle as well as instantaneous
stress. Healey and Picard (2005) collected physiological data (ECG, EMG, skin conductance
and respiration) during real-world driving tasks to determine three levels of driver stress (low,
medium and high). They found that for most drivers, skin conductivity and heart rate metrics
are closely correlated with driver stress level. They established that physiological signal
based metric of driver stress can help manage noncritical in-vehicle information systems and
could also provide a continuous measure of how different road and traffic conditions affect
driver's stress in future cars. Katsis et al. (2008) adopted the bio-signal (facial EMG, ECG,
respiration and electrodermal activity) processing approach to recognize the four emotional
states such as high stress, low stress, euphoria, and disappointment of car-racing drivers in a
simulated driving environment. In another work, they again classified four affective states
such as high stress, low stress, Dysphoria and Euphoria in a simulated car racing environment
with the same set of biosignals which established that biosignals are effective sensing
parameters (Katsis et al., 2011).
Khushaba et al. (2011) employed electroencephalogram (EEG), electrooculogram (EOG),
and electrocardiogram (ECG) signals for detecting driver's drowsiness in simulation driving
test environment. They developed an efficient fuzzy mutual-information (MI)- based wavelet
packet transform (FMIWPT) feature-extraction method for classifying the driver drowsiness
25
state into one of predefined drowsiness levels. The results obtained correlate different
drowsiness levels achieving a classification accuracy of 95%–97% on an average across all
subjects.
It may be interesting to note here that there are works where biosignals have been used
for identification of mental stress other than in a driving scenario. Choi et al. (2012)
developed a wearable sensor platform to monitor a number of physiological correlates of
mental stress. The system consists of wireless sensor nodes attached to physiological sensors
and a holster unit consisting of data processing unit, a sensor hub (integrating a GPS, an
accelerometer, a real-time clock), and a Lithium-polymer battery. They extracted features
from the signals collected besides proposing a new spectral feature that estimates the balance
of the autonomic nervous system by combining information from the power spectral density
of respiration and heart rate variability. The device was validated by exposing the subjects
under two psychophysiological conditions: mental stress and relaxation. A logistic regression
model was able to discriminate between these two mental states with a success rate of 81%
across subjects.
Thus, it can be seen that biosignals or physiological signals may be used as credible
indicators useful for monitoring driver's stress in various driving conditions.
2.4.3 Hybrid Techniques for Stress Level and Fatigue Monitoring
Malta et al. (2009) studied the driver behaviors under hazardous scenarios by utilizing the
brake pedal force, speed, and speech signals to detect incidents from a real-world driving
database of 373 drivers. Analysis of the results addressed the individuality in driver
behaviors, the multimodality25
of driver reactions, and the detection of potentially dangerous
locations. They identified 25 potentially hazardous scenes in the database which were hand
labeled and categorized and a detection feature was satisfactorily applied to the indication of
anomalies in driving behavior based on the joint histograms of behavioral signals and their
time derivatives. Out of the 25 scenes 17 scenes were observed due to brake pedal reactions
whereas 11 scenes were due to verbal reactions with a true positive (TP) rate of 100% and
54% alongwith a false positive (FP) rate of 4.1% and 6.4% respectively. They recommended
that future analysis of driving behavior signal processing must consider the individuality of
driver reactions and the integration of multimodal responses to hazards.
Yang et al. (2009) analyzed the performance of 12 subjects during a simulated driving
condition to study the driver–vehicle interaction characteristics. The drivers were subjected to
25 multimodality refers to the set of data consisting of images, driving behavior, location and speech signals.
26
a series of stimulus-response and had to perform routine driving task under either partially
sleep deprived or without sleep-deprivation conditions. The result demonstrated that sleep
deprivation had greater effect on rule-based26
than on skill based27
cognitive functions. Their
performance of responding to unexpected disturbances degraded, while they were robust
enough to continue the routine driving tasks such as lane tracking, vehicle following, and lane
changing. They suggest that the driving performance of the rule-based tasks such as stopping
at traffic signals should be investigated further for the effective design of drowsy-driver
detection systems. Skill-based tasks, which cover most driving tasks may be used to provide
early indicators of drowsy driving, deterioration of such tasks may indicate the existence of
other driving impairments such as inebriation, motion sickness, stress, or inattention.
Subsequently, Malta et al. (2011) published the results of an investigation by proposing a
method for estimating a driver’s spontaneous frustration in the real world by integrating
information about the environment, the driver’s emotional state, and the driver’s responses in
a single model. Drivers interacted with an automatic speech recognition (ASR) system to
retrieve and play music while the data was being collected in an instrumented vehicle with
several sensors. They again used Bayesian Network (BN) to combine knowledge on the
driving environment assessed through data annotation, speech recognition errors, the driver’s
emotional state (frustration), the driver’s responses measured through facial expressions,
physiological condition, and gas- and brake-pedal actuation. The results showed an overall
estimation of a true positive rate of 80% and a false positive rate of 9% i.e., the system
correctly estimates 80% of the frustration and, when drivers are not frustrated, makes
mistakes 9% of the time. They suggest to include ASR systems, gas- and brake-pedal sensors
in future studies on emotion recognition and interface design. They argued that automatic
quantization of frustration levels will provide more satisfactory results than a simple manual
selection.
In a recent study, Das et al. (2012) studied the alcohol-induced driving impairment
through vehicle-based sensor signals in a driving simulator. They collected steering wheel
movement sensor data from 108 drivers with and without alcohol-induced impairment to
differentiate different driving conditions. Various quantitative measures of steering wheel
movement like simple statistics like mean and standard deviation etc. and nonlinear dynamic
invariant measures like entropy, Lyapunov exponent etc. data was extracted to evaluate the
performance by cluster separation indices like Fisher linear discriminant and the Gamma
26 Real-Time tasks and tracking tasks with unexpected disturbances. 27 when drivers were sleep-deprived.
27
index. In the next step, several instances of genetic algorithms (GA) were run and combined
to evolve a parallel genetic algorithm (PGA) to cluster the drivers into subgroups, and the
within-group separation. It was observed that the nonlinear invariant measures were able to
capture the characteristics of the signal better than that which were captured by the simple
statistics.
Thus, it can be concluded that the driver's inattention, stress and fatigue could be
monitored using hybrid techniques among the other techniques reviewed.
2.5 Wearable Driver Assistance Systems: A Need Analysis
The foregoing section dealt with the ADAS definitions, functional characteristics,
classifications and enabling technologies and other related issues. The foregoing analysis
suggests that by its very nature an ADAS is driver-centric. Most importantly, drivers'
inattention, distraction were measured with the help of visual information and computer
vision techniques, whereas their psychological and physiological factors were measured with
the help of physiological sensors like ECG, GSR, Respiration, EMG etc.
Lisetti and Nasoz (2004) used the SenseWear Armband wearable computer to collect the
physiological signals (galvanic skin response, heart rate, temperature) from the autonomic
nervous system and mapped them to certain emotions such as sadness, anger, fear, surprise,
frustration, and amusement. They suggested that multimodal affective intelligent user
interfaces in future will become a reality in telemedicine, driving safety, and learning once
the research is fully mature. Healey and Picard (2005) envisaged the development of an
integrated vehicular or body-worn sensor configuration for calculating the driver’s stress
level in real-time. In an approach towards affective state recognition in automotive drivers
using on-road experimentation they investigated the use of bioelectric signals to infer the
driver’s internal state. Katsis et al. (2008) adopted the bio-signal processing approach to
recognize the four emotional states of car-racing drivers in a simulated environment over
vision-based and speech-based methods by designing a wearable system, under the AUBADE
project. They further demonstrated this wearable system's applicability for the affective state
monitoring in a simulated car racing environment which established that biosignals are
effective sensing parameters (Katsis et al., 2011). This choice was mainly motivated by the
fact that vision based algorithms would require a compatible illumination source to function
well which may not naturally be available during night time driving. Use of an artificial
illumination source would add to the distraction levels of the driver and better be avoided.
The speech-based methods have been successfully adopted in hospital-level distress
28
monitoring systems but in case of automotives due to noise and traffic these systems are
susceptible to malfunctioning.
However, if we compare (Table 2.4) the respective prototypes used in these projects it is
evident that none of the prototypes can be considered as a complete wearable device. Because
either some of the devices are commercially available data logging systems not suitable for
wearing or some of them are poorly designed to be considered as truly wearable.
Table 2.4: Wearable Sensing and Data Acquisition Modules used for Driver Stress
Analysis
Research
Works
Features Sensors Used Real-
Time /
Simulated
Driving
Wearable Module
(Commercial /
Custom)
Remarks /
Shortcomings
Lisetti and
Nasoz
(2005)
Six emotions:
sadness, anger,
surprise, fear,
frustration, and
amusement
galvanic skin
response, heart
rate, and
temperature
Simulated Commercial (The
SenseWearTM
Armband
from Body Media
Inc.)
Armband is a
wearable
device.
Healey and
Picard
(2005)
Overall Stress
Level Analysis
ECG, SpO2,
Respiration,
GSR
Real-time Commercial
(Procomp Infinity,
Thought Technology)
The device is
not wearable.
Katsis et al.
(2008, 2011)
Emotion and
Affective State
Recognition of
Car racing drivers
Surface EMG,
ECG,
Respiration,
EDA
Simulated Custom Designed Wearable
(limited to car
racing only)
Traditionally, wearable prototypes developed so far concentrate on monitoring the health
status of a person by embedding the typical attributes of an embedded system like sensors
(physiological), processors, wireless channels etc. with or without an operating system.
Pantelopoulos and Bourbaki (2010) surveyed and categorized four types of wireless health
monitoring systems (WHMS) such as (a) Systems-Based on a Microcontroller Board or on
Custom Designed Platforms use wired communication to collect the physiological data, (b)
Systems Based on Smart Textiles integrate the biosensors and processing elements on a vest
or jacket, (c) Mote-Based Body Area Network (BAN) are formed with the help of tiny nodes
of mote sensors where a single mote collects one or more physiological data and transmits
wirelessly to a central node or base station, (d) WHMS Based on Commercial Bluetooth®
29
Sensors and Cell Phones attach biosensors and exploit the computing power of the cell phone
processor, and (d) other types of WHMS that integrate the biosensors on a glove.
However, in the case of a WDAS a viable solution could combine microcontroller and
smart textile for building the device. Since the drivers would be busy most of the time in
certain attentive and maneuvering tasks, it would be inappropriate to use Cellphone based
devices while driving due to safety related reasons whereas use of glove etc. may make the
driver uncomfortable. Therefore, the WDAS could be designed either in the form of a jacket
or using a lightweight wrist-worn device as discussed above.
In most of the cases the wearable device does not perform the intelligent signal
processing task, instead it is done offline or on a separate processing unit. However in driving
conditions, the delay in offline computation, alarm generation and communication between a
wearable device and another processing unit will pose the automotive drivers at risk.
Although, it has been suggested that some of these device could be used for driver stress
and health monitoring, but the driver centric human factors design requirements are missing
on those systems. Therefore, it is important that a WDAS be developed by incorporating
select features of a wearable health monitoring device as well as by considering the human
factors, local processing, robustness, fault tolerance, one-to-one communication and
mechanisms for proactive alert generation to assist the drivers.
2.6 Wearable Sensing Parameters and their Effects on Autonomous Nervous
Systems (ANS)
The physiological signal parameters which may affect the autonomic nervous system of
drivers leading to stress and fatigue could be monitored using wearable sensors. In
subsequent sections, the parameters to be sensed, their origin and how they can be utilized to
develop a wearable driver assistance system has been presented.
2.6.1 Human Physiology, The Nervous System and Stress
Reflexes are an automatic instinctive unlearned physiologic reaction to a stimulus. In
other words, it is an automatic response of bodily actions without the conscious control or
involuntary control within or outside body (Ebneshahidi, 2009). Reflexes maintain nearly
constant conditions in the internal environment of body known as homeostasis, a popular
term used by physiologists (Guyton and Hall, 2006). Autonomic reflexes maintain a
physiologic equilibrium by performing several complex biological mechanisms comprising
of all organs and tissues of the body via the autonomic nervous system to offset disrupting
30
changes. Autonomic reflexes regulate functions such as heart rate, breathing rate, blood
pressure, and digestion etc. They also carry out the automatic action of swallowing, sneezing,
coughing, and vomiting etc. In addition, reflexes maintain balance & posture such as spinal
reflexes control trunk and limb muscles and control cognitive functions like brain reflexes
which involve a reflex center in brain stem control reflexes for eye movement.
The human nervous system coordinates the voluntary and involuntary actions of the
human body by transmitting signals to ensure smooth functions of various systems in the
body. This is responsible for the regulation of rapid events like muscular contractions and
secretions of most of the glands in the body. Sensory receptors, like visual receptors in the
eyes, auditory receptors in the ears, tactile receptors on body surface, or other kinds of
receptors, initiates the activities of the nervous system (Guyton and Hall, 2006). The nervous
system mainly consists of two parts, the central nervous system (CNS) and the peripheral
nervous system (PNS). The CNS consists of the brain and spinal cord and contains the
majority of the nervous system that integrates the information received from all parts of the
body to coordinate and control the motor activities. The PNS consists mainly of nerves (long
fibers) and ganglia28
outside of the brain and spinal cord. The PNS connects the CNS to every
other part of the body such as limbs and organs. The PNS is divided into the somatic nervous
system (SoNS) and the autonomic nervous system (ANS). The SoNS is responsible for
voluntary control of body movements via skeletal muscles. SoNS transmits the sensory
information from the sensory receptors of the entire body surface to the CNS through PNS.
Autonomic nervous system (ANS) controls most of the visceral functions of the body like
arterial pressure, gastrointestinal mortality, gastrointestinal secretion, urinary bladder
emptying, sweating, body temperature, and many other activities either partially or
completely (Guyton and Hall, 2006). The involuntary activities of the ANS can be completed
at a very rapid rate i.e. within seconds and regulates those bodily activities which are beyond
conscious control. The ANS consists of two subsystems which operate in reverse of each
other, an excitatory sympathetic nervous system (SNS) and an inhibitory parasympathetic
nervous system (PSNS). The SNS is the dominant system during physical or psychological
stress for example an increased pulse, or heart rate, is characteristic of this state of arousal.
The PSNS is dominant during relaxation (periods of relative safety and stability) and
maintains a lower degree of physiological arousal and a decreased heart rate (Appelhans and
Luecken, 2006).
28 Ganglia: a neural structure consisting of a collection of cell bodies or neurons.
31
Stress reflects some kind of emotional unease. One can experience a low-grade feelings
of emotional unrest to intense emotional turmoil. A person may feel stressed either due to the
direct response to external situations or events, or also due to the internal emotional processes
and attitudes (McCraty, 2006). Emotional stress can have varied forms such as feelings of
agitation, worry, and anxiety; anger, judgmentalness, and resentment; discontentment and
unhappiness; insecurity and self-doubt.
Mental or physical stress can excite the sympathetic system to provide extra activation of
the body in the states of stress, this is called the sympathetic stress response or alarm
(Guyton and Hall, 2006). This reaction happens due to the phenomenon of mass discharge,
when large portions of the sympathetic nervous system discharge at the same time. This
enables the human body to perform vigorous muscle activities which would have been
possible due to the combined effect of increased arterial pressure, blood flow, mental activity
etc. to name a few. The SNS is strongly activated in many emotional states such as rage, fight
or flight situations etc. Therefore sensing parameters which reflect the sympathetic responses
of drivers in driving situations will be of great help for assessment of the stress and emotions.
Figure 2.2 depicts the flow of human physiology and reflex control systems of the nervous
system sensing the stress response.
2.6.2 Heart Rate (HR) and Heart Rate Variability (HRV)
Heart rate variability (HRV) is defined as spontaneous fluctuations (variation in the time
interval between heart beats, also known as beat-to-beat interval) in sinus rate due to internal
and external body processes (Kristal-Boneh et al., 1995). Fundamentally, HRV is derived
from heart rate (HR) which is the measure of number of heart beats per minute (BPM). HRV
is usually measured as the standard (or average) deviation from the mean R-R intervals of all
cardiac cycle lengths29
over a given period, most commonly 5 minutes. HRV reflects the time
varying influence of the autonomic nervous system (ANS) and its components, on cardiac
function i.e. sympathetic and parasympathetic systems. The two autonomic branches regulate
the lengths of time between consecutive heartbeats, or the interbeat intervals, with faster heart
rates corresponding to shorter interbeat intervals and vice versa (Appelhans and Luecken,
2006).
Two popular heart activity measurement techniques have been reported in literature,
electrocardiography (ECG) and photoplethysmography (PPG) techniques, the ECG is the
most commonly used method.
29 R-R intervals for normal sinus beats
32
2.6.2.1. Electrocardiography (ECG)
The ECG measures heart activity, heart rate (HR) and heart rate variability (HRV) by
detecting voltages on the surface of the skin resulting from heartbeats. The ECG wave is
characterized by five waves namely the P, Q, R, S, and T wave. The ECG recordings include
measurement of these wave durations from which various inferences can be obtained. ECG
can give a precise estimate of instantaneous heart rate by detecting sharp R-wave peaks.
Detection of successive R wave peaks (the QRX complex) is used to calculate inter-beat
intervals (IBI). Physiologists have found that there exists a correlation between the cardiac
nervous activity and the immediate R-R interval. A typical ECG waveform is shown in
Figure 2.3.
Figure 2.2: Human Physiology and Reflex-Control leading to Stress
The HR normally lies in the range of 60 to 100 BPM, usually 72 BPM is considered as
normal. There are some associated terms like Tachycardia, Bradycardia, and Arrhythmia etc.
Tachycardia means fast heart rate. It is an abnormal condition of heart when HR reaches
greater than 100 BPM (about 150 BPM). Tachycardia is caused due to increased body
temperature, stimulation of the heart by the sympathetic nerves, or toxic conditions of the
Human Nervous System Regulates rapid events like muscular contractions and secretions of most of the glands in the body.
Ensures smooth functions of various systems in body
Peripheral Nervous System (PNS) Nerves and groups of neurons / nerve cells outside the brain and spinal
cord
Central Nervous System (CNS) Brain and Spinal cord
Autonomic Nervous System (ANS) Controlled involuntarily: controls smooth muscle, gland
activity and cardiac muscle (stress detection)
Somatic Nervous System (SoNS) Controlled voluntarily: controls the skeletal muscles
(for body movement and other voluntary activities)
Parasympathetic (PSNS) Relaxed activity controller: promotes body maintenance
(food digestion etc.)
Sympathetic (SNS) Dominantly functions in emergency situations
( fight or flight situations)
Sympathetic Stress Response
or
Alarm Reaction
33
heart (Guyton and Hall, 2006). Bradycardia means a slow heart rate. It is a condition when
HR reaches below 60 BPM, normally found in athletes. Arrhythmia is a condition when the
HR cycles are not evenly spaced. Arrhythmia can result from certain circulatory conditions
that alter the strengths of the sympathetic and parasympathetic nerve signals to the heart sinus
node. Whenever a heart blocks, one or more of the basic features of the ECG waveform will
be missing. For example whenever the P-R interval is greater than 0.2 sec, it can suggest a
blockage of the atrioventricular (A-V) node (Guyton and Hall, 2006).
Figure 2.3: A Typical ECG Waveform
2.6.2.2. Photoplethysmography (PPG)
Photoplethysmography (PPG) is a non-invasive optical measurement technique to detect
blood volume changes in the microvascular bed of tissue, resulting in a peripheral pulse
known as blood volume pulse (BVP) or PPG pulse synchronized to heart beats (Allen, 2007).
PPG technique has been used in many clinical applications to build medical devices like
pulse oximeters to measure oxygen saturation, digital beat-to-beat blood pressure
measurement systems to measure blood pressure and cardiac output, vascular diagnostics
devices for detecting peripheral vascular disease and devices for assessing autonomic
functions.
A PPG device consists of optoelectronic components, a light source which emits light and
a photodetector which receives the light reflected by the surface of the skin. Blood is forced
through the blood vessels for each heart beat, producing an engorgement of the peripheral
vessels under the light source modifying the amount of light reflected to the photo sensor.
This reflectance gives a relative measure of the amount of blood in the capillaries from which
34
heart rate can be derived. Relaxation, sleeping and vegetative states generally create a
reduced cellular need i.e. blood reaching cells. Exercise, responding to stressors, and even
just standing up may create greater cellular needs for oxygen and blood nutrients (Guyton and
Hall, 2006).
The PPG or BVP sensors can be placed anywhere on the body where the capillaries are
close to the surface of the skin, but peripheral locations such as the fingers are recommended
for studying emotional responses (Allen, 2007). The PPG sensors does not require gels or
adhesives, but the PPG reading is very sensitive to variations in placement and to motion
artifacts.
2.6.2.3. HRV Measurement Techniques
The IBI obtained is used to compute three HRV classes as statistical, geometrical and
frequency, however the geometrical classes are not much reported in literature as they
provide less precise measures of HRV, hence statistical (time-domain) and spectral
(frequency-domain) are the two popular techniques (Appelhans and Luecken, 2006) .
1) Time-domain or Statistical Methods:
The time-domain or statistical analysis utilizes the IBI values to find variance-based
calculations to yield numerical estimates of HRV in temporal units for e.g. in milliseconds
(Appelhans and Luecken, 2006). The normal-to-normal (NN) interval or the instantaneous
heart rate is determined by detecting all the adjacent QRS complexes for a particular QRS
complex from an ECG record (Malik et al., 1996). Some of the various time domain features
extracted by utilizing the NN interval are:
(a) AVNN (SDANN): Standard deviation of the averages of NN intervals in all 5 min
segments of the entire recording.
(b) SDNN: This is the standard deviation of all NN interval intervals, i.e. the square root of
variance.
(c) rMSSD: This is the square root of the mean of the sum of the squares of differences
between adjacent NN intervals, one of the most commonly derived feature.
(d) pNN50: This is derived by dividing the NN50 by the total number of NN intervals. Here,
NN50 is the number of interval differences of successive NN intervals greater than 50 ms.
(e) pNN20: This is the ratio of NN20 divided by all NN intervals.
2) Frequency-domain or Spectral Methods:
In this method spectral features are extracted from short-time HRV recordings. The power
spectral density (PSD) analysis provides the basic information of how the power (i.e. the
35
variance) has been distributed with respect to frequency (Malik et al., 1996). The algorithms
like the Fast-Fourier Transform decompose the HRV signal into its individual spectral
components using the PSD analysis to be grouped into three distinct bands resulting in the
following features:
(a) VLFP: This represents the power in very low frequency range (approx. ≤ 0.4 Hz) and is
commonly avoided in analysis as its physiological significance is less defined.
(b) LFP: This is the spectral power in low frequency range (0.04–0.15Hz). This reflects a
combination of sympathetic and parasympathetic ANS response.
(c) HFP: This is the spectral power in high frequency range (0.15–0.40Hz range). This
reflects the vagal modulation of cardiac activity.
(d) LF/HF Ratio: The LF/HF power ratio is used as an index for assessing sympatho-vagal
balance.
Besides these features other spectral features could be extracted which has been discussed
in Section 3.6.5. Figure 2.4 indicates the HRV based features derived using both the time and
frequency domain methods.
Figure 2.4: Heart Rate Variability (HRV) Features
2.6.3 Blood Pressure (BP)
Blood Pressure (BP) is the force exerted by the blood against any unit area of the blood
vessel wall (Guyton and Hall, 2006). It is measured in millimeters of mercury (mm Hg) i.e.
when the force exerted is sufficient to push a column of mercury against gravity up to a level
Heart Rate Variability (HRV)
Time Domain Method (Statistical Features: useful in analyzing
interbeat changes in HR, reflect emotions like
frustration, boredom etc.)
Frequency Domain Method (Spectral Features: robust to missed heart beats,
carries information of parasympathetic and sympathetic nervous system balance)
AVNN
(SDANN) SDNN rMSSD pNN20 pNN50
VLFP LFP HFP LF / HF
Ratio
36
50 mm high, the BP is said to be of 50 mm Hg. Two commonly used numbers describe the
BP of a person: the systolic pressure which is the higher and first number, and the diastolic
pressure which is the lower and second number. During the systolic activity the heart
chamber contracts to drive blood into the aorta and pulmonary artery, whereas during the
diastolic activity the heart chambers widen to fill more blood in the chambers between two
contractions. A third measure of BP has become an important indicator of a health severity
known as pulse pressure which is the difference between the systolic and diastolic pressure,
an indicator of the blood vessels wall stiffness. The BP measurement of a normal subject
would be considered to be less than 120/80 mm Hg (systolic/diastolic).
Large changes in blood flow can be noticed due to either increased or decreased
sympathetic nerve stimulation of the peripheral blood vessels. The inhibition of sympathetic
activity greatly dilates the blood vessels to increase the blood flow twofold or more. Whereas
very strong sympathetic stimulation can constrict the blood vessels in such a way that blood
flow occasionally decreases to as low as zero for a few seconds despite high arterial pressure
(Guyton and Hall, 2006). Therefore negative emotional states like anxiety, frustration, anger,
fear, anticipation of pain etc. can bring about elevations in heart rate and / or blood pressure.
The positive emotional states of excitement, joy, and interest can also bring about elevated
cardiovascular responses.
People in the middle age or elderly adults with high systolic pressure are at greater risk as
they will face heart, kidney, and circulatory complications. Older people (between 50-59
years of age) with elevated systolic pressure may face heart events and stroke events even
when their diastolic pressure is normal, a condition known as isolated systolic hypertension.
Similarly high diastolic pressure is a strong predictor of heart attack and stroke in young
adults. For people who are over 45 years old, every 10−mm Hg increase in pulse pressure
increases the stroke risk by 11%, cardiovascular disease by 10%, and overall mortality by
16% (Guyton and Hall, 2006).
2.6.4 Galvanic Skin Response (GSR)
Galvanic Skin Response (GSR) is a measure of the skin's conductance between two
electrodes. The skin conductance is a measure of autonomic nervous system activity and a
potential indicator of stress levels. The GSR sensor electrodes are placed either on palm or
foot of the subjects to measure the changes in the resistance of the skin due to the ionic sweat
produced by sweat glands, by passing a small electrical current through the electrodes. The
resistance of the skin is usually large, approximately 1MΩ and hence the conductance
37
(reciprocal of resistance) is measured in micro-mhos or microsiemens (μS). Skin conductance
response varies linearly with arousal ratings, thereby making it suitable for measuring the
states of anger, fear, anxiety, responses due to sudden stimuli, stress, startle responses etc.
This measure has also been studied to assess the stress levels of automotive drivers and
aircraft pilots (Healey and Picard, 2005; Roscoe, 1992).
The skin conductance signal has two basic components: tonic and phasic. The tonic skin
conductance represents the baseline level of skin conductance whenever the stimulus are
absent. The tonic component varies with time depending on psychological state and
autonomic regulation of a person. During either anticipating or performing mental arithmetic,
vigilance or attention tasks, and social tasks the tonic component rises. The phasic skin
conductance (also known as GSRs) represents the changes observed in skin conductance due
to the presence of external stimuli or events. Time related changes in skin conductance are
observed due to discrete environmental stimuli like sights, sounds, smells, etc. This has been
studied in several contexts such as stress level analysis, lie detection, analyzing social
empathy or embarrassment (Lisetti & Nasoz, 2004; Healey and Picard, 2005).
2.6.5 Respiration
Respiration is an activity characterized by breathing in and out i.e. inhalation and
exhalation. In simpler words it is an activity of taking in oxygen from inhaled air and
releasing carbon dioxide by exhalation. A Hall Effect respiration sensor, which consists of
two magnets embedded inside an elastic tube, measures the expansion and contraction of the
chest cavity around the diaphragm to capture the breathing activity (Picard and Healey, 1997)
is commonly used. During inhalation the elastics stretch which separates the magnets
producing a current and due to exhalation the sensor returns to the baseline state. The amount
of stretch in the elastic is measured as a voltage change producing the respiration waveform
from which amplitude proportional to the subject's breath and the respiration rate can be
calculated.
Both physical activity and emotional arousal are the cause for faster and deeper
respiration, while peaceful rest and relaxation are reported to lead to slower and shallower
respiration. Sudden, intense or startling stimuli can cause a momentary cessation of
respiration and negative emotions have been reported to cause irregularities in respiration
pattern. The respiration signal can also be used to assess physical activities such as talking,
laughing, sneezing and coughing (Picard and Healey, 1997).
38
2.6.6 Electromyography (EMG)
The electromyography (EMG) is a measure of muscle activity which generates tiny
electrical pulses due to the contraction of muscle fibers. These tiny pulses are detected and
amplified using EMG sensor resulting in a constantly varying signal, as the sensor contract at
different rates within the recording area. The amplitude of the EMG signal is proportional to
the strength of contraction of muscle fibers which is dependent on the force required to
perform the movement. The EMG sensor consists of three electrodes, two are placed along
the axis of the muscle of interest and the third ground electrode is placed off axis. EMG can
be used to study facial expression, gestural expression, emotional valence and emotional
stress as the muscle activity increases during such activities. The EMG sensors are placed
depending upon the area of interest like at face knows as facial EMG, at shoulder etc. It can
diagnose:
- some causes of muscle weakness or paralysis,
- muscle or motor problems, such as involuntary muscle twitching,
- sensory problems, such as numbness, tingling or pain, and
- nerve damage or injury.
2.6.7 Electroencephalography (EEG)
The "nerve cells" and "glia cells" located between neurons constitute the Central Nervous
System (CNS). Each nerve cell consists of axons30
, dendrites31
, and cell bodies32
(Sanei and
Chambers, 2008). The nerve cells respond to stimuli and transmit information over long
distances. The main activity in the CNS relates to the synaptic currents transferred between
the junctions (called synapses) of axons and dendrites, or dendrites and dendrites of other
cells. Synaptic excitations of the dendrites result in flow of currents which is measured to
obtain an EEG signal. In simple terms, brain waves are created when brain cells (neurons) are
activated (Sanei and Chambers, 2008). These brain waves are the electrical potentials which
can be measured using electroencephalograph (EEG).
EEG electrodes are placed on the scalp to record the EEG signals, whose magnitude and
frequency depends upon the degree of mental activity of brain. There are five different brain
waves categorized due to their frequencies: Alpha, Beta, Theta, Delta and Gamma waves.
30 a cylinder which transmits electrical impulses. 31
connected to either the axons or dendrites of other cells and receive / relay signals from / to other nerves. 32 connected to either axons or dendrite of other cells and receive or relay the electrical impulses.
39
Although none of these waves is ever emitted alone, the state of consciousness of the
individual may make one frequency more pronounced than the others.
1) Alpha:
The frequency of alpha waves lie between 8 and 13 Hz and this indicate both a relaxed
awareness and also inattention (Sanei and Chambers, 2008). This is the most prominent
rhythm generated among the several other wave due to brain activities. Alpha wave is a
waiting or scanning pattern produced by the visual centers of the brain. It is reduced or
eliminated by opening the eyes, by hearing unfamiliar sounds, by anxiety or mental
concentration. Alpha indicate
- an empty mind rather than a relaxed one
- a mindless state rather than a passive one, and
- requires the presence of other frequencies, beta and theta before the usual
description of alpha becomes true.
2) Beta:
The beta wave frequencies lie within the range of 14 - 26 Hz. A different name has been
given to frequencies above 26 Hz mostly known as high-level beta wave corresponding to the
panic state of a human (Sanei and Chambers, 2008). Beta wave is usually considered as the
waking rhythm of the brain found in normal adults and associated with:
- active thinking
- active attention
- focus on the outside world, and
- solving concrete problems.
3) Theta:
Theta waves lie within the frequency range of 4 to 7 Hz. Theta waves are indicative of
consciousness slipping towards drowsiness (Sanei and Chambers, 2008). Theta has been
associated with:
- access to unconscious material,
- creative inspiration and
- deep meditation.
A theta wave is often accompanied by other frequencies and seems to be related to the
level of arousal. Therefore the theta waves can act as a good indicator of the drowsiness state
of the automotive drivers.
40
4) Delta:
Delta waves lie within the frequency range of 0.5 to 4 Hz. Delta waves are primarily
associated with deep sleep, and in the waking state, were thought to indicate physical defects
in the brain (Sanei and Chambers, 2008). Necessary signal processing is required to filter
those artifact signals caused by the large muscles of the neck and jaw which appear as if they
are the genuine delta response.
5) Gamma:
Gamma wave frequencies lie above 30 Hz (mainly up to 45 Hz) correspond to the gamma
range, sometimes called the fast beta wave. Although the amplitudes of these rhythms are
very low and their occurrence is rare, detection of these rhythms can be used for confirmation
of certain brain diseases (Sanei and Chambers, 2008).
2.6.8 Blood Oxygen Saturation (SpO2)
Pulse oximetry is one of the significant clinical applications of PPG measurement
techniques for patient monitoring. The device used to obtain the arterial blood oxygen
saturation (SpO2) readings is known as pulse oximeters which can also be used to calculate
HR. Pulse oximeters are used in different clinical settings, including hospital, outpatient,
sports medicine, domiciliary use, and in veterinary clinics (Allen, 2007). Pulse oximeters use
non-invasive methods to determine the SpO2 by passing red and then near infrared
wavelength light through vascular tissue with rapid switching. A pulsatile signal
corresponding to the amplitudes of the red and near infrared AC signals is obtained which is
superimposed on a DC component (relates to the tissues and average blood volume). The
amplitude of the pulsatile signal is sensitive to changes in SpO2 because of the differences in
the light absorption of oxidized hemoglobin (HbO2) and reduced hemoglobin (Hb) at these
two wavelengths. The amplitude ratio of the AC and DC components corresponding to these
two wavelengths is used to calculate the SpO2. It is assumed that the pulsatile component of
the PPG signal is a result of the arterial blood volume changes with each heartbeat. Proper
signal processing techniques needed as the output of the device are sensitive to the motion
artifacts and noise.
Among the several sensing parameters selection of appropriate sensors which could be
suitably placed on driver's body without obstructing the related task will help in design of a
WDAS. Particularly, the sensors like PPG, GSR, BP etc. could possibly be integrated into a
single wearable device. Sensors like EEG and ECG use multiple electrodes and could create
41
discomfort to drivers due to their placement locations, should be selected carefully. However
certain ECG configurations with minimum electrodes may be selected.
2.7 Models for Stress-Level Analysis
Continuous monitoring of driver’s affective state in a real-time driving scenario has been a
challenging task. The methodology involves physiological data collection, preprocessing,
feature extraction coupled with selection as well as transformation and finally classifying
these features to a related stress or fatigue state using classification methods. Healey and
Picard (2005) developed an automatic stress recognition algorithm using linear discriminant
analysis (LDA) with the features extracted from physiological data as inputs. They achieved
over 97% recognition rate and a good correlation rate with their video stress metric. Qiang Ji
et al. (2006) proposed a real-time non-intrusive fatigue monitor based on information
collected from physiological sensors and the subject’s environment. Their bayesian dynamic
network accounted for both temporal and dynamic aspects of human fatigue. Katsis et al.
(2008) designed a wearable system for assessing the emotional state of car racing drivers
using physiological signals collected in simulated environments. The maximum predictive
ability of their system was 79.3% when they used support vector machine (SVM) for
classification. Lisseti et al. (2004) and Haag et al. (2004) successfully employed ANNs in
classification of emotions, valence and arousal states of subjects with over 90% classification
rate. The following subsections discuss different classification methods used for stress level
analysis.
To model non-linearities33
observed in the extracted features accurately, it is required that
we perform a comprehensive analysis of the machine learning paradigms available and
choose the one with optimal predictive ability and sensitivity.
2.7.1 Fisher Projection and Linear Discriminant Analysis (LDA)
Healey and Picard (2005) extracted 22 physiological features, consisting of 9 statistical
features from EMG, respiration, heart rate and skin conductance signal, 4 spectral features
from respiration, 8 skin conductance orienting response features and 1 HRV feature, to create
a single vector consisting of a total of 112 data segments. These 112 feature vectors were
used to train and test a recognition algorithm based on Fisher projection matrix and a linear
discriminant. The training vectors were used to create a Fisher projection matrix and a linear
discriminant.
33 Majority of the physiological signals like EEG, ECG, blood flow, human gait, etc. are characterized by
complex dynamics including both non-stationarities and non-linearities (Popivanov and Mineva, 1999).
42
In statistics, pattern recognition and machine learning, Fisher linear discriminant analysis
(LDA) method is used to find a linear combination of features which characterizes or
separates two or more classes of objects or events. The LDA is preferred in those cases where
within-class frequencies are unequal and their performances has been examined on randomly
generated test data (Balakrishnama and Ganapathiraju, 1998). In the class-dependent
transformation, the ratio of between-class variance to within- class variance is maximized
such that classes are separable adequately. In the class-independent transformation, the ratio
of overall variance to within-class variance is maximized by considering one class as a
separate class against all other classes. The class independent transformation is preferred
whenever generalization is of prime concern while the class dependent type is preferred for
good discrimination among the classes. (Balakrishnama and Ganapathiraju, 1998).
2.7.2 Support Vector Machines (SVM)
The SVMs were first proposed by Vapnik and are based on statistical learning technique
to be used for pattern classification to infer the nonlinear relationships between variables
(Vapnik, 1995; Cristianini and Taylor, 2000). Support Vector Machines (SVM) have been
used by transportation researchers in several ways like emotion recognition of car racing
drivers using biosignals (Katsis et al., 2008, Katsis et al., 2011), cognitive distraction
detection using drivers’ eye movements (camera based eye tracking) and driving performance
and/or vehicle dynamics data (Liang et al., 2007; Tango and Botta., 2013). Several other
pattern classification problems where SVMs have been successfully applied include face
recognition, object recognition, handwritten character and digit recognition, text and speech
recognition, speaker recognition, protein classification along with information retrieval etc
(Dong et al., 2011; Tango and Botta, 2013).
The SVMs first transform a given input data set into a higher dimensional space utilizing
a kernel function. In the next step optimization methods are used to identify a hyperplane that
separates the transformed data with minimum errors and maximum gain. The hyperplane
identification is based on support vectors which are available as a set of boundary training
instances. Finally, the hyperplane is transformed back to the input space based on new
training instances to obtain the decision boundaries. The misclassified instances are penalized
to obtain nonseparable data as a final outcome (Dong et al., 2011; Tango and Botta, 2013).
The advantages of SVM based classifications include (a) fairly insensitive to the curse of
dimensionality problem, (b) efficient enough to handle very large scale problems in both
sample and variables, (c) can generate both linear and nonlinear models with efficient
43
computation of even nonlinear models, (d) can extract information from noisy data without
the need for prior knowledge before training etc. However they have certain disadvantages
like (a) for large volume of training data the computational complexity will be more, (b) the
“classical” application of SVMs concerns a binary classification task. Therefore SVM has
established itself as a useful classifier for human cognition34
task.
2.7.3 Bayesian Networks (BNs)
A BN model is a graphical representation of uncertain knowledge or variable to infer the
high-level activities of observed data or variables (Ji et al., 2004, Malta et al., 2011). A BN
consists of certain nodes as domain variable representing either discrete or continuous values,
arcs representing a probabilistic relationship between the parent and the child nodes.
Bayesian Networks (BNs) have been used to model human fatigue, by using visual cues35
data (Ji et al., 2004), as well as by considering the degradation of driving performance tasks
caused due to drowsiness (Yang et al., 2009). They have found applications in the analysis of
driver's frustration using multiple sensors like speech, physiological, facial, gas- and brake-
pedal actuation (Malta et al., 2011), and driver's stress events detection (Rigas et al., 2012).
Dong et al. (2011) lists certain advantages of BNs that make them well suited for
describing human behavior that include (a) information from different sources and at
different levels of abstraction can be presented due to the hierarchical structure of BNs which
can also capture probabilistic relationships, (b) BNs reveal the relationships that generate the
model predications i.e. it is a computational as well as knowledge representation model, (c)
BNs can handle situations with missing data by adding new data using a probabilistic
dependence network when new evidences are added. However, the difficulty in creating a
correct and stable BN model requires extensive computational capability and a large amount
of training data which is a disadvantage.
2.7.4. Artificial Neural Networks (ANNs)
Artificial Neural Network (ANN) based models closely emulate the decision making
paradigm used by the human brain. It has been a preferred classifier in applications where the
training features exhibit non-linear nature and the decision boundary is best modeled as a
non-linear function in the feature space. ANN works reliably with noisy data and has been
proven to be useful for both categorical and continuous features (Ali and Wasimi, 2009). A
typical ANN architecture consists of certain computing elements known as nodes or nets or
34 The term human cognition has been used here in the limited sense of driver's inattentiveness, distraction etc. 35 facial features and eyelid features etc.
44
units, which transmit and receive signals through interconnection of neurons. Each of the
node is characterized by an associated weight which is multiplied by the incoming signal to
calculate the weighted sum which acts as an activation to the net characterizing an activation
function (Patel et al., 2011, Tango and Botta, 2013). Proper selection of an NN architecture,
activation function and learning rule helps in designing an NN classifier. NNs are trained
either using unsupervised or supervised learning methods.
Artificial Neural Network (ANN) models have been utilized to monitor the driver's
fatigue using HRV features (Patel et al., 2011), to detect driver's distraction using visual
techniques (Tango and Botta, 2013).
The advantages of ANNs include (a) classify targets without prior knowledge of patterns
in the data i.e., even in the absence of exact input–output relationship, (b) ability to generalize
(identify similar patterns with reasonable accuracy, useful for real world or noisy or distorted
or incomplete data), (c) model nonlinear or complex problems accurately than linear
techniques (Dong et al., 2011). The limitations of ANNs are (a) their long training process,
(b) determination of an optimal boundary when handling real-life data due to the ambiguous
nature of such data (Wahab et al., 2009).
2.7.5 Neuro-Fuzzy Systems
The fuzzy inference system (FIS) supports linguistic concept modeling by means of fuzzy
rule expressions, which is considered to be close to human expert natural language. A fuzzy
system manages the uncertain knowledge and infers high-level behaviors from the observed
data. Besides this, due to their universal approximator nature they can also be used for
knowledge induction processes (Bergasa et al., 2006). Fuzzy logic was introduced to handle
vagueness and uncertainty in data. FIS has been used to monitor driver's vigilance or
inattentiveness level using visual fatigue behaviors such as ocular and face pose measures etc.
(Bergasa et al., 2006).
The limitations of ANNs in determining an optimal boundary when handling real-life data
has been compensated with the use of adaptive neuro-fuzzy systems (ANFIS) (Wahab et al.,
2009). The neuro-fuzzy hybrid systems have been developed to solve the problem of
uncertainty and vagueness in data by combining learning techniques of neural networks for
the learning and identification of fuzzy model parameters (Wahab et al., 2009). These hybrid
systems offer strong generalization ability and fast learning capability from large amount of
data. The fuzzy set represents fuzzy concept and fuzzy rules, link these fuzzy concept of the
input space with the fuzzy concept of the output space.
45
Hybrid neuro-fuzzy system like the evolving fuzzy neural network (EFuNN) and the
adaptive network-based fuzzy inference system (ANFIS) have been used for driver behavior
modeling (Wahab et al., 2009), emotion recognition (Katsis et al., 2008).
These model were the representative models which have so far been dominating the scene
in the current context.
2.8 Enabling Technologies for WDAS Design
WDAS will include the current state of the art technologies driven by the recent
advancements in the very large scale integration (VLSI) technologies, micro
electromechanical systems (MEMS) or possibly Bio-MEMS. These devices can, not only
monitor the driver's affective or psychophysiological state, but also will be able to monitor
the vital sign parameters. In addition it can also measure the physiological changes observed
due to sudden-stimulus (stress events or stress-trends) which may be encountered during the
course of driving. These kind of WDAS should have the provision for alerting the drivers
locally and in case of an accident to the main vehicular computer to handover the control of
the vehicle (Banerjee, 2005).
ADAS-specific enabling technologies may serve as the building blocks for the
development of products and systems (Shladover, 1993).
2.8.1 Wearable Biosensors and Sensing Parameters for Driver Stress Monitoring
Several wearable sensors have been used by researchers to monitor the physiological
parameters of automotive drivers. Table 2.5 summarizes the physiological parameters of
measurements and their related use from the literature.
Sensor selection is the most important part of development process for the wearable
devices useful in health monitoring, caregiving and stress related parameter detection.
Appropriate sensors specific to a particular application must be selected by considering its
electrical characteristics such as impedance, power consumption, sampling frequency etc. and
mechanical characteristics like size, robustness etc.. This should also consider some
environmental issues like extreme weather conditions and its ability to operate for long hours.
The reliability and operability of sensors in wearable conditions not only influence the design
but also facilitates the acceptability of these sensors in such a situation.
A typical location of wearable biosensors is shown in Figure 2.5. The criteria for selection
of sensors will be dependent on the parameters such as: Sensing range, Resolution, Response
time, Size & weight, Power requirement, Voltage / current requirement, Calibration method,
46
Humidity / moisture constraints, Temperature range, Noise (EMI, Magnetic field etc.),
Placement of sensors, Requirement of external unit, Cost / unit, Body safety, Availability in
India etc.
Table 2.5: List of Parameters of Measurement / Sensing alongwith Related Use
S. N. Physiological Signal Parameters to be extracted Related Use
1. Electrocardiograph
(ECG or EKG)
QRS Complex Width
RR Distance
QT Interval
Presence of a heart block
Heart Rate (directly)
HR Variation
2. Electroencephalograph
(EEG)
Alpha, beta, theta and
gamma waves
Neural status
3. Pulse Oximeter Heart Rate
SpO2
A sudden change in HR
Reduction in blood oxygenation (urgent medical
intervention)
4. Body Motion Accelerometer (3-axis)
Gyroscope (1-axis)
Surface EMG
Orientation and movement of each body segment
Angular Velocity (limb position if accelerometer
data is used)
Identification of motor tasks (muscle activity,
numbness etc.)
5. Blood Pressure Pressure
Pulse
Hypertension
Healthiness
6. Galvanic Skin
Response (GSR)
Sweat Gland Activity Sudden fear, stress etc.
7. Body Temperature Skin Temperature Fatigue and healthiness etc.
8. Respiration Respiration Rate Breathing activity and healthiness etc.
Figure 2.5: Typical placement location of Wearable Biosensors alongwith Sensing Module
47
2.8.2 Processing Requirements and Elements
The processing elements of proposed WDAS has to perform three key functions of
sensing, processing and communicating. To sense multimodal physiological signals it should
have sufficient analog-to-digital (A/D) channels, general purpose as well as digital signal
processing cores and sufficient number of communication channels like RS232, USB, I2C
etc. Due to the mobile nature of the wearable computers the processing elements must
consume less power during computations and external communications as well as they must
have minimal heat dissipation (Baber et al, 1999). In such real-time safety-critical systems,
preprocessing is required at the wearable computer itself for assessing if an emergency arises.
Selection of processing core is influenced by their robustness, weight, power consumption,
word-length and storage etc. Most modern microcontrollers possess these features. Other
typical features required are like resilience to noise and vibration, small footprints, fast and
versatile handling of input and output, efficient interrupt capabilities, fail-safe features such
as watchdog timers for automatic recovery in the event of system lock-up and brown-out
protection for recovery from power supply anomalies, and on-board features for serial and
parallel communications (Baber et al, 1999).
Identification of operating system (OS), application software, peripherals etc. would
enable specification of the processing demands, which may, in turn, help in selecting a
particular hardware configuration (Baber et al., 1999). Traditionally, the wearable devices so
far have been developed around general purpose microcontrollers for performing the data
collection, computing and communicating functions. Reconfigurable hardware such as field
programmable gate arrays (FPGAs) were seen as an alternate option to leverage its high
performance, flexibility, energy-efficient capabilities than the general purpose CPU in
wearable device for real-time computation intensive tasks (Plessl et al., 2003). In spite of the
initial work by Plessl et al. (2003) that involve options like FPGAs, hybrid CPUs (FPGA +
CPU), ASIC-on-demand (hosts computational intensive functions on reconfigurable
hardware) etc. not much progress has been made as of this writing that could have indicated
emergence of reconfigurable computers.
2.8.3 Communication Elements
Physiological parameters sensed via body-worn sensors could be transmitted through
wired connections to the processing elements, whereas the wireless communication links may
be used to relay the locally processed data to a back end system including those belonging to
recovery agencies.
48
The communication channels available in market are categorized depending upon their
transmission range as short-range, medium-range or long-range communication channels. In
a typical WDAS, hybrid communication medium may be useful consisting of short and
medium range communication devices. The short-range communication channels like
Bluetooth, ZigBee etc. are required to establish a two-way communication between one of
the vehicular computers to either inform the driver's health condition or to take control of the
vehicle or to a nearby backend server in the pervasive computing environment. The medium-
range communication is needed to send the information to a nearby agency (e.g. highway
patrolling staff) to help the wearer in time. Long range communication channels like
2G/3G/4G etc. may establish a two-way connection between a wearer and hospitals, remote-
monitoring centre in case in spite of assistance by the wearable computer an accident takes
place. The WDAS consisting of such hybrid communication channels will require exploiting
more power from the power source of the battery.
These wireless technologies pose a great challenge for their universal compatibility and
acceptability, because each technology has its own limitations with respect to the bandwidth,
channels, interference, line-of-sight, communication range that greatly influence the user’s
privacy and security requirements. However, their proper selection would enable a designer
to (a) to categorize the device with respect to its power requirement, intended application
domain, interference or hazards critical to a particular disease etc. (b) select an OS which
supports a particular communication technology / protocol.
2.8.4 Storage Devices
The storage elements relevant for such systems include registers, random access memory
(RAM), read only memory (ROM), caches, flash memories etc. Data storage can be achieved
using processor specific registers at the CPU level, commonly available in RISC processors.
On-chip internal memories in the form of RAMs and ROMs as well as flash memories are
usual requirements for such systems. RAMs are used to store the run time variables, stacks
and also function as buffers for various forms of processing. ROMs are used to store boot-up
programs, application programs, initialization data, look-up tables, certain codes for RTOSes
and address pointers for specific subroutines etc. Flash based devices are helpful in storing
non-volatile results of processing, compressed form of processed physiological data and can
also act as temporary storage for certain data as a result of fast processing. Additionally,
voluminous data can be stored on SD-MMC cards based devices as a form of secondary
storage instead of disk drives.
49
2.8.5 Power Provisioning
In wearable devices, less power consumption is one of the critical requirements. Batteries
to be used in such devices, therefore, must be able to operate for long hours without the need
of replacement or frequent recharge. But they are bulky and provide limited amount of
energy. Alternately, we could select power sources from one of these options such as solar
cells, polymer cells, body-generated power etc. Whereas recent advancement in fuel cell
technologies intended for portable electronics devices may replace batteries in near future.
Another important option is to use body-power itself, extracted from body-heat, breathing
(respiration), blood pressure, arm-motion, walking and some physical activities like exercise,
walking, typing etc. However this adds extra burden from design point of view due to the use
of a number of transducers and other devices needed to extract a sizable amount of power,
acceptability due to ergonomic considerations, form-factor etc. Therefore, selection of a
proper power provisioning option for designing such WDAS will be driven by the
architectural requirements that have provisions for placement of specific transducers if
original source of power is other than electrical in nature. Additionally, some power sources
or devices can be identified more quickly, easily and efficiently for certain life-critical
applications like WDAS due to the associated design choices and peripheral support.
Although very little information can be found from literature regarding the power
consumed by different hardware units, Anliker et al. (2004) recommended some methods to
reduce the power consumption based on the observations while developing the AMON
wearable prototype. They inferred that:
The specific hardware unit that degrades battery life includes wireless radio during
data streaming, analog signal conditioning unit that use amplification and filtering
circuits, CPU and some specific sensor systems such as BP pump and valves etc.
The software modules that consume more power are encryption / decryption engines,
DSP algorithms for performing non-linear dynamics, wavelets, morphological
analysis etc.
The battery life in the case of single parameter (for e.g. ECG) monitoring systems
varies between 24-48 hours whereas in the multi-parameter monitoring systems it
may vary between 6 hours to more than 20 hours depending upon the operation of
different modules.
Low power consumption can be achieved by (a) providing discrete power provisioning
modules as implemented by Anliker et al. (2004), and (b) dynamic voltage scaling (DVS)
50
which manages the power dynamically by keeping unused modules in sleep mode through
software control (Jejurikar et al., 2004). Use of ASIC for providing hardware solution to
reduce significant power is recommended for analog units, encryption /decryption engines
and some specific sensor signal conditioning circuits (Anliker et al., 2004).
2.8.6 Alarm and Warning Actuators
Drivers must be given warnings or alerts by means of certain vibratory actuators, audio-
visual actuators etc. However the exact choice would depend on the actual situations
including aspects like various forms of abilities or disabilities of the vehicular drivers and/or
presence or absence of factors like ambient light, background noise, and the terrain etc. In a
driving simulator based experiment conducted on drivers, Lee et al. (2004) found that haptic
warnings were preferred to the auditory warnings on several dimensions including trust,
overall benefit to driving, and annoyance. They suggested that non-standard warning mode
(e.g., haptic cues from a vibrating seat) and warning strategies (e.g., a graded warning) need
to be considered to promote appropriate use and acceptance. But in the WDAS case vibratory
and auditory alerts could be preferred methods.
2.8.7 Wearable Fabrics
The WDAS could be designed as either a fabric-based or a fabric-less wearable computer.
The research community so far have experimented on the use of wearable fabrics such as
resistive, inductive and capacitive Fiber Meshed Transducers36
(FMTs) (Wijesiriwardana et
al., 2004). Piezoelectric fabric has also been used in some applications. The fabric-based
systems should be designed in such a manner that it targets different age groups of people
like baby, children, adolescent, youth and for the elderly. Each and every age group has
different requirements and likings with some common features. For example adolescents are
more fashion conscious and they like aesthetically designed and attractive devices or gadgets.
Whereas for elderly people since their requirements vary due to different health related
problems which grow with age, the design should take care of related aspects.
2.8.8 Application and System Software
In addition to the hardware components required as above, the WDAS must have a
reliable, fault-tolerant system software. Since the system is life-critical, a hard real-time
36Resistive FMTs: used for wearable respiratory information and activity monitoring.
Inductive FMTs: used for monitoring kinematical movements of body, worn on sleeves, gloves and legging of
garments by properly knitting metallic wires.
Capacitive FMTs: for detecting ECG and can also be operated as touch switches.
51
operating system (RTOS) must complement the software requirements for such devices.
Applications software for several tasks such as data collection and conversion, signal
processing and pattern analysis, communicating, alerting and control must be duly supported
to enhance the functionality of the WDAS.
2.9 Impact of the Literature Review on Identification of Next Steps
The literature review presented so far primarily focused on to the stress-monitoring of
automotive drivers using physiological signals. It could be noticed that the Wearable Driver
Assistance System designed using the necessary enabling technologies such as wearable
biosensors, processing, communicating elements etc. potentially help avoidance of road
accidents caused by deterioration in the ability of safe-driving by the virtue of changes in
select physiological signals. Consequently, it became evident that driver's affective state or
sentic state or emotional state (Riener et al., 2009) must be analyzed to effectively monitor
the stres- level by means of physiological signals. It, thus, became clear that in order to be
able to properly estimate driver's stress-level, sensory data had to be collected under real-life
driving conditions and duly analyzed. Since the affective states can not be quantified, it was
found necessary to identify certain stress-levels that could help in indicating thresholds
beyond which it may be unsafe to drive a vehicle. Such an exercise could also help in
classifying various stress-levels of relevance. This strategy assumed even more significance
since no field data was available for ready use in Indian road conditions.
52
Chapter 3
Physiological Signal: Data collection and Processing
It is important to collect physiological signals from actual vehicular drivers and carefully
analyze various visible and invisible patterns of behavior from the collected data before
embarking on the design of Wearable Driver Assist Systems (WDAS). Such a data is best
collected either directly from the sensors mounted on the body of the driver in a non-intrusive
way or embedded in the environment of the driver's vehicle.
It is in this context that framing of a data collection environment and designing an
appropriate set of experiments that allow direct or indirect sensing of relevant physiological
signals so that the data embedded in the signal could be duly extracted, processed to eliminate
noise and bias where applicable and analyze the resultant data so as to identify patterns of
interest. Figure 3.1 details the necessary steps for identification of pattern classes from
Biosignals.
Fig. 3.1 Biosignal based Pattern Recognition: Functional Block Diagram
Life-threatening as well as health-critical parameters are sensed from physiological
signals. Emotions such as frustrations, anger, and fear may also contribute to the changes in
53
the autonomic nervous system of drivers. Road-conditions as well as vehicular parameters
also affect the decision making process complex for the drivers. These factors contribute to
the degree of sensitiveness needed from a WDAS system. In such kind of sensitive
applications presence of a multimodal sensor-compute infrastructure is mandatory. Usually,
in multimodal sensing we collect data from two or more sensors at the same time and it is a
common knowledge that human perception and cognition is fundamentally multimodal in
nature (Pantic and Rothkrantz, 2003). This will help in effective sensing and quantification of
parameters ranging from vehicle’s condition, road condition, driver’s mental and physical
state, environment of operation etc (Benoit et al., 2006).
3.1 Steps Involved and their Significance
A typical data collection methodology involves steps like: (a) identification of sensory
parameters, (b) sensor and equipment selection, (c) scenario identification, (d) data collection
protocol design, (e) subject37
identification, (f) subject training, and (g) data collection.
Local acquisition of significant samples is extremely important in such situations.
Availability of a verifiable reflex level estimation / assessment model is one of the enabling
factors in the credible diagnosis of an approaching risk-condition. However, we judiciously
chose to exclude EEG due to the requirements of a headgear and a number of electrodes (as
many as 32), which a driver would not like to wear inside a vehicle unless he / she is a car
racing driver. Similarly we excluded ECG also due to the problem faced while collecting data
in driving scenarios. Placements of ECG sensors were also one of the constraints in their
selection. Since our approach for stress detection is dependent on biosignals, initially we
included four physiological signals ECG, PPG, GSR and Respiration for the purpose of
analyzing the stress levels of automotive drivers.
3.2 Sensor Selection
Possibly, the most appropriate approach for WDAS should involve use of wearable
biosensors in a body-mounted network of sensor-compute nodes forming a body area
network. Identifying appropriate sensors as well as their placement are equally significant.
In order to estimate the stress-level of the driver, researchers have used a common set of
physiological sensors such as EEG, EMG, ECG, GSR, SpO2 and Respiration in applications
ranging from telemedicine, bio-health informatics, affective state recognition, and
biofeedback to rehabilitation. Table 3.1 lists various physiological signals, sensor types,
location, minimum and optimum sampling frequencies, parameters which could be extracted
37 Subject in this case are the vehicular drivers.
54
from them and their related use in wearable physiological monitoring applications (Singh,
2007).
Table 3.1: List of Physiological Signals, Sensors and Sensing Parameters of
Measurements alongwith their Related Use
S.N. Signals Sensor
Type
Sensor Location Minimum
Sampling
Frequency
(Hz)
Optimum
Sampling
Frequency
(Hz)
Extractable
Parameters
Inference
1. ECG Disposable
Electrode
Chest, Hand
(e.g. Einthoven I/II,
Goldberger, Wilson
recording)
256 1024 QRS Complex
Width
RR Distance
QT Interval
Indicates presence of
blockage in heart
(arteries)
Allows to compute
Heart Rate
HR Variation
2. EEG Sintered
Ag/AgCl
electrodes
Scalp (along the
international 10/20
electrodes system)
256 256 Alpha, beta,
theta and
gamma waves
Neural status
3. PPG Pulse
Oximeter /
BVP Sensor
Finger, earlobe
64 256 Heart Rate
SpO2
A sudden change in
HR, Reduction in blood
oxygenation
(urgent medical
indicator)
4. EOG Sintered
Ag/AgCl
electrodes
Vertical/horizontal/
diagonal eye
128 256 Eyelid
movement
Sleepiness / Gaze
Detection
5. EMG Disposable
electrode
Face, Hand, Leg 256 2048 Surface EMG Identification of motor
tasks (muscle activity,
numbness etc.)
6. BP BP Cuff /
PPG Sensor
Wrist / Finger Pressure
Pulse
Hypertension
Healthiness
7. GSR Finger
electrode
Hand, foot, forehead 32 32 Sweat Gland
Activity
Sudden fear, stress etc.
8. Body
Temp.
Skin
Temperature
Fatigue and healthiness
etc.
9. RSP Belt/Nose
flow sensor
Thorax, abdominal
32 32 Respiration
Rate
Breathing activity and
healthiness etc.
ECG: Electrocardiograph; EEG: Electroencephalograph; EMG: Electromyography; BP: Blood Pressure; GSR:Galvanic Skin Response;
RSP: Respiration (Source: Singh, R. R., 2007).
Heart Rate (HR) and Heart Rate Variability (HRV) can be derived from ECG and PPG
signals, which alongwith other physiological signals is considered to be direct indicators of
stress level of an individual. Driving might be affected by sudden and unexpected events like
55
unanticipated pedestrian crossing, abrupt lane change by another vehicle, unruly bikers, and
sudden appearance from behind blind spots etc.38
. Healey and Picard (2005) used
physiological signals to detect stress of automotive drivers. GSR signal has been widely used
by researchers in the past for assessing a startle response as well as indicating occurrence of
sudden stress (Lisseti and Nasoz, 2004).
In multi-modal39
sensing approaches also, signals like GSR and PPG have been
extensively used for stress detection in varying operating conditions. In a different context by
applying SVM classifiers Zhai et al. (2003) showed a strong correlation between the
emotional states and the corresponding physiological signals like GSR, BVP (PPG) and pupil
diameter (PD). Benoit et al. (2006) followed a multimodal strategy to study the effect of
stress on a person's mind in a driving simulator using video data involving factors like
blinking of eyes, yawning, head rotations etc., as well as physiological signals viz. ECG and
GSR. Therefore, our sensor selection had to be guided by facts like PPG signal and
electrocardiograph (ECG) signal both can be used for HR and HRV analysis. Since SpO2
sensor uses photoplethysmography technique, it can be also be used to extract HR,
percentage SpO2 and breathing rate (Asada et al., 2003).
3.3 Sensors Employed for Data Collection
As shown in Figure 3.2 (a), initially four physiological sensors were used: a body-worn
clip-on Nonin Pulse Oximeter for PPG and SpO2 signals, GSR Velcro electrodes, an
abdominal respiration belt containing a respiration sensor and a Lead-II ECG sensor (having
+ve, -ve and GND leads). These sensors continuously communicated acquired data to the
Mind Media BV’s NeXus-10 device40
and through it to an associated research workbench
installed on an HP Compaq Tablet PC via IEEE 802.15.1.
The collected signals were displayed in run-time for the experimenter’s reference using
Biotrace+, a bio-feedback monitoring software compatible with the NeXus-10 device. For
offline processing, the collected data was later converted into time-series format for
necessary signal processing with the help of MATLAB.
While collecting the data during driving experiments, it was found that the ECG signal
data was not properly acquired due to movement of hands of the drivers and the respiration
signal was too noisy. The PPG and GSR signal rather showed stability with respect to hand
movements but were recorded with some motion artifacts.
38 A common practice in Indian metropolitan cities. 39 The term multi-modal here refers to multiple modes of sensing from physiological signals like the PPG sensor
uses optical sensing whereas the GSR sensor uses skin's conductance by passing weak electrical signals. 40 Nexus-10 Wireless Monitoring and Biofeedback System; Available Online: http://www.mindmedia.nl/CMS/
56
Fig. 3.2: Experimental setup for sensing and computing of chosen parametric data using body-
mounted sensors.
Therefore we excluded ECG signal and continued our data collection with only three
signals as shown in Figure 3.2 (b) i.e. using a pulse oximeter (Device B), a GSR sensor
(Device D) and a respiration sensor (Device C). We chose PPG over ECG because it has been
proven that the PPG signal could be used for deriving certain physiological parameters and
can be used as an alternate for ECG signal41
(Asada et al., 2003; Lu et al., 2008).
3.3.1 Galvanic Skin Response (GSR) Sensor
GSR level or the skin conductance (SC) level is a sensitive indicator of arousal and is
expressed in micro-mho or micro-Siemens (increases when the arousal level increases,
decreases during relaxation). Because of NeXus-10'42
ultra high 24-bit resolution, changes of
up to 0.001 micro-mho can be detected in the range from 0.1 to 1000 micro-mho. This sensor
requires Ag-AgCl finger electrodes. Polarity of the electrodes is not important.
3.3.2 Pulse Oximetry Sensor
Both pulse Oximetry and blood volume pulse (BVP) sensors use photoplethysmography
techniques (Citation). Pulse Oximetry sensor gives two outputs namely SpO2 pulse and
Percentage SpO2. The SpO2 pulse is quite similar to a BVP pulse waveform which can be
used to calculate the HR. During each heart beat, blood flows through the arteries and blood
vessels. As the blood flow increases, the amplitude of the BVP signal increases accordingly.
The height of the BVP signal peak indicates the relative blood flow which correlates with the
level of vasodilatation / vasoconstrictions at that point. The distance between the peaks can
41 Although clinically relevant signal parameters still require ECG signals. 42 Nexus-10 Biofeedback Monitoring Device. User Manual for the BioTrace+ Software. Version 1.1, Mind
Media B. V. Netherlands, 2004-2006.
57
be used to obtain the absolute heat rate (HR) or interbeat interval (IBI). The BVP is measured
in mill volts (mV).
3.3.3 Respiration Sensor
This sensor measures the relative expansion of the abdomen or thorax during inhalation
and exhalation. It consists of an elastic belt that is worn around the body and a sensor part.
This sensor can be used to calculate the respiration rate parameter. A respiration rate of 6
would mean 6 inhalation-exhalation cycles per minute, or 10 seconds per cycle.
3.4 Data Collection: Requirements and Processes
Considering the requirements of the proposed wearable driver assistance system, in order to
devise a robust algorithm for on-road stress monitoring, it is imperative to analyze
physiological data collected under real-time43
driving scenarios for possible intrinsic patterns
correlating the driver’s behavior under stressful situations and his affective state. This will
enable us to detect the incremental changes in the emotions as well as the stress level (reflex
level) of drivers. Features derived from physiological signals, if tracked adaptively, will help
in identification of alarming situations. This requirement helped us in designing of our
experimental setup explained earlier and the five carefully designed data acquisition
scenarios (two relaxed and three real-time driving scenarios). The advantages of real-time
data collection in different driving scenarios over simulated environments are (a) training
classifier on real-time data makes it robust to noise, motion artifacts, device errors etc. (b)
effect of factors like environmental, vehicle’s characteristics and driver’s physiological
conditions can be considered (c) correlation of stressful events are more accurate than in
simulated conditions (Singh et al., 2011).
While in all real-time data was collected from 20 professional drivers, in the first phase
only a subset (seven) of these were used for initial data collection. The experiments were
conducted in a variety of locations in urban and rural settings through typical regional terrains
comprising of stretches of expressways, national highways, country roads and sandy patches
of connecting un-tarred streets in the semi-arid zone of Shekhawati in the desert state of
Rajasthan in India. In the next phase of experiment we collected data from an additional 13
drivers thus making it 20, in all. In order to ensure that the data collected on the first occasion
was indeed the representative and correctly acquired data, we verified the data quality and
43 The term real-time has been commonly used by Intelligent Transportation and biomedical researchers, to
mean what otherwise could be called real-life (Ji et al., 2004; Healey and Picard, 2005; Bergasa et al., 2006;
Fairclough, 2009 and Tango and Botta, 2013). In contrast, this term is often used in computing domains in terms
of the time-bound within which a time-sensitive operation should be completed.
58
integrity as well as consistency by repeating the experiments multiple times (typically 3
iterations) with the same driver, same vehicle and same stretches of terrain.
3.4.1 Data Collection Protocol
Some local professional taxi drivers with varying driver profiles, vehicle profiles and
terrain profiles were contacted. Initially drivers were skeptical about the experimentation
procedure and their usage but later on some of them became curious and gave their consent
when the purpose of the research work and its possible outcome was explained to them. A
nominal amount was also paid to them because they were working for some travel agencies
and they had to spare their time as well as the fuel charges. The travel agency owners were
also duly informed about the work. The data collection protocol is explained below:
Step I: The drivers were explained the entire process including the sequence of steps they
would have to go through as well as the way sensors would be attached to their body.
They were made to sign a consent form as well.
Step II: Upon their arrival, the drivers were asked to relax for about 5 minutes so that pre-
driving data corresponding to their relaxed state could be duly collected.
Step III: Once they appeared relaxed, the GSR, Oximetry sensor and the Respiration sensors
were attached to their body and the data was acquired for about 10 minutes. The data
such acquired has been termed as the pre-driving (Pr-dr) data in the following
discussion.
Step IV: Immediately after completion of Step III, drivers were asked to get inside the vehicle
with all sensors duly attached to their body and were asked to drive on a pre-defined
campus road which usually witnesses low traffic and stretches over about 4.2
kilometers. Since this stretch of driving could be performed without any visible stress,
we have termed this state as relaxed driving (Rx-dr) state.
Step V: Next, the drivers were asked to leave the campus premises and drive through a
stretch of about 5.5 kilometers covering busy market areas and continued to small
stretch of highway. We labeled this state as the busy driving (By-dr) state.
Step VI: Subsequently, the drivers were asked to retrace the route back to the university
campus, but once inside, take a different route to driving origin, to avoid familiarity
with the route taken during relaxed driving. This return journey was labeled as the
return driving state (Rt-dr) typically covered about 2.5 kilometers.
59
Step VII: Lastly, upon reaching the origin, drivers were asked to switch off the engine but
remain seated in the vehicle for another 5 minutes during which the post driving
(Po-dr) data was collected.
The experimenter while interacting with the drivers made certain assumptions during the
data collection experiments. It was felt that the experiments would be conducted either in the
morning or in the evening as both the drivers and experimenter were available during these
periods. Drivers reported as being comparatively relaxed during the morning hours (8AM–
11AM) as they came directly from their home in most of the cases. Whereas during evening
hours44
(3PM–7PM) they would have been subjected to a certain degree of stress due to
several work routines and driving during the day time. However in the present analysis this
difference was not considered. In addition to the above assumptions, the effect of certain
lifestyle parameters like prior sleep, alcohol and caffeine factors etc. were also not included
in the study. In order to account for the influences of these parameters, normalizing
procedures were considered for the GSR and PPG as discussed in Section 3.5.5.1, as they
contribute to the GSR and PPG baseline signals. The drivers were given certain relevant
instructions by the experimenter which include the following:
Any kind of prior substance abuse by drivers must be reported to the experimenter.
Handling instructions for sensors as well as the data acquisition equipment were given
to drivers since sudden hand movements during the left and right movement of
steering wheel might lead to inclusion of motion artefacts.
The speed limit instructions involved maintaining speed between 30 - 35 Kmph inside
the university campus, maximum permissible limit 40 Kmph, and for busy driving the
limit was between 40 - 45 Kmph.
The drivers were advised to drive normally as per their normal way of driving,
however the experimenter observed their behavior carefully and noted down any
change in driving style (for e.g. from normal / calm to aggressive and vice-versa in
course of driving).
The experimenter gave appropriate instructions about the driving routes to follow
while seating next to the driver seat.
Some other relevant instructions as and when required.
The detailed description of all the scenarios is explained in the next subsection.
44 The experimenter could not collect data beyond 7 PM due to practical reasons.
60
3.4.2 Data Acquisition Scenarios
The steps mentioned in the section 3.4.1, were carried out as part of a carefully structured
set of five scenarios each of which corresponds to one of the steps defined above. Figure 3.3
gives a glimpse data collection in relaxed and driving scenario whereas Table 3.3 lists all the
relevant scenario related information that includes: route-length, drive time, traffic density
etc.
Fig. 3.3 Sensor Configuration for data collection under (a) Rest Scenarios (Pr-dr and Po-dr) and
(b) Driving Scenario (Rx-dr, By-dr and Rt-dr).
(a) Pre-driving (Pr-dr: In order to assess the relative changes observed in the bodily
parameters of drivers, while in stressful situations, we need a reference data. Such data is best
acquired when the subject is relatively relaxed. We therefore chose morning hours45
in most
of the cases as the time during which we could invite a given driver. Although, most of the
drivers were found to be sufficiently relaxed upon arrival, some of them did exhibit a certain
degree of anxiety before their first time of driving with the body-mounted sensors. In such
cases, the experimenter tried to bring down any such anxiety by the way of a brief
conversation with them prior to mounting sensory devices on their bodies and then proceeded
to collect samples as mentioned in Step III. This reference data, thus, could be compared with
the data collected in other scenarios and the relative changes observed could be used for
further analysis.
45 While it is generally true that most of the people (drivers included) are usually relaxed in the morning hours
after a good night's sleep, there might be a small percentage of aberrations as well. For instance drivers suffering
from certain forms of addiction, insomnia or intense family stress etc. may not be actually relaxed as assumed.
61
In this scenario, data was collected over a sampling period of 8-10 minutes as shown in
Table 3.2. This scenario corresponds to the initial affective state of the driver before
commencement of the driving experiment.
(b) Relaxed-driving (Rx-dr): Since we wanted to observe the incremental changes in the
drivers' physiological state, our first driving scenario design included a route with lower
traffic volume (as shown in Table 3.2), familiar terrain with a set of known routes, type and
average speed of traffic, type and locations of speed barriers/speed breakers and pothole etc.
Accordingly, the drivers were asked to drive through this terrain for about 9-10 minutes
(Table 3.2) maintaining an appropriate speed (observed speeds were typically in the range of
30-40 kilometers per hour).
Fig. 3.4 Satellite route map of Relaxed Driving (Rx-dr) Scenario.
The route within the campus was however carefully chosen to include several sharp turns
(as many as 8-9), areas of higher than average pedestrian traffic and relatively low vehicular
traffic. The path followed by drivers in the relaxed driving (Rx-dr) scenario has been
illustrated in Figure 3.4.
62
(c) Busy-driving (By-dr): Next, the drivers were asked to move out of campus towards areas
characterized by a relatively busy traffic pattern for about 9 minutes as shown in Table 3.2,
where they could not exceed a speed limit of 45 Kmph. The drivers drove through semi-urban
busy roads including busy market area, sandy patches, busy four-way crossings and un-tarred
country roads. Increased stress level observed due to traffic congestion and pollution (Table
3.2). The path followed by drivers during this phase has been shown in Figure 3.5. as an
overlay over the Google Maps Terrain View.
The stress-trends observed during busy By-dr was less than the Rx-dr driving on some
particular days, as shown in Table 3.2, due to low traffic volumes on busy roads whereas the
relaxed driving had a fixed pattern of trend markers due to even in presence of many sharp
turns and speed breakers besides pedestrians and cyclists. However, the effect of stress-trends
in By-dr will have relatively more influence on the stress levels as compared to the Rx-dr
scenario.
(d) Return-driving (Rt-dr): As soon as we reached back the campus gate, we returned to
the point-of-start via a differently chosen route shown in Figure 3.6. This phase continued for
about 4 minutes (Table 3.2). Relatively low amplitudes in the GSR signal were observed for
the collected signals in this scenario suggested that the return route within the campus was
characterized low stress and mental workload.
Table 3.2. Data Collection Scenarios (Source: Singh et al., 2013b)
Scenarios Location Route
Length
Time
(min.)
Speed-Limit
(Kmph)
Stress-
Trends
Traffic Density
Ped.
(per m2)
2-Wh./Bi.
(per m2)
4-Wh.
(per m2)
Pr-dr Lab. - 10 - - - - -
Rx-dr Driving ~4.2 kms 7 - 9 35 (Max. 40) 18 - 22 0 - 0.6 0 - 0.3 0 - 0.1
By-dr Driving ~5.5 kms 7 - 10 40-45 15 - 25 0.6 - 1.0 0.3 - 0.6 0.1 - 0.2
Rt-dr Driving ~2.5 kms 3 - 4 30 - 35 10 - 15 0 - 0.4 0 - 0.2 0 - 0.05
Po-dr Lab. - 5 - - - - -
Legends: Pr-dr: Pre-driving; Rx-dr: Relax Driving; By-dr: Busy Driving; Rt-dr: Return Driving; Po-dr: Post Driving;
Lab.: Laboratory; min: Minutes; Kmph: Kilo meters per hour; Ped.: Pedestrian Count; 2-Wh.: Two Wheeler
Count; Bi.: Bicycle Count; 4-Wh.: Four Wheeler Count
63
Fig. 3.5 Satellite route map of Busy Driving (By-dr) Scenario.
Fig. 3.6 Satellite route map of Intracampus-return Driving (Rt-dr) Scenario.
64
(e) Post-driving (Po-dr): In the last phase, as soon as the vehicle stopped they were asked to
remain in their seat with the engine switched off and make themselves comfortable for about
5 minutes. We had taken this set of observations in the last phase hoping that we would get
some evidence of return to the original pre-driving state of relaxations; however, our
observations do not necessarily indicate that pattern.
In all the driving scenarios (2-4) care was taken to ensure that drivers didn't feel too much
of discomfort while driving with the sensors mounted on their bodies during the data
collection process. Although, initial anxiety and discomfort due to body-worn sensor setup
was noticed.
Salient observations and strategies followed during the above referred data collection
process are as follows:
The total driving time for each driver lasted for nearly 24 minutes covering a distance
of approximately 11.5 kilometers (Singh et al, 2013).
The number of driving hours could not be extended due to the limited instrumentation
support as well as the need to replace the batteries after each session46
.
During on-road driving several factors contribute to stress that may include repeated
distractions and stressful events like negotiating sharp or circular turns as well as left
or right turns, driving through busy market areas having high vehicle and pedestrian
density, handling bad stretches of streets, negotiating ill-designed or ill-marked speed
breakers, abrupt lane change by a neighboring vehicle, jaywalkers etc. Henceforth in
this thesis such on-road events has been defined as stress-trends. Automatic detection
of such stress-trends would enable the wearable computer to activate and respond in
accident prone spells of driving. In order to account for these stress-trend marker, a
person assisting the experimenter helped in annotating the time-stamps of the stress-
trends.
Unlike in case of roads, highways and expressways etc. in the developed world, in
many of the developing world towns and cities there are no separate lane for bikers
and cyclists or no pathways for the pedestrians. It is not very uncommon to find
people violating common norms like pedestrian crossing using clearly marked zebra
markers, underpaths or foot overbridges even where they might be present across
regular in-town roads, highways etc. This adds to the need of greater stress in driving
46 It was observed that each data collection session took nearly an hour to complete the entire process involving
all the five scenarios discussed, and after an hour the batteries drained out of power completely.
65
on such roads. It was in this context that our observations included identifying the
pedestrian density in select segments along the routes chosen.
Traffic density was also marked in each of the phases of observations throughout the
entire data collection cycle (shown in Table 3.2).
These observations helped us in devising a timeline chart to label the data further for
different stages of analysis. For affective state recognition, we have classified the data
collected under Pr-dr and Po-dr as Relaxed affective state, under Rx-dr and the latter half of
Rt-dr as Moderate affective state and under By-dr and the first half of Rt-dr as Stressed
affective state (as shown in Stress Graph in Fig. 3.7.) This annotation of the driver’s affective
state is justified from our observation of the different parameters of external driving
environment including traffic density, pedestrian density, traffic congestion and pollution as
shown in Fig. 3.7. Out of the 20 drivers selected for data collection, we carefully chose 9
drivers for training the classifiers for physiological monitoring. The data collected from these
selected drivers had minimal sensor errors, motion artefacts and corrupt data segments.
The original data as collected during real-time data collection phases, had quite a few
elements of noise including those pertaining to motion artifacts, sensor errors, and corrupt
data segments. The section 3.5.5 explains how this data was made appropriate for further
analysis by systematic removal of some of these elements of noise.
Fig. 3.7. Timeline Chart
3.5 Processing the Data acquired from Real-Time Signals
The collected signals were smoothed using methods of manual selection, commercial
software based artifact rejection and appropriate signal processing employed for individual
signals explained in the following subsections.
66
3.5.1 Data Analysis Strategies and Mechanisms
The data analysis methods must ensure that the signal represent the characteristic patterns
of interest to assess the stress level of drivers. Therefore, following steps and requirements
were identified as part of the data acquisition and analysis strategy:
a preliminary statistical analysis of the collected signal should be done to understand
whether the raw data has a representative statistical significance.
necessary preprocessing, filtering, motion artifacts removal etc. must be carried out to
rule out some erroneous information present in the data.
identification of feature extraction methods for preparing a database for classifier
training has to be done.
statistical significance of the extracted features must be established.
necessary feature selection methods must be adopted to select only relevant signals
for classifier training.
identification of pattern recognition classifier or models should be done by
considering the characteristics of the data.
training and evaluation of the prepared data should be carried out for the selected
population by considering the intra- as well as inter-subject variability.
In subsequent sections the methodology for data analysis as outlined above has been
presented with their need and significance.
3.5.2 Manual Observation
The data collected under each of the five scenarios was converted into a time-series data
format for necessary signal processing procedures in an offline workstation using
MATLAB®. We computed simple statistical parameters on the data collected from 14 drivers
initially and observed the following variations manually (Singh and Banerjee, 2010) during
the field survey:
1) GSR:
Three major variations were observed for GSR signals which corresponded to: sudden or
abrupt changes, increased GSR levels and decreased GSR levels.
Sudden or abrupt changes:
caused by:
67
o sudden application of brakes (like a pedestrian unexpectedly crossing a busy road,
a cyclist’s losing his concentration and coming in the drive-path of the vehicle all
of a sudden)
Increased GSR levels:
caused by:
o badly designed / built speed-breakers, potholes etc.
o taking a turn at relatively high speed or situations requiring negotiating sharp turns
o overtaking of / by a vehicle without signals
Decreased GSR levels:
were noticed while:
o maintaining a constant speed
o relaxed recording state (when asked to do so).
2) SpO2:
An increase in PPG pulse height was observed when the data acquired during pre-driving
state was compared to the one collected during relaxed driving state, whereas a very slight
decrease in pulse height was observed from relaxed driving to busy driving. The percentage
SpO2 varied between 96% - 98% in all the experiments.
3) Respiration:
As this sensor mainly measures the abdominal respiration activity, during the activities
such as coughing, sneezing etc. the signal changed abruptly. Motion artifacts were marked
during appropriate stages of driving scenarios.
3.5.3 Preliminary Statistical Analysis
For the purpose of preliminary analysis, we extracted statistical parameters like mean,
standard deviation, minimum and maximum value of the GSR, Respiration, %SpO2 value
and PPG pulse signals for each scenario (Singh and Banerjee, 2010). Near-uniformity
between the driver’s data-collection and sampling durations was ensured by limiting the
analyzed amount of the actual data for each scenario as: Pre-driving = 9 minutes, Relaxed-
driving = 9 minutes, Busy-driving = 8 minutes, Intra-campus Return-driving = 3 minutes and
Post-driving = 5 minutes.
The analysis of variance (ANOVA) was used as a chosen statistical method as it allowed
to test if observed differences between groups were remarkably large. We specifically
employed two-way ANOVA by considering mean values of each signals to analyze if driver
68
data and scenarios had any noteworthy significance. The data for GSR, Respiration and
%SpO2 showed statistically important results at 5% significance level, whereas PPG pulse
showed insignificant results. Further we performed one-way ANOVA analysis to validate our
finding which showed that GSR, Respiration and SpO2 signals indeed had their mean values
statistically significant.
Although scenarios were not identifiable to be of much significance (Table-3.3 below) in
the initial analysis, situation changed during the following analysis using one-way
multivariate analysis of variance (one-way MANOVA).
Table 3.3: Two Way ANOVA Analysis (Source: Singh and Banerjee, 2010)
Signals Drivers Scenarios
GSR Significant Not significant
RSP Significant Not significant
%SpO2 Significant Not significant
PPG pulse Not Significant Not significant
The one-way MANOVA returned an estimate of the dimension of the space ‘d’
containing the group (scenarios) mean values by testing the null hypothesis that the mean for
each scenario led to the same n-dimensional multivariate vector. Also, it was possible to
reasonably establish that any difference observed in the other parameters (signals and drivers)
was due to random chance. Since the one-way MANOVA analysis returned d = 0, which
justifies that there was no evidence to reject this hypothesis.
Based on the data extracted from a carefully chosen population comprising a significant
sample size47
of 67,200 each for GSR, %SpO2 and Respiration, as well as 268,800 samples
of PPG, this work tries to estimate select characteristics relevant to the BITS Life-Guard
Wearable Computer. In the process of data collection and analysis, a very large sample size
was chosen consciously. For instance for each of the five scenarios (Pr-dr, Rx-dr, By-dr, Rt-
dr and Po-dr), the chosen sample sizes was 134,400; 120,960; 107,520; 40,320; 67,200
respectively. This large sample size allowed us to satisfy the condition of consistency (Singh
and Banerjee, 2010).
Also, the unbiasedness condition which is required for an ideal statistical inference, was
reasonably satisfied. This is so because the representative samples of the targeted population
of three kinds (long-distance drivers, short distance drivers and casual drivers) collected so
47 These sample sizes were obtained for the preliminary statistical analysis only out of the data collected as
described in Section 3.4.2.
69
far during field tests are nearly equally distributed (Long-distance = 35.71%, Short-distance =
35.71% and Casual = 28.75%) (Singh and Banerjee, 2010).
3.5.4 Challenges faced in Signal Preprocessing
In both the relaxed states (Pr-dr and Po-dr) signals received were at their expected levels
and exhibited their required structural properties as shown in Figure 3.8. But noticeable
differences were found in signal levels in all phases of driving. Motion artifacts due to several
driving related tasks like that of hand movements, jerky motion of vehicles, sudden brakes
and diversions etc. were reflected in the form of noisy data for short periods of time as shown
in Figure 3.9 as bad segments. In addition part of the bad segments might also indicate
presence of noise due to sensor errors.
Figure 3.8: Clean Signals sampled during Pre-driving Scenario
Figure 3.9 Noisy Signals sampled during Drive with Motion Artifacts and Sensor Errors
70
3.5.5 Approach for Physiological Signal Processing
Preprocessing of physiological signals collected in real-time scenarios is necessary
because of occurrences of high frequency noise, motion artifacts and sensor errors. For the
present work, we used a manual artifact rejection technique. In this method only the relevant
segments in signals collected from each subject, which were free of motion artifacts and
sensor errors have been used (BioTrace+ Manual V1.1., 2004-2006, include some more
citation). The PPG signal was downsampled to 32 Hz from 128 Hz to have a compatibility in
sampling with the GSR which is sampled at 32 Hz. For multimodal signal analysis the signals
must be brought to a compatible sample rate for allowing necessary signal processing tasks
(Oppenheim, 2006). A time window of 10 seconds resulting in 320 samples of GSR and PPG
each was chosen because the spread of a GSR signal is available over a 10 second time
frame.
3.5.5.1 Normalization and Spike Removal
For pre-processing of subject-specific physiological signal baselines and life-style
dependent factors a min-max normalization technique has been used Benoit et al. (2009).
Using Eqn. 3.1, the collected data was min-max normalized by using the maxima and minima
of the physiological signals recorded between 30-60 seconds of the Pre-driving scenario. The
first 30 seconds data was discarded under the assumption that it was the time taken by the
driver become familiar with the experimental setup.
(3.1)
In the next step, the normalized physiological signals were filtered and pre-processed for
removal of signal noise, motion artefacts and sensor errors prior to feature extraction. A one-
dimensional median filter of size 3 was used to remove the signal spikes and impulse noise.
In Sections 3.5.5.2 and 3.5.5.3 the methodology adopted for the extraction of features from
the normalized and filtered physiological signals have been described.
3.5.5.2 Galvanic Skin Response Signal Processing
The GSR signal is a bio-electric physiological signal controlled by the sympathetic activity
of the human nervous system (Gorini and Rival, 2008). It is a function of the sweat gland
activity. Whenever incidence of startle response, sudden fear, anxiety etc. are encountered,
the GSR signal morphology changes. In real-time driving such events would lead to stress.
71
3.5.5.2.1 Signal Decomposition
The GSR signal comprises of two components: phasic and tonic. An activity continued
over a period of time reflects the skin conductance level (SCL) and is known as the tonic
component. Whereas, stimulus measured over a short duration of time reflects the skin
conductance response (SCR), known as the phasic component (Schmidt and Walach, 2000).
In the present case the tonic component was extracted by low-pass filtering of the normalized
signal using a Butterworth filter of order 3 with a cut-off frequency of 0.16 Hz. Whereas, the
phasic component comprised of frequency components belonging to the band of 0.16 Hz and
2.1 Hz and were extracted using a Butter-worth band-pass filter of order 3 and was
appropriately corrected for time delays observed.
3.5.5.2.2. Peak and Point of Onset Detection
A sudden rise in the skin conductance is recorded due to the sympathetic nervous system
activity whenever ions fill the skin's sweat glands, (Healey and Picard, 2005). This sudden
rise may occur due to some stimuli characterized by the skin conductivity features which
needs to be extracted from the phasic component of GSR signal. In the present algorithm it is
required that the response onsets and peaks must be detected to record the changes in the
signal morphology. The Ktonas’ 7- point Lagrangian interpolation algorithm has been used
for the peak detection which uses the 1st and the 2nd derivatives of GSR signal Zhai et al.
(2005). The 1st derivative and the 2
nd derivative as given by the formula in Eqn. 3.2 and in
Eqn. 3.3 respectively, corresponds to the ith
time instance, where the GSR signal is given by
gsr and fs corresponds to the sampling frequency.
(3.2)
(3.3)
Each GSR segment was divided into 5 sub-segments of 2-seconds each with 64 sample
points. The points whose second derivatives were minimum in each sub-segment
corresponded to a zero-crossing in the 1st derivative signal were analyzed for possible peak-
coordinates. Additionally, a point was classified as a peak, if the GSR phasic value in the
points being considered exceeded a threshold of 0.4 µSiemens and if the absolute value of
GSR exceeded the adaptive threshold of GSRthresh (Eq. 3.4), calculated from the 10-second
window.
(3.4)
72
Additionally, the Point of Onset was classified as the 1st point in the range of 3.2 seconds
to the left of the peak, where the GSR 1st derivative just crossed a threshold of 0.002 nearly
close to zero-crossing, after the detection of peak. Once peaks are detected, the algorithm
checks for multiple peaks, the peaks which are less than 0.5s away, and eliminates them
keeping only the highest one based on the absolute value of GSR. These multiple peaks
represent themselves as if they are similar to stimuli and stress level, therefore discarded.
3.5.5.3 Photoplethysmography Signal Processing
In non-clinical applications due to its non-invasive property the PPG signal has been
established as an alternative signal to the ECG signal in both the clinical and non-clinical
applications. The morphology of the pulsatile component of the PPG signal has been used to
extract certain clinically significant parameters like pulse, HR and heart rate variability
(HRV) etc. (Asada et al., 2003; Linder et al., 2006). Additional spectral and statistical
features relevant to human stress, could be derived from the time-series of the instantaneous
heart rate obtained from the PPG signal.
3.5.5.3.1. Motion Artifact Removal
A one-dimensional median filter with an order of 4 samples was used for pre-processing of
PPG signal. This was followed by the geometric reconstruction of lost peaks by sub-segment
replacement using cross correlation detection method used by Weng et al. (2005). The
reconstructed PPG signal is further de-trended to remove any sensor base-line drifts.
3.5.5.3.2. Instantaneous Heart Rate Extraction
Instantaneous heart rate can be extracted by detecting the systolic peaks and diastolic
troughs adaptively from the filtered PPG signal. Peaks are classified as the local maxima of
the PPG signal and correspond to the systolic values, whereas troughs are classified as local
minima and correspond to the diastolic values corresponding to the zero-crossing of the 1st
derivative of the PPG signal (Linder et al., 2006). By fixing the minimum peak-to-peak
interval and trough-to-trough as 16 samples (0.5 seconds), the multiple peaks were
eliminated. PPG syntactic features were extracted using the identified peak and trough
coordinates as explained in Section 3.5.5.4.3
The techniques proposed by Linder et al. (2006), was utilized to derive the heart rate (HR)
signal required in heart rate variability (HRV) analysis from PPG signal. The formula given
in Eqn. 3.5 was used to derive the instantaneous heart rate time series from the peak-to-peak
interval. This method uses consecutive peak-to-peak difference in digital time coordinates.
73
(3.5)
3.6 Extracting Features from Physiological Signals
Features are condensed representation of patterns containing salient information with
minimal loss of significant information. A feature may represent any structural characteristic,
a transform, some kind of structural description or graph extracted from some input pattern.
Feature extraction reduces a large input data to a set of features for further processing to
complete a desired task (Ciaccio et al., 1993).
3.6.1 Methods of Feature Extraction
Although, mainly two types of feature extraction methods are used which represent the
(i) statistical characteristics and (ii) syntactic descriptions of features as discussed by Ciaccio
et al. (1993), they can be further subdivided into four categories as shown in the Table 3.4
below. Among these methods, the present work involved the three most commonly used
feature extraction methods as (a) statistical (b) syntactic and (c) transform based as described
in the subsequent sections (Ciaccio et al., 1993).
Table 3.4: Feature Extraction Methods
S. No. Feature Extraction
Methods
Examples
1. Non-transformed
structural characteristics
moments, power, phase information, and
model parameters
2. Transformed structural
characteristics
frequency spectra and subspace mapping
methods
3. Structural descriptions
such as formal
languages and their grammars, parsing
techniques, and string matching techniques
4. Graph descriptors such as
attributed.
graphs, relational graphs, and semantic
networks
3.6.2 Statistical Features
Physiological signals can be classified as random signals as there is always some degree
of uncertainty involved in their occurrences (Lessard, 2006) i.e. they are characterized by
their stochastic nature. Hence their statistical characteristics are important for the present
analysis. Moreover, statistical feature can be computed easily in real-time scenarios (Picard et
al. 2001).
In statistics, the general method is to obtain a measure of a distribution by calculating its
moments, that is, the 1st moment about the origin is often referred to as the “Mean” or
74
“Average Value” about the origin, which is a measure of central tendency or the centroid.
The 2nd
moment is generally the mean about some point other than the origin, and the 3rd
moment is the variance of the distribution, which is a measure of disbursement. Using these
statistical parameters, the statistical features were extracted for both the GSR and PPG
signals, with an additional feature for the PPG, as shown in Table 3.5.
Table 3.5: Statistical Features
S. No. Feature Name Description / Formula
1. Mean
2. Signal Energy
3. Time Duration
4. Bandwidth
5. Time-Bandwidth
Product
6. Dimensionality
3.6.3 Galvanic Skin Response (GSR) Syntactic Features
Syntactic features are derived from the geometry of the signals. Useful structural
information when extracted would help in classification and description. They provide
contextual information about the signals with respect to the stimuli based responses observed
in the signals. The SCRs are characterized by four different parameters amplitude (peak),
latency, rise time, and half recovery time as below:
(a) Amplitude: Of an event-related GSR is the difference between the skin conductance
level, at the time the response was evoked and the skin conductance at the peak of the
response.
(b) Latency: time between the stimulus and the onset of the event-related GSR peak
(value should be about three seconds or less).
(c) Rise Time: time between the onset of the event-related GSR and the peak of the
response (typical value between 1 – 3 sec).
(d) Half Recovery Time: time between the peak of the response and the time after the
peak when the conductance returns to an amplitude that is one-half the amplitude of
the peak (typical value between 2 – 10 sec).
75
Fig 3.10 Galvanic Skin Response (GSR) Syntactic Features.
These structural details as discussed above were used to extract the corresponding
syntactic features as shown in Fig. 3.10 for a 10 sec. window.
Fig. 3.11. Galvanic Skin Response Syntactic Features during Busy Driving
In Section 3.5.5.2.2, the method to identify the peaks and point of onset have been
explained. These peaks and point of onset were used to extract the syntactic features
characterizing the signal's morphology as described in Table 3.6 with their corresponding
mathematical formulations as well as their clinical significance discussed by Schmidt and
76
Walach (2000); Healey and Picard (2005); Soleymani et al. (2008). Fig. 3.11 shows some of
the features extracted from a GSR signal segment corresponding to busy driving scenario.
Table 3.6: Syntactic GSR Features S. No. Feature Name Description / Formula Clinical Significance
1. GSR Peak Rise
Time Sum (GPRTS)
Peak Rise time = Time of
Occurrence of Peak - Time of Point
of Onset
GPRTS is the sum of response
durations. It is an indirect
indicator of the response time of
the subject.
2. GSR Peak
Amplitude Sum
(GPAS)
Peak-Amplitude = GSR value at
Peak- GSR value at Point of Onset
GPAS is an indicator of the
intensity of stress observed by the driver
3. GSR Half-Recovery
Sum (GPHRS)
Half-Recovery Time = Time of
Occurrence of Half Amplitude- Time
of occurrence of Peak
GPHRS is an indicator of
recovery time after occurrence of
a stressor.
4. GSR Peak Energy
Sum (GPES)
Peak Energy = 0.5 * Peak Amplitude
* Peak Rise Time
GPES is an indicator of the
intensity of stress experienced by
the subject. Higher the GPES
value, greater is the stress experienced by the driver.
5. GSR Rise Rate
Average (GRRA)
Average Rise Rate = Sum Average
of 1st derivative of points with 1st
derivative > +ve Threshold (0.025)
GRRA is calculated from Tonic
GSR. It gives an indication of
how the Global GSR level is
varying as time progresses.
6. GSR Decay Rate
Average (GDRA)
Average Decay Rate = Sum Average
of 1st derivative of points with 1st
derivative < -ve Threshold (-0.025)
GDRA is calculated from Tonic
GSR. It is an indirect measure of
the relaxation pattern experienced
by the driver
7. GSR Percentage
Decay (GSRPD)
GSR Percentage Decay = Percentage
of Time samples in given segment
with 1st derivative < 0.
GSRPD is related to tonic
component of GSR and reflects on the global stress level driver is
experiencing.
8. GSR No. of Peaks Number of peaks in a given segment. GNP is an indicator of the
number of stressors experienced
by the driver in a 10-second
segment
3.6.4. Photoplethysmogram (PPG) Syntactic Features
The following syntactic features were extracted from each 10-second segment as shown
in Table 3.7 with their clinical significance as investigated by Shamir et al., (1999);
Hjortskov et al., (2004); Ryoo et al., (2005); Weng et al., (2005); Linder et al., (2006). Fig.
3.12 shows the syntactic features extracted during relaxed driving scenario.
77
Table 3.7: PPG Syntactic Features S. No. Feature Name Description / Formula Clinical Significance
1. Pulse Height Average of (Value of PPG
peak- Value of PPG trough)
in a segment
The pulse height which is proportional to pulse
pressure useful in analysis of loss of blood
pressure and arteriole constriction perfusing the
dermis.
2. PPG Rise Time
(PPGRT)
Average of (Time of Peak-
Time of Preceding Trough)
in a segment
PPGRT is a measure of the general circulatory
performance and indicates normal metabolic
activity during the systolic phase.
3. PPG Fall Time
(PPGFT)
Average of (Time of
Trough- Time of Preceding
Peak) in a segment
PPGFT is a measure of the general circulatory
performance and indicates normal metabolic
activity during the diastolic phase.
4. PPG Cardiac
Period
(PPGCP)
Average of Period of PPG
signal in a segment
PPGCP is useful in deriving the Heart Rate
variability parameters.
5. PPG
Instantaneous
Heart Rate
(PPGIHR)
60 / (Time Difference
between two consecutive
peaks)
PPGIHR is an indicator of mental stress
Fig. 3.12. PPG Syntactic Features extracted under Relaxed Driving
78
3.6.5 Heart Rate Variability (HRV) Features Derived from PPG
HRV features are found to be a selective and sensitive measure of stress caused by both
physical and mental workload (Appelhans and Luecken, 2006). A decreasing HRV indicates
abnormality in the autonomic nervous system function and is a sign of imminent deterioration
of a subject's reflexes (Hjortskov et al., 2004). In other words, this indicates likely inability of
a person to rationally respond to certain stimuli like stress-trends in time. The spectral
features of HRV are more robust for short durations of time as compared to statistical
features (Zhai et al., 2005). Therefore it was envisaged to extract the spectral features in
addition to the time domain statistical analysis, as in the present application a 10-second
segment was considered.
3.6.5.1. HRV Spectral Features using Lomb Periodogram
The power spectrum of the heart rate was calculated from the instantaneous heart rate time
series data derived from the raw PPG signal. Lomb periodogram was used to extract the
spectral features so that the data corresponding to heart beats gets duly represented without
any loss of significant information (Healey and Picard, 2005). Additionally, this is robust to
the missing heart beats (Laguna et al., 1998). The following HRV spectral features were
extracted:
(a) HF Power (HFP): This band reflects parasympathetic (vagal) tone and fluctuations. The
parasympathetic nervous system modulates heart rate effectively between the frequencies of
0-0.5Hz. Whereas the sympathetic nervous system modulates only frequencies below 0.1 Hz
(Hjortskov et al., 2004). The frequency range for HFP is 0.15-0.4Hz.
(b) LF Power (LFP): This band reflects both sympathetic and parasympathetic tone. The
frequency range for LFP is 0.04-0.15Hz.
(c) VLF Power (VLFP): For short duration observations VLFP is observed to fairly
represent various negative emotions, worries, rumination etc. But usually a long period of
measurement is required for proper analysis of the band (Partin et al., 2006). The frequency
range for VLFP is 0.003-0.04Hz.
(d) Total Power (TP): The TP is a net effect of all possible physiological parameters
contributing in HR variability that can be detected in 10-second recordings, however
sympathetic tone is considered as a primary contributor.
(e) LF/HF ratio (LFHF): It is an indicator of the balance between sympathetic and
parasympathetic activity. A decrease in this ratio might indicate either increase in
79
parasympathetic or decrease in sympathetic activity. As observed in Healey and Picard
(2005), increased stress causes an increase in sympathetic activity and hence the ratio.
(f) PPG Respiratory Rate (PPGRSP): Partin et al. (2006) observed that an increased
mental stress condition resulted in increased HR and respiration rate. The point of
maximum power in frequency range of 0.10-0.25 Hz corresponds to the frequency of
the respiratory cycle which is an indicator of driver mental stress caused by strenuous
driving tasks.
These features have been shown in Fig. 3.13.
Fig. 3.13. Lomb Periodogram of Instantaneous Heart Rate Time Series
3.6.5.2. HRV Statistical Features
The time domain statistical features of HRV, shown in Table 3.8, are useful in analyzing
the interbeat changes in the HR and emotions like frustration, boredom etc (Malik et al.,
1996; Giakoumis et al., 2010).
Table 3.8: HRV Statistical Features
S. No. Feature
Name
Description / Formula
1. AVNN Mean of all NN intervals
2. SDNN Standard deviation of all NN intervals
3. rMSSD RMS of the sequential differences of the IBI
calculated for the whole trial
4. pNN20 Percentage of the number of sequential IBI
differences that are over 20 ms
5. pNN50 Percentage of the number of sequential IBI
differences that are over 50 ms
80
After these feature extraction routines, each signal segment produces a feature vector
(Ciaccio et al., 1993) comprising of an array of 39 features. The concatenated matrix of
feature vectors consisting of all the extracted features during the drive time was used for
further analysis described in the sections below.
3.7 Statistical Significance of Extracted Features
In order to establish the statistical validity of these extracted features it is necessary to
perform an statistical significance test. This ensures that the extracted features exhibit a
significant relationship for recognizing the stress-level of drivers. In addition it must be also
established that these features just do not represent a chance population. Therefore, the
Analysis of Variance (ANOVA) significance test was performed on the extracted features by
considering the between-scenario all-subject. The primary goal of this test was to compare
the individual feature's statistical significance for distinguishing the individual stress levels.
Table 3.9 shows the degrees of freedom (df), the F-value (F) and their corresponding
Significance value (p-value) for all the 39 features considered in the analysis. It can be
noticed that majority of the features showed statistically significant results for the subjects
included in the analysis at the significance level p < 0.05, except certain features including
the GTD, GFDA, PPGFM and PPFTD. However, this test was performed just to establish the
statistical significance of the features. The final set of features that were selected for classifier
training have been discussed in Section 3.8 which uses a hybrid feature selection method.
81
Table 3.9: Statistical Significance of Individual Features Extracted Using a 10-Second
Time Window
No. Feature Name Abbreviations F-value Significance
p-value
Significant
(Yes/No)
Degrees of Freedom (dF) (Within Groups = 18, Between Groups = 4420)
GSR Statistical Features
1 GSR Mean GM 164.0416 0.00 Yes
2 GSR Energy GE 57.2818 1.37E-185 Yes
3 GSR Time Duration GTD 0.40464 0.98744 No 4 GSR Bandwidth GB 7.1005 2.78E-18 Yes
5 GSR Time Bandwidth Product GTBP 6.9812 6.94E-18 Yes
6 GSR Dimensionality GD 6.9817 6.91E-18 Yes
GSR Syntactic Features
7 GSR Peak Rise Time Sum GPRTS 26.2825 2.88E-84 Yes
8 GSR Peak Amplitude Sum GPAS 24.2973 1.77E-77 Yes
9 GSR Peak Energy Sum GPES 24.7941 3.52E-79 Yes
10 GSR Half Recovery Sum GHRS 10.8432 5.37E-31 Yes
11 GSR First Derivative Average GFDA 0.007799 1.00 No 12 GSR Rise Rate Avg. GRRA 27.1883 2.36E-87 Yes
13 GSR Decay Rate Avg. GDRA 22.2816 1.48E-70 Yes 14 GSR % Decay GPD 32.3563 7.82E-105 Yes
15 GSR No. of Peaks GNP 34.7819 5.93E-113 Yes
PPG Syntactic Features
16 PPG Rise Time PPGRT 5.698 1.14E-13 Yes 17 Pulse Height Min. PPGPHmin 205.8929 0.00 Yes
18 Pulse Height Max. PPGPHmax 264.218 0.00 Yes
19 PPG Fall Time PPGFT 3.6822 2.20E-07 Yes
20 Cardiac Period PPGCP 100.8541 4.0774E-313 Yes
21 Inst. HR PPGIHR 113.2411 0.00 Yes
HRV Spectral Features derived from PPG
22 PPG Spectral HR PPGSHR 72.4626 6.85E-232 Yes
23 Respiration Rate RSP 3.9566 3.33E-08 Yes
24 V. Low Freq. Power VLFP 6.5868 1.41E-16 Yes
25 Low Freq. Power LFP 14.8034 1.09E-44 Yes
26 High Freq. Power HFP 14.7192 2.14E-44 Yes
27 LF/HF Ratio LFHF 9.1683 2.93E-25 Yes
HRV Statistical Features derived from PPG
28 AVNN AVNN 99.4715 2.59946E-309 Yes
29 SDNN SDNN 57.3681 7.36E-186 Yes
30 rMSSD rMSSD 55.8967 2.93E-181 Yes
31 pNN20 pNN20 8.4224 1.00E-22 Yes
32 pNN50 pNN50 27.428 3.60E-88 Yes
PPG Statistical Features
33 PPG Mean PPGM 16.6033 6.26E-51 Yes
34 PPG Energy PPGE 114.4605 0.00 Yes
35 PPG First Moment PPGFM 1.4499 0.09822 No 36 PPG Time Duration PPGTD 1.4499 0.09882 No 37 PPG Bandwidth PPGB 244.3783 0.00 Yes 38 PPG Time Bandwidth Product PPGTBP 242.2473 0.00 Yes
39 PPG Dimensionality PPGD 242.2473 0.00 Yes
Legend:
Feature Names: GSR- Galvanic Skin Response; PPG- Photoplethysmography; dF - Degree of Freedom
82
3.8 Feature Selection
Feature selection is necessary for the identification of those components of a pattern or a set
of features which actually represent a given pattern (Ciaccio et al., 1993).
3.8.1 Shape-based Feature Selection
Online trend analysis has been useful in physiological monitoring and can provide early
warnings, severity assessments and decision support in clinical scenarios. It involves
examining time-series data of a reference variable and identifying clinically significant
increasing or decreasing patterns (Melek et al., 2005). Melek et al. (2005), emphasized on the
need to analyze pattern observed in a signal to understand the course of a variable. Primary
objective here is to select those features which show significant pattern in a sequence of time-
ordered data under different environments. This may enable us to predict the environment as
well as the repetitive patterns observed if any (Haimowitz et al.,1993).
The patterns in features may include: concave or convex, monotonically increasing or
decreasing, linear etc. These patterns can help us infer the kind of transitions that are taking
place in the driver’s affective state while he is driving. For example, if a feature exhibits a
concave pattern as stress increases first and decreases subsequently, if such a pattern is
tracked while testing we can predict the possible transitions which occurred in the drivers'
affective state.
To analyze and identify significant trend-patterns observed in the selected features, a 160
sec data window was selected from each scenario. The variation in the mean value of features
with increasing stress level (Pr-dr to By-dr) and decreasing stress level (from By-dr to Po-dr)
was manually observed. The significant trend-shapes noticed were (a) increasing, (b)
decreasing, (c) concave, (d) convex and (e) linear. These observations were used to develop a
feature weight allocation algorithm for stress-trend detection as discussed in Section 5.2.3
(Singh et al., 2011).
During long distance driving and route planning, analyzing the features extracted from
physiological data of the drivers for significant trends indicating patterns in stress level will
help identify routes which are most driver friendly and have least impact on his affective
state. Monitoring the time-series of these features can help infer variations in driver’s stress
while driving.
It can be observed from Table 3.10, which shows the shape based feature selection
method, that features which exhibited significant changes in their mean level when under
different scenarios gather higher feature weights than less significant features. A critical
threshold of at least 3.0, representing about 55% - 60% significance level, was chosen to
83
evaluate the features. Features with weights above or equal to 3.0 are hence classified as
significant and those below would be discarded while forming the feature mask.
Table 3.10: Shape Based Feature Selection Method
S. No. Feature
Name
Shape
Observed
P(S)
(Pattern)
Wt. Significant?
1 GM CC 66 4 Yes
2 GE CC 66 4 Yes
3 GTD LIN 77 0 No
4 GB NC -- 1 No 5 GTBP CC 55 3 Yes
6 GD CC 55 3 Yes
7 GPRTS CC 100 5 Yes
8 GPAS CC 100 5 Yes
9 GPES CC 100 5 Yes
10 GHRS CC 88 4 Yes
11 GFDA NC -- 1 No
12 GRRA CC 55 4 Yes
13 GDRA MINC 55 3 Yes
14 GPD CC 50 2 No
15 GNP CC 100 5 Yes 16 PPGRT CV 50 2 No
17 PPGPHmin CV 77 4 Yes
18 PPGPHmax MINC 55 3 Yes
19 PPGFT CC 66 3 Yes
20 PPGCP CV 66 5 Yes
21 PPGIHR CC 66 5 Yes
22 PPGSHR CC 66 4 Yes
23 RSP CC 88 4 Yes
24 VLFP NC -- 3 Yes
25 LFP CV 65 3 Yes
26 HFP NC -- 2 No
27 LFHF CV 55 3 Yes
28 AVNN CV 66 4 Yes 29 SDNN CC 88 4 Yes
30 rMSSD CC 88 4 Yes
31 pNN20 CC 100 5 Yes
32 pNN50 CC 100 5 Yes
33 PPGM MINC 55 3 Yes
34 PPGE MINC 66 3 Yes
35 PPGFM NC -- 2 No
36 PPGTD NC -- 2 No
37 PPGBW MINC 55 3 Yes
38 PPGTBP MINC 55 3 Yes
39 PPGD MINC 55 3 Yes
Legend:
Feature Names: GSR- Galvanic Skin Response; PPG- Photoplethysmography; IBI- Inter Beat Interval Shapes Observed (SO): CC- Concave; LIN-Linear; MINC- Monotonically Increasing; CV- Convex; NC- No
Conclusion; P(S) = Probability of a Shape = No of Drivers exhibiting a particular shape / Total number of
drivers; Wt. - Weights
Table 3.10 indicates the total 31 features selected using the feature weight allocation
algorithm. However to have optimal features for the classifier training the conventional
feature selection methods were also carried out comprising of a hybrid approach discussed
next.
84
3.8.2 Hybrid Approach: Filter and Wrapper based
It is imperative to select critical features from a feature set as relevant to fit in a typical
classification model. The features which produce misclassification rates for a classification
models satisfy the criterion for selection in this context (Duda et al., 2006).
Broadly, features can be selected using either (a) feature selection methods such as filter
based and wrapper based feature selection etc. or (b) dimensionality reduction techniques
such as principal component analysis (PCA), self organizing maps etc. We preferred feature
selection methods over dimensionality reduction method as it was observed by Kim et al.
(2008) that in such an application it is important to preserve their origins of analysis, domain
and value. It has been also noticed that the dimensionality reduction may lead to loss of their
affective relevance.
It was decided to use the filter and wrapper based feature selection methods for faster
execution without affecting the intrinsic properties of the collected data and at the same time
achieving a better recognition rate avoiding problems of overfitting (Blum et al., 1997). The
filter based method evaluate subsets by their information content, e.g., interclass distance,
statistical dependence or information-theoretic measures. Whereas, the wrapper based
methods use a classifier to evaluate subsets by their predictive accuracy, on test data, by
statistical resampling or cross-validation.
The filter based feature selection method uses a variance filter and an entropy filter by
placing either a variance or an entropy value on the feature's centroid (mean). Entropy in this
case describes the dispersion of feature values within the boundary of the centroid. This
process finds those feature sets that meet a minimum size and have relatively higher centroid
variability over the given boundary (Blum et al., 1997).
The wrapper based feature selection method was a combination of sequential forward
selection (SFS) and sequential backward selection (SBS). In SFS features are sequentially
added to an empty candidate set until the addition of further features does not decrease the
criterion, whereas in SBS features are sequentially removed from a full candidate set until the
removal of further features increase the criterion (Ciaccio et al., 1993, Gutiérrez-Osuna,
2002). Both SFS and SBS face a disadvantage in a sense that for SFS the features that
become superfluous once other features are added can not be removed and for SBS that the
discarded features can not be re-included that would be helpful after discarding other features
(Ciaccio et al., 1993).
Features with entropy lesser than 15 percentile were removed by the entropy filter. The
variance filter used removed features with profile variance lower than 10 percentile. The
85
feature vector matrix was passed through both these filters separately; the resulting feature
masks48
were subsequently ANDed together to get a list of new feature sets. This process
was repeated for the entire driver’s data considered in this study. Features selected for all the
drivers were tabulated and 26 features were picked up whose score were 70% and above,
representing maximum.
In the wrapper based approach, both the SFS and SBS feature selection approach resulted
in a number of features which were again compared with the 26 features computed from filter
based approach and finally we were able to select 27 features which had a score of again 70%
and above.
In addition to the above results, to ensure that no clinically significant feature was lost;
we added an ad-hoc feature mask which was created based on the clinical significance found
in literature. Finally, around 30 features were found to be significant after these routines
(shown in italicized and boldface in Table 3.11). These selected features are symbolically
combined to express interrelationships between the observed bio-signals and the driver’s
affective state. The concatenated feature vector matrix consisting of these features was used
for further analysis. The overall features selection techniques have been shown in Fig. 3.14.
Fig. 3.14. Feature Selection Techniques Adopted
48 Feature mask was created to select clinically significant features found in literature.
86
Table 3.11: Extracted Features and their Selection
No. Feature Name Abbreviation Description/Formula A B C D F
GSR Statistical Features
1 GSR Mean GM
; where xi is signal value and N is
number of samples
√ √ √ √ √
2 GSR Energy GE
; where fs is signal sampling frequency
√ √ √ √ √
3 GSR Time
Duration GTD
√ × √ × √
4 GSR Bandwidth GB
× √ √ × √
5 GSR Time
Bandwidth
Product
GTBP × √ × × ×
6 GSR
Dimensionality GD × √ √ × √
GSR Syntactic Features
7 GSR Peak Rise
Time Sum GPRTS Peak Rise time = Time of Occurrence of Peak - Time of
Point of Onset √ √ √ √ √
8 GSR Peak
Amplitude Sum GPAS Peak-Amplitude = GSR value at Peak- GSR value at
Point of Onset × × × √ √
9 GSR Peak Energy
Sum GPES Peak Energy = 0.5 * Peak Amplitude * Peak Rise Time × √ √ √ √
10 GSR Half
Recovery Sum
GHRS Half-Recovery Time = Time of Occurrence of Half
Amplitude- Time of occurrence of Peak × × × × ×
11 GSR First
Derivative
Average
GFDA Average First Derivative= Average of the First
Derivative observed in the given segment × × × √ √
12 GSR Rise Rate
Average GRRA Average Rise Rate = Sum Average of 1st derivative of
points with 1st derivative > Positive Threshold (0.025) × × √ × √
13 GSR Decay Rate
Average
GDRA Average Decay Rate = Sum Average of 1st derivative of
points with 1st derivative < Negative Threshold (-0.025) × √ × × ×
14 GSR % Decay GPD GSR Percentage Decay = Percentage of Time samples
in given segment with 1st derivative < Zero (0). √ √ √ × √
15 GSR No. of Peaks GNP Number of peaks in a given segment. × √ × √ √
PPG Syntactic Features
16 PPG Rise Time PPGRT Average of (Time of Peak- Time of Preceding Trough)
in a segment √ × √ √ √
17 Pulse Height Min. PPGPHmin Maximum and Minimum of (Value of PPG peak-
Value of PPG trough) in a segment √ √ √ √ √
18 Pulse Height
Max. PPGPHmax √ √ √ √ √
19 PPG Fall Time PPGFT Average of (Time of Trough- Time of Preceding Peak)
in a segment √ √ × √ √
20 Cardiac Period PPGCP Average of Period of PPG signal in a segment × × √ × ×
21 Inst. HR PPGIHR 60 / (Time Difference between two consecutive peaks) √ × √ √ √
87
Table 3.11: Extracted Features and their Selection (Continued....)
No. Feature Name Abbreviation Description/Formula A B C D F
HRV Spectral Features derived from PPG
22 PPG Spectral HR PPGSHR 60* Frequency maximum in range of 0.5-2.5 Hz in HRV spectrum
√ √ √ √ √
23 Respiration Rate RSP 60* Frequency maximum in range of 0.1-0.25 Hz in HRV spectrum
× √ √ √ √
24 V. Low Freq. Power
VLFP Power in range of 0.003-0.04 Hz in HRV spectrum √ × × × ×
25 Low Freq. Power LFP Power in range of 0.04-0.15 Hz in HRV spectrum √ √ × √ √
26 High Freq. Power HFP Power in range of 0.15-0.4 Hz in HRV spectrum √ √ × √ √
27 LF/HF Ratio LFHF LF Power/ HF Power √ √ √ √ √
HRV Statistical Features derived from PPG
28 AVNN AVNN Mean of all NN intervals √ √ √ √ √
29 SDNN SDNN Standard deviation of all NN intervals √ √ √ √ √
30 rMSSD rMSSD RMS of the sequential differences of the IBI calculated
for the whole trial √ × √ √ √
31 pNN20 pNN20 % of the number of sequential IBI differences that are
over 20 ms √ × √ √ √
32 pNN50 pNN50 % of the number of sequential IBI differences that are
over 50 ms √ × × × ×
PPG Statistical Features
33 PPG Mean PPGM Expression same as GM √ √ × √ √
34 PPG Energy PPGE Expression same as GE √ √ √ √ √
35 PPG First
Moment
PPGFM PPG Mean (PPGM) about the origin √ × × × ×
36 PPG Time
Duration PPGTD Expression same as GT √ √ √ × √
37 PPG Bandwidth PPGB Expression same as GB × × × × ×
38 PPG Time
Bandwidth
Product
PPGTBP √ × × × ×
39 PPG
Dimensionality PPGD √ √ × × √
Legend: GSR- Galvanic Skin Response; PPG- Photoplethysmography; IBI- Inter Beat Interval; A: Filter based
method; B: SFS; C: SBS; D: Literature; F: Final Selection; √: Selected; ×: Not Selected; The abbreviations shown as
italicized indicate the features which were actually selected after the feature selection algorithm.
3.9 Conclusions
Thus, in essence, these experiments led us to a set of 30 significant features extracted such
that any data or pattern of significance was not lost. The subsequent chapters utilize these
results for identification of appropriate types of classification techniques and consequent
design choices.
88
Chapter 4
Driver-Profile Analysis
Driver-profiling provides several meaningful inputs which could be exploited by the
designers of modern driver assistance systems including those belonging to the relatively
advanced category. Carsten and Nilsson (2001) had suggested that considering the functional
safety, human machine interface (HMI) and traffic safety aspects, driver assistance systems
may be built. These systems may include either one or several components among the likes
of night vision, pedestrian crossing detection, automatic cruise control, lane change detection,
collision avoidance system, parking assistant etc., to name a few. Golias et al. (2002)
recommended that a driver centric assistance system may involve drowsiness detection,
behavioral monitoring, stress and health monitoring for overall driver's safety. In a wearable
driver assist system (WDAS), such a profiling of drivers relevant behavioral patterns is thus,
very useful.
Development of such a driver monitoring system requires inclusion of algorithms
considering human centered design aspects capable of understanding the driver’s intent and
attention (McCall et al., 2004). However different drivers have different personality
attributes, dexterity in coping the stress, driving style, human machine compatibility factor
etc. which such a system must consider while being designed. The driver assist system should
augment the driving process without distracting the driver and warn him / her only when
absolutely necessary.
Driving may be influenced by several factors like the driver's mental and physical state
before commencement as well as during driving, his / her personality attributes, comfort level
due to make and model of a vehicle, adaptability in using the driving- assistance devices and
technologies etc.
4.1 Profiling and its Significance
In a generic sense, "profiling49
is a method of recording a person's behavior and analyzing
psychological characteristics in order to predict or assess their ability in a certain sphere or to
identify a particular group of people" (WordWeb).
Human factors50
are the major contributing factors in road traffic accidents. These include
driving behavior like speeding, drinking and driving, traffic law violations etc. and impaired
49 WordWeb Definition. Available Online: http://wordweb.info/
89
skills like inattention, fatigue, physical disabilities, impaired sensory perception etc. (Nabi et
al., 2005). Driver's behavior pattern such as impatience, time urgency, and hostility etc. have
been characterized as Type A Behavior Pattern (TABP) which puts a majority of driving
population at risks for meeting an accident (Nabi et al., 2005). This indicates that driver's
psychological characteristics and / or their personality traits may be a contributing factor for
road accidents.
In the limited context of the present work for wearable computing systems design for
vehicular drivers, a typical driver profile might refer to a full set or a sub-set of the following
attributes of a given driver:
Driver's attitude with respect to adhering to traffic rules, accepted social norms and
harmonious co-existence in the event of long and stressful driving situations.
Driver's ability to comprehend and respond to the situations arising out of sudden
appearance of a potential hazard, erroneous driving by other drivers, unlawful road-
crossing by pedestrians, vehicles, partial vehicular failure etc.
Driver's physical and cognitive ability to act and react in different circumstances with
respect to eyes, ears, hands and feet coordination.
Habits, addiction or prescription involving use of substances that could influence the
sensory, cognitive and muscular activities.
Driver's ability to cope up with non-driving related / external causes of mental stress,
preoccupation, distraction etc.
In the presented research, we have carefully restricted ourselves to a relatively narrow list
of attributes that have been used to profile the sample drivers who have participated in real-
time data collection while driving. The following driving profile parameters have been
selected in this study:
a) Initial Affective State (IAS): Driver’s stress and fatigue level just before the
commencement of the experiment.
b) Current Physiological State (CPS): Five physiological states which reflect the stress
accumulated over the course of driving. It is a dynamic variable and is influenced by
50 Typically, in certain countries including India there do exists a set of factors which also create additional
circumstances which add to the risk of driving. These may include unauthorized use of road-segments by oversized vehicles carrying more load than what is allowed, simultaneous use of the roads by cattle and
motorists, presence of stray animals in certain road-segments (like bullocks etc.), unmarked under-construction /
under-repair areas of roads, unmarked and ill-designed speed breakers, presence of major potholes and pasting
of paper posters etc. right on the traffic signs and boards.
90
the driver's physiological response to external stimuli like the one of the many stress-
trends observed.
c) Driver Age (DA): Reflects stress level experienced by drivers of different age groups.
The DA has been normalized between the minimum and maximum age considered to
be safe for driving i.e. between 18 and 60 years respectively.
d) Driver Group (DG): Casual, short distance and long distance drivers will have varied
task performance and ability to cope with stress and fatigue. Driver Centric design
requires recognizing differences in situation awareness, attention and human error.
e) Driving Style (DS): Distinguishes between the stress profile of a calm or mature and
an aggressive driver as perceived by the passengers (decided by Experimenters 1
and 2).
f) Vehicle Configuration (VC): Factors like ease of handling, good suspension design,
vehicle noise and vibration, loading effect, absence / presence of roll, ergonomically
designed cockpit considering driver comfort and safety plays a major role in making
driving stress-free.
g) Human Machine Compatibility Factor (HCF): Anxiety and discomfort among the
drivers, fear of detection of unknown abnormalities and alcoholism reflected an
increase in stress level.
4.2 Requirement for Profiling
Survival Analysis is used to model the distribution of survival times. Survival time is defined
as the time taken to reach an event or end-point (Fox, 2002). Such data is also known as time-
to-event data or transition data or duration data. Survival data distributes a life-course domain
into several mutually exclusive states in which individuals may move (Jenkins, 2005). The
distribution of survival time data is often found to be censored. By the term 'censored data'
we mean, incomplete data as collected in course of a study in which a few subjects
information / data may not have been collected till a terminal point like end of an event or
death etc. As per Cox (1972), the time to "failure" or time to "loss" or censoring of an
individual from a population is observed with a condition that time to failure is greater than
the censoring time (Cox, 1972). In simple terms, for centered data the observations are made
only for partial duration (Singer and Willet, 1993; Houggard, 1999). Due to this reason, the
standard statistical techniques can not be applied, thereby giving opportunity to survival
models to be applied in such cases (Bewick et al., 2004).
91
In survival analysis, the relationship between survival and one or more predictors, also
known as covariates are studied (Fox, 2002). The survival data may be collected from a
population sample in which a general survey is conducted by asking about the area of
interests of the respondents using a cross-section sample survey with retrospective questions
(Jenkins, 2005). In such surveys the respondents provide information about their spells of
interest using a retrospective recall method. For example, in the current context the drivers
were asked to rate the driving time after which they would feel stressed under five different
physiological states, a transition from one state to another state.
There are two popular regression models, a proportional hazard (PH) model and an
accelerated failure time (AFT) model. In PH models the focus is on to describe the effects of
covariates on a hazard function, whereas in the AFT models the covariates act directly on the
time via a scale factor (Houggard, 1999). The main advantage of PH model is in providing an
estimate based on an arbitrary hazard function rather than requiring a parametric model like
the AFT model. Therefore, in PH models it is easy to accommodate time-dependent
covariates. Therefore, in our Driver-Profile Analysis we chose some of the best suitable time-
dependent covariates or predictors by considering the maximum time each of the driver will
be driving till he reports being stressed as discussed. The following sections describe about
the Cox PH model and its parameters like the predictors and the results obtained.
Driver profile includes training levels, knowledge of traffic rules, fitness to drive etc.
Driving style and driving environment of an Indian driver may differ significantly from those
observed in developed countries (Mohan, 2009). In order to incorporate these potentially
stress-contributing variables into our stress-detection framework, this work investigates the
applicability of hazard models and inferred that the Cox Proportional Hazard Model (Cox
PHM) is established as a popular choice in such problems. Researchers have used Cox PH
model to analyze several survival data like risk of affective and stress related disorders of
human service professionals (Wieclaw et al., 2006), stomach cancer data (Moghimi-Dehkordi
et al., 2008), general cardiovascular (CVD) risk and risk of individual CVD events such as
coronary, cerebrovascular, and peripheral arterial disease and heart failure (D'Agostino et al.,
2008) and analysis of fetal and infant death (Platt et al., 2004).
In the context of automotive drivers, Lagarde et al. (2004) applied Cox's PH regression to
compute the hazard ratios for certain life events with time-dependent covariates to estimate
the relative risks of all serious accidents and at-fault serious accidents. Vadeby et al. (2010)
collected data from physiological indicators (eyelid blink) and driving data indicators (lane
departure) in a driving simulator experiment to study the sleepiness and impairment of
92
driving performance respectively. They used statistical parameters derived from driver's eye
blink movements, vehicle's lateral deviation and acceleration to fit Cox PHM models. The
study established that the combination of blink based indicators and driving behavior based
indicators perform better than the model which uses only blink data. Therefore an effective
sleepiness warning system can be designed to avoid lane departure.
Therefore we performed a profile analysis on all the participant drivers using parameters
characterizing their temperament, personality attributes and driving style.
4.3 The COX Proportional Hazard (PH) Model
Cox PHM models have capabilities of modeling the survival data distribution non-
parametrically and establishing parametric relationships between survival time and the value
of predictors or covariates (Wayne, 2006).
The regression model discussed by Cox (1972) is given in eqn. (4.1):
(4.1)
where X
= scaled value of covariates (predictors);
= regression coefficients of corresponding covariates, and
)(thb = baseline hazard function.
The evaluation parameters after a Cox ph fit is described below:
a) Regression Coefficient (β): Coefficient estimates.
b) Standard Error (se): Standard error in estimating β. The inverse of the Hessian matrix,
evaluated at the estimate of β, can be used as an approximate variance-covariance
matrix for the estimate, and used to produce approximate standard errors for the
regression coefficients.
c) z-statistic (z): This is the ratio of each regression coefficient (β) to its standard error
(se), a Wald statistic which is asymptotically standard normal under the hypothesis
that the corresponding β is zero.
d) p-value (p): p-values for β indicator of statistical significance.
e) hazard ratio (HR): indicate the relative risk of the complication based on comparison
of event rates.
4.4 Predictors for Unified Cox PH Driver Stress Model
To understand the drivers' behavior and traits, questionnaire based data have been used by
many researchers. Lagarde et al. (2004) analyzed driving behavior questionnaire data from a
French cohort study and found that marital separation or divorce was the predominant factor
93
for serious road accidents when compared to other life events like a child leaving home, an
important purchase and hospitalization of the partner to name a few. The covariates included
annual mileage as a driver and three time-dependent covariates like occupational category
each year, age, alcohol consumption. Reimer et al. (2006) collected self-reported
questionnaire data and found significant relationships among six driving behavior measures:
accidents, speeding, velocity, passing, weaving between traffic, and behavior at stop signs
during a driving simulation scenario.
Therefore, a self-reporting driver-behavior centric questionnaire was devised (shown in
Table A.1, Appendix A) and collected responses from drivers during data acquisition to
create a driver profile tabulated in Table 4.1. This profile data was converted into seven
predictors as IAS, CPS, DA, DG, DS, VC and HCF. The predictors chosen were explanatory
variables which are important with respect to driver stress. These are designed considering
individual differences in human, vehicle and affective characteristics observed while
characterizing individual profiles of our subject drivers. The description of these individual
stress predictors and their corresponding evaluation/scaling methodology is tabulated in
Table 4.2. Scaling of the predictor’s values is necessary because points based systems have
been potentially suggested for clinicians for estimating risk as it simplifies the computational
complexity of the proportional hazard model and produces more reliable and extendable
results (Sullivan et al., 2004).
The predictors chosen in the present model for driver stress recognition are described
below:
1) Initial Affective State (IAS):
Since the experiment was performed either in the morning session or in the evening
session, drivers reported that they can not quantify the degree of their stress level. It was also
observed that during evening drives the drivers were tired due to the various work routines
and activities performed during day time, whereas in the morning they were comparatively
relaxed as in most of the cases they came directly from home without any specific stress.
Hence we defined IAS as 'relaxed' state and 'tired' state, which refers to stress and fatigue
level just before the commencement of experiment on a binary scale i.e. relaxed as 0.00 and
tired as 1.00.
2) Current Physiological State (CPS):
We defined CPS as the driver’s current physiological state which would reflect the level
of stress accumulated due to the mental and physical workload during the course of driving.
We sought answers for the maximum time each driver is capable of driving without
94
overstressing himself under five different physiological states identified as Relaxed (Low),
Moderate (Mod.), Moderate with Stress-Trends (Mod.+ ST), Stressed and Stressed with
Stress-Trends (Stressed + ST). The average projected drive time (APDT) was calculated to
identify the scale for CPS, which was again found to lie between 0-1. The following CPS
values were obtained:
Relaxed (Low): 0.0; Mod.: 0.209; Mod.+ ST: 0.403; Stressed: 0.776; Stressed + ST: 1.0
3) Driver Age (DA):
Comparative study based on driver age by transportation researchers (ETSC, 2001)
reported significant differences with respect to accident rates and levels of stress experienced
during driving as age progresses. Studies by transportation researchers (Horberry et al., 2006)
concluded that the level of driving skills (beginners / experienced), driver age and driving
environment play a very significant role in determining if a particular driver is susceptible to
stress or not. Di Milia et al. (2011) reported that young and agile drivers are more prone to
accidents and rule-violations whereas middle aged drivers were found to be more disciplined
and calm while driving. In most countries including India the lower limit for legal age for
driving is 18 years. To develop a machine which caters to a wide range of age group it is
required to consider a categorization of stress susceptibility, consequently the driver age
(DA) has been considered as a predictor. The age of a driver was normalized, on a scale
between 0.00 and 1.00, using the following the formula in Eqn. 4.2 and the value obtained
was used as a predictor in the model developed.
)18()60(
)18(
AgeMinimumAgeMaximum
AgeMinimumAgeDriverXage
(4.2)
4) Driver Group(DG):
The large variability observed in automotive drivers, due to differences in their mental
representation and workload, results in varied task performance and ability to cope with stress
and fatigue. In the present application a driver centric design approach, demands that we
recognize the differences in situation awareness, attention and human error differences which
arise because of this variability. Of the drivers considered in the study, three broad classes of
driver groups were observed depending upon the distance travelled by each of them per day
in kilometers. They are:
casual (CD - less than 20 kilometers drive per day);
professional short distance (SD - between 20-150 kilometers drive per day), and
professional long distance (LD - greater than 150 kilometers drive per day).
95
Based on their individual group, the APDT was calculated and normalized to get an average
DG Value for CD = 0.976, LD = 0.0 and SD = 0.938, which was used to fit the model.
5) Driving Style (DS):
Studies conducted by researchers have proven that fatigue proneness increases in drivers
with an aggressive driving style where thrill seeking and overtaking effects are more
prominent (Katsis et al., 2008). Mature drivers who prefer calm and composed driving style
and strictly drive within permissible speed limits exhibit a healthier stress profile. Therefore,
DS has been selected as a predictor which distinguishes between the stress profile of a calm
or mature and an aggressive driver in the passenger’s view (decided by Experimenters 1
and 2).
DS was calculated on a scale between 0.00 and 1.00 depending upon the ratings decided
on an stress scale of 1 to 8 observed by experimenters 1 and 2, where '1' represents the
calmest driver whereas '8' represents an aggressive profile using the formula in Eqn. 4.3.
1
2
21
7
1 ExExDS
(4.3)
6) Vehicle Configuration (VC):
We could not find literature about the classifications of vehicles in Indian scenario,
however Choo et al. (2004) had categorized nine vehicle types and related them to travel
attitude, personality, lifestyle, mobility, and demographic variables individually to identify
the choice of vehicles people would like to drive. We chose only three commonly used
vehicle configurations (VC) on Indian roads driven by professional drivers as hatchback,
sedan and all terrain which are similar to the small, mid-sized and pickup respectively in their
categorization.
Drivability of a vehicle plays a major role in evaluation of driving stress. Factors like ease
of handling, good suspension design, vehicle noise and vibration and loading effect
contribute to this classification. Ergonomic cockpit design considering driver comfort and
safety plays a major role in making driving experience stress-free. Therefore, VC was
calculated on an scale between 0.00 and 1.00 based on a rating scale of 1-5 reported by the
drivers which was normalized further to fit the scale.
7) Human Machine Compatibility Factor (HCF):
The HCF reflects the anxiety and discomfort due to ignorance of sophisticated
instrumentation involved in the data collection module by drivers. It was observed that
drivers who underwent the experiment multiple times were more comfortable than first
timers. These guidelines led us to design our questionnaire for drivers and observations to be
96
noted by experimenters 1 and 2 and these questionnaire responses were later tabulated in
Table 4.1. The drivers who reported that they were comfortable with the machine were given
a predictor value of 0.0 and those who reported uncomfortable with the wearable unit were
given 1.0.
Table 4.1. Driver Profile Data Acquired Through Questionnaire and Experimenter's
Observations
Driver-Profile Analysis Experimenter's Observations after Test Drive
Drivers Vehicle Type
(Comfort
Level)
Projected Drive Time
(Hours)
TE TDT
(min)
I
A
S
DS
VC HCF
Age Exp DG SD H
B AT S1 S2 S3 S4 S5 Ex1 Ex2
D1 58 37 CD 5 4 2 3.0 2.5 2.5 2.0 1.5 6.00 PM 22 T 1 1 HB 0
D2 34 15 LD 5 3 1 5.0 5.0 4.5 4.0 3.5 9.00 AM 25 R 1 2 SD 1
D3 37 17 LD 5 4 2 5.5 5.5 5.0 4.5 4.0 6.00 PM 22 T 1 1 SD 0
D4 28 8 SD 5 4 4 3.5 3.5 3.0 2.5 2.0 8.30 AM 23 R 2 2 AT 1
D5 26 7 SD 4 5 4 3.5 3.0 2.5 2.0 1.5 3.00 PM 26 T 2 2 AT 1
D6 44 14 CD 5 4 1 2.0 1.5 1.5 1.0 0.5 7.00 PM 24 T 3 3 SD 1
D7 47 26 SD 5 5 1 3.5 3.5 3.0 2.5 1.5 8.45 AM 25 T 7 6 HB 0
D8 23 4 LD 5 4 1 3.0 2.5 2.0 1.5 1.0 6.45 PM 26 T 4 4 SD 1
D9 47 12 SD 5 4 1 3.0 2.5 2.5 1.5 1.5 7.00 PM 28 T 2 3 SD 1
D10 22 4 SD 4 4 4 3.0 2.5 2.0 1.5 1.0 7.00 PM 23 T 7 7 AT 0
D11 24 5 SD 4 4 5 2.5 2.0 2.0 1.0 1.0 6.45 PM 23 T 7 8 AT 1
D12 34 10 LD 5 4 2 4.0 3.5 3.0 2.5 2.0 6.30 PM 24 T 2 2 SD 1
D13 21 3 SD 4 4 5 3.0 2.5 2.0 1.5 1.0 6.30 PM 22 T 7 6 AT 1
D14 40 20 LD 5 4 2 4.0 3.5 3.5 2.5 2.5 5.10 PM 20 T 2 3 SD 1
D15 29 10 LD 5 4 1 4.5 4.5 4.0 3.5 3.5 4.00 PM 20 T 2 3 SD 0
D16 28 6 SD 3 3 4 2.5 2.0 1.5 0.5 0.5 11.0 AM 21 R 6 6 AT 1
D17 40 12 SD 4 3 5 3.0 2.5 2.5 1.5 1.5 4.30 PM 22 T 6 7 AT 0
D18 32 9 SD 5 4 4 3.5 3.0 2.5 2.0 1.5 1.30 PM 22 T 2 3 AT 0
D19 25 5 CD 4 5 2 3.5 3.5 3.0 2.5 2.0 7.45 PM 23 T 8 7 HB 0
D20 34 8 SD 5 4 5 3.0 2.5 2.5 2.0 1.5 6.45 PM 26 T 3 2 AT 1
Avg 33.6 11.6 - 4.6 4 2.8 3.4 3.1 2.8 2.1 1.8 - 23 - - - - -
Legend: Exp. - Driving Experience; SD- Short Distance; HB- Hatch-back; AT- All-Terrain; S1- Relaxed (Low); S2 - Moderate; S3 -
Moderate + Stress Trends; S4 - Stressed; S5 - Stressed + Stress Trends; DG - Driver Group; LD- Long Distance Driver; CD- Casual
Driver; SD - Short Distance Driver; TE - Time of Experiment; TDT - Total Drive Time; IAS: Driver's Initial Affective State; DS -
Driving Style; Ex1 - Experimenter 1; Ex2 - Experimenter 2; VC -Vehicle Configuration; HCF - Human Compatibility Factor
4.5 Results: COX PHM based Driver-Profile Analysis
The results obtained from the Cox PH model fit is tabulated in Table 4.3 depicting regression
coefficients (β), standard error, p-value, z-static, hazard ratio (HR) and the 95% confidence
interval (C.I.). It can be inferred that the CPS, DG and the HCF are the three most important
predictors. The empirical relationships for the risk factor with reference to β0, is given by
expression in equation (4.4):
(4.4)
The survival plot shown in Fig. 4.1. indicates that the survival probability starts decreasing
from the mid-point if the drivers continue driving after three hours and they start feeling of
being overstressed four hour onwards. This indicates that in the initial hours of drive their
survival probability i.e. little indication of stress. But after a drive of between 2.5 to 3 hrs,
97
there is a 50% likelihood that the drivers may feel getting stressed. After three hours of drive
the survival decreases rapidly to make them over-stressed after four hour onwards.
Table 4.2: Description of Predictors for COX PHM
Predictor Evaluation + Scaling Methodology (Scale of 0 to1)
Initial Affective
State (IAS)
It is an experimenter inferred parameter on the basis of the time of the experiment and
the driver's activity in the preceding hours.
Binary Scale: Relaxed:0.00 and Tired: 1.00
Current
Physiological
State (CPS)
Average Projected Drive Time (APDT) in hours: Relaxed: 3.425; Mod.: 3.075; Mod.+
ST: 2.75; Stressed: 2.13; Stressed + ST: 1.75.
APDTMinAPDTMax
ValueAPDTMaxCPS
..
.
CPS Value: Relaxed (Low): 0.0; Mod.: 0.209; Mod.+ ST: 0.403; Stressed: 0.776;
Stressed + ST: 1.0
Driver Age
(DA)
Max. Driver Age: 60 years; Min. Legal Driver Age (India): 18 years
)18()60(
)18(
AgeMinimumAgeMaximum
AgeMinimumAgeDriverDA
Driver Group
(DG)
APDT Normalized Score
1 2 3 4 5 1 2 3 4 5
CD 2.8 2.5 2.3 1.8 1.3 1.0 1.0 1.0 0.9 0.99
LD 4.3 4.1 3.7 3.1 2.7 0.0 0.0 0.0 0.0 0.0
SD 3.1 2.7 2.4 1.7 1.3 0.83 0.88 0.97 1.0 1.0
Average DG Value: CD:0.976; LD: 0.0; SD: 0.938
Driving Style
(DS)
Inferred from Experimenter 1 and Experimenter 2 ratings on a scale of 1 to 8 where '1'
corresponds to calmest profile and '8' corresponds to most aggressive profile.
1
2
21
7
1 ExExDS
Vehicle
Configuration
(VC)
Average Comfort Level (scale of 1-5): SD: 4.6; HB: 4.0; AT: 2.8
Normalized Comfort Level (scale of 0-1): SD: 0.9; HB: 0.75; AT: 0.45
VC Values (scale of 0-1): SD: 0.1; HB: 0.25; AT: 0.55
Human
Compatibility
Factor (HCF)
Binary Scale: Compatible:0.00 and Not Compatible: 1.00
98
Figure 4.1: Survival Analysis Plot of Drivers
For each of the predictors described in Table 4.2 we adopted a methodology for
evaluating and further normalizing the points assigned in Table 4.3 and their corresponding
values were assigned using a 0-1 scale. These covariates were fitted in the model against the
projected drive time for each of the drivers for all the five physiological states.
From Table 4.3 we can observe that the p-value represents a statistically significant fit of
the model parameters and the selected covariates. The results also demonstrate that CPS is
the most dominant factor influencing driver's relative risk while driving, thus validating this
work premise. Among all the predictors CPS is the only dynamic variable which changes
through the drive for a particular combination of driver and car.
Other predictors may be considered as initialization parameters which reflect on the
driver's stress susceptibility for this combination. However DG and HCF have also certain
degree of influence on the design of a wearable stress monitoring system. The Cox PH
regression analysis suggests that monitoring the CPS (HR:18.1162; 95% C.I.: 8.31 - 39.51) is
of paramount importance for a driver centric safety prevention mechanism. These real-life
observations re-affirm our hypothesis that driver behavior and stress levels are majorly
inferred from the current affective state of the driver. Therefore for designing a WDAS
system further analysis of physiological data is needed. We thus performed the affective state
and the stress-trends based analysis which are dependent upon the CPS.
99
Table 4.3. Results of COX Proportional Hazard Model
Predictor β-value Error p-Value z – Statistic Hazard Ratio 95% C. I. Rank
IAS 1.062 0.334 1.493E-03 3.176 2.8927 1.48 - 5.65 VII
CPS 2.897 0.389 1.082E-13 7.431 18.1162 8.31 - 39.51 I
DA 1.145 0.701 1.025E-01 1.633 3.1410 0.77 - 12.76 V
DG 1.717 0.490 4.620E-04 3.502 5.5715 2.09 - 14.86 II
DS 1.567 0.449 4.862E-04 3.488 4.7914 1.95 - 11.77 IV
VC 1.064 0.957 2.661E-01 1.112 2.8991 0.43 - 19.66 VI
HCF 1.635 0.296 3.246E-08 5.527 5.1307 2.84 - 9.27 III
4.6 Conclusions
It may be concluded here that Driver-Profile Analysis reflects the influence of identified
predictors on driver's behavioral characteristics with respect to estimating their stress levels.
The Cox PHM based survival analysis indicated that the Current Physiological State (CPS) of
drivers must be monitored carefully, as this predictor resulted in a significantly high hazard
ratio. The Driver-Profile Analysis further revealed that the drivers group (DG) and the
Human Machine Compatibility Factor (HCF) must be carefully undertaken while designing a
device for stress-monitoring. Building on this result, rest of the thesis focuses on the
Affective State and Stress-Trend analysis methods for the stress-level identification of
automotive drivers.
100
Chapter 5
Biosignal-Assisted Stress Analysis
In this chapter the affective state and stress-trend analysis have been discussed in view of the
significance of the driver stress as established in the previous chapter. The current
physiological state (CPS) which has a very high hazard ratio when compared with other
predictors, indicates that to analyze the driver stress level, biosignal based analysis methods
such as affective state monitoring and stress-trend detection should be further explored.
The popular approaches for monitoring drivers physical and mental stress involves
"affective state" monitoring. This is also known as "emotional state" or "sentic state"
monitoring (Riener et al, 2009). Research has shown that discrete driving events and
incidents observed during driving (James and Nahl, 2003) also contribute to stress level,
defined as stress-trends (Singh et al., 2011) and stress event (Rigas et al., 2012). Therefore in
this chapter the methodologies51
adopted along with the results obtained for affective state
monitoring and stress-trend detection have been discussed.
5.1 Affective State Detection using ANN Classifiers
In order to build the proposed wearable driver assistance system (WDAS), it is necessary that
we develop algorithms for assessing the affective state of the drivers as well as
simultaneously consider the effect of stress-trends that may degrade their performance. In
pattern recognition systems a number of classifiers52
have been employed. These include
Decision Trees (DTs), Discriminant Analyzers, Bayesian Networks (BNs), Support Vector
Machine (SVMs), Artificial Neural Networks (ANNs), etc. It is a known fact that the
functioning of the human brain has inspired decision making mechanisms of Artificial Neural
Network models (Jain et al., 1996; Jain et al., 2000). In applications where the nature of
training features are non-linear as well as where the decision boundaries are modeled as a
non-linear function in the feature space, ANNs are the preferred classifier. One of the merits
of ANN is its ability to perform well even in the presence of noisy data (Ali et al., 2009).
ANNs have found suitable for training of both categorical as well as continuous features. It
has also been proven to be a good classification and prediction method for noisy data (Duda
et al., 2006). In addition, Neural network models are used in the analysis, prediction and
classification of time series data (Jain et al, 2000), as appropriate in the present case.
51 Most of the work presented in this chapter has been published in two journal papers (Singh et al., 2013a and
Singh et al., 2013b). 52 It has been discussed in detail in Section 2.7, Chapter 2.
101
In this work, the physiological data was collected in real-time driving scenarios which
may have a likelihood of being corrupted due to noise, sensor errors and motion artefacts.
Therefore, we employed the required signal processing algorithms, filtering and methods for
removing the motion artefacts as discussed in Section 2.7. However, it is likely that due to
the dynamic real-time driving scenario as well as due to the interaction of the body-worn
physiological sensor systems with the human body, some amount of noise may still be
present in the data. It can thus be seen that ANNs are one of the best models for handling
such noisy dynamic systems with acceptable classification rate and training speed.
ANNs employ a connectionist approach to compute the interconnection weights and bias
parameters. A generic neural network model has been shown in Fig. 5.1 (Haykin, 2001). The
neuron model for a neuron 'k' consists of (a) synapses or connecting links (j) for a signal xj,
characterized by their own weights or strengths (wkj); (b) an adder acting as a linear combiner
to sum the input signals weighted by their respective synapses; and (c) an activation function
or squashing function or a transfer function to limit the amplitude of the output signal to a
permissible range.
Figure 5.1. A Generic Nonlinear Neural Network Model
Mathematically a neuron k is described by the following pair of equations:
(5.1)
(5.2)
where,
x1, x2, x3,....,xm = Input Signals
wk1, wk2, wk3,....,wkm = Synaptic weights of neuron 'k'
102
uk = Linear combiner output due to the input signals
bk = bias or offset, increases and decreases the net input of the activation function depending
on its polarity (also known as affine transformation)
φ(∙) = activation function or squashing function or transfer function
yk = Output signal of the neuron
An activation function in a neural network calculates the layer’s output for its net input
(ʋk). This output is fed into subsequent layers as input. These activation functions may be
either a linear or non-linear function of the net input (ʋk). Some of the most commonly used
transfer functions evaluated are described as below (Hagan et al., 1996):
(a) The Threshold Function or the Heaviside Function or the Hard Limit (hardlim) Transfer
Function has been shown in Fig. 5.2 (a).
Figure 5.2 (a) A Hard Limit Function
(b) The Linear Function's output is equal to the input which is also known as "purelin"
function. A purelin function alongwith a piecewise linear function has been shown in
Fig. 5.2 (b).
103
Figure 5.2 (b) A purelin Function and a Piecewise-Linear Function
(c) The Log-sigmoid function takes any value of input between -∞ to +∞ and limits the
output according to the Eqn. 5.3, where 'a' is a slope parameter, shown below. These
functions are used in multilayer networks and trained using backpropagation algorithms and
has several variants.
5.3
Figure 5.2 (c) A Log-Sigmoid Function
104
The activation functions shown in Figure 5.2 (a) - 5.2 (c) are defined in the range from 0
to +1. But, some activation functions which are antisymmetric with respect to the origin are
sometimes used in certain cases, which range from -1 to +1 (Haykin,2001). A Tan-Sigmoid
function can be defined as per Eqn. 5.4. A Signum function and a Tan-Sigmoid has been
shown in Figure 5.2 (d).
5.4
Figure 5.2 (d) A Signum Function and a Tan-Sigmoid Function
5.1.1 Classification Approaches
Three broad categories of classification approaches involve (a) unsupervised (b)
supervised and (c) reinforcement learning of the datasets in one of the possible classification
states as shown in Figure 5.3 (Ali et al., 2009).
In the present stress-classification problem, unsupervised and supervised learning
approaches have been used. The supervised learning is adopted because of its compatibility in
modeling and controlling dynamic systems.
105
Figure 5.3: Classification Methods
Feature vectors, as discussed in Section 3.6 and 3.8.2, extracted from the real-time data if
trained properly will enable the machine in determining the driver’s affective level in a real
environment and assess the relative risk the driver is facing, utilizing the context of the
driver’s situation. In the present context both unsupervised and supervised algorithms using
neural networks have been applied.
In the following sub-section we discuss the classifier evaluation parameters on which
different classifiers are analyzed for their appropriateness.
5.1.2. Performance Measures for Classifier Evaluation
In a classification process the input instances are mapped into the predicted classes as
output (Fawcett, 2006). This exercise results in a confusion matrix (also known as
contingency table) which is used to calculate the performance measure of a classifier. Table
5.1 shows the confusion matrix for a binary classifier53
consisting of computed values as the
number of (a) true positives (tp), (b) true negatives (tn), (c) false positives (fp), and (d) false
negatives (fn). These values help in calculating several performance metrics such as the
precision which is also known as predictive ability, sensitivity and specificity of a binary
classifier.
Precision is the measure of exactness or fidelity i.e. correctly identified instances of a
relevant subset (Eq. 5.5).
(5.5)
53 A binary classifier classifies the input instances into only two classes for e..g true or false.
Classification Methods
Supervised Learning
• Input (training data)
and output (target
data) both labeled
• classifier function
for discrete output
• regression function
for continuous
output
Unsupervised Learning
• finds hidden
structures in
unlabeled data
• forms natural
clusters based on
similarity
• no error or reward
signal
Reinforcement Learning
• trial-and-error based
approach
• involve a sequence
of steps
• decisions at each
stage affects the
decisions taken at
next steps
106
Table 5.1: Confusion Matrix or Contingency Table of a Binary Classifier
Actual Class (True Class)
Hypothesized
Class
(Predicted Class)
Positive Negative Row Total
Positive True Positives
(tp)
False Positives
(fp)
(Total number of
subjects with
positive test)
tp + fp
Negative False Negatives
(fn)
True Negatives
(tn)
(Total number of
subjects with
negative test)
fn + tn
Column
Total
(Total number of subjects
with given condition)
tp + fn
(Total number of
subjects without
given condition)
fp + tn
Sensitivity is the ability of a test to correctly identify positive results (Eq. 5.6). This is
also called as recall or true positive rate.
(5.6)
Specificity is the ability of a test to correctly identify negative results (Eq. 5.7).
(5.7)
The other important performance measures are:
(5.8)
(5.9)
When ambiguities are observed in the values of precision, sensitivity and specificity of a
classifier, their interpretation becomes difficult. In such situations, another set of measures
are used for evaluating classifier performance known as the f-measure and g-mean.
F-measure is the measure of accuracy of a test by computing the weighted average of
precision and sensitivity (Sokolova and Lapalme, 2009). A high F-measure (Eq. 5.10) value
indicates a significantly high precision and sensitivity. Whereas G-mean values (Eq. 5.11 &
5.12) measures the balanced performance of a classifier between sensitivity, specificity and
precision by maximizing the performance accuracy of a classifier (Gu et al., 2009)
(5.10)
107
(5.11)
(5.12)
Receiver operating characteristics (ROC) curves haven been used to visualize and select a
classifier based on the classifier's performance (Fawcett, 2006). ROC curves are plotted
between the true positive rate (X-axis) and the false positive rate (Y-axis).
In multiclass classification problems, it is necessary to account for the individual class
results for efficient interpretations. Therefore to understand the quality of classifications let
us consider an individual class Ci, where i = 1,2,3,....,n are the number of classes.
Table 5.2: Classifier Performance Measures for Multiclass Classifiers
Evaluation
Metric
Formula Importance
Precision
Measure of exactness or fidelity (correctly
identified instances of a relevant subset)
Sensitivity
The ability of a test to correctly identify
positive results
Specificity
The ability of a test to correctly identify
negative results
Accuracy
Overall classification accuracy
Area Under the
ROC Curve
Trade-off parameter between sensitivity and
specificity. AUC value range between 0 and
1.0 (Fawcett, 2006). For multiclass
classification, averaged AUC is computed by
considering the one-against-all configuration
(Ferri et al., 2009) (i.e. a c-dimensional
classifier as c 2-dimensional classifiers).
Kappa
Statistics
Where:
P(a) = Relative observed agreement among
the classes
P(e) = Probability that agreement is due to
chance.
Measure of inter-observer reliability.
kappa coefficient (Landis and Koch, 1977): ≤
0 = poor, 0.01 - 0.20 = slight, 0.21 - 0.40 =
fair, 0.41 - 0.60 = moderate,
0.61 - 0.80 = substantial, and 0.81 - 1.0 =
almost perfect.
108
The different performance measures are computed using micro-averaging or macro-
averaging techniques for multiclass classifiers (Sokolova and Lapalme, 2009). In the present
problem which models a 3-Class and 4-Class problem, the macro-averaging techniques have
been used to define the classifiers performance measures explained in Table 5.2 (Sokolova
and Lapalme, 2009) with two additional performance measures as Area under the ROC curve
and the Kappa statistics.
5.1.3 Employing Unsupervised Learning for Affective State Monitoring
There are several approaches to unsupervised learning which includes clustering, blind signal
separation and neural network models. Clustering uses methods like k-means, mixture
models, hierarchical clustering etc. The blind signal separation methods use feature extraction
techniques for dimensionality reduction like Principal Component Analysis, Independent
Component Analysis, Non-negative Matrix Factorization and Singular Value Decomposition
etc. The neural network models for clustering include the Self Organizing map (SOM) and
Adaptive Resonance Theory (ART) (Haykin, 2001).
The SOM is a topographic organization in which nearby locations in the map represent
inputs with similar properties. The methods employed uses the Kohonen Self Organized
Maps (KSOM) for cluster analysis to find intrinsic patterns in the data set in an unsupervised
fashion. In the process, each element in the data set is annotated with a cluster index
signifying the data cluster to which it belongs to. The KSOM is capable of spatially
organizing the data set using the techniques of spatial concentration and tuning using learning
vector quantization methods (Kohonen, 1990).
In the present application, Principal Component Analysis (PCA) method has been
adopted for dimensionality reduction of the 39-attribute feature vector into first two principal
components after normalization. In the process of dimensionality reduction the principal
components that contribute to less than 10% of the total variation in the data set were
eliminated (Singh et al., 2012). The extracted principal components are further clustered
using KSOMs using the network architecture described in Table 5.3.
Fig. 5.4 (a) shows a distinct 2D topographic map of data clusters formed for a sample
subject driver. Unified distance matrix (U-matrix) approach is adopted in analysis of SOMs
for visualizing the high-dimensional data in a 2D plane (Kohonen, 1990). Fig. 5(b) represents
the corresponding unified matrix obtained on training the SOM to cluster the principal
components of the extracted data vectors. The groups of lighter colors (closely spaced nodes)
correspond to a cluster while the darker hexagons (distant nodes) correspond to the separation
109
boundaries. The nodes annotated as 'I' in the Fig. 5.4 (b) predominantly contain data points
collected during the initial relaxed state i.e. Low Stress State.
Table 5.3: KSOM Configuration and Architecture (Source: Singh et al., 2012)
S.
N.
Parameter Characteristic Adopted
a. Dimensions 10 X 3 nodes
b. Topology Hexagonal Topology
c. Distance Function Euclidean Distance Weight Function
d. Weight Function Negative Distance Weight Function
e. Transfer Function Competitive Transfer Function
f. Learning Function Batch Self Organizing Map Learning Function
g. Adaptation Sequential Order Incremental Training
h. Training Function Unsupervised Bias Training
monotonically decreasing Learning Rate
i. Cluster Identification
Fig 5.4: (a) KSOM Weight Vectors and (b) Unified Distance Matrix
(Source: Singh et al., 2012)
110
Cluster of points marked as 'II' comprises of data points under relaxed-driving i.e.
Medium Stress State and Cluster 'III' contains data vectors obtained during busy driving i.e.
High Stress State. The clustering using SOM attained an average predictive ability of
81.60%, sensitivity of 79.68% and a specificity of 89.83%. SOMs can thus help us establish
topological relationships between the collected data and the observed stress states and can
hence prove to be a powerful tool for driver stress monitoring.
5.1.4 Employing Supervised Learning for Affective State Monitoring
In the present stress-classification problem, supervised learning approach has been
adopted. Supervised learning methods model and control dynamic systems effectively as well
as they are proven to be a good classification and prediction method for noisy data. As it is a
well known fact that there is no specific approach to select a classifier. Therefore, it was
decided to evaluate a number of neural network architectures for stress level classification. A
typical stress-classification model contains an adaptive interconnection of artificial neurons
conforming to a computational model to classify a number of selected features into one of the
target classes.
In this work it was observed during data collection that drivers reported varied level of
stress experienced by them, reflecting that the present problem belongs to a multiclass
problem (Sokolova and Lapalme, 2009). Therefore in the first phase, based on a self-reported
maximum voting scheme, it was found that the stress levels can be modeled as a 3-Class
classification problem. This analysis divides the whole dataset depending upon the timeline
chart (Fig. 3.7) resulting in Relaxed, Moderate and Stressed states. Whereas in the next phase
of analysis, to include certain other stress-contributing effects the dataset was modeled into a
4-Class classification problem. This method uses a segmented road54
based class labels
resulting the stress levels from Level 1 - Level 4 (explained later in this section).
(a) A 3-Class Classification Model: Drivers affective state qualitatively reflects the stress
experienced by them under different scenarios of data acquisition depicted in the stress graph
(Fig. 2). A maximum voting scheme was employed to understand the stress level faced by
drivers under each of the scenarios (discussed in Table 5.2). This helped in categorizing the
physiological data into one of the 3-Class labels. The data under Pr-dr and Po-dr was
categorized as Relaxed (R) affective state. The Moderate (M) affective state was labeled for
the data under Rx-dr and the latter half of Rt-dr. Finally, the Stressed (S) affective state was
assigned to the data collected under By-dr and the first half of Rt-dr.
54 The term segmented road here refers to different sections of the road labeled as segments based on difficulties
associated with them with respect to driving through the segment.
111
In the maximum voting scheme, drivers had to respond to the questionnaire which
includes certain questions regarding their driving based on the experiment, scenarios, feeling
of stressful and comfortable etc. Some of the questions are listed in Table 5.1. A response
form was used to collect the individual driver's responses. The drivers had to rate the stress
level experienced by them in the driving scenarios using a 6-point Likert Semantic Scale
(1- Least Stress to 6- Most Stress). These responses were processed to identify the stress
levels of the individual scenarios. Table 5.2 gives a glimpse of the methodology adopted to
tabulate the individual fractions of responses based on the observations made by all the 20
drivers. It was felt that a 3-Class label will be more suitable derived from the 6-point scale as
a 6-class classifier would be statistically weak due to the fact that for each individual class
there would be a likelihood of less representation of input instances in the collected data.
Table 5.4: Questionnaire and Observations (Source: Singh et al., 2013a)
S. No. Questions asked from Drivers S. No. Experimenter's Observation
1. Rate the scenario according to the
stress experienced (Relaxed /
Moderate / Stressed?)
1. Time of experiment and total
drive time.
2. Average distance driven per day 2. Driving Style (Calm /
Aggressive?)
3. How comfortable you are while
driving a Sedan / Hatchback /
All Terrain Vehicle
3. Which vehicle type?
4. Driving Experience 4. Comfortable with the
equipment?
Table 5.5: Stress-Level Assessment for Individual Scenarios for a 3-Class Model
(Source: Singh et al., 2013a)
Stress Scale Low Moderate Stressed Maximum Voting Scheme
1 2 3 4 5 6
Pre –driving 100 % - - - - - Low
Relaxed-driving - 40 % 60% - - - Moderate
Busy-driving - - 10% 20 % 55 % 15% Stressed
Return-driving - - 50% 25% 25% - Moderate (50%) + Stressed (50 %)
Post-driving 30% 60% 10 % - - - Low
Besides the affective state class labeling as discussed above, major stress-inducing
driving events like sudden brakes, sharp turns, rough road-patches etc. were annotated
simultaneously by a secondary experimenter during real-time data collection. These events
have been analyzed separately in Section 5.2.
112
(b) A 4-Class Classification Model: The second approach adopted to analyze the affective
state considers the segmentation of the road. Since the data collected was of two types
namely (a) affective state data and (b) stress-trends data, the feedback as well as driver's
perceptions were considered to model the affective states as a 4-Class problem. This helped
in annotating the class labels as Level-1, Level-2, Level-3 and Level-4.
During data collection, two types of observations were carried out: (a) data collected
before and after driving and (b) data collected during driving. The data collected during pr-dr
and po-dr driving scenarios has been labeled as the stress Level-1. The other three classes,
Level-2 to Level-4, were labeled according to the difficulty level observed during driving on
a particular road segment shown as underlined numbers (2, 3 and 4) in the Figure 5.5,
reflecting the corresponding affective states, a scale of 2 was assigned to the route where
minimum pedestrian density and driving effort was observed. The routes where slightly
higher traffic and more people noticed were given a scale of 3. The routes under the busy
driving scenario were assigned a scale of 4 in this semi-urban setup. It was decided that the
scale of 5 represents a very busy highway with voluminous traffic consisting of longer
stretches should be avoided, which may typically would have been observed in a
metropolitan city. Table 5.6 presents the annotated stress-trend data observed as event
markers during driving experiments only. The weight scores with their abbreviated names for
a corresponding stress-trend has been shown in the Figure 5.5 at appropriate locations.
Table 5.6. Stress-Trend Markers and their Weights (Source: Singh et al., 2013b)
Stress-Trend
Markers
Abbreviations Weight Score
Left Turn LT 1 = less (low); 2 = more than 1 (medium) and 3 = greater than 2 ( slightly
high effort)
Right Turn RT 2 = approx. same as 2 of LT (medium) and 4 = slightly higher effort required
than the 3 of LT
Left-to-Right
Circle
LRC 1 = less (low); 2 = approx. same as 2 of LT (medium) and 4 = slightly higher
effort required than the 3 of LT
Speed
Breaker
SB 1 = less (low); 2 = more than 1 (medium) and 3 = greater than 2 ( slightly
high effort)
113
Figure 5.5. Driving Scenarios Route Map.
5.1.5 Evaluation of Neural Network Architectures
A typical ANN architecture consists of an input layer, a hidden layer and an output layer
(Haykin, 2001). The input layer is fed with an input vector of selected features which
represent the affective state of the automotive driver. The hidden layer models the non-
linearities present in the data. In contrast, the output layer represents the predicted output
classes as a result of the classification process that in the present case will represent the
predicted affective state of the automotive driver.
There are two broad categories of neural network architectures viz. (i) Feed-forward
Neural Network and (ii) Recurrent Neural Networks. For the present work, 7 variants of
neural network architectures, comprising of four feed-forward and three recurrent networks,
have been evaluated as described below (Singh et al., 2013a):
(i) Feed-forward Neural Network: These neural networks have one-way connections from the
input to the output layers. They are used in prediction and pattern recognition problems
(Haykin, 2001). The feed-forward networks evaluated include the following:
114
a) Single Layer Perceptron Neural Network (SLPNN): In this network the first layer is
the input layer to which the feature vector is fed and the second layer the output layer itself.
An SLPNN has been shown in Fig. 5.6. SLPNN are considered to be a simple feed-forward
network.
Figure 5.6. Single Layer Perceptron Neural Network Model.
b) Multi-Layer Perceptron (1-Hidden Layer)Neural network (MLP1NN): The MLP’s
fully-connected network structure enables the pattern of activation in a particular layer at
each time step to influence its behavior in the next time step. Due to inherent overlap
observed between moderate and stressed affective states data of drivers, the classifier should
be capable of handling non-linearities. An MLPNN configuration has been shown in Fig. 5.7.
It is proven that a feed-forward network like an MLP with one hidden layer can fit any finite
input and output mapping function (Haykin, 2001). The feed forward network is also capable
of handling non-linear mapping problems like the affective state recognition. The MLP with
sufficient number of neurons in the hidden layer is proven to be capable of training itself in
any kind of non-linearly separable classification problems like the present one (Haykin,
2001).
115
Figure 5.7. Multilayer Perceptron Neural Network Model.
c) Cascade Forward Backpropagation Network (CASFBNN): CASFBNN are similar to
MLPs, but have connections from the input layer to every previous layers of the output.
These networks are trained faster because each neuron is trained independently but suffer
from over fitting problems when the training data used is noisy (Schetinin, 2003). The choice
of this network for evaluation was made to achieve faster training time with acceptable
classification rate. A CASFBNN configuration has been shown in Fig. 5.8.
Figure 5.8. Cascade Forward Backpropagation Neural Network Model.
d) Feed Forward Distributed Time-Delay Neural Network (DTDNN): FFDTDNN are
similar to feed forward multilayer perceptrons (MLPs) but differ in the sense that the inputs
116
to a node contain not only immediate outputs of previous nodes but also some previous time
steps realized using tapped-delay lines. Such a network has a finite dynamic response to time
series input data but suffers from a very high training time requirement (Ibrahim, 2010).
These networks were chosen for evaluation to account for the latency in response to the
stimuli which are observed while driving. The influence of the past observations on the
present affective state of the driver can be accounted for by using the delay elements
embedded in this network. A FFDTNN configuration has been shown in Fig. 5.9.
Figure 5.9. Feed Forward Distributed Time-Delay Neural Network Model.
(ii) Dynamic or Recurrent Neural Networks: A context aware neural network with dynamic
neurons, memory and recurrent feedback connections. It is used for time series prediction and
non-linear dynamic problems. These networks are sensitive, capable of attractor dynamics
and adapt to past inputs (Haykin, 2001). The recurrent networks evaluated include the
following:
a) Elman back-propagation neural networks (ELMBNN). Elman Networks are feed-
forward networks with a recurrent layer. Such a recurrence simplifies the learning process for
complicated networks because it allows the networks to remember states from the past. In
addition to the hidden layer, the context layer of the network copies the output of the layer
and uses it as an extra input signal in the next time step (Grüning, 2007). They were chosen to
check if a reduced complexity in the design significantly affects the predictive ability for the
present problem or not. The Elman network, shown in Fig. 5.10, commonly is a two-layer
network with feedback from the first-layer output to the first-layer input. This recurrent
connection allows the Elman network to both detect and generate time-varying patterns.
117
Figure 5.10. Elman Backpropagation Neural Network Model.
b) Layer recurrent neural networks (LRNN): LRNN are similar to feed-forward
networks but each layer has a recurrent connection with a tapped delay associated with it.
Such a feedback provides moving window analysis and is useful in evaluating the instance
correctly because the output of such networks depend not only on the current input but also
on previous states (Liu and Wang, 2008). The recurrency involved in the design of such a
network would enable the design of a dynamically stable stress classifier. As explained in the
case of distributed time delay network in the section above, the delay elements can account
for the latency in response observed, when the driver is subjected to stressful situations
during the driving task. An LRNN network is shown in Fig. 5.11.
Figure 5.11. Layer Recurrent Neural Network Model.
118
c) Non-linear Autoregressive Networks with Exogenous Inputs [22]: Non-linear
autoregressive networks with exogenous inputs neural network (NARXNN). This network is
capable of predicting one time series values, given past values as well as the feedback inputs
and also another time series called the exogenous time series. It is found to be less sensitive
to long-term dependencies and maintain a good learning rate and generalization performance.
Here the hidden units from previous states are considered as additional inputs to the next state
(Haykin, 2001). The choice of this network for evaluation was made to enable faster learning
rates for the present application. A NARXNN has been shown in Fig. 5.12.
Figure 5.12. Non-Linear Autoregressive with Exogenous Inputs Neural Network Model.
Most of the neural networks enunciated above utilize back propagation technique for
learning. This method utilizes the gradient of error criterion with respect to weights for a
given input by propagating it through the network. It is a variation of gradient search which
employs the least squares criterion for optimization.
Neural network training involves selection of (a) the input, hidden and output layers in a
particular architecture (b) a learning method (c) training method and (d) a stopping criterion.
In order to train the selected neural network configurations, the feature vector was divided in
the ratio of 60:20:20 for training, validation and testing respectively. The Levenberg-
Marquadt Backpropagation algorithm was selected as the learning function whereas the
Gradient Descent Non-linear Optimization search method and Bias Learning function were
used as the training algorithm, as discussed in Section 5.1.6.1. The stopping criterion was
based on mean squared error (MSE) with a threshold of 0.05 observed during the training.
119
An activation function in a neural network calculates the layer’s output from its net input.
This output is fed into subsequent layers as input (Haykin, 2001). The activation function
used in the present application is the tan-sigmoid (tansig) function for the hidden layer as
shown in Eqn. 5.13. For a very high value of x, the node sends maximum excitation i.e. 1.
(5.13)
where x = combined input to node
The activation function for the output layer was selected as the linear (purelin) function.
In the present 3-Class problem, the networks were trained using a feature vector formed by
employing a fixed 'window-size' of 10-seconds. The number of neurons were fixed to '15' in
the hidden layer of each network configuration based on an initial observation for which the
networks were generalizing with minimum mean squared errors (MSE).
5.1.6. Results of 3-Class Affective State Classification
Performance of these neural network classifiers may be evaluated using commonly used
evaluation parameters such as precision, sensitivity, specificity and the receiver operating
curves (ROC). These metrics are used to evaluate the desirability of a particular classifier
over another in case of a multi-parameter optimization problem. The following steps were
involved in obtaining the individual classifier's performance:
individual 3-Class confusion matrix for each of the drivers were extracted as an
output of the classification exercise for each of the classifiers.
the classification parameters such as precision, sensitivity, specificity, gmean-1,
gmean-2 and f-measure were calculated by considering the macro-averaging
techniques (Sokolova and Laplme, 2009) from the individual confusion matrix of the
drivers (total 19) for a particular network.
the mean of these parameters for all the 19 drivers for each of the networks were
tabulated as a final performance measure indices shown in Table 7.
Table 5.7 shows the predictive abilities (precision) for all the 19 drivers for the selected
neural network classifiers.
120
Table 5.7: Classifier Performance Parameter: Precision
Driver
Index
SLPNN MLP1NN CASFBNN FFDTDNN ELMBNN LRNN NARXNN
D1 0.867435 0.897407 0.904667 0.90829 0.940881 0.928676 0.86708
D2 0.564292 0.889275 0.895079 0.874531 0.899933 0.889081 0.848838
D3 0.642089 0.881544 0.83906 0.881175 0.912553 0.886617 0.793743
D4 0.753142 0.863682 0.858773 0.845426 0.877934 0.861494 0.795292
D5 0.785277 0.901687 0.879435 0.839494 0.8658 0.89175 0.820616
D6 0.763044 0.862346 0.887887 0.814908 0.882685 0.852816 0.765476
D7 0.697654 0.859543 0.878336 0.88235 0.88703 0.897977 0.760028
D8 0.704313 0.81924 0.849182 0.813633 0.854389 0.852086 0.725918
D9 0.746935 0.833807 0.890762 0.867215 0.904032 0.874013 0.740868
D10 0.754515 0.832889 0.841482 0.815645 0.85633 0.888845 0.790103
D11 0.671847 0.873807 0.870161 0.818722 0.905885 0.886671 0.841709
D12 0.651894 0.852931 0.839443 0.810655 0.854395 0.902755 0.728849
D13 0.651763 0.871976 0.872229 0.859206 0.895309 0.903434 0.761433
D14 0.844627 0.932187 0.922323 0.938334 0.948943 0.963292 0.935921
D15 0.802281 0.852747 0.766895 0.69473 0.863338 0.862759 0.755518
D16 0.763837 0.915317 0.909324 0.921214 0.94177 0.940216 0.896884
D17 0.861157 0.955179 0.916677 0.963704 0.966009 0.946443 0.879104
D18 0.755027 0.887278 0.863022 0.835528 0.853716 0.906411 0.79178
D19 0.719744 0.852893 0.868735 0.822269 0.855065 0.817593 0.787236
Average 0.736888 0.875565 0.871235 0.853001 0.892947 0.892259 0.804547
It can be observed that although the ELMBNN and the LRNN classifiers perform better with
a mean value of 89.30% and 89.23% respectively, the maximum individual predictive ability
for ELMBNN and LRNN is close to above 96%. This indicates that the results obtained are
far more better for a classification exercise.
121
Table 5.8: Classifier Performance Parameter: Sensitivity
Driver
Index
SLPNN MLP1NN CASFBNN FFDTDNN ELMBNN LRNN NARXNN
D1 0.854531 0.905806 0.913528 0.898972 0.939201 0.923757 0.837122
D2 0.401361 0.891858 0.893984 0.86375 0.898424 0.888547 0.842772
D3 0.620798 0.866429 0.844467 0.882015 0.917336 0.87491 0.78935
D4 0.723404 0.873236 0.850223 0.856004 0.883494 0.825788 0.79959
D5 0.782995 0.903443 0.871976 0.839636 0.871925 0.894527 0.823323
D6 0.725136 0.819749 0.862432 0.795969 0.870054 0.840904 0.715007
D7 0.670036 0.86297 0.886141 0.878255 0.871281 0.894964 0.762242
D8 0.7083 0.816349 0.850417 0.814979 0.845836 0.847759 0.727382
D9 0.740155 0.83461 0.891839 0.845329 0.907291 0.879961 0.719417
D10 0.766543 0.847089 0.851055 0.829556 0.869598 0.905675 0.770633
D11 0.66952 0.876142 0.875442 0.820364 0.904751 0.889779 0.847241
D12 0.606489 0.806962 0.841336 0.81329 0.852576 0.891319 0.683318
D13 0.625965 0.845638 0.868344 0.84695 0.895387 0.896485 0.758244
D14 0.824691 0.943061 0.921257 0.946497 0.951835 0.966055 0.925405
D15 0.810083 0.847639 0.708895 0.695229 0.868053 0.862499 0.762084
D16 0.765595 0.910304 0.912097 0.91991 0.934022 0.935578 0.885627
D17 0.857627 0.964438 0.924855 0.962244 0.965792 0.947213 0.883426
D18 0.738224 0.890945 0.862408 0.831424 0.84329 0.904721 0.79226
D19 0.707092 0.851888 0.861666 0.817633 0.852859 0.807512 0.780405
Average 0.715713 0.871503 0.868019 0.850421 0.891737 0.888313 0.794992
In Table 5.8, the sensitivity values for the drivers are shown. It can be seen that for both the
ELMBNN and LRNN the mean is approximately close to 89% with a maximum value close
to approximately 96.60% in both the cases. This indicates that among the drivers who were
stressed almost 89% have been correctly identified.
122
Table 5.9: Classifier Performance Parameter: Specificity
Driver
Index
SLPNN MLP1NN CASFBNN FFDTDNN ELMBNN LRNN NARXNN
D1 0.940768 0.959473 0.960757 0.956823 0.972979 0.968688 0.936999
D2 0.702616 0.949922 0.951093 0.940087 0.950471 0.948783 0.928335
D3 0.845861 0.939685 0.925523 0.948393 0.960594 0.945794 0.906473
D4 0.880835 0.939896 0.931809 0.931133 0.948082 0.925965 0.909404
D5 0.90785 0.956789 0.944862 0.932035 0.940173 0.953582 0.922984
D6 0.875021 0.921154 0.935666 0.907337 0.940702 0.927778 0.87581
D7 0.849973 0.935151 0.943639 0.94153 0.94237 0.950067 0.885142
D8 0.8657 0.914473 0.929671 0.914045 0.928887 0.927955 0.87336
D9 0.880541 0.9213 0.949187 0.927804 0.954865 0.940867 0.870012
D10 0.884447 0.922529 0.92591 0.91523 0.934426 0.949704 0.892249
D11 0.847548 0.944496 0.940187 0.914388 0.956191 0.947908 0.9301
D12 0.843865 0.916817 0.92489 0.920362 0.933384 0.951783 0.866854
D13 0.848388 0.933941 0.941777 0.931742 0.952803 0.953829 0.895668
D14 0.927864 0.975383 0.968233 0.973802 0.983719 0.983455 0.968776
D15 0.920906 0.934074 0.873083 0.865381 0.937951 0.939364 0.902427
D16 0.893609 0.959262 0.960968 0.964683 0.970383 0.970289 0.947477
D17 0.939191 0.980999 0.964497 0.985194 0.984944 0.975831 0.949053
D18 0.891065 0.950953 0.939183 0.928943 0.932536 0.958467 0.911446
D19 0.865697 0.92934 0.935847 0.915588 0.932527 0.913894 0.900545
Average 0.874302 0.941349 0.939304 0.932342 0.95042 0.949158 0.909111
Table 5.9 shows the specificity value of the drivers for the selected neural network classifier
configuration. It may be observed that out of the total population of drivers who were
considered to be stress-free, only 95% of them were indeed so. This leads remaining 5%
drivers about whom nothing can be explicitly concluded with confidence. The figure of 95%
was arrived at by the virtue of individual mean values of specificity for ELMBNN and LRNN
classifiers.
123
Table 5.10: Classifier Performance Parameter: gmean-1
Driver
Index
SLPNN MLP1NN CASFBNN FFDTDNN ELMBNN LRNN NARXNN
D1 0.968803 0.984141 0.973559 0.96814 0.973387 0.984141 0.978779
D2 0.43073 0.954327 0.954882 0.956234 0.931104 0.965473 0.944831
D3 0.913794 0.94 0.922735 0.967267 0.962581 0.958107 0.926521
D4 0.907409 0.946862 0.932755 0.940579 0.96814 0.944759 0.932755
D5 0.97613 0.976537 0.971698 0.980951 0.947878 0.981132 0.966466
D6 0.86164 0.925213 0.918923 0.896421 0.931334 0.933444 0.895833
D7 0.835937 0.923089 0.921782 0.922278 0.953206 0.945566 0.871806
D8 0.912503 0.926762 0.943119 0.943119 0.951994 0.924442 0.919943
D9 0.923899 0.947291 0.974372 0.938144 0.958763 0.959234 0.914862
D10 0.886523 0.936457 0.916927 0.951876 0.934282 0.958238 0.895073
D11 0.838158 0.951365 0.939266 0.893668 0.963084 0.94119 0.940554
D12 0.887227 0.92566 0.913717 0.953206 0.93478 0.958373 0.887689
D13 0.943119 0.958967 0.958385 0.9439 0.958385 0.958967 0.947968
D14 0.946894 0.974474 0.974887 0.964217 0.994937 0.979798 0.970049
D15 0.931069 0.936784 0.876738 0.862113 0.921879 0.936179 0.923762
D16 0.906447 0.952394 0.979002 0.968085 0.973559 0.968475 0.941812
D17 0.962976 0.973329 0.973559 1.000000 0.989418 0.973559 0.973387
D18 0.948122 0.963624 0.948504 0.959184 0.953483 0.96896 0.958814
D19 0.887022 0.933935 0.962289 0.936894 0.957045 0.968085 0.951476
Average 0.887811 0.949011 0.94511 0.944541 0.955749 0.958322 0.93381
In Table 5.10, the gmean-1 values have been shown. It can be observed that the LRNN
performs better among other choices. This is an indication that the sensitivity and precision
values extracted have a balanced result on the performance of the classifier, LRNN being the
most optimal whereas ELMBN is the next best classifier.
124
Table 5.11: Classifier Performance Parameter: gmean-2
Driver
Index
SLPNN MLP1NN CASFBNN FFDTDNN ELMBNN LRNN NARXNN
D1 0.973063 0.985354 0.975984 0.970666 0.974702 0.985354 0.980042
D2 0.448649 0.962463 0.964132 0.967244 0.945244 0.974228 0.955613
D3 0.928057 0.9516 0.93006 0.968998 0.968486 0.966092 0.938471
D4 0.922409 0.953459 0.944049 0.944379 0.971444 0.955868 0.944049
D5 0.97613 0.980762 0.976035 0.980951 0.955283 0.984026 0.968107
D6 0.88463 0.933059 0.935806 0.915679 0.94042 0.945787 0.912549
D7 0.863318 0.935059 0.93124 0.938268 0.957001 0.95722 0.892221
D8 0.935577 0.947295 0.95651 0.95651 0.958554 0.943584 0.940325
D9 0.946722 0.952053 0.980599 0.950805 0.967223 0.969385 0.935736
D10 0.917542 0.956585 0.938478 0.964465 0.951472 0.97055 0.925801
D11 0.861773 0.958227 0.943316 0.91084 0.970698 0.950885 0.949124
D12 0.90703 0.939722 0.927884 0.956909 0.946709 0.960272 0.907616
D13 0.952814 0.967013 0.96555 0.954256 0.96555 0.967013 0.956558
D14 0.939052 0.974132 0.97321 0.963533 0.994937 0.978787 0.967601
D15 0.929981 0.9337 0.852347 0.854197 0.92133 0.934029 0.920703
D16 0.920114 0.959342 0.983033 0.972405 0.977733 0.973846 0.950143
D17 0.967274 0.973329 0.976426 1.000000 0.989418 0.976426 0.97492
D18 0.952857 0.963624 0.954442 0.964761 0.958078 0.970541 0.963271
D19 0.911988 0.951432 0.966702 0.950669 0.963517 0.974282 0.958088
Average 0.902052 0.956748 0.951358 0.95187 0.961989 0.965167 0.94426
In Table 5.11, the gmean-2 values have been shown. It can be observed that the LRNN still
performs better among other choices. This is an indication that the precision and specificity
values extracted have a balanced result on the performance of the classifier, LRNN being the
most optimal whereas ELMBN is the next best classifier.
125
Table 5.12: Classifier Performance Parameter: f-measure
Driver
Index
SLPNN MLP1NN CASFBNN FFDTDNN ELMBNN LRNN NARXNN
D1 0.96875 0.984127 0.973545 0.968085 0.973262 0.984127 0.978723
D2 0.333333 0.954315 0.954774 0.955665 0.930693 0.965174 0.944724
D3 0.912821 0.938776 0.922222 0.967033 0.962567 0.957895 0.926316
D4 0.907216 0.946809 0.932642 0.939891 0.968085 0.944162 0.932642
D5 0.975845 0.976526 0.971698 0.980769 0.947867 0.981132 0.966184
D6 0.861538 0.924731 0.917073 0.895522 0.931217 0.933333 0.895833
D7 0.834951 0.923077 0.921466 0.921569 0.95288 0.945274 0.871795
D8 0.910891 0.925373 0.943005 0.943005 0.951872 0.923858 0.919192
D9 0.92233 0.946809 0.974359 0.938144 0.958763 0.959184 0.914573
D10 0.885057 0.935673 0.916667 0.951807 0.934132 0.958084 0.892655
D11 0.835821 0.951351 0.938547 0.893617 0.962963 0.941176 0.940541
D12 0.8867 0.925373 0.913706 0.95288 0.934673 0.957895 0.885714
D13 0.943005 0.958763 0.958333 0.94359 0.958333 0.958763 0.947917
D14 0.946341 0.974359 0.974874 0.964103 0.994924 0.979798 0.97
D15 0.931034 0.936709 0.874016 0.861925 0.921739 0.93617 0.923729
D16 0.90625 0.952381 0.978947 0.968085 0.973545 0.968421 0.941799
D17 0.962963 0.972973 0.973545 1.000000 0.989362 0.973545 0.973262
D18 0.947917 0.962963 0.948454 0.959184 0.953368 0.96875 0.958763
D19 0.886598 0.933333 0.962162 0.936842 0.956989 0.968085 0.951351
Average 0.882072 0.948654 0.944739 0.944301 0.955644 0.958149 0.933459
In Table 5.12, the f-measure values have been shown. Here f-measure refers to the geometric
mean computed from the precision and sensitivity of the classifiers and that the results
computed values for these parameters are more accurate. It can be observed that the LRNN
performs better among other choices whereas the ELMBNN is again the next best classifier.
126
In Table 5.13, the mean values of all the evaluation parameters have been shown for the
chosen neural network classifiers. When monitored closely in terms of the three cardinal
performance measures like precision, sensitivity and specificity, it may be seen that the
ELMBNN performs better than the LRNN classifier. However, a closer look also reveals that
the difference in performance is not easily distinguishable. Therefore, the other performance
measures like f-measure, gmean-1 and gmean-2 values, which are considered the balanced
performance measure, need to be calculated. These values have also been shown in the Table
5.13. It may be observed that in case of all six performance parameters, the LRNN performs
better than the ELMBNN classifier.
Table 5.13: Comparative Results of Neural Network Classifier Evaluation
S. No. Networks Evaluated Performance Measures
Precision Sensitivity Specificity gmean-1 gmean-2 f-measure
1. SLPNN 0.73689 0.71571 0.87430 0.88781 0.90205 0.88207
2. MLP1NN 0.87556 0.87150 0.94135 0.94901 0.95675 0.94865
3. CASFBNN 0.87124 0.86802 0.93930 0.94511 0.95136 0.94474
4. FFDTDNN 0.85300 0.85042 0.93234 0.94454 0.95187 0.94430
5. ELMBNN 0.89295 0.89174 0.95042 0.95575 0.96199 0.95564
6. LRNN 0.89226 0.88831 0.94916 0.95832 0.96517 0.95815
7. NARXNN 0.80455 0.79499 0.90911 0.93381 0.94426 0.93346
The ambiguities present in the classifier performance measures depicted in Table 5.13
necessitates the identification of an ideal classifier. This classifier should not only have
higher average performance over the population considered but it should exhibit a consistent
behaviour to the core i.e. the dispersion in the performance parameter should be minimum.
To satisfy this need a normalized scale invariant performance measure has been defined as
the Standardized First-Order Moment, by considering the ratio between the individual
performance mean and the corresponding standard deviations. These values were further used
to formulate a desirability measure for classifier's performance as defined in Eq. 5.14. Table
5.14 shows the standardized first-order moment of the individual parameters and the overall
desirability of the classifiers. It may be noticed that LRNN performs with higher desirability
amongst all the classifiers considered and hence could be the preferred classifier.
(5.14)
127
Table 5.14: Classifier Performance Evaluation based on Unified Desirability Measure
S. No. Networks
Evaluated
Standardized 1st Order Moment Measures Desirability
Measure
Precision Sensitivity Specificity gmean-1 gmean-2 f-measure
1. SLPNN 9.279947 6.73612 16.76195 7.531016 7.878069 6.343945 8.563516
2. MLP1NN 25.32339 20.72087 49.24547 49.97429 62.66337 49.51076 39.85285
3. CASFBNN 24.30765 18.57107 45.0339 33.68312 31.37097 33.05949 29.86937
4. FFDTDNN 14.36038 14.31623 34.5907 28.91506 30.30583 28.82392 23.75342
5. ELMBNN 24.8986 24.33995 54.64387 48.78659 55.86717 48.65318 40.46697
6. LRNN 24.93126 22.13845 53.25593 56.70211 68.39605 56.36939 43.11769
7. NARXNN 13.54908 12.38576 31.20354 30.42353 39.4763 30.08845 23.96057
5.1.6.1 Training and Learning Function Evaluation
A neural network adapts and iteratively changes their synaptic weights to enhance the
classification performance while performing supervised learning based classification. To
minimize the mean squared classification error of the network as an objective, these iterative
changes are governed by learning rules which perform a search for finding optimized
synaptic weights (Møller, 1993). There are multitudes of learning functions which influence
the individual weights and biases of network locally, in addition to training functions which
influence a network globally. In the initial training phase several training and learning
functions compatible with Layer recurrent neural network architecture were selected for the
present application. The observed results during this analysis are provided in Table 5.15.
It can be noticed from the Table 5.15 that the Levenberg Marquardt backpropagation
algorithm for a layer recurrent network was more suitable than other alternates, Scaled
Conjugate Gradient being the second choice, when both the correct classification rate and the
MSE observed were compared. This algorithm is a blend of Gradient Descent and Gauss
Newton iteration methods (Marquardt, 1963). This algorithm is better than the Gauss Newton
methods while minimizing ill-initialized non-linear functions more robustly over the
parametric space. The LM optimizer when far away from the optima acts more like a
gradient-descent method and when it is closer to the objective it acts like the Gauss Newton
method.
128
Table 5.15: Evaluation of the Performance of the Neural Network Learning and
Training Algorithms
TA CR (%) MSE CR (%) MSE
GDM GD
1. BFG 88.90 0.0560 88.90 0.0585
2. GDA 87.00 0.0778 83.30 0.0752
3. LM 90.30 0.0504 91.20 0.0473
4. RO 92.10 0.212 88.90 0.206
5. RP 86.60 0.0771 88.00 0.0560
6. SCG 90.30 0.0572 90.30 0.0573
Legend:
BFG:BFGS quasi-Newton backpropagation; GDA: Gradient descent with adaptive learning rate
backpropagation; LM: Levenberg-Marquardt backpropagation; RO: Random order incremental training with
learning functions; RP: Resilient backpropagation algorithm (Rprop); SCG: Scaled Conjugate Gradient;
GDM: Gradient descent with momentum weight and bias leaning function; GD: Gradient descent weight and
bias leaning function; TA: Training Algorithm; MSE: Mean Squared Normalized Error Performance Function;
CR: Classification Rate
5.1.6.2 Identification of an Optimum Classifier for a 3-Class Affective State
Detection
It was found that the most optimum classifier based on the average performance of the
calculated metrics was the LRNN classifier, exhibiting a consistent behavior as discussed in
Section 5.1.6.1. However, for understanding the dispersion in the selected driver population,
the mean values for the three performance measures have been studied further and provided
in the Table 5.16 for all the 19 drivers. The predictive ability (or precision), sensitivity and
the specificity values were found to be 89.23%, 88.83% and 94.92% respectively. The
standard deviation in the predictive ability, sensitivity and the specificity values were
obtained as 3.58%, 4.01% and1.78% respectively. Such a low standard deviation signifies
that the present classification result may be applied to a larger population of drivers. This also
confirms that the LRNN classifier can be used to obtain consistent classification performance
for the selected population.
Another important method to evaluate a classifier's performance is a graphical analysis
method that involves the Receiver Operating Characteristics (ROC) curve. Such curves have
been used extensively by the machine learning and data mining professionals. This curve is a
plot between the true positive rate (sensitivity) and the false positive rate (1 - specificity) of a
129
classifier (Fawcett, 2006). The area under the ROC curve (AUC) is an extension to this
method which is used to analyze the accuracy of a particular classification model. The AUC
value is computed to obtain a numerical value and the test performance is rated as: excellent
(AUC > 0.9), good (0.8 < AUC < 0.9), fair (0.6 < AUC < 0.8) and failed (below 0.6). Figure
5.13 depicts the ROC plots of an LRNN classifier, for the present 3-Class classification
problem, for a randomly selected driver. The three stress-classes have been indicated as
Relaxed, Moderate and Stressed. It may be noticed from the ROC plot that the Relaxed and
Stressed classes exhibit an excellent separation between the classes while a fair class
representation is visible for the Moderate stress-class.
Table 5.16: Affective State Detection using Layer Recurrent Network
Driver Index Precision Sensitivity Specificity
D1 92.87% 92.38% 96.87%
D2 88.91% 88.85% 94.88%
D3 88.66% 87.49% 94.58%
D4 86.15% 82.58% 92.60%
D5 89.17% 89.45% 95.36%
D6 85.28% 84.09% 92.78%
D7 89.80% 89.50% 95.01%
D8 85.21% 84.78% 92.80%
D9 87.40% 88.00% 94.09%
D10 88.88% 90.57% 94.97%
D11 88.67% 88.98% 94.79%
D12 90.28% 89.13% 95.18%
D13 90.34% 89.65% 95.38%
D14 96.33% 96.61% 98.35%
D15 86.28% 86.25% 93.94%
D16 94.02% 93.56% 97.03%
D17 94.64% 94.72% 97.58%
D18 90.64% 90.47% 95.85%
D19 81.76% 80.75% 91.39%
Average 89.23% 88.83% 94.92%
130
Fig. 5.13. ROC Curves for Affective State Detection using Layer Recurrent Neural Networks
In Table 5.17, different metrics extracted after an ROC analysis have been tabulated for
each of the three target affective state classes. The analysis shows that the Relaxed and
Stressed classes perform better.
Table 5.17: ROC Analysis of the Drivers Affective State
Metrics Relaxed Moderate Stressed
Area under ROC 0.99232 0.87705 0.93928
Standard Error 0.00782 0.02999 0.02163
Standardized AUC 62.9197 12.5736 20.3126
Discriminant Threshold 0.2927 0.4787 0.3841
95% Confidence Interval L: 0.97698 L: 0.81827 L: 0.89689
H: 1.00000 H: 0.93582 H: 0.98167
Comments Excellent Good Excellent
The results obtained for the three cardinal performance parameters such as precision,
sensitivity and specificity were used to plot the box-plots as shown in Fig. 5.14(a), 5.14(b)
and 5.14(c) respectively by considering all the classifiers. The box-plot is an statistical tool to
represent the dispersion or spread and skewness in numerical data through their quartiles. It
can be observed from the plots that even though the upper quartile ranges of all the classifiers
other than ELMBNN are approximately close or below the LRNN, the median of the LRNN
classifier for all the three evaluation parameters i.e. precision, sensitivity and specificity is
higher than the median of other classifiers. In a boxplot diagram such as the Fig. 5.14, the
difference between the upper quartile values and the median (i.e. Upper quartile - median) is
131
referred as the median deviation. If analyzed closely further, it may be observed that for the
LRNN classifier the median deviation is minimum for all the considered classifiers. Thus, it
can be justified that a consistent performance is achieved as very few higher values deviate
from the median. This confirms that LRNN classifier performs better among the selected
classifiers with respect to different classifier performance metrics and supports the analysis
performed earlier.
Fig. 5.14 (a)
Fig. 5.14 (b)
132
Fig. 5.14 (c)
Fig. 5.14: Boxplots of Neural Network Classifiers Performance: (a) Precision (b) Sensitivity
(c) Specificity
The architecture for the proposed LRNN neural network with the above selected training and
learning functions is shown in Fig. 5.15.
Fig. 5.15. Layer Recurrent Neural Network Architecture for Affective State Detection
133
5.1.7. Methodology adopted for 4-Class Affective State Classification
The classification of the annotated data into pre-defined affective states i.e. Level-1 to
Level-4 is achieved using the six different NN classifiers. Four classifiers such as the
MLP1NN, CASFBNN, ELMBNN and LRNN are similar as discussed in Section 5.1.5 for a
3-Class problem. Two configurations of feed forward time-delay neural network (TDNN)
were additionally included. TDNNs are also similar to MLPs but the inputs to a node also
contain some previous time steps realized using tapped-delay lines besides the immediate
outputs of previous nodes. These networks can learn precise weight patterns from imprecisely
prepared training data (Lang and Waibel, 1990) and trains faster because the tapped delay
line appears only at the input without any feedback loops. These networks were chosen
because they are suitable for time series data prediction, which in this case are the selected
features representing the present affective state of the driver. We evaluated two separate
configurations of this network FFTD-D1NN and FFTD-D2NN with two separate time delays
d1 and d2. An FFTDNN network has been shown in Fig. 5.16.
Fig. 5.16. Feed Forward Time Delay Neural Network Model
The feature vector was divided in the ratio of 60:20:20 for training, validation and testing
respectively for training the NN. The training algorithm, learning algorithm and stopping
criterion selected were similar to that of a 3-Class problem as discussed in Section 5.1.5. The
expected output of this 4-Class classification problem resulted in confusion matrices of size
4x4. This confusion matrix was used to calculate the performance measures of the classifiers
discussed in Table 5.2.
134
In the 3-Class classification problem the networks were trained by considering fixed
window and with fixed neuron sizes in the hidden layers. However, optimum performance
can be achieved by varying these parameters. A proper window size may be selected by
varying the sliding window while training the networks, which in turn optimizes the
performance of neural network classifiers (Frank et al., 2001). Similarly, a group of networks
are trained by fixing different number of neurons for each iteration, known as the "fixed"
approach which is suitable for offline computation despite more time needed in training
(Kaastra and Boyd, 1996). A final decision is made based on the network which satisfies less
error criterion. Therefore, in the present 4-Class problem, a variable window size and variable
number of neurons have been considered. However, to avoid overfitting, a single hidden layer
has been considered such that comparing the performance of several classifier combination
becomes easy.
The input feature vectors extracted earlier described in the feature extraction and selection
algorithm in Section 3.6 and 3.8 were arranged in a concatenated matrix alongwith the target
vector representing the stress-classes, were fed to the selected classifier configurations. The
expected output of the classification process will be a confusion matrix of 4x4 size which can
be further processed to identify the interrelationships between extracted biosignal features
and the driver's stress-classes representing the affective state. With the requirements outlined
here to analyze the results in this multiclass problem, the following challenging tasks must be
addressed:
- selecting a proper window size
- selecting the number of neurons in the hidden layer, and
- identifying an optimum classifier which can recognize the given classes from the data
which is dependent on several parameters.
In order to address these requirements the following steps have been implemented for
training each of the six selected neural network configurations:
(a) varying the non-overlapping window sizes between 5 seconds to 30 seconds
(b) varying the number of neurons between 5 to 30 neurons, and
(c) training and extracting evaluation metrics for each combination.
It was also envisaged that the expected solution must be uniformly applicable to a larger
driver population, therefore a two-fold analysis was considered for analyzing the affective
state namely, a single-turn analysis and a multi-turn analysis. Although, for the 3-Class
classification problem 19 drivers data was considered in the analysis, for the present 4-Class
problem, only 14 drivers data was considered for the single-turn analysis. This happened due
135
to the misinterpretation by the experimenter while assigning the affective state annotations
which was originally collected considering the classes as 3-Class. In addition to this, some of
the drivers data was lost due to machine errors and was deliberately skipped such that to
correlate the analysis with the stress-trend analysis phase as some of the stress-trend data was
also missed.
5.1.7.1 Methodology for Single-Turn Affective State Analysis
In the single-turn analysis, the feature vector was formed with the features extracted from
the real-time data of the drivers who participated once in the data collection experiment. The
expected outcome of this analysis is based on the identification of an optimal neural network
which exhibit a consistent performance and with minimum inter-observer variability. On
successful identification, the selected network can serve as a benchmark classifier universally
applicable among a varied degree of population, if ported on the system's inference engine.
The single-turn analysis involved the following steps:
the average value of the classifier performance measures such as the precision,
sensitivity and specificity etc. was extracted by considering the individual window
sizes of 5-to-30 for six different neuron values, 5-to-30, for each of the classifiers.
using the formula shown in Eqn. 5.15, the individual desirability (Derringer and
Suich, 1980) measure was calculated by considering the averages of each classifier
performance measures.
to obtain an optimum configuration of the window size, number of neurons in the
hidden layer and the classifier, the individual desirability was used.
using the formula shown in Eqn. 5.16, an overall desirability was obtained for each
window size by considering the individual desirabilities obtained earlier, which can
indicate the optimal window size.
The maximum of the individual overall desirabilities is tabulated in Table 5.13.
(5.15)
(5.16)
136
where
r = user defined value (r = 1, desirability increases linearly)
5.1.7.2 Results: Single-Turn Affective State Analysis
The classification exercise involving the training of selected NN configurations with the
selected input feature vector and the 4-Class target resulted in 4x4 size confusion matrix.
Since the training was performed using different window sizes and with different neurons,
corresponding confusion matrix were used to compute the classifier performance measures
such as precision, sensitivity, specificity, Area under the ROC curve (AUC), the kappa
statistics and the classification accuracy. The next step was to identify the optimum window
size for each of the configurations selected. The three cardinal performance measures namely,
precision, sensitivity and specificity were first used to obtain the individual desirabilities for
maximizing the response. Finally the overall desirability using the formula in Eqn. 5.X for a
particular window size was calculated.
Figure 5.17 shows the plot between the number of neurons and the individual
desirabilities obtained for all the six neural network configurations for each of the window
sizes (5-to-30) trained. A visual inspection of the plots indicate that the CASFBNN classifier
performs optimally with MLP1NN being the second choice and FFTD-D1NN as the third
choice. The optimal choice for the number of neurons in the hidden layers was found to be 25
followed by 30 when the individual desirabilities were compared across the window sizes.
Further, to select a suitable configuration among these variables of window size, number of
neurons and the classifier, the overall desirabilities alongwith the three cardinal performance
measures could be compared. This analysis resulted in total six possible choices for each of
the window sizes as shown in Table 5.18, based on the overall desirability.
137
Fig. 5.17. Single-Turn Analysis: Window Size Selection
All the six classification performance measures were tabulated in Table 5.19 for all the 14
drivers to further analyze the individual performance of the six possible configurations
presented in Table 5.18. When the average values for each of the configurations for each of
the performance measures are compared from Table 5.19, it is evident that two possible
configuration choices exists as (a) CASFBNN (window size, WS = 25 and no. of neurons,
N = 25), and (b) MLP1NN (WS = 30 and N = 25). This observation when compared again
with the overall desirability and the three cardinal performance measures of precision,
sensitivity and specificity shown in Table 5.18 is also justified.
Table 5.18. Optimum Window-Size Selection for Single Turn Drives
Window Size
(Seconds)
Overall
Desirability
No. of
Neurons in
Hidden Layer
Optimum
Classifier
Mean
Precision
Mean
Sensitivity
Mean
Specificity
5 0.99957 30 CASFBNN 71.20% 69.80% 91.35%
10 0.96078 30 FFTD-D1 72.58% 70.69% 91.90%
15 0.98911 25 CASFBNN 72.94% 72.59% 92.10%
20 0.98248 30 MLP1NN 75.85% 72.83% 92.10%
25 0.98455 25 CASFBNN 77.94% 78.20% 93.73%
30 0.97329 25 MLP1NN 76.01% 75.38% 92.96%
138
Table 5.19. Classifier Performance Measure for the 4-Class Classifier
Drivers Precision Sensitivity
CAS
W = 5
N = 30
FFTDD1
W = 10
N = 30
CAS
W = 15
N = 25
CAS
W = 20
N = 30
CAS
W =25
N = 25
MLP1
W = 30
N = 25
CAS
W = 5
N = 30
FFTDD1
W = 10
N = 30
CAS
W = 15
N = 25
CAS
W = 20
N = 30
CAS
W =25
N = 25
MLP1
W = 30
N = 25
D1 79.90% 77.93% 77.30% 83.11% 81.05% 86.11% 81.02% 76.41% 77.70% 83.11% 80.41% 86.78%
D2 72.21% 74.95% 75.61% 77.39% 80.21% 87.63% 52.93% 70.89% 75.43% 76.89% 82.06% 88.39%
D3 79.03% 80.14% 74.11% 78.45% 83.84% 67.16% 70.05% 82.03% 75.94% 77.93% 80.47% 65.78%
D4 53.98% 63.04% 64.97% 84.36% 75.48% 69.34% 61.79% 55.13% 64.37% 83.36% 77.72% 68.62%
D5 75.69% 82.77% 71.30% 77.10% 81.67% 74.42% 61.81% 81.93% 69.37% 71.63% 83.98% 76.42%
D6 66.40% 63.91% 76.03% 67.16% 78.94% 73.15% 66.99% 62.66% 75.96% 67.76% 80.00% 70.03%
D7 72.92% 58.40% 77.26% 77.37% 71.48% 72.10% 29.57% 56.47% 77.61% 77.61% 72.25% 73.69%
D8 67.32% 78.89% 77.12% 77.13% 82.71% 76.85% 60.89% 79.21% 75.05% 75.40% 81.75% 73.96%
D9 69.07% 74.08% 66.78% 72.08% 75.93% 77.58% 34.09% 73.17% 66.69% 71.10% 76.72% 76.94%
D10 74.24% 72.05% 67.11% 63.39% 75.49% 74.53% 60.75% 68.85% 67.27% 55.45% 76.12% 75.06%
D11 55.45% 72.46% 65.49% 77.83% 72.41% 76.35% 64.84% 68.32% 63.91% 50.77% 72.69% 76.81%
D12 75.71% 79.57% 74.60% 79.18% 78.12% 76.51% 57.28% 77.16% 74.71% 78.53% 77.71% 74.92%
D13 77.74% 68.67% 77.74% 73.87% 71.04% 73.48% 63.64% 67.51% 77.36% 73.36% 71.12% 70.74%
D14 70.52% 73.20% 70.99% 75.43% 82.79% 78.94% 59.57% 73.12% 70.10% 76.22% 81.77% 77.14%
Average 70.73% 72.86% 72.60% 75.99% 77.94% 76.01% 58.95% 70.92% 72.25% 72.79% 78.20% 75.38%
Table 5.19 (Continued). Classifier Performance Measure for the 4-Class Classifier
Drivers Specificity AUC
CAS
W = 5
N = 30
FFTDD1
W = 10
N = 30
CAS
W = 15
N = 25
CAS
W = 20
N = 30
CAS
W =25
N = 25
MLP1
W = 30
N = 25
CAS
W = 5
N = 30
FFTDD1
W = 10
N = 30
CAS
W = 15
N = 25
CAS
W = 20
N = 30
CAS
W =25
N = 25
MLP1
W = 30
N = 25
D1 95.29% 94.50% 94.43% 96.21% 95.69% 96.50% 87.95% 88.77% 89.46% 92.50% 95.83% 90.00%
D2 87.23% 92.14% 91.81% 92.87% 93.94% 96.30% 68.77% 74.78% 50.00% 63.33% 60.48% 81.25%
D3 92.06% 94.66% 93.13% 93.76% 95.04% 92.26% 77.74% 82.17% 74.83% 70.48% 81.35% 93.55%
D4 89.98% 87.90% 90.48% 95.27% 93.53% 90.00% 50.00% 50.00% 56.19% 82.09% 81.13% 50.00%
D5 89.27% 94.33% 90.46% 91.50% 94.65% 92.86% 58.44% 67.11% 50.00% 50.00% 69.92% 63.86%
D6 91.03% 89.35% 93.41% 90.38% 94.01% 91.02% 53.56% 50.00% 79.44% 52.22% 72.50% 50.00%
D7 76.77% 87.22% 93.11% 93.89% 91.93% 91.26% 67.26% 50.00% 72.01% 88.04% 74.53% 63.97%
D8 89.16% 94.07% 93.03% 92.58% 94.96% 92.82% 77.12% 91.52% 82.19% 73.97% 83.59% 86.39%
D9 78.78% 92.16% 89.03% 91.55% 92.79% 93.08% 56.63% 72.53% 50.00% 70.05% 70.33% 80.09%
D10 89.37% 91.42% 90.03% 85.01% 92.93% 92.55% 60.83% 65.30% 50.00% 50.00% 63.56% 57.92%
D11 90.75% 91.14% 89.63% 85.30% 92.22% 93.36% 50.00% 55.02% 50.00% 50.00% 65.18% 69.05%
D12 87.64% 93.88% 93.61% 94.35% 93.83% 93.39% 79.22% 78.68% 94.58% 85.61% 76.66% 76.60%
D13 90.32% 91.57% 93.82% 92.89% 91.60% 92.40% 80.17% 76.07% 77.95% 77.70% 69.81% 78.79%
D14 88.64% 92.54% 91.63% 93.09% 95.14% 93.60% 71.42% 88.42% 83.84% 80.04% 96.72% 79.25%
Average 88.31% 91.92% 91.97% 92.05% 93.73% 92.96% 66.75% 70.74% 68.61% 70.43% 75.83% 72.91%
139
Table 5.19 (Continued). Classifier Performance Measure for the 4-Class Classifier
Drivers Kappa Classification Accuracy
CAS
W = 5
N = 30
FFTDD1
W = 10
N = 30
CAS
W = 15
N = 25
CAS
W = 20
N = 30
CAS
W =25
N = 25
MLP1
W = 30
N = 25
CAS
W = 5
N = 30
FFTDD1
W = 10
N = 30
CAS
W = 15
N = 25
CAS
W = 20
N = 30
CAS
W =25
N = 25
MLP1
W = 30
N = 25
D1 78.08% 75.53% 74.56% 82.71% 80.02% 84.10% 84.60% 82.95% 81.94% 87.96% 86.05% 88.73%
D2 65.79% 66.97% 67.00% 69.35% 74.44% 84.36% 75.51% 76.64% 76.54% 77.87% 81.44% 88.75%
D3 73.61% 76.78% 69.80% 73.31% 78.97% 64.90% 81.43% 83.41% 78.38% 81.08% 85.23% 75.34%
D4 42.64% 51.28% 59.85% 79.26% 71.04% 57.84% 59.22% 67.39% 71.90% 85.22% 79.12% 69.74%
D5 69.09% 76.37% 61.88% 66.07% 77.55% 69.50% 78.20% 83.19% 73.42% 76.47% 84.04% 78.21%
D6 59.53% 55.59% 71.68% 59.25% 74.41% 63.21% 71.43% 68.44% 79.63% 70.49% 81.44% 73.75%
D7 66.51% 47.75% 70.78% 73.53% 65.65% 63.63% 75.94% 62.95% 78.44% 80.80% 75.00% 73.49%
D8 57.09% 74.52% 70.71% 68.81% 78.83% 69.43% 68.25% 81.30% 78.74% 77.10% 84.62% 77.91%
D9 59.88% 67.45% 55.35% 64.85% 69.87% 71.01% 70.77% 76.21% 66.89% 74.34% 77.78% 78.67%
D10 62.18% 64.06% 58.50% 37.74% 70.21% 68.51% 72.65% 74.56% 70.20% 52.63% 78.89% 77.33%
D11 39.06% 63.89% 56.94% 43.54% 66.79% 71.93% 55.76% 74.79% 69.57% 63.64% 76.04% 80.00%
D12 71.13% 74.10% 71.46% 75.32% 73.71% 71.55% 79.57% 81.66% 79.61% 82.46% 81.32% 80.00%
D13 73.55% 63.88% 73.62% 69.47% 62.75% 66.84% 80.94% 74.68% 81.29% 78.45% 72.83% 76.62%
D14 61.11% 67.95% 64.31% 70.27% 78.98% 73.12% 71.43% 76.61% 73.94% 78.23% 84.69% 80.49%
Average 62.81% 66.15% 66.18% 66.68% 73.09% 70.00% 73.26% 76.06% 75.75% 76.19% 80.61% 78.50%
Figure 5.18: Boxplots of Performance Measures for Single Turn Drives
In order to understand the dispersion in the data, the boxplots for all the six classifier
performance measures have been plotted as shown in Figure 5.18 for the selected classifier
configurations as discussed in Table 5.18 - 5.19. It can be clearly seen that the interquartile
140
ranges (IQR) for CASFBNN (WS = 25 and N = 25) configuration is symmetric about the
median for all the classifier performance measures except for the AUC measure. In addition
to this the median for this configuration also lies above all the other configurations. When the
MLP1NN (WS = 30 and N = 25) configuration is compared, it is noticed that the
inconsistencies exist due the variable IQRs and in some cases due to the presence of outliers.
Hence, it can be concluded that for the single-turn analysis the CASFBNN (WS = 25 and N =
25) configuration is the best choice followed by the MLP1NN (WS = 30 and N = 25)
configuration.
In the preceding discussion, it was observed that the overall classification accuracy was
achieved nearly close to 80% in all the cases. In machine learning problems, the classification
accuracies lying close to 90% and above are considered as a good performance measure for a
binary class (2-class) problem, which is a commonly used classification design. In some
literature it is considered that a classification accuracy of 50% in a two-class problem is
equivalent to an accuracy of 25% in a four-class problem (Townsend et al., 2006), provided
the number of items to be classified in each class is equal (or balanced). However, very few
evidence exists in literature. The present problem being actually a multiclass (4-Class)
problem, as discussed in the Section 5.1.4, can not be termed as a binary class, as each of the
class levels (Level-1 to Level-4) have their own characteristics and existence in the
considered feature map. Hence in such cases, selection of performance metrics need careful
attention to have minimal bias towards the most-represented class.
Congalton et al. (1991), proposed a method for estimating the individual class accuracies
for a multiclass problem. In this method, the producer's and user's accuracies can be
computed, representing the individual class accuracies, sometimes also known as partial
accuracies, from the confusion matrix obtained from a classification exercise as shown in
Table 5.20. The probability of correctly classifying the features categorized in a particular
class is known as the Producer's Accuracy. Whereas the probability of a feature categorized
in a particular stress class which actually belongs to that class is known as the User's
Accuracy. Alternately, the producer's accuracy is considered as a measure of omission error
(100 - producer's accuracy) whereas the user's accuracy is a measure of commission error
(100 - user's accuracy).
141
Table 5.20: Producer and User Accuracy of a Classifier
Actual Class (True Class)
Hypothesized
Class (Predicted
Class)
A B C Row Total
A D E F D+E+F
B G H I G+H+I
C J K L J+K+L
Column
Total
D+G+J E+H+K F+I+L
Producer's Accuracy
A = D / D+G+J
B = H / E+H+K
C = L / F+I+L
User's Accuracy
A = D / D+E+F
B = H / G+H+I
C = L / J+K+L
The average overall, producer's and user's accuracies for the selected configurations by
considering the window size, number of neurons and the classifiers have been shown in Table
5.21 for all the 14 drivers. It can be observed that the CASFBNN (WS = 25 and N = 25)
configuration performs better with an overall accuracy of 80.61%. Although 94.90%, 74.18%
and 75.15% of the features (producer's accuracies) have been correctly classified as Level-1,
Level-3 and Level-4 class respectively, only 89.46%, 72.70% and 74.99% (user's accuracies)
of them actually belong to their respective classes. Whereas 67.54% of the features which
were correctly classified as Level-2 stress class, 75.65% of them actually belong to this class,
a misclassification of 8.11%. We can also observe that for all the configurations considered
there is a certain degree of misclassification present in all the stress classes. The standard
deviation of each class for the CASFBNN (WS = 25 and N = 25) configuration is also very
less, indicating that the classification results can be generalized among a suitable population.
142
Table 5.21. Individual Class Accuracies: Producer's and User's Accuracy
Classifiers (Window
Size; No. of Neurons)
Overall
Accuracy
(Max.;
Min.)
Producer's Accuracy User's Accuracy
Level 1 Level 2 Level 3 Level 4 Level 1 Level 2 Level 3 Level 4
CASFBNN
(5; 30)
Average
73.26%
(84.60%;
55.76%)
89.87% 62.04% 65.25% 65.75% 88.69% 61.51% 59.16% 67.26%
Std.
Dev. 8.19 5.97 11.16 13.69 10.63 9.15 15.92 13.67 16.21
FFTD-D1
(10; 30)
Average
76.06%
(83.41%;
62.95%)
87.25% 67.48% 62.88% 67.98% 94.04% 52.40% 64.81% 65.92%
Std.
Dev. 6.34 8.08 10.04 8.33 11.41 3.53 17.23 16.57 14.51
CASFBNN
(15; 25)
Average
75.75%
(81.94%;
66.89%)
89.75% 66.52% 66.24% 67.90% 89.38% 68.00% 61.40% 70.21%
Std.
Dev. 4.75 7.93 8.61 10.50 9.46 8.30 11.43 12.74 7.35
MLP1NN
(20; 30)
Average
76.19%
(87.96%;
52.63%)
89.78% 70.27% 67.26% 76.65% 87.94% 66.68% 68.25% 68.31%
Std.
Dev. 9.06 10.90 17.90 8.26 10.19 13.11 17.41 6.64 19.92
CASFBNN
(25; 25)
Average
80.61%
(86.05%;
72.83%)
94.90% 67.54% 74.18% 75.15% 89.46% 75.65% 72.70% 74.99%
Std.
Dev. 4.15 3.64 12.06 8.63 6.53 5.93 11.26 10.59 10.31
MLP1NN
(30; 25)
Average
78.50%
(88.75%;
69.74%)
93.49% 71.24% 70.37% 68.95% 89.14% 67.61% 67.79% 76.97%
Std.
Dev. 5.25 3.99 11.32 16.02 9.26 7.84 14.68 16.51 15.50
In Table 5.22, the producer's and user's accuracies of the two configurations CASFBNN
(WS = 25 and N = 25) and MLP1NN (WS = 30 and N = 25) have been compared for all the
14 drivers individually.
143
Table 5.22. Individual Class Accuracies: Producer's and User's Accuracy of Two
Classifiers
Drivers Classification
Accuracy
CASFBNN
(Window Size = 25, Neurons = 25)
MLP1NN
(Window Size = 30, Neurons = 25)
Level-1 Level-2 Level-3 Level-4 Level-1 Level-2 Level-3 Level-4
D1 Producer's 100.00% 86.67% 63.64% 73.91% 100.00% 83.33% 83.33% 77.78%
User's 97.37% 76.47% 58.33% 89.47% 93.55% 76.92% 83.33% 93.33%
D2 Producer's 91.43% 61.90% 80.00% 87.50% 96.67% 91.67% 85.71% 76.47%
User's 82.05% 92.86% 83.33% 70.00% 90.63% 91.67% 90.00% 81.25%
D3 Producer's 94.59% 75.00% 87.50% 78.26% 96.88% 70.00% 33.33% 68.42%
User's 94.59% 78.95% 58.33% 90.00% 100.00% 46.67% 40.00% 76.47%
D4 Producer's 100.00% 50.00% 82.35% 69.57% 95.65% 80.00% 45.45% 56.25%
User's 86.84% 75.00% 82.35% 66.67% 68.75% 80.00% 35.71% 90.00%
D5 Producer's 91.89% 72.22% 84.00% 78.57% 90.32% 62.50% 83.33% 61.54%
User's 89.47% 92.86% 75.00% 78.57% 87.50% 83.33% 68.18% 66.67%
D6 Producer's 94.44% 66.67% 73.68% 80.95% 84.38% 80.00% 66.67% 61.54%
User's 87.18% 77.78% 77.78% 77.27% 84.38% 50.00% 61.54% 84.21%
D7 Producer's 92.11% 52.38% 71.43% 70.00% 92.86% 76.92% 63.64% 55.00%
User's 92.11% 73.33% 65.22% 58.33% 81.25% 83.33% 77.78% 52.38%
D8 Producer's 94.87% 78.57% 74.07% 83.33% 96.88% 75.00% 76.92% 58.62%
User's 94.87% 57.89% 90.91% 83.33% 93.94% 64.29% 52.63% 85.00%
D9 Producer's 93.33% 70.00% 72.22% 68.18% 96.00% 64.71% 65.00% 84.62%
User's 84.85% 73.68% 65.00% 83.33% 88.89% 73.33% 72.22% 73.33%
D10 Producer's 89.47% 62.50% 75.00% 75.00% 89.66% 53.85% 84.62% 70.00%
User's 89.47% 76.92% 71.43% 66.67% 83.87% 58.33% 64.71% 93.33%
D11 Producer's 94.12% 61.54% 68.97% 65.00% 90.91% 61.54% 76.47% 76.47%
User's 82.05% 53.33% 86.96% 68.42% 90.91% 66.67% 68.42% 81.25%
D12 Producer's 92.31% 71.43% 80.00% 68.75% 91.18% 53.33% 84.62% 76.92%
User's 94.74% 88.24% 63.16% 64.71% 96.88% 66.67% 64.71% 71.43%
D13 Producer's 100.00% 50.00% 54.17% 80.00% 93.94% 66.67% 58.33% 75.00%
User's 79.49% 73.33% 65.00% 66.67% 93.94% 61.54% 87.50% 40.00%
D14 Producer's 100.00% 86.67% 71.43% 73.08% 93.55% 77.78% 77.78% 66.67%
User's 97.30% 68.42% 75.00% 86.36% 93.55% 43.75% 82.35% 88.89%
Average Producer's 94.90% 67.54% 74.18% 75.15% 93.49% 71.24% 70.37% 68.95%
User's 89.46% 75.65% 72.70% 74.99% 89.14% 67.61% 67.79% 76.97%
Standard
Deviation
Producer's 0.0364 0.1206 0.0863 0.0653 0.0399 0.1132 0.1602 0.0926
User's 0.0593 0.1126 0.1059 0.1031 0.0784 0.1468 0.1651 0.1550
Overall Average
(Std. Dev.) 80.61% (0.0451) 78.50% (0.0525)
144
It is observed that among the individual class levels, the Level-1 which represents a low
stress level has been classified optimally for some of the drivers, reaching 100%
classifications for both the configurations. Similarly, although Level-2 class represent
misclassifications for both the configurations, the classes Level-3 and Level-4 perform better
for the CASFBNN (WS = 25 and N = 25) configuration but not for the MLP1NN (WS = 30
and N = 25). Hence finally it can be concluded that the CASFBNN configurations with
window size = 25 and number of neurons = 25 is the optimal configuration for the presented
4-Class problem.
5.1.7.3 Methodology for Multi-Turn Affective State Analysis
In the multi-turn analysis, six drivers were asked to participate in the real-time data
collection experiment multiple times depending upon their availability. The drivers had to
follow the same route based on the assumption that they experience same level of stress while
driving multiple times. This analysis also tries to estimate for an individual driver the best
suited NN configuration with least intra-observer variability and with high reliability. This
ensures an unbiased analysis to be performed due to the fact that each drivers chose their own
time for data collection experiment, the number of turns completed by each driver also varied
for the real-time data acquisition.
In order to obtain an optimum classifier, desirability function based approach can not be
adopted due to the fact that number of turns for the data collection experiment varied for each
driver. In such a situation the approach requires that the individual feature vector for the
respective multiple turns should be trained and the performance analyzed on a comparative
basis. Therefore, for this analysis only four cardinal classifier performance measures such as
precision, sensitivity, specificity and classification accuracy were computed. However, the
configurations were again used as discussed earlier in the single-turn analysis i.e. by
considering window sizes from 5 to 30 against the six neurons values of 5 to 30, for each of
the six classifier configurations. The variables like window size, number of neurons in the
hidden layer for each classifier if compared will find a best fit in the current set of classifier
configurations for multi-turn analysis.
5.1.7.4 Results: Multi-Turn Affective State Analysis
The data collected for the number of turns completed by six drivers who participated in
the multi-turn analysis experiment was trained to obtain the averages of the performance
measures has been shown in Table 5.23.
145
Table 5.23. Multi-Turn Analysis considering Individual Averages
Drivers
(No. of
Turns)
Performance
Measures
CASFBNN
(WS=5; N=30)
FFTDD1NN
(WS=10; N=30)
CASFBNN
(WS=15; N=25)
MLP1NN
(WS=20; N=30)
CASFBNN
(WS=25; N=25)
MLP1FNN
(WS=30; N= 25)
Avg. STD Avg. STD Avg. STD Avg. STD Avg. STD Avg. STD
D1
(5)
Precision 68.21% 7.80 68.12% 9.70 65.81% 12.94 73.90% 7.72 75.10% 8.48 69.87% 16.65
Sensitivity 64.86% 9.28 66.64% 8.99 63.55% 15.86 71.61% 8.87 72.86% 12.86 69.61% 16.86
Specificity 90.25% 2.89 91.50% 2.47 89.57% 4.99 92.22% 2.41 92.24% 4.09 91.48% 5.31
Class. Accu. 76.80% 6.04 74.40% 7.36 75.50% 8.36 79.98% 7.26 78.38% 10.44 76.30% 15.99
D2
(5)
Precision 74.44% 11.51 72.27% 6.25 74.95% 9.29 81.52% 2.60 76.02% 2.86 74.62% 10.84
Sensitivity 73.04% 10.73 69.03% 9.25 74.46% 9.55 81.19% 2.93 76.38% 2.93 73.96% 12.50
Specificity 92.27% 3.42 91.47% 2.36 92.87% 2.69 94.57% 0.80 93.19% 0.84 92.16% 3.84
Class. Accu. 76.82% 9.91 74.88% 5.60 78.30% 7.84 83.25% 2.33 78.64% 2.50 76.17% 11.34
D3
(5)
Precision 71.71% 5.06 71.89% 3.57 77.63% 3.15 74.17% 5.18 74.93% 3.38 76.15% 3.96
Sensitivity 71.26% 5.37 70.65% 3.53 77.64% 3.71 73.18% 7.14 74.30% 3.55 74.40% 2.96
Specificity 91.77% 1.62 92.06% 0.75 93.40% 1.03 92.42% 2.04 92.74% 1.30 93.05% 0.94
Class. Accu. 75.03% 5.06 75.89% 2.42 80.01% 2.67 77.52% 5.17 77.86% 4.08 78.61% 2.67
D4
(4)
Precision 71.38% 7.59 71.89% 6.22 75.35% 3.02 78.66% 0.48 78.99% 4.77 74.06% 6.25
Sensitivity 69.20% 6.96 71.63% 7.24 75.63% 1.70 76.33% 1.19 77.82% 3.87 74.45% 7.27
Specificity 93.17% 3.60 92.05% 1.74 92.76% 0.32 93.03% 0.64 93.76% 1.11 92.99% 1.18
Class. Accu. 80.01% 12.18 75.97% 5.04 77.69% 1.12 79.79% 1.82 81.36% 3.37 78.16% 3.83
D5
(3)
Precision 63.13% 9.66 74.19% 3.84 74.49% 6.87 75.74% 2.94 68.71% 22.52 77.52% 14.29
Sensitivity 62.79% 9.31 72.72% 4.34 72.77% 8.83 75.17% 2.69 66.94% 26.33 78.32% 14.24
Specificity 89.20% 3.27 92.48% 1.53 91.81% 2.66 92.68% 0.95 89.85% 8.60 95.42% 4.53
Class. Accu. 67.28% 9.44 76.87% 4.22 75.30% 8.08 77.48% 3.08 68.57% 26.37 86.01% 14.25
D6
(2)
Precision 59.83% 9.29 63.14% 1.08 64.87% 15.79 73.56% 9.05 79.02% 0.12 77.16% 5.68
Sensitivity 59.01% 8.15 57.89% 6.75 65.32% 15.05 73.52% 8.15 79.66% 0.49 73.63% 5.09
Specificity 87.93% 3.28 87.26% 2.96 90.66% 3.89 92.39% 2.83 94.05% 0.06 92.56% 2.17
Class. Accu. 63.79% 10.81 65.07% 4.77 71.55% 11.43 76.70% 8.78 81.58% 0.20 78.43% 6.62
In this analysis the maximum values of the average performance measures have been
highlighted in the table indicating their significance. Certain observations can be made
although they may not be unique are summarized as below:
the window size can be fixed between 15 to 30 with 25 being most suitable. Window
size of 25 was also optimal for single-turn analysis.
the number of neurons in hidden layers can be fixed as either 25 or 30, which justifies
the single-turn analysis too.
the CASFBNN classifier performs better as compared to others with MLP1NN being
the second choice, again confirms the analysis performed in the single-turn case.
since the standard deviation for each of the turn in the multi-turn analysis is very close
to their means, it can be said that the effect of number of turns have a little effect on
the results obtained.
146
the classification accuracy for the for the CASFBNN (WS = 25; N = 25)
configuration is close to 80%, which is similar to single-turn analysis.
Therefore, it can be concluded that for the multi-turn analysis, when only the overall
cardinal performance measures are compared, the CASFBNN classifier could be an optimal
choice for the affective state analysis for a 4-Class classification problem as discussed.
However proper selection should be made concerning the parameters like window size,
number of neurons in the hidden layer for achieving a better classification performance in the
proposed scenario.
5.2 Real-Time Trend Analysis and Detection Methods
In the foregoing sections the methodology adopted for affective state detection using neural
networks identified the stress classes into (a) 3-Classes: Relaxed, Moderate and Stressed, and
(b) 4-Classes: Level-1 to Level-4. In addition to the affective states, during the data collection
experiment discussed in Section 3.4.2, stress-trends were defined and annotated carefully.
These stress-trend markers reflect the time stamps of certain stress inducing driving event and
incidents (Nahl et al.,2003) like sharp left turns, sharp right turns, busy market area, bad road
stretches, circular turns, speed breakers, unanticipated pedestrian crossing, abrupt lane change
by another vehicle, jaywalkers etc. observed during driving. In a continuous driving setup,
besides the affective states the effect of stress-trends on the driver's mental state could have a
cumulative effect which must be detected and accounted for to assess the overall stress level.
Therefore, in the following subsections the stress-trend detection algorithms using real-time
trend detection methods has been discussed.
5.2.1 Need for an online approach and the proposed novelty
Adaptive detection of incremental changes in the emotions as well as the fatigue or stress-
level of drivers during real-time driving may minimize the road-accidents (Singh et al.,
2011). In order to achieve this, the change in physiological features due to stress-trends must
be tracked to identify alarming situations. Therefore the data collected from drivers during
real-time experiments have been analyzed instead of a simulation based experiment as the
data collected in real-time will effectively correlate the stress-levels (Singh et al., 2011, Singh
et al., 2013a). Whenever a stress-trend occurs, change in physiological signal base-level,
morphology and other parameters could be noticed. This change is significant as it may
reflect a quantitative measure of the amount of the stress experienced by the subject.
147
5.2.2 The Trigg's Statistical Approach
The Trigg's statistical method has been a popular approach to detect the changes in signal
patterns in real-time tracking applications. In physiological computing this method has been
used to significantly identify the signal-trends for blood pressure and heart rate monitoring of
patients (Melek et al., 2005). Ping Yang (2009) applied this approach for the trend change
detection and pattern recognition of physiological signals of intensive care unit (ICU)
patients. This approach resulted in acceptable outcomes with the data that had many sudden
changes and both high and low degrees of noise. Since ICU patients need careful attention
from the nurses and doctors for longer duration, their workload could be minimized if an
alarming situation could be notified to them via a machine which tracks the vital parameters
of a person adaptively using this approach. Therefore, the similar approach could be
applicable to the driver stress level detection based on physiological signals. It has been
assumed that automatic detection of the stress-trends would enable the proposed WDAS to
activate and respond in accident prone spells of driving (Singh et al., 2011). Additionally
these recorded stress-trends could also be used to quantify the emotional toll of traffic trouble
spots which could help prioritize road improvements as proposed by Healey and Picard
(2005).
In the Triggs statistical method, a Triggs’ Tracking Variable (TTV) is calculated which a
signal detection index. The difference between the actual value of a feature and the value
predicted using the exponential weighted moving average (EWMA) of the previous values
are computed to get the TTV value. The absolute value of the TTV indicates the significance
of the change observed. In the TTV calculation algorithm, a value of +1 is assigned to the
TTV if there is 100% certainty that a feature is increasing and a value of -1 is assigned if
there is 100% certainty that a feature is decreasing. The design parameters for TTV
calculation are exponential smoothing constant 'α' and the number of samples of observed
signal included in exponential weighing. The smoothing constant 'α', which can have values
between 0.0 and 1.0, determines the number of control observations to be included in EWMA
(Cembrowski et al., 1975). The algorithm for calculating Triggs’ tracking variable is
illustrated in (Hope et al., 1973) and is represented in the pseudocode shown in Table 5.24.
The algorithm shown in Table 5.24 is used to extract a tracking vector comprising of the
TTV values of individual features. This tracking vector adaptively tracks the incremental
changes in the feature values and is therefore used as the input vector to train the machine for
stress-trend detection using the following two techniques: (a) a shape based feature weight
148
allocation algorithm, and (b) a neural network based regression model, described in the
subsequent sections.
Table 5.24. Algorithm Pseudocode for TTV calculation
Algorithm ttv := Triggs’ Tracking Variable
Inputs
Dt := array of feature values observed
α := smoothing constant; between 0.0 and 1.0 which determines the time constant for exponential weighting.
Output
ttv := Triggs’ tracking variable
Begin:
Initialization
ut-1 := vav ; exponentially weighted avg. of 1st monitoring segment
st-1 := vav /100 ; initial error in prediction for the 1st segment
madt-1:= vav /10 ; initial mean absolute deviation
Main Routine
S1:ut := αdt+(1-α)ut-1 ; predicted value for present segment
S2:et := dt-ut-1 ; error in prediction
S3:st := αet+(1-α)et-1 ; smoothened error
S4:madt := α|et| + (1-α)madt-1 ; mean absolute deviation
S5 ttvt := st /madt ; Triggs’ tracking variable for the tth segment
Updation
ut-1:= ut;
st-1:= st;
madt-1:= madt;;
End
5.2.3 The Shape Based Feature Weight Allocation
In this algorithm, a 160 sec window was selected in each of the five scenarios from the
feature vector matrix. The segments in which the concentration of annotated stress-trends
were highest were chosen for each driving scenario, reflecting an estimate of feature value
that is under constant stress. The scenarios were indexed in the order of their occurrence as 1
- Pre-driving, 2 - Relaxed-driving, 3 - Busy-driving, 4 - Return-driving and 5 - Post-driving
(Singh et al., 2011).
5.2.3.1 Classification of Trend Shapes
It was observed that stress progressively increased from the start of the driving towards
Busy-driving and decreased during the Return-driving and Post-driving resulting in trend-
shapes. The feature shapes when observed manually resulted in the following category of
significant trend-shapes:
149
1) Concave / Convex Trend: This trend indicates that the corresponding feature is of high
significance level as it attains either a peak or trough in a high stress scenario. This also
shows that such a feature reflects the transient and scenario-dependent component of the
signals collected. Such features possessing these shapes were assigned with a weight of 5.
2) Monotonically Increasing / Decreasing Trend: This trend correlates to the long term
effects of stress and fatigue indicating the global component of the signals collected. It can be
easily visualized that this steady increase or decrease observed over time has lesser
significance than concave / convex trends. Therefore, these features were assigned a weight
of 3 for this category of shapes.
3) Linear / Other Trends: A linear trend is an indication of very little change in the stress
level of drivers and features with these shapes carry very less information about the stress
state of the driver or the scenario of operation. Therefore these features were assigned with
zero weight.
5.2.3.2 Feature Weight Allocation
In order to allocate weights to a particular feature, the percentage of each trend shape
observed for a particular driver was calculated. If a trend shape is observed in at least 50% of
the drivers, that shape is allotted to the corresponding feature. This process is repeated for all
the trend shapes for all features for a particular driver which has been shown in Table 3.10 in
Section 3.8.1. In the next step, for a particular feature the "weight-sum" was calculated which
is the cumulative sum of the individual weights of the trend-shape observed (concave /
convex: 5, monotonically increasing / decreasing: 3, others: 0) for each driver. The final
"feature weight" allotted to a feature was the quotient of the weight-sum divided by the total
number of drivers. These calculated weights for some of the extracted features alongwith
their trend shapes have been shown in the Fig. 5.19.
5.2.3.3 Trigg's Tracking Variable (TTV) Calculation
As discussed in Section 5.22., the parameters of TTV analysis are the smoothing constant
'α' and the confidence interval between TTV lower control limit (TTVl) and TTV upper
control limit (TTVu). The smoothing constant 'α' determines the time constant for exponential
weighing as well as the number of control observations to be included in the exponential
weighing (Cembrowski et al., 1975). A smoothing constant value of 0.3 was chosen as the
number of observations included for smoothing were found to be as '6'. Hence the first six
segments were discarded as transients and not considered in the analysis. For α = 0.3, the
90% confidence interval gave the TTVl as -0.63 and TTVu as +0.63 which included all the
150
observed significant trends. The algorithm for calculating Triggs’ tracking variable is
illustrated in Table 5.24 [Cembrowski et al., 1975].
Figure 5.19: Feature Shapes and Feature Weight Allocation (Source: Singh et al., 2011)
5.2.3.4 Segment Weight Calculation for TTV Analysis
In the next step, the "Segment Weight" is defined as the cumulative sum of the feature-
weights whose TTV values were found to violate the control limits. The feature-weights used
for this calculation were found using the 'Shape based Feature Weight Allocation Algorithm'
as described in Section 5.2.3.2. Whereas the TTV values for each feature for the given
segment was obtained using the TTV calculation method as described in Section 5.2.3.3
above. Whenever the TTV value for a particular feature violates the control limits i.e. the
90% confidence interval, the corresponding feature weight was added to the segment weight
which was initialized to zero in the beginning.
5.2.3.5 Optimal Threshold Selection using the Desirability Function Approach
In order to classify a particular segment as a ‘Stressful’ segment, a critical threshold value
must be identified for the segment weights. A proper threshold selection will minimize the
false alarms whereas at the same time it will maximize the true alarms. This could be
achieved by the desirability function approach, often used for optimization of multiple
response processes (Derringer et al., 1980). Desirability of a response takes values between
151
0.0 and 1.0, where 0.0 corresponds to a completely undesirable value of response whereas 1.0
corresponds to ideal response value. In such one-sided transformations, an upper and lower
limit of the responses was appropriately chosen. The individual desirability functions and the
overall desirability for each control value were calculated using formulae used in Eqn. 5.15
(Derringer et al., 1980). The control value corresponding to the maximum desirability was
chosen as the threshold.
It was observed that the threshold of segment weight lies between the range of 18 to 27. A
particular alarm for an ith
segment is considered as a true alarm if there was another manually
recorded stress-trend marker in its vicinity i.e. in [i-2, i+2] segments, otherwise that alarm
was considered as a false alarm. The grand mean of percentage of true and false alarms
calculated for each individual threshold were used as responses in this method. In Eqn. 5.15,
the value of the desirability exponent was chosen as '1' for a linear increase of the desirability
function. The overall desirability was calculated from individual desirability for each
threshold value. The threshold value with the maximum desirability was chosen as the
optimum threshold.
Figure 5.20: Optimum Threshold Identification using Desirability Function.
(Source: Singh et al., 2011)
152
5.2.3.5 Results of Segment Weight based Stress-Trend Detection
During the data collection experiment, the average number of stress-trend markers recorded
were '42'. The threshold for true and false alarms were found to be 34 and 15 respectively
while analyzing from the individual number of True and False alarms detected. The number
34 corresponds to 80% successful detection rate of true alarms. Fig. 5.20 shows the results
that an optimal threshold value of 24 was found when individual desirabilities were
calculated for each threshold value with overall desirability reaching a maximum of 0.549.
The detection efficiency for a threshold of 24 can be validated by crosschecking with the
events annotated during the drive. Table 5.25 shows the percentage of true and false alarms
alongwith the stress-trends detected.
Table 5.25: Results of Segment Weight based Stress-Trend Detection
(Source: Singh et al., 2011)
Driver Index Detected (%) True Alarms %) False Alarms (%)
D1 82 (Max.) 59 40
D2 64 45 55
D3 54 42 57
D4 75 65 34
D5 71 60 39
D6 76 66 33
D7 68 50 50
D8 77 70(Max.) 29(Min.)
Avg.(Approx.) 71 58 42
It can be observed from Table 5.25 that present algorithm resulted in successful detection of
approximately 71% of stress-trends recorded, this is equivalent to 80% for 8 drivers. In this
analysis, only 8 drivers results have been included as this algorithm was developed in the
initial phases of the work where 9 drivers participated in the experiment, but some recorded
events went missing for a driver. It can also be observed that the average true alarms detected
are close to approximately 58% of total alarmable trends, whereas remaining 42% are the
false alarms. This result indicated that although the success rate for the present algorithm is
not optimal but it could be improved. For more optimal solutions, many factors must be
considered like inclusion of more stress markers, more scenarios and may be a larger data set.
Care should be taken while recording the data that they should be devoid of any error of
judgment, synchronization errors, machine errors, unavailability of all relevant stress-trends.
153
The stress-trend markers recorded and the stress-trends detected using the above approach for
three sample drivers have been shown in Fig. 5.21.
Figure 5.21: Stress-Trends Detected (Source: Singh et al., 2011)
5.2.4 Neural Network based Regression Model for Stress-Trend Detection
In this approach to perform an instant-by-instant tracking of alarmable trends during real-
time driving, often attributed to instantaneous reflexes and stimuli caused due to stress-trends
using neural network based regression model has been discussed. In Section 5.1.4, typical
stress-trends observed in the driving route has been shown in Fig. 5.5, whereas in Table 5.6
the annotation of stress-trend markers and the rationale behind the weighing methodology has
been provided.
The TTV value can be used to determine and track stress-trends in a feed-forward manner
through online monitoring of variations in the observed signals as the absolute value indicates
the significance of the change observed. The mathematical description of TTV calculation in
the form of pseudocode algorithm is presented in Table 5.24. It was discussed that the
exponential smoothing constant ‘α’ value is crucial in determining the time constant for
154
exponential weighting. The value of ‘α’ is dependent upon the number of observations ‘n’ in
a segment using the relations ship α = 2/(n+1). A large value of ‘α’ is more sensitive to the
changes in patterns while estimating the current value of observations.
In this method, a TTV values based tracking vector of individual features was generated
for 10 different values of ‘α’. If the value of ‘α’ is modulated properly, tracking the changes
in the physiological signal at different time-scales becomes easier. A smaller ‘α’ value
enables a larger time-window to be considered for analyzing the incremental changes. These
individual tracking vectors help in adaptive tracking of the incremental changes observed in
the feature values which is then used as the input vector to train the neural network classifiers
chosen for stress-trend detection.
Four neural network configurations were chosen by using the similar approach followed
for affective state detection as discussed in Section 5.1.7. Six different numbers of neurons
(5 - 30) in the hidden layers for all ‘α’ values were chosen. The networks were trained as a
regression problem using a stopping criterion of the mean square error value (MSE) of 0.0.
5.2.4.1 Result of Neural Network based Regression Model for Stress-Trend Detection
It can be observed from Fig. 5.22 that for all the classifier configurations chosen the MSE
values settled close to '0' for neurons greater than '15'. However, the CASFBNN classifier has
produced a trained data set with minimum MSE errors.
In order to obtain an optimum classifier, the average values for MSE and R-square for all
the 14 drivers55
for each 'α' were compared. It can also be observed that in both the cases the
CASFBNN classifier gave best MSE and R-square values as shown in Table 5.26. The
number of neurons in hidden layers can be chosen either as 25 or 30 for the selected
configuration. The value of smoothing constant 'α' could be used as the tuning parameter for
stress-trend analysis. The time required to determine the stress-trends can be corresponding to
each 'α' value. In a very dynamic scenario, like in high stress scenario for affective state
detection, ‘α’ has to be tuned higher. Similarly, in a low stress scenario it has to be tuned
down. Therefore the present analysis could be concluded with the observation that the
CASFBNN classifier alongwith a suitable value of ‘α’ will give best results.
55 Reason for selecting 14 drivers in this study has been discussed in Section 5.1.7
155
Fig. 5.22. Stress-Trend Analysis Data: MSE
Table 5.26 Optimum Classifier for Stress-Trend Detection
S.N. Alpha
CASFBNN
MSE
(R-Value)
MLP1NN
MSE
(R-Value)
FFTD-D1NN
MSE
(R-Value)
FFTD-D2NN
MSE
(R-Value)
No. of
Neurons
Optimum
Classifier
1. 0.6670 0.005505
(0.997998)
0.011999
(0.995686)
0.007465
(0.997007)
0.0084
(0.996921) 25 CASFBNN
2. 0.5000 0.012149
(0.995332)
0.00812
(0.996949)
0.012331
(0.995857)
0.010361
(0.996349) 30 MLP1NN
3. 0.4000 0.005247
(0.997968)
0.016963
(0.993788)
0.011403
(0.995845)
0.008055
(0.996903) 25 CASFBNN
4. 0.3330 0.015521
(0.994667)
0.009856
(0.996518)
0.022055
(0.991536)
0.008408
(0.997135) 25 FFTD-D2NN
5. 0.2850 0.009346
(0.996387)
0.019854
(0.992524)
0.014407
(0.994964)
0.017403
(0.994163) 20 CASFBNN
6. 0.2500 0.024556
(0.989857)
0.008048
(0.996968)
0.015601
(0.994842)
0.010163
(0.996148) 25 MLP1NN
7. 0.2220 0.005128
(0.998126)
0.01084
(0.996142)
0.015407
(0.994937)
0.012306
(0.996271) 30 CASFBNN
8. 0.2000 0.008923
(0.99697)
0.016781
(0.993525)
0.016469
(0.994497)
0.019968
(0.992459) 30 CASFBNN
9. 0.1818 0.012568
(0.995154)
0.018661
(0.992761)
0.020861
(0.99242)
0.010468
(0.99607) 30 FFTD-D2NN
10. 0.1667 0.011979
(0.995846)
0.01306
(0.994865)
0.024377
(0.990805)
0.02079
(0.992207) 25 CASFBNN
156
5.3 Conclusions
In this chapter, the results and analysis of the affective state detection of automotive drivers
using unsupervised as well as supervised learning methods have been presented. As
unsupervised learning method, Kohonen Self Organizing Maps (KSOM) was used; whereas
as the supervised learning method, a number of ANN classifiers were employed.
Out of the classifiers employed, it was observed that LRNN classifier performed
optimally for the 3-Class problem identifying three affective states labeled as: 'Relaxed',
'Moderate' and 'Stressed'. In comparison, the CASFBNN classifier performed optimally for
the 4-Class problem and identified four stress levels as: 'Level-1' to 'Level-4' using single- as
well as multi-turn analysis. In addition, a stress-trend analysis using Trigg's Tracking
Variable (TTV) Vectors was also performed using a feature-weight allocation algorithm and
a neural network based regression model.
The results reflect that the LRNN classifier has been able to better classify three classes in
terms of performance indices considered namely, precision - (89.23%), sensitivity - (88.83%)
and specificity - (94.92%). However, the CASFBNN classifier configuration (with window
size = 25 and number of neurons in hidden layer = 25), includes a fair representation of four
class labels with respect to the same performance indices as precision of 77.94%, sensitivity
of 78.20% and specificity of 93.73%.
It has been observed in literature that as the number of classes increase, the percentage of
performance accuracy decreases. However, the impact of the individual class labels can not
be ignored in such diverse driving conditions.
Therefore, we, hereby, infer that the CASFBNN classifier would be a suitable choice for
the proposed WDAS.
157
Chapter 6
A Proposed Architecture for the Resultant Wearable Driver Assistance
System
In the present chapter, a proposed56
pervasive computing architecture has been discussed.
This comprises of the elements of a wearable driver assistance system (WDAS) as well as the
computational, storage and communication elements of a vehicular computer envisaged as
the main building blocks. In such a life-threatening situation, communication of life-critical
information requires that the related tasks must be initiated, tracked and terminated in an
appropriate manner. The key functions which are of paramount importance are extracting the
relevant and critical information, authentication, data integrity and most importantly
providing the security to drivers as well as passengers.
As discussed in Section 2.4, such a problem can be tackled either (a) by using wearable
computers alone or (b) by adopting a hybrid approach. In the hybrid approach, sensors
embedded in the vehicle’s environment as well as in a fabric-based flexible wearable clothe
mounted on driver’s body, a WDAS, could be used. The vital sign and fitness parameters thus
sensed and extracted could be helpful in generating necessary alert to the drivers. In order to
get additional help by external recovery agencies when an accident takes place, the system
could exploit this information and use appropriate communication medium. A larger
pervasive computing environment consisting of the sensing elements, communication devices
for WDAS as well as in-vehicle and outside environments, processing elements etc. could be
realized to solve this complex problem.
6.1 About the Architecture of the Overall Envisioned Ubiquitous Computing
Environment
The proposed BITS-LifeGuard ubiquitous computing environment has been shown in Fig.
6.1 which consists of two components: (a) the WDAS and (b) the vehicular computer. In both
of these units intra-vehicular and inter-vehicular communication channels, backup memories
and backup processing facilities have been envisioned to enhance the robustness. A number
of physiological sensors may help in collecting the vital sign information of a driver which
could be further analyzed by applying select machine learning algorithms to identify certain
56 This chapter presents a possible logical application of the outcome of the work done. As such, proposing an
architecture was not within the scope of the targeted work. However, an attempt has been made here in order to
help indicate one of the many possible ways in which such devices could be physically built in future.
158
patterns of interest. The system will alert the driver in case the driver exhibits a sign of being
in an inattentive state or stressed depending upon the patterns observed. Additionally, an SOS
could be sent to the vehicular computer via the external communication unit (ECU) of the
WDAS when the driver even after being alerted doesn't take an appropriate action. For intra-
WDAS communication once the authentication succeeds no encryption needed.
Figure 6.1: Functional blocks of the Pervasive Computing Environment of the Vehicle and the
Wearable Computer
In the likely scenario of several vehicles and their WDAS forming a personal area
network (PAN) in the vicinity of a host vehicle, secure communication can be achieved by
providing both authentication and encryption to avoid unwanted access. The short distance
communication units (SDCUs) of the vehicle's environment communicates with the ECUs of
the WDAS while the long distance communication unit (LDCU) will communicate with the
outside agencies such as the recovery agencies. In this case a robust encryption and
authentication mechanism is needed such that misleading information is avoided to the
recovery agencies. Human intervention is needed in all such interactions as discussed above
whenever a situation arises.
159
Additionally, the vehicular computer should process the GPS and GIS data and interpret
as well as initiate the necessary action after receiving the SOS signal. In the event of an
accident, the system would automatically be able to inform the rescue agency or highway
staff in addition to the nearest trauma control center or hospital.
While the above description applies to the overall vision related to the project, scope of
the work presented here is strictly limited to the wearable element alone. As a consequence,
in the following sections an attempt shall be made to evolve and present an expanded view of
the architecture of the wearable computing part of the referred environment such that the
treatment ends at the boundary that separates the WDAS from the rest of the environment.
6.2 Identification of Constituent Elements of the Resultant System
Based on the above referred architectural framework, the constituent elements of the
proposed WDAS system would fall in the following categories: (a) body mounted sensors (b)
analog building blocks (signal conditioning blocks) and analog to digital converter (c)
processing elements (d) storage elements (e) actuation blocks (f) communication elements
and (g) power provisioning block (Conjeti et al., 2012).
As discussed in the foregoing chapters, the physiological sensors could be selected by
considering their locations on the driver's body. In the proposed WDAS, as discussed in
Chapter 5, the results indicate that the GSR and PPG sensors could be the two possible
choices. The PPG signal based features correlate with the ECG signal based features,
reflecting the stress patterns of drivers. The PPG sensor measures the blood volume flowing
at the peripheral tissues to be captured from the locations either at the finger or at ear-lobes,
resulting in a PPG pulse. The GSR signal based features reflect the instantaneous and startle
responses of the autonomic nervous system (ANS). In literature, two locations for the GSR
electrodes have been identified: (a) at the index and middle finger and (b) at the toes. This
selection leads to two possible solutions: (i) a wrist-worn device and (ii) a body-hugging
device. In the wrist-worn device57
special mechanical assembly could be designed to place
the PPG sensor similar to the AMON device (Anliker et al., 2004). Since the drivers would
feel uncomfortable while wearing gloves, the GSR sensor could be placed in the shoes or
socks touching the toes appropriately. In a body-hugging device, a number of sensors
including 1-lead to 3-lead ECG which provides more robust CVD detection (Park and
Jayaraman, 2003); Paradiso et al., 2005), could be included alongwith the respiration,
57 Additionally, the wrist-worn device can host some other optional sensors like a BP cuff, temperature, activity
etc. by appropriate placement and integration.
160
temperature, activity etc. besides the PPG and GSR sensors. GSR sensor with Bluetooth
might be integrated in many forms including bracelets, wristwatches etc. depending upon the
comfort level and personal preferences (Poh et al., 2010). The GSR and PPG sensor may
even be a part of either a wrist-band or sleeves of a full-sleeves jacket.
The necessary signal conditioning for individual sensors will be achieved in the form of
amplifier blocks, filtering blocks etc. These circuits may direct an appropriate analog signal
corresponding to the sensor readings which could be converted to the digital form by
employing an analog-to-digital converter (ADC). The ADC should meet the requirements
with respect to its sampling rate, resolution, conversion time, speed and accuracy etc. The
choice of an ADC depends upon the processor selected as it may be either available on-chip
or off-chip. In the case off-chip ADC, appropriate interfacing requirements to the processor
should be met.
The processing elements in the context of WDAS has to perform various tasks including
the conversion of sensed physiological signals from analog to digital form, digital signal
processing tasks such as feature extraction etc., read-and-write operations of memory,
machine learning tasks to recognize the patterns of interests, generation of alarms and alerts
to drivers besides the wired and wireless communication tasks involving serial
communication interfaces. These requirements pose a great challenge for the selection of
processors among the available choices.
The choice of processor for such a device development should meet the generic
requirements of low-power, high-performance, low-cost etc. Amongst the many of the
available microcontroller families from Intel, Freescale, Microchip, ARM etc., the current
trends indicate that ARM architecture is prevalent in embedded applications due to a large
variety of peripheral support.
Out of the several available ARM processor families, the Cortex series of ARM
architecture, the ARM Cortex™-M3 have been utilized to map medical application, such as
EEG seizure detection, with microwatt power consumption on an SOC embedded platform
(Sridhara et al., 2011). Such SOC has the potential to map other sensors like a hazardous gas
sensor etc. besides providing accelerator tasks for encryption, arithmetic operations by
utilizing the frequency domain signal processing algorithms implemented on the chip.
Recently, ARM introduced the ARM Cortex™-M4 processor architecture, which seems
suitable for wearable medical grade applications, which has been designed to address the
digital signal control applications. The characteristics feature of ARM Cortex™-M4
embedded processor include high-efficiency signal processing functionality, low-power, low
161
cost and ease-of-use etc. The targeted applications include motor control, automotive, power
conversion and management, embedded audio, communications, industrial automation and
medical devices. Leveraging on these benefits, several vendors have come out with their SOC
designs for medical grade applications. Table 6.1 compares several ARM Cortex™-M4
processor architecture families alongwith the Texas Instruments OMAP 35xx series of
processors based on ARM Cortex™-A858
core suitable for medical device development.
Although the WDAS of this kind shall not be a medical device, its intrinsic features overlap
several key characteristics of medical wearables. It is in this context that keeping wearer
safety in view the recommendations have been made here.
In case separate processor cores would have been used the system would have required
multiple variants of storage elements to serve as external cache, RAM, ROM, EEPROM etc.
However the recent trends indicate that quite a few SOCs have adequate built-in resource
provisions that fit the requirements of the proposed WDAS.
Primary purpose of actuators in such WDAS is to alert the drivers in situations where he
or she may not be able to anticipate a dangerous situation leading to an accident. Several
forms of actuators exists including vibratory, auditory, visual etc (Lee et al., 2004). In case of
a wrist-worn device a vibratory and auditory actuator may be a good choice. In the case of a
body-hugging device the location of vibratory actuators must be positioned in such a way that
it does not create any health issue for e.g. locations close to heart or in the vicinity of seat-
belts must be avoided. It would be appropriate to place such device either on the upper or
lower arm of the drivers so that it is minimally intrusive. It should have very low (ideally
none) possibility of tickling or creating an irritable sensation that could lead to the kind of
distraction having the potential to adversely affect driving safety. Care must be taken while
issuing auditory alerts. The sound, preferably a voice clip, must be played for a shorter
duration as an alert and it should not annoy or startle the wearer.
A number of popular short range communication protocols exists like Bluetooth, Zigbee,
Wi-Fi as the vehicles for communication between wireless components of WDAS as well as
between WDAS and vehicular computer. In addition, voice / data capabilities offered by
2.5G/3G/4G cellular networks would be provided within the vehicular environment for
intimation of relevant information, in the event of need, to pre-programmed agencies and
people.
58 Cortex-M4 Processor. Available Online: http://www.arm.com/products/processors/cortex-m/cortex-m4-
processor.php.
162
Table 6.1: A Possible List of Medical Grade Microcontroller / System-On-Chip Families for WDAS Design
Manufacturer/ MCU Family/
Architecture / Core
Max.
Freq.
(MHz)
Memory
(bits)
Data
Size
(bits)
ADC /
DAC
(bits)
Serial
Ports
Temp.
Range
(°C)
Operating
Voltage
(V)
DSP
Support
(MDU/
MAC)
Additional Resources Applications
Atmel SAM4L Cortex™-M4
Flash MCU
48 128 KB -
256 KB Flash
32 12 / 10 UART,
SPI, I2C,
-40 to
+85
1.68 to 3.6 Yes Timers, LCD, hardware
cryptography, USB host
and device, QTouch
technology
Sensors and Detectors, Medical
Meters, Remote Controls Toys,
Wireless Devices etc.
Atmel SAM4S Cortex™-M4
Flash MCU
120 Upto 2MB
Flash; 160
KB SRAM;
2KB Cache
32 12 / 12 UART,
SPI, I2C,
SSC, SD
/ eMMC
-40 to
+85
1.62 to 3.6 Yes Timers, PWMs, LCD,
hardware cryptography,
USB host and device,
QTouch technology
Medical, Industrial control,
Industrial automation, M2M,
Smart grid, Consumer and
computing peripherals,
Embedded audio
Infineon XMC4000 Cortex™-
M4
80-120 128 KB - 1
MB Flash;
160 KB
SRAM; 4KB
Cache
32 12 / 12 UART,
SPI, I2C,
I2S, SD /
eMMC,
CAN,
-40 to
+125
3.3 Yes Timers, PWMs Delta
Sigma Demodulators,
USB host and device,
Ethernet, Touch
Interface and LED
Matrix, Math
Coprocessor
Motor Control, Position
Detection, IO Devices, HMI,
Solar Inverters, SMPS, Sense &
Control, PLC, UPS, Light
Networks
Freescale K20 USB MCUs
Cortex™-M4
50-120 32KB - 1 MB
Flash
32 12 / 12 UART,
SPI, I2C,
I2S, SD /
eMMC,
CAN,
-40 to
+105
1.7 - 3.6 Yes Timers, PWMs, USB
host and device,
Ethernet, Touch
Interface, Encryption
Hardware Accelerator
Wearable Wireless Healthcare
Patch
163
Table 6.1 (Continued....): A Possible List of Medical Grade Microcontroller / System-On-Chip Families for WDAS Design
Manufacturer/ MCU Family/
Architecture / Core
Max.
Freq.
(MHz)
Memory
(bits)
Data
Size
(bits)
ADC /
DAC
(bits)
Serial
Ports
Temp.
Range
(°C)
Operating
Voltage
(V)
DSP
Support
(MDU/
MAC)
Additional Resources Applications
TI OMAP 35xx
Cortex™-A8
720
MHz -
1000
64KB RAM,
112KB
ROM
64/32 UART,
SPI, I2C,
SD/MM
C/SDIO,
-40 to
+105
1.8 - 3.0 Integrate
d TI 64x
DSP
Timers, PWMs, USB
host and device, Camera
Interface, Color and
Monochrome Display
Interface
Automotive, industrial
automation, enterprise and
mobile consumer, medical
LPC4000 120 64-512 KB
Flash,
24-96 KB
RAM,
32 12/10 UART,
SPI, I2C,
SD/MM
C/SDIO,
CAN,
SSP
-40 to
+85
3.6 V Yes Timers, PWMs, LCD,
USB, Ethernet,
motor control and power
management, industrial
automation and robotics,
medical, automotive
accessories embedded audio
LPC4300 204 0-1024 KB
Flash,
104-282 KB
RAM
32 12/10 UART,
SPI, I2C,
SD/MM
C/SDIO,
CAN,
SSP
-40 to
+85
3.6 V Yes Timers, PWMs, LCD,
USB, Ethernet,
-do-
164
All the communication involved in the WDAS environment as well as between the WDAS
and the vehicular computer would be suitably encrypted and would not proceed without
device-to-device authentication (both outside the scope of the present work).
Table 6.2: A Possible List of Communication Elements for WDAS Design
(Source: Lee et al., 2007)
Communication
Standards
Range
(m)
Freq. Band Max. Data
Rate
Sensitivity
(dBm)
Application Areas
Bluetooth
(IEEE 802.15.1)
10-100 2.4 GHz 1-3 Mbps 0 - 10 Cordless mouse, keyboard, and hands-
free headset and mobile phones
Zigbee
(IEEE 802.15.4)
10-75 868 MHz
915 MHz
2.4 GHz
20 kbps
40 kbps
250 kbps
(-25) - 0 smart meters, home automation and
remote controls
UWB
(IEEE 802.15.3)
10 3.1 - 10.6
GHz
110 Mbps -
480 Mbps
-41.3 /MHz high-bandwidth multimedia networks
Wi-Fi
(IEEE 802.11/a/b/g)
100 2.4 GHz;
5 GHz
54 Mbps 15 - 20 large data transfer, computer-to-
computer
Many of the small and portable devices use battery as power source. Since the batteries
have limited power the operating lifetime of such devices are also limited. In addition to this
they put more weight and volume to the overall design of such systems. In life-critical
devices such as a WDAS two power sources are mandatory: (a) a main power source and (b)
an auxiliary power source. The auxiliary power could be utilized in case the main power fails;
just as commonly provisioned in case of the safety-critical equipments on-board aircrafts.
The main power source such as rechargeable batteries could be complemented by auxiliary
power sources such as piezoelectric, mechanical vibration, thermoelectric and solar cells etc.
However, solar cells cannot be used in the WDAS case as this requires direct sunlight to
charge the solar panels. Human power could serve as an alternate power scavenging
technique as suggested by (Starner, 1996).
Any such system require application software and system software. Such a system
requires an embedded real-time operating system alongwith appropriate device drivers and
one or more appropriate embedded application programs.
6.3 The Proposed System Design
Consequent to the discussion in the foregoing sections a possible recommended architectural
framework and a tentative design has evolved as shown in Fig. 6.2. and Fig. 6.3 respectively.
165
Figure 6.2: Architectural Framework of the Proposed Wearable Driver Assistance System
As it can be seen from Fig. 6.2, three basic tasks such as sensing, processing and
actuating or communicating will form the basic architecture of an embedded system like a
WDAS. Proper selection of necessary sensors, processing elements, storage devices,
coprocessing elements, wired and wireless communication elements alongwith the vibratory
and auditory actuators the WDAS could be realized. A more detailed design that involves the
required basic hardware building blocks indicated in Fig. 6.3 reflects that a number of
peripherals are needed to meet the WDAS design requirements. A general purpose
microcontroller with all the necessary peripherals like input and output ports, specific serial
communication devices, analog-to-digital convertors, digital-to-analog converters etc. may be
used for sensing and interfacing needs. An integrated digital signal processor (DSP) may be
used to off-load the processor from digital signal processing task. Encryption hardware may
be included either as an accelerator separately or may be as part of the SOC design itself.
Although in the WDAS to avoid distraction, displays are not a mandatory element of the
proposed design, but they can be optionally included to inform the drivers about certain
readings. Several serial bus protocols like I2C, I2S, USB etc. may be used to interface device
166
such as external flash memories, speakers or microphones and an USB human interface
device (HID).
Figure 6.3: Hardware Building Blocks of the Proposed Wearable Driver Assistance System
The complete logical flow for the affective state detection of drivers has been shown in Fig.
6.4.
167
Figure 6.4: Affective State Detection: Complete Logical Flow
As shown above, an intelligent inference engine is needed to take care of the signal
processing and decision making tasks based on the neural network training and the results
obtained for affective state detection. Fig. 6.5 shows the logical flow to implement such
inference engine which may either implemented on an integrated DSP or on a separate
coprocessing unit attached to the microcontroller bus.
6.4 Possible Implementation Approaches
The proposed WDAS would comprise of body-mounted sensors, communication devices,
actuators, and appropriate computing elements. The computing elements around which the
WDAS will be built, might be reflected slightly differently in various candidate approaches:
(a) A combination of a set of microcontrollers (each preferably in system-on-chip format)
and electronic textile (e-fabric) based system design approach with embedded operating
system and integrated special purpose application software.
168
(b) A combination of a reconfigurable computing element like Field Programmable Gate
Arrays (FPGAs) and e-fabric as well as appropriate corresponding hardware and software
elements.
(c) A combination of microcontroller, FPGA, e-fabric and appropriate hardware and
software elements. In such systems, certain additional tasks which could not be readily
implemented on the microcontrollers, FPGAs may be used.
Figure 6.5: Intelligent Inference Engine: Logical Flow
Out of these approaches, the approach recommended in this work is in favor of the first
choice. The resultant physical system (WDAS) should be created in form of a body-hugging
vest or an all weather full-sleeves jacket with select body-hugging areas like wrist, chest etc.
169
Chapter 7
Conclusion
The principal focus of this work has been to perform an analysis for stress level detection of
vehicular drivers involving minimal sensing parameters while extracting a large set of
relevant physiological features towards the development of a wearable driver assistance
system. In order to achieve this goal, as discussed in Section 3.4, first step was collection of
physiological data from automotive drivers using body mounted sensors under real-life
conditions. In the next step, signal processing was carried out to remove noise as well as
motion artefacts followed by extraction and selection of relevant features. This was followed
by analysis and identification of driver profiles for the purpose of understanding their
behavioral traits. Subsequently, neural network classifiers were employed to classify the
driver's stress into respective categories of affective states. In addition, a novel feature-weight
allocation algorithm as well as a neural network based regression model were developed for
the detection of stress-trends as presented in Section 5.2. Finally, a bio-inspired ubiquitous
computing architecture has been proposed in the context of the BITS-LifeGuard Wearable
Computing environment.
7.1 Principal Contributions of the Thesis
The contributions of the thesis can be summarized as below:
real-life primary data was collected so as to obtain a credible view of the reflections
of typical constraints and environmental settings observed in Indian driving
conditions. While in the initial stages of this work the secondary data was carefully
looked into, it soon became obvious that such data did not faithfully represent the
driving conditions and environments prevalent in this part of the world which were
vastly different from that of western countries.
unlike the simulated driving conditions reported in most of the literature, this work
almost exclusively used the real-life field test data and is thus free from several
possible errors that have to be known to be commonly present in existing simulation
models. In turn, this allowed establishment of more reliable correlation of the stress-
level experienced by the drivers under varying conditions.
This work involved extraction of 39 statistical, structural and spectral features from
the physiological signals, whereas the next best work reported in literature utilized
only 22 features Healey and Picard (2005). Consideration of such a large number of
170
features reflect more faithful and fine-grained representation of the concerned
environments in terms of autonomic nervous system responses which are use to
extract the level of stress of the subject at any given instant of time.
in the context of Driver-Profile Analysis, use of Cox proportional hazard model
firmly established the significance of the 'current physiological state (CPS)' as it came
out to be the most important predictor with highest hazard ratio.
the affective state detection involved modeling the given problem as a multiclass
problem by performing two comparative analysis: (i) a 3-Class model used seven
different neural network configurations and (ii) a 4-Class model used six different
neural network configurations. Thus, the work was able to establish consistent
performance by the way of having attempted a large number of classifier
configurations.
a new and comprehensive approach of multi-stage verification was performed by
employing multi-turn driver data in the analysis in addition to single-turn data. Thus,
the work accounts for the intra- as well as inter-subject variability.
the effects of stressful events and incidents on driver's stress-level have been
comprehensively analyzed using stress-trends detection approaches with the help of
Trigg's Tracking Variable (TTV). This is significant since it resulted in a unique
approach that involved use of the TTV vector to develop a feature-weight allocation
algorithm as well as to model a neural network based regression problem for detecting
stress-trends.
In addition, the thesis has culminated into a fine-grained, bio-inspired ubiquitous computing
architecture which has been proposed to serve as the blue-print of driver-centric wearable
driver assistance systems in near future.
7.2 Limitations of the Work Done
This particular work does not consider the drivers other than those who regularly drive cars
or taxis for substantial distances and time often involving rural, semi-rural and highway
roads. This led to a limitation in the sense that all the real-life test data that was collected over
a long period of time could not involve women drivers as well as those whose daily driving
run is anywhere less than 60 kilometers per day on an average.
The second limitation of this work stems from the fact that the research team did not have
access to professional driving simulators having built-in hazard-simulation capability. This
resulted in lack of primary data which could have been otherwise collected for the purpose of
171
capturing driver's reflex levels and their corresponding response quality in the event of
unexpected appearance of a hazardous condition.
The line of approach chosen in the course of this research attempted to strike a balance
between achievable and useful results within the available period of time which excluded the
possibilities like evolving a formal analytical framework and rigorous mathematical modeling
which would have been as important as the bias demonstrated by this work towards
experimental analysis. This is not to say that the presented work is any less rigorous; but to
simply acknowledge the fact that a more formal approach has been avoided in favor of
emphasis on experiments and semi-formal analysis.
7.3 A Comparison with Relevant Contemporary Works
Healey and Picard (2005) used physiological features to identify three levels of driver stress
(low, medium and high) with an accuracy 97.4% using a fisher projection and linear
discriminant classifier involving 3 drivers on different driving days. Katsis et al. (2008)
obtained overall classification rate of 79.3% for SVM and 76.7% for ANFIS classifiers for
car racing drivers to classify four emotions viz. high, low, disappointment and euphoria. Patel
et al. (2011) extracted the heart rate variability features to classify early onset of fatigue with
an accuracy of 90% into two classes alert and fatigued. In comparison, the present work first
obtained results by computing three cardinal measures of precision, sensitivity and specificity
with 89.23%, 88.83 % and 94.92% respectively for an LRNN classifier involving 19 drivers
into three stress-levels. The second work involved 14 drivers to classify the stress-levels into
four different levels using CASFBNN classifier by computing performance measures as
precision, sensitivity, specificity, classifier accuracy, Area under the ROC curve and kappa
statistics. While some of these figures may not initially seem impressive enough, in fact these
are still significant since the performance of a classifier is dependent upon several parameters
like the use of data collection scenarios and methods, identified stress classes and
performance matrices amongst the other things etc.
While several recent works like Healey and Picard (2005), Katsis et al. (2008) and Patel
et al. (2011), have made use of several sensor types (GSR, ECG, PPG, Respiration, EMG
etc.), the present work establishes that with the help of the collected evidence and based on
primary data and its analysis that comparable results can be obtained by using just two types
of sensor namely PPG and GSR (Singh et al., 2013a). Consequently, the resultant architecture
presented allows designing and building of a less complex, smaller, energy-efficient yet
reliable system at significantly lower cost. The Table 7.1 attempts to summarize some of the
172
above referred as well as a few additional points of comparison of relevant contemporary
works.
Table 7.1: Comparative Analysis of Proposed Approach against Existing Approaches
for Driver Stress Detection
Authors Objective Physiological
Signals Used
Subjects Scenario
Classifier
Employed
Performance Comments
Healey
and Picard
(2005)
To determine
driver's overall
stress-level
ECG,EMG,GSR,
RSP : 22 features
3 Real-time
driving
Fisher Projection
and Linear
Discriminant
Analysis
Overall accuracy
97.4%
3 stress
class: low,
medium and
high
Katsis et
al. (2008)
To evaluate
the emotional
states of car
racing drivers
ECG,EMG,GSR,
RSP: 12 features
10 Simulated
(Laboratory)
Support Vector
Machines (SVM)
and Adaptive
Neuro Fuzzy
System (ANFIS)
Overall
classification rates
79.3% (SVM),
76.7% (ANFIS)
4 emotional
states: high,
low,
disappointm
ent and
euphoria
Patel et al.
(2011)
To detect early
onset of
fatigue on
drivers
ECG based Heart
Rate Variability
Analysis: 1
feature
12 Simulated
(Laboratory)
Feed Forward
Neural Network
(without
feedback)
Accuracy
90%
2 states:
alert and
fatigue
This Work
(3-Class
problem)
To determine
driver's overall
stress-level
PPG, GSR and
derived features
for HRV
Analysis : 39
features
19 Real-time
driving
Neural Network
(Exhaustive
Evaluation using
7 configurations
including both
feed forward
backpropagation
and recurrent)
Predictive Ability
(Precision):
89.23%
Sensitivity:
88.83 %
Specificity:
94.92 %
3 stress
class:
relaxed,
moderate
and stressed
This Work
(4-Class
problem)
To determine
driver's overall
stress-level
PPG, GSR and
derived features
for HRV
Analysis : 39
features
14 Real-time
driving
Neural Network
(Exhaustive
Evaluation using
6 configurations
including both
feed forward
backpropagation
and recurrent)
Predictive Ability
(Precision):
77.94%
Sensitivity:
78.20 %
Specificity:
93.73 %
4 stress
class:
Level-1 to
Level-4
Legend : ECG – Electrocardiogram; EMG- Electromyogram; GSR : Galvanic Skin Response; PPG – Photoplethysmogram and RSP :
Respiration; HRV – Heart Rate Variability.
It may be of interest to note here that the present work takes into account aspects of a
larger driver population with varied age groups and driving category to ensure the variability
in data as well as the intra- and inter-subject variability due to single-turn and multi-turn
analysis. In essence, presented results assume greater significance than those of contemporary
works not only in terms of lower design complexity but also in terms of its applicability to
significantly wider range of vehicular drivers.
173
7.4 Future Scope
This work could be further expanded and improved in future along the following lines:
a) including larger sample population cutting across all genders, terrains, vehicle types
and age groups.
b) inclusion of a combination of vehicle-mounted and body-mounted sensors may be
considered so as to improve both usability and overall reliability of the resultant
driver assistance systems. In other words, instead of vehicle-only (VDAS) or wearer-
only (WDAS) approaches, a hybrid approach shall be more likely to be both reliable
and user-friendly and is therefore likely to emerge as a very strong candidate for
inclusion into direction of research in very near future.
c) evolution of rigorous analytical framework maybe a complementary direction of work
which could be of particular value in terms of determining proof of correctness as
well as complexity involved, prior to moving to prototype stage.
d) identification of all-weather e-fabric material with an ability to survive washing or
cleaning process could be yet another important direction of research that would
eventually help build WDAS units of practical utility on mass-scale.
Several recent advances are promising a paradigm shift in increased adoption and
availability of driverless road transport vehicles particularly those in the category of light
commercial vehicles (LCVs), cars, taxis etc. Efforts like Google Self Driving Car, INRIA's
driver less taxis etc. are good examples of what might become a trend in time to come. Even
the commercial vehicle companies like General Motors have indicated that by 2020 they
expect to roll out driverless cars on commercial scale. However in spite of all of these
developments, it is extremely unlikely that in view of the production costs and complexity as
well as the degree of availability, for at least next two decades driver less cars would become
the mainstream vehicles for most of the economies of the world. As a consequence, the
research being done as part of this project and many others that target reduction in road
accidents by providing driver-centric solutions remain both relevant and significant.
174
REFERENCES
Ali, A.B.M.S., and Wasimi, S.A. (2009). Data Mining: Methods and Techniques, India
Edition. Cengage Learning, New Delhi, (Chapter 5).
Allen, J. (2007). Photoplethysmography and its application in clinical physiological
measurement. Physiological Measurement, vol. 28, no. 3, pp. R1–R39.
Anliker, U., Ward, J. A., Lukowicz, P., Troster, G., Dolveck, F., Baer, M., Keita, F.,
Schenker, E. B., Catarsi, F., Coluccini, L., Belardinelli, A., Shklarski, D., Alon, M. Hirt,
E., Schmid, R., and Vuskovic, M. (2004). AMON: A Wearable Multiparameter Medical
Monitoring and Alert System. IEEE Transactions On Information Technology in
Biomedicine, vol. 8, no. 4, pp. 415-425.
Appelhans, B. M. and Luecken, L. J. (2006). Heart Rate Variability as an Index of Regulated
Emotional Responding. Review of General Psychology, vol. 10, no. 3, pp. 229–240.
Asada, H. H., Shaltis, P., Reisner, A., Rhee, S., & Hutchinson R. C. (2003). Mobile
Monitoring with Wearable Photoplethysmographic Biosensors. IEEE Engineering in
Medicine and Biology Magzine (Special Issue on Wearable Sensors / Systems and Their
Impact on Biomedical Engineering), vol. 22, no. 3, pp. 28-40.
Atmel ARM based Microcontrollers. Available Online:
http://www.atmel.com/products/microcontrollers/arm/.
Baber, C., Haniff, D. J., Woolley, S. I. (1999). Contrasting paradigms for the development of
wearable computers. IBM Systems Journal, vol. 38, no. 4, pp.551-565.
Balakrishnama, S., and Ganapathiraju, A. (1998). Linear Discriminant Analysis- A Brief
Tutorial. Institute for Signal and Information Processing, Department of Electrical and
Computer Engineering, Mississippi State University, pp. 1-8.
Banerjee, Rahul. (2005). From Research to Classroom: A Course in Pervasive Computing.
IEEE Pervasive Computing, vol. 4, no. 3, pp. 83-86.
Benoit, A., Bonnaud, L., Caplier, A., Ngo, P., Lawson, L., Trevisan, D. G., Levacic, V.,
Mancas, C. and Chanel G. (2009). Multimodal focus attention and stress detection and
feedback in an augmented driver simulator, Pervasive and Ubiquitous Computing, vol.
13, no. 1, pp. 33–41.
Bergasa, L. M., Nuevo, J., Sotelo, M. A., Barea, R., and Lopez, M. E. (2006). Real-Time
System for Monitoring Driver Vigilance. IEEE Transactions on Systems, Man, and
Cybernetics-Part A: Systems and Humans, vol. 7, no. 1, pp. 63-77.
175
Bewick, V., Cheek, L., and Ball, J. (2004). Statistics review 12: Survival analysis. CRITICAL
CARE-LONDON, vol. 8, 389-394.
BioTrace+ Manual V1.1, (2004-2006). User Manual for the BioTrace+ Software. Version
1.1, Mind Media B. V. Netherlands, 2004-2006.
Carsten, O.M.J. and Nilsson, L. (2001). Safety Assessment of Driver Assistance Systems,
European Journal of Transport and Infrastructure Research, vol. 1, no. 3, pp. 225–243.
Cembrowski, G.S., Westgard, J.O., Eggert, A.A., & Toren, E.C. (1975). Trend Detection in
Control Data: Optimization and Interpretation of Trigg’s Technique for Trend Analysis.
Clinical Chemistry, vol. 21, no. 10, pp. 1396–1405.
Choi, J., Ahmed, B., and Osuna, R. G. (2012). Development and Evaluation of an
Ambulatory Stress Monitor Based on Wearable Sensors. IEEE Transactions on
Information Technology in Biomedicine, vol. 16, no. 2, pp. 279-286.
Choo, S. and Mokhtarian, P. L. (2004). What Type of Vehicle Do People Drive? The Role of
Attitude and Lifestyle in Influencing Vehicle Type Choice, Transportation Research Part
A: Policy and Practice, vol. 38, no. 3, pp. 201–222.
Ciaccio, E. J., Dunn, S. M. and Akay, M. (1993). Biosignal Pattern Recognition And
Interpretation Systems, Part 2 of 4: Methods for Feature Extraction and Selection. IEEE
Engineering in Medicine and Biology Magazine, vol. 12, no. 4, pp. 106-113.
Conjeti, S., Singh, R. R., and Banerjee, R. (2012). Bio-inspired Wearable Computing
Architecture and Physiological Signal Processing for On-road Stress Monitoring. In:
Proceedings of IEEE-EMBS International Conference on Bio-medical and Health
Informatics (BHI 2012), Hong Kong, Shenzhen, China, Jan 5-7, pp. 479 - 482.
Cortex-M4 Processor. Available Online: http://www.arm.com/products/processors/cortex-
m/cortex-m4-processor.php.
Cox, D. R. (1972). Regression Models and Life-Tables, Journal of the Royal Statistical
Society. Series B (Methodological), vol. 34, no. 2, pp. 187-220.
Cristianini, N. and Taylor, J. S. (2000). An Introduction to Support Vector Machines and
Other Kernel-Based Learning Methods. Cambridge, U.K.: Cambridge Univ. Press.
D'Agostino, R. B., Vasan, R. S., Pencina, M. J., Wolf, P. A., Cobain, M., Massaro, J. M. and
Kannel, W. B. (2008). General Cardiovascular Risk Profile for Use in Primary Care : The
Framingham Heart Study, Circulation, vol. 117, no. 6, pp.743–753.
Das, D., Zhou, S., and Lee, J. D. (2012). Differentiating Alcohol-Induced Driving Behavior
Using Steering Wheel Signals. IEEE Transactions on Intelligent Transportation Systems,
vol. 13, no. 3, pp. 1355-1368.
176
Derringer, G. and Suich, R. (1980). Simultaneous Optimization of Several Response
Variables. Journal of Quality Technology, vol. 12, no. 4, pp. 214-219.
Di Milia, L., Smolensky, M. H., Costa, G., Howarth, H. D., Ohayon, M. M. and Philip, P.
(2011). Demographic factors, fatigue, and driving accidents: An examination of the
published literature. Accident Analysis and Prevention, vol. 43, no. 2, pp. 516–532.
Dong, Y., Hu, Z., Uchimura, K., and Murayama, N. (2011). Driver Inattention Monitoring
System for Intelligent Vehicles: A Review. IEEE Transactions on Intelligent
Transportation Systems, vol. 12, no. 2, pp. 596-614.
Duda, R.O., Hart, P.E., and Stork, D.G. (2006). Pattern Classification, Second Edition, Wiley
India, New Delhi, (Chapter 6).
European Commission. (2010). 7th Framework Programme ICT for Transport Future
Directions: i2010- intelligent car initiative. European Commission, Brussels, Belgium.
European Transport Safety Council (ETSC). (2001). The Role of Driver Fatigue in
Commercial Road Transport Crashes. Technical Report, ISBN: 90-76024-09-X.
European Transport Safety Council, Rue du Cornet 34, B-1040 Brussels. Available
Online: http://www.etsc.eu/oldsite/drivfatigue.pdf
Fairclough, S. H. (2009). Fundamentals of physiological computing. Interacting with
Computers. vol. 21, no. 1, pp. 133–145.
Fox, J. (2002). Cox proportional-hazards regression for survival data. An R and S-PLUS
companion to applied regression, pp. 1-18. Available Online: http://cran.r-
project.org/doc/contrib/Fox-Companion/appendix-cox-regression.pdf
Giakoumis, D., Vogiannou1, A., Kosunen, I., Moustakas, K., Tzovaras, D., & Hassapis, G.
(2010). Identifying Psychophysiological Correlates of Boredom and Negative Mood
Induced During HCI. In: B-Interface, Valencia, Spain, pp. 3-12.
Gietelink, O., Ploeg, J., De Schutter, B., and Verhaegen, M. (2006). Development of
advanced driver assistance systems with vehicle hardware-in-the-loop simulations.
Vehicle System Dynamics, vol. 44, no. 7, pp. 569–590.
Golias, J., Yannis, G. and Antoniou, C. (2002). Classification of driver assistance systems
according to their impact on road safety and traffic efficiency, Transport Reviews, vol.
22, no. 2, pp. 179-196.
Gorini, A. and Riva1, G. (2008). The potential of Virtual Reality as anxiety management
tool: a randomized controlled study in a sample of patients affected by Generalized
Anxiety Disorder. Trials, vol. 9, no. 1, pp. 1-9. Available Online:
http://www.trialsjournal.com/content/9/1/25.
177
Gutiérrez-Osuna, R. (2002). Pattern Analysis for Machine Olfaction: A Review. IEEE
Sensors Journal, vol. 2, no. 3, pp. 189- 202.
Guyton, A. C., and Hall, J. E. (2006). Textbook of Medical Physiology. Eleventh Edition,
Elsevier Saunders.
Hagan, M. T., Demuth, H. B., and Beale, M. H. (1996). Neural Network Design. PWS
Publishing Company, Boston, USA.
Haykin, S. (2001). Neural Networks, Second Edition., Pearson Education, New Delhi,
(Chapters 3–6).
Healey, J. A. and Picard, R. W. (2005). Detecting Stress During Real-World Driving Tasks
Using Physiological Sensors. IEEE Transactions on Intelligent Transportation Systems,
vol. 6, no. 2, pp. 156-166.
Hjortskov, N., Risse´n, D., Blangsted, A. K., Fallentin, N., Lundberg, U., Søgaard, K. (2004).
The effect of mental stress on heart rate variability and blood pressure during computer
work. European Journal Applied Physiology, vol. 92, no. 1-2, pp. 84–89.
Horberry, T., Anderson, J., Regan, M. A., Triggs, T. J. and Brown, J. (2006). Driver
Distraction: The Effects of Concurrent In-vehicle Tasks, Road Environment Complexity
and Age on Driving performance, Accident Analysis and Prevention, vol. 38, no. 1, pp.
185–191.
Houggard, P. (1999). Fundamentals of Survival Data. Biometrics, vol. 55, no. 1, pp. 13-22.
Institute of Road Traffic Education (IRTE), India. Citing Internet Sources. Available Online:
http://www.irte.com.
International Harmonized Research Activities (IHRA) Working Group on ITS. (2011).
Design Principles for Advanced Driver Assistance Systems: Keeping Drivers In-the-
Loop. Document No. ITS-19-07. Available Online:
http://www.unece.org/fileadmin/DAM/trans/doc/2011/wp29/ITS-19-07e.pdf
Jain, A. K., Duin, R. P. W., and Mao, J. (2000). Statistical pattern recognition: A review.
IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 1,
pp. 4-37.
Jain, A. K., Mao, J., and Mohiuddin, K. M. (1996). Artificial neural networks: A
tutorial. Computer, vol. 29, no. 3, pp. 31-44.
James, L., and Nahl, D. (2003). Dealing with stress and pressure in the vehicle. Taxonomy of
driving behavior: Affective, cognitive, sensorimotor. In: Driving lessons: Exploring
systems that make traffic safer, J. Peter Rothe, (Ed.). (Edmonton, Canada: University of
Alberta Press, 2003), Chapter 3, pp. 21-50.
178
Jejurikar, R., Pereira, C., and Gupta, R. (2004). Leakage aware dynamic voltage scaling for
real-time embedded systems. In: Proceedings of the 41st Design Automation Conference,
pp. 275-280.
Jenkins, S. P. (2005). Survival Analysis. Unpublished manuscript, Institute for Social and
Economic Research, University of Essex, Colchester, UK. Available Online:
http://www.iser.essex.ac.uk/teaching/degree/stephenj/ec968/pdfs/ec968lnotesv6.pdf
Ji, Q., Zhu, Z., and Lan, P. (2004). Real-Time Nonintrusive Monitoring and Prediction of
Driver Fatigue. IEEE Transactions on Vehicular Technology, vol. 53, no. 4, pp. 1052-
1068.
Kantowitz, B. H. and Moyer, M. J. (2000). Integration of Driver In-Vehicle ITS Information,
In: Proceedings of Intelligent Transportation Society of America (ITSA) 9th Annual
Meeting and Exposition. Available Online: http://www-
nrd.nhtsa.dot.gov/departments/Human%20Factors/driver-distraction/pdf/28.pdf.
Katsis, C. D., Katertsidis, N., Ganiastas, G., & Fotiadis, D. I. (2008). Toward Emotion
Recognition in Car-Racing Drivers: A Biosignal Processing Approach. IEEE
Transactions On Systems Man And Cyernatics – Part A: Sytems and Humans, vol. 38, no.
3, pp. 502–512.
Katsis, C. D., Goletsis, Y., Rigas, G., & Fotiadis, D.I. (2011). A Wearable System for the
Affective Monitoring of Car Racing Drivers during Simulated Conditions. Transportation
Research Part C: Emerging Technologies, vol. 19, no. 3, pp. 541-551.
Khushaba, R. N., Kodagoda, S., Lal. S., and Dissanayake G. (2011). Driver Drowsiness
Classification Using Fuzzy Wavelet-Packet-Based Feature-Extraction Algorithm. IEEE
Transactions on Biomedical Engineering, vol. 58, no. 1, pp. 121-131.
Kohonen, T. (1990). The Self Organizing Map. Proceeding of the IEEE, vol. 78, no. 9,
pp.1464-1480.
Kristal-Boneh, E., Raifel, M., Froom, P., Ribak, J. (1995). Heart rate variability in health and
disease. Scandinavian Journal of Work Environment & Health, vol. 21, no. 2, pp. 85 - 95.
Lagarde, E., Chastang, J. F., Gueguen, A., Pellicer, M. C., Chiron, M. and Lafont, S. (2004).
Emotional Stress and Traffic Accidents: The Impact of Separation and Divorce,
Epidemiology, vol. 15, no. 6, pp. 762-766.
Laguna, P., Moody, G.B., and Mark, R.G. (1998). Power spectral density of unevenly
sampled data by least-square analysis:performance and application to heart rate signals.
IEEE Transactions on Biomedical Engineering, vol. 45, no. 6, pp. 698- 715.
179
Lal, S. K.L., and Craig, A. (2001). A critical review of the psychophysiology of driver
fatigue. Biological Psychology, vol. 55, no. 3, pp. 173–194.
Lee, J. D., Hoffman, J. D., and Hayes, E. (2004). Collision warning design to mitigate driver
distraction. In: Proceedings of the SIGCHI conference on Human factors in computing
systems, pp. 65-72.
Lee, J. S., Su, Y. W., and Shen, C. C. (2007). A Comparative Study of Wireless Protocols:
Bluetooth, UWB, ZigBee, and Wi-Fi. In: Proceeding of The 33rd Annual Conference of
the IEEE Industrial Electronics Society (IECON), Nov. 5-8, Taipei, Taiwan, pp. 46-51.
Lessard, C.S. (2006). Signal Processing of Random Physiological Signals, Morgan &
Claypool, San Rafael, California, (Chapter 3).
Li, L., Wen, D., Zheng, N. N., and Shen, L. C. (2012). Cognitive Cars: A New Frontier for
ADAS Research. IEEE Transactions Intelligent Transportation Systems, vol. 13, no. 1,
pp. 395-407.
Liang, Y., Reyes, M. L. and Lee J. D. (2007). Real-Time Detection of Driver Cognitive
Distraction Using Support Vector Machines. IEEE Transactions Intelligent
Transportation Systems, vol. 8, no. 2, pp. 340-350.
Linder, S. P., Wendelken, S. M., Wei, E., & McGrath, S. P. (2006). Using the Morphology of
Photoplethysmogram Peaks to Detect Changes in Posture. Journal of Clinical Monitoring
and Computing, vol. 20, no. 3, pp. 151-158.
Lindgren, A., Chen, F., Jordan, P. W., and Zhang, H. (2008). Requirements for the design of
advanced driver assistance systems - The differences between Swedish and Chinese
drivers. International Journal of Design, vol. 2, no. 2, pp. 41-54.
Lindgren, A. and Chen, F. (2006). State of the Art Analysis: An Overview of Advanced
Driver Assistance Systems (ADAS) and Possible Human Factors Issues. In: Proceedings
of the Swedish Human Factors Network (HFN) Conference, Linköping, Sweden, pp. 38-
50.
Lisetti, C. L. and Nasoz, F. (2004). Using noninvasive wearable computers to recognize
human emotions from physiological signals. EURASIP Journal Of Applied Signal
Processing, vol. 11, pp. 1672–1687.
Lisetti, C. L. and Nasoz, F. (2005). Affective Intelligent Car Interfaces with Emotion
Recognition. In: Proceedings of 11th International Conference on Human Computer
Interaction, Las Vegas, NV, USA, pp. 1-10.
180
Lu, M., Wevers, K. and van der Heijden, R. (2005). Technical Feasibility of Advanced Driver
Assistance Systems (ADAS) for Road Traffic Safety, Transportation Planning and
Technology, vol. 28, no. 3, pp. 167-187.
Lu, S., Zhao, H., Ju, K., Shin, K., Lee, M., Shelley, K., and Chon, K. H. (2008). Can
Photoplethysmography Variability Serve as an Alternative Approach to obtain Heart Rate
Variability Information? Journal of Clinical Monitoring and Computing, vol. 22, no. 1,
pp. 23–29, DOI: 10.1007/s10877-007-9103-y.
Malik, M., Bigger, J. T., Camm, A. J., Kleiger, R. E., Malliani, A., Moss, A. J. and Schwartz,
P. J. (1996). Heart rate variability: Standards of measurement, physiological
interpretation, and clinical use. European Heart Journal, vol. 17, no. 3, pp. 354-381.
Malta, L., Miyajima, C., Kitaoka, N., and Takeda, K. (2011). Analysis of Real-World
Driver’s Frustration. IEEE Transactions on Intelligent Transportation Systems, vol. 12,
no. 1, pp. 109-118.
Malta, L., Miyajima, C., and Takeda, K. (2009). A Study of Driver Behavior Under Potential
Threats in Vehicle Traffic. IEEE Transactions on Intelligent Transportation Systems, vol.
10, no. 2, pp. 201-210.
Matthews, G. (2002). Towards a transactional ergonomics for driver stress and fatigue.
Theoretical Issues in Ergonomics Science, vol. 3, no. 2, pp. 195-211.
Mccall, J.C., Achler, O., Trivedi, M. M., and Fastrez, J. H. P. (2004). A Collaborative
Approach for Human-Centered Driver Assistance Systems. In: Proceedings of 7th
International IEEE Annual Conf. on Intelligent Transportation Systems, pp. 663-667.
Mccall, J.C., and Trivedi, M. M. (2006). Video-Based Lane Estimation and Tracking for
Driver Assistance: Survey, System, and Evaluation. IEEE Transactions on Intelligent
Transportation Systems, vol. 7, no. 1, pp. 20-37.
Melek, W. W., Lu, Z., Kapps, A., & Fraser, W. D. (2005). Comparison of Trend Detection
Algorithms in the Analysis of Physiological Time-Series Data. IEEE Transactions On
Biomedical Engineering, vol. 52, no. 4, pp. 639–651.
Miller, R. and Huang, Q. (2002). An Adaptive Peer-to-Peer Collision Warning System. In:
Proceedings of 55h IEEE Vehicular Technology Conference (VTC Spring 2002), vol. 1.
pp. 317-321.
Moghimi-Dehkordi, B., Safaee, A., Pourhoseingholi, M. A., Fatemi, R., Tabeie, Z., and Zali,
M. R. (2008). Statistical Comparison of Survival Models for Analysis of Cancer Data.
Asian Pacific Journal of Cancer Prevention, vol. 9, no. 3, pp. 417-420.
Mohan, D. (2009). Road Accidents in India, IATSS Research, vol. 33, no. 1, pp. 75-79.
181
Nabi, H., Consoli, S. M., Chastang, J. F., Chiron, M., Lafont, S. and Lagarde, E. (2005).
Type A Behavior Pattern, Risky Driving Behaviors, and Serious Road Traffic Accidents:
A Prospective Study of the GAZEL Cohort. American Journal of Epidemiology, Vol.
161, No. 9, pp. 864-870. DOI: 10.1093/aje/kwi110.
National Highway Traffic Safety Administration (N.H.T.S.A.). (2008). Traffic Safety Facts
2008: A Compilation of Motor Vehicle Crash Data from the Fatality Analysis Reporting
System and the General Estimates System. US Department of Transportation, Report No.
DOT HS 811 170, Washington DC.
NeXus-10 Biofeedback Monitoring Device. User Manual for the BioTrace+ Software.
Version 1.1, Mind Media B. V. Netherlands, 2004-2006.
Nexus-10 Wireless Monitoring and Biofeedback System. Available Online:
http://www.mindmedia.nl/CMS/
Oppenheim, A. V. (2006). Discrete-Time Signal Processing, Second Edition. Pearson
Education India.
Pantelopoulos, A. and Bourbaki, N. G. (2010). A Survey on Wearable Sensor-Based Systems
for Health Monitoring and Prognosis. IEEE Transactions On Systems Man And
Cybernetics – Part C: Applications and Reviews, vol. 40, no. 1, pp. 1–12.
Pantic, M. and Rothkrantz, L. J. M. (2003). Toward an Affect-Sensitive Multimodal Human–
Computer Interaction. Proceeding Of The IEEE, vol. 91, no. 9, pp. 1370-1390.
Paradiso, R., Loriga, G., Taccini, N., Gemignani, A., & Ghelarducci, B. (2005). Wealthy - a
wearable health-care system: new frontier on e-textile. Journal of Telecommunications
and Information Technology, vol. 4, 105-113.
Park, S., and Jayaraman, S. (2003). Enhancing the quality of life through wearable
technology. IEEE Engineering in Medicine and Biology Magazine, vol. 22, no. 3, pp. 41-
48.
Partin, D. L., Sultan, M. F., Thrush, C. M., Prieto, R., & Wagner, S. J. (2006). Monitoring
Driver Physiological Parameters for Improved Safety. SAE Technical Paper. In: 2006
SAE World Congress Detroit, Michigan, 2006-01-1322.
Patel, M., Lal, S.K.L., Kavanagh, D. and Rossiteret, P. (2011). Applying neural network
analysis on heart rate variability data to assess driver fatigue. Expert Systems with
Applications, vol. 38, no. 6, pp. 7235–7242.
Picard, R. W., and Healey, J. (1997). Affective wearables. Personal Technologies, vol. 1, no.
4, pp. 231-240.
182
Picard, R. W., Vyzas, E., and Healey, J. (2001). Toward Machine Emotional Intelligence:
Analysis of Affective Physiological State. IEEE Transactions on Pattern Analysis and
Machine Intelligence, vol. 23, no. 10, pp. 1175-1191.
Platt, R. W., Joseph, K. S., Ananth, C. V., Grondines, J., Abrahamowicz, M., and Kramer, M.
S. (2004). A Proportional Hazards Model with Time-dependent Covariates and Time-
varying Effects for Analysis of Fetal and Infant Death, American Journal of
Epidemiology, vol. 160, no. 3, pp. 199-206.
Plessl, C., Enzler, R., Walder, H., Beutel, J., Platzner, M., Thiele, L. and Trӧster, G. (2003).
The case for reconfigurable hardware in wearable computing. Pervasive and Ubiquitous
Computing, vol. 7, no. 5, pp. 299–308.
Poh, M. Z., Swenson, N. C., & Picard, R. W. (2010). A wearable sensor for unobtrusive,
long-term assessment of electrodermal activity. IEEE Transactions on Biomedical
Engineering, vol. 57, no. 5, pp. 1243-1252.
Popivanov, D., and Mineva, A. (1999). Testing procedures for non-stationarity and non-
linearity in physiological signals. Mathematical biosciences, vol. 157, no. 1, pp. 303-320.
Reimer, B., D’Ambrosio, L. A., Couhlin, J. F., Kafrissen, M. E. and Biederman, J. (2006)
Using Self Reported Data to Assess the Validity of Driving Simulation Data, Behavior
Research Methods, vol. 38, no. 2, pp. 314-324.
Riener, A., Ferscha, A., and Aly, M. (2009). Heart on the road: HRV analysis for monitoring
a driver's affective state. In: Proceedings of the 1st International Conference on
Automotive User Interfaces and Interactive Vehicular Applications, (AutomotiveUI,
2009), pp. 99-106.
Rigas, G., Goletsis, Y., and Fotiadis, D. I. (2012). Real-Time Driver’s Stress Event Detection.
IEEE Transactions on Intelligent Transportation Systems, vol. 13, no. 1, pp. 221-234.
Roscoe, A. H. (1992). Assessing pilot workload. Why measure heart rate, HRV and
respiration?. Biological psychology, vol. 34, no. 2, 259-287.
Ryoo, D. W., Kim, Y. S., & Lee, J. W. (2005). Wearable Systems for Service based on
Physiological Signals. In: IEEE Engineering in Medicine and Biology 27th Annual
Conference, Shanghai, China, pp. 2438-2440.
Sanei, S. and Chambers, J. A. (2008). EEG Signal Processing. John Wiley & Sons Ltd,
Wiley.com, England.
Schmidt, S., and Walach, H. (2000). Electrodermal Activity (EDA) – State-of-the-Art
Measurement and Techniques for Parapsychological Purposes. The Journal of
Parapsychology, vol. 64, no. 2, pp. 139-163.
183
Shamir, M., Eidelman, L. A., Floman, Y., Kaplan, L., & Pi-zov, R. (1999). Pulse
oximetryplethysmographic waveform during changes in blood volume. British Journal of
Anaesthesia, vol. 82, no. 2, pp. 178-181.
Singer, J. D. and Willet, J. B. (1993). It's About Time: Using Discrete-Time Survival
Analysis to Study Duration and the Timing of Events. Journal of Educational Statistics,
vol. 18, no. 2, pp. 155-195.
Singh, R. R., and Banerjee, R. (2005). A Communication-Architecture for Life-Critical Data
Transfer in the BITS-LifeGuard Wearable Computing Environment. In: Web Proceedings
12th IEEE International Conference on High Performance Computing (HiPC 2005), 18-
21, December, Goa, India. Available Online:
http://www.hipc.org/hipc2005/posters/rajivsingh.pdf.
Singh, R. R. (2007). Preventing Road Accidents with Wearable Biosensors and Innovative
Architectural Design. In: Proceedings of 2nd ISSS National Conference On MEMS,
Microsensors, Smart Materials, Structures And Systems (ISSS MEMS-2007), CEERI
Pilani, India, pp. 1-8.
Singh, R. R., and Banerjee, R. (2010). Multi-parametric Analysis of Sensory Data collected
from Automotive Drivers for Building a Safety-Critical Wearable Computing System. In:
Proceedings of 2nd International Conference on Computer Engineering and Technology
(ICCET 2010), Chengdu, China, pp. VI-355 - 360.
Singh, R. R., Conjeti, S., and Banerjee, R. (2011). An Approach for Real-Time Stress-Trend
Detection Using Physiological Signals in Wearable Computing Systems for Automotive
Drivers". In: Proceedings of the 14th International IEEE Annual Conference on
Intelligent Transportation Systems (ITSC 2011), The George Washington University,
Washington, DC, USA, pp. 1477 - 1482.
Singh, R. R., Conjeti, S., and Banerjee, R. (2012). Bio-signal based On-road Stress
Monitoring for Automotive Drivers. In: Proceedings of Eighteenth National Conference
on Communications (NCC 2012), IIT Kharagpur, India, pp. 1 - 5.
Singh, R. R., Conjeti, S. and Banerjee, R. (2013a). A comparative evaluation of neural
network classifiers for stress level analysis of automotive drivers using physiological
signals. Biomedical Signal Processing and Control, vol. 8, no. 6, pp. 740-754.
Singh, R. R., Conjeti, S. and Banerjee, R. (2013b). Assessment of Driver Stress from
Physiological Signals collected under Real - Time Semi - Urban Driving Scenarios.
International Journal of Computational Intelligence Systems. (ahead-of-print, online
version published on 12 Nov 2013), pp. 1-15, DOI: 10.1080/18756891.2013.864478.
184
Smolensky, M. H., Di Milia, L., Ohayon, M. M., & Philip, P. (2011). Sleep disorders,
medical conditions, and road accident risk. Accident Analysis and Prevention, vol. 43, no.
2, pp. 533–548.
Soleymani, M., Chanel, G., Kierkels, J. J. M., & Pun, T. (2008). Affective Ranking of Movie
Scenes Using Physiological Signals and Content Analysis. In: Proceedings of the 2nd
ACM workshop on Multimedia semantics, pp. 32-39.
Sridhara, S. R., DiRenzo, M., Lingam,S., Lee, S. J., Blázquez,R., Maxey, J., Ghanem, S. Lee,
Y. H., Abdallah, R., Singh, P. and Goel, M. (2011). Microwatt Embedded Processor
Platform for Medical System-on-Chip Applications. IEEE Journal of Solid-State Circuits,
vol. 46, no. 4, pp. 721-730.
Starner, T. (1996). Human Powered Wearable Computing. IBM Systems Journal, vol. 35, no.
3, pp. 618-629.
Sullivan, L. M., Massaro, J. M. and D’Agostino, R. B. (2004). Presentation of Multivariate
Data for Clinical Use: The Framingham Study Risk Score Functions, Statistics in
Medicine. vol. 23, no. 10, pp. 1631–1660, DOI: 10.1002/sim.1742.
The Royal Society for the Prevention of Accidents (RoSPA). (2001). Driver Fatigue and
Road Accidents: A Literature Review and Position Paper. Edgbaston, United Kingdom.
Available Online: http://www.rospa.com/roadsafety/info/fatigue.pdf
Thought Technology's Data Loggers. Thought Technology's Limited, Available online:
http://www.thoughttechnology.com/hardware.htm.
Tsugawa, Sadayuki. (2006). Trends and Issues in Safe Driver Assistance Systems: Driver
Acceptance and Assistance for Elderly Drivers. IATSS RESEARCH, vol. 30, no. 2, pp. 6 -
18.
Vadeby, A., Forsman, A., Kecklund, G., Akerstedt, T., Sandberg, D. and Anund, A. (2010)
Sleepiness and Prediction of Driver Impairment in Simulator Studies using a Cox
Proportional Hazard Approach. Accident Analysis and Prevention, vol. 42, no. 3, pp. 835-
841.
Vapnik, V. N. (1995). The Nature of Statistical Learning Theory. New York, NY, USA:
Springer-Verlag.
Wahab, A., Quek, C., Tan, C. K., and Takeda, K. (2009). Driving Profile Modeling and
Recognition Based on Soft Computing Approach. IEEE Transactions on Neural
Networks, vol. 20, no. 4, pp. 563-582.
Wayne, W. D. (2006). Biostatistics: A Foundation for Analysis in the Health Sciences, Ch.
12, (Wiley India).
185
Wen, D., Yan, G., Zheng, N. N., Shen, L. C., and Li L. (2011). Toward Cognitive Vehicles.
IEEE Intelligent Systems Magazine, vol. 26, no. 3, pp. 76-80.
Weng, J., Ye, Z., & Weng, J. (2005). An Improved Pre processing Approach for
Photoplethysmographic Signal. In: Proceedings of IEEE Engineering in Medicine and
Biology 27th Annual Conference, Shanghai, China,
pp. 41-44.
Wieclaw, J., Agerbo, E., Mortensen, P. B. and Bonde, J. P. (2006). Risk of Affective and
Stress Related Disorders among Employees in Human Service Professions, Occupational
and Environmental Medicine. vol. 63, no. 5, pp. 314–319.
Wijesiriwardana, R., Mitcham, K. and Dias, T. (2004). Fibre-Meshed Transducers Based
Real Time Wearable Physiological Information Monitoring System, In: Proceedings of
Eighth IEEE International Symposium on Wearable Computers (ISWC’04), Arlington,
VA, USA, vol. 1, pp. 40-47.
Williamson, A., and Chamberlain T. (2005). Review on-road driver fatigue monitoring
devices. NSW Injury Risk Management Research Centre, University of New South
Wales, Australia.
World Health Organization (WHO), (2013). Global Status Report on Road Safety 2013:
Supporting a Decade of Action. Geneva, Available Online:
http://www.who.int/violence_injury_prevention/road_safety_status/2013/en/.
World Health Organization (WHO), (2009). Global Status Report on Road Safety: Time for
Action. Geneva, Available Online:
www.who.int/violence_injury_prevention/road_safety_status/2009.
Yang, J. H., Mao, Z. H., Tijerina, L., Pilutti, T., Coughlin, J. F., and Feron, E. (2009).
Detection of Driver Fatigue Caused by Sleep Deprivation. IEEE Transactions on Systems,
Man, and Cybernetics-Part A: Systems and Humans, vol. 39, no. 4, pp. 694-705.
Yang, L., and Wang, F. Y. (2007). Driving into Intelligent Spaces with Pervasive
Communications. IEEE Intelligent Systems, vol. 22, no. 1, pp. 12-15.
Young, K. and Regan, M. (2007). Driver distraction: A review of the literature. In: I. J.
Faulks, M. Regan, M. Stevenson, J. Brown, A. Porter & J.D. Irwin (Eds.). Distracted
driving. Sydney, NSW: Australasian College of Road Safety, pp. 379-405.
186
LIST OF PUBLICATIONS AND PRESENTATIONS
Journals:
Singh, R. R., Conjeti, S. and Banerjee, R. (2013b). Assessment of Driver Stress from
Physiological Signals collected under Real - Time Semi - Urban Driving Scenarios.
International Journal of Computational Intelligence Systems. Online version
published on 12 Nov 2013, pp. 1-15, DOI: 10.1080/18756891.2013.864478. (Co-
published by Taylor & Francis and Atlantis Press, Indexed in Scopus and Science
Citation Index Expanded).
Singh, R. R., Conjeti, S. and Banerjee, R. (2013a). A comparative evaluation of neural
network classifiers for stress level analysis of automotive drivers using physiological
signals. Biomedical Signal Processing and Control, vol. 8, no. 6, pp. 740-754.
(Elsevier, Indexed in Scopus and Science Citation Index Expanded).
Conferences:
Singh, R. R., Conjeti, S. and Banerjee, R. (2012). Bio-signal based On-road Stress
Monitoring for Automotive Drivers. In Proceedings of Eighteenth National
Conference on Communications (NCC 2012), IIT Kharagpur, India, February 3-5. pp.
1 - 5, doi: 10.1109/NCC.2012.6176845.
Conjeti, S., Singh, R. R., and Banerjee, R. (2012). Bio-inspired Wearable Computing
Architecture and Physiological Signal Processing for On-road Stress Monitoring. In
Proceedings of IEEE-EMBS International Conference on Bio-medical and Health
Informatics (BHI 2012), Hong Kong, Shenzhen, China, Jan 5-7, pp. 479 - 482,
doi: 10.1109/BHI.2012.6211621.
Singh, R. R., Conjeti, S. and Banerjee, R. (2011). An Approach for Real-Time Stress-
Trend Detection Using Physiological Signals in Wearable Computing Systems for
Automotive Drivers. In Proceedings of The 14th International IEEE Annual
Conference on Intelligent Transportation Systems (ITSC 2011), The George
Washington University, Washington, DC, USA, 05-07 October, pp. 1477 - 1482.
Singh, R. R. and Banerjee, R. (2010). Multi-parametric Analysis of Sensory Data
collected from Automotive Drivers for Building a Safety-Critical Wearable
Computing System. In Proceedings of The 2nd International Conference on Computer
187
Engineering and Technology (ICCET 2010), Chengdu, China, 16-18 April, pp. V1-
355 - V1-360, doi: 10.1109/ICCET.2010.5486110.
Singh, R. R. (2007). Preventing Road Accidents with Wearable Biosensors and
Innovative Architectural Design. In Proceedings of The 2nd ISSS National
Conference On MEMS, Microsensors, Smart Materials, Structures And Systems
(ISSS MEMS-2007), CEERI Pilani, India, 16-17 November, pp. 1-8.
Singh, R. R. and Banerjee, R. (2005). A Communication-Architecture for Life-
Critical Data Transfer in the BITS-LifeGuard Wearable Computing Environment. In
the 12th IEEE International Conference on High Performance Computing (HiPC
2005), Goa, India. 18-21 December, pp. 1-5. Available on Web Proceedings of HiPC
2005 Posters (http://www.hipc.org/hipc2005/posters/rajivsingh.pdf).
Invited Talk:
Title: Architecting the BITS-LifeGuard Wearable Computing System: A Bio-signal-based
approach for Stress-level Monitoring of Automotive Drivers.
Venue: Durham Hall, Center for Embedded Systems for Critical Applications at Bradley
Department of Electrical and Computer Engineering,Virginia Tech., Blacksburg,
USA on 11th October 2011.
188
APPENDICES
Appendix A
Table A.1: Questionnaire for Driver-Profile Analysis
Note: This data was filled up by the experimenter himself by asking each question to the
drivers in local language (Hindi).
To be filled by Experimenter after Test Driving:
Time of Experiment: _____________________________
Total Drive Time: ____hrs ____min
Driver Initial Affective State: _______________________
Driving Style Adopted:
Calm Aggressive
Experimenter 1
Experimenter 2
Vehicle Configuration: Sedan Hatchback
All Terrain
Was the driver compatible with the sensor configuration?
Compatible Not Compatible
Remarks (if any): _________________________________
_______________________________________________
_______________________________________________
________________________________________________
Questionnaire for Driver-Profile Analysis
Name:
Age: _____years Gender: M/F
Driving Since: _____years
Driver Group: Casual Short Distance Long Distance
Average Distance driven per day: ____ kilometers
How comfortable are you driving:
Vehicle Type Comfort Level
1-Low 2 3 4 5-High
Sedan
Hatchback
All Terrain
Projected Drive Time:
Driving
Situation
Scenario Drive Time
(in hours)
Relaxed Driving
(Low)
Intra-Campus
Moderate Stress City Main Road
Moderate + S.T. City Road + Market Area
Stressed Main Road during Peak
Hours
Stressed + S.T. Highways Connecting Cities
with Long Stretches of
Market areas
189
Appendix B: Samples of Consent Forms Signed by the Drivers
190
Appendix C: Photographs of Drivers Participated in Data Collection
(Relaxed and Driving States)
191
Note: Faces have been masked here in order to protect the privacy of the subjects.
192
193
BRIEF BIOGRAPHY OF THE CANDIDATE
Rajiv Ranjan Singh is currently working with the Department of Electrical and Electronics
Engineering / Instrumentation at the Birla Institute of Technology and Science (BITS), Pilani,
as a Doctoral Scholar working in the area of Wearable Computing under the "Faculty
Development Programme".
His research interests include Biomedical Signal Processing, Human Centric
Design specific to Intelligent Transportation Systems (ITS) area, Pattern Recognition and
Wearable Embedded Systems.
Besides teaching (CS/EEE/INSTR Courses), he is also looking after the activities of
Embedded Controller and Application Centre (ECAC Lab).
Rajiv is an Associate Member of the Institution of Engineers (India), Kolkata and a
Student Member of IEEE.
194
BRIEF BIOGRAPHY OF THE SUPERVISOR
Dr. Rahul Banerjee is a Professor of Computer Science at the Birla Institute of Technology &
Science, Pilani. He holds a PhD in Computer Science & Engineering and his research
interests lie in the areas of Computer Networking, Cloud Computing, Wearable Computing
and Pervasive / Ubiquitous Computing (including what is sometimes known as Cyber-
Physical Systems).
He has participated in several funded research projects in the areas of computer
networking including those funded by European Commission in the area of Next Generation
Networking involving IPv6 (involving France, Spain, Switzerland, Denmark, Luxembourg) ,
Govt. of India in the area of Technology-enabled Learning, IPv6 and Mobile Ad-hoc
Networks, Govt. of France (involving France, India, China, South Korea) in the area of IPv6-
enabled Low-Power Wireless Sensor Networking and select industries.
He has also served as a Reviewer for several IEEE / ACM journals and magazines
including IEEE Transactions on Computers, IEEE Internet Computing, IEEE
Communications, IEEE Transactions on ITS as well as many international conferences held
around the world. He has published several papers and technical reports apart from writing
two books.
He is a Member of the IEEE, ACM, ISTE and ISCA.