by xiaoyu zhu - university of floridaufdcimages.uflib.ufl.edu/uf/e0/04/27/14/00001/zhu_x.pdfxiaoyu...

141
AN ANALYSIS OF INJURY SEVERITIES OF LARGE-TRUCK CRASHES By XIAOYU ZHU A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA 2011

Upload: others

Post on 15-Jul-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

  • AN ANALYSIS OF INJURY SEVERITIES OF LARGE-TRUCK CRASHES

    By

    XIAOYU ZHU

    A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT

    OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY

    UNIVERSITY OF FLORIDA

    2011

  • 2

    © 2011 Xiaoyu Zhu

  • 3

    To my parents

  • 4

    ACKNOWLEDGMENTS

    I would like to take this opportunity to thank my parents. They are always doing

    their best to provide me opportunities to pursue my goal. Their endless support and

    encouragement lead me through every step in my life.

    I would also like to thank the faculty members at the University of Florida (UF),

    who have provided me the huge amount of knowledge and skills during my Ph.D. study.

    I would like to thank my advisor Dr. Siva Srinivasan for constantly being a source of

    inspiration. He provided not only the foundation of this research, but also approaches to

    a successful research, which will be a lifelong benefit. I would like to thank Dr. Lily

    Elefteriadou for being one of my outstanding examples as women professors, Dr. Scott

    Washburn for his positive attitude towards life and work, and Dr. Yafeng Yin for his

    constant enthusiasms and innovations in the research. I also would like to thank Dr.

    Chunrong Ai and Dr. Trevor Park for their helpful comments from various perspectives

    to make this research in a broader context.

    Special thanks go to Neng Fan for his support and accompanying in both my life

    and study during these four years. I would also like to thank all my friends and

    colleagues for making my life in Gainesville enjoyable.

  • 5

    TABLE OF CONTENTS page

    ACKNOWLEDGMENTS .................................................................................................. 4

    LIST OF TABLES ............................................................................................................ 8

    LIST OF FIGURES ........................................................................................................ 10

    ABSTRACT ................................................................................................................... 11

    CHAPTER

    1 INTRODUCTION .................................................................................................... 13

    1.1 Background: Traffic Safety and Large Trucks ................................................... 13 1.2 Objectives of the Research ............................................................................... 15 1.3 Organization of the Document .......................................................................... 16

    2 LITERATURE REVIEW .......................................................................................... 17

    2.1 Research on Large-Truck Crashes ................................................................... 17 2.2 Research on Modeling Injury-Severity of Automobile Crashes ......................... 21

    2.2.1 Injury of Interest ....................................................................................... 22 2.2.2 Levels of Injury Severity .......................................................................... 23 2.2.3 Data Sources ........................................................................................... 24 2.2.4 Modeling Method ..................................................................................... 25

    2.2.4.1 Treatment of ordinal ....................................................................... 25 2.2.4.2 Incorporating interdependencies among the injuries of all

    persons involved in the crash ................................................................. 29 2.2.5 Explanatory Factors................................................................................. 30

    2.2.5.1 Crash characteristics...................................................................... 30 2.2.5.2 Vehicle characteristics ................................................................... 32 2.2.5.3 Driver and occupant characteristics ............................................... 33 2.2.5.4 Environmental characteristics ........................................................ 35 2.2.5.5 Roadway characteristics ................................................................ 36 2.2.5.6 Occupant protection ....................................................................... 38

    2.3 Contribution of this Dissertation ........................................................................ 39

    3 DATA ...................................................................................................................... 41

    3.1 Data Source and Raw LTCCS Data Characteristics ......................................... 41 3.2 Sample Formation Procedure ........................................................................... 42

    3.2.1 Selecting Cases ...................................................................................... 42 3.2.2 Cleaning and Consistency Checking ....................................................... 42 3.2.3 Variable Selection ................................................................................... 43

    3.2.3.1 Crosstab check .............................................................................. 43

  • 6

    3.2.3.2 Classification analysis .................................................................... 44 3.2.3.3 Missing data ................................................................................... 44

    3.3 Sample Characteristics ..................................................................................... 47

    4 MODELS FOR CRASH LEVEL INJURY ................................................................. 48

    4.1 Sample Data ..................................................................................................... 48 4.2 Methodology ..................................................................................................... 51 4.3 Empirical Results .............................................................................................. 52

    4.3.1 Crash-level Variables ........................................................................ 53 4.3.2 Truck-level Variables ......................................................................... 55 4.3.3 Car-level Variables ............................................................................ 58

    4.4 Contributions ..................................................................................................... 65

    5 THE PANEL HETEROSKEDASTIC ORDERED PROBIT MODEL FOR OCCUPANT-lEVEL INJURY SEVERITY STUDY ................................................... 67

    5.1 An Exploratory Analysis of Occupant-level and Crash-level Injury Severities ... 67 5.2 Sample Data ..................................................................................................... 75 5.3 Methodology ..................................................................................................... 76 5.4 Empirical Results .............................................................................................. 79

    5.4.1 Truck Occupants ............................................................................... 81 5.4.2 Car Drivers ........................................................................................ 83 5.4.3 Car Passengers ................................................................................ 85

    5.5 Contributions ..................................................................................................... 93

    6 THE PANEL HETEROSKEDASTIC ORDERED GENERALIZED EXTREME VALUE MODEL IN INJURY SEVERITY STUDY .................................................... 95

    6.1 Background ....................................................................................................... 95 6.2 Methodology ..................................................................................................... 97 6.3 Empirical Result ................................................................................................ 99

    6.3.1 Truck Occupants ............................................................................. 101 6.3.2 Car Drivers ...................................................................................... 102 6.3.3 Car Passengers .............................................................................. 104 6.3.4 Application and Sensitivity Testing .................................................. 105

    6.4 Contributions ................................................................................................... 108

    7 SUMMARY AND CONCLUSIONS ........................................................................ 121

    7.1 Contributions of the Dissertation ..................................................................... 122 7.1.1 Methodological Contributions .......................................................... 122 7.1.2 Empirical Contributions ................................................................... 123

    7.2 Directions for Further Research ...................................................................... 126

    APPENDIX DESCRIPTIONS FOR THE VARIABLES ................................................. 128

    LIST OF REFERENCES ............................................................................................. 136

  • 7

    BIOGRAPHICAL SKETCH .......................................................................................... 141

  • 8

    LIST OF TABLES

    Table page 3-1 Sample size for each category. .......................................................................... 47

    4-1 Cross tabulation of police-determined and researcher–determined injury severity levels. .................................................................................................... 50

    4-2 Empirical model results: effects of crash-level variables. ................................... 62

    4-3 Empirical model results: effects of truck-level variables. .................................... 63

    4-4 Empirical model results: effects of car-level variables. ....................................... 64

    5-1 List of possible combinations of occupant‘s injury .............................................. 74

    5-2 Factors affecting the injury severity of truck occupants. ..................................... 89

    5-3 Factors affecting the injury severity of car drivers. .............................................. 90

    5-4 Factors affecting the injury severity of car passengers. ...................................... 91

    6-1 Model comparison. ........................................................................................... 110

    6-2 Standard deviation of intra-vehicle correlation term. ......................................... 110

    6-3 Factors affecting the injury severity of truck occupants. ................................... 111

    6-4 Factors affecting the injury severity of car drivers. ............................................ 112

    6-5 Factors affecting the injury severity of car passengers. .................................... 113

    6-6 List of sensitivity test. ........................................................................................ 114

    6-7 Sensitivity test: effect of airbags on the injury severity of truck drivers in truck-car head on crashes. ........................................................................................ 114

    6-8 Sensitivity test: effect of truck driver behavior on the injury severity of truck drivers in truck-car head on crashes. ................................................................ 115

    6-9 Sensitivity test: effect of seatbelts on the injury severity of truck drivers in truck-car head on crashes. ............................................................................... 115

    6-10 Sensitivity test: effect of crash type on the injury severity of truck drivers. ....... 116

    6-11 Sensitivity test: effect of airbag deployment on the injury severity of car drivers in truck-car head on crashes. ................................................................ 116

  • 9

    6-12 Sensitivity test: effect of airbag availability on the injury severity of car drivers in truck-car head on crashes. ........................................................................... 117

    6-13 Sensitivity test: effect of alcohol on the injury severity of car drivers in truck-car head on crashes. ........................................................................................ 117

    6-14 Sensitivity test: effect of crash type on the injury severity of car drivers. .......... 118

    6-15 Sensitivity test: effect of airbag deployment on the injury severity of car passengers in truck-car head on crashes. ........................................................ 118

    6-16 Sensitivity test: effect of airbag availability on the injury severity of car passengers in truck-car head on crashes. ........................................................ 119

    6-17 Sensitivity test: effect of car driver distraction on the injury severity of car passengers in truck-car head on crashes. ........................................................ 119

    6-18 Sensitivity test: effect of truck driver DUI on the injury severity of car passengers in truck-car head on crashes. ........................................................ 120

    A-1 Injury severity characteristics............................................................................ 128

    A-2 Crash characteristics (Crash level). .................................................................. 128

    A-3 Crash characteristics (Vehicle level). ................................................................ 129

    A-4 Truck driver characteristics. .............................................................................. 131

    A-5 Car driver characteristics. ................................................................................. 132

    A-6 Truck characteristics. ........................................................................................ 133

    A-7 Car characteristics. ........................................................................................... 133

    A-8 Truck occupant characteristics. ........................................................................ 134

    A-9 Car occupant characteristics. ........................................................................... 134

    A-10 Carrier characteristics. ...................................................................................... 135

  • 10

    LIST OF FIGURES

    Figure page 5-1 A cross-tabulation of cumulative injury severity and highest injury severity ........ 69

    5-2 Distribution of the cumulative injury severities by highest injury severity ............ 69

    5-3 Cross tabulations of cumulative injury cost against the highest injury severity ... 70

    5-4 Cross tabulation of average injury severity against number of occupants by HIC value ............................................................................................................ 73

    5-5 Distribution of injury severity levels by occupant type ......................................... 75

  • 11

    Abstract of Dissertation Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy

    AN ANALYSIS OF INJURY SEVERITIES OF LARGE-TRUCK CRASHES

    By

    Xiaoyu Zhu

    May 2011

    Chair: Sivaramakrishnan Srinivasan Major: Civil Engineering

    Traffic crashes have become one of the largest public health problems in the world

    and will be one of the most concerned transportation issues in the future. The

    importance of trucking to freight logistics and, consequently, its impact on the economic

    well being of a nation is well acknowledged. There is a need for studying crashes

    towards improving the safety of the transportation system, educating driver behavior,

    enhancing carrier operation and incident cost reduction. Data from the Large Truck

    Crash Causation Study (LTCCS) is used in the empirical analysis.

    The goal to develop econometric models of injury-severity in large-truck crashes is

    accomplished in a three-step procedure. The first step of this dissertation contributes

    towards that end by undertaking the relationship between injury severity and a vast

    number of inter-dependent explanatory factors using a crash level sample. The injury

    severity is modeled using both police-reported and researcher-determined scales. The

    results indicate the strong impacts of several Crash-, Truck, and Car-level variables on

    the severity of the crashes.

  • 12

    Then we proceed to the occupant level study in the second step because the

    highest injury severity cannot fully represent the severity of the whole crash. The

    methodology of incorporating the effects of common unobserved factors (error

    correlations) affecting the injury-severity of all persons involved in the same vehicle and

    crash is developed. Both the intra-crash and intra-vehicle correlations are confirmed to

    be important in the second step.

    A more advanced and flexible structure of methodology is explored in the last step.

    This approach is attractive as it recognizes the ordered nature of the choice alternatives

    and, at the same time, it is not constrained by the ―proportional odds‖ or ―parallel line‖

    restrictions of the ordered probit. The results indicate that the variables which are not

    significant in ordered probit model may have impact on the injury severity. For different

    roles (truck occupant, car driver and car passenger), the significant driver behavior

    variables are also different.

    In summary, the advanced and flexible methodologies for occupant level injury

    severity study are developed and compared in this dissertation. The results and

    implications are useful from the standpoints of traveler, transportation engineer and

    policy maker.

  • 13

    CHAPTER 1 INTRODUCTION

    This chapter motivates the need to study large-truck crashes and outlines the

    objectives of the dissertation. The organization of the rest of this document is also

    presented.

    1.1 Background: Traffic Safety and Large Trucks

    According to the World Health Organization, more than a million people are killed

    on the world‘s roads each year (Leonard, 2004). In the United States, 26,689 occupants

    (drivers and passengers) died and an additional 2,120,000 were injured in the

    5,811,000 crashes in 2008 according to the National Highway Traffic Safety

    Administration, (NHTSA, 2009). Clearly, these numbers highlight that traffic safety is

    one of the critical public-health and transportation problems in the world.

    Among all motor-vehicle traffic crashes, the focus of this study is on crashes

    involving large-trucks. A large truck is defined as a commercial vehicle weighing more

    than 10,000 lbs. The importance of trucking to freight logistics and, consequently, its

    impact on the economic well being of a nation is well acknowledged in literature.

    Specifically, based on the 2007 Commodity Flow Survey results, among all the modes,

    trucks moved 74.3% of all freight by value, 67.2% by weight, and 40% by ton-miles

    (USDOT/BTS, 2004). These large volumes of truck traffic, the unique operating

    characteristics of the trucks and drivers, and the design and weight of trucks have

    resulted in large numbers of crashes, injuries, and fatalities.

    In 2005, over 5000 people died and an additional 114,000 were injured in the

    442,000 large-truck crashes in the United States. Approximately 12% of all traffic

    fatalities involved a large-truck crash (NHTSA, 2006). In 2007, 413,000 large trucks

    http://en.wikipedia.org/wiki/World_Health_Organization

  • 14

    were involved in traffic crashes resulting in 4,808 fatalities, which are 12% of the total

    fatality (NHTSA, 2008). Large trucks account for approximately 4% of all the vehicles

    but are about 8% of vehicles in fatal crashes. 75% of the fatalities that resulted from

    crashes involving large trucks were occupants of other vehicles. In addition to all the

    above cross-sectional statistics, time-series trends reported by Lyman and Braver

    (2003) are also illuminating. Based on aggregate data from 1975 to 1999, these authors

    find that the involvement of large-trucks in fatal crashes per truck vehicle-mile-traveled

    has decreased. However, with a corresponding increase in the volume of truck travel,

    the involvement per unit population has not seen the same declining trend. Thus, there

    is continued public concern about large-truck crashes.

    Of all the injury- and fatal- crashes in 2008, 566,554 (34%) were single-vehicle

    crashes and 1,097,463 (66%) were multi-vehicle crashes. This indicates that in at least

    2/3rd of the injury- and fatal- crashes, more than one person is involved (even single-

    vehicle crashes can have more than one person involved). Among all crashes with at

    least one injury, there are, on an average, 1.29 persons injured or killed per crash.

    These statistics indicate that a large number of crashes involve more than one person

    and in many cases from multiple vehicles. Despite these results, a vast majority of

    literature on the crash severity has focused only on the highest–level of severity rather

    than on the injuries sustained by the different persons involved in the crash. Arguably,

    one of the major reasons for the state-of-practice approach is that the highest severity

    of the crash is more reliably recorded than the severities sustained by individual

    persons (Chang and Mannering, 1999).

  • 15

    The above statistics clearly underscore the need for studying large-truck crashes

    towards improving the safety of the transportation system. The results from such studies

    will be valuable in transportation policy, improvement of carrier operation, and incident-

    cost reduction. The broad goal of this dissertation is to contribute towards that end.

    Specifically, data from a recent, nationally-representative sample of large-truck crashes

    will be analyzed to determine the factors affecting the injury severity of these crashes.

    1.2 Objectives of the Research

    The objective of this study is to develop econometric models of injury-severity in

    large-truck crashes. The models developed will facilitate the evaluation of a variety of

    countermeasures from the stand points of transportation control, roadway design, traffic

    operations, and carrier management aimed at improving safety. Despite the importance

    of truck-safety, research on understanding the relative magnitudes of the influences of

    the various factors affecting the injury-severity of such crashes is limited. The models

    estimated in this study will include a comprehensive set of explanatory factors including

    the characteristics of the crash, vehicle, truck-carrier, and the occupants.

    This dissertation will also contribute methodologically to the literature on injury-

    severity modeling. Two important enhancements will be incorporated. First, the use of

    advanced, flexible structures such as the Ordered Generalized Extreme Value (OGEV)

    model will be explored to replace the simpler and restrictive Ordered-Probit models

    conventionally used in the injury-severity literature. Second, the effects of common

    unobserved factors (error correlations) affecting all the injury-severity of all persons

    involved in the same crash will be incorporate in the models developed in this

    dissertation. Most research to date either ignore this effect (even though many crashes

    involve more than one person) or focus on the injury to one particular person (such as

  • 16

    the car driver) involved in the crash. The contributions of this dissertation are discussed

    further in Chapter 2.

    1.3 Organization of the Document

    A brief synthesis of the relevant literature is presented in Chapter 2. A detailed

    description of the data and the sample-formation procedure is outlined in Chapter 3.

    Chapter 4 presents the results of the crash-level ordered-probit models for injury

    severity. The empirical results capturing the effects of several explanatory factors

    including driver, vehicle, crash, environment and carrier are discussed. These models

    will serve as the basis for further advanced specifications. Chapter 5 discuss how the

    highest-level of injury sustained may not be a comprehensive descriptor of the overall

    severity of the crash and present a methodology to simultaneously model the severity

    sustained by all persons involved in the crash. In Chapter 6, we continue the exploratory

    analysis of injury severity of each occupant with a panel, hetroskedastic Ordered

    Generalized Extreme Value (OGEV) model, to release the constraint of ordered probit

    model. A summary and conclusion of the key contributions of this dissertation and future

    research is discussed in Chapter 7.

  • CHAPTER 2 LITERATURE REVIEW

    This chapter presents a synthesis of literature relevant to the dissertation‘s

    objective of modeling injury-severity in large-truck crashes. The rest of this chapter is

    organized as follows. A review of past research on large-truck crashes is presented in

    Section 2.1. A summary of studies on modeling injury-severity in automobile crashes, in

    general, is presented in Section 2.2. Significant emphasis is placed on the modeling

    methods employed and the key empirical results. Section 2.3 positions the dissertation

    in the context of past research by identifying the gaps in knowledge and the

    contributions of this study.

    2.1 Research on Large-Truck Crashes

    A brief synthesis of literature on large-truck (gross vehicle weight rating greater

    than 10,000 pounds) crashes, with particular focus on the analysis of injury-severity of

    such crashes is presented in this section of the chapter.

    Work undertaken by Khattak and colleagues (Duncan et al., 1998, Khattak et al.,

    2002) and Chang and Mannering (1999) are most directly related to our efforts.

    Duncan et al. (1998) examined the injury severity in the case of rear-end collisions

    between heavy trucks and passenger cars. The focus was on modeling the injury to the

    passenger-car occupants as they are almost always likely to sustain more severe

    injuries than truck drivers in crashes with large/heavy trucks. Ordered-probit models

    were developed using the Highway Safety Information System (HSIS) data from North

    Carolina for the years 1993-1995. The results indicate that higher speeds (and speed

    differentials), darkness, and grade increase the severity of the injury. Females and

    drunk-drivers were estimated to sustain more severe injuries compared to male and non

  • 18

    drunk drivers respectively. Snowy/icy road conditions and traffic congestion were found

    to decrease the effect of the injury severity compared to respectively dry and free-flow

    traffic conditions. Finally, the car being struck in the rear was found to lead to more

    severe injuries compared to the truck being struck in the rear.

    Khattak et al. (2002) used the HSIS data from North Carolina for the years 1996-

    1998 to examine the injury severity of single large-truck crashes. In particular, the intent

    was to examine the differences between rollover and non-rollover crashes. Using

    ordered-probit models, the authors found that rollover leads to more severe injuries in

    single-truck crashes. Further, dangerous driving behaviors such as drug/alcohol use,

    and speeding, not wearing seat-belts increases the injury severity. Crashes that result in

    fire are also estimated to have a greater injury severity. In this study, the authors

    continued to examine the factors affecting the roll-over of trucks in single-truck crashes.

    The researchers found that rollovers are more likely to happen at a right, left or U-turn

    and on a curved road. Trucks with longer trailers are more likely to roll over. Reckless

    driving has the largest influence on increasing rollover propensity. These factors may

    also be construed as affecting the injury severity as the roller-over crashes were

    established to be more severe than non roll-over crashes.

    Chang and Mannering (1999) modeled the vehicle occupancy and the most-

    severe injury sustained by an occupant of the vehicle using data from the state of

    Washington. The need to model vehicle occupancy simultaneously with injury severity

    was motivated by the observation that the possibility of a severe injury increases with

    increasing number of persons in the vehicle. Nested-logit models were developed with

    occupancy as the upper-level nest and injury severity as the lower level nest. Unlike, the

  • 19

    previous efforts discussed, Chang and Mannering adopt an un-ordered discrete-choice

    structure to model injury severity. The authors segmented the data into truck-involved-

    and non-truck-involved- crashes and demonstrated the statistical and empirical validity

    of such segmentation. For example, the results indicate that higher speeds are strongly

    associated with more-severe crashes when trucks are involved (the effect was

    insignificant in the case of non-truck crashes). Similarly, the effects of turning

    movements (right turn and left turn) of the vehicles on the crash severity were also

    found to be different. Consistent with expectations, the results also indicated that multi-

    occupant vehicles in truck-involved crashes result in significantly severe injuries.

    Overall, these authors argue that counter-measures aimed at reducing the severity of

    truck-involved crashes could be different from those aimed at reducing the severity of

    non-truck crashes.

    In contrast to the previous three studies which have examined the level of injury

    severity, other researchers have focused on fatal crashes involving large-trucks.

    Braver et al. (1996) examined the effect of roadway geometry, weather, and other

    factors on the incidence of fatal large truck-car crashes. Defiance of traffic control

    devises, curves, slippery and roadway conditions were some of the conditions found to

    be associated with fatal crashes.

    Campbell (1991) examined the impact of driver age on the involvement in fatal

    crashes. Based on nationally-representative data for the years 1980-1984, the author

    developed estimates for the risk of involvement in fatal crashes as the number of fatal

    crashes per hundred million vehicle miles. The analysis indicates that younger drivers

    (age < 27 years) are over-involved in fatal crashes. Further, the relative risk of very

  • 20

    young drivers (less than 21 years of age) was found to be about six times the overall

    risk for all drivers.

    Golob et al. (1987) examined the severity (both injury severity and incident

    duration) of truck-involved freeway accidents. About 9000 crashes from the years 1983-

    1984 were obtained from TASAS (Traffic Accident Surveillance and Analysis System)

    data base maintained by the California Department of Transportation. All data were

    from the Los Angeles area. Based on the number of fatalities per accident, the ―hit-

    object‖ type crashes were found to be most dangerous (0.025 fatalities per accident).

    ―Rear end‖ and ―other type‖ (other than hit-object, side-swipe, broad-side, and overturn)

    of crashes were also very dangerous (0.021 fatalities per accident).

    It is useful to mention here that past studies have also examined other aspects of

    large-truck safety (other than injury severity). For example, research undertaken by

    Blower et al. (1993) examined the factors affecting the crash propensities (or the risk of

    being involved in a crash) and show that truck crash-rate is significantly affected by

    truck configuration, location (rural or urban), traffic density, and time of day. Hallmark

    (2009) focused on the incidence of a specific type of crash – the lane-departure

    crashes. Using logistic-regressions, the authors identify that such crashes were more

    likely to happen when driver is fatigued, upset, distracted, or unfamiliar with the

    roadway. More generally, driver fatigue has been recognized as an important factor

    affecting truck crashes. Based on a survey conducted in New Zealand, Gander et al.

    (2006) identified 7.6% of crashes were identified as fatigue-related. The duration of the

    most recent sleep period was considered as a measurement of fatigue. In consideration

    of the effect of driver fatigue, the hours of service (HOS) of commercial drivers are

    http://en.wikipedia.org/wiki/Hours_of_service

  • 21

    regulated by the Federal Motor Carrier Safety Administration (FMCSA) in the United

    States. Commercial motor vehicle (CMV) drivers are limited to 11 cumulative hours

    driving in a 14-hour period, which must then be followed by a rest period of no less than

    10 consecutive hours. Drivers employed by carriers in "daily operation" may not drive

    more than 70 hours within any period of 8 consecutive days (NHTSA, 2008). Although

    the primary intent of this dissertation research is on injury-severity (conditional on a

    crash) and not on the risk of a crash happening, insights from studies discussed above

    are useful and appropriate explanatory variables (such as fatigue) will be included in our

    models.

    Overall, the literature on the modeling of injury-severity of large-truck crashes

    appears to be limited. Past studies have focused on specific types of crashes (such as

    rollover or rear-end) or on specific injury-severity levels (such as fatal crashes). Also,

    the methods employed are rather simplistic. In this context, the intent of this dissertation

    is to present a comprehensive analysis of the injury severity of all types of crashes

    involving large-trucks. Flexible econometric structures and a comprehensive empirical

    specification will be developed.

    2.2 Research on Modeling Injury-Severity of Automobile Crashes

    Although few studies have analyzed injury-severity in large-truck crashes, the

    body of literature on modeling injury severity, in general, is extensive. A substantial

    fraction of these are focused on automobile crashes and these are discussed in the rest

    of this section. It is envisioned that methodological- and empirical- insights from these

    past studies will inform our research on large-truck crashes. It is also useful to

    acknowledge that injury-severity of motorcycle, pedestrian, and bicycle crashes have

    also been studied in the past. To limit the scope of our literature review, we do not

    http://en.wikipedia.org/wiki/Federal_Motor_Carrier_Safety_Administrationhttp://en.wikipedia.org/wiki/United_Stateshttp://en.wikipedia.org/wiki/United_States

  • 22

    present an extensive discussion of these studies. However, advanced methods used to

    model such crashes are discussed wherever appropriate.

    Table 2-1 summarizes the key features from several studies in literature on

    modeling injury-severity of automobile crashes. Five important features are discussed in

    separate sub-sections: (1) The Injury of Interest, (2) Levels of Injury Severity, (3) Data

    Source, (4) Modeling Method, and (5) Explanatory Factors. The first three studies listed

    in the table are the ones that explicitly focus on large-truck crashes (also discussed in

    Section 2.1).

    2.2.1 Injury of Interest

    Automobile crashes could potentially involve one or more vehicles, and each

    vehicle could have one or more occupants (including the driver of the vehicles). All

    these occupants could have different levels of injury severity. Correspondingly, there

    are differences in the injury-severity of interest across the studies presented in Table 2-

    1.

    In the simplest case, some studies have defined the overall severity of the crash

    as the most-severe injury sustained by any person involved the crash. Alternately,

    others define the injury-severity of each vehicle as the most severe injury sustained by

    any person in that vehicle (Chang and Mannering (1999), Chang and Wang (2006) and

    Milton et al. (2008)). Some studies have focused specifically on the injury sustained by

    the driver of the vehicles (Kockelman and Kweon, 2002, Eluru and Bhat, 2007,

    Yamamoto et al., 2008, Wang and Abdel-Aty, 2008, Delen et al., 2006 and Xie et al.,

    2009). Others have focused on specific occupants such as driver and front seat

    occupant (Newgard, 2008), and front seat and rear seat passengers (Shimamura et al.,

    2005). At the other end of the spectrum are studies that have examined the injury

  • 23

    severity of each of the occupants involved in the crash. (O‘ Donnell and Connor, 1996,

    Kuhnert et al., 2000, Khattak et al., 2003, Chang and Wang, 2006 and Eluru et al.,

    2009).

    The focus on the most-severe injury is appropriate from the stand point of data as

    the injury sustained by every person involved in the crash is often not accurately

    recorded (the most severe injury is generally well-recorded –Chang and Mannering

    (1999)). At the same time, models of the injury sustained by every occupant involved in

    the crash (subject to data availability) present a comprehensive description of the

    overall severity of crashes. In light of the above discussions, this dissertation will

    develop models at both the crash-level (most severe injury) and the occupant level. The

    data available support such an effort and are described in detail in the next chapter.

    2.2.2 Levels of Injury Severity

    Injury severity is recorded in an ordinal scale. The number of categories used in

    modeling range from two (high or low, Ouyang et al., 2002) to seven (no injury, minor,

    moderate, serious, severe, critical, and non-survivable, Newgard, 2008). Four- and five-

    categories are more common. The ―KABCO‖ is the most common scale used ( for

    example, Duncan et al., 1998) with ―K‖ being the most severe category representing a

    fatal crash, ―A‖ representing incapacitating injury, B representing non-incapacitating

    injury, C being minor injury, and ―O‖ representing the least severe category (no

    injury/property damage only). Most state- and national- crash databases use this scale.

    Consistently, four- and five-level ordinal scales are most commonly used in the models

    for injury severity.

    The dataset used in this research provides two measures of injury severity. The

    first measure is based purely on police records and the second derived from additional

  • 24

    hospital data and interviews (further details provided in the next chapter). The first

    measure uses a four-level scale whereas the second uses a three-level scale. Models

    will be developed using each of these measures.

    2.2.3 Data Sources

    Most research in injury-severity modeling has used data from national- and state-

    level sources. For instance, the Crashworthiness Data System (CDS) from the National

    Automotive Sampling System (NASS) was used by Wang and Kockelman (2005) and

    Newgard (2008). The General Estimates System (GES) from the National Automotive

    Sampling System (NASS) was used by Delen et al. (2006) and Eluru et al. (2009).

    Chang and Mannering (1999) and Yamamoto et al. (2008) used the state-level

    Washington State Highway Accident Records Database. Non US-data such as the

    Linked Accident Database from Japan (Shimamura et al., 2005) and French database

    of accident (Lapparent, 2008) have also been used.

    The data to be used in this study come from the Large Truck Crash Causation

    Study (LTCCS – discussed in more detail in Chapter 3). The database assembled by

    this study augments the conventional crash-data obtained from police reports in several

    ways. For instance, additional data related to ―human factors‖ such as the fatigue,

    illness, and distraction of the drivers was collected. Historical records on the safety of

    the drivers, vehicles, and carriers (past violations and citations) involved in the crashes

    were also obtained and added to the crash data obtained from the police accident

    reports. Thus, the database available for this study would enable the development of a

    richer empirical specification.

  • 25

    2.2.4 Modeling Method

    Almost all past research on injury-severity modeling has employed

    statistical/econometric methods. Exceptions include the Classification and Regression

    Tree (CART) method used by Kuhnert et al. (2000) and Chang and Wang (2006) and

    Artificial Neural Networks used by Delen et al. (2006). While the CART and Neural

    Network methods can help establish very flexible and non-linear relationships, their

    value as descriptive models is limited because it can be extremely difficult to interpret

    the marginal effects of various factors as implied by the estimated relationships. This

    ability to obtain physical meanings of the parameters is important from the standpoint of

    identifying appropriate counter-measures to reduce crash severity. Further, such

    methods, unlike the statistical approaches, often do not have estimates of the strength

    of correlations (i.e., the ‗p‘ values). Finally, methods like CART can be difficult to apply

    with increasing number of explanatory variables. This dissertation focuses on the use of

    econometric methods. In the rest of this section, such methods used in past research

    for injury-severity modeling are discussed in detail.

    There are two fundamental issues in the modeling of injury-severity of crashes: (1)

    the treatment of the ordinal, injury-severity variable in the modeling process and (2) the

    incorporation of interdependencies (error correlations) among the injury-severity of the

    different persons involved in the same crash.

    2.2.4.1 Treatment of ordinal

    As discussed in Section 2.2.2, injury-severity is generally recorded in an ordinal

    scale. Most commonly, the following five-level scale in the increasing order of injury

    severity is used: Property-damage only, Possible Injury, Non Incapacitating Injury,

    Incapacitating Injury, and Killed (Fatal). Thus, ordered-response discrete-choice models

  • 26

    (ordered probit or ordered logit) are appropriate for the analysis of such data. In fact,

    this has been a popular approach for modeling injury severity in general (For example,

    Kockelman and Kweon, 2001; O‘ Donnell and Connor, 1996).

    In this approach, the observed, ordinal, injury-severity level is related to an

    unobserved (latent), continuous injury propensity, which is then related to a vector of

    explanatory variables corresponding to the crash via a linear-in-parameters

    specification.

    Wang and Kockelman (2005) and O‘ Donnell and Connor (1996) employed the

    heteroscedastic variant of the ordered response model (the conventional ordered-

    response model assumes homoskedasticity, or equal variances). Thus, their

    specifications allowed the variance in the error term to vary systematically as a function

    of certain exogenous factors such as speed, vehicle type, vehicle weight, time of day

    and occupant characteristics.

    In general, the maximum-likelihood procedure is used for model estimation.

    However, Xie et al. (2009) adopted the Bayesian inference procedure. The authors

    compared the Bayesian and the non-Bayesian methods and concluded that the results

    are similar with large samples.

    There are several advantages to using the ordered-response models. Such

    models explicitly recognize the ordering in the levels of injury severity. The specification

    is parsimonious (fewer parameters as there is only one propensity function to estimate).

    Finally, the interpretations are straightforward. Generally, a positive coefficient on a

    variable implies that the corresponding explanatory factor is associated with more

  • 27

    severe crashes and a negative coefficient implies that the corresponding variable is

    associated with less severe crashes.

    A key shortcoming of the ordered-response models is that it is restrictive in

    capturing the effects of explanatory variables on the different levels of injury severity.

    Specifically, if a factor is estimated to increase the probability of the most-severe injury

    (i.e., fatal injury), then the specification implies that the same factor necessarily

    decreases the probability of the least-severe injury (often this is the ―no-injury‖

    category). However, this may not always be true. For instance, it has been shown that

    airbags decrease the likelihood of fatal injuries in the event of a crash. At the same time,

    the deployment of airbags can also cause minor injuries and hence decreasing the

    likelihood of least-severe injuries. Simple ordered-response models cannot capture this

    effect. Researchers have attempted alternate approaches to address this issue and

    these are discussed in the rest of this section.

    The ordered-response models can be directly extended to address this issue by

    allowing for variable threshold parameters. Eluru et al. (2008) have applied such a

    ―mixed generalized ordered logit model‖ to study injury severity of pedestrians and

    bicyclists. Another extension of the conventional ordered-response structure is the

    Partial Proportional Odds Model (Wang and Abdel-Aty, 2008). This approach also

    allows for the coefficients on the explanatory variables to be different across the injury-

    levels. However, relative to the simpler ordered-response models, the interpretations of

    the parameters from these advanced models are not straightforward.

    Another approach to address the above issue, while still having easily

    interpretable structures, is to use unordered specifications such as the nested-logit

  • 28

    model (for instance, Savolainen and Mannering (2007) and Abdel-Aty et al., 2003).

    Such models require the specification of a utility function corresponding to each

    alternative (unlike the ordered-probit models which use a single propensity function and

    fixed thresholds) and, hence, overcomes, the restrictive empirical specification.

    However, the use of nested-logit structure implies that the ordering of the injury-severity

    levels is ignored. The nested-logit models require that each alternative belong to only

    one nest, and hence, the error-correlations between adjacent levels of injury severity

    cannot all be effectively captured. A particular extension of the nested-logit model that is

    appropriate in the context of ordered-alternates is the Ordered Generalized Extreme

    Value (OGEV) model but this structure has not been applied to injury-severity modeling.

    The most flexible error-correlations across the choice alternatives can be captured by

    the use of mixed-logit models (Train, 2009). With the exception of Milton et al. (2008),

    these methods have not been applied to injury-severity analysis. It also appears that the

    primary motivation for the above researchers to use the mixed-logit model is to capture

    heterogeneous impacts of explanatory variables on the injury severity by using random

    coefficients rather than to capture flexible error correlations.

    A third approach to capturing the ordinality among the alternatives is to model the

    ordered choice as a sequence of binary choices where each binary choice involves

    choosing a specific level relative to higher (or lower levels). Dissanayake and Lu (2002)

    used such a sequential binary approach. Two sequential model structures where

    estimated – one in which the injury severity varied from lowest level to the highest and

    the second in which the severity was varied from the highest to the lowest. However

    these researchers assumed that the binary choices were independent. Yamatoto et al.

  • 29

    (2008) pointed out that the correlation existed in the error terms and was especially

    stronger in the successive two levels and developed an improvement which

    accommodates correlations (partially) among the successive levels.

    2.2.4.2 Incorporating interdependencies among the injuries of all persons involved in the crash

    Most of the injury-severity models can be classified as ―single-equation‖ models

    and these do not consider the correlations among the injuries sustained by all persons

    in the same crash or in the same vehicle. Few studies have used the bivariate ordered-

    response structures to account for correlations among two persons involved in the same

    crash. For example, Hutchinson (1986) analyzed the severity of injuries sustained by

    the driver and front-seat passenger simultaneously. Yamatoto and Shankar (2004)

    applied model to the driver and the most severely injured person in the vehicle. Ouyang

    et al. (2002) studied rear-end crashes involving trucks and cars. The injury-level (on a

    binary scale) associated with both vehicles were estimated simultaneously

    Most recently, Eluru et al. (2009) used a copula-based approach to accommodate

    the dependence in injury-severity propensities among the multiple occupants of the

    same vehicle (the dependencies among the different vehicles in the same crash, were,

    however not considered). Copulas are functions that generate stochastic-dependence

    relationships (i.e., a multivariate distribution) among random variables with given

    marginal distributions.

    While mixed-models and error-component structures have been routinely used in

    other fields (such as economics and travel-demand modeling) to estimated correlated

    models, it appears that such methods have had limited applications in the context of

    http://www.sciencedirect.com/science?_ob=ArticleURL&_udi=B6V5S-4B0PPJY-1&_user=2139813&_coverDate=09%2F30%2F2004&_rdoc=1&_fmt=full&_orig=search&_cdi=5794&_sort=d&_docanchor=&view=c&_searchStrId=1078157190&_rerunOrigin=google&_acct=C000054276&_version=1&_urlVersion=0&_userid=2139813&md5=8d031b63098876807386c505f71c71c7#bib7

  • 30

    injury-severity analysis. This dissertation will contribute to the literature by estimating

    such advanced econometric models for injury-severity.

    2.2.5 Explanatory Factors

    The last major column in Table 2-1 identifies the primary explanatory factors used

    in past injury-severity models. These factors are classified into the following six

    categories: (1) crash characteristics, (2) driver/occupant characteristics, (3) vehicle

    characteristics, (4) environment characteristics, (5) roadway characteristics, and (6)

    occupant protection characteristics. Each of these is discussed in the rest of the this

    section

    2.2.5.1 Crash characteristics

    The characteristics of the crash that were estimated to influence injury severity

    include time of day (such as peak or off-peak and daylight or dark), crash type (frontal,

    rear-end, rollover or else), and other crash descriptives (number of vehicles involved, at-

    fault driver, harmful event/causation).

    NHTSA (2008) reports that the period from midnight to 3 a.m. on Saturdays and

    Sundays was the deadliest of all 3-hour periods with more fatal crashes than any other

    time period (NHTSA, 2008). Kockelman and Kweon (2002) also find that crashes on

    Friday, Saturday and Sunday during late night (midnight to 4am) are more severe and

    late night on Sundays was the most dangerous time. Analysis results from Chang and

    Mannering (1999) suggest that the night time was more dangerous and rush-hour

    crashes were less severe. Eluru and Bhat (2007) report that crashes occurring between

    6am-7pm were less severe than crashes during other time period. Eluru et al. (2009)

    also report that 12am to 6 am was the most dangerous time. Friday afternoons were

    also shown to have more crashes but less fatal crashes in NHTSA (2008). It is

  • 31

    important to note that the time-of-day of the crash could be reflective of traffic volumes,

    speeds, and lighting conditions. All the above results indicate that crashes during darker

    and less-congested time periods are more severe than crashes during brighter and

    more-congested times.

    Chang and Mannering (1999) found that summer was the most dangerous season

    followed by spring and autumn. Many other studies did not have statistically significant

    effects of the season possibly because these effects are captured by variables

    describing the road-surface condition (for instance wet/icy/snowy conditions could be

    reflective of winter conditions). It is useful to note that Chang and Mannering did not

    control for road-surface conditions.

    Crash type is another important factor affecting injury severity. Head-on crashes

    and crashes with a stationary object were most dangerous and the vehicle being struck

    received higher injury severity relative to the striking vehicle (Eluru et al., 2007).

    Kockelman and Kweon (2002) as well as Duncan et al. (1998) report that rollover

    crashes can result in more severe injuries. Other types of crashes such as the angle

    and sideswipe have not been extensively examined. Crashes that lead to fire are found

    to result in more severe injuries as would be expected. As hazardous cargo can lead to

    fire in the event of a crash, countermeasures aimed at improving the safety of trucks

    carrying such cargo becomes more important.

    The number of vehicles involved in the crash was also important from the

    standpoint of injury severity. NHTSA (2008) estimates that multi-vehicle crashes were

    more dangerous using aggregate data and this result was also supported by

    econometric models estimated by Chang and Mannering (1999). Yamamoto and

  • 32

    Shankar (2004) also report that more passengers in the vehicle increase the severity of

    the most severe injury in the accident. At the same time, these researchers also find

    that increasing number of passengers in the vehicle decreases the injury severity of the

    driver. Such results suggest that focusing on the most-severe injury or the injury

    sustained by one of the occupants is not adequate. Rather, the injuries sustained by all

    persons involved in the crash must be studied to have a comprehensive understanding

    of the crash severity.

    Vehicle movement at the time of crash is also a factor influencing injury severity.

    Crashes while negotiating curves and passing other vehicles were shown to have

    higher probabilities of fatalities compared with other kinds of movements such as

    turning and merging (NHTSA, 2008). Among turning movement, left-turns might be

    particularly critical because of the possibility of conflicts with opposing streams of traffic

    (for instance the study on left-turn crashes at intersections by Wang and Abdel-Aty

    (2008)). The impact point has also been found to determine injury severity. Based on

    research by Delen et al. (2006) occupant in the vehicle that is struck is more likely to

    sustain severe injuries compared to the occupants in the striking vehicle.

    2.2.5.2 Vehicle characteristics

    The age, size, engine, and other characteristics of the vehicle(s) involved in the

    crash affect the injury severity.

    The research by Kockelman and Kweon (2002) indicate that drivers of light- and

    heavy-duty trucks and minivans are better protected against injuries. Yamamoto and

    Shankar (2004) report that drivers of large trucks sustain less-severe injuries. Khattak

    and Rocha (2003) studied sport utility vehicles (SUVs) and found that SUVs were more

    likely to rollover, but its protective effect exceeded the harmful effect caused by rollover.

  • 33

    Therefore, on comparing with passenger cars, SUV occupants have less-severe injuries

    in the event of a crash. Eluru and Bhat (2007) showed that drivers in sedans were more

    likely to be injured heavily comparing to others in dual-vehicle crash. According to

    Kockelman and Kweon (2002), in two-vehicle crashes, heavy-duty trucks result in more

    severe injury for the driver of the partner vehicle. Consistent results were achieved from

    other research focusing on effect of vehicle types. In general, it appears that occupants

    of heavier vehicles often have less-severe injuries. At the same time, if heavier vehicles

    are involved in the crash, the overall severity of the crash could be higher because of

    greater injuries to the occupants of the other vehicle(s) involved in the crash.

    Some other factors such as vehicle age have also been studied. For example,

    Yannis et al., 2005 report that older vehicles (age > 35 years) are involved in more

    severe crashes. Khattak et al. (2003) report higher injury-severity to be associated with

    large trucks manufactured before 1992.

    2.2.5.3 Driver and occupant characteristics

    As already discussed, many studies have focused on the injury sustained by the

    drivers of vehicles. NHTSA (2008) estimates that drivers comprised 63% of the total

    persons injured or killed in crashes and passenger accounted for only 28%. Therefore,

    the substantive focus on drivers seems appropriate.

    The age of the drivers and the vehicle occupants has been found to be strongly

    related to injury severity. Wang and Abdel-Aty (2008) reported that very young

    (age≤19) and young drivers (19 < age ≤24) were more likely to sustain severe injuries.

    Based on fatalities and injured rates per 100,000 population (NHTSA, 2008), men aged

    21-24 and women aged 16-20 had the highest fatality rates, while both males and

    females aged16-20 had the highest injury rates. Eluru et al. (2009) report that younger

  • 34

    drivers (16-20) are less likely to be severely injured comparing with driver over 65 years

    if age. Overall, there appears to be a non-linear effect of age on injury severity with the

    youngest and the oldest being susceptible for more severe injuries (arguably for very

    different reasons) compared to the middle-aged.

    Eluru et al (2009) also examined the effect of driver age on the injuries sustained

    by other occupants in the vehicle. Driver over 45 years old were estimated to be

    associated with more severe injury to front seat passengers, but driver‘s age did not

    significantly affect the rear passenger. Children less than 5 years old were less likely to

    be highly injured when seating in the rear, and passengers over 65 are more likely to be

    severely injured.

    Gender is also expected to influence injury severity. Based on, aggregate,

    national-level crash data from 2007 (NHTSA, 2008), men had a significant higher rate of

    being killed or severely injured compared to women. However, econometric models

    developed by Eluru and Bhat (2007) and Kockelman and Kweon (2002) suggest that

    men suffer less severe injuries compared to women, after controlling for several other

    factors that affect crash severity.

    Alcohol is another important factor studied. There were 12,998 alcohol-impaired

    driving fatalities in 2007 which accounted for 32% of all traffic fatalities for the year.

    Among the fatal crashes occurring from midnight to 3 a.m., 65% involved alcohol-

    impaired driving. All studies on alcohol (Duncan et al., 1998 and Eluru and Bhat, 2007)

    confirmed that drivers under influence of alcohol were likely to be more-severely injured

    than others and Eluru et al. (2009) also estimated that the alcohol consumption of the

  • 35

    drivers also affected the injury severity of the passengers. It is useful to note here that

    alcohol records for passengers are not required in Police Accident Reports (PAR).

    Driver fatigue is another critical aspect that affects injury severity. Gander et al.

    (2006) reported that 41%-71% of the truck crashes were related to fatigue. Srinivasan

    (2003) estimated that fatigued drivers were five times as likely to be serious injured and

    faced a 30% lower chance of experiencing a property damage only (PDO) event.

    Sleepy drivers were more likely to be involved in more-severe crashes (Khattak et al.,

    2003). Fatigue and sleeping habits can be expected to be even more important

    attributes of truck drivers compared to car drivers.

    Speeding was a main causation of crashes and contributed 31% of all fatal

    crashes in 2007 (NHTSA, 2008). For drivers involved in fatal crashes, young males

    were confirmed to be more involved in speeding and speeding was clearly a deadly

    combination with alcohol.

    Data on drivers‘ history related to traffic violations have also been studied. Drivers

    with violation history faced about 15% and 22% increase in the probability of moderate

    and severe injuries (Srinivasan, 2003).

    For all the passengers in a vehicle, the seating position is also important. O‘

    Donnell and Connor (1996) obtained that the left-rear seating position is the most

    dangerous seating position. Newgard (2008) concluded that seat position has a strong

    correlation with passenger‘s age.

    2.2.5.4 Environmental characteristics

    Environmental factors affecting injury severity include weather and light conditions.

    In general bad weather has been found to be associated with lower injury severity

    (Duncan et al., 1998 and Yamamoto and Shankar, 2004). This is possibly because

  • 36

    drivers are inherently more cautious and do not speed during bad weather (Eluru and

    Bhat, 2007). Weather conditions are also reflective of the road surface conditions (for

    example, rainy weather also leads to slippery road surface). The snow intensity and

    wind gust speed were studied by Khattak and Knapp (2001). The results indicated that

    higher wind gusts during snow events tended to result in more-severe injuries. The

    negative effect of snowfall intensity on injury level was explained as greater snow

    accumulation due to higher snowfall intensity acted as an attenuator.

    Light condition has more complex results than weather. Wang and Kockelman

    (2005) concluded that lack of light decreased the severity in one-vehicle crashes, but

    increased the injury severity in two-vehicle crashes. Driver was considered to be less

    injured in dusk or dark with lighting by Eluru and Bhat (2007) given the same reason as

    bad weather. Huang et al. (2008) consisted that crashes at night were more serious

    than those during daytime and a bad street lighting condition could increase the odds of

    severe crash by about 69%.

    Given the strong interdependencies between weather, lighting conditions, and

    driving behavior (drivers are inherently more cautious during bad weather and bad

    lighting conditions and perhaps the converse is true under good weather and good

    lighting conditions), additional research on effectively disentangling the marginal effects

    of these correlated factors is needed.

    2.2.5.5 Roadway characteristics

    Roadway characteristics include roadway design, location, surface condition,

    traffic control and traffic volume.

    Because speed is an important factor of severe injury and the actual speed is

    seldom recorded, the speed limit is often used as an important proxy variable in injury-

  • 37

    severity studies. As commonly admitted, roads with higher speed limits have a higher

    injury severity. The medium-to-high speed limit (26-64mph) was most dangerous and

    the high speed limit (>65mph) cause more severe injury than low speed limit roads

    (45 mph) was

    considered to be dangerous to car drivers at intersection and to both car and truck

    drivers at straight section.

    Speed limit is also indicative of the location of the crash, such as interstate

    highway, or state road. High speed exists on good road condition, mainly on highway,

    and lower speed may exist on rural road and ramp on the highway. Chang and

    Mannering (1999) estimated that rural area was more dangerous than urban area.

    Furthermore, road classification, curve, uphill or down hill also influenced the injury

    severity combined with the speed limit. The result from Wang and Kockelman (2005)

    stated that crashes on curves were more severe, while uphill increased the injury level

    in one-vehicle crashes but decreased the injury in two vehicle crashes. Road signs also

    contributed to the traveling speed, such as flash light, electronic display for the traffic

    condition, and sign for the camera. The validity of these speed control methods was

    studied (Alicandri and Warren, 2003, Sarasua et al., 2006).

    Another crucial factor of injury is the traffic condition, which is usually a temporal

    factor. Significantly, traffic varies by peak hours and off peak hours of a day. Night time

    has less traffic and higher speed. Without considering special occasions, such as

  • 38

    congestion caused by incident, sport game or work zone, traffic is shown by the time of

    the crash. Special traffic conditions causing trouble in traffic also cause probability of

    crashed and injuries but these records are not available in most police record data.

    Higher truck percentages on the roadway were shown to decrease the injury

    severity of accidents (Milton et al. 2008) and this was explained by slowing effect on

    travel speeds of the sheer number of trucks.

    2.2.5.6 Occupant protection

    Occupant protection mainly refers to seat belts, airbags.

    NHTSA (2008) estimated that 15,147 lives were saved in 2007 by the use of seat

    belts. Seat belt use was studied by Eluru and Bhat (2007), which indicated that men,

    younger individuals (age

  • 39

    which an airbag was not deployed. This was because airbag deployed when the vehicle

    was struck violently and the passenger was shocked seriously.

    2.3 Contribution of this Dissertation

    The discussions thus far clearly highlight the value of improving truck-safety and

    studies conducted to date on identifying appropriate countermeasures. In light of the

    above discussions, the following are the main empirical and methodological

    contributions of this dissertation.

    Empirical Contributions:

    The focus of this research is on large-truck crashes. Despite the unique aspects of

    truck-crashes and the importance of safe trucking to the economy of the country,

    research on the factors affecting injury-severity has been limited. This study contributes

    by using a recently-assembled national-level database to develop econometric models

    to relate injury-severity to a wide variety of explanatory factors. A rich empirical

    specification will be developed that incorporates the marginal effects of several

    explanatory factors including crash characteristics, driver characteristics, vehicle

    characteristics, environment and roadway characteristics.

    Methodological Contributions:

    Almost all past research in injury-severity modeling has used relatively simpler and

    restrictive methods for analysis. For example, a key shortcoming of past research is

    that the interdependencies among the injuries sustained by the different persons

    involved in the same crash have largely been ignored. These will be explicitly

    incorporated in our models leading to more comprehensive descriptions about the

    overall severity of multi-person multi-vehicle crashes. A second key shortcoming of the

    popularly-used ordered-response models is that it is restrictive in capturing the effects of

  • 40

    explanatory variables on the different levels of injury severity. Specifically, if a factor is

    estimated to increase the probability of the most-severe injury (i.e., fatal injury), then the

    specification implies that the same factor necessarily decreases the probability of the

    least-severe injury (often this is the ―no-injury‖ category). This research will use

    advanced methods such as the OGEV models to develop flexible empirical structures.

  • 41

    CHAPTER 3 DATA

    In this chapter, we first present an overview of the source and characteristics of

    the LTCCS data. Then, the detailed procedures for data cleaning and variables

    reduction are stated and the description of sample data is presented at the end of this

    chapter.

    3.1 Data Source and Raw LTCCS Data Characteristics

    This study uses data from the Large Truck Crash Causation Study (LTCCS).

    These data represent a sample of large-truck crashes that occurred between April 2001

    and December 2003. Data on approximately a thousand crashes were collected from 24

    sites in 17 states. Each crash in the LTCCS sample involves at least one large truck

    and resulted in at least one injured. These data were collected by the Federal Motor

    Carrier Safety Administration (FMCSA) and the National Highway Traffic Safety

    Administration (NHTSA) of the U.S. Department of Transportation (DOT).

    USDOT/FMCSA (2006) and Hedlund and Blower (2006) provide an overview of the

    data. The full data and related documentation are available for download from the

    LTCCS website at: http://ai.fmcsa.dot.gov/ltccs/default.asp.

    The raw database has over a thousand data fields and is organized into 58

    different files. The data sample includes 1,070 cases with 2,284 vehicles (including

    1,141 large trucks and 1,043 passenger vehicles) and 3,014 occupants (including

    drivers and passengers). The crashes resulted in 251 fatalities and 1,499 injuries. Injury

    severity is categorized into no injury, possible injury, non-incapacitating injury,

    incapacitating injury, and killed.

    http://ai.fmcsa.dot.gov/ltccs/default.asp

  • 42

    Elements that influence the severity are drawn from 43 data files and categorized

    into Injury Severity, Crash Description, Characteristics of the Occupant, Characteristics

    of the Driver (demographics, fatigue, history of crashes/violations, behavior, health,

    drugs/alcohol), Characteristics of the Vehicle (physical attributes, cargo, history of

    crashes/violations), Characteristics of the Environment (roadway features, weather) and

    Characteristics of the Carrier.

    Furthermore, the dataset can be divided to levels of occupant, vehicle, truck, crash

    and carrier to obtain better description and analysis of the factors since it is collected by

    several departments focusing on different fields, and some records are not complete

    enough for all the crashes.

    3.2 Sample Formation Procedure

    Extensive data processing was undertaken which involved cleaning, consistency

    checks, and variable selection (this was particularly important given the significant

    correlations observed among the different variables in the database) using the SPSS

    statistical software.

    3.2.1 Selecting Cases

    Some cases in the raw data are from the pilot phase of the project or found to not

    meet the selection criteria. RATWeight, is used as a weight to produce nationally

    representative estimates, and zero weight cases should be dropped first. After dropping

    107 zero weight cases, there remain 963 cases for further processing.

    3.2.2 Cleaning and Consistency Checking

    Because of the existence of useless variables and inconsistent cases, the

    procedures of cleaning and consistency checking are used for sample formation to keep

    the data effective and consistent.

  • 43

    Sample cleaning starts from the frequency test for all factors, because a variable

    is invalid if over 99% of the variable is distributed as the same value. Therefore, for

    each category, we test the frequency at corresponding levels, e.g., injury severity at

    occupant level, crash and environment description at crash level, characteristics of the

    occupant for car and truck occupant separately, characteristics of the vehicle and driver

    at car and truck level, and characteristics of the carrier for available carrier records, and

    drop the binary variables with less than 1% or more than 99% present cases.

    The consistency checking includes two steps. First, we check variables within

    each category, such as crash type, driver and occupant number, time of day and

    daylight. Second, we check across the category, for example, aggregated vehicle

    record and total number of vehicles, crash type across crash and vehicle level, number

    of vehicles and crash type. The results are shown to be mostly consistent. The number

    of inconsistent cases is not large enough to affect the analysis result. After correcting

    some apparent inconsistent cases according to the variables recorded in detail, we

    dropped the other 10 inconsistent cases. The sample size by each level is shown in

    Table 3-1 and the variables after cleaning and consistency checking from raw data are

    displayed in Appendix.

    3.2.3 Variable Selection

    The number of variables is limited by the sample size. Variable selection can

    merge the correlated variables and reduce the number of variables in the model so that

    to simplify the modeling procedure and reduce the running period.

    3.2.3.1 Crosstab check

    If two or more variables describe the same or correlated field of a subject, they are

    expected to be correlated. Taking the roadway traffic ( a category) and the restriction (b

  • 44

    category) as an example, restriction on the road is correlated with the situation of the

    traffic condition. Merging the two variables into one and re-combining the values

    reasonably can reduce the ba categories to bac . The same situation exists for

    other variables, such as seatbelt use, eye vision, carrier status and etc.

    3.2.3.2 Classification analysis

    When dealing with over three correlated or similar variables, since cross tabulation

    analysis may be required several times, classification analysis can be applied to reduce

    the variables easily. In the cleaned dataset of crash characteristics, seven variables are

    used to describe weather condition before and at the time of crash (ENVRain,

    ENVSnow, ENVFog, AFTRain, AFTSnow, AFTFog, and Weather). Hierarchical cluster

    command can merge these seven binary variables into clusters by distance or similarity

    measures. The cluster output implies that variables of ENVRain, ENVFog, AFTRain and

    AFTFog can be aggregated into one group and ENVSnow and AFTSnow into another

    group. For continuous variables such as driver height and weight, after grouping the

    cases by each ten centimeter and kilogram, there are still over 6 groups for these two

    variables. The K-mean cluster method can be used to group the cases into clusters

    using the distribution of these two variables.

    3.2.3.3 Missing data

    There were several variables (particularly those describing driver and vehicle

    characteristics) of potential interest that had missing values for a significant fraction of

    the cases. Further, it was also found that these values were ―systematically missing‖ for

    many of the variables. For example, the value of certain variables could be more likely

    to be missing for crashes with greater severity. Alternatively, the value of other variables

    could be more likely to be missing for crashes with lesser severity. Consequently,

  • 45

    simply removing the cases with missing values would skew the sample (in addition to

    reducing the sample size). Therefore, we chose to retain all cases, and the indicator

    variables were created to explicitly identify cases with missing values for each of the

    variables of interest. These indicator variables were also included in the model

    specifications and some of these turned out to be statistically significant (discussed

    further in the next section).

    Missing data in this dataset is a common situation. To deal with missing data, it is

    better to start from the types of missing, where, proper methods are required to process

    the data.

    Harrell (2001) presented the missing data as three types: Missing completely at

    random (MCAR), Missing at random (MAR) and Informative missing (IM).

    Missing completely at random (MCAR) refers to the situation that data elements

    are missing for reasons that are unrelated to any characteristics or responses for the

    subject. Examples include a subject omitted the response to a question for reasons

    unrelated to the response she made to her characteristics (miss the question), loosing

    the data because of a mistake in the experiment. This kind of missing data will be in

    small numbers.

    Missing at random (MAR) does not mean that data elements are missing at

    random, but the probability that a value is missing depends on values of variables that

    were actually measured and does not depend on the unobserved data. An example is

    the large truck weight is not all recorded because not all the carriers provide this

    information, but the carrier information is recorded.

  • 46

    Informative missing (IM) is missing elements more likely when their true values of

    the variable are systematically higher or lower. For example, a heavily injured driver

    would have difficulty in helping record the driving information, such as speed, emotion

    or road familiarity. IM is also called nonignoreable nonresponse.

    In the studied dataset, large numbers of variables have values recorded as

    unknown. When this variable is a record for the crash type, location and some other

    existing situation, this is a most common missing data completely at random (MCAR) in

    the dataset. The unknown values are not huge and ignorable. Usually, join these

    unknown values with some other low frequency values together as a new value. If the

    values are under 1%, join in any group will not affect the result significantly. For

    unknown value over 1%, it can be recoded to other values by linear interpolation or

    mean. This procedure can reduce the values for the variables.

    Another kind of ―unknown‖ or missing record is not MCAR and is nonignoreable.

    Such as alcohol exist at crash level, 11.9% report as unknown. This is maybe caused

    by the disability to make a record for high severe injured people or carelessness when

    recording a no injury crash. In the crash report, there are also huge missing records,

    such as the cargo weight, driver behavior and etc., recorded as unknown. For these

    types of missing data, keep the unknown as a value and model the impact of unknown

    on the response of injury.

    The same unknown situation can happen in the records of individual age, gender,

    speed limit and etc. But if the count of unknown for the variable is not large, it is not

    necessary to keep the unknown as a special value. For the low frequency (2%)

  • 47

    unknown value, consider it as ignorable and replace the unknown value by mean,

    median or linear trend.

    3.3 Sample Characteristics

    The final ―reduced‖ dataset assembled for this analysis was organized into the

    three major files: Crash-level data (including highest level of injury severity in the crash,

    crash type, and environment variables), Vehicle-level data (characteristics of the trucks

    and cars such as age, body-type, cargo, occupancy levels, and deficiencies), and

    Driver-level data (characteristics of the truck and car drivers including demographics,

    fatigue, health, and behavior). The effects of all these different variables were examined

    during the statistical-modeling procedure.

    At the crash level, we have 953 cases in the estimation sample. For occupant

    injury study, we select 918 representative crashes, which have at most 3 trucks, at most

    3 cars and at most 5 occupants per vehicle (including the driver). Totally, there are 2374

    occupants in the selected sample, including 1038 truck drivers, 145 truck occupants,

    818 car drivers, and 373 car passengers.

    Table 3-1. Sample size for each category.

    Variable category Level Sample size cases Injury severity Occupant 2699

    Crash

    Vehicle 2056

    Crash 953

    Truck and truck driver Truck 1108

    Car and car driver Car 941

    Truck occupant Truck occupant 151

    Car occupant Car occupant 499

    Carrier Truck carrier 733

  • CHAPTER 4 MODELS FOR CRASH LEVEL INJURY

    The intent of this chapter is to present a comprehensive analysis of the ―injury

    severity‖ of all types of crashes involving large-trucks. Because this is the first step of

    this dissertation, as many variables as possible are estimated to understand the

    relationship between injury severity and a vast number of inter-dependent explanatory

    factors using crash level data. The injury severity of a crash is defined as the highest

    level of severity among all those injured in the crash.

    4.1 Sample Data

    There are 953 crashes in the estimation sample. The data were collected from

    both police accident reports and additional sources such as site investigation,

    interviews, and review of medical records conducted by the LTCCS team. Each crash

    was investigated by a two-person team comprising a trained researcher and a state

    truck inspector (USDOT/FMCSA, 2006 for an overview). Correspondingly, there are two

    measures of injury severity available for each crash: (1) a measure determined from the

    police accident reports (referred to as PAR) and (2) a measure determined by the

    LTCCS researchers based on medical records and case narratives (referred to as RES

    for researcher-determined).

    A cross tabulation of the highest injury severity level of the 953 crashes

    determined from each of PAR and RES is presented in Table 4-1. In the case of PAR,

    the severity was measured on a four-item scale: possible injury (14%), non

    incapacitating injury (32%), incapacitating injury (32%), and killed (22%). In the case of

    RES, the severity was measured on a three item scale: non incapacitating injury (48%),

  • 49

    incapacitating injury (29%), and killed (23%). This is because the crashes in the LTCCS

    were sampled such that there is at least one injured person on the RES-scale.

    The cross-tabulation indicates that fatal crashes are recorded almost identically by

    both measures (approximately 22% each). However, discrepancies are observed in the

    case of non-fatal crashes. About 12% of all crashes (119 crashes) that were classified

    as level C by PAR were classified as level B by RES indicating under-estimation of the

    injury severity by the police reports (consistent with the findings of others such as Tsui

    et al., 2009). At the same time, 103 crashes classified as level A by the PAR were

    classified as level B by the RES indicating an over-estimation by the police reports.

    Given these discrepancies, it is useful to compare models estimated using each of

    these severity measures. Such an effort is undertaken in this study.

    It is also important to note that the sampling of the crashes was not purely random.

    Therefore, weights have been calculated to scale the sample to be nationally

    representative (the RATWEIGHT variable described in user‘s manual by USDOT,

    2006). Table 2 also presents the weighted percentages of the injury-severity levels

    according to the two measures (the last row and last column). The results indicate that

    crashes with higher levels of severity (particularly fatal crashes) have been

    oversampled.

  • 50

    Table 4-1. Cross tabulation of police-determined and researcher–determined injury severity levels.

    RES

    Possible Injury

    Non-incapacitating

    Injury

    Incapacitating Injury

    Killed Total Percentage Weighted

    Percentage

    PAR

    Possible Injury 0 119 15 0 134 14.06 14.39

    Non-incapacitating

    Injury 0 239 61 2 302 31.69 31.32

    Incapacitating Injury

    0 103 192 10 305 32.00 45.90

    Killed 0 1 5 206 212 22.25 8.40

    Total 0 462 273 218

    Percentage 48.48 28.65 22.88

    Weighted Percentage

    55.00 36.52 8.48

  • 62

    4.2 Methodology

    Injury severity is generally recorded in an ordinal scale. Most commonly, a five-level

    scale in the increasing order of injury severity is used: Property-damage only, Possible

    Injury, Non Incapacitating Injury, Incapacitating Injury, and Killed (Fatal). Thus, an

    ordered-response discrete-choice model is appropriate for the analysis of such data. In

    fact, this has been a popular approach for modeling injury severity in general (For

    example, Kockelman and Kweon, 2001). In this approach, for any crash n, the observed,

    ordinal, injury-severity level