estimation of statewide origin-destination truck flows ...docs.trb.org/prp/15-5463.pdf · 12...

20
1 Estimation of Statewide Origin-Destination Truck Flows Using Large Streams of GPS Data: An Application for the Florida Statewide Model Akbar Bakhshi Zanjani Graduate Research Assistant Department of Civil & Environmental Engineering University of South Florida Tel: (813) 285-1831; Email: [email protected] Abdul R. Pinjari (Corresponding Author) Associate Professor Department of Civil & Environmental Engineering University of South Florida, ENB 118 4202 E. Fowler Ave., Tampa, FL 33620 Tel: (813) 974-9671; Email: [email protected] Mohammadreza Kamali Graduate Research Assistant Department of Civil & Environmental Engineering University of South Florida Tel: (813) 713-1327; Email: [email protected] Aayush Thakur Senior Travel Demand Forecaster Cambridge Systematics Tel: (303) 357-4668; Fax: (303) 446-9111, Email: [email protected] Jeffrey Short Senior Research Associate American Transportation Research Institute Tel: (770) 432-0628; Email: [email protected] Vidya Mysore Freight Analysis and Modeling Specialist Federal Highway Administration, Resource Center Tel: (404) 562-3929; Email: [email protected] S. Frank Tabatabaee Systems Transportation Modeler Florida Department of Transportation, Systems Planning Office Tel: (850) 414-4931; Email: [email protected] Word count: 7,388 words + 1 table × 250 + 4 figures × 250 = 8,638 equivalent words Submission date: Aug 1, 2014 Submission for presentation and publication consideration at the 94 th TRB Annual Meeting. Statewide Travel Demand Forecasting Joint Subcommittee of ADA10 and ADB40

Upload: duongkiet

Post on 20-Mar-2018

216 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Estimation of Statewide Origin-Destination Truck Flows ...docs.trb.org/prp/15-5463.pdf · 12 Administration to develop and test a national system for monitoring freight ... the inputs

1

Estimation of Statewide Origin-Destination Truck Flows Using Large Streams of GPS Data: An Application for the Florida Statewide Model Akbar Bakhshi Zanjani Graduate Research Assistant Department of Civil & Environmental Engineering University of South Florida Tel: (813) 285-1831; Email: [email protected] Abdul R. Pinjari (Corresponding Author) Associate Professor Department of Civil & Environmental Engineering University of South Florida, ENB 118 4202 E. Fowler Ave., Tampa, FL 33620 Tel: (813) 974-9671; Email: [email protected] Mohammadreza Kamali Graduate Research Assistant Department of Civil & Environmental Engineering University of South Florida Tel: (813) 713-1327; Email: [email protected] Aayush Thakur Senior Travel Demand Forecaster Cambridge Systematics Tel: (303) 357-4668; Fax: (303) 446-9111, Email: [email protected] Jeffrey Short Senior Research Associate American Transportation Research Institute Tel: (770) 432-0628; Email: [email protected] Vidya Mysore Freight Analysis and Modeling Specialist Federal Highway Administration, Resource Center Tel: (404) 562-3929; Email: [email protected] S. Frank Tabatabaee Systems Transportation Modeler Florida Department of Transportation, Systems Planning Office Tel: (850) 414-4931; Email: [email protected] Word count: 7,388 words + 1 table × 250 + 4 figures × 250 = 8,638 equivalent words Submission date: Aug 1, 2014 Submission for presentation and publication consideration at the 94th TRB Annual Meeting. Statewide Travel Demand Forecasting Joint Subcommittee of ADA10 and ADB40

Page 2: Estimation of Statewide Origin-Destination Truck Flows ...docs.trb.org/prp/15-5463.pdf · 12 Administration to develop and test a national system for monitoring freight ... the inputs

2

ABSTRACT 1 This paper investigates the use of large streams of truck GPS data from the American 2 Transportation Research Institute (ATRI) for the estimation of statewide freight truck flows in 3 Florida. To this end, first, the raw GPS data streams comprising over 145 million GPS records 4 were used to derive a database of more than 1.2 million truck trips starting and/or ending in 5 Florida. The paper sheds light on the extent to which these trips derived from the GPS data 6 capture observed truck traffic flows in Florida. This includes insights on (a) the truck type 7 composition, (b) the proportion of the truck traffic flows covered by the data, and (c) 8 geographical differences in the coverage. The paper applies origin-destination matrix estimation 9 (ODME) methodology to use the GPS data in combination with observed truck traffic volumes at 10 different locations within and outside Florida to derive an origin-destination (OD) table of truck 11 flows within, into, and out of the state. The procedures, implementation details, and experiences 12 discussed in the paper are expected to be useful to a number of transportation planning agencies 13 who are considering the use of GPS data for freight travel demand modeling. 14

Page 3: Estimation of Statewide Origin-Destination Truck Flows ...docs.trb.org/prp/15-5463.pdf · 12 Administration to develop and test a national system for monitoring freight ... the inputs

3

1 INTRODUCTION 1 Accelerated growth in the volume of freight shipped on American highways has led to a 2 significant increase in truck traffic, influencing traffic operations, safety, and the condition of 3 highway infrastructure. Traffic congestion in turn has impeded the speed and reliability of freight 4 movements. As freight movement continues to grow nationwide, appropriate planning and 5 decision making processes are necessary to mitigate these impacts. However, a main challenge in 6 establishing these processes is the lack of adequate data on freight movements such as detailed 7 origin-destination (OD) demand data. As traditional data sources on freight movement are either 8 inadequate or no longer available, new data sources must be investigated. 9 A recently available source of data on nationwide freight flows is based on a joint venture 10 by the American Transportation Research Institute (ATRI) and the Federal Highway 11 Administration to develop and test a national system for monitoring freight performance 12 measures on freight-significant corridors in the nation (1). This data, obtained from trucking 13 companies who use GPS technologies to remotely monitor their trucks, provides unprecedented 14 amount of data on freight truck movements in North America. Such truck GPS data potentially 15 can be used to support planning, operation, and management processes associated with freight 16 movements. 17 ATRI’s truck GPS data have been used for a variety of freight performance measurement 18 and planning applications in the U.S., including the measurement of truck speeds on major 19 freight corridors in the nation, truck speed reliability measurements, identification of truck flow 20 bottlenecks, and analysis of truck parking issues. In addition, the data provides an opportunity to 21 develop OD demand data for large geographical regions such as truck flows between urban 22 areas, megaregional flows, statewide truck flows, and even nationwide truck flows. Since a 23 majority of freight being shipped across the U.S. is via the truck mode, OD data and analysis of 24 truck flows across multiple jurisdictions at a large geographical scope can significantly improve 25 freight planning at all levels of the government. 26 It is important to note, however, that while ATRI’s truck GPS data comes from a large 27 sample of trucks in North America, it is not necessarily the census of all trucks from any region. 28 Before applying the data to estimate OD flows for any region, it is important to understand the 29 extent to which the data covers truck flows in the region and the nature of trucks in the data (e.g., 30 the truck type composition and the types of businesses served). As such, additional information 31 and procedures must be employed to weight the sample of OD trip flows derived for a study area 32 from the ATRI data to represent the population of heavy truck flows within, to, and from the 33 study area. The weighting process is required not only for inflating the sample to the population 34 but also for ensuring that the spatial distribution of the resulting truck flows is representative of 35 the truck flows in the study area. One approach to do this is Origin-Destination Matrix 36 Estimation (ODME), which involves combining the sample OD trip flows derived from the 37 ATRI data with other sources of information on truck flows observed at various links of the 38 highway network in the study area to estimate a full OD flow matrix representing the population 39 of truck flows in the study area. 40 This paper demonstrates the use of ATRI’s truck GPS data in combination with other 41 observed data on truck traffic flows to estimate a statewide OD table of truck flows within, into, 42 and out of Florida. In doing so, the paper sheds light on the extent to which ATRI data captures 43 the observed truck traffic flows in Florida. This includes insights on (a) the types of trucks (e.g., 44 heavy trucks and medium trucks) present in the data, (b) the geographical coverage of the data in 45 Florida, and (c) the proportion of the truck traffic flows in the state covered by the data. Similar 46

Page 4: Estimation of Statewide Origin-Destination Truck Flows ...docs.trb.org/prp/15-5463.pdf · 12 Administration to develop and test a national system for monitoring freight ... the inputs

4

procedures can be used to evaluate ATRI data in terms of its coverage of truck flows in other 1 states as well as the entire nation. Further, the paper applies an ODME methodology to use 2 ATRI’s GPS data for estimation of OD truck flows over large geographical regions (and 3 describes the practical aspects in doing so). The data is being considered for use in regional and 4 statewide freight travel demand models by a number of transportation planning agencies in the 5 U.S. and Canada. To aid these agencies, the paper documents lessons learned from using the data 6 for estimating statewide truck OD flows. Finally, since the truck flow OD tables estimated in this 7 research are primarily for validating and calibrating the freight component of the Florida 8 statewide travel demand model (FLSWM), this paper sheds light on using ATRI’s GPS data for 9 statewide freight travel demand modeling purposes. 10 Section 2 briefly describes ATRI’s truck GPS data used in this study and the procedure 11 used to convert it into truck trips and a corresponding OD table. Section 3 provides an 12 assessment of the data in terms of its truck type composition, its coverage of truck flows in 13 Florida, and geographical differences in the coverage. Section 4 presents the ODME 14 methodology, the inputs and assumptions for the ODME procedure used in this study, and the 15 results and their validation. Section 5 concludes the paper. 16 17 2 DESCRIPTION OF ATRI’S RAW GPS DATA AND ITS CONVERSION INTO A 18 TRUCK TRIP OD TABLE 19 20 2.1 ATRI’s GPS Data 21 To derive truck OD flow tables for Florida, the research team worked with Florida specific raw 22 GPS data from ATRI for four months – March, April, May, and June – in 2010 for the state of 23 Florida. Specifically, for each of these four months, all trucks from ATRI’s database that were in 24 Florida at any time during the month were extracted. Subsequently, all GPS records of those 25 trucks were extracted for the entire month, as they traveled within Florida as well as in other 26 parts of North America. This allows the examination of truck movements within Florida as well 27 as truck flows into (and out of) Florida from (to) other locations. The number of GPS records for 28 each month was over 35 million, summing up to over 145 million records for the four months. 29 Each GPS record contained information on its spatial (latitude/longitude) and temporal 30 (date/time) location along with a unique truck ID that did not change across all the GPS records 31 of the truck for a certain time period varying from a day to over a month (at least two weeks for 32 most trucks in the data). In addition to this information, a portion of the GPS data contained spot 33 speeds (i.e., the instantaneous speeds) of the truck and the remaining portion of the database did 34 not contain spot speeds. These two types of data were separately delivered, presumably because 35 they come from different truck fleets with different GPS technologies. The frequency (i.e., ping 36 rate) of the GPS data streams varied considerably, ranging from a few seconds to over an hour of 37 interval between consecutive records. 38

Information about individual trucks such as the commodity, weight or volume carried or 39 the type of truck was not available. Since the data was collected originally for measuring truck 40 travel speeds on freight-significant corridors, the data comprises predominantly tractor-41 semitrailer combinations or larger trucks (or heavy trucks) that tend to travel on such corridors. 42 Such trucks can be categorized as class 8 to class 13 of FHWA’s vehicle classification scheme. 43 44 45 46

Page 5: Estimation of Statewide Origin-Destination Truck Flows ...docs.trb.org/prp/15-5463.pdf · 12 Administration to develop and test a national system for monitoring freight ... the inputs

5

2.2 Conversion of GPS Data into a Truck Trip OD Table 1 The raw GPS data must be converted into a truck trip format before utilizing it form most 2 transportation modeling and planning uses, including OD table estimation. The algorithm for 3 converting the GPS data into truck trips is briefly described here. Pinjari et al. (2) provide full 4 details on the procedure and the rationale behind it. 5 (1) Sort GPS data for each truck ID into time series, in the order of date & time of GPS records. 6 (2) Identify potential trip-ends (origins/destinations) based on travel speed between consecutive 7

records (calculated using spatial movement and time gap between consecutive records). 8 a. If travel speed between consecutive GPS records was less than 5 mph, the truck was 9

assumed to be at rest (i.e., at a stop). 5 mph speed cut-off was verified using data with 10 spot speeds and based on the literature (3-5). 11

b. Not all truck stops are valid trip-ends. A truck stop was considered to be a trip-end if the 12 stop dwell-time (i.e., stop duration) was greater than a minimum dwell-time buffer value. 13 Stops of smaller dwell-time than dwell-time buffer were considered to be insignificant 14 stops, including traffic signal stops, congestion stops, fueling stops, etc. At the beginning 15 of the algorithm, a 30-minute minimum dwell-time buffer was used to identify truck trip-16 ends. Stops of less than 30-minute duration were considered to be intermediate stops not 17 intended for pickup/delivery. 18

c. Combine very small trips (< 1 mile trip length) with preceding trips or eliminate them, 19 because most such small movements were within in large establishments. 20

d. Eliminate poor quality trips based on data quality issues such as consecutive GPS records 21 with large time gaps or with unrealistically high travel speeds. 22

(3) Eliminate trip-ends in rest areas and other locations that are unlikely be pickup/delivery stops 23 a. by overlaying trip ends on a geographic file of rest areas, wayside parking stops, and 24

similar locations, 25 b. by eliminating stops within close proximity (800 feet) of interstate highways, most of 26

which are most likely to be rest areas or wayside parking stops, and 27 c. by joining consecutive trips ending and beginning at such stops. 28

(4) Find circular (i.e., circuitous) trips and break each of them into multiple valid trips. 29 a. Trips with a ratio between air-distance to network-distance less than 0.7 were considered 30

circular trips, with a high likelihood of valid a intermediate trip-end that was missed due 31 to counting only stops with at least 30-min dwell-time as valid trip-ends. 32

b. Use raw GPS data between the origin and destination of circular trips to split them into 33 appropriate number of shorter, non-circular trips by allowing smaller dwell-time buffers 34 at the destinations. For this, implement step 2 with a smaller dwell-time buffer (15 35 minutes) and go through steps 3 and 4 to find any remaining circular trips. Repeat the 36 process with a dwell-time buffer of 5 minutes to split remaining circular trips. 37

(5) Conduct additional quality checks and eliminate trips that do not satisfy quality criteria. 38 Using the above-described procedure, a total of over 2.7 million truck trips were derived 39

using ATRI’s Florida-specific raw GPS data of over 145 million records from four months in 40 2010. Over 1.2 million of these trips were either within Florida or had one end in Florida. The 41

Page 6: Estimation of Statewide Origin-Destination Truck Flows ...docs.trb.org/prp/15-5463.pdf · 12 Administration to develop and test a national system for monitoring freight ... the inputs

6

trip end locations of the trips derived from the above procedure were overlaid on a traffic 1 analysis zone (TAZ) layer of the FLSWM to aggregate the trips into TAZ-to-TAZ OD flows. 2 3 3 ASSESSMENT OF ATRI’S TRUCK GPS DATA AND ITS COVERAGE OF TRUCK 4 TRAFFIC IN FLORIDA 5 6 3.1 Truck Type Composition in ATRI Data 7 The major sources of ATRI’s data are large trucking fleets, which typically comprise tractor-8 semitrailer combinations that predominantly serve the purpose of long distance freight hauling. 9 However, a close observation of the data, through following several trucks on Google Earth and 10 examining travel characteristics of individual trucks, suggested that a small proportion of trucks 11 in the data were more likely to be medium trucks (e.g., single unit trucks and straight/box trucks) 12 that predominantly serve the purpose of local delivery and distribution in urban areas. Although 13 small in number, these trucks were observed to make a large number of short trips, most likely 14 for local delivery and distribution, which are not of primary interest for FLSWM. Since the data 15 does not provide information on the vehicle classification of each individual truck, heuristics 16 were developed to classify the trucks into heavy trucks and medium trucks utilizing the travel 17 characteristics of individual trucks over extended time periods (i.e., at least two weeks). 18 The 2.7 million truck trips derived from four months of ATRI’s GPS data corresponded 19 to 169,714 unique truck IDs. From these trucks, those that did not make at least one trip of 100 20 miles in a two week period and trucks that made more than 5 trips per day were assumed to be 21 medium trucks that are used predominantly for local delivery and distribution and removed from 22 further consideration. These comprise 4.7% of the trucks (7936 trucks) in the database that made 23 13.5% of all trips extracted from the database. After removing these trucks, over 2.34 million 24 trips extracted from GPS data of over 161,776 unique truck IDs were considered as trips made 25 by heavy trucks that predominantly carry freight. These trips were further used for OD matrix 26 estimation. 27 Note: It is not necessary that only heavy trucks carry freight over long distances while 28 only medium trucks serve the purpose of local delivery and distribution. Further research is 29 needed to identify the composition of trucking fleet in the ATRI data and the purposes served by 30 those trucks. 31 32 3.2 What Proportion of Heavy Truck Traffic Flows in Florida Is Captured in ATRI Data? 33 To address this question, truck traffic flows in one week of ATRI’s truck GPS data was 34 compared with observed truck traffic volumes from Telemetered Traffic Monitoring (TTM) sites 35 in Florida for that week. This section describes the procedure and results from this analysis. 36

One week of ATRI data (May 9-15, 2010) was used to derive weekly ATRI truck traffic 37 volumes at the TTM sites. Generating data on weekly ATRI truck traffic volumes at each TTM 38 location required counting the number of times the trucks in ATRI data crossed the location in 39 the week. To do so, the truck trips generated from the procedure discussed earlier were isolated 40 for the week of May 9-15, 2010. For each of these trips, given the origin and destination, a 41 sample of en-route GPS records between the trip-ends (sampled at a 5-minute interval) were 42 map-matched to the FLSWM highway network using the network analyst tool in ArcGIS. The 43 map-matching algorithm snaps the GPS points to the nearest roadway links and also determines 44 the shortest path between consecutive GPS points. Since intermediate GPS points between the 45 trip-ends were sampled at only a 5-minute interval, this procedure results in a sufficiently 46

Page 7: Estimation of Statewide Origin-Destination Truck Flows ...docs.trb.org/prp/15-5463.pdf · 12 Administration to develop and test a national system for monitoring freight ... the inputs

7

accurate route for the trip. The output from this process was an ArcGIS layer containing the 1 travel routes for all trips generated from ATRI’s one-week truck GPS data. This ArcGIS layer 2 was intersected with another layer of FLSWM network containing the TTM stations. This helped 3 estimate the number of ATRI truck trips crossing each TTM station (i.e., the volume of ATRI 4 trucks crossing TTM stations). 5

The truck traffic volumes derived from ATRI data were compared with observed 6 volumes of heavy trucks (of class 8 to 13) extracted from FDOT’s TTM data for the same week. 7 Figure 1 shows the average daily heavy truck volumes at over 200 TTM locations in Florida (top 8 map in the figure). Only 160 of these locations had traffic count data for all 7 days in the specific 9 week under consideration. Therefore only these locations were selected for comparing the ATRI 10 truck traffic volumes with observed truck traffic volumes. The bottom map in Figure 1 shows 11 these results of this comparison for TTM locations with observed heavy truck traffic volumes 12 greater than 1,000 per day (location with smaller volumes are not shown for clarify in 13 presentation). Clearly, at no single location does the ATRI data provide 100% coverage of the 14 observed heavy truck volume. However, the data does provide some coverage of the heavy truck 15 traffic at all locations. At most of these locations, at least 5% of the observed heavy truck 16 volumes are captured in ATRI data. 17

Table 1 shows these results aggregated by highway facility type. Note from the third 18 column that a bulk of heavy truck traffic counts (65.6%) are observed on freeways and 19 expressways that represent only 18.1% of the 160 TTM sites considered in this analysis. The last 20 row in the fourth column shows that a total of 163,467 ATRI truck crossings were counted at the 21 160 TTM locations. Note from the same column that the distribution of these ATRI truck traffic 22 counts across different facility types is similar to the distribution of observed truck counts across 23 facility types in the third column. This result suggests that the ATRI data provides a 24 representative coverage of heavy truck flows through different facility types in the state. The last 25 column expresses truck traffic counts from ATRI data as a percentage of observed heavy truck 26 traffic counts at the TTM locations. Overall, it can be concluded that the truck trip OD table 27 derived from ATRI data (in 2010) provides 10% coverage of heavy truck flows observed in 28 Florida. This result is useful in many ways. For example, the OD table derived from ATRI data 29 can be weighted (10-fold) to create a seed matrix for use as an input into the ODME process. 30

Page 8: Estimation of Statewide Origin-Destination Truck Flows ...docs.trb.org/prp/15-5463.pdf · 12 Administration to develop and test a national system for monitoring freight ... the inputs

8

(a)

(b)

FIGURE 1 (a) Observed heavy truck traffic flows at different TTM sites in Florida; (b) Percentage of observed heavy truck (classes 8-13) volumes represented by ATRI data at TTM sites in Florida during May 9–15, 2010.

Page 9: Estimation of Statewide Origin-Destination Truck Flows ...docs.trb.org/prp/15-5463.pdf · 12 Administration to develop and test a national system for monitoring freight ... the inputs

9

TABLE 1 Coverage of Heavy Truck Traffic Volumes in Florida in ATRI Data (for One 1 Week from May 9 to 15, 2010) 2

Facility Type

No. of TTM Traffic Counting stations

Observed Truck Traffic Volumes (Class 8-13) during May 9-15, 2010

Truck Traffic Volumes in ATRI data during May 9-15, 2010

% Coverage

Freeways & Expressways 29 (18.1%) 1,063,765 (65.6%) 111,608 (68.3%) 10.5%

Divided Arterials 64 (40.0%) 333,791 (20.6%) 30,472 (18.6%) 9.1%

Undivided Arterials 52 (32.5%) 101,066 (6.2%) 6,969 (4.3%) 6.9%

Collectors 8 (5.0%) 42,164 (2.6%) 5,127 (3.1%) 12.2%

Toll Facilities 7 (4.4%) 80,493 (5.0%) 9,291 (5.7%) 11.5%

Total 160 1,621,279 163,467 10.1%

3 3.3 Geographical Coverage of ATRI Data in Florida 4 The bottom map in Figure 1 sheds light on the geographical differences in the extent to which 5 ATRI data captures observed heavy truck volumes in the state. Specifically, it can be observed 6 that the coverage in the southern part of Florida (within Miami) and the southern stretch of I-75 7 (i.e., in and below Tampa) is relatively lower compared to the coverage in the northern and 8 central Florida regions. 9 To further assess the statewide geographical coverage of ATRI data in detail, the OD 10 table derived from ATRI data was aggregated into trip productions (i.e., # trips beginning from) 11 and trip attractions (i.e., # trips ending at) for TAZs in the FLSWM. The trip attractions and 12 productions were then plotted on a GIS layer of FLSWM TAZs within Florida. These maps are 13 not presented in the paper to conserve space (see (2) for the maps), but the findings are briefly 14 discussed here. It was observed that the Everglades region in the south Florida and some TAZs 15 in northwest Florida had zero trip productions and/or attractions in the OD table derived from 16 ATRI data. This could be due to two reasons: (1) low penetration of ATRI data in those TAZs, or 17 (2) those TAZs did not have heavy truck trip generation in reality. It is reasonable to expect the 18 Everglades region in Florida to have little to no truck trip generation. To investigate zero truck 19 trip generation among the northwestern TAZs in the state, we examined Figure 1 (map in the 20 top) for observed heavy truck flows in the TTM data. Except along the I-10 corridor, the 21 northwest region of the state does not have high truck traffic volumes. This suggests that the zero 22 trip generations in ATRI data for several TAZs in northwest part of Florida is a reasonable 23 representation of truck flows in that region. 24 The TAZ-level trip productions and attractions were further aggregated to a county-level 25 to identify any potential spatial biases in ATRI data. It was observed that, Duval (Jacksonville), 26 Polk, Orange (Orlando), Miami-Dade (Miami), and Hillsborough (Tampa) counties, in that 27 order, had the highest truck trip generation. It is reasonable that counties within major 28 metropolitan areas in the state have the highest heavy truck trip generation. Whereas Polk 29 County is expected to have a high truck trip generation due to the presence of several freight 30 distribution centers, it is interesting that the county had higher truck trip generation than that in 31

Page 10: Estimation of Statewide Origin-Destination Truck Flows ...docs.trb.org/prp/15-5463.pdf · 12 Administration to develop and test a national system for monitoring freight ... the inputs

10

Tampa, Orlando and Miami regions. Further, the truck trip generation in Southeast Florida 1 (Miami, Broward and Palm Beach counties) was smaller than that in Polk County. Recall from 2 the first paragraph in this section that the ATRI data coverage of heavy truck flows in the 3 southern part of Florida was relatively lower compared to other locations in Florida. These trends 4 are likely to be a manifestation of spatial biases in the data. To address such spatial biases, 5 ODME process combines the truck trip flows derived from the ATRI data with observed heavy 6 truck traffic volumes at different locations in the state. 7 8 4 ORIGIN DESTINATION MATRIX ESTIMATION OF STATEWIDE TRUCK FLOWS 9 ODME is a procedure used to update an existing matrix of OD flows using information on traffic 10 volumes observed at various locations in the transportation network (6). The method has been 11 used widely for passenger travel demand estimation and to a relatively small extent for freight 12 demand estimation (7). However, estimation of reliable OD matrices from traffic count data is a 13 challenging exercise, since the observed OD data (from typical approaches such as establishment 14 surveys or roadside interviews) is often limited and the ODME procedures can lead to multiple 15 non-unique OD matrices that may provide equally good fit to observed traffic volumes. ATRI’s 16 GPS data provides unprecedented amounts of observed truck OD flow data that offers an 17 opportunity to estimate truck OD flow matrices in a potentially more reliable manner. ODME 18 can be used to factor the seed matrix derived from ATRI data in such a way that the resulting 19 estimated OD matrix, when assigned to the highway network, closely matches with observed 20 heavy truck counts at various locations on the network. Recent attempts at estimating truck OD 21 flows using this data include studies by Bernardin et al. (3) and Bernardin and Short (8). 22 23 4.1 The ODME Methodology 24 The specific ODME procedure used in this research, embedded in the Cube Analyst Drive 25 software, is an optimization problem that tries to minimize a function of the difference between 26 observed traffic counts and estimated traffic counts (from the estimated OD matrix) and the 27 difference between the seed matrix and the estimated OD matrix, as below: 28

0 arg min

subject to 0 and XX

lower upper

J X F AX b G X X

X X X

(1) 29

In this optimization problem, X is the OD matrix to be estimated, 0X is the seed OD matrix, G 30

is a function measuring the distance between the estimated matrix and the seed matrix, b is a 31 vector of observed counts at different locations in the study area, A is the route choice 32 probability matrix obtained from assignment of OD flows in X on the network using user 33 equilibrium method, AX is a vector of estimated traffic counts, and F is a function measuring 34 the difference between estimated and observed traffic counts. The procedure attempts to arrive at 35 an OD flow matrix X in such a way that the resulting traffic volumes (AX) match closely with 36 observed traffic flows (b ). At the sametime, the procedure avoids overfitting to observed traffic 37 flows by including the term 0G X X so that the estimated matrix has a similar structure as 38

the seed matrix. Xlower and upperX are boundaries (lower and upper bounds) within which the 39

estimated matrix should fall. The analyst can use these boundary constratins to set lower and 40 upper bounds on the estimated matrix, relative to the seed matrix. 41 The estimated OD matrix may be evaluated by comparing of estimated heavy truck 42 traffic volumes and observed heavy truck traffic volumes at different locations within and 43

Page 11: Estimation of Statewide Origin-Destination Truck Flows ...docs.trb.org/prp/15-5463.pdf · 12 Administration to develop and test a national system for monitoring freight ... the inputs

11

outside Florida (for a set of validation data that was not used for ODME). Specifically, a root 1 mean square error (RMSE) can be evaluated as: 2

2

11

N

i ii

avg

V CRMSE

C N

(2) 3

where, iV is the estimated truck volume on link i, Ci is the observed truck volume on link i, 4

Cavg is the average heavy truck traffic counts of the entire set of observations, and N is the total 5

number of truck counting locations. In addition to comparing observed and estimated traffic 6 volumes, it is important to assess the reasonableness of the estimated OD matrix in different 7 ways. Aggregating the OD matrix to a coarser spatial resolution and examining the spatial 8 distribution of flows, examining the trip productions and trip attractions for each aggregate 9 spatial zone, and examining the trip length distribution of the estimated OD matrix in 10 comparison to the seed OD matrix are different ways of assessing the estimated OD matrix. 11 12 4.2 Inputs for ODME 13 The primary inputs to ODME procedure are the seed OD matrix (derived from ATRI data), a 14 highway network for the study area along with information on the travel times and capacity of 15 each link in the network (extracted from FLWSM), and observed truck traffic volumes on 16 different links in the network. In addition, OD flow matrices corresponding to travel other than 17 freight truck flows – non-freight truck travel and passenger travel (both extracted from FLSWM) 18 – were provided as inputs to generate realistic travel conditions in the network. 19

Seed matrix: This is a matrix of FLSWM TAZ-to-TAZ truck trip flows derived from 20 ATRI’s truck GPS data. Specifically, the OD matrix of heavy truck flows obtained from 4 21 months (122 days) of ATRI data was divided by 122 to obtain the seed matrix for an average day 22 and then multiplied by 10 to account for the fact that the data represents 10% of observed truck 23 traffic flows in the state. 24

In FLSWM, Florida, other states in the U.S., and Canada are divided into 6,242 TAZs, 25 with 5,403 of these zones in Florida. Therefore, the seed matrix has a total of 39 million OD 26 pairs. The 2.34 million trips extracted from four months of ATRI data for this OD matrix were 27 between only 0.41 million of the 39 million OD pairs. The remaining 38.5 million OD pairs in 28 the seed matrix were zero-cells (i.e., they had no trips). This is an important issue to address 29 because most ODME methods used in practice result in zero trips for OD pairs that began with 30 zero-cells in the seed matrix. A common approach to address this issue is to introduce a small 31 positive number (say, 0.01) for zero-cells in the seed matrix that the analyst believes should have 32 trip flows. To assess which zero-cells in the seed matrix were expected to have trip flows, the 33 structure of the seed matrix was examined by aggregating it into county-level in Florida and 34 state-level outside Florida. 35

Out of a total of 67×67 (4,489) county-to-county OD pairs in Florida, the seed OD matrix 36 derived from ATRI data had trips for 79.4% (3,564) OD pairs. The remaining 20.6% OD pairs 37 did not have trips. A closer examination suggested that some rural counties in northwest Florida 38 and few rural counties in the southwest (such as in Everglades) have higher occurrence of zero 39 trip flows to/from other counties in Florida and other states outside Florida. Combining this 40 information with earlier discussion (section 4.3) on geographical coverage of ATRI data, it can 41 be concluded that the zero truck flows in the seed matrix to/from counties in northwestern and 42 southwestern parts of Florida is likely because these counties may not actually have truck flows 43

Page 12: Estimation of Statewide Origin-Destination Truck Flows ...docs.trb.org/prp/15-5463.pdf · 12 Administration to develop and test a national system for monitoring freight ... the inputs

12

to/from a large number of locations. Further, considering that the seed matrix was derived from 1 four months of GPS data (which is a large amount of data), if some OD pairs at a county-level 2 resolution did not have any trip exchanges, it was considered reasonable to assume that those OD 3 pairs may not have truck flows in reality. On the other hand, for OD pairs with both ends outside 4 Florida, 350 out of the 2500 (50×50) state-to-state OD pairs did not have any trips in the OD 5 table. Since the data is Florida centric, it is likely that the seed OD matrix is not necessarily a 6 good representation of truck flows between OD pairs outside Florida. 7

Observed truck traffic volumes: This data was gathered for several locations within and 8 outside Florida for the same four-month duration for which the seed matrix was available (and 9 used in the form of average daily truck traffic). Since the OD matrix to be estimated includes 10 truck flows between Florida and other states as well, it was considered important to include truck 11 traffic counts outside Florida as well. Data on truck traffic counts in Florida was obtained from 12 Florida Department of Transportation (FDOT)’s TTM sites at over 200 locations on Florida’s 13 highway network (counts were obtained separately for each direction; a total of 413 traffic 14 counts). Of all the 413 different heavy truck counts at different locations in Florida, data from 15 365 locations (i.e., input stations) were used the ODME while data from the remaining 48 16 locations (i.e., validation stations) were kept aside for validation. 17 For Georgia, truck traffic counts from Georgia Automated Traffic Recorder (ATR) 18 locations were obtained from Georgia Department of Transportation (GDOT). For all other 19 states, FHWA’s vehicle travel information system (VTRIS) database was utilized to obtain truck 20 traffic counts on highway network locations. Since the FLSWM network is not very detailed 21 outside Florida, only 635 of the VTRIS and ATR locations fell on FLSWM highway network 22 links. Therefore, except Florida and Georgia, other states in the southeast such as Alabama, 23 Mississippi, Louisiana, and South Carolina had very few locations from which observed truck 24 traffic count data was used in ODME. Tennessee, Kentucky, and North Carolina did not have 25 any traffic counting locations; this will likely have a bearing on the results. Out of 635 heavy 26 truck counts at different locations outside Florida, data from 598 locations were used in ODME 27 while data from the remaining 37 locations were kept aside for validation. 28 29 4.3 Evaluation of Different Assumptions for ODME 30 The ODME procedure was run several times to evaluate different assumptions on the OD matrix, 31 including assumptions of upper/lower bounds on the number of trips estimated between OD pairs 32 (i.e., Xlower and upperX ) and assumptions on zero-cells in the seed matrix. 33

Among the assumptions on upper/lower bounds on the OD matrix the extent of upper 34 bound did not influence the results as long as the bound was large enough. Imposing small upper 35 bounds was leading to poor fit of the estimated OD matrix (estimated traffic volumes, to be 36 precise) to observed traffic counts. Therefore, no upper bound was imposed on the OD matrix. 37 The lower bounds, however, had considerable influence on the estimated matrix. When lower 38 bounds were removed, the estimated truck traffic volumes matched very well with observed 39 truck traffic volumes for locations from which traffic count data was used for ODME. 40 Specifically, the RMSE value between estimated and observed truck traffic volumes was very 41 small (<10%). However, in this scenario, trip length distribution of the estimated OD matrix was 42 skewed toward a much greater share of shorter trips than those in the seed matrix. There are two 43 possible reasons for this: (1) the seed matrix was biased toward long-distance trips and that 44 combining the seed matrix with the observed traffic counts reduced the bias by increasing the 45 proportion of short trips, or (2) the estimated OD matrix is over-fitting to observed traffic counts. 46

Page 13: Estimation of Statewide Origin-Destination Truck Flows ...docs.trb.org/prp/15-5463.pdf · 12 Administration to develop and test a national system for monitoring freight ... the inputs

13

When we closely examined the estimated OD matrix, no trips were estimated between Florida 1 and some southeastern states that had no observed traffic counts from the VTRIS data. For 2 example, no OD pair between North Carolina and Florida and between Tennessee and Florida 3 had any trips in the estimated OD matrix (recall that we could not use any observed truck traffic 4 counts from Tennessee and North Carolina), whereas the seed matrix did have trip flows 5 between Florida and those states. This suggests that the estimated OD matrix is likely an artifact 6 of over-fitting to the observed traffic volumes. This was reflected in a deteriorated fit of the 7 estimated truck traffic volumes to observed truck traffic volumes from the locations kept aside 8 for validation. 9

When the lower bound was set to be equal to the seed matrix, the estimated OD matrix 10 was very close in its trip length distribution to the seed matrix. However, the RMSE between the 11 estimated and observed traffic volumes was high (60%). This was because the seed matrix was 12 not being modified meaningfully by the ODME procedure. As a middle ground between the 13 above two scenarios, we explored the different lower bounds ranging from 0.1 to 0.9 times the 14 seed matrix. Of all these, lower bounds set to 0.7 times the seed matrix provided the most 15 reasonable results. Note that the seed matrix derived from the ATRI-data was inflated 10-fold to 16 recognize that ATRI data represented 10% of the observed heavy truck flows in the state (at an 17 aggregate level). However, it is not necessary that the data represents 10% of heavy truck flows 18 at every location. In some locations, the data might represent more or less than 10%. Therefore, 19 setting a lower bound of 0.7 allows for the possibility that the actual heavy truck trip flows might 20 be less than the 10-fold inflated number of heavy truck derived from ATRI data. This scenario 21 provided reasonable results, with RMSE value of 20% for input stations and 38% validation 22 stations while also allowing trips from (and to) all states to (and from) Florida. 23 Among the assumptions on zero-cells, keeping the zero-cells as is provided better results 24 both in terms of validation measures against observed heavy truck counts as well as 25 reasonableness of the spatial distribution of truck flows. For instance, altering all zero-cells to 26 0.01 provided high RMSE values results unless the lower bounds were removed on all cells. 27 However, removing the lower bounds on all cells, as discussed earlier, was leading to over-28 fitting of the estimated heavy truck traffic volumes to observed truck traffic volumes. 29 30 4.4 ODME Results for One Set of Assumptions 31 This section presents results from the following set of assumptions in ODME: (1) no upper 32 bounds but a lower bound of 0.7 times the seed matrix on the estimated OD matrix, and (2) zero-33 cells in seed matrix assumed to truly represent zero truck flows. 34 The seed matrix had trips between nearly 0.41 Million OD pairs, of which 0.18 Million 35 OD pairs had both ends in Florida. The same OD pairs have trips in the estimated OD matrix 36 (due to assumption on zero-cells). The seed matrix contained a total of 69,025 daily heavy truck 37 trips that started and/or ended in Florida while the estimated OD matrix resulted in a total of 38 104,587 trips. The daily mileage of estimated trips with at least one end in Florida was over 27 39 Million miles. 26.6% of these miles (i.e., over 7 million miles) were due to trips within Florida. 40 Figure 2 shows a comparison of estimated truck traffic volumes (from user equilibrium 41 assignment of the estimated OD matrix) and observed heavy truck traffic volumes in Florida’s 42 TTM data. Blue dots in the figure are for locations for which TTM data was used in the ODME 43 process, while the red dots are for locations for which TTM data was kept aside for validation. 44 All dots on the 45 degree line indicate perfect fit between estimated and observed truck volumes, 45 while the dots that fall between the two dotted lines correspond to locations with less than 25% 46

Page 14: Estimation of Statewide Origin-Destination Truck Flows ...docs.trb.org/prp/15-5463.pdf · 12 Administration to develop and test a national system for monitoring freight ... the inputs

14

difference. A table embedded in the figure with RMSE values for different ranges of daily 1 observed truck volumes shows that the estimated truck volumes are matching reasonably well 2 with the observed volumes, especially at locations with daily truck volumes higher than 1,000 3 trucks. 4

Figure 3 shows the trip length distributions of the trips in the seed and estimated OD 5 matrices, for trips with at least one end in Florida. It can be observed that the distribution of the 6 trips in the estimated OD matrix is closely following those from the seed matrix derived from the 7 ATRI data, albeit the estimated OD matrix has a slightly greater proportion of shorter trips. 8 County-level trip productions and attractions for both the seed and estimated OD matrices 9 were also examined (see Figure 4 for county-level trip productions in seed and estimated 10 matrices). As discussed earlier, the seed matrix shows lower than expected trip generation in the 11 south Florida region (especially in and around Miami) and the southern stretch of I-75 beginning 12 from the Tampa region (when compared to those in the Polk County). The estimated OD matrix, 13 due to its use of additional information (observed heavy truck traffic volumes in the state), 14 addresses this issue to a certain extent. Counties in the southeast Florida and Hillsborough 15 County have higher trip generation in the estimated matrix than in the seed matrix. 16

The seed matrix had around 75% of the trips staring or ending in Florida staying within 17 Florida, while the estimated matrix adjusts this distribution to 82%. At the county level, the seed 18 matrix shows Polk County as one of the major origins/destinations for trips from/to other 19 counties. The estimated matrix makes adjustments to this trend for Miami-Dade, Palm Beach, 20 and Broward Counties. Specifically, the estimated matrix shows greater flows between these 21 three counties. Also, the estimated OD matrix shows smaller proportion of flows between 22 Hillsborough and Miami-Dade Counties than that in the seed OD matrix. While one would 23 expect greater amount of flows between these two counties, the observed truck traffic volumes 24 on major highways between these two counties are not high enough to support this notion. 25

26

Page 15: Estimation of Statewide Origin-Destination Truck Flows ...docs.trb.org/prp/15-5463.pdf · 12 Administration to develop and test a national system for monitoring freight ... the inputs

15

0

1000

2000

3000

4000

5000

6000

7000

0 1000 2000 3000 4000 5000 6000 7000

Estim

ated

 heavy truck volumes per day

Observed heavy truck volumes per day

All Input Stations

Validation Stations

Linear (45 Degree Line)

Linear (25% Error Line)

FIGURE 2 Observed vs. estimated heavy truck counts per day at different locations in Florida.

Observed truck 

counts per day

RMSE for input stations

RMSE for validation stations

20‐100 70% (64 locations) 105% (11 locations)

100‐500 47% (167 locations) 93% (14 locations)

500‐1000 30% (38 locations) 49% (6 locations)

1000‐7000 11% (96 locations) 25% (17 locations)

Total 20% (365 locations) 38% (48 locations)

Page 16: Estimation of Statewide Origin-Destination Truck Flows ...docs.trb.org/prp/15-5463.pdf · 12 Administration to develop and test a national system for monitoring freight ... the inputs

16

FIGURE 3 Trip length distributions of truck trips in seed and estimated OD matrices.

0

5

10

15

20

25

30

35

<10

50‐100

150‐200

250‐300

350‐400

450‐500

550‐600

650‐700

750‐800

850‐900

950‐1000

1050

‐1100

1150

‐1200

1250

‐1300

1350

‐1400

1450

‐1500

1550

‐1600

1650

‐1700

1750

‐1800

1850

‐1900

1950

‐2000

Percen

tage

 of trip

s with

 at least one

 end

 in Florid

a

Trip Length/Distance between TAZs (Miles)

ATRI Data (Seed Trips)

Estimated Trips

Page 17: Estimation of Statewide Origin-Destination Truck Flows ...docs.trb.org/prp/15-5463.pdf · 12 Administration to develop and test a national system for monitoring freight ... the inputs

17

1 FIGURE 4 County-level trip productions in seed and estimated OD matrices.2

Page 18: Estimation of Statewide Origin-Destination Truck Flows ...docs.trb.org/prp/15-5463.pdf · 12 Administration to develop and test a national system for monitoring freight ... the inputs

18

5 CONCLUSIONS AND FUTURE RESEARCH 1 This paper presents an investigation of the use of large streams of truck GPS data (available from 2 ATRI) in combination with other observed data on truck traffic flows to estimate a statewide OD 3 table of truck flows within, into, and out of Florida. In doing so, the paper sheds light on the 4 extent to which ATRI data captures the observed truck traffic flows in Florida. This includes 5 insights on (a) the types of trucks (e.g., heavy trucks and medium trucks) in the data, (b) the 6 proportion of the truck traffic flows in the state captured in the data, and (c) the geographical 7 differences in the coverage. Similar procedures can be used to evaluate ATRI data in terms of its 8 coverage of truck flows in other states as well as the entire U.S.. Further, the paper applies an 9 ODME methodology to use ATRI’s GPS data for estimation of OD truck flows over large 10 geographical regions. The data is being considered for use in regional and statewide freight 11 travel demand modeling by a number of transportation planning agencies in the U.S. and 12 Canada. To aid these agencies, the paper provides a description of the procedures implemented 13 and a detailed discussion of implementation issues (e.g., evaluation of different assumptions) on 14 using the data for estimating statewide truck OD flows. 15 Over 145 million records of ATRI’s truck GPS data during four months – April-June, 16 2010 – were used for this research. The raw GPS data streams were converted into a database of 17 over 2.7 million truck trips. It is known that ATRI data predominantly comprises tractor-18 semitrailer combinations (or heavy trucks). However, a close examination of the data (via tracing 19 the land-uses served by some trucks in Google Earth and examining the truck-level travel 20 characteristics) in this research suggests that the data has a small proportion of trucks that are 21 likely to be single-unit trucks or straight trucks that do not necessarily haul freight over long 22 distances. The paper devised simple rules to divide the data into two categories: (1) long-haul 23 trucks or heavy trucks, and (2) short-haul trucks or medium trucks. 24

ATRI’s truck GPS data represents a large sample of truck flows within, coming into, and 25 going out of Florida. However, the sample is not a census of all trucks traveling in the state. To 26 evaluate the coverage of ATRI data in Florida, truck traffic flows implied by one-week of 27 ATRI’s truck GPS data were compared with truck counts data at over 160 locations in the state. 28 At an aggregate level, the 2010 ATRI data was found to provide 10% coverage of heavy truck 29 flows observed in Florida. Further, geographical differences the coverage were examined. it is 30 worth noting here that the ATRI data has grown significantly between 2010 and the present date, 31 therefore it can be assumed that current coverage in Florida is greater than 10%. 32

The OD tables derived from the ATRI data were combined with observed truck traffic 33 volumes at different locations within and outside the state to derive an OD table that is 34 representative of the freight truck flows within, into, and out of the state. The ODME method 35 was employed to achieve this. A variety of different assumptions were evaluated before arriving 36 at a set of defensible assumptions for deriving the OD tables in this research. The resulting OD 37 tables provided acceptable validation results when the estimated truck traffic volumes were 38 compared with observed truck traffic volumes. In addition, the estimated OD matrix was 39 subjected to a variety of reasonableness checks. Therefore, the OD tables derived in this research 40 can be used for statewide freight modeling in many ways, including the validation and 41 calibration of highway freight modeling components in FLSWM. 42

The ODME procedure in this study can be improved in different ways. First, utilizing 43 more robust data on observed truck traffic volumes in several southeastern states potentially can 44 help improve the ODME results. For example, there were little to no traffic count information for 45 states such as Tennessee and North Carolina. Filling such data gaps can potentially help in better 46

Page 19: Estimation of Statewide Origin-Destination Truck Flows ...docs.trb.org/prp/15-5463.pdf · 12 Administration to develop and test a national system for monitoring freight ... the inputs

19

estimating the truck flows into and out of the state. Second, the ODME procedure itself can be 1 improved in different ways: (a) by allowing different constraints (lower/upper bounds) that are 2 specific to different OD pairs (the constraints in this study were uniform to all OD pairs due to 3 software limitations), (b) by exploring the different weighting schemes used to expand the seed 4 matrix, and (c) by improving the traffic assignment procedure based on observed route choice 5 patterns of trucks using GPS data. 6 7 ACKNOWLEDGEMENTS 8 This research was funded by FDOT. The opinions, findings, and conclusions expressed in this 9 publication are those of the authors and not necessarily those of FDOT or USDOT. Thanks to Vince 10 Bernardin and Arun Kuppam for sharing their experiences working with ATRI data. Vipul Modi 11 provided valuable assistance with the use of Cube software. Ramachandran Balakrishna provided 12 helpful insights on the theoretical and practical aspects of ODME. 13

Page 20: Estimation of Statewide Origin-Destination Truck Flows ...docs.trb.org/prp/15-5463.pdf · 12 Administration to develop and test a national system for monitoring freight ... the inputs

20

REFERENCES

1. Jones, C., D. C. Murray, and J. Short. Methods of Travel Time Measurement in Freight Significant Corridors. Presented at 84th Annual Meeting of the Transportation Research Board, Washington, D.C., 2005.

2. Pinjari, A. R., A. Bakhshi Zanjani., A. Thakur, A. Nur Irmania., M. Kamali, J. Short., D. Pierce, and L. Park. Using Truck Fleet Data in Combination with Other Data Sources for Freight Modeling and Planning. Research Report, prepared for Florida Department of Transportation, Tallahassee, FL, 2014.

3. Bernardin, V. L., J. Avner, J. Short, L. Brown, R. Nunnally, and S. Smith. Using Large Sample GPS Data to Develop an Improved Truck Trip Table for the Indiana Statewide Model. Presented at 4th Transportation Research Board Conference on Innovations in Travel Modeling, Tampa, FL, 2012.

4. Kuppam, A., J. Lemp, D. Beagan, V. Livshits, L. Vallabhaneni, and S. Nippani.

Development of a Tour-Based Truck Travel Demand Model Using Truck GPS Data. Presented at 93rd Annual Meeting of the Transportation Research Board, Washington, D.C., 2014.

5. Ma, X., E. D. McCormack, and Y. Wang. Processing Commercial Global Positioning System

Data to Develop a Web-Based Truck Performance Measures Program. In Transportation Research Record: Journal of the Transportation Research Board, No. 2246, Transportation Research Board of the National Academies, Washington, D.C., 2011, pp. 92-100.

6. Van Zuylen, H. J., and L. G. Willumsen. The Most Likely Trip Matrix Estimated from

Traffic Counts. Transportation Research Part B: Methodological, Vol. 14, No. 3, 1980, pp. 281-293.

7. González-Calderón, C., J. Holguín-Veras, and J. Ban. Tour-Based Freight Origin-Destination

Synthesis. Presented at 4th European Transportation Conference, Association for European Transport, Glasgow, Scotland, 2012.

8. Bernardin, V.L., and J. Short. Expanding Truck GPS-Based Passive Origin-Destination Data in Iowa and Tennessee. Presented at 5th Transportation Research Board Conference on Innovations in Travel Modeling, Baltimore, MD, 2014.