introduction - university of central florida€¦  · web viewi would also like to express my...

Click here to load reader

Upload: others

Post on 27-May-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

UNDERSTANDING BICYCLE SHARING SYSTEMS: USER AND SYSTEM INSIGHTS FROM THREE CITIES IN NORTH AMERICA

by

Seyed Ahmadreza FAGHIH IMANI

Department of Civil Engineering and Applied Mechanics

MCGILL UNIVERSITY

Montreal, Canada

November 2016

A Dissertation

submitted to McGill University

in partial fulfillment of the requirements of the degree of

DOCTOR OF PHILOSOPHY

© Ahmadreza FAGHIH IMANI 2016

15

ABSTRACT

In recent years, there has been increasing attention on bicycle-sharing systems (BSS) as a viable and sustainable mode of transportation for short distance trips. A bicycle-sharing system is a public transportation service within a service area in which bicycles are available for renting for a short period. Many benefits of BSS have led to their growth around the world recently. However, due to the relatively recent adoption of BSS, there is very little research exploring how people consider these systems within the existing transportation alternatives. Further, it is of substantial interest for transportation professionals to identify factors contributing to the demand for these systems (arrivals and departures). The main objective of this dissertation is to examine bicycle-sharing system from both system and user perspectives to provide useful insights on how these innovative public systems influence urban transportation.

Within the systems perspective, the dissertation addresses three research questions. First, the research quantifies the influence of meteorological data, temporal characteristics, bicycle infrastructure, land use and built environment attributes on BSS usage (arrivals and departures) at the station level using multilevel linear mixed modeling approach. The modeling approach explicitly recognizes the repeated observations of arrivals and departures and dependencies associated with bicycle flows originating at the same station. The result demonstrates the negative impact of adverse weather conditions on BSS usage. The BSS is more predominantly used during the PM period relative to other times of the day. While during the weekends the BSS usage reduces, Friday and Saturday nights are positively related to arrival and departure rates. The BSS usage is expected to decrease when we go farther from CBD. Points of interests, population and job density near a station significantly influence the arrival and departure rates. The results indicate that adding a station has a stronger impact on usage compared to increasing station capacity.

Second, we explore the presence of spatial and temporal correlations across BSS usage dimensions by developing panel spatial error and panel spatial lag models. It is possible that bicycle arrival and departure rates of one BSS station are potentially inter-connected with bicycle flow rates for neighboring stations. It is also plausible that the arrival and departure rates at one time period are influenced by the arrival and departure rates of earlier time periods for that station and neighboring stations. The model estimation results clearly highlight the presence of spatial and temporal dependency for BSS station’s arrival and departure rates. Further, a hold-out sample validation exercise emphasizes the improved accuracy of the models with spatial and temporal interactions.

Finally, we examine the presence and influence of BSS infrastructure installation decision endogeneity on the “true” impact of BSS infrastructure on BSS usage through the development of a 3-dimensional panel multi-level mixed ordered logit model (3PMMOL). The proposed approach formulates a measurement equation that accounts for the installation process and relates it to the usage equations. The model estimates clearly show the over-estimation of BSS infrastructure impacts in models that neglect the installation process. The results evidently indicate that bicycle-sharing infrastructure is not randomly allocated in the urban region. The estimated BSS installation model results can be used as a guiding template to model BSS installation decisions as a function of existing land-use, built environment attributes.

In terms of user perspective, the dissertation studies BSS behavior at a trip level to examine bicyclists’ destination preferences. Specifically, the decision process involved in identifying destination locations after picking up a bicycle at a BSS station is studied. The analyses are presented in two chapters. First, a random utility maximization approach in the form of a multinomial logit model (MNL) is developed that assumes an individual who picks up a bicycle at one of the stations makes destination station choice based on a host of attributes including individual’s characteristics, time period of the day and destination station attributes. The developed model will allow bicycle-sharing system operators to better plan their services by examining the impact of travel distance, land use, built environment and access to public transportation infrastructure on users’ destination preferences. Using the estimated model, we generate utility profiles as a function of distance and various other attributes allowing us to visually represent the trade-offs that individuals make in their decision process.

Second, in order to account for BSS trips heterogeneity, a Finite Mixture Multinomial Logit (FMMNL) model that probabilistically assigns trips to different segments and estimate segment-specific destination choice models for each segment is proposed. Unlike the traditional destination choice based MNL model, in an FMMNL model, we can consider the effect of fixed attributes across destinations such as users or origins attributes in the decision process. We validate our models using hold-out samples and compare our proposed FMMNL model results with the traditional MNL model results. The proposed FMMNL model provides better results in terms of goodness of fit measures, explanatory power and prediction performance.

The dissertation also investigates the influence of sample size on system and user perspectives. Specifically, a systematic evaluation of the impact of sample size on model estimates, inference measures and predictive performance is proposed. The analyses show that the impact of sample size on parameters estimated is stronger than that of the impact on prediction performance. The results suggested a minimum sample size of three days data for weekday and two days data for weekend models in order to examine BSS demand and 5000 trips for weekday and weekend models for examining users’ destination choice. The results are intended to serve as minimum requirements for sample sizes when analyzing BSS and would help the analysts to make decisions on sample size for accurately examining BSS in the absence of any system level guidelines.

The bicycle-sharing systems analyzed in this dissertation are from three North American cities: BIXI system in Montreal, Canada, DIVVY system in Chicago and CitiBike system in New York City in the United States. Overall, the dissertation results demonstrate the distinct travel behavior of users with annual membership compared to users with temporary passes. Further, it is shown that BSS is used for daily commute to and from work. Moreover, the results indicate distinct travel patterns for BSS during weekends. The results also highlight the importance of BSS system attributes such as station capacity and location.

The dissertation makes several policy recommendations. First, in order to increase the BSS usage, it is more beneficial to have a large number of stations with smaller capacity than a few stations with high capacity. Interestingly, it is shown that this effect is opposite for daily customers; they prefer stations with large number of docks. Thus, the BSS operators will need to carefully balance requirements of members and daily customers. For instance, during high tourist activity months, a reallocation plan to reduce the number of stations but increase per station capacity especially on weekends might be beneficial. Second, the dissertation results highlight that bicycle facilities such as bike lanes increase the BSS usage. Hence, investments in bicycle facilities will help to attract more individuals toward BSS and cycling in general.

RÉSUMÉ

Au cours des dernières années, les Systèmes de Vélos en Libre-Service (SVLS) se sont distingués en tant qu’alternatives viables et durables pour des déplacements sur des courtes distances. Un SVLS est un service de transport public qui offre des vélos en location pour de courtes durées dans une aire de service définie. Les nombreux avantages offerts par les SVLS expliquent leur développement récent partout dans le monde. Malgré cela, les SVLS étant des systèmes relativement récents, il existe à ce jour peu de recherche examinant comment les individus incorporent ces systèmes dans leurs alternatives de transport. De plus, les professionnels du transport ont grand intérêt à identifier les facteurs qui influencent la demande pour ces systèmes (en termes d’arrivées et de départs de stations). L’objectif principal de cette dissertation est d’examiner les SVLS d’un point de vue systémique, ainsi que de la perspective des usagers, afin de contribuer à notre compréhension de ces systèmes innovateurs, et comment ils influencent nos systèmes de transports urbains.

D’un point de vue systémique, cette dissertation se penche sur trois questions. Premièrement, ce document quantifie l’influence de données météorologiques, temporelles, d’infrastructure cyclable, de l’utilisation du territoire, et de l’environnement urbain sur l’utilisation des SVLS (en termes d’arrivées et de départs de stations), à travers l’usage de modèles linéaires multi-niveaux mixtes. Cette approche reconnait explicitement les observations répétées d’arrivées et de départs, et les dépendances associées aux flux de vélos provenant d’une même origine. Les résultats démontrent l’influence négative de conditions météorologiques adverses sur l’usage des SVLS. Les SVLS sont utilisés davantage en après-midi que durant d’autres périodes de la journée. En fin de semaine, l’utilisation décroît, mais les soirées du Vendredi et Samedi sont associées à davantage d’utilisation. Les taux d’utilisation sont censé décliner lorsqu’on s’éloigne du centre-ville. La présence d’attractions d’intérêt public, la population et la densité d’emplois autour de stations ont toutes une influence sur les taux d’arrivée et de départ des stations. Nos résultats démontrent également que l’addition de nouvelles stations a un impact plus important sur les taux d’utilisation que d’accroître la capacité de stations existantes.

Deuxièmement, la présence de corrélations spatio-temporelles dans l’usage des SVLS est investiguée grâce au développement de modèles panels d’erreur spatiale et de délai spatial. Il est possible que les taux d’arrivée et de départ d’une station soient interconnectés avec les flux de vélos de stations avoisinantes. Il est également plausible que les taux d’arrivée et de départ durant une plage horaire soient influencés par les flux durant les plages horaires précédentes, pour la station d’intérêt ainsi que les stations avoisinantes. Les résultats de ces analyses démontrent clairement l’existence de dépendances spatio-temporelles dans les taux d’utilisation de SVLS. De plus, un exercice de validation sur un échantillon de données exclu de l’estimation des modèles démontre les gains de précision de modèles incluant des dépendances spatio-temporelles.

Finalement, nous explorons la présence et l’influence de l’endogénéité de la décision d’installer de l’infrastructure de SVLS sur le « véritable » impact de l’infrastructure de SVLS sur l’utilisation des SVLS, à travers le développement d’un modèle panel tridimensionnel multi-niveaux logit ordonné (3PMMOL). L’approche proposée formule une équation de mesure qui tient compte du processus d’installation et le relie aux équations d’utilisation. Le modèle démontre clairement que les modèles qui ne tiennent pas compte du processus d’installation surestiment l’impact de l’infrastructure de SVLS. Les résultats démontrent également que l’infrastructure de SLVS n’est pas allouée de manière aléatoire à la région étudiée. Les résultats peuvent être utilisés comme un modèle afin de modéliser les décisions d’installation de SVLS en fonction de l’usage du territoire, et des attributs de l’environnement urbain.

En termes de perspective de l’usager, cette dissertation étudie le comportement des SVLS au niveau de trajets individuels afin d’examiner les choix de destination des cyclistes. Spécifiquement, le processus de décision impliqué dans le choix d’une destination après avoir choisi un vélo à une station de SVLS est étudié. Les analyses sont présentées en deux chapitres. Premièrement, une approche de maximisation aléatoire de l’utilité prend la forme d’un modèle logit multinomial (MNL) qui suppose que le choix de l’individu dépend de facteurs individuels, ainsi que de la période de la journée et de caractéristiques de la station de destination. Le développement de ce modèle permettra aux opérateurs de SVLS de mieux planifier leurs services en examinant l’impact de la distance de trajet, de l’usage du territoire, de l’environnement urbain et de l’accès aux transports en commun sur les choix de destination des usagers. Grâce au modèle développé, nous générons des profils d’utilité en fonction de la distance de trajet et de divers autres facteurs, qui nous permettent de représenter visuellement les sacrifices que font les individus dans leurs processus décisionnels.

Deuxièmement, afin d’accommoder l’hétérogénéité des trajets de SVLS, un Modèle Multinomial Logit à Mixture Finie (FMMNL) est proposé, qui assigne de manière probabiliste les trajets aux divers segments et des modèles de choix de destination spécifiques à chaque segment sont proposés. Contrairement au traditionnel modèle MNL de choix de destination, dans un modèle FMMNL, les effets d’attributs à effets fixes sur les diverses destinations peuvent être inclus dans le modèle (comme les caractéristiques d’usagers ou de stations d’origine). Nous validons le modèle FMMNL grâce à des segments réservés et le comparons au modèle traditionnel MNL. Le FMMNL performe mieux en termes de mesures de concordance, de pouvoir d’explication, et de performance de prédiction.

Cette dissertation investigue aussi l’influence de la taille des échantillons sur les perspectives d’usager et du système. Spécifiquement, une évaluation systématique de l’impact de la taille des échantillons sur les résultats des modèles, des mesures d’inférence, et la capacité de prédiction est proposée. Les analyses démontrent que l’impact de la taille des échantillons sur les paramètres estimés est plus important que celui sur la capacité de prédiction. Les résultats suggèrent une taille minimale d’échantillon de trois jours de données pour la semaine de travail, et deux jours de données pour les fins de semaine afin d’examiner la demande pour les SVLS, et 5000 trajets pour la semaine et les fins de semaine afin d’examiner les choix de destination des usagers. Les résultats sont censés servir de minimum requis sur la taille des échantillons afin d’étudier les SVLS et peuvent aider les analystes à prendre de meilleures décisions en l’absence de directives systémiques.

Cette dissertation examine les SVLS de trois villes Nord-Américaines : BIXI à Montréal, au Canada; DIVVY à Chicago, et CitiBike à New York aux États-Unis. Généralement parlant, les résultats de cette thèse démontrent les comportements distincts d’usagers avec passes annuels des usagers temporaires. De plus, il est démontré que les SVLS sont utilisés afin de faire la navette entre le travail et la maison. Il est également démontré que les SVLS ont des motifs d’utilisation différents les fins de semaine. L’importance de caractéristiques des SVLS tels la capacité de la station et sa location sont également soulignées.

Cette thèse fait aussi des recommandations stratégiques. Premièrement, afin d’augmenter les taux d’utilisation de SVLS, il est préférable d’avoir un grand nombre de petites stations plutôt que d’avoir quelques grandes stations. Curieusement, il est également démontré que du point de vue des utilisateurs quotidiens, il est préférable d’avoir de plus grandes stations. Il est donc important pour les opérateurs de SVLS de trouver un équilibre entre les besoins du système et ceux des usagers quotidiens. Par exemple, durant les saisons touristiques, il serait peut-être préférable de réduire le nombre de stations mais d’augmenter leurs capacités, surtout les fins de semaine. Deuxièmement, cette thèse démontre que la présence d’infrastructure de SVLS à un effet positif sur les taux d’utilisation de SVLS. On peut donc dire que les investissements dans ces infrastructures auront tendance à attirer davantage d’utilisateurs vers les SVLS, et vers le cyclisme en général.

ACKNOWLEDGEMENTS

I would like to take this opportunity to acknowledge those from whom I was fortunate enough to receive support, help and encouragement throughout the journey of my doctoral education.

I consider myself one of the luckiest students to have Prof. Naveen Eluru as my supervisor. Naveen: I am deeply indebted to you for your invaluable advice, continuous support and, most importantly, friendship in the past four years. Words cannot express the gratitude I have for the profound impact you have had on my development both professionally and personally. This dissertation would not have been completed without your invaluable help.

I am very thankful to Prof. Marianne Hatzopoulou for adopting me as her student during my time in Toronto and for her always kind encouragement and support. I also want to thank Prof. Ahmed El-Geneidy for his help and guidance in our co-authored paper and for cheering me up every time I had a chance to meet him. I also would like to extend my gratitude to my master’s supervisor Prof. Luis Amador for giving me confidence when I most needed it and for his support during the transition time before my doctoral education.

I would like to convey my gratitude to Prof. Luc Chouinard for being my co-supervisor over the past months. I would also like to express my sincere gratitude to Prof. Matthew Roorda, Prof. Luis Miranda-Moreno, Prof. Ahmed El-Geneidy, and Prof. Kevin Manaugh for serving on my dissertation committee and for their valuable suggestions which improved the quality and clarity of the final dissertation. Any mistakes that remain are mine. I would like to thank Prof. Rajesh Paleti, Prof. Karthik Konduri and Prof. Abdul Pinjari for giving me the opportunity to work with them on several research projects and broadening my research perspectives.

I also want to thank Jorge Sayat for always being kind when responding to my tedious computer issues. Special thanks to my colleagues at McGill University for making a friendly and scientific workplace: Fred, Maryam, Sabreena, Shams, Vince, Tim and Junshi.

I would like to acknowledge my family for their constant love and support. First, to Golnaz, my best friend: I know your love, help, guidance and patience deserve more than one line on this page so the whole dissertation is dedicated to you. Then, many thanks to my parents, my sisters and my brothers especially Mohammadreza who has helped me a lot over the past few months. Finally, special thanks to my precious friends all around the world who have brought joy and happiness to my life: Hossein, Ashkan, Saeed, Arian, Mehrdad, Iman, Amirpooyan, Amirpasha, Goli, Shohreh, Nilou, Babak, Mahdi, Mohammad K, Mohammad S, Abolfazl, Arash, Amirhosein, Mahya, Ali, Dena, Lina, Sahel, Sayeh, Kamran, Nastaran and Mazhar.

Last but not least, this research was made possible by the financial support I received from McGill University MEITA award and North American Regional Science Council (NARSC) Benjamin H. Stevens Graduate Fellowship in Regional Science.

TABLE OF CONTENT

ABSTRACTiACKNOWLEDGEMENTSviTABLE OF CONTENTviiLIST OF FIGURESxiLIST OF TABLESxiiAUTHOR CONTRIBUTIONSxiiiCHAPTER 1INTRODUCTION11.1 Bicycle Sharing Systems11.1.1 History of Public Bicycles21.2 Analysis of Bicycle-Sharing Systems31.2.1 System Perspective Studies31.2.2 User Perspective Studies51.3 Dissertation in Context51.3.1 Determinants of BSS Usage61.3.2 Spatial-Temporal Correlation of Bicycle Usage61.3.3 Role of Bicycle-Sharing System Infrastructure71.3.4 Users Destination Choice Preferences81.3.5 BSS Trips Heterogeneity81.3.6 Data Employed for BSS Analysis91.4 Objectives of the Dissertation101.5 Outline of the Dissertation11CHAPTER 2HOW LAND-USE AND URBAN FORM IMPACT BICYCLE FLOWS152.1 Introduction152.2 Data162.2.1 Independent Variable Generation182.3 Visual Representation of BIXI flows192.4 Analysis and Discussion202.4.1 Linear Mixed Models202.4.2 Model Fit Measures212.4.3 Results212.4.4 Model Validation242.5 Policy Exercise242.6 Summary25CHAPTER 3INCORPORATING THE IMPACT OF SPATIO-TEMPORAL INTERACTIONS ON BICYCLE SHARING SYSTEM DEMAND333.1 Introduction333.2 Data343.2.1 Data Assembly and Exogenous Variable Generation353.2.2 Sample Characteristics373.3 Methodology373.4 Analysis and Discussion393.4.1 Model Fit Measures403.4.2 Model Estimation Results413.4.3 Model Validation433.5 Summary44CHAPTER 4DETERMINING THE ROLE OF BICYCLE SHARING SYSTEM INFRASTRUCTURE ON USAGE524.1 Introduction524.2 BSS Infrastructure Measure554.3 Methodology564.3.1 Model Structure574.3.2 Model Estimation604.4 Data614.4.1 Sample Formation614.4.2 Dependent Variable Generation624.4.3 Independent Variables Considered624.5 Empirical Analysis634.5.1 Model Fit Measures644.5.2 Bicycle-Sharing Infrastructure Installation Model644.5.3 Arrivals and Departures Model654.5.4 Endogeneity and Unobserved Heterogeneity664.5.5 Elasticity Effects684.6 Summary69CHAPTER 5ANALYSING BICYCLE SHARING SYSTEM USER DESTINATION CHOICE PREFERENCES795.1 Introduction795.2 Data825.2.1 Sample Formation825.2.2 Independent Variable Generation835.2.3 Descriptive Analysis845.3 Analysis and Discussion855.3.1 Multinomial Logit Model855.3.2 Results865.3.3 Validation and Elasticity Analysis905.4 Summary92CHAPTER 6A FINITE MIXTURE MODELING APPROACH TO EXAMINE USERS’ DESTINATION PREFERENCES1006.1 Introduction1006.2 Data1026.2.1 Sample Formation1026.2.2 Independent Variables Generation1036.3 Methodology1046.4 Analysis and Discussion1056.4.1 Model Estimation Results1066.4.2 Model Validation1116.5 Summary111CHAPTER 7EXAMINING THE IMPACT OF SAMPLE SIZE IN THE ANALYSIS OF BICYCLE SHARING SYSTEMS1187.1 Introduction1187.2 Research Methodology1207.2.1 Data Source1207.2.2 Sample Formation1217.2.3 Independent Variable Generation1237.3 Analysis and Discussion1247.3.1 Evaluation Measures1247.3.2 Evaluation Results1267.4 Summary128CHAPTER 8CONCLUSIONS AND FUTURE RESEARCH1408.1 Summary of Chapters1408.2 How Land-Use and Urban Form Impact Bicycle Flows1408.3 Incorporating the Impact of Spatio-Temporal Interactions on Bicycle Sharing System Demand1418.4 Determining the Role of Bicycle Sharing System Infrastructure on Usage1428.5 Analysing Bicycle Sharing System User Destination Choice Preferences1438.6 A Finite Mixture Modeling Approach to Examine Users’ Destination Preferences1448.7 Examining the Impact of Sample Size in the Analysis of Bicycle Sharing Systems1458.8 Recommendations for Future Research146REFERENCES148

LIST OF FIGURES

Figure 11. BSS Station Locations14

Figure 21. BIXI stations in Montreal Island27

Figure 22. Spatial Distribution of Average Arrival and Departure Rates in Peak hours28

Figure 31. Average Hourly Arrival and Departure Rates for Members in Peak Hours46

Figure 32. Average Hourly Arrival and Departure Rates for Daily Customers in Peak Hours47

Figure 41. The variation of BSS infrastructure for an average sized TAZ in Montreal72

Figure 42. Bicycle-Sharing Infrastructure Index in Montreal73

Figure 43. Three dimensional panel mixed multi-level ordered logit (3PMMOL) framework74

Figure 51. Chicago’s Bicycle-Sharing System (Divvy)94

Figure 52. Bicycle-Sharing Trips in Chicago’s Divvy System95

Figure 53. Probability of Choosing a Station in Chicago’s Divvy System96

Figure 54. The Variation of Utility as a function of Distance, Bike Route Length and Station Capacity97

Figure 71. Map of CitiBike System in New York City131

LIST OF TABLES

Table 21. Descriptive Summary of Sample Characteristics29

Table 22. Model Estimation Results30

Table 23. Validation Results31

Table 24. Elasticity Effects for Arrival and Departure Rates*32

Table 31. Descriptive Summary of Sample Characteristics48

Table 32. Summary of Estimated Models49

Table 33. Estimates of Spatial Lag Model with Temporal and Spatial Lagged Variables50

Table 34. Validation Results for Spatial Lag Model with Temporal and Spatial Lagged Variables51

Table 41. Descriptive Summary of Sample Characteristics75

Table 42. Model Fit Measures76

Table 43. 3PMMOL Model Estimation Results77

Table 44. Elasticity Effects (and its Standard Deviation) for TAZ Arrival and Departure Rates78

Table 51. Descriptive Summary of Sample Characteristics98

Table 52. Model Estimation Results99

Table 61. Descriptive Summary of Sample Characteristics113

Table 62. Segments Characteristics114

Table 63. Segmentation Component of FMMNL Models115

Table 64. Destination Choice Component of FMMNL Models116

Table 65. Models Validation Results117

Table 71. Summary of Recent Literature on BSS132

Table 72. Descriptive Summary of Base Samples Characteristics133

Table 73. Arrivals Models Estimation Results - Weekdays134

Table 74. Departures Models Estimation Results - Weekdays135

Table 75. Arrivals Models Estimation Results - Weekends136

Table 76. Departures Models Estimation Results - Weekends137

Table 77. Destination Choice Models Estimation Results - Weekdays138

Table 78. Destination Choice Models Estimation Results - Weekends139

AUTHOR CONTRIBUTIONS

The dissertation contains empirical studies from six full length journal articles. Among these six articles, five have already been accepted and published in several journals and the other article is under review for publication in a peer-reviewed journal. Ahmadreza Faghih-Imani, as the primary author for all the manuscripts, contributed to the articles in terms of data preparation, model estimations and writing. The respective co-authors have contributed to the manuscripts by sharing their valuable insights and editing the manuscripts. The publication details of the manuscripts are presented here.

Chapter 2 is based on the article: Faghih-Imani A., N. Eluru, A. El-Geneidy, M. Rabbat, and U. Haq. How land-use and urban form impact bicycle flows: Evidence from the bicycle-sharing system (BIXI) in Montreal. Journal of Transport Geography, Vol. 41, 2014, pp. 306-314.

Chapter 3 is based on the article: Faghih-Imani A., and N. Eluru. Incorporating the impact of spatio-temporal interactions on bicycle sharing system demand: A case study of New York CitiBike system. Journal of Transport Geography, Vol. 54, 2016, pp. 218-227.

Chapter 4 is based on the article: Faghih-Imani A., and N. Eluru. Determining the role of bicycle sharing system infrastructure on usage. The paper is accepted for publication in Transportation Research Part A: Policy and Practice.

Chapter 5 is based on the article: Faghih-Imani A., and N. Eluru. Analyzing bicycle sharing system user destination choice preferences: An investigation of Chicago's Divvy system. Journal of Transport Geography, Vol. 44, 2015, pp. 53-64.

Chapter 6 is based on the article: Faghih-Imani A., and N. Eluru. A finite mixture modeling approach to examine New York City bicycle sharing system (CitiBike) users’ destination preferences. The paper is under review in Applied Geography.

Chapter 7 is based on the article: Faghih-Imani A., and N. Eluru. Examining the impact of sample size in the analysis of bicycle sharing systems. The paper is accepted for publication in Transportmetrica A: Transport Science.

INTRODUCTION

Bicycle Sharing Systems

In recent years, there has been growing attention on bicycle-sharing systems (BSS) as an alternative and complementary mode of transportation. A bicycle-sharing system is a public transportation service within a service area in which bicycles are available for renting for a short period. These systems are recognized to have traffic and health benefits including flexible mobility, physical activity and support for multimodal transport connections (Shaheen et al., 2010). A bicycle-sharing system is intended to provide convenience because individuals can use the service without the costs and responsibilities associated with owning a bicycle for short trips. Furthermore, a bicycle-sharing system frees individuals from the need to secure their bicycles; bicycle theft is a common problem in urban regions (Rietveld and Daniel, 2004; van Lierop et al., 2015). At the same time, the decision to make a trip can be made in a short time frame providing an instantaneously accessible alternative for a one-way or a round trip.

Bicycle-sharing systems can enhance accessibility to public transportation systems by improving the last mile connectivity (Jäppinen et al., 2013). Moreover, BSS’s implementation in a city can motivate new segments of the society to cycle, resulting in an increase in the overall bicycling mode share (Buck et al., 2013). BSS have also assisted in encouraging the public perception of cycling as an everyday travel mode; thus, broadening the cycling demographic (Goodman et al., 2014). BSS are in tune with the millennials’ proclivity for the shared transportation systems (Davis et al., 2012; Dutzik and Baxandall, 2013). Moreover, installing bicycle-sharing systems promotes active transportation that can enhance physical activity levels and thus improve health outcomes for bicyclists (Fuller et al., 2011; Fishman et al., 2015b). Further, earlier research efforts provide evidence that BSS were successful in improving the driver awareness towards cyclists and consequently increased the safety for cyclists (Murphy and Usher, 2015).

Cities, by installing bicycle-sharing systems, are focusing on inducing a modal shift to cycling and subsequently, decrease traffic congestion and air pollution. There is significant evidence from the travel behavior data in the United States to support BSS installation in urban areas. According to data from the 2009 National Household Travel Survey (NHTS), about 37.6% of the trips by private vehicles in the United States are less than 2 miles long. The NHTS data also indicates that about 73.6% of bicycle trips in the US are less than 2 miles long. Even if a small proportion of the short privately owned vehicle trips (around dense urban cores) are substituted with BSS trips, it offers substantial benefits to individuals, cities and the environment. Thus, it is not surprising that more than 1000 cities around the world have installed or plan to install a bicycle-sharing system (Meddin and DeMaio, 2016). A well designed and planned bicycle-sharing system can play a complimentary role to existing public transportation infrastructure. This dissertation sheds lights on the recently emerged bicycle-sharing systems by examining three different systems in North America.

History of Public Bicycles

The first bicycle-sharing system was introduced in the 1960s in the Netherlands (DeMaio, 2009; Shaheen et al., 2010). Since then, there have been four generations of these systems. The first generation was “white bicycles” or free bicycles available in different locations around the city. The idea was simple: a person would pick up one of the bicycles, which were typically painted in bright colors and unlocked, ride it to his or her destination and leave it there for the next possible user. It was free and without any time constraint. This program failed because of many stolen and vandalized bicycles. In the 1990s, a second-generation coin-deposit system was introduced as a result of the experience of the first generation of bicycle-sharing systems. Locked bicycles could be borrowed with a small deposit, which was usually refunded on return. Unfortunately, this did not eliminate the issue of bicycle theft due to user anonymity (Shaheen et al., 2010). Also, no time limit for the use of bicycles resulted in excessively long rental periods for borrowed bicycles. The third generation system added transaction kiosks to docking stations to solve the problem of user anonymity. People could rent a bicycle for only a limited amount of time. These systems became relatively successful around the world. Fourth generation systems, also called demand-responsive multimodal systems, have been built on the success of the third generation, while also improving docking stations, bicycle redistribution and integration with other transport modes (DeMaio, 2009; Shaheen et al., 2010). The bicycle-sharing systems studied in this dissertation all belong to the latest generation of bicycle-sharing systems.

The current chapter provides an overview of the dissertation including a detailed review of earlier research efforts on BSS. The remainder of the chapter is organized as follows. Section 1.2 presents a review of recent literature on BSS. Section 1.3 discusses the different dimensions of BSS analysis conducted as part of the dissertation. The specific research objectives of the dissertation are outlined in Section 1.4. Finally, Section 1.5 presents the dissertation structure and an overview of the remaining chapters.

Analysis of Bicycle-Sharing Systems

Although bicycle-sharing systems are becoming common around the world, there are relatively few studies exploring the factors affecting shared bicycle flows and usage. Fishman et al. (2013), after an extensive literature review, concluded that in order to better understand and maximize the effectiveness of bicycle-sharing programs, the evaluation of current performance of bicycle-sharing systems is crucial. With the growing installation of BSS infrastructure across the world, there is a substantial interest in understanding how these systems impact the urban transportation system. There are two perspectives to examine such influences: (1) system level – what are the determinants of BSS usage and (2) user level – what encourages individuals to use these systems.

System Perspective Studies

Under the systems perspective, there have been several studies devoted to examining factors affecting BSS flows and usage. A subset of these studies conducted a feasibility analysis; proposing different bicycle-sharing programs for different cities (for example, see Gregerson et al., 2010). These studies typically aim to identify potential locations for stations and to estimate BSS flows and usage considering socio-demographic and land-use variables (such as population and job density) as well as topological and meteorological parameters for the proposed locations. Krykewycz et al. (2010) estimated demand for a proposed bicycle-sharing program in Philadelphia using observed bicycle flow rates in European cities. Over the past few years, earlier quantitative studies employed observed bicycle usage data to capture the determinants of BSS usage (Gebhart and Noland, 2013; Nair et al., 2013; O’Brien et al., 2013; Rixey, 2013; Rudloff and Lackner, 2014; Zhao et al., 2014). In these studies, usage is usually characterized as arrivals (depositing bicycles) and departures (removal of bicycles). These studies typically examine the influence of transportation network infrastructure (such as length of bicycle facilities, streets and major roads), land use and urban form (such as presence of metro and bus stations, restaurants, businesses and universities), meteorological data (such as temperature and humidity) and temporal characteristics (such as time of day, day of the week and month) on BSS usage. These studies employed various levels of aggregation both temporally such as hourly, daily or monthly usage and spatially such as station level or traffic analysis zone (TAZ) level.

For example, several studies demonstrated that increasing BSS infrastructure (number of stations and capacity) or increasing bicycle routes around stations increases BSS usage (Buck and Buehler, 2012; Wang et al., 2015). The impact of land use and urban form attributes on BSS usage were also investigated. Studies found that stations in areas with higher job or population density or stations with higher number of point of interests (such as restaurants, retail stores and universities) in their vicinity experience higher arrivals and departures (Rixey, 2013; Wang et al., 2015). Moreover, several studies investigated and identified the socio-demographic disparities in using of bicycle-sharing systems (Ogilvie and Goodman, 2012; Goodman and Cheshire, 2014). Furthermore, the relationship between BSS and other public transportation systems such as subway or bus transit system were also examined by several research efforts (Nair et al., 2013; González et al., 2015). The analyses on temporal attributes of BSS showed that the peak arrivals and departures are observed during the evening peak hours while weekdays tend to have higher rates of usage compared to weekend. The results underscored the use of BSS for the daily commute to and from work especially by annual members (O’Brien et al., 2014; Murphy and Usher, 2015). Several studies analyzed the impact of weather information (such as temperature and humidity) on the usage of the BSS (Gebhart and Noland, 2014; Mahmoud et al., 2015). These studies, as expected, concluded that BSS usage is lower under adverse weather conditions (presence of rain or low temperature).

Another stream of studies concentrates on operational issues of BSS including identifying problematic stations (stations that are full or empty) and the efficiency of operator rebalancing programs. For example, Fricker and Gast (2016) studied the effect of the randomness of user decisions on the number of problematic stations. Vogel and Mattfeld (2011) and Bouveyron et al. (2015) developed cluster analysis and found different categories of stations within the bicycle-sharing systems. Further, several studies analyzed various methods for optimizing bicycle rebalancing operations and repositioning trucks’ routing schemes (Vogel and Mattfeld, 2011; Nair et al., 2013; Raviv et al., 2013; Pfrommer et al., 2014; Forma et al., 2015).

User Perspective Studies

The studies focussed on the user perspective contribute to the literature by studying user behavior in response to bicycle-sharing systems. Bachand-Marleau et al. (2012) conducted a survey in Montreal, Canada, and found that convenience and having a BSS station closer to home location are the main motivators for individuals to use BSS. Further, Fishman et al. (2015a) examined the factors influencing the user’s membership of BSS in two major cities in Australia (Melbourne and Brisbane) and identified riding frequency, age and proximity to a docking station as significant contributing factors for membership. Several studies highlighted the differences between BSS short-term users and BSS annual members’ preferences towards the use of the system (Lathia et al., 2012; Buck et al., 2013). Lathia et al. (2012) analyzed the effect of opening London bicycle-sharing system to casual users on the system usage. Their study showed that allowing casual users to use the system resulted in increased BSS usage on weekends and overall usage increase at several stations. The impact of BSS on cyclists’ safety and prevalence of using helmets by BSS users were investigated by several studies (Kraemer et al., 2012; Graves et al., 2014; Fishman and Schepers, 2016). Murphy and Usher (2015) employed survey data conducted in Dublin, Ireland, and underlined the gender gap and income equity issues regarding accessing and using BSS. Further, the impact of BSS on mode choice and modal shift to BSS was analyzed in several studies (Buck et al., 2013; Martin and Shaheen, 2014; Murphy and Usher, 2015). It is found that BSS mostly substituted trips made by public transport and by walking. However, the overall impact of BSS on increasing active transportation and reducing car use and thus air pollution was found to be positive (Fishman et al., 2014; Fishman et al., 2015b). Studies also found that BSS users, in general, prefer shorter trips with all else same (Mahmoud et al., 2015). Further, research efforts demonstrated that BSS users prefer to use the existing bicycle facilities such as bicycle lanes and have a higher interest in stations closer to the transit system such as subway stations (González et al., 2016).

Dissertation in Context

As is evident from the literature review, the bicycle-sharing system literature is still in its infancy and there are several dimensions that are unexplored in earlier research. The overall objective of this dissertation is to examine bicycle-sharing systems from both user and system perspectives to provide useful insights on how these innovative public systems influence the urban transportation. The analyses in this dissertation examined bicycle-sharing systems directly as a standalone system and not within the context of overall travel demand framework; thus the study is not designed to capture the impact of BSS on trip generation or mode choice. The primary reason for this the lack of established literature on BSS. The lack of data that specifically considers BSS within the broader travel demand context is also another reason for this.

The bicycle-sharing systems analyzed in this dissertation are from three North American cities: BIXI system in Montreal, Canada, DIVVY system in Chicago and CitiBike system in New York City in the United States. We selected Montreal since it was one of the first major public bicycle-sharing systems in North America while Chicago and New York are the latest major cities in US that have installed a bicycle-sharing system. The location of BSS stations around the three cities and the position of the cities in North America is illustrated in Figure 1.1. New York has the most and Chicago has the least dense systems between the three cities. The BSS in these cities are studied in this dissertation as mature and successful systems to provide insights on how BSS integrated within transportation systems.

In the subsequent discussion, various unexplored analyses in the BSS studies are presented and the dissertation research are explained along six main directions.

Determinants of BSS Usage

Demand modeling plays an important role in determining the required capacity, and hence the success of new bicycle-sharing systems and/or the success of expanding an existing system. Over the past few years there have been several studies devoted to examining factors affecting bicycle-sharing flows and usage. However, by using aggregated monthly or yearly flow rates, earlier studies fail to capture the impact of variables that change in the short term; i.e., at an hourly level (such as variations in weather and time-of-day effects). Neglecting the presence of such variations usually reduces the applicability of the results obtained. Moreover, examining bicycle flows at an hourly level (or a short time frame) allows the analyst to provide the operators with bicycle demand profiles including excess and shortage information. The current dissertation contributes to literature by determining the effect of meteorological data, temporal characteristics, bicycle infrastructure, land use and urban form attributes on hourly bicycle arrival and departure rates at the station level using observed usage data. Specifically, the influence of various factors on arrival and departure rates at the BSS station level are quantified using a general statistical modeling technique that can be adopted by BSS planners and operators.

Spatial-Temporal Correlation of Bicycle Usage

While examining BSS usage, it is important to consider the spatial and temporal correlation of bicycle usage at a station level. It is expected that bicycle arrival and departure rates of one BSS station are potentially correlated with the bicycle flow rates of nearby stations. Also, the arrival and departure rates at one time period can be potentially influenced by the arrival and departure rates of earlier time periods for that station and its nearby stations. Few studies analyzed the effect of neighbouring stations in a bicycle-sharing system. Rudloff and Lackner (2014) employed count models to analyze demand profiles of Citybike Wien system in Vienna, Austria. They incorporated the neighbouring stations effect in the modelling framework by considering dummy variables whether a station is full or empty for the three closest stations. Several research efforts focused on the prediction of the BSS usage in the near future (Froehlich et al., 2009; Kaltenbrunner et al., 2010; Borgnat et al., 2011; Giot and Cherrier, 2014; Han et al., 2014) by employing time series analysis considering temporal and meteorological variables. However, studies that accounted for neighbouring stations have neglected the temporal correlation while studies that considered the temporal correlation have ignored the effect of neighbouring stations or land-use and built environment. This dissertation develops and estimates econometric models to incorporate for the influence of observed and unobserved spatial-temporal correlation on bicycle arrival and departure rates for a bicycle-sharing system.

Role of Bicycle-Sharing System Infrastructure

Earlier research efforts, while providing useful insights on the system level usage patterns, ignored the BSS infrastructure installation decision process; i.e., BSS operators installed the infrastructure based on an expectation of system usage. The aforementioned research studies overlooked this when they consider usage as a dependent variable and employ BSS infrastructure as an independent variable. Thus, in the models developed, the unobserved factors influencing the measured dependent variable (BSS usage) also strongly influence one of the independent variables (BSS infrastructure). This is a classic violation of the most basic assumption in econometric modeling; i.e., the error component in the model is not correlated with any of the exogenous variables (Greene, 2012 pp. 63). The model estimates obtained with this erroneous assumption are very likely to over-estimate the impact of BSS infrastructure. To obtain an accurate estimation of the “true” impact of BSS infrastructure on BSS usage, it is critical to consider infrastructure installation and usage decisions as an interconnected process. The issue is analogous to the very well documented residential self-selection endogeneity issue tied closely with travel behavior outcomes (see Bhat and Guo, 2007; Cao et al., 2009). To correctly characterize the decision processes at hand, it is necessary to consider the bicycle-sharing infrastructure installation itself as a dependent variable simultaneously along with the usage patterns. The current dissertation proposes a joint econometric framework that remedies this drawback.

Users Destination Choice Preferences

While cities are supporting bicycle-sharing systems as a more sustainable transport mode, due to their relatively recent adoption, there is very little research exploring how people consider these systems within existing transportation alternatives. Understanding the individuals’ decision processes in adoption and usage of bicycle-sharing systems will enable bicycle-sharing system operators/analysts to enhance their service offerings. It can be assumed that an individual who picks a bicycle at one of the stations makes destination station choice based on a host of attributes including individual’s age and gender, time period of the day and destination attributes such as distance from the origin station, points of interest, bicycle infrastructure, land use and built environment variables. The decision process is studied using a random utility maximization approach where individuals choose the destination that offers them the highest utility from the universal choice set of stations in the study region. There have been several location choice studies in traditional travel demand literature that adopt a random utility maximization approach for understanding destination/location preferences (Chakour and Eluru, 2014; Waddell et al., 2007; Sivakumar and Bhat, 2007). In this dissertation, we extend earlier work on destination choice behavior for the newly developed BSS by examining BSS behavior at a trip level to identify contributing factors to cyclists’ destination preferences.

BSS Trips Heterogeneity

The typical destination choice frameworks implicitly assume that the influence of exogenous factors on the destination preferences are constant across the entire population. To illustrate this, consider the destination choice behavior for two users (U1 and U2) with the same attributes except for trip start time period. U1 starts her trip in the AM time period while U2 starts her trip in the PM time period. Now let us consider the influence of “job density” and “restaurant density” in the vicinity of the destination alternatives on destination station preferences. U1 beginning her trip in the AM period is more likely to be positively affected by “job density” while being minimally affected by “restaurant density”. U2, on the other hand, is likely to be positively influenced by “restaurant density” and either minimally (or even negatively) affected by “job density”. This is an illustration of how based on the trip start time, impact of exogenous variables can be substantially different across users. The illustration provided is a case of one variable (trip start time) moderating the influence of other variables (“job density” and “restaurant density”). The reader would recognize that it is possible that multiple variables might serve as a moderating influence on a reasonably large set of exogenous variables. If such distinct profile of exogenous variables across users is not considered, a restrictive assumption that all exogenous variables have the same effect on user destination process is imposed. This dissertation develops an econometric framework that addresses this restrictive homogeneity assumption in examining BSS user destination choice preferences.

Data Employed for BSS Analysis

The BSS operators provide system availability data to users on their websites. Through relatively simple scripting exercises, it is possible to build a database of bicycle availability across stations for the BSS system. Thus, the data obtained can provide a glimpse of how BSS usage varies across the day. More recently, in addition to the system availability information, BSS operators release trip information containing details including origin and destination stations, start and end time of the trip for BSS users. The station usage may also be obtained from this information by aggregating trips originated or destined at one station.

An important question in the process of developing models for BSS analysis is to choose the size of the data to be selected for the model estimation sample. As opposed to the traditional transportation literature where sample sizes are quite limited, in the context of BSS, information is available for every minute for multiple days and months. Hence, the selection of appropriate sample for BSS analysis is quite critical since the size of sample influences the complexity of the modelling process. Employing large samples requires substantial data preparation and model run times. For example, one month of data for a BSS with 300 stations results in 216,000 records of hourly arrivals or departures and hundreds of trips. The processing of usage or trip data and preparation of station level variables including built environment attributes and other variables such as weather characteristics or temporal attributes are substantially time-consuming. In addition to data preparation, a very large sample significantly increases the model run times. On the other hand, employing a smaller sample than appropriate would result in inaccurate and possibly even biased model estimates affecting the planning process. Hence, it would be useful to understand the sample size requirements for examining bicycle-sharing systems. Besides, the data is not always available; knowing the required appropriate sample size prior to collecting data would be beneficial. Due to the relative infancy of BSS, there is little to no guidance on the amount of data necessary for the analysis. The current dissertation investigates the impact of sample size on analysis of BSS to provide insights on minimum requirements for the size of samples.

Objectives of the Dissertation

The dissertation aims to provide insights on unexplored dimensions of bicycle-sharing systems based on the observed usage data from the three cities in North America – Montreal, Canada, Chicago, USA and New York, USA. The followings are the specific research objectives of this dissertation.

The first objective is to examine the influence of meteorological data, temporal characteristics, bicycle infrastructure, land use and built environment attributes on BSS usage using a multilevel linear mixed modeling approach that explicitly recognizes the dependencies associated with bicycle flows originating at the same station. The model results obtained are validated using operational data compiled from the year after the data used to fit the model.

The second objective is to address spatial-temporal correlation of bicycle arrival and departure rates at BSS stations. Towards this end, the dissertation focuses on three models: a) a simple model without considering the spatial-temporal correlation; b) a model with the observed spatial-temporal correlation; c) a model with both observed and unobserved spatial-temporal correlation. A comparison between these modelling frameworks demonstrates the advantages of including the spatial-temporal correlation in the BSS demand modelling effort.

The third objective is to propose a joint econometric model accounting for the impact of endogeneity and common unobserved heterogeneity on the BSS usage patterns. We formulate a multi-level joint econometric framework. The framework considers the bicycle-sharing infrastructure installation process (a one-time process) while allowing for a multi-level analysis of arrivals and departures (repeated observations). We consider an ordered representation for all the dependent variables yielding a three dimensional panel ordered formulation. Specifically, we adopt a panel multi-level mixed (or random parameters) ordered logit model.

The fourth objective is to study bicycle-sharing system behavior at a trip level to analyze cyclists’ destination preferences. The objective of the proposed research effort is to evaluate the impact of socio-demographics, built environment, bicycle infrastructure and bicycle-sharing system on the trip making behavior after picking up a bicycle at a BSS station using a random utility framework in the form of a Multinomial Logit Model (MNL).

The fifth objective is to extend the traditional destination choice models by developing a Finite Mixture Multinomial Logit (FMMNL) model that accounts for heterogeneity in BSS trips for examining BSS users’ destination choice preferences based on a broad set of exogenous variables (such as temporal and weather characteristics, trip attributes, users’ attributes and origin and destination characteristics). Unlike the traditional destination choice based MNL model, in an FMMNL model, we can consider the effect of fixed attributes across destinations such as users or origins attributes in the decision process. The performance of proposed FMMNL model and the traditional MNL model is evaluated in terms of goodness of fit measures, explanatory power and prediction performance.

The sixth objective is to investigate the impact of sample size on BSS analysis. Specifically, this dissertation proposes a systematic evaluation of the impact of sample size on model estimates, inference measures and predictive performance for both perspectives of BSS analysis: system usage and user destination choice.

The first three objectives of this dissertation examine the BSS from the system perspective while the fourth and fifth objectives analyze the BSS behaviour at a trip level. The last objective use the models from both perspective to investigate the required data for BSS analysis. The main two limitations of the first objective – neglecting the spatial dependency in the arrivals and departure and ignoring the endogeneity issue of BSS infrastructure installation decision – are addressed in the objective two and three. The fifth objective extend the model in the fourth objective to account for the BSS trips heterogeneity and origin’s and user’s attributes. Figure 1.2 illustrates the relation between the different objectives of this dissertation.

Outline of the Dissertation

The remainder of the dissertation is organized in seven additional chapters as follows:

In chapter 2, using data compiled from minute-by-minute readings of bicycle availability at all 410 stations on the BIXI website between April and August 2012, we attempt to examine the determinants of bicycle-sharing demand in Montreal. The BIXI database compiled is augmented with meteorological data, temporal characteristics, bicycle infrastructure, land use and built environment attributes allowing us to examine the influence of these factors on bicycle-sharing system demand. This chapter employs a multilevel linear mixed modeling approach that explicitly recognizes the dependencies associated with bicycle flows originating at the same station. The model results obtained are validated using operational data compiled from 2013. Further, we compute elasticity estimates of various attributes to illustrate the applicability of the developed model for policy analysis.

The major objective of Chapter 3 is to accommodate for spatial and temporal effects (observed and unobserved) for modeling bicycle demand employing data from New York City’s bicycle-sharing system (CitiBike) for the month of September 2013; i.e., the peak month of the usage in 2013. Towards this end, Spatial Error and Spatial Lag models that accommodate for the influence of spatial and temporal interactions are estimated. The exogenous variables for these models are drawn from BSS infrastructure, transportation network infrastructure, land use, point of interests and meteorological and temporal attributes.

Chapter 4 proposes a joint econometric model (a repeated observation based panel multi-level mixed ordered logit model) that accommodate the impact of endogeneity and common unobserved heterogeneity on the BSS usage patterns. The proposed model is estimated using data compiled from the Montreal bicycle-sharing system, BIXI, from April to August 2012. To further examine the advantages of the proposed model framework, elasticity impacts for a host of policy variables is computed.

Chapter 5 evaluates the impact of socio-demographics, built environment, bicycle infrastructure and bicycle-sharing system on the users’ destination decision making behavior using a multinomial logit model (MNL). The proposed quantitative analysis is conducted employing BSS trip data for Chicago’s Divvy bicycle-sharing system from July to December 2013. To illustrate the applicability of the proposed framework for planning purposes, destination station choice probability prediction is undertaken. Finally, a trade-off analysis to illustrate the relationship between important attributes affecting the destination choice process is also undertaken.

Chapter 6 proposes a Finite Mixture Multinomial Logit (FMMNL) model that accommodates unobserved heterogeneity by probabilistically assigning trips to different segments and estimating segment-specific destination choice models for each segment. Using data from New York City bicycle-sharing system (CitiBike) for 2014, we develop separate models for members and non-members. We validate our models using hold-out samples and compare our proposed FMMNL model results with the traditional MNL model results.

Chapter 7 performs a systematic evaluation of the impact of sample size on model estimates, inference statistics and predictive performance. Towards this end, we evaluate the BSS data from two perspectives: a) system usage – what contributing factors influence hourly arrival and departure rates at a station level based on the model developed in Chapter 2, b) user destination choice – what factors contribute to users’ preference of destination station choice based on the model developed in Chapter 4. The analyses are done using data from CitiBike system in New York City.

Finally, Chapter 8 concludes the dissertation by summarizing the results and findings and identifying possible directions for future research on bicycle-sharing systems.

Figure 11. BSS Station Locations

Figure 12. The Link between Dissertation Chapters

HOW LAND-USE AND URBAN FORM IMPACT BICYCLE FLOWS Introduction

Demand modeling plays an important role in determining the required capacity, and hence the success of new bicycle-sharing systems and/or the success of expanding an existing system. Over the past few years, there have been several studies devoted to examining factors affecting bicycle-sharing flows and usage. The objective of this chapter is similar to these previous studies. Earlier studies, however, by using aggregated monthly or yearly flow rates, fail to capture the impact of variables that change in the short term; i.e., at an hourly level (such as variations in weather and time-of-day effects). Neglecting the presence of such variations usually reduces the applicability of the results obtained. Moreover, examining bicycle flows at an hourly level (or a short time frame) allows the analyst to provide the operators with bicycle demand profiles including excess and shortage information. A more recent research effort, Hampshire et al. (2013), studied the influence of bicycle infrastructure attributes and land-use characteristics on bicycle flows using aggregated hourly arrival and departure rates at the sub-city district (SCD) level in Barcelona and Seville, Spain. They highlighted that bicycle station density, the average capacity of stations in the SCD and the number of points of interest in SCD are important contributors to arrival and departure rates. Contrary to the previously mentioned literature, while Hampshire et al. (2013) used a fine temporal dimension, their study fails to capture fine-grained spatial effects because the station flows studied are aggregated at the SCD level.

In this chapter, using data compiled from minute-by-minute readings of bicycle availability at all 410 stations on the BIXI website between April and August 2012, we attempt to examine the determinants of bicycle-sharing demand in Montreal. BIXI in Montreal is a mature system that offers a unique opportunity for understanding the factors influencing its flows and usage. The BIXI database compiled is augmented with meteorological data, temporal characteristics, bicycle infrastructure, land use and built environment attributes allowing us to examine the influence of these factors on bicycle-sharing system demand. Specifically, we quantify the influence of various factors on arrival and departure flows at the bicycle sharing station level using a general statistical modeling technique that other regions can adopt. We employ a multilevel linear mixed modeling approach that explicitly recognizes the dependencies associated with bicycle flows originating at the same station. The model results obtained are validated using operational data compiled from 2013 (one year after the data used to fit the model). Further, we compute elasticity estimates of various attributes to illustrate the applicability of the developed model for policy analysis. The estimated models will allow us to predict changes to the demand profiles (arrivals and departure flows) allowing us to examine the influence of changes to the system – capacity reallocation or new station installation.

The rest of the chapter is organized as follows. Section 2.2 explains the data compilation and sample formation in detail. Section 2.3 presents the visual representation of BIXI flows. The statistical model employed in this chapter and the model estimation results are discussed in section 2.4. Section 2.5 discusses a policy exercise. Finally, Section 2.6 concludes and summarizes the chapter.

Data

Montreal is the second largest Census Metropolitan Area (CMA) in Canada and one of the largest cities in North America. Montreal is a distinctive city in North America with an enhanced integrated public transportation system and a diverse and relatively dense urban form; thus, it is one of the least car dependent cities in North America. Further, the city has one of the best cycling infrastructure in North America with more than 600 km of recreational and on-street cycle paths. The bicycle mode share in the city of Montreal is slightly more than 2% while in the bicycle friendly neighbourhood of Plateau-Mont-Royal the share increases to about 8.6%. Further, in the city of Montreal, about 61% of the trips are short enough to substitute by bicycle trips (Vélo Québec, 2010). Given these potentials for cycling, Montreal’s bicycle-sharing system, BIXI (a word formed by combining bicycle and taxi), was installed in 2009. The service began with 3000 bicycles and 300 stations and expanded to 410 stations with more than 4000 bicycles in 2012. Montreal’s BIXI system has been one of the most successful bicycle-sharing systems around the world with an annual ridership of more than three million trips although it is not operated during winter months due to harsh weather conditions.

For this study, the hourly arrival and departure rates are obtained from minute-by-minute BIXI bicycle availability data for all stations in service (410 stations) between April and August 2012. Figure 2.1 shows the location of BIXI stations on the Montreal Island. It is important to note that, due to severe winter conditions in Montreal, the BIXI season starts on April 15th and ends on November 15th of each year.

A sample formation exercise was necessary to obtain the arrival and departure rates from the bicycle availability data for every station. The raw data saved from the BIXI website provided information on the number of bicycles available at each station for every minute. The raw data was processed to generate minute-by-minute bicycle arrival and departure rates for every station. The arrival and departure rates obtained are not necessarily due to customer-based bicycle flows. It is important to note that bicycle-sharing system operators frequently perform rebalancing operations, removing bicycles from stations that are full and refilling the docks of empty stations. Unfortunately, the occurrence of rebalancing operations is not indicated in the minute-by-minute data available, and so it is not possible to directly distinguish whether the addition (removal) of bicycles is due to customers or operators. So, we adopt a heuristic mechanism to arrive at the “true” arrivals and departures. We identify spikes of bicycle availability (or removal) in the data compiled to differentiate between customer flows and operator flows. For this purpose, we aggregate the flow rate data temporally up to a 5-minute level to capture the effect of rebalancing operations. Specifically, we assume that a rebalancing operation has occurred if the 5-minute arrival/departure rate is greater than the 99th percentile arrival/departure for that station. When such a trigger is identified, the actual bicycle flow for this 5-minute period is obtained by averaging the bicycle flow rates of the two earlier 5-minute periods and the remainder of the flow is allocated to the rebalancing operation (a slight variant of this approach is employed in Hampshire et al., 2013). After correcting for rebalancing operations, hourly arrival and departure rates for every station are obtained by aggregating this 5-minute bicycle flow data.

Although the BIXI season starts April 15th every year, only a subset of the stations begins functioning within the first ten days of the season. Hence, from 2012 BIXI data, we removed the month of April and restricted our sample to the four months of May, June, July and August. Subsequently, to obtain a reasonable sample size, we randomly select two days for every station in our database. The arrival/departure rates in overnight hours (1 AM to 6 AM) are very low. Thus, we aggregate the bicycle flow rates in the overnight time period as one record, generating 20 records for every day (one for the period 1 AM to 6 AM, and one for each remaining hour of the day). Further, to account for the influence of station capacity on bicycle flows, we normalized our dependent variable (arrivals or departures at a station) with station capacity. The final sample consisted of 16400 records (20 hours × 2 days × 410 stations) of normalized arrival and departure rates at a station level. The data sample compiled is well distributed across the four months (percentages of April, May, June and July range between 22.4 and 26 percent) and across all 7 days in a week (daily shares range from 12.8 to 15.6 percent). To be sure, the data sample employed in our analysis forms a small share of the entire data compiled. If the objective is to estimate a linear regression model, large sample size would not be an issue. However, in this chapter, we estimate a linear mixed model (described in Section 2.4.1) whose structure results in longer model run times for larger samples. Further, employing very large samples for model estimation might result in data over-fit and inflated parameter significance. Two separate models are developed to examine the arrival rates and departure rates at every station.

Independent Variable Generation

The independent variables considered in our analysis can be categorized into three groups: (1) weather, (2) temporal and (3) spatial variables. Weather variables include hourly temperature, relative humidity and the hourly weather condition represented as a dummy variable indicating whether or not it is raining. The temporal variables considered aim to capture time-of-day and day-of-the-week effects. Specifically, the day is divided into four periods: morning (6AM-10AM), mid-day (10AM-3PM), PM (3PM-7PM) evening (7PM- 12AM). The influence of weekend vs. weekday was also taken into account. Further, to account for young individual users in the downtown core of Montreal, we included a Friday and Saturday night dummy variable to test for possible increase in BIXI usage during these periods compared to other times.

To examine the spatial determinants influencing bicycle usage at each station, two classes of spatial variables were used: a) Bicycle infrastructure and b) Land-use and built environment variables. The bicycle infrastructure variables included are at both the traffic analysis zone (TAZ) level and the buffer level. A 250-meter buffer around each station was found to be an appropriate walking distance considering the distances between BIXI stations. Bicycle infrastructure variables were used to examine the effect of cycling facilities on the bicycle demand and usage of the bicycle-sharing system. The length of bicycle facilities (bicycle lanes and bicycle paths) in the buffer was calculated to capture the impact of placing BIXI stations near bicycle facilities on the usage of the bicycle-sharing system. Moreover, the length of minor roads (local streets and collectors) and major roads (arterials and highways) in the buffer were calculated to identify cyclist preference of routes. The number and capacity of BIXI stations in the 250-meter buffer were computed to capture the effect of neighbouring stations.

Land-use and built environment characteristics are the other group of variables considered in our analysis. To study the influence of the central business district (CBD), the distance from each station to the CBD was computed. The walkscore corresponding to every station is also generated (The walkscore is a walkability index based on the distance to amenities such as grocery stores, restaurants, etc. see Carr et al., 2011 and http://www.walkscore.com/ for more information). The presence of metro and bus stations near a BIXI station and the length of bus lines in the 250-meter buffer were generated to examine the influence of public transit on bicycle arrival and departure rates. We also considered three types of points of interest near each station: (1) the number of restaurants (including coffee shops and bars), (2) the number of other commercial enterprises and (3) a categorical variable indicating whether or not the BIXI station is near a university. The TAZ level variables considered in our analysis include population density and job density of the TAZ associated with each BIXI station. To provide an illustration of the data compiled, we provide a descriptive summary of the sample in Table 2.1.

Visual Representation of BIXI flows

In order to better understand the spatial and temporal variation of bicycle usage in the BIXI system, we represent the bicycle arrival and departure rates of every station visually using a geographic information system. For this purpose, the bicycle flows of every station in every day of June were considered. To conserve space, we mainly focus on the AM and PM time periods in our visualization exercise. We compute the average hourly arrival and departure flows at every station for the AM and PM time periods. The patterns are presented in Figure 2.2. Several interesting observations can be made from the results. First, we can see that flows are much higher for the BIXI system during the PM period. One plausible explanation for the trend is that employed individuals might find it easier to bicycle home since they are presumably not in as much of a rush as when going to work in the morning. These individuals might decide to arrive at work using less strenuous modes (such as bus or metro). Furthermore, people might also consider riding the BIXI as a useful exercise after work or might make short trips within the CBD — for instance, going from work to a restaurant. It is also possible that during the evening peak hour the population using BIXI includes students and other individuals without the typical schedule (e.g., workers in restaurants and coffee shops and non-workers). Second, the higher concentration of arrival rates in CBD in the morning peak hour confirms the use of bicycle-sharing system for daily commute purposes. Third, the results indicate that bicycle flows are more spatially widespread in the evening peak compared to morning peak. Overall, the visualization provides a brief overview of bicycle flows in Montreal using the BIXI system.

Analysis and DiscussionLinear Mixed Models

The most common methodology employed to study continuous dependent variables such as arrival and departure flows is the linear regression model. However, the traditional linear regression model is not appropriate to study data with multiple repeated observations. In our empirical analysis, we observe the arrivals and departures at the same station at an hourly level for each station. Hence to recognize this, we employ a multilevel linear model that explicitly recognizes the dependencies associated with the bicycle flow variable originating from the same BIXI station. Specifically, we employ a linear mixed modeling approach that builds on the linear regression model while incorporating the influence of repeated observations from the same station. The linear mixed model collapses to a simple linear regression model in the absence of any station specific effects. A brief description of the linear mixed model is provided below.

Let q = 1, 2, …, Q be an index to represent each station, d = 1, 2, …, D be an index to represent the various days on which data was collected and t = 1, 2, …, 20 be an index for hourly data collection period. The dependent variable (arrival or departure rate over station capacity) is modeled using a linear regression equation which, in its most general form, has the following structure:

(2.1)

where yqdt is the normalized arrival or departure rate as dependent variable, X is an L×1 column vector of attributes and the model coefficients, β, is an L×1 column vector. The random error term, ε, is assumed to be normally distributed across the dataset.

The error term may consist of three components of unobserved factors: a station component, a day component and an hour-of-the-day component. Due to the substantial size of the data and the number of independent variables considered in our study, it is prohibitively burdensome, in terms of run time, to estimate the combined influence of the three components simultaneously. Thus, we consider the station and the time-of-day to be related common unobserved effects. In this structure, the data can be visualized as 20 records for each StationDay combination for a total of 820 observations. Estimating a full covariance matrix (20 x 20) is computationally intensive while providing very little intuition. Hence, we parameterize the covariance matrix (Ω). For estimating a parsimonious specification, we assume a first-order autoregressive moving average correlation structure with three parameters σ, ρ and φ as follows:

(2.2)

The parameter σ represents the error variance of ε, φ represents the common correlation factor across time periods and ρ represents the dampening parameter that reduces the correlation with time. The correlation parameters φ and ρ, if significant, highlight the impact of station specific effects on the dependent variables. The models are estimated in SPSS using the Restricted Maximum Likelihood Approach (REML) that is slightly different from maximum likelihood (ML) approach. The REML approach estimates the parameters by computing the likelihood function on a transformed dataset. The approach is commonly used for linear mixed models (Harville, 1977).

Model Fit Measures

In our study, two model frameworks were estimated for arrivals and departures: (1) a linear regression model and (2) a linear mixed model. The final model selection was based on the restricted log-likelihood and Bayesian Information Criterion metrics. Our model estimation process was guided by considerations of parsimony and intuitiveness. The two model frameworks were compared using the log-likelihood ratio (LLR) test. For the arrivals model, the LLR test statistic was significant at any reasonable level of significance (the LLR test-statistic value was 3632, significantly higher than the corresponding chi-square value for two additional degrees of freedom (φ and ρ)). Similarly, for the departures model, the LLR test statistic was significant at any reasonable level of significance (the LLR test-statistic value was 3491). The LLR test comparisons clearly highlight the suitability of the mixed modeling approach employed in our analysis for examining the determinants of BIXI usage in Montreal.

Results

In this section, we discuss the results of linear mixed model estimation to understand the different effects of meteorological, spatial and temporal elements on the bicycle usage in the BIXI bicycle-sharing system. It must be noted that we considered several specifications but only the statistically significant results for arrival and departure rates are presented in Table 2.2.

Weather variables

As expected, there is a positive correlation between temperature and the arrival and departure rates. Further, to examine the non-linear impact of temperature on usage, we have included the square of the temperature variable in the specification; however, the impact turned out to be insignificant. On the other hand, humidity has a negative impact on the arrival and departure rates. People are less likely to ride a bicycle in rainy or very humid time periods. However, the rainy weather variable is not significant for the arrival rate model. This might be explained by the idea that the weather has a stronger effect on the decision of taking out a bicycle than on returning it.

Temporal Variables

People tend to bicycle more on weekdays than weekends, as highlighted by the negative coefficient of the weekend variable. The interpretation of the time-of-day variables needs to be judiciously undertaken due to the presence of interaction effects with population density and university variables. Nevertheless, we clearly observe that the BIXI system is more predominantly used during the PM period relative to other times of the day. The likelihood of using bicycle-sharing systems increases on Friday and Saturday nights, indicating a propensity of young individual users in the downtown core of Montreal during these periods compared to other days.

Bicycle Infrastructure Variables

In this section, the results for parameters related to bicycling infrastructure variables are explained. The bicycle flows and usage of the bicycle-sharing system increase when there are more bicycle facilities (bicycle lanes, bicycle paths, etc.) nearby a BIXI station (in agreement with the findings of Buck and Buehler, 2012). While the length of minor roads in a 250-meter buffer of each station is associated with a positive impact on arrival and departure rates, the length of major roads has a negative effect. The results indicate that BIXI usage is more likely to occur in densely populated neighborhoods. The impact of the number of BIXI stations and the BIXI capacity in a 250-meter buffer need to be examined as a combination. At first glance, it might seem unintuitive that the impact of capacity is negative on BIXI usage. However, the result recognizes that as the number of stations increases we simultaneously increase the capacity. Hence, the estimates obtained are the overall effect of adding stations as well as capacity. In fact, the capacity variable is almost 25 times smaller than the positive impact associated with the number of BIXI stations, highlighting that adding more stations with capacity of 10-15 (the typical size in Montreal) is likely to increase BIXI usage more than adding a few large stations. The result provides an indication that adding stations with very large capacity is not as productive for arrivals and departures as adding smaller stations.

Land Use and Built Environment Variables

It is expected that the arrival and departure rates decrease when a BIXI station is located farther from the CBD. This is supported by the negative coefficient of the distance-to-the-CBD variable. BIXI users often combine their trip mode with the metro more than other modes of transport (Bachand-Marleau et al., 2012); this is also recognized by the positive impact of the presence of metro stations near BIXI stations in the results (similar results can be seen in Nair et al., 2013). In general, the number of restaurants in the vicinity of a BIXI station increases the usage of that station (similar to the findings of Hampshire et al., 2013; Wang et al., 2015). While the presence of this type of business has a negative impact on the departure rate of a BIXI station in the AM period, it intuitively has a positive influence in both arrival and departure rates in the PM period, reinforcing the attraction of bicycle-sharing systems for restaurant customers. The number of all other commercial enterprises in the 250-meter buffer of each station during PM and evening time periods is associated with negative impact. The coefficient associated with the presence of a university campus on a BIXI station’s arrival rate has, interestingly, the opposite sign in the AM and PM periods. BIXI stations near universities are more likely to experience a higher volume of bicycles arriving in the AM than in the PM. While for the departure rates model, the negative coefficient for the AM period has the similar explanation, the university variable does not have a significant influence in PM period. This is plausible since students and teachers tend to have more flexible schedules and usually do not have a fixed time for the end of a work day. The effect of population and job density are incorporated in both models at the TAZ level. BIXI stations in TAZs with higher population density tend to have higher arrival and departure rates (see Rixey, 2013 and Wang et al., 2015 for similar results). The opposite sign of job density in the AM and PM in the arrival rate models highlights the likely use of bicycle-sharing systems for daily work commute trips.

Model Validation

The model estimation results for arrival and departure rates were validated using data from May 2013 (one year after the data used to fit the model). The bicycle availability data was compiled from minute-by-minute readings from the BIXI system for all the stations in May 2013. The same data compilation process described in sample preparation for model estimation (see section 2.2) was repeated to compute bicycle arrival and departure rates. The model developed in section 2.4.3 was used to generate predictions of bicycle arrivals and departures and the predictions were compared with the observed values in the validation dataset. Specifically, we calculated two error metrics to evaluate model prediction performance: a) Root Mean Square Error (RMSE) and b) Mean Absolute Error (MAE). Furthermore, we computed the absolute error as a percentage of station capacity and examined the number of stations with less than 5% error, between 5 and 10% error, between 10 and 15% error, between 15 and 20% error, between 20 and 25% error, greater than 25% error. These measures were computed for the entire sample as well as for specific time periods of the day. The validation exercise results are presented in Table 2.3. Overall, the predicted arrival and departure rates are reasonably close to the observed rates with absolute error of around 1.8 bicycles per hour. The results indicate that for about 90% of the records the error in prediction is within 20%. The fit for the arrival model is slightly better than the fit for the departure model. In terms of time of day, we can see that the performance of the model in the PM period is relatively inferior to the performance of the model for other time periods. However, the results are satisfactory considering the larger rates of arrival and departure in the PM period. The validation highlights the predictive ability of the proposed framework to examine BIXI system bicycle flows (arrivals and departures).

Policy Exercise

To better illustrate the magnitude of effects of variables on the use of BIXI system we computed the elasticity effects for both arrival and departure models by computing the percentage change of arrival/departure rate due to changes to the exogenous variables.

In this part, we focus on the following variables: 1) increasing the length of bicycle facilities by 10% in the 250-meter buffer; 2) increasing the number of stations in the buffer without increasing the capacity in the buffer, i.e., we reallocate capacity to add a new station; 3) increasing the station capacity by the average station size (19); and 4) increasing the number of restaurants by 50% of average number in the 250-meter buffer. The elasticity effects are computed as a percentage difference in arrivals and departures relative to the base case. The measures generated are presented in Table 2.4.

The following observations can be made from the results presented. First, an increase in the bicycle infrastructure variables (length of bicycle facilities, stations and/or capacity) leads to an increase in usage of the bicycle-sharing system, as expected, since the presence of infrastructure plays a great role in cyclists’ decision to use such a system. These effects are marginally higher for departures than for arrivals. Second, and more strikingly, we see that increasing the number of stations