learning by driving: productivity improvements by new...
TRANSCRIPT
Learning by Driving: Productivity Improvements by New York City Taxi Drivers1
Kareem Haggag
University of Chicago
Brian McManus
University of North Carolina
Giovanni Paci
Columbia University
June 2014
Abstract:
We study learning by doing (LBD) by New York City taxi drivers, who have substantial
discretion over their driving strategies and receive compensation closely tied to their success in
finding customers. The breadth of the data, which cover each individual fare during 2009, allows
us to document significant learning while controlling for unobservable factors that often
constrain the LBD literature. In addition, we use the data’s detail on fares’ locations and times to
show that new drivers lag behind experienced drivers to a greater extent in more difficult
situations, and that new drivers benefit from both general and neighborhood-specific experience.
Keywords: learning by doing, specific human capital, taxis, big data
JEL Codes: D24, D83, J24, L92
1 Contact information: [email protected], [email protected], and [email protected]. We thank Ray Fisman, Jonas Hjort, Supreet Kaur, Kiki Pop-Eleches, Bernard Salanie, Chad Syverson, James Roberts, and several seminar audiences for helpful comments.
1
1. Introduction
Learning by Doing (LBD) is a widely studied economic phenomenon, with potential impacts on
macroeconomic growth, market structure evolution, and individuals’ wage dynamics. Prior
studies of LBD have examined productivity improvements by individual workers, as well as
“organizational learning” which describe more complex improvements which may not be
embodied in a single agent. While some exceptions exist, the empirical literature generally
focuses on manufacturing settings, with LBD captured through reductions in unit cost as a firm
or worker accumulates experience. In these cases, while production improvements may be strong
and clear, it can be difficult to learn which agents are improving their capabilities, what activities
are being improved, and how strongly individual agents are encouraged to find improvements.
We provide new evidence on LBD that addresses some of these gaps in the literature. We
use a highly detailed dataset of New York City (NYC) yellow taxi drivers to study how drivers
make improvements overall, how their performance varies across measurably different
situations, and how general and specific experience makes different contributions to driver
performance. Taxi service in NYC is characterized by two features that make it an especially
interesting setting for studying LBD. First, drivers have substantial discretion in how they search
for new customers, and their compensation is very closely tied to the earnings they collect
through customers’ fares. Drivers’ agency in their own production makes them similar to
entrepreneurs, searching for the next fruitful business opportunity. Second, drivers operate in an
environment that lends itself to a relatively clean study of learning. While other LBD studies
often contend with isolating productivity improvements from confounding factors such as scale
economies, improvements in inputs, and shifts in input prices, by contrast all taxi drivers
essentially use the same capital equipment, work in the same demand and cost environment, and
charge the same prices.
We perform our analysis using data on all 171 million NYC yellow taxi rides during the
full 2009 calendar year. Our main analysis features a sample of 7,664 drivers who worked an
average of 176 shifts over the year. Among these drivers we identify 3,298 as “new” based on
their first day in the 2009 data (after March 1st), and other criteria. Our main measure of overall
experience is driver shifts (i.e. full days of work), although it is also possible to use finer
divisions such as the driver’s cumulative number of drop-offs. We measure driver output through
fare earnings per hour. By incorporating the activity of experienced drivers together with new
2
ones, we are able to control for variation in the earning opportunities across each individual hour
of our dataset; this allows us to eliminate bias due to selection effects of new drivers into lower-
demand shifts. We find that a new driver’s productivity is about 8.1% less than an experienced
driver’s when the new driver is in his first shift. A new driver’s productivity increases by 0.19%
for each 10% increase in the number of shifts, and by a driver’s 70th shift he earns as much as an
experienced driver. New drivers’ earnings continue to grow, and by the new driver’s 120th shift
his average earnings are 1% above those of the average driver who entered the industry before
2009.
Our most novel results come from the opportunity to observe the time and location of
every pick-up and drop-off of each ride during 2009. We use census boundaries to divide New
York City’s five boroughs into a collection of 220 neighborhoods, and further divide the week
into 168 hour-long blocks. For each neighborhood and each hour of the week, we use a sample
of 17,463 experienced drivers (i.e. all experienced drivers not included in our analysis sample) to
compute the expected earnings of a driver following a drop-off in a particular setting. These
statistics are useful in describing whether certain situations represent “easy” or “hard” earnings
opportunities. We find that new drivers earn considerably less in more difficult situations.
Following a drop-off in one of the most difficult location-hours, a driver in his first shift will
earn 14% less than an experienced driver in the same situation, but brand new drivers earn only
1.5% less than experienced drivers in the easiest settings. Despite the large gap in initial earnings
in difficult situations, new drivers’ earnings improve fastest there and erase the gap with
experienced drivers after 136 shifts.
The time- and location-specific data also allow us to compute a new driver’s share of
experience in each neighborhood and portion of the day (and the intersection of these). We
estimate that the importance of location-specific experience is comparable to general experience
for new drivers’ earnings. A new driver whose experience is relatively concentrated (75th
percentile) within a neighborhood has expected earnings that are 4% greater than new drivers
with a smaller share (25th percentile) of experience in a given neighborhood. Moreover, time-
specific experience at a location explains a substantial share of the benefit from local experience.
Our analyses of task difficulty and general versus specific experience rely on two simple
but powerful identification opportunities. First, we observe new drivers visiting a variety of
locations a large number of times. On average, a driver makes 23 drop-offs per shift. Relative to
3
the labor literature, which relies on relatively infrequent job switches to disentangle the effects of
specific versus general human capital, we observe substantially more chances to see how a driver
performs in different tasks. The second opportunity comes from the randomness introduced to
task selection through customer-selected destinations. While drivers are in control of the
neighborhoods in which they look for new customers, they do not control the selected
destination. The customer drops the driver into some location which could be easy or hard, or
within which the driver has little or substantial prior experience. We contrast this with the usual
endogeneity concerns about the motivation for workers to switch jobs within their careers, which
may bias estimates of job-specific human capital in studies that use labor panel data.
Our results join a large empirical literature on LBD; see Thompson (2010 and 2012) for
surveys of the field. Several recent papers have grappled with the same issues that interest us.
Jovanovic and Nyarko (1995) consider potentially-bounded learning by individual agents in a
variety of occupations. Hatch and Mowry (1998) and Hendel and Spiegel (2014) use extremely
fine data on production improvements in manufacturing, and they discuss the distinction among
improvements that coincide with deliberate changes to production processes, small “tweaks” to
production, and passive learning which occurs automatically. Our consideration of
heterogeneous tasks is related to Levitt et al. (2013), who use a variety of measures on
production improvements within a single auto manufacturer. More common in the literature are
studies that aggregate activity across diverse settings and offer some controls for this diversity
without directly studying it; examples include Kellogg’s (2011) study of relationship-specific
LBD across a variety of Texas oilfields, and Thompson’s (2001) estimation of separate ship yard
productivities within an investigation of aggregated learning across yards. Examples of research
that allow for differences in learning speeds across agents include Rockart and Dutt’s (2013)
study of equity-underwriting projects, and Pisano et al.’s (2001) study of hospitals’
improvements in cardiac surgery.
Our paper also relates to a labor literature that studies general and (firm- or task-) specific
experience by workers. Similar to our exercise, Shaw and Lazear (2008) use direct output
measures to study learning at the individual-worker level. They find that new installers of
automobile windshields have steep productivity-tenure curves that are attributed to both learning
and effort, although Shaw and Lazear cannot distinguish between general and specific
experience in their exercise. Also closely related to our paper is Ost (2014), who studies how
4
elementary school teachers’ general and grade-specific experience affect classroom productivity
as measured through reading and math test scores. Additional prior studies of task-specific
experience include Gathmann and Schonberg (2010), who use German panel data which span
many occupations, and Clement et al. (2007), who study variation in forecasting accuracy in a
cross-section of financial analysts.
The rest of our paper is organized as follows. In Section 2 we provide an overview of the
NYC taxi market, including the various activities that are tracked by our primary data source, the
NYC Taxi and Limousine Commission (TLC). In Section 3 we provide an overview of the data.
Section 4 contains our empirical analysis, and Section 5 concludes.
2. The New York City taxi market
We study driving activity in New York City’s yellow taxicabs. These taxis serve customers who
hail from the street, plus taxi queues at airports, train stations, and hotels. They are not permitted
to accept customers in any other way. Other types of for-hire vehicles (e.g. town cars and
limousines) serve the market for pre-arranged and radio-dispatched transportation. During our
sample period of 2009, there were 13,237 yellow cab taxi licenses (“medallions”) and about
40,000 additional for-hire vehicles in New York City. Each medallion corresponds to a single
yellow taxi, which may be controlled by an owner-operator, a firm with a fleet of cabs, or an
authorized leasing agent. The taxis are operated by drivers licensed by New York’s Taxi and
Limousine Commission; the number of drivers increased from 46,409 to 48,521 during 2009.
The change in number of drivers combines the separate effects of attrition, the entry of new
drivers, and the potential return of experienced drivers who ended extended absences from the
market.
Taxi drivers enter the market through a variety of avenues. Based on conversations we
have had with taxi fleet managers, it appears that very few drivers are “seasonal” in the sense
that they take multi-month breaks from driving a taxi before returning to the job. Schaller (2006)
reports that, in 1991, 20% of TLC license applicants drove professionally in their previous job,
with 44% of applicants having ever held a professional driving position (in New York or
anywhere in the world). Restaurants and food service firms provide the second-largest share
(15%) of previous employment. To obtain a TLC license, drivers are required to: hold a valid
DMV license; attend either a 24-hour or 80-hour Taxicab School course; and pass tests on New
5
York geography, TLC rules, and English language proficiency. New and experienced drivers are
likely to differ on personal characteristics and performance. Schaller (2006) reports that the
median experienced driver is about ten years older than a typical newly-licensed driver, and
experienced drivers receive fewer complaints for service problems such as refusing passengers,
overcharging, treating passengers rudely or abusively, or driving unsafely. Driver experience is
also negatively correlated with the number of accidents and traffic violations (Schneider, 2010).
Taxi earnings and costs are structured so that it is in the driver’s interest to maximize fare
earnings during a shift. Drivers keep all fares and tips. Fares are accrued as a function of ride
distance and duration, and may include surcharges for nighttime rides, peak weekday rides, and
destinations at airports or outside of New York City’s five boroughs. Drivers’ costs vary
depending on their ownership of the taxis they drive. All drivers pay for their own gas ($5,000-
$10,000 per year), annual TLC fees ($100), and DMV/TLC fines for driving infractions. Drivers
who lease their vehicles will pay a per-day or per-week flat fee; these fees were about $100 per
day in 2009. Owner-operators pay for annual maintenance and repair ($4,000-$10,000/year),
insurance ($7,000-$13,000/year), and licensing fees ($1,000/year). On average, a driver’s take-
home pay is 57% of revenue, with the rest divided among expenses paid by the driver and taxi
owner (Schaller 2006). Driver earnings vary with the time of day and day of the week, so some
shifts are more lucrative than others. Experienced drivers are generally sorted into the more
valuable shifts; we document this pattern below. In our conversations with fleet managers we
have learned that this is largely due to seniority rules for the assignment of drivers to shifts.
3. Data
3.1 Sample construction
The data we use in analysis are derived from a fare-by-fare database of yellow taxi activity from
the full 2009 calendar year. These data are collected by the TLC as part of an effort to monitor
the activity of taxis and their drivers in the NYC area. The database includes all fares for licensed
NYC cabs, even if one or both endpoints of a fare occur outside of the city’s five boroughs.
We begin with a full database of 171 million observed fares received by 41,256 drivers.
Each observation includes a unique driver-specific index, the longitude, latitude, date, and time
of the ride’s pick-up and drop-off, and payments to the driver. Date and time are recorded to the
6
second, and longitude and latitude are recorded through on-board GPS.2 The total payment is
broken down into the fare, surcharges, tip when the payment is via credit card, and MTA tax.
The combination of driver identification codes and specific ride data allow us to construct a
complete history of each individual driver’s activity during 2009. Following Farber (2005), we
define a shift as a succession of rides with breaks between rides no longer than 5 hours. We then
track each driver’s total fare earnings per shift, the shift’s duration, and running counts of the
driver’s numbers of drop-offs in each neighborhood and neighborhood-time of day combination.
We also construct statistics within selected hours of a shift. The measures include for each
driver: fare earnings in the hour, percentage of time spent idle or slack (i.e. without a passenger),
the number of miles driven, the number of fares collected, and the average driving speed.
Unfortunately we do not have additional information about the drivers’ personal characteristics,
employer’s identity, contract terms, or costs.
We clean and organize the data in order to conduct analysis of new versus experienced
drivers. Our primary metric for experience is the number of shifts worked. We remove drivers
and shifts that are unlikely to represent the production efforts of a regular driver in the market. In
particular, we drop drivers who are associated with fewer than 100 fares, shifts associated with
more than one unique car identifier, and shifts that were shorter than 2 hours or longer than 15
hours. This reduces the dataset to 156 million observations. 86% of the data reduction comes
from the elimination of very long shifts, and 9% is due to drivers who use multiple cars in a
single shift.
Next, we separate drivers into groups by their level of experience. We identify 27,519
drivers who first appeared in the data on or before January 15, 2009. Of these drivers, we retain
21,819 who worked at least 100 shifts between January 1 and December 31, and additionally
worked at least 20 shifts before March 1. These drivers are likely to have pre-2009 experience in
the market, while also working in the market with enough frequency to maintain their stock of
knowledge. To maintain tractability of the data, we select a 20% random sample of these drivers
as our “experienced” drivers in the analysis described below.
2 Longitude and latitude are reported to one-foot precision, but the presence of tall buildings and other technical challenges is likely to distort these values somewhat. Small errors in location data will not affect our estimates, which use areas of several city blocks as the finest descriptors of taxi location.
7
From the collection of drivers who fail the criteria for experienced drivers, we identify a
subset that are likely to have entered the NYC taxi workforce in 2009 as new or inexperienced,
and work with sufficient frequency to indicate an intention to function as a full-time taxi driver
in the market. To identify these new drivers, we isolated the sample to 5,310 drivers that were
first observed in the data on or after March 1, 2009. Among these drivers, we select the 3,298
who worked at least 50 shifts between their entry date and December 31, 2009. It is possible that
some of these drivers have prior experience in the NYC taxi market, but we cannot measure the
size of this effect in the data. To address the possibility that some of these drivers are seasonal
workers, we have performed additional analysis (discussed below) on drivers who are subject to
stricter inclusion criteria; our results are largely unchanged. In addition, we have experimented
with a variety of thresholds for the first day of work by new drivers and the total number of shifts
driven, and again the results are unaffected. To complete the selection of drivers and shifts for
analysis, we limit both the experienced and new driver samples to shifts that start between March
1 and December 31, 2009.
We use the longitude and latitude information in the fare database to identify the NYC
geographic region in which each pick-up and drop-off occurs. We divide the market in two ways.
First, we use the boundaries of Public Use Micro Areas (PUMAs) to identify 59 regions within
the city’s five boroughs. PUMAs represent a fairly coarse division of the city; for example one
PUMA is defined by the portion of Manhattan west of Central Park and approximately bounded
by the Park’s north and south edges. Second, we use the boundaries of 220 Neighborhood
Tabulation Areas (NTAs), which are mostly nested within PUMAs and correspond more closely
to conventional views of NYC neighborhoods. For example, the Lincoln Square area is a distinct
NTA within the Manhattan PUMA described above.
We use the geography data in two ways. First, we use a random sample of experienced
drivers3 to construct statistics on driver performance within each NTA-hour and PUMA-hour
combination within a week (i.e. for each region a separate measure is constructed for each of
7×24=168 unique hours in the week). Using the fare-level data, we calculate each experienced
driver’s total earnings during the 60 minutes following a drop-off in a specified region-hour. We
then average these earnings across all drivers in the same region-hour, thereby computing a
3 We use data from all experienced drivers who are not within the 20% sample that we employ in our main analysis.
8
measure of how locations and times may vary in the earning opportunities available for drivers.4
Some passenger-selected drop-off locations may take the driver to a part of the city where (at a
given time of day) it is especially easy to find the next customer, or perhaps find customers who
are likely to request rides to high-earning areas. With these measures of average earning within
regions and time, we can characterize some situations as “easy” or “hard” production
opportunities, and investigate how new driver performance varies with task difficulty.
Our second use of the geography data is to construct measures of location and location-
time specific experience by new drivers. For all new drivers who are included in our analysis
sample, we maintain a running count of the number of drop-offs that the driver has experienced
in each PUMA and NTA. We also divide the day into four six-hour blocks (11pm-5am, 5am-
11am, 11am-5pm, and 5pm-11pm), and record each driver’s experience in a PUMA or NTA
within each time block. With these drop-off counts, we are able to track the evolution of a
driver’s general and specific experience during the 2009 calendar year. For example, if we are
considering a new driver’s performance following a particular drop-off, we may ask what share
of the driver’s experience comes from dropping-off passengers in the same neighborhood and
during the same portion of the day.
3.2 Summary statistics
Our analysis features two populations of drivers. All of our analysis includes activity by the
3,298 new drivers who meet the selection criteria described above. We also use a 20% random
sample of 4,366 experienced drivers, who act as a comparison group in much of the empirical
analysis. We present summary statistics on the drivers’ productivity in Table 1, separately
reporting activity for experienced drivers, new drivers overall, and new drivers in their first 20
shifts. As in our analysis below, all summary statistics correspond to shifts between March 1 and
December 31, 2009.
We examine several types of productivity variables, including average earnings per hour
and earnings within a specified hour of the shift. We generally focus on the 60 minutes following
4 Before constructing this measure, we first drop all rides that are not followed by at least 4 more rides within the driver’s shift (17.9% of observations), so as not to characterize locations by endogenous shift-ending behavior. We also set the driver’s fare-earnings to missing if equal to zero (to coincide with our rule in the regression analysis). Finally, after constructing the NTA/Hour averages, we drop all NTA/Hours with fewer than 10 observations over the entire sample period (this corresponds to the bottom 10% of the distribution of NTA/Hour observations).
9
a driver’s third drop-off of a shift. For convenience we refer to this period as “hour-R3” in our
text and tables. Hour-R3 roughly overlaps with a driver’s second hour of work. For the median
driver this hour begins at minute 55 of the shift, and drivers at the 25th and 75th percentiles begin
at minutes 39 and 81, respectively. This hour is a microcosm of earning opportunities yet less
likely to be affected by considerations about when to stop working or whether to take long mid-
shift breaks, which may differ between new and experienced drivers.5 This also allows us to look
directly at the pick-up and drop-off times and locations of the driver’s final fare starting
immediately before hour-R3, as well as the pick-up time and location of the next fare. In
computing earnings in the hour following a drop-off, we omit observations in which a driver has
no customers for 60+ minutes. We infer that the driver is on a break, and is not attempting to find
new customers. This rule affects 5% of our observations, and has no substantial effect on the
results described below.
The earnings statistics in Table 1 show that experienced and new drivers collect about the
same value in fares per hour (approximately $26), but the median new driver in his first 20 shifts
earns $1.43 less per hour (5.5%) than the average experienced driver. Similar differences emerge
across experienced and new drivers when we compare other measures of driver activity. Drivers
in their first 20 shifts spend a larger fraction of hour-R3 without a passenger, and the slack time
between their third drop-off and the next pick-up is longer, on average. These differences have
an influence on numbers of miles driven and customers served. Average ride duration is greater
for experienced drivers, despite these drivers taking more customers per hour. Average travel
speed, which is approximately the same across drivers of all experience levels, could capture
both drivers’ ability to navigate congested city streets and the types of passengers and locations
favored by the driver. Any differences displayed among drivers in this table, however, could be
influenced by variation in market conditions when drivers of different experience levels are
working.
In addition to differences in productivity, in Panel B of Table 1 we display some
differences in working practices by new and experienced drivers. We observe new drivers
working fewer shifts per person than experienced ones, but this is largely an artifact of our data
5 An advantage of this variable is that it allows us to partially separate learning from drivers’ habituation to driving conditions, such as sitting for many hours, which may be less likely to be important early in the shift.
10
construction procedure. (Experienced drivers work in the market from March to December, but
new drivers are added gradually during this period.) New and experienced drivers work a similar
proportion of days following their first appearance in the analysis sample. Some notable
differences exist in the number of cars associated with each driver. While our data do not provide
information on drivers’ relationship to cab owners, we observe that experienced drivers are
170% more likely to be paired with a single taxi during the sample period. This suggests that the
proportion of owner-operators and long-term lessees is larger in this population. Taxi ownership
may provide greater flexibility for drivers to choose their own schedules, which can affect
earning opportunities. In Figure 1’s Panel A we display the proportion of pick-ups in our analysis
sample by new drivers across different hours of the day on Tuesdays and (separately) Saturdays,
and we see some variation in when new drivers are more likely to be working.6 The Figure’s
Panel B displays the average number of pick-ups per hour during the same days, and it appears
that new drivers are most common in the market when overall demand is relatively low. These
differences in selection into shift hours highlight the importance of controlling for earning
opportunities across days and hours of the sample period.
In Figure 2 we provide an illustration of how earning opportunities vary across the city,
and to what extent they vary across four different hours on Tuesday and Saturday. We separate
the NTA-hour earnings statistics into quartiles. The first quartile (lightest shading) represents the
lowest average earnings ($19.08 on average) in the hour following a drop-off, and the fourth
quartile (darkest shading) contains the highest average earnings ($33.83). In constructing the
quartiles, we weight the NTA-hour statistics by the frequency with which they occur in the fare-
level data for the new and experienced drivers of the analysis sample. Some NTAs (largely on
Staten Island) are unshaded because fewer than 10 rides terminated in those neighborhoods
during an hour of the week. Lower Manhattan is at its most lucrative around rush hours and on
weekend evenings, while the Bronx contains many neighborhoods in the lowest-earning quartile.
The eastern portions of Brooklyn and Queens have relatively high average earnings, likely
because of their proximity to John F. Kennedy airport, from which the TLC mandated a flat fare
of $45 in 2009.
6 Our use of the analysis sample, with all qualifying new drivers but only 20% of qualifying experienced drivers, inflates the proportion of new drivers relative to the actual share in the market.
11
Some of our empirical analysis relies on an assumption that drivers are randomized into
locations across the city based on the requested destinations of their customers. While drivers
can control where they pick up customers, they do not determine the drop-off location.7 Contrary
to our assumption, if drivers gain expertise in selecting customers who are likely to request rides
to more lucrative drop-off locations, then some of their measured improvements in fares will be
due to the ability to avoid arriving in difficult situations rather than performing well within them.
(This would be a different sort of learning than interests us.) We evaluate our randomization
assumption by investigating whether new and experienced drivers have different outcomes in the
mean earnings of a destination (measured at the NTA-hour and PUMA-hour level). Our results,
which we report in Appendix Table A1, show that once we account for the date, hour, and NTA
in which a ride begins, the expected earnings of the destination are uncorrelated with a driver’s
stock of experience.
4. Empirical analysis
We estimate a set of reduced-form econometric models to satisfy our research objectives. The
models are reduced-form in the sense that they do not model drivers’ choices and how these vary
with experience, but instead we measure how market outcomes (e.g. wages) vary across drivers
with different amounts of experience. It is beyond the scope of this paper to perform a direct
analysis of drivers’ choices, but this is a topic we are studying in ongoing research.
Our first goal is to document overall learning dynamics, and also investigate how
estimating learning can be affected by sample selection or functional form assumptions. We
perform this analysis with a variety of measures of driver productivity, and across a variety of
situations that vary in difficulty. The second goal is to use the data’s geographic detail to
estimate the relative importance of general and specific experience on a driver’s productivity.
We perform our main analysis with a fairly simple econometric framework. Let yit be
driver i’s productivity during shift t. The production measure could be (a function of) fare per
hour during t, fare during a specified hour of shift t, or a measure of the driver’s idle time. The
7 While drivers’ refusals of fares may happen occasionally in practice (and contrary to our assumptions here), this activity is against TLC regulations and can result in the driver receiving a punishment. During our sample period, refusal punishment was $200-$350 for a first offense, $350-500 and a possible 30-day license suspension for a second offense, and a mandatory license revocation for a third offense. The TLC received about 2,000 formal complaints per year in 2009 and 2010.
12
new driver’s experience at shift t is Eit. We generally assume that E is the current shift number
for a new driver, but in some specifications E is the cumulative number of drop-offs that a driver
has completed as of t. We also calculate the share of all drop-offs that occur within each
location-time block combination as of shift t, and we extend E to include these shares. We
measure the impact of E on y with the function g(E;θ), where θ is our main parameter vector of
interest. In some specifications we let g be the natural log of a scalar E, while in others g is a set
of dummy variables for different levels of driver experience. We control for market-level shocks
over time with the fixed effect αh. In our primary preferred models we allow for a different αh
for each distinct hour in the dataset, i.e. for each hour from 12AM March 1 through 11PM
December 31 2009, and we also explore some coarser specifications. αh accounts for demand
and supply fluctuations over time that may influence all drivers’ earnings during t. This may
include regular variation in demand (e.g. daily rush hours), idiosyncratic variation in demand
(e.g. weather), and seasonal effects. In addition to market-level shocks, some driver-level
characteristics during t may also affect production. These factors, which we collect in the vector
Xit, can include the length of a driver’s shift (which may affect how the driver takes breaks) and
characteristics of recent pick-up locations (for analysis of activity in the following hour). In some
specifications we include driver fixed effects or interactions between αh and recent pick-up
locations. Finally, we allow the error term εit to account for additional driver-shift level
unobservables in production. This term contains individual driver variation in production from
period- and experience-specific averages. Driver production and learning are likely to be
correlated within driver over time, so we cluster ε at the driver level during estimation. We
combine the components described above into the econometric model:
yit = g(Eit;θ) + Xitβ + αh + εit. (1)
Across our specifications below we employ a variety of assumptions on y, g, E, X, and α. These
changes should be clear in context, and we implement them to demonstrate the robustness of our
central results on LBD.
4.1 Overall measures of learning
We now consider a variety of models that describe learning by new drivers. We attempt to
demonstrate a few different ideas with this collection of models. First, we explore how sample
13
selection issues may result in biased LBD estimates. Second, we evaluate the benefits from
taking a nonparametric approach to learning dynamics. Third, we assess the robustness of our
results with a collection of related productivity measures and model restrictions.
In our main analysis we focus on hour-R3. We do this in order to examine productivity
differences in a way that limits concerns about shift length, mid-shift breaks, and controls for
time of day. We estimate (1) with y specified as the log of total fare earnings during hour-R3 of a
shift. In Table 2 we provide results in which:
g(E;θ) = θ0 + θ1log(Eit)
for new drivers while g = 0 for all experienced drivers. To focus on sample selection issues, we
gradually build to our preferred specification. In Specification 1 we report results for new drivers
only, and without controls for date and time. Without these controls for market conditions, we
find a large effect of experience on productivity. Specification 1’s results imply that a 10%
increase in a driver’s number of shifts leads to an increase of 0.36% in hour-R3 earnings. Once
the full set of date-by-hour fixed effects is included, to control for driver selection across shift
start times, the learning rate falls to 0.27% in specification 2. The addition of experienced
drivers, in specification 3, provides an estimate of the production gap for brand new drivers
(8.1%) and reveals a considerably lower rate of earnings improvement, now 0.19% for each 10%
increase in the number of shifts worked. Using the full sample of drivers allows us to obtain
better estimates of the time-specific factors that affect shift productivity outcomes. For example,
when we include new drivers only, we have no observations during April that include both
brand-new drivers and those with 100 or more shifts of experience. Inference on productivity
improvements relies on tying the differences between brand new and more seasoned drivers later
in 2009 back to the earlier parts of the year. It is preferable to benchmark the productivity of new
drivers to experienced ones who are working during the same hours of the year. In specification
4 of Table 2 we include driver-specific fixed effects, but due to computational limits we restrict
αh to contain a different fixed effect for each day in the panel and (separately) each hour in the
day. This measure of purely within-driver improvements yields an estimate of learning (0.16%
per 10% increase in shifts) that is similar to the model with a richer set of time controls but
possibly affected by driver attrition. Economically and statistically significant learning occurs
within drivers. See Appendix Table A2 for a collection of additional models that examine how
new drivers’ starting dates and other sample construction assumptions affect the results. 14
The log(E) specification, while common in the literature, imposes a parametric restriction
on the effect of experience on production, and with our large volume of data we can assess
whether the restriction is a poor fit for the market. In Figure 3 we provide estimates from a
model in which we allow a separate dummy variable for every 10 shifts of a driver’s experience
for shifts 0-100, and a dummy for every 20 shifts for shifts 101-200. (Parameter estimates from
this model are provided in Appendix Table A3.) In addition to this less parametric approach to
learning, we provide in Figure 3 the estimated production improvements implied by specification
3 on Table 2. We observe several interesting results from this Figure. Production improvements
among new drivers slow considerably after the first 30 shifts, and beyond the 100th shift driver
productivity is no different from that of a new driver with 200+ shifts of experience. The results
in the appendix also reveal that seasoned new drivers are more productive per-hour than
experienced drivers. This may be due to new drivers being more energetic or capable, on
average, than the older cohort. Macroeconomic fluctuations may also play a role in shifting the
population of workers who select-in to taxi driving. Finally, when we compare the learning
curves from the parametric and nonparametric approaches to learning, we find very little
difference between the two approaches. This provides reasonable support for moving forward
with more concise, parametric models in the analysis we discuss below.
In Table 3 we report a collection of robustness tests. These tests address a few reasonable
concerns about our results: the drivers’ choices during the first hour of a shift have a substantial
effect on hour-R3 earnings, the focus on hour-R3 earnings is not representative of driver
production, and that experience measures other than shift count may provide different answers.
In specifications 1 and 2 we address concerns about the earlier activity. In specification 1 we
include a control for average earnings in the NTA-hour combination where the final fare prior to
hour-R3 originates. While this earnings measure has a substantial effect on hour-R3 earnings, it
does not have a large impact on our estimate of learning. We find similar results in specification
2, where we add an interaction between the date-by-hour fixed effect and the starting NTA of the
driver’s third fare. In specifications 3-5 we use alternative measures of driver production: the
hour following the first drop-off (hour-R1 on the Table), the hour following the eighth drop-off
(hour-R8), and average fare per hour for the full shift. All measures show positive and significant
improvements to driver production, although some magnitudes are slightly different from the
hour-R3 estimates. Finally, in specification 6 we include a count of a driver’s total number of
15
prior drop-offs as the measure of experience, and again we find significant evidence of learning.
The magnitude of the learning estimate (0.025) is larger than in our preferred specifications in
Table 2. This parameter estimate may include some upward bias due to reverse causality, as
more productive drivers will be able to generate more drop-offs per shift or hour.
The improvements in driver earnings leave open the question of what drivers might be
doing to improve their outcomes. While we do not directly observe drivers’ choices during a
shift, we track how several production outcomes vary with experience. These outcomes, which
we described above, are considered in Table 4. Specifications 1 and 2 track improvements in a
critical driver activity: reducing the amount of time between customers. Specification 1, in which
we consider the share of hour-R3 without passengers, shows that new drivers start with a
significantly larger fraction of hour-R3 idle, and then significantly improve in this outcome with
experience. Likewise, the logged slack time between the driver’s third drop-off and his next
customer (i.e., the first fare of hour-R3) is significantly greater for new drivers, but this falls
steadily with experience. Given these results, it is sensible that more experienced drivers are
observed to transport passengers a greater number of miles (specification 3). We find that new
drivers have lower-mileage trips, on average, which may be due to the types of neighborhoods in
which they search for customers. One possible explanation for improvements in drivers’ earnings
is faster travel while passengers are in the car, and an initial examination of the data in
specification 5 shows that this may be true: brand new drivers appear to travel about 9% slower
than experienced drivers. Speed differences between new and experienced drivers fall somewhat
once we add a control for the driver’s average miles per ride (which could be influenced by
longer trips on higher-speed roads), but significant differences remain. While we do not have
direct evidence on drivers’ decisions to avoid congestion or construction, the results of
specification 6 suggest that experienced drivers may navigate NYC streets more efficiently than
new drivers.
4.2 Productivity, learning, and task difficulty
One strength of our data is that we can examine how new drivers’ production lags behind
experienced drivers’ across a variety of situations, and also how new drivers make progress in
different settings. For this analysis, we separate the city’s NTA-hour (of week) combinations into
quartiles by the average earnings of drivers in the 60 minutes following a drop-off. “Difficult”
16
NTA-hour combinations will have relatively low earnings in the hour following a drop-off.
While a separate difficulty statistic can be calculated for each NTA-hour combination, the
distribution of drop-offs across these is far from uniform (i.e. midtown Manhattan drop-offs are
substantially more common than Staten Island). To address this issue, we construct the quartiles
at the level of the (full) fare data rather than the NTA-hour combination. While the full data have
observations split into equal fourths before we apply the sample restrictions and cleaning steps
described above in Section 3, the analysis sample departs from this weighting somewhat.
We estimate a regression model that allows new drivers to have different expected
earnings at the start of their career in each difficulty quartile, experience different rates of
improvement in each quartile, and allows experienced drivers to vary in their average earnings
across difficulty quartiles as well. The base model is:
log(hour-R3 fare𝑖𝑡) = ��1{𝑄𝑖𝑡 = 𝑞} × 1{𝑖 new} × �θ0𝑞 + θ1𝑞 log(𝐸𝑖𝑡)��4
𝑞=1
+ � 1{𝑄𝑖𝑡 = 𝑞}3
𝑞=1
µ𝑞 + 𝑋𝑖𝑡β + αℎ + ε𝑖𝑡.
The variable Qit contains the difficulty quartile (q) in which driver i drops-off his third customer,
starting hour-R3. 1{.} is the indicator function. New drivers have a different intercept (θ0q) and
learning coefficient (θ1q) for each possible difficulty quartile. These parameters capture,
respectively, the productivity lag of brand new drivers across different quartiles and the rate at
which new drivers’ production improves across quartiles as they accumulate experience (E).
Experienced drivers in the bottom three quartiles have earnings that differ by µq from the
(excluded) earnings of experienced drivers in the easiest quartile. The control variables (X),
market condition fixed effects (α), and error term (ε) are all defined as in the models discussed
above.
We report the results of this analysis in Table 5. In specification 1 we specify αh to
include date-by-hour fixed effects to control for market conditions. We find that driver
productivity, as measured through hour-R3 earnings, is 14% lower for brand new drivers
operating in the most difficult quartile than it is for experienced drivers in the same quartile.
Experienced drivers in the most difficult quartile, in turn, have earnings that are 26% below
those of experienced drivers in the easiest quartile. Note that this difference within experienced
17
drivers is substantially larger than difference between new and experienced drivers in overall
earnings reported on Table 2’s preferred specification 3. New drivers in the middle two quartiles
start their driving careers 10% and 8% below the productivity of experienced drivers, while
drivers in the easiest quartile are just 1.5% below experienced drivers. While new drivers
perform substantially worse than experienced drivers in difficult situations, their performance
improves in these settings more quickly than in the three easier quartiles. In fact, the more
difficult a quartile is for new drivers, the faster earnings improve.
One possible concern about this analysis is that savvy drivers will make choices within
their first three fares that reduce the chance of starting hour-R3 in a difficult location. If drivers
are non-randomly assigned to neighborhoods, then part of the earnings penalty of a difficult
neighborhood will be due to below-average drivers sorting into those areas. We address this
concern by adding to X the logged average earnings of the NTA of the driver’s third pick-up
location. This variable accounts for the final choice the driver could make before hour-R3, and
drivers who find better neighborhoods will be distinguished by picking up their last passengers in
high-earning areas. In specification 2 of Table 5 we find that hour-R3 earnings are increasing in
the value of the third fare’s pick-up location, as expected, but very little changes in the way in
which brand new drivers lag experienced ones or the way in which the new drivers’ earnings
improve with experience. This provides some evidence in favor of our assumption that drop-off
locations, conditional on pick-up locations, are a credible randomizing device for selecting the
neighborhoods reached by drivers. We extend our analysis in specification 3, where we replace
the control variable for third-fare pick-up earnings with a separate neighborhood fixed effect
which is fully interacted with our date-by-hour fixed effects. This allows different neighborhoods
to vary in the magnitude in which their expected earnings carry-over to driver activity in the
following hour. Again we find very little movement in the estimated parameters. New drivers lag
experienced drivers by substantially more in difficult settings, but new drivers also improve
relatively quickly in these settings. The coefficient estimates of specification 3 imply that new
drivers in hardest-quartile (Q1) situations require 136 shifts to eliminate the productivity
difference between themselves and experienced drivers; in the second-easiest quartile (Q3)
drivers require 78 shifts.8
8 For new drivers in Q1 situations, we solve -0.137 + 0.0279×log(t’) = 0 for t’ to obtain the number of shifts until convergence for new and experienced drivers’ expected hour-R3 earnings.
18
4.3 General and specific experience
Our final set of analyses employs separate measures of driver experience across locations and
location-time combinations. While there exists a substantial literature on general versus specific
human capital, which we describe briefly in the Introduction, to our knowledge this literature
relies on relatively small datasets, fairly coarse measures of specific experience, and difficult-to-
evaluate or -test identification arguments. (Consider the example of workers who are switched
between jobs, and the potential questions about which workers are selected for switching.) By
contrast, we have the randomizing device of customers asking for drop-offs in a variety of
neighborhoods, which themselves vary in their earning opportunities and (perhaps) optimal
strategies for finding customers.
We employ several measures of specific experience, and we contrast these with “general”
experience as measured through a driver’s cumulative number of drop-offs. These experience
measures are tied to the location and (in some cases) time of the drop-off for a new driver’s third
customer in shift t. As of this drop-off, the driver is observed to have accumulated dt total drop-
offs through the present customer, with dlt drop-offs in the same location l (PUMA or NTA) as
the customer. In addition, we divide the day into six time blocks (b) which run 11pm-5am, 5am-
11am, 11am-5pm, and 5pm-11pm, and we compute the driver’s cumulative number of drop-offs
that occur through shift t in location l during block b: dlbt. We intend the blocks to capture
different neighborhood-specific patterns in demand that correspond to different times of day. We
utilize these drop-off counts by calculating the share, slt = dlt/dt, that has occurred in location l
through shift t; we also calculate the share that has occurred in the same location and time block,
slbt = dlbt/dt. Finally, to limit colinearity concerns, we construct the share of total activity that
occurs within a PUMA or NTA but outside of the current time block. Among new drivers, the
mean value of slt is 17% for the hour-R3 PUMA, and 8% for the hour-R3 NTA. Within a portion
of the day, the mean values of slbt are 8% for PUMAs and 4% for NTAs. (The shares of within-
location activity outside of the current time block are roughly the same respective magnitudes,
due to the common practice of twelve-hour shifts for drivers.) We consider the impact of these
experience measures by gradually layering them into models of driver earnings during hour-R3.
In all models we interact date-by-hour fixed effects with an additional set of indicators for the
pick-up location of the driver’s third fare. To account for the possibility that drivers accumulate
19
less experience in neighborhoods that are more difficult, we include the log of mean earnings for
the NTA-hour of ride 3’s drop-off.
We report in Table 6 our results on general and specific experience. Our first
specification is an adaptation of Table 3’s specification 6. In addition to augmenting the prior
model’s fixed effects and controlling for the current NTA-hour’s mean earnings, we add slt at the
PUMA level. The coefficient value on slt implies that moving from the 25th percentile (0.099) of
slt to the 75th percentile (0.255) would provide a 2.7% increase in the expected earnings of a new
driver during hour-R3. We introduce slbt in specification 2 in order to analyze specific experience
at a finer level. We find that time-specific experience has a substantially greater effect on
earnings than experience in the same neighborhood outside of the current time block, although
this latter effect is also significantly positive. Moving from the 25th percentile (0.034) of slbt to
the 75th percentile (0.126) provides a 2.5% increase in hour-R3 earnings, while the same increase
in experience outside of block b in location l provides a 0.6% increase in earnings. Thus, it
appears that specification 1’s result is largely driven by prior experience that occurred in the
same portion of the day as shift t’s third ride. In specifications 3 and 4 we repeat this analysis
while using (tighter) NTA boundaries to define locations; our results are qualitatively similar to
specifications 1 and 2. When we group together all experience within an NTA, we find that our
coefficient estimate (0.148) implies that moving from the 25th percentile (0.032) to the 75th
percentile (0.098) of slt would provide an increase in earnings of 1%. Relative to the PUMA-
level result, the smaller magnitude of the NTA-level result largely is due to the lower shares of
experience drivers have in NTAs compared to PUMAs; the coefficient estimates for the two
geographies are very similar (0.174 for PUMAs and 0.148 for NTAs). As in specification 2, we
find in specification 4 that the benefit from NTA-specific experience is primarily due to
experience in the same time block. Finally, we note that the impact of overall experience (total
drop-offs) is fairly consistent across specifications, and our point estimates are also quite similar
to Table 3’s specification 6, which included no measures of specific experience.
Specifications 1-4 show some decline in the importance of specific experience as we
move from a relative broad geography (PUMAs) to a narrow one (NTA). This suggests a limit to
the value of highly-localized experience for taxi drivers, at least when we measure productivity
through the full hour of fares following shift t’s third ride. We address these ideas in the
remaining specifications of Table 6. First, we examine splitting the driver’s share of experience
20
between the drop-off’s NTA and the portion of the NTA’s PUMA that is outside of the NTA
itself. We report in specification 5 that the share of experience in the surrounding PUMA has a
significantly larger impact on earnings than does the share in the drop-off’s NTA. This may be
due to drivers’ tendencies to search across NTAs following a drop-off, or the sequence of rides
that occur in the full hour following a drop-off. Drivers are more likely to end a ride and start a
subsequent ride in the same PUMA than the same NTA (66% vs. 50%), so PUMA-level
experience may be especially useful for the full hour of customers. In the remaining
specifications of Table 6 we turn to an alternative measure of productivity – slack time following
ride 3 – to confirm our earlier results on specific experience as well as to focus on an outcome
that is more precisely limited to the driver’s circumstances following ride 3. In specification 6 we
report that within-PUMA experience, both inside and outside of the current time block, has a
significant impact on reducing slack time. A new driver at the 75th percentile of slbt will require
about 1 fewer minute to find his next customer than a driver at the 25th percentile of slbt. We
conclude the analysis by repeating specification 5 while using slack time as a measure of driver
productivity. Again we find that experience within a PUMA but outside of the current NTA
matters more than experience in the drop-off’s NTA. Since this dependent variable focuses on
outcomes that immediately follow ride 3, it appears that drivers benefit more from having
experience in the neighborhoods that surround a drop-off than in the drop-off NTA itself.
5. Discussion and conclusions
We have described learning patterns by New York City yellow taxi drivers while controlling for
a wide variety of potential factors that can confound empirical studies of learning by doing. We
find that economically and statistically significant learning occurs among taxi drivers, but these
effects may be overstated in the absence of controls for task selection or time-specific demand
conditions. In addition to studying overall learning, we provide evidence on driver performance
across measurably heterogeneous situations. We find that the performance of new drivers lags
that of experienced drivers by a greater margin in hard situations than easy ones, but
performance gaps are eliminated relatively quickly in hard circumstances. Finally, we document
the importance of specific versus general experience. We find that overall and neighborhood-
specific experience are both important in improvements to driver productivity.
21
We present our results with two main contributions relative to the prior literature. First,
the taxi drivers in our study are able to choose their own production strategies, and through fare-
based compensation they are rewarded directly for strategies that are more successful. This is
novel relative to the manufacturing settings that are commonly studied in the LBD literature,
where workers’ actions may be relatively constrained by production line conventions and rigid
pay practices. Second, the TLC’s fare-level “big data” allow us the opportunity to examine an
unusually large number of economic agents moving though a precisely documented production
environment. The large population of agents allows us benchmark new drivers’ production
relative to experienced drivers’ contemporaneous efforts, which is often impossible in prior
studies that examine single firms or one-time production processes. The location- and time-
specific data on each fare allow us to quantify which production tasks are relatively difficult, as
well as track neighborhood-specific experience to compare with a driver’s total experience in the
market.
Our data and analysis leave open several important questions about learning in general
and performance by NYC taxi drivers specifically. First, we are unable to observe how drivers
may selectively refuse fares to certain neighborhoods, or attempt to charge a price above the
appropriate meter fare. While either type of activity is prohibited by the TLC and can result in
the loss of a hack license, we cannot gauge the frequency of this activity in the market and
whether it is correlated with driver experience. Second, the absence of driver-characteristic data
prevents us from describing which types of drivers learn most quickly, and whether learning is
affected by the driver’s social circle or the organizational arrangements of the medallion owner.
Third, our reduced-form analysis prevents us from studying the precise mechanisms of driver
learning or their welfare benefits. Further work in this area could include descriptions of the
strategies drivers choose for finding new customers, how drivers update their information on
earning opportunities as they gain experience, and the welfare value of improvements to drivers’
information. Finally, additional analysis is needed to assess the validity of our empirical
arguments (on selection, etc.) and results outside of the NYC taxi market.
22
References
Clement, Michael B., Lisa Koonce, and Thomas J. Lopez. 2007. “The Roles of Task-specific Forecasting Experience and Innate Ability in Understanding Analyst Forecasting Performance.” Journal of Accounting and Economics 44 (3): 378–398.
Gathman, Christina and Uta Schonberg. 2010. “How General is Human Capital? A Task-Based Approach.” Journal of Labor Economics 28 (1): 1-49.
Hatch, Nile and David Mowry. 1998. “Process Innovation and Learning by Doing in Semiconductor Manufacturing.” Management Science 44 (11): 1461-1477.
Hendel, Igal and Yossi Spiegel. 2014. “Small Steps for Workers, a Giant Leap for Productivity.” American Economic Journal: Applied Economics 6 (1): 73-90.
Jovanovic, Boyan, and Yaw Nyarko. 1995. “A Bayesian Learning Model Fitted to a Variety of Empirical Learning Curves.” Brookings Papers on Economic Activity. Microeconomics 1995: 247-305.
Kellogg, R. 2011. “Learning by Drilling: Interfirm Learning and Relationship Persistence in the Texas Oilpatch.” The Quarterly Journal of Economics 126 (4): 1961–2004.
Levitt, Steven D., John List, and Chad Syverson. 2013. “Toward an Understanding of Learning by Doing: Evidence from an Automobile Assembly Plant.” Journal of Political Economy 121 (4): 643–681.
Ost, Ben. 2013. “How Do Teachers Improve? The Relative Importance of Specific and General Human Capital.” American Economic Journal: Applied Economics, forthcoming.
Pisano, GP. 2001. “Organizational Differences in Rates of Learning: Evidence from the Adoption of Minimally Invasive Cardiac Surgery.” Management Science 47 (6): 752–768
Schaller, B. 2006. “The New York City Taxicab Fact Book.” Schaller Consulting, Mars (March).
Schneider, Henry S. 2010. “Moral Hazard in Leasing Contracts: Evidence from the New York City Taxi Industry.” Journal of Law and Economics 53 (4): 783-805.
Shaw, Kathryn and Edward Lazear. 2008. “Tenure and Output.” Labour Economics 15: 710-724.
Rockart, Scott and Nilanjana Dutt. 2013. “The Rate and Potential of Capability Development Trajectories.” Strategic Management Journal, forthcoming.
Thompson, Peter. 2001. “How Much Did the Liberty Shipbuilders Learn? New Evidence for an Old Case Study.” Journal of Political Economy 109 (1): 103–137.
23
———. 2010. “Learning by Doing.” Chapter 11 in the Handbook of Economics of Technical Change, edited by Bronwyn H. Hall and Nathan Rosenberg, North-Holland/Elsevier.
———. 2012. “The Relationship Between Unit Cost and Cumulative Quantity and the Evidence for Organizational Learning-by-Doing.” The Journal of Economic Perspectives 26 (3): 203–224.
24
Figure 1: Proportion of New Drivers & Total Rides by Hour of Day
Panel A: Proportion of Pick-Ups by New Driver, by Each Hour of Day0
.1.2
.3.4
Pro
port
ion
of P
ick−
Ups
by
New
Driv
ers
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23Hour of the Day
Tuesday
0.1
.2.3
.4P
ropo
rtio
n of
Pic
k−U
ps b
y N
ew D
river
s
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23Hour of the Day
Saturday
Panel B: Total Number of Pick-Ups, by Each Hour of Day
010
0000
2000
0030
0000
4000
00N
umbe
r of
Pic
k−U
ps
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23Hour of the Day
Tuesday0
1000
0020
0000
3000
00N
umbe
r of
Pic
k−U
ps
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23Hour of the Day
Saturday
Notes : Sample corresponds to all rides by the full sample of New Drivers (March 1 - December 31) and the20% sample of Experienced Drivers (January 1 - December 31, 2009). Each bar corresponds to one hourof each day of the week (i.e. 52 hours across the year) Panel A: For each hour of the day, we compute theproportion of pick-ups in that hour that were by new drivers. Panel B: Each column reports the total countof pick-ups within each hour of the day, for a given day of the week). These counts are computed in thesample of all new drivers & 20% of all experienced drivers.
Figure 2: NTA/DOW/HH-Level Difficulty Measures by Location of Drop-Off
1234No data
Tuesday 8am
1234No data
Tuesday 3pm
1234
Tuesday 10pm
1234No data
Saturday 10pm
Notes : Sample corresponds to 80% of Experienced Drivers (January 1 - December 31, 2009). Difficulty isdefined by a NTA/Day-of-the-Week/Hour-of-the-Day specific measure of the total fare earnings in the 60minutes following a drop-off. This variables is split into quartiles, weighted by the number of rides. Thefourth quartile corresponds to the “easiest” quartile (i.e. highest average fare). 8am corresponds to the hourfalling between 8am and 9am (and similarly for the other reported hours). The maps exclude NTAs 199,299, 399, 498, and 499 which straddle multiple PUMAs (for the empirical analysis, these 5 NTAs are brokeninto 30 distinct NTAs that are nested within PUMAs).
Figure 3: Evaluation of parameteric (log) vs non-parametric (dummy) approaches
−.0
6−
.04
−.0
20
.02
Coe
ffici
ent (
Dep
Var
: Log
(Hou
r−R
3 F
ares
)
*1_10
*11_20
*21_30
*31_40
*41_50
*51_60
*61_70
*71_80
*81_90
*91_100*101_120*121_140*141_160*161_180*181_200
New Driver: # of Shifts Since Starting
Dummies log(E)
Notes : The red line corresponds to Table 2, Specification 3, and uses a natural log specification for g(E)(extrapolating the intercept + log coefficient to shifts 5, 15, 25,...,170,190). The blue line corresponds toAppendix Table 1, Specification 3, and uses a collection of dummy variables for g(E) (minus the coefficienton “Experienced”). “Hour-R3 Fares” is the sum of all fares within the first 60 minutes following the 3rddrop-off of a shift (including fractional fares for those rides that straddle the end of the hour). Samplecorresponds to all rides by the full sample of New Drivers (March 1 - December 31) and the 20% sample ofExperienced Drivers (January 1 - December 31, 2009).
Table 1: Summary Statistics
Experienced Drivers New Drivers New Drivers, Shift≤20
N=894,178 N=456,407 N=61,519Mean Median SD Mean Median SD Mean Median SD
Panel A: Shift-Level Summary Statistics
Avg. Hourly Fare Earnings in Shift 26.04 26.13 5.58 25.80 25.85 5.31 24.70 24.70 5.30Fare Earnings in Hour-R3 25.30 26.38 11.50 25.21 26.14 11.57 23.97 24.50 12.06Percent of Hour-R3 Spent Idle 48.73 45.39 22.84 49.33 47.33 23.04 52.66 50.97 22.98Slack Time (Min.) after Ride 3 13.31 6.55 24.77 13.44 6.55 23.61 14.66 6.55 23.67Miles Travelled in Hour-R3 6.19 5.93 3.55 6.12 5.85 3.54 5.81 5.50 3.59Trips in Hour-R3 2.83 3.00 1.39 2.82 3.00 1.38 2.71 2.88 1.36Avg. Ride Duration (Min.) in Hour-R3 12.31 10.74 7.82 12.20 10.61 7.61 11.87 10.22 7.94Speed Per Ride (MPH) in Hour-R3 12.71 11.36 5.81 12.71 11.35 5.90 12.80 11.44 5.90
Panel B: Driver-Level Summary StatisticsExperienced Drivers New Drivers
N=4,366 N=3,298
Driver Shifts 204.80 213.00 56.16 138.39 132.00 60.19Share of Days Active 0.70 0.73 0.17 0.67 0.70 0.20Number of Cars Per Driver 9.90 2.00 17.93 17.88 7.00 20.92Driver in Single Car? 0.46 0.00 0.50 0.17 0.00 0.38
Notes : Sample contains shifts between March 1, 2009 and December 31, 2009 (Full Sample of New Drivers, 20% Sample of Experienced Drivers). Ncorresponds to the main fare variable (Fare earnings in hour 2); some counts are smaller due to missing data.
Table 2: Driver Experience and Productivity Improvements
Specification: 1 2 3 4Dependent Variable: log(Hour-R3 Fares) log(Hour-R3 Fares) log(Hour-R3 Fares) log(Hour-R3 Fares)Includes Experienced Drivers? N N Y YDriver Fixed Effects?: N N N YTime Fixed Effects: None Date X Hour Date X Hour Date,Hour(no
interaction)
New Driver -0.0811∗∗∗
(0.00536)
New×log(Shift) 0.0361∗∗∗ 0.0269∗∗∗ 0.0187∗∗∗ 0.0160∗∗∗
(0.00158) (0.00198) (0.00123) (0.00110)
N 435,103 435,103 1,281,286 1,281,286r2 0.006 0.201 0.166 0.179∗ p < 0.10, ∗∗ p < 0.05, ∗∗∗ p < 0.01
Notes : Robust standard errors clustered at the driver level, in parentheses. “Hour-R3 Fares” is the sum of all fares within the first 60 minutesfollowing the 3rd drop-off of a shift (including fractional fares for those rides that straddle the end of the hour). Sample contains shifts between March1, 2009 and December 31, 2009 (Full Sample of New Drivers, 20% Sample of Experienced Drivers). All models include controls for the driver’s shiftlength (in hours) and shift length squared.
Table 3: Robustness of Productivity Improvements
Specification: 1 2 3 4 5 6Dependent Variable: log(Hour-R3 Fares) log(Hour-R3 Fares) log(Hour-R1 Fares) log(Hour-R8 Fares) log(Avg Hr Fares) log(Hour-R3 Fares)Fixed effects: Date X Hour Date X Hour X Date X Hour Date X Hour Date X Hour Date X Hour
Starting NTA
New Driver -0.0787∗∗∗ -0.0777∗∗∗ -0.1127∗∗∗ -0.0571∗∗∗ -0.0531∗∗∗ -0.1802∗∗∗
(0.00536) (0.00549) (0.00610) (0.00530) (0.00451) (0.00801)
New×log(Shift) 0.0183∗∗∗ 0.0178∗∗∗ 0.0250∗∗∗ 0.0140∗∗∗ 0.0137∗∗∗
(0.00124) (0.00127) (0.00139) (0.00125) (0.00105)
New×log(Drop-offs) 0.0247∗∗∗
(0.00110)
log(Starting NTA Hr Earn) 0.3168∗∗∗
(0.01153)
N 1,260,232 1,260,384 1,261,516 1,238,539 1,350,558 1,281,286r2 0.168 0.307 0.169 0.121 0.194 0.167∗ p < 0.10, ∗∗ p < 0.05, ∗∗∗ p < 0.01
Notes : Robust standard errors clustered at the driver level, in parentheses. “Hour-R3 Fares” is the sum of all fares within the first 60 minutesfollowing the 3rd drop-off of a shift (including fractional fares for those rides that straddle the end of the hour). “Hour-R1” & “Hour-R8” are the 60minutes that follow the 1st & 8th drop-offs respectively. “Avg Hr Fares” is the sum of all fares within a shift, divided by the shift length (in hours).“Starting NTA” denotes the pick-up NTA for the 3rd ride of the shift. “Starting NTA Hr Earn” is a difficulty measure for the pick-up location/time ofthe 3rd ride of the shift – specifically, it is the average fare earnings (in the 80% sample of experienced drivers) in the 60 minutes following a drop-offin that NTA/Day-of-the-Week/Hour-of-the-Day. All models include controls for the driver’s shift length (in hours) and shift length squared. Date XHour fixed effects in columns 1, 2, & 6 correspond to the Date X Hour of the ride 3 drop-off, and to the ride 1 drop-off for column 3, ride 8 drop-offfor column 4, and ride 1 pick-up for column 5. Sample contains shifts between March 1, 2009 and December 31, 2009 (Full Sample of New Drivers,20% Sample of Experienced Drivers).
Table 4: Changes to Activities and Outcomes
Specification: 1 2 3 4 5 6Dependent Variable: Share of Hour-R3 log(Slack Time log(Hour-R3 Miles) log(Hour-R3 Avg log(Hour-R3 Avg log(Hour-R3 Avg
Spent Idle Between Fares) Miles Per Ride) Speed) Speed)
New Driver 1.6040∗∗∗ 0.0988∗∗∗ -0.1398∗∗∗ -0.0640∗∗∗ -0.0912∗∗∗ -0.0646∗∗∗
(0.19391) (0.00831) (0.00826) (0.00750) (0.00645) (0.00450)
New×log(Shift) -0.4111∗∗∗ -0.0190∗∗∗ 0.0287∗∗∗ 0.0118∗∗∗ 0.0166∗∗∗ 0.0117∗∗∗
(0.04635) (0.00197) (0.00217) (0.00203) (0.00180) (0.00119)
log(Hour-R3 Avg Miles Per 0.4154∗∗∗
Ride) (0.00456)
N 1,276,952 1,292,360 1,286,127 1,286,127 1,283,152 1,283,152r2 0.312 0.131 0.078 0.031 0.211 0.593∗ p < 0.10, ∗∗ p < 0.05, ∗∗∗ p < 0.01
Notes : Robust standard errors clustered at the driver level, in parentheses. “Share of Hour-R3 Spent Idle” is the share of the first 60 minutes followingthe 3rd drop-off of a shift spent without a fare. “Slack Time Between Fares” is the number of minutes between the 3rd drop-off and the 4th pick-up.“Hour-R3 Miles” is the sum of all miles driven within the first 60 minutes following the 3rd drop-off of a shift (including fractional amounts for thoserides that straddle the end of the hour). “Hour-R3 Avg Miles Per Ride” is “Hour-R3 Miles” divided by the number of rides within the hour. “Hour-R3Avg Speed” is “Hour-R3 Miles” divided by the number of minutes spent with a fare in that hour. All models include controls for the driver’s shiftlength (in hours) and shift length squared. Sample contains shifts between March 1, 2009 and December 31, 2009 (Full Sample of New Drivers, 20%Sample of Experienced Drivers).
Table 5: Productivity across Different Difficulties
Specification: 1 2 3Dependent Variable: log(Hour-R3 Fares) log(Hour-R3 Fares) log(Hour-R3 Fares)
New× Q1 Location -0.1388∗∗∗ -0.1363∗∗∗ -0.1370∗∗∗
(0.01138) (0.01139) (0.01207)
New× Q2 Location -0.0950∗∗∗ -0.0938∗∗∗ -0.0962∗∗∗
(0.00871) (0.00870) (0.00903)
New× Q3 Location -0.0791∗∗∗ -0.0782∗∗∗ -0.0801∗∗∗
(0.00692) (0.00691) (0.00720)
New× Q4 Location -0.0146∗ -0.0140∗ -0.0151∗
(0.00757) (0.00757) (0.00789)
New× Q1 × log(Shifts) 0.0288∗∗∗ 0.0283∗∗∗ 0.0279∗∗∗
(0.00264) (0.00264) (0.00279)
New× Q2 × log(Shifts) 0.0208∗∗∗ 0.0206∗∗∗ 0.0206∗∗∗
(0.00200) (0.00200) (0.00207)
New× Q3 × log(Shifts) 0.0183∗∗∗ 0.0182∗∗∗ 0.0184∗∗∗
(0.00159) (0.00159) (0.00166)
New× Q4 × log(Shifts) 0.0069∗∗∗ 0.0068∗∗∗ 0.0068∗∗∗
(0.00181) (0.00181) (0.00188)
Q1 Location -0.2639∗∗∗ -0.2534∗∗∗ -0.2791∗∗∗
(0.00475) (0.00481) (0.00511)
Q2 Location -0.1088∗∗∗ -0.1029∗∗∗ -0.1065∗∗∗
(0.00350) (0.00354) (0.00380)
Q3 Location -0.0570∗∗∗ -0.0532∗∗∗ -0.0573∗∗∗
(0.00270) (0.00272) (0.00296)
log(Starting NTA Hr Earn) 0.2153∗∗∗
(0.01168)
N 1,256,715 1,256,715 1,256,715r2 0.175 0.175 0.314∗p < 0.10, ∗∗
p < 0.05, ∗∗∗p < 0.01
Notes : Robust standard errors clustered at the driver level, in parentheses. “Hour-R3 Fares” is the sum ofall fares within the first 60 minutes following the 3rd drop-off of a shift (including fractional fares for thoserides that straddle the end of the hour). Difficulty is defined by a NTA/Day-of-the-Week/Hour-of-the-Dayspecific measure of the total fare earnings in the 60 minutes following a drop-off. This variables is split intoquartiles, weighted by the number of rides. The fourth quartile corresponds to the “easiest” quartile (i.e.highest average fare). “Starting NTA Hr Earn” is a difficulty measure for the pick-up location/time of 3rdride of the shift – specifically, it is the average fare earnings (in the 80% sample of experienced drivers)in the 60 minutes following a drop-off in that NTA/Day-of-the-Week/Hour-of-the-Day. All models includecontrols for the driver’s shift length (in hours) and shift length squared. Specifications 1 & 2 include fixedeffects for Date X Hour, while specification 3 includes fixed effects for Date X Hour X Starting NTA. Samplecontains shifts between March 1, 2009 and December 31, 2009 (Full Sample of New Drivers, 20% Sample ofExperienced Drivers), and only those shifts for which a difficulty measure is recorded for both the pick-upand drop-off of Ride 3.
Table 6: Location-specific Experience and Productivity
Specification: 1 2 3 4 5 6 7Dependent Variable: log(Hour-R3 Fares) log(Hour-R3 Fares) log(Hour-R3 Fares) log(Hour-R3 Fares) log(Hour-R3 Fares) Slack Time (Min.) Slack Time (Min.)
After Ride 3 After Ride 3
New Driver -0.2200∗∗∗ -0.2323∗∗∗ -0.1997∗∗∗ -0.2113∗∗∗ -0.2140∗∗∗ 5.3441∗∗∗ 4.9936∗∗∗
(0.00849) (0.00882) (0.00849) (0.00899) (0.00854) (0.15803) (0.15257)
New×log(Drop-offs) 0.0258∗∗∗ 0.0277∗∗∗ 0.0257∗∗∗ 0.0275∗∗∗ 0.0246∗∗∗ -0.4506∗∗∗ -0.3781∗∗∗
(0.00112) (0.00119) (0.00113) (0.00122) (0.00113) (0.02030) (0.01969)
New×Share of rides in 0.1739∗∗∗
PUMA (0.00920)
New×Share of rides in 0.0700∗∗∗ -10.5398∗∗∗
PUMA outside 6HH (0.01948) (0.36114)
New×Share of rides in 0.2738∗∗∗ -11.2401∗∗∗
PUMA within 6HH (0.01697) (0.33908)
New×Share of rides in 0.2793∗∗∗ -16.3009∗∗∗
PUMA outside NTA (0.01431) (0.34517)
New×Share of rides in 0.1482∗∗∗ 0.0853∗∗∗ -6.3399∗∗∗
NTA (0.01357) (0.01354) (0.29893)
New×Share of rides in -0.0022NTA outside 6HH (0.03214)
New×Share of rides in 0.2615∗∗∗
NTA within 6HH (0.02489)
log(Ending NTA Hr Earn) 0.9154∗∗∗ 0.9154∗∗∗ 0.9315∗∗∗ 0.9320∗∗∗ 0.9102∗∗∗ -35.1837∗∗∗ -34.9228∗∗∗
(0.01423) (0.01423) (0.01415) (0.01415) (0.01427) (0.37720) (0.37787)
N 1,256,850 1,256,850 1,256,715 1,256,850 1,256,850 1,267,726 1,267,726r2 0.316 0.317 0.316 0.316 0.317 0.304 0.304∗ p < 0.10, ∗∗ p < 0.05, ∗∗∗ p < 0.01
Notes : Robust standard errors clustered at the driver level, in parentheses. “Hour-R3 Fares” is the sum of all fares within the first 60 minutesfollowing the 3rd drop-off of a shift (including fractional fares for those rides that straddle the end of the hour). “Ending NTA Hr Earn” is a difficultymeasure for the pick-up location/time of 3rd ride of the shift – specifically, it is the average fare earnings (in the 80% sample of experienced drivers)in the 60 minutes following a drop-off in that NTA/Day-of-the-Week/Hour-of-the-Day. All models include fixed effects for Date X Hour X StartingNTA, as well as controls for the driver’s shift length (in hours) and shift length squared. Sample contains shifts between March 1, 2009 and December31, 2009 (Full Sample of New Drivers, 20% Sample of Experienced Drivers). “6HH” denotes the 6-Hour block (11pm-5am, 5am-11am, 11am-5pm,5pm-11pm) combination (e.g. Friday, 5am-10am).
A Appendix
Table A.1: Tests of Randomization Assumption (Transition between Pick-Up & Drop-Off)Specification: 1 2 3 4 5 6Dependent Variable: NTA/DOW/HH Difficulty for Drop-off (Ride 3) PUMA/DOW/HH Difficulty for Drop-off (Ride 3)
Mean Mean Mean Log(Mean) Mean Log(Mean)
Fixed effects: None Date X Hour Date X Hour X Date X Hour X Date X Hour X Date X Hour XStarting NTA Starting NTA Starting NTA Starting NTA
New Driver -0.8811∗∗∗ -0.0507∗∗∗ 0.0045 0.0002 -0.0043 0.0002(0.08676) (0.01193) (0.01041) (0.00042) (0.01009) (0.00042)
New×log(Shift) 0.1732∗∗∗ 0.0103∗∗∗ -0.0014 -0.0001 -0.0003 -0.0001(0.01975) (0.00284) (0.00244) (0.00010) (0.00235) (0.00010)
Constant 26.9845∗∗∗ 26.9301∗∗∗ 26.9276∗∗∗ 3.2831∗∗∗ 26.7856∗∗∗ 3.2831∗∗∗
(0.03664) (0.00279) (0.00182) (0.00007) (0.00197) (0.00007)
N 1,320,911 1,320,911 1,320,911 1,320,911 1,325,465 1,321,066r2 0.001 0.831 0.876 0.874 0.895 0.874∗ p < 0.10, ∗∗ p < 0.05, ∗∗∗ p < 0.01
Notes : Robust standard errors clustered at the driver level, in parentheses. Sample contains shifts betweenMarch 1, 2009 and December 31, 2009 (Full Sample of New Drivers, 20% Sample of Experienced Drivers),and only those shifts for which a difficulty measure is recorded for both the pick-up and drop-off of Ride 3.
Table A.2: Baseline Analysis with Alternative Driver Samples
Specification: 1 2 3 4Dependent Variable: log(Hour-R3 Fares) log(Hour-R3 Fares) log(Hour-R3 Fares) log(Hour-R3 Fares)Includes Experienced Drivers? N N Y YDriver Fixed Effects?: N N N YTime Fixed Effects: None Date X Hour Date X Hour Date,Hour(no
interaction)Panel A: Sample: Drivers Observed in Every Month (Mar-Dec)
New Driver -0.0512∗∗∗
(0.01037)
New×log(Shift) 0.0279∗∗∗ 0.0327∗∗∗ 0.0126∗∗∗ 0.0134∗∗∗
(0.00302) (0.00953) (0.00216) (0.00203)
N 108,919 108,919 792,321 792,321r2 0.004 0.239 0.161 0.173
Panel B: Sample: Drivers Observed in Every Month After Entry
New Driver -0.0968∗∗∗
(0.00625)
New×log(Shift) 0.0416∗∗∗ 0.0283∗∗∗ 0.0213∗∗∗ 0.0188∗∗∗
(0.00185) (0.00233) (0.00143) (0.00130)
N 320,476 320,476 1,017,913 1,017,913r2 0.007 0.209 0.168 0.177∗ p < 0.10, ∗∗ p < 0.05, ∗∗∗ p < 0.01
Notes : Robust standard errors clustered at the driver level, in parentheses. “Hour-R3 Fares” is the sum ofall fares within the first 60 minutes following the 3rd drop-off of a shift (including fractional fares for thoserides that straddle the end of the hour). All models include controls for the driver’s shift length (in hours)and shift length squared. Sample contains shifts between March 1, 2009 and December 31, 2009 (Full Sampleof New Drivers, 20% Sample of Experienced Drivers).
Table A.3: Non-parametric Approach to Driver Experience
Specification: 1 2 3Dependent Variable: log(Hour-R3 Fares) log(Hour-R3 Fares) log(Hour-R3 Fares)Includes Experienced Drivers? N N YIncluded fixed effects: None Date X Hour Date X Hour
New×Shift 1-10 -0.1398∗∗∗ -0.1025∗∗∗ -0.0733∗∗∗
(0.00889) (0.00929) (0.00659)
New×Shift 11-20 -0.1031∗∗∗ -0.0740∗∗∗ -0.0474∗∗∗
(0.00863) (0.00858) (0.00629)
New×Shift 21-30 -0.0900∗∗∗ -0.0662∗∗∗ -0.0435∗∗∗
(0.00863) (0.00824) (0.00630)
New×Shift 31-40 -0.0736∗∗∗ -0.0520∗∗∗ -0.0327∗∗∗
(0.00853) (0.00783) (0.00615)
New×Shift 41-50 -0.0679∗∗∗ -0.0473∗∗∗ -0.0307∗∗∗
(0.00851) (0.00753) (0.00607)
New×Shift 51-60 -0.0658∗∗∗ -0.0415∗∗∗ -0.0281∗∗∗
(0.00852) (0.00731) (0.00606)
New×Shift 61-70 -0.0623∗∗∗ -0.0393∗∗∗ -0.0264∗∗∗
(0.00853) (0.00725) (0.00614)
New×Shift 71-80 -0.0615∗∗∗ -0.0345∗∗∗ -0.0236∗∗∗
(0.00859) (0.00712) (0.00614)
New×Shift 81-90 -0.0572∗∗∗ -0.0273∗∗∗ -0.0171∗∗∗
(0.00848) (0.00690) (0.00602)
New×Shift 91-100 -0.0432∗∗∗ -0.0173∗∗ -0.0082(0.00859) (0.00686) (0.00612)
New×Shift 101-120 -0.0413∗∗∗ -0.0177∗∗∗ -0.0104∗
(0.00815) (0.00630) (0.00565)
New×Shift 121-140 -0.0317∗∗∗ -0.0153∗∗ -0.0092(0.00832) (0.00611) (0.00565)
New×Shift 141-160 -0.0141∗ -0.0116∗∗ -0.0071(0.00823) (0.00593) (0.00559)
New×Shift 161-180 -0.0018 -0.0120∗∗ -0.0092∗
(0.00787) (0.00562) (0.00542)
New×Shift 181-200 0.0123 -0.0046 -0.0033(0.00750) (0.00583) (0.00569)
Experienced -0.0202∗∗∗
(0.00587)
N 435,103 435,103 1,281,286r2 0.006 0.201 0.166∗p < 0.10, ∗∗
p < 0.05, ∗∗∗p < 0.01
Notes : Robust standard errors clustered at the driver level, in parentheses. “Hour-R3 Fares” is the sum ofall fares within the first 60 minutes following the 3rd drop-off of a shift (including fractional fares for thoserides that straddle the end of the hour). All models include controls for the driver’s shift length (in hours)and shift length squared. Sample contains shifts between March 1, 2009 and December 31, 2009 (Full Sampleof New Drivers, 20% Sample of Experienced Drivers).