qualifying times for the boston marathon

10
This article was downloaded by: [Ondokuz Mayis Universitesine] On: 12 November 2014, At: 08:14 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK CHANCE Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/ucha20 Qualifying Times for the Boston Marathon Richard L. Smith a , Scott Powers b & Jessi Cisewski c a The University of North Carolina at Chapel Hill and director of the Statistical and Applied Mathematical Sciences Institute b Stanford University c Carnegie Mellon University Published online: 16 Sep 2014. To cite this article: Richard L. Smith, Scott Powers & Jessi Cisewski (2014) Qualifying Times for the Boston Marathon, CHANCE, 27:3, 25-33 To link to this article: http://dx.doi.org/10.1080/09332480.2014.965628 PLEASE SCROLL DOWN FOR ARTICLE Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://www.tandfonline.com/page/terms- and-conditions

Upload: jessi

Post on 17-Mar-2017

215 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Qualifying Times for the Boston Marathon

This article was downloaded by: [Ondokuz Mayis Universitesine]On: 12 November 2014, At: 08:14Publisher: Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41Mortimer Street, London W1T 3JH, UK

CHANCEPublication details, including instructions for authors and subscription information:http://www.tandfonline.com/loi/ucha20

Qualifying Times for the Boston MarathonRichard L. Smitha, Scott Powersb & Jessi Cisewskica The University of North Carolina at Chapel Hill and director of the Statistical and AppliedMathematical Sciences Instituteb Stanford Universityc Carnegie Mellon UniversityPublished online: 16 Sep 2014.

To cite this article: Richard L. Smith, Scott Powers & Jessi Cisewski (2014) Qualifying Times for the Boston Marathon, CHANCE, 27:3,25-33

To link to this article: http://dx.doi.org/10.1080/09332480.2014.965628

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in thepublications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations orwarranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinionsand views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsedby Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified withprimary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings,demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectlyin connection with, in relation to or arising out of the use of the Content.

This article may be used for research, teaching, and private study purposes. Any substantial or systematicreproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone isexpressly forbidden. Terms & Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions

Page 2: Qualifying Times for the Boston Marathon

CHANCE

25

Qualifying Times for the Boston MarathonRichard L. Smith, Scott Powers, and Jessi Cisewski

Table 1—Boston Marathon Qualifying Times (up to 2012)

Patriots’ Day of 1897 marked the first of what is now the oldest annual marathon in the world: the Boston Marathon. The Boston Athletic Associa-

tion (BAA) operates this marathon in which runners from around the world strive to participate. The Boston Marathon is not only historically appealing for runners, but is also considered a grueling course – topped off with “Heartbreak Hill’’ after mile 20 of 26.2. An extra twist to the allure of the Boston Marathon is that run-ners have to satisfy the BAA qualifying time standards in a certified marathon within about a year and a half of the desired Boston Marathon. Qualifying standards for the Boston Marathon were first established in the 1970s, when they were viewed as a means of controlling the size of the field while continuing to allow estab-lished marathon runners to participate. They have since been revised a number of times.

Table 1 contains the qualifying times used for the 2011 Boston Marathon (QT1), which had been in place since 2003. An additional 59 seconds is accepted for each standard. For instance, a man aged 18–34 could have run up to 3:10:59 and still been accepted into the race. Also, while race organizers may publish both “gun” and “chip” times, the chip time (representing net time between crossing the start and finish lines) is accepted as the basis of Boston qualifying times. The chip time is based on a small computer chip attached to the runner’s vest, which is recorded by sensors at the start, the finish, and several intermediate points.

The 2011 Boston Marathon registration opened October 18, 2010, and all of the 20,000 spots for qualifiers were filled in about eight hours. This led to speculation that the BAA would impose substantially tighter qualifying times for future years. In addition, The Wall Street Journal (October 13, 2010) published a commentary suggesting the current 30-minute dispar-ity between qualifying times for men and women was too generous to the women.

In light of these speculations, we contacted the BAA to express our interest in assisting with this process and proposed an analysis for modifying the Boston qualifying times. After initial consultations with Jack Fleming and Mike Pieroni of the BAA, we agreed to assess the impact of any QT changes of 2, 5, 10, 15, 20, and 30 minutes. The main objective of our study was to estimate the total number of potential entrants, for each sex and age group, under each of these changes.

As described in the following account, we did this in stages, starting with large races that sent many quali-fiers to the Boston Marathon and then extending the

calculation via a random sampling exercise from the remaining races in their database.

DataTo verify qualifying claims, the BAA requests race results from all official marathon events in the United States and Canada, as well as major overseas events. These have been compiled in the form of Excel spreadsheets that include each runner’s name and home city, sex, age or date of birth, and finish time in the race (both gun and chip times—we used the chip time whenever possible). For the 2010 Boston Marathon, which at the time of our study was the most recent race with complete data available, the BAA had results from 641 “feeder races,” that is, races from which at least one runner had qualified for Boston. The vast majority of these races were in the United States and Canada, though a few were outside the official qualification window (mostly runners who had entered an earlier Boston Marathon and claimed a medi-cal deferment). While the United States and Canada data are essentially complete, it would be extremely dif-ficult to estimate all possible qualifiers from the rest of the world. Therefore, the decision was made to exclude races outside the United States and Canada from our analysis; by our own estimate, only about 4% of runners in the 2010 race qualified from outside those two countries.

Editor’s Note: This arti-cle is based on the talk, “Completing the Results of the 2013 Boston Marathon,” presented by Richard Smith at the 2013 New England Symposium on Statistics in Sports, held September 21, 2013, at the Harvard University Science Cen-ter. The talk is available to view at www.youtube.com/watch?v=cRUfrkbXUNQ.

Age Men Women

18–34 3hrs 10min 3hrs 40min35–39 3hrs 15min 3hrs 45min40–44 3hrs 20min 3hrs 50min45–49 3hrs 30min 4hrs 00min50–54 3hrs 35min 4hrs 05min55–59 3hrs 45min 4hrs 15min60–64 4hrs 00min 4hrs 30min65–69 4hrs 15min 4hrs 45min70–74 4hrs 30min 5hrs 00min75–79 4hrs 45min 5hrs 15min80 and over 5hrs 00min 5hrs 30min

Dow

nloa

ded

by [

Ond

okuz

May

is U

nive

rsite

sine

] at

08:

14 1

2 N

ovem

ber

2014

Page 3: Qualifying Times for the Boston Marathon

VOL. 27.3, 2014

26

Table 2—Estimated Totals by Age Group of All Male and female Marathon Runners in All Races and Numbers of Qualifiers Under

2011 Qualifying Standards

An additional resource we used was the extensive file of marathon statistics posted at www.marathonguide.com, which includes nearly all marathons in the USA and Canada, but only a very incomplete listing of marathons in the rest of the world.

Other limitations of our study include the following:Not all race results make clear the distinction between marathon runners and other competitors (e.g., wheelchair racers). In particular, some of the reported times by marathon runners are under two hours. Since the current world record for a marathon is 2:03:23, set by Wilson Kipsang of Kenya in the 2013 Berlin Marathon, such reported performances cannot be correct. In cases where it was possible to distinguish, we removed all competitors who were not identified as marathon runners. Where we could not make that distinction, we used the results as given.

In cases where exact date of birth was recorded, we used that information to compute ages of runners on the day of the Boston Marathon (April 19, 2010). When no birthdates were supplied, we used a statis-tical estimate of ages. For example, if a 54-year-old man ran a marathon between 3:35 and 3:45 in July 2009, we estimated there was a 0.75 chance he would be 55 before April 19, 2010, and so counted him as 0.75 of a qualifier.

Our method of counting Boston qualifiers does not distinguish cases where one runner may have run multiple qualifying times. However, in a later section, we propose a method of adjusting the estimate to account for this problem.

For the initial calculation, the BAA sent us complete results for the 37 top feeder races, that is, the races that sent most qualifiers to Boston (this includes the Boston Marathon, itself ).

For the second stage of our analysis, we supplemented that data set in two ways. We asked for complete results from five additional races (Honolulu 2008 and 2009, Disney World 2009 and 2010, Los Angeles 2009) for which the number of Boston qualifiers was low, though the races were large (all well over 10,000). From the remaining feeder races, we removed those outside the United States and Canada, races outside the qualification window for the 2010 race, and all races for which the fraction of Boston qualifiers was less than 5%.

For the sampling exercise (though not for the sub-sequent estimation of total Boston qualifiers), we also eliminated races with fewer than 500 runners total. This left us with a “population of races” consisting of 237 feeder races that were included in the sampling exercise.

While it would have been computationally feasible to analyze data from all the races, the time taken to prepare and transfer the data sets was a significant constraint; therefore, we agreed with the BAA that they would send us just 25 of those 237 races for our analysis. We selected the 25 by simple random sampling, in addition to the 42 selected earlier.

AgeGroup

Total Runners Boston QualifiersM F M F

18–34 126,780 126,861 12,004 12,78235–39 67,271 48,298 6,630 5,78740–44 70,890 45,339 8,305 6,59445–49 60,644 34,101 10,994 6,78350–54 45,712 22,033 7,541 3,42055–59 27,609 10,453 4,378 1,53360–64 15,799 4,686 3,204 64965–69 6,270 1,584 1,167 19270–74 2,488 397 560 5275–79 601 113 84 980–99 148 28 21 6All ages 424,210 293,893 54,888 37,809Total 718,106 92,695

Dow

nloa

ded

by [

Ond

okuz

May

is U

nive

rsite

sine

] at

08:

14 1

2 N

ovem

ber

2014

Page 4: Qualifying Times for the Boston Marathon

CHANCE

27

CalculationsFor the 42 races selected (37 top feeders plus the 5 addi-tional races chosen for their large field sizes), an exact count of Boston qualifiers was taken by sex and age group for current qualifying times for each of the possible alter-natives given above. For the 25 randomly sampled races, the total number of qualifiers was multiplied by 5.59 for men and 5.89 for women. The rationale is as follows. The total number of male runners in the 25 sampled races is 30,768, while the total number of male runners in all 237 races in the population of races is 172,114, with corresponding numbers for women as 19,752 female runners in the 25 sampled races and 116,305 in all 237 races. Thus, if we multiply the number of qualifying men by 5.59 = 172,114/30,768 and qualifying women by 5.89 = 116,305/19,752, the estimates of the total number of qualifiers are the sums of those from the 42 races and the adjusted random sample of 25 races.

There are a couple of caveats to this procedure. In particular, the counts of total runners (30,768; 172,114; 19,752; 116,305) are based on the tables published at www.marathonguide.com. In many cases, the totals on there do not correspond exactly to those in the official results held by the BAA. Nevertheless, we believe this is a fairly minor source of error. Regardless, it needs to be emphasized that the 25 randomly sampled races are a sample of the total number of races, and therefore, there is sampling error associated with the results. The principle is analogous to forecasting the result of an election: Polls taken before or during the election are based on a sample of voters, and there is always some margin of error associ-ated with the sample.

Results The total number of finishers in all 37 races broken down by age group is displayed in Table 2. Combining the 42 races for which we had complete results with the esti-mates for the rest of the feeder races, we display in Table

2 our estimates for the total number of marathon runners and the total number of qualifiers for the 2010 race.

This exercise was then repeated for a number of adjustments to the qualifying times—QT1 represent-ing the qualifying times as of 2010 and QT1-2, QT1-5 through QT1-30 to represent adjustments of the stan-dards by 2, 5, 10, 15, 20, and 30 minutes. The results are expressed in Table 3 as percentages of total runners who are Boston qualifiers in each age and sex group. Total numbers of runners are easily calculated by reversing this process. For example, for the M18–34 age group under QT1-10, we estimate 5.3% of 126,780 runners, or about 6,700.

Overall, there is some discrepancy between the per-centages of males and females who qualify for Boston. Females have a higher percentage of qualifiers in the 18–34 age group, but this is reversed in some older age groups. However, the overall discrepancies (between males and females) are small. Within the 18–34 age group, there are more women qualifiers than men, but there are also more women runners in total in that age range. The percentage of women qualifiers is higher than men (10.1 against 9.5), but it’s not a very big discrepancy. If the women’s qualifying time were adjusted by as little as 2 minutes (the men’s remaining the same), the balance would again shift in favor of men (9.5% of men, 8.9% of women). Therefore, any change in the relative M/F qualifying times to correct for the discrepancy would have to be very slight (less than 2 minutes) and hardly worth making. If there were to be a real tightening of the qualifying times (10 minutes or more), the advantage would again shift back to the men.

These results point toward the difference between M18-34 and F18-34 qualifying times remaining at 30 minutes. Among all the age groups under 60, the 45–49 age group (M and F) has the highest percentage of qualifiers. This may be the consequence of the 2003 revision of the qualifying times, when the 45–49 stan-dards were changed from 3:25/3:55 to 3:30/4:00. Maybe something like 3:27/3:57 would be more consistent with

Table 3—Percentage of Qualifiers for All Races with QT Adjustments

AgeGroup

QT1 QT1 - 2 QT1 - 5 QT1 - 10 QT1 - 15 QT1 - 20 QT1 - 30M f M f M f M f M f M f M f

18–34 9.5 10.1 8.2 8.9 6.7 7.0 5.3 5.1 3.6 3.5 2.7 2.5 1.4 1.235–39 9.9 12.0 8.4 10.6 6.8 8.4 4.8 6.0 3.6 4.3 2.2 3.0 0.9 1.540–44 11.7 14.5 10.0 13.0 8.2 10.7 5.9 7.7 4.3 5.4 3.1 3.8 1.0 1.745–49 18.1 19.9 15.6 17.9 12.6 14.9 9.3 11.6 6.8 8.2 5.0 6.1 2.4 3.050–54 16.5 15.5 14.6 14.1 11.9 12.1 8.4 8.7 5.9 6.3 3.9 4.3 1.7 1.855–59 15.9 14.7 14.1 13.3 11.8 11.3 8.6 9.0 6.0 6.3 4.1 4.2 1.7 2.260–64 20.3 13.8 18.1 12.4 15.3 11.2 11.5 9.0 8.8 7.5 6.7 5.9 3.5 3.765–69 18.6 12.1 17.0 11.7 14.6 9.8 12.1 8.5 10.0 6.7 7.9 5.9 4.5 3.470–74 22.5 13.1 20.8 12.6 17.4 12.1 13.6 9.8 11.5 7.6 8.8 6.8 5.2 5.075–79 14.0 8.0 13.6 8.0 12.8 7.1 11.5 6.2 9.8 4.4 8.3 3.5 6.2 2.780–99 14.2 21.4 13.5 21.4 12.2 21.4 12.2 21.4 11.5 21.4 8.1 21.4 7.4 21.4

Dow

nloa

ded

by [

Ond

okuz

May

is U

nive

rsite

sine

] at

08:

14 1

2 N

ovem

ber

2014

Page 5: Qualifying Times for the Boston Marathon

VOL. 27.3, 2014

28

the surrounding age groups, if the rest of the standards remained the same. Also, women in the 60-79 age range get a bit of a raw deal: there are far fewer of them (than men in the same age groups), but also a substantially lower percentage of Boston qualifiers.

Table 4 shows what these results mean for the total number of qualifiers under each of the proposed revi-sions, aggregated over all age groups. For example, a 5-minute reduction in qualifying times would result in about a 29% reduction in the number of male qualifiers and a slightly smaller (28%) reduction in the number of female qualifiers. This information was valuable to the BAA in determining how much of a reduction in the number of entrants might be expected with different adjustments.

Table 6—95% Confidence Intervals for Qualifiers

Table 4—Estimated Total Runners Under 2011 Qualifying Times (QT1) and by Various Adjustments of the Standards

Table 5—Number of Multiple Qualifiers from Sample of 115 Runners

All Runners QT1 QT1-2 QT1-5 QT1-10 QT1-15 QT1-20 QT1-30

Male 424,210 54,886 47,734 39,048 28,907 20,889 14,886 6,808

Female 293,895 37,809 33,715 27,426 20,271 14,391 10,300 4,944

All 718,105 92,695 81,449 66,474 49,178 35,280 25,186 11,752

Number of Qualifying

Races

Number of Runners with That Number of

Qualifying RacesProduct

1 54 542 10 203 11 334 4 165 0 06 0 07 0 08 1 8Total 80 131

Group Lower Bound Estimate Upper BoundMale 51,500 54,886 59,800Female 35,300 37,809 41,300Total 86,800 92,695 101,000

The Problem of Multiple QualifiersThe results given here do not account for the possibil-ity that the same runner may run multiple qualifying times. This was addressed by another sampling exercise, as follows.

We sampled 115 runners from the results of the 2010 Boston Marathon. For each, we used the data at www.marathonguide.com to trace each of those runners and all the qualifying times they had run during the 2010 qualifying races. For 35 of the runners, we were unable to trace any qualifying time. We suspect most or all of these were either overseas runners who had qualified in overseas races or charity runners who did not qualify at all. Among the 80 runners for whom we could trace a qualifying time, the counts were as in Table 5. In other words, 54 runners had exactly one qualifying time, 10 had two qualifying times, 11 had three qualifying times, etc.

Overall, the 80 runners had run a total of 131 quali-fying times, a ratio of 131/80=1.6. If this ratio held up over all runners, the implication would be that we should reduce our estimated numbers of qualifying runners by a factor of 1.6. For instance, under current qualifying stan-dards, the estimated total numbers of qualifiers would become 54,886/1.6=34,304 men and 37,809/1.6=23,631 women, for a total of 57,935.

This is an admittedly crude result. We didn’t distin-guish between men and women in our sample, let alone the different age groups. Also, it’s unlikely that the www.marathonguide.com results are complete. However, it does give a rough guide to the extent by which our earlier results could be affected by the issue of multiple qualifiers.

Estimating the Sampling ErrorWe already made the analogy between our sampling method and the way pollsters conduct an election poll. In election polls, it’s common to quote a margin of error, which is supposed to represent the error in the poll due to sampling. For example, in a poll based on a sample of size 1,000, the margin of error is usually quoted as plus or minus 3%. A more precise statement is that this is a 95% confidence interval.

We have performed a rough calculation of the margin of error involved in these samples. This was achieved by bootstrapping. From among the same 25 races selected at random in the second stage of the procedure, we resa-mpled by selecting races at random, with replacement.

Dow

nloa

ded

by [

Ond

okuz

May

is U

nive

rsite

sine

] at

08:

14 1

2 N

ovem

ber

2014

Page 6: Qualifying Times for the Boston Marathon

CHANCE

29

This gives us a bootstrap sample of 25 races, for which we calculated total qualifiers by the same procedure as before. The bootstrap procedure was repeated 10,000 times and the 0.025 and 0.975 quantiles of the bootstrap sample were used to define a 95% confidence interval. Results are shown in Table 6.

These ranges give a rough idea of how accurate our estimates are. However, it should be pointed out (as with election polls) that they account only for the error associ-ated with drawing a sample and not with other sources of error. They do not take into account that we have omitted overseas races, that some of the results we used may contain wheelchair racers, or the other non-random sources of error we have pointed out. We also ignored the multiplicity correction described earlier. Despite these caveats, we believe the results can be interpreted as something of a reality check on the results we have given.

The Current Qualifying Time StandardsThe BAA did decide to set new qualifying time standards that took effect with the 2013 Boston Marathon. The new qualifying times were 5 minutes below the old ones for all age groups and sexes and eliminated the 59-second cushion (so the new qualifying times are 5 minutes 59 seconds, or 5.9833 minutes, faster than the old ones).

In addition, the organizers introduced a four-tier entry system, whereby runners who beat the (new) qualifying standard by 20 minutes or more got to enter first, then those who beat the qualifying standard by 10 minutes or more, then those who beat the qualifying standard by 5 minutes or more, and finally everyone else who beat the qualifying standard. If the race was oversubscribed, entrants would be accepted according to the degree by which they beat the qualifying stan-dard. For example, in 2012 (the first time the four-tier entry system was used, though still based on the old qualifying times) everyone who beat their qualifying standard by 1 minute 14 seconds or better was accepted for the race. In 2013, the new standards went into effect and everyone who applied for the race with a legitimate qualifying time was accepted.

After the 2013 race ended with two bombs explod-ing, organizers expected a substantial increase in the number of entrants, partly because of runners who were unable to finish the 2013 race (who were accepted without being required to re-qualify), but also because of a general upsurge in interest as runners demonstrated their support for the race. They therefore increased the total size of the field to 36,000 (from the previ-ous 25,000). Even so, not every qualified runner was accepted—everyone who beat their age- and sex-adjusted qualifying time by 1 minute 38 seconds or more was accepted. (One of the authors of this article beat the relevant qualifying time by 1 minute 54 sec-onds and was therefore accepted.)

Given the fluctuation in numbers of entrants that occurs naturally from year to year, even in the absence of special circumstances such as those that followed the 2013 bombings, it is not possible to say definitively how

accurate our predictions of future numbers of entrants were. However, we can make some comparisons.

The top two (blue and green) plots in Figure 1 show our projected numbers derived from Tables 3 and 4, plotted against the projected change in qualifying time, separately for men and women, but with all ages com-bined. On a log-linear scaling, the fit is remarkably close to a straight line. To be specific, each 1-minute reduction in qualifying time is expected to result in a 6.4% reduc-tion in entrants for either men or women. For 5.98333 minutes reduction in qualifying time, this calculation projects a 33% reduction in number of qualified entrants. We do not know whether this was the reduction in field size the organizers were aiming for, but it seems to be in the right ballpark.

Even though we cannot directly compare entrants in the same year under the old and new qualifying standards, a loose proxy for this is to look at the numbers of entrants who beat each of the intermediate qualifying times under the four-tier entry system. Specifically, the organizers released the following information for the 2012 race:

• 23,521 applications were received during the two weeks of registration

• 20,081 applicants were accepted, having beaten the (same as 2011) qualifying standard by 1 min-ute 14 seconds or better (subject to verification of qualifying performance)

• Among those, 14,780 beat their qualifying stan-dard by 5 minutes or better

• An additional 277 qualifiers who had run 10 or more consecutive Boston Marathons also were accepted

Figure 1. Numbers of estimated and actual qualifiers by time increment

Num

ber o

f Qua

lifier

sTime Below 2011 Qualifying Standard

Dow

nloa

ded

by [

Ond

okuz

May

is U

nive

rsite

sine

] at

08:

14 1

2 N

ovem

ber

2014

Page 7: Qualifying Times for the Boston Marathon

VOL. 27.1, 2014

30

For the 2014 race, by which time the new standards were in effect:

• 25,654 applications were received during the two weeks of registration.

• 22,679 applicants were accepted, having beaten the qualifying standard by 1 minute 38 seconds or better (subject to verification of qualifying performance).

• 5,450 applicants beat their qualifying time by between 1 minute 38 seconds and 5 minutes.

• 5,879 applicants beat their qualifying time by between 5 and 10 minutes.

• 6,866 applicants beat their qualifying time by between 10 and 20 minutes.

• 4,484 applicants beat their qualifying time by 20 minutes or better.

• An additional 5,625 runners who were 2013 Boston Marathon starters who reached the half marathon checkpoint or later but did not have the opportunity to cross the finish line on Boylston Street on April 15 were invited to participate in the 2014 race, some of whom were qualified. Of those runners, 4,615 registered for 2014, 1,649 of whom were qualifiers.

• In addition, 402 qualifiers who had an active streak of 10 or more consecutive Boston Marathon completions also entered during a pre-registration period and were accepted.

For the purpose of the following discussion, we ignore those who were accepted by virtue of having not finished the 2013 race or because they had 10 or more Boston Marathons.

The above numbers have been translated into num-bers of race entrants who beat the 2011 qualifying stan-dard by various margins and are plotted on Figure 1 (2014 entrants in red, 2012 entrants in cyan).

For these numbers as well, the log-linear fit is good, but the slope is slightly different from the estimates computed from Tables 3 and 4. Specifically, according to the straight lines fitted to the actual data from 2012 (2014 in parentheses), a 1-minute reduction in qualifying time corresponds to an 8.2% (8.7%) reduction in number of entrants, and a 5.9833-minute reduction in qualifying time corresponds to a 40% (42%) reduction in number of entrants.

The one factor that our analysis cannot take into account is what proportion of marathon runners who are qualified for the Boston Marathon actually decide to enter. It is entirely plausible that runners who have beaten the qualifying time by a large margin are less concerned about taking their “one and only chance” to run the Boston Marathon. In other words, the proportion of qualified runners who enter the race is smaller among faster runners. This is in the right direction to explain the differences among the slopes of the various straight lines in Figure 1.

Estimated Changes in the Age-Sex Distribution of Runners in the Boston MarathonHere we extend the preceding discussion to consider also the age and sex categories. The changes in total qualifiers envisioned from our results imply not only an approximately 33% reduction in Boston Marathon qualifiers based on the actually adopted 5 minutes 59 seconds tightening of the qualifying times, but also a slight redistribution of qualifiers among the different age and sex categories. We demonstrate here that it is also possible to see the signal of that change in the observed race results.

To illustrate, consider Table 7. Based on Table 4, the expected proportion of women under the old (2011) qualifying time is 37809/92695=0.408, or 40.8%, while based on the new qualifying times, it is 27426/66474, or 41.3%. We can compare this with the actual number of finishers. The data for this comparison come from finishers in the 2010 and 2011 Boston Marathons to represent the old qualifying times and the 2013 Boston Marathon to represent the new qualifying times. The three race results are compiled in the file TIM.txt at www.unc.edu/~rls/boston.html and include runners in the 2013 Boston Marathon who were unable to finish because of the bombs (the vast majority of whom would have finished had they been able to reach the finish line).

Table 7—Expected and Observed Percentages of Women Under Old

and New Qualifying Times

Table 8—Expected and Observed Percentages of Runners of

Age<45 Under Old and New Qualifying Times

Old New

Expected 40.8 41.3

Observed 42.1 42.8

Old New

Expected 56.2 54.9

Observed 59.4 57.7

Dow

nloa

ded

by [

Ond

okuz

May

is U

nive

rsite

sine

] at

08:

14 1

2 N

ovem

ber

2014

Page 8: Qualifying Times for the Boston Marathon

CHANCE

31

Table 9—Changes in Expected and Observed Percentages from Old to New Qualifying Times for Each Age-Sex Category

For both old and new qualifying times, the observed percentage of women is higher than the expected per-centage under our tables. We have speculated a little about the reason for this, but one possible explanation is that, even 50 years after Roberta Gibb became the first woman to unofficially complete the Boston Marathon, women are still relative newcomers to the sport (espe-cially in the older age groups) and consequently more likely to take up their place in the race if they qualify. As a result, it may not be reasonable to directly compare the observed and expected percentages in the different age and sex categories.

However, it does make sense to look at the changes in percentages between the old and new qualifying stan-dards, which presumably would be much less affected by differential selection effects. From this point of view, Table 7 shows good agreement: Our calculations pre-dicted a 0.5% increase in the total percentage of women, and we actually observed 0.7%.

We can make similar comparisons of the age distribu-tion—see Table 8. For example, the expected proportions of qualifiers aged under 45 based on the old and new qualifying times went from 56.2% to 54.9%. (The calcula-tion under the new qualifying times differs slightly from Table 3 because it is based on the 5 minute 59 second increment and not exactly 5 minutes as in the QT1-5 column of Table 3.) The observed change was from 59.4% to 57.7%. In this case, it looks as though younger runners are more likely to take up their place in the race (which possibly, again, reflects a bias toward runners new to the marathon) but the change in percentage (1.3% expected, 1.7% observed) is still quite close.

We can extend this comparison to all age-sex catego-ries—see Table 9. Here, we have computed the change in the percentage of runners in each of the 22 age-sex categories, both expected based on our calculations and

Age

Expected Observed

Men Women Men Women

18–34 -0.093 -0.551 -0.183 -0.193

35–39 -0.305 -0.113 -0.487 0.528

40–44 -0.310 0.087 -0.725 -0.596

45–49 -0.318 0.392 -0.868 -0.286

50–54 0.040 0.324 -0.032 0.357

55–59 0.110 0.156 0.588 0.444

60–64 0.168 0.110 0.380 0.249

65–69 0.146 0.036 0.495 0.120

70–74 0.057 0.021 0.076 0.011

75–79 0.031 0.003 0.075 0.002

80 and over 0.006 0.003 0.034 0.011

Dow

nloa

ded

by [

Ond

okuz

May

is U

nive

rsite

sine

] at

08:

14 1

2 N

ovem

ber

2014

Page 9: Qualifying Times for the Boston Marathon

VOL. 27.3, 2014

32

The Boston Marathon is the world’s oldest annual marathon and a “feel good” race. The city takes a holiday to celebrate this extraordinary cultural event and allow its residents to participate in the tradition on Patriots Monday. Runners from all over the world come to challenge and compete against themselves. Winning is not defined by where you finish among fellow runners, but by doing your best. Spectators line the streets for 26 miles and cheer for the runners at the end of the pack with as much enthusiasm as they do for the elite runners at the front of the pack. Despite extreme fatigue and dehydration, runners feed off the crowd’s energy and encouragement to push themselves to limits they may otherwise not reach.

This traditionally happy day turned tragic in April of 2013. Instead of a day of people running for a cause

REMEMBERING LU LINGzI

(e.g., raising money for medical research or to help impoverished people) as so many do, people were running for their lives due to a bombing attack near the finish line. The city and local areas had an unprecedented shut down while law enforcement searched for the people responsible for the attack. Two suspects were eventually identified; one was killed during a shoot-out with law enforcement, while the other was apprehended after a long manhunt.

Throughout the ordeal, the attitude of the city was “Boston Strong.” David Ortiz, the Boston Red Sox’s most famous player, delivered a speech in which he said, “This is our !@#$ city. And nobody is going to dictate our freedom.” The Red Sox went on to win the World Series that year.

The Boston Strong spirit lives on. The city and marathon

by Scott Evans

rallied in 2014. Instead of fear reducing marathon participation, a record 35,775 runners participated.

The attack in 2013 tragically took the life of three race spectators. One of them was Lingzi Lu, a graduate student in statistics at Boston University. Lu was an ambitious, talented, and promising student with a powerful work ethic and vibrant personality. She had just passed the first part of the master’s comprehensive exam.

Boston University awarded Lu a posthumous degree at the 2013 commencement. The university also established the Lingzi Lu Scholarship Fund, which endows two scholarships for graduate students, each providing an annual stipend and full tuition for up to two years. Now, thanks to generous support from more than 1,700 people across the globe, the scholarship fund exceeds $1 million.

In addition, the Lingzi Lu Memorial Award has been created by the American Statistical Association (ASA) in partnership with the International Chinese Statistical Association (ICSA) in remembrance of Lu. Through the award, the ASA and ICSA support the studies of similarly dedicated people who are in master’s programs in statistics or who have recently earned their master’s degree in statistics. Thus, the award honors not only the memory of Lu, but also the talented statistician she would have become. For more information about this award, visit http://bit.ly/VjJ8Gd.

Dow

nloa

ded

by [

Ond

okuz

May

is U

nive

rsite

sine

] at

08:

14 1

2 N

ovem

ber

2014

Page 10: Qualifying Times for the Boston Marathon

CHANCE

33

Figure 2. Histogram of bootstrapped correlation coefficients: central 95% in red

About the AuthorsRichard L. Smith is professor of statistics at The University of North Carolina at Chapel Hill and director of the Statistical and Applied Mathematical Sciences Institute. His main research interests are in environmental statistics, including climate change and air pollution. He ran his first marathon in 1978 and has completed the Boston Marathon 11 times.

Scott Powers is a PhD student in the department of statistics at Stanford University. He earned his BS in mathematics and mathematical decision sciences in 2011 from The University of North Carolina at Chapel Hill. He also has worked on statistical analyses in baseball and volleyball.

Jessi Cisewski is a visiting assistant professor in the department of statistics at Carnegie Mellon University. She earned her PhD in 2012 from the department of statistics and operations research at The University of North Carolina at Chapel Hill. She enjoys running and completed the Twin Cities Marathon in 2010.

as observed in the 2010, 2011, and 2013 race results. In this case, it is rather hard to decide whether the observed and expected percentages match—there are some cells where they differ substantially. However, one way to assess the overall agreement between the observed and expected percentage changes in Table 9 is to compute their correlation. Specifically, we computed the standard Pearson correlation coefficient between the 22 “expected” percentage changes in Table 9 and the 22 “observed” percentage changes. The result is a correlation of 0.527, which seems to imply substantial agreement among the expected and observed values.

Is this, in fact, a statistically significant correlation? Standard statistical tests based on correlation coefficients, such as Fisher’s z-transformation, do not apply in this rather complicated sampling framework, so we have used a simple bootstrap method to evaluate the uncertainty in the estimated 0.527 correlation coefficient. Specifi-cally, for each of 5,000 bootstrap samples, we resampled with replacement from the entire 69,923-runner data-base in the TIM.txt file. For each bootstrapped sample, we recomputed Table 9 and hence the corresponding bootstrapped correlation coefficient. A histogram of bootstrapped correlation coefficients is shown in Figure 2. As shown in red, 95% of the bootstrapped correlation coefficients lie between 0.28 and 0.68, which is therefore an approximate 95% confidence interval for this coeffi-cient. Note that the confidence interval does not include 0. Therefore, we conclude that the observed correlation between the expected and observed percentage changes is statistically significant.

This is admittedly a mild test, since it only establishes that the changes in the age-sex distribution are different from random, but given the small percentage changes predicted by our model and the considerable variation in the composition of the field from year to year (in other words, a lot of random variation), we consider it an achievement that we can still detect the effect of the changes in the qualifying standards on the percentages of women and younger runners in the 2013 race.

ConclusionsAs the community of marathon runners continues to grow, the demand for an entry into the historical Boston Marathon will increase with it. This makes the issue of establishing appropriate Boston Marathon qualifying times even more crucial due to its impact on a large number of runners. In this work, we presented a proce-dure for modifying the previously established qualifying times by analyzing the effect of a variety of reductions by age and sex. A specified population of marathons that were feeders for the 2010 Boston Marathon was used for this analysis.

The BAA made the decision to change the qualify-ing times for the 2013 Boston Marathon. They reduced the qualifying times by 5 minutes 59 seconds across the board and introduced a new tier structure for entries. According to our estimates, the changes should have resulted in an approximate 33% reduction in the number of entrants, compared with what would have happened if

the qualifying times had not changed. Two comparisons based on subsequent events—one tracking the numbers of entrants in different qualifying groups for the 2012 and 2014 races, the other based on differences in the distribution of finishers across age groups—both showed that our estimates worked well in practice. The 2014 race, with 32,458 starters and 31,926 finishers, was the second-largest Boston Marathon in the history of the race, behind the 1996 centenary race.

In the future, it is likely that increasing pressure on the field size in the Boston Marathon will require further adjustments to the qualifying stan-dards. Other races such as the New York and London marathons also use qualifying times as a means of ensuring guaranteed entry for fast runners. The methodology employed here provides a systematic means of fine-tuning the qualifying times to achieve certain goals both for the overall field size and the distribution across sex and age groups.

Freq

uenc

y

Correlation Between Observed and Expected Frequency Changes

Dow

nloa

ded

by [

Ond

okuz

May

is U

nive

rsite

sine

] at

08:

14 1

2 N

ovem

ber

2014