studying the effects of aging in major league baseball

Post on 30-Dec-2015

30 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

Studying the Effects of Aging in Major League Baseball. Phil Birnbaum www.philbirnbaum.com. Aging patterns in baseball. How do players age? Is it different for hitters and pitchers? If you have a good player who's 31, how much do you expect him to decline over the next few years? - PowerPoint PPT Presentation

TRANSCRIPT

Studying the Effects of Aging in Major League Baseball

Phil Birnbaumwww.philbirnbaum.com

Aging patterns in baseball How do players age? Is it different for hitters and

pitchers? If you have a good player who's 31,

how much do you expect him to decline over the next few years?

Want a result like: "hitters decline X% between age 31 and 35"

Studies Bill James' classic aging study in

the "1982 Baseball Abstract" Work by Tom Tango Academic studies: Jim Albert, Ray

C. Fair, and others (This presentation is based mostly

on Tango, with a bit of James)

Previous findings The best batters peak at 27 –

that's when most of the major awards are won (James)

Different skills peak at different times: speed early, HRs mid-career, BBs late (Tango)

A naive look What's the average performance of

the various age cohorts? Fairly similar, it turns out, except

at the extremes

Average Batting vs. Age

0

1

2

3

4

5

6

18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45

Ru

ns

per

gam

e (R

C27

)

Average Pitching vs. Age

0

1

2

3

4

5

6

18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45

Co

mp

on

ent

ER

A

A naive look Statistical illusion Curve traces different groups of

players Players at 25 are a cross-section of

the league Players at 40 are former superstars The players at 40 were much better

players when they were 25

Example Age 27

Player A: 6.00 … Player B: 5.00 … Player C: 4.00 Average: 5.00

Age 35 Player A: 5.50 … Player B: 4.50 … Player C: released Average: 5.00

Age 40 Player A: 5.00 … Player B: retired … Player C: released Average: 5.00

All players decline with age, but the mean is still 5.00

Paired seasons "Paired seasons" method

Find all players who were 28 in season X

See how they did in season X+1 (Weight the average by playing time)

The average difference reflects the effects of aging from 28 to 29

Career path obtained by chaining (multiplying) single-year effects

Paired seasons: Batting

0

0.2

0.4

0.6

0.8

1

1.2

19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46

RC

27 r

elat

ive

to p

eak

Paired seasons: Pitching

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45

ER

A r

elat

ive

to p

eak

Paired seasons: results biased The paired-seasons method shows big

declines as players age But it suffers from a bias – selective

sampling Players who were "lucky" in season X

(large positive error term) get more playing time in season X+1

Those "lucky" players will show bigger declines

So big declines are over-represented

Example Three 37-year-olds, all of whom have skill

of .250 this year, .240 next year This year, due to chance, they

hit .200, .250, .300 respectively The .200 guy is forced to retire The .250 guy plays half time next year and loses 10

points (.250 .240) The .300 guy plays full time next year and loses 60

points (.300 .240) The weighted average loss is 43 points, not 10

points The decline is very much overestimated

How can we eliminate this bias? Can try to estimate the "true" talent of the three

players Regressing to the mean

The .200 guy is "probably" .220 The .250 guy is "probably" .250 The .300 guy is "probably" .280

Now the third guy declines only 40 points, not 60 Average decline: 30 points More accurate than previous estimate of 43 points If we regressed "perfectly" – all players to their

talent of .250 – we'd get the right answer (10 pts)

Regressing season X How much to regress? Need to do some research to figure

that out Can probably get a theoretical lower

bound from binomial (multinomial) distribution

For now, consider 10% and 30%

Batting, regressed 10%

0

0.2

0.4

0.6

0.8

1

1.2

19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46

per

cen

tag

e o

f p

eak

Batting, regressed 30%

0

0.2

0.4

0.6

0.8

1

1.2

19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46

per

cen

tag

e o

f p

eak

Pitching, regressed 10%

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

17 19 21 23 25 27 29 31 33 35 37 39 41 43 45

per

cen

tag

e o

f p

eak

Pitching, regressed 30%

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

17 19 21 23 25 27 29 31 33 35 37 39 41 43 45

per

cen

tag

e o

f p

eak

Conclusions Results sensitive to how much we regress Getting correct estimates of aging using the

paired-seasons method depends on solving the selective sampling problem and/or figuring out how much to regress

Alternative: can fit curves to careers (Albert, Fair)

But this method requires a long career, which means only the most successful players are analyzed

Some selective sampling issues there too

References "Looking For the Prime," 1982 Bill James Baseball

Abstract, p. 191 Tom Tango, http://tangotiger.net/agepatterns.txt Tom Tango, "Forecasting Pitchers – Adjacent Seasons,"

http://www.tangotiger.net/adjacentPitching.html Ray C. Fair, "Estimated Age Effects in Baseball,"

http://www.bepress.com/jqas/vol4/iss1/1/

top related