Forecasting a Tennis Match at the Australian Open
Tristan BarnettStephen ClarkeAlan Brown
Introduction Match Predictions
Markov Chain Model
Collecting Data
Exponential Smoothing
Combining Player Statistics
Real Time Predictions Combining Sheets from Markov Chain Model
Bayesian Updating Rule
Excel Computer Demonstration
Markov Chain Model Modelling a game of tennis
Recurrence Formula: P(a,b) = pP(a+1,b) + (1-p)P(a,b+1)
Boundary Conditions: P(a,b) = 1 if a=4, b ≤ 2
P(a,b) = 0 if b=4, a ≤ 2
where for player A:
p = probability of winning a point on serve
P(a,b) = conditional probability of winning the game when the score is (a,b)
22
2
)-1(+=)3,3(
ppp
P
Markov Chain Model
Table 1: The conditional probabilities of player A winning the game from various score lines for p = 0.6
Similarly
sheet for player B serving
sheets for a set (from sheets of a game)
sheet for a match (from sheets of a set)
B score 0 15 30 40 Game
0 15 30 40
A score
Game
0.74 0.84 0.93 0.98
1
0.58 0.71 0.85 0.95
1
0.37 0.52 0.69 0.88
1
0.15 0.25 0.42 0.69
0 0 0
Collecting Data The ATP tour matchfacts: http://www.atptennis.com/en/media/rankings/matchfacts.pdf
Collecting Data fi = ai bi + ( 1 - ai ) ci
gi = aav di + ( 1 - aav ) ei
where the percentage for player i :
fi = points won on serve
gi = points won on return
ai = 1st serves in play
bi = points won on 1st serve
ci = points won on 2nd serve
di = points won on return of 1st serve
ei = points won on return of 2nd serve
where the percentage for average player on the ATP tour:
aav = 1st serves in play = 58.7%
Exponential Smoothing Fi
t = Fit-1 + [ 1 - ( 1 – α )n ] [ fi
t - Fit-1 ]
Git = Gi
t-1 + [ 1 - ( 1 – α )n ] [ git - Gi
t-1 ]
where:
For player i at period t
Fit = smoothed average of the percentage of points won on serve after observing fi
t Gi
t = smoothed average of the percentage of points won on return of serve after observing gi
t
Initialised for average ATP tour player
Fi0 = the ATP average of percentage of points won on serve
Gi0 = the ATP average of percentage of points won on return of serve
n = number of matches played since period t-1
α =smoothing constant
When n=1, [1-(1-α)n] = α, as expected When n becomes large, [1-(1-α)n] → 1, as expected
Combining Player Statistics fij = ft + ( fi - fav ) - ( gj - gav )
gji = gt + ( gj - gav ) - ( fi - fav )
where:For the combined player statistics
fij = percentage of points won on serve for player i against player jgji =percentage points won on return for player j against player I
For the tournament averagesft = percentage of points won on servegt = percentage of points won on return of serve
For the ATP tour averagesfav = percentage of points won on servegav = percentage of points won on return of serve
Since ft + gt = 1, fij + gji = 1 for all i,j as required
Combining Sheets The equation for the probability of player A winning a best-of-5 set match
from (e,f) in sets, (c,d) in games, (a,b) in points, player A serving.
P''(a,b:c,d:e,f ) = P(a,b) P'B(c+1,d) P''(e+1,f ) +
P(a,b) [1-P'B(c+1,d)] P''(e,f+1) +
[1-P(a,b)] P'B(c,d+1) P''(e+1,f ) +
[1-P(a,b)] [1-P'B(c,d+1)] P''(e,f+1)
where for player A :
P''(a,b:c,d:e,f ) = probability of winning the match from (a,b:c,d:e,f )
P'B(c,d) = probability of winning the set from (c,d) when player B is serving
P''(e,f ) = probability of winning the match from (e,f )
Bayesian Updating Rule
where:
θ ti = updated percentage of points won on serve at time t for player i
μi = initial percentage of points won on serve for player i
φ ti = actual percentage of points won on serve at time t for player i
n = number of points played
M = expected points to be played
When n=0, θ 0i= μi as expected
When M → 0, θ ti → φ t
i
tii
ti φ
nMn
μnM
Mθ
++
+=
Computer Demonstration ISF3.XLS
2003 Australian Open Quarter Final
El Aynaoui versus Roddick
Computer Demonstration ISF4.XLS
Chance of winning current Point Game Set MatchEl Aynaoui 30% 10% 50% 68%Roddick 70% 90% 50% 32%
End of 1st set
where: = game to El Aynaoui
= game to Roddick
= set to El Aynaoui
Chances of winning match
0%
25%
50%
75%
100%
0 10 20 30 40 50 60Number of points played
El Aynaoui0%
25%
50%
75%
100%
Roddick
Computer Demonstration End of match
where: = game to El Aynaoui by breaking serve
= game to Roddick by breaking serve
= set to El Aynaoui
= set to Roddick
Chances of winning match
0%
25%
50%
75%
100%
0 100 200 300 400Number of points played
El Aynaoui0%
25%
50%
75%
100%
Roddick