Poisson Prediction

To predict how many goals a team will score in a match we can use the Poisson Distrubution.  The Poisson Distribution is a mathematical formula that describes the distribution of the number of discrete random events.  It is applicable to almost any random event - for example the number of cars arriving at a red light, the number of customers entering a shop per hour, the number of goals scored in a match.  The formula is written as

P(k) = (lambda^k) * (e ^ -lambda) / k!

Where P(k) mean the Probability of k goals, Lambda is the average goals per game.  e is eulers constant (this is a known constant), k! is k factorial (eg 3! = 3*2*1=6).  It's just mathematical notation.

So we have quite a simple way of calculating the probability that a team will score 1 goal in a game, or 2 or 3.  And we can calculate likewise for their opponent.  We are then calculating the probabilities of football results!  The only parameter of the equation that we don't know is lambda.  However we can estimate this based on the average goals per match by a team.

First to test the formula we can look at the number of goals each team scored in all games in the 2013/14 season.  I am using the usual data that I have collected on the English, Spanish, German, Italian, Turkish, Scottish, Dutch, Belgian, Greek and Portuguese top divisions.  Total number of matches used is 4232, so this should be a fine sample space - enough to give a good long run average.


Poisson distribution (lambda = 1.348) of goals per team per match vs actual observed distribution.  All top europen matches for 2013/14 season.

To understand the chart, the Poisson formula predicts that a team will score 0 goals in 26% of matches, 1 goal in 35% etc.  (These predictions are based on a Poisson distribution with average of 1.348). The actual percentage of goals per match is plotted beside the predictions.  I think it show quite a good fit.

Now to make this useful we can break it down by team.  Obviously probability of scoring at home is much larger than away, so we break it down by home and away too.  Now we have narrowed it down by team and by home/away we have a much much smaller sample space.  This means the actual and predicted outcomes don't look as close.  However because of the Law of Large Numbers we would expect these margins to become smaller if we add more data.

Poisson distribution (lambda=2.211) of goals for Man Utd per home match vs actual observed distribution.  Includes seasons 11/12, 12/13 and 13/14.


So to use this to predict a match lets take Man Utd's first match of the 2014/15 season on 2014-08-16, vs Swansea.  So we look at Man Utd's home record and Swansea's away record.  Man Utd have scored an average of 2.211 goals at home and have conceded on average 1.035.  Swansea have scored an average of 1.000 goals away and conceded an average of 1.509.  These averages are taken over the last three seasons, so that is Man Utd's last 57 home games and Swansea's last 57 away games.  There is a trade off here, look back too far and you are looking back on irrelevant data, but don't look back far enough and you won't have enough matches to use to calculate a genuine long run average.

I'll average 2.211 (average scored by Man Utd at home) and 1.509 (average conceded by Swansea away) and use this as the estimate for lambda for P(h) for the home team (Man Utd).  This is an estimates for the expected number of goals scored by the home team.  I'll generate the estimate for lambda for the away team similarly.

The table below lists all possible scoreline combinations (up to 7 goals per team).  Then all that is left to do is calculate the probability of each scoreline based on the lambda estimates.  Just plug lambda into the Poisson formula and out comes the actual probabilities for each possible score.    

home goals (h)away goals (a)outcomelambda (home)lambda (away)Probability of h home goalsProbability of a away goalsP(h) * P(a)Sum
00Draw1.8601.0180.1560.3610.056
0.225
11Draw1.8601.0180.2890.3680.106
22Draw1.8601.0180.2690.1870.050
33Draw1.8601.0180.1670.0630.011
44Draw1.8601.0180.0780.0160.001
55Draw1.8601.0180.0290.0030.000
66Draw1.8601.0180.0090.0010.000
77Draw1.8601.0180.0020.0000.000
10Man Utd1.8601.0180.2890.3610.105
0.570
20Man Utd1.8601.0180.2690.3610.097
21Man Utd1.8601.0180.2690.3680.099
30Man Utd1.8601.0180.1670.3610.060
31Man Utd1.8601.0180.1670.3680.061
40Man Utd1.8601.0180.0780.3610.028
32Man Utd1.8601.0180.1670.1870.031
41Man Utd1.8601.0180.0780.3680.029
50Man Utd1.8601.0180.0290.3610.010
42Man Utd1.8601.0180.0780.1870.015
51Man Utd1.8601.0180.0290.3680.011
60Man Utd1.8601.0180.0090.3610.003
43Man Utd1.8601.0180.0780.0630.005
52Man Utd1.8601.0180.0290.1870.005
61Man Utd1.8601.0180.0090.3680.003
70Man Utd1.8601.0180.0020.3610.001
53Man Utd1.8601.0180.0290.0630.002
62Man Utd1.8601.0180.0090.1870.002
71Man Utd1.8601.0180.0020.3680.001
54Man Utd1.8601.0180.0290.0160.000
63Man Utd1.8601.0180.0090.0630.001
72Man Utd1.8601.0180.0020.1870.000
64Man Utd1.8601.0180.0090.0160.000
73Man Utd1.8601.0180.0020.0630.000
65Man Utd1.8601.0180.0090.0030.000
74Man Utd1.8601.0180.0020.0160.000
75Man Utd1.8601.0180.0020.0030.000
76Man Utd1.8601.0180.0020.0010.000
01Swansea1.8601.0180.1560.3680.057
0.204
02Swansea1.8601.0180.1560.1870.029
03Swansea1.8601.0180.1560.0630.010
12Swansea1.8601.0180.2890.1870.054
04Swansea1.8601.0180.1560.0160.003
13Swansea1.8601.0180.2890.0630.018
05Swansea1.8601.0180.1560.0030.001
14Swansea1.8601.0180.2890.0160.005
23Swansea1.8601.0180.2690.0630.017
06Swansea1.8601.0180.1560.0010.000
15Swansea1.8601.0180.2890.0030.001
24Swansea1.8601.0180.2690.0160.004
07Swansea1.8601.0180.1560.0000.000
16Swansea1.8601.0180.2890.0010.000
25Swansea1.8601.0180.2690.0030.001
34Swansea1.8601.0180.1670.0160.003
17Swansea1.8601.0180.2890.0000.000
26Swansea1.8601.0180.2690.0010.000
35Swansea1.8601.0180.1670.0030.001
27Swansea1.8601.0180.2690.0000.000
36Swansea1.8601.0180.1670.0010.000
45Swansea1.8601.0180.0780.0030.000
37Swansea1.8601.0180.1670.0000.000
46Swansea1.8601.0180.0780.0010.000
47Swansea1.8601.0180.0780.0000.000
56Swansea1.8601.0180.0290.0010.000
57Swansea1.8601.0180.0290.0000.000
67Swansea1.8601.0180.0090.0000.000

The probilities in the last two columns are calculated using the below basic probability laws.

P(A and B) = P(A) * P(B)

P(A or B) = P(A) + P(B) where A and B are mutually exclusive (as they are in our case).

So in percentage terms this gives 20.4% chance Swansea will win, 22.5% for a draw and a 57% chance that Man Utd will win.  (We only calculated up to 7 goals in our table above.  The other 0.1% is the probability that either teams scores more than 7 goals).

Comments

Post a Comment

Popular posts from this blog

Deconstructing WDL and O/U 2.5 goals odds.

Chrome Extension 1000

Elo Reverse Calculator