Jump to content

help with a mathematical model using the Poisson equation


ngro

Recommended Posts

Hi, I recently read a work of the Department of Statistics Trinity College Dublin called "Creating a Profitable Betting Strategy for Football by Using Statistical Modelling." the work is very interesting and explains how to create a mathematical model using the Poisson ecuasion to predict the outcome of a game (apart from explaining basic concepts about gambling in general). is recommended for anyone interested in the topic my problem comes because the mathematics are quite complex and I would like someone to help me the idea is this:

"In Maher’s model, he suggested that the team i, playing at home against team j, in which the score is (xij, yij), and Xij and Yij are independent Poisson random variables with means αβ and δγ respectively. The parameters represent the strength of the home team’s attack (α), the weakness of the away team’s defence (β), the strength of the away team’s attack (δ), and the weakness of the home team’s defence (γ). He finds that a reduced model with δi = kαi, γi = kβi for all i is the most appropriate of several models he investigates. Thus, the quality of a team’s attack and a team’s defence depends on whether it is playing at home or away. Home ground advantage (1/k) applies with equal effect to all teams. English Premier league consists of 20 teams. The three lowest placed teams will be relegated to Division 1 and three top teams will be promoted to the league from Division 1 after each season. Dixon and Coles applied Maher’s model in their article where they used all four divisions in the model. They also included cup matches in the analysis and thus obtained a measurement for the difference in relative strengths between divisions. We ignore cup matches in our study. Dixon and Coles had 185 identifiable parameters, because of the number of divisions they dealt with. In our basic model, we use only 41 parameters. Attack and defence parameter for each team, and a common home advantage parameter. We set Arsenal’s attack parameter to zero as our base parameter. For clarity, we use slightly different notation than in Maher’s paper. We assume that the number of goals scored by the home team has a Poisson distribution with mean λHOME and the number of goals scored by the away team has a Poisson distribution λAWAY. One match is seen as a bivariate Poisson random variable where the goals are events, which occur during this 90-minute time interval. The mean λHOME reflects to the quality of the home attack, the quality of the away defence, and the home advantage. The mean λAWAY reflects the quality of the away attack, and the quality of the home defence. These are specific to each team’s past performance. The mean of the Poisson distribution has to be positive, so we say that the logarithm of the mean is a linear combination of its factors. 84824288.png This is a simplified equation because in reality there would be team specific attack and defence parameters, so in the English Premier League where 20 teams are competing the amount of zi’s would be 41. This is an example of a log-linear model, which is the special case of the generalized linear models. The theory of generalized linear models was obtained from the books by McCullach and Nelder (1983), and by Dobson (1990). We can estimate the values of the parameters above by the method of maximum likelihood assuming independent Poisson distribution for Y. For now on, we refer to this whole process as Poisson regression. Eq. 2.2 gives us the expected number of goals scored for both teams in a particular match. Using these values in our bivariate Poisson distribution, we can obtain the probabilities for home win, draw and away win in the following way: 32075805.png P(Home win) = total the combination of score probabilities where h>a P(Draw) = total the combination of score probabilities where h = a P(Away win) = total the combination of score probabilities where h As you can see is quite complicated. My biggest problem has to do with the estimation of the parameters, because they do not perform the maximum likelihood method, anyone know how this is done? what steps should be followed? is there any way to do it and that is not super complicated? I understand it is quite complicated what I am asking, but maybe someone is interested in the topic and would like to help. thanks leave the link to the full pdf www.ylikerroin.com/file/Complete.pdf
Link to comment
Share on other sites

Re: help with a mathematical model using the Poisson equation Hi, ngro, :welcome to forum! A tough task to help you on this topic, it's indeed too complicated, at least for me. I downloaded that full paper, and in my humble opinion, it constitutes relaxing afternoon reading, rather than system that will make you rich; I think author admits it himself to some degree: look at page 42, chapter 3.2.1:

As the lambdas vary from match to match, there is no direct way to test he validity of the Poisson assumption (no replicates). However, we can assess whether the assumption holds in an average sense.
So, that paper is his Master Thesis, based on data available from 4 last seasons at that time, and he probably had an intention to impress his professors by his mathematical ideas and knowledge, rather than to indeed make it usable in real betting world. Should ask him if he has been deploying that strategy in this past decade since he published his work. ;) Back on topic, from equations 2.1 and 2.2 you provided above, I see prediction is based on relative strengths of home attack vs. away defense and away attack vs. home defense. That approach strongly resembles me to work of another professor / student team described at Understanding Uncertainty blog, in more understandable language: http://plus.maths.org/content/os/issue52/risk/index They did it well for that one matchday; however, would it work in a long run? I don't think so; well known issue with Poisson is that it operates with independent events, and distributes them equally in each calculation; however, goals in a match cannot be considered as fully independent - a goal often causes change in match's pace, and the way match develops can be different for various matches, even if they have the same inputs for Poisson. Additionally, when estimating the inputs for Poisson, both your method and method of Understanding Uncertainty use averages from past matches, which also adds up to errors - as the season progresses, teams' form trends to their average for that season, but you need both teams to perform at their average level in the same time, which is also difficult to expect. If either team underperforms or overperforms, the actual result will not follow your calculation! Though, I do not want to discourage you - I am also fan of statistics, that's what I mostly use for my betting analysis; but I want to say, I'm not a fan of too sophisticated approach, except if it's for fun or out of curiosity. I believe that a simpler formula would work equally well as this calculation presented above, only if your estimatuion of input parameters is good enough! :ok Anyway, good luck with your research, and enjoy your staying at forum! :ok Would be nice to see your results here if you create some system out of this! :)
Link to comment
Share on other sites

This thread has more posts. To see them, you'll need to sign up or sign in.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...