Jump to content
** March Poker League Result : =1st Bridscott, =1st Like2Fish, 3rd avongirl **
** Cheltenham Tipster Competition Result : 1st Old codger, 2nd sirspread, 3rd Bathtime For Rupert **

Rating and Poisson distribution


Charon84

Recommended Posts

I’m familiar enough with using poisson in terms of total goals (I tend to use the spread prices as my starting point) but not sure how you would factor the other stats into that. That said, I can see two broad approaches:

  • mash various numbers together then apply poisson to that value
  • run various numbers through poisson then mash the results together to get your single rating

Either way, when “mashing” (probably not the correct technical term) you can weight the individual values as you see fit, e.g. a x 1 plus b x 0.8 plus c x 0.2

Link to comment
Share on other sites

1 hour ago, Charon84 said:

Don't onderstand the critical part "mashing" (lol). I'm not a Excel wonder or statician.

Me neither, hence my use of the word “mashing”! To give you a simple example, if I had two ways of calculating the goals expectation for a team or player I might find that the average of the two was more predictive than either one, so I might calculate that then use poisson to arrive at indicative odds for overs/unders etc. On the other hand, I might favour a different mix of the two, e.g. 70/30.

Or, for bookings, you might calculate the expected figure by using each team’s average x 0.35 plus the referees’s average x 0.3 (adding up to 1 or 100%) and use that to work out the odds for various bookings totals.

I’m not sure how you bring other stats into the equation in a way that can utilise poisson unless you can translate possession etc. into a goals value. xG would obviously be fit for purpose. For instance you could use 80% of a teams actual goal and 20% of their expected goals if you thought that was more accurate.

Essentially you give each input a weighting and end up with an equation that might look like =(a2*0.5)+(b2*0.3)+(c2*0.2). There’s your mashing!

Link to comment
Share on other sites

  • 2 weeks later...

@harry_ragHave done some trying with Excel. Just the 'normal' calculations and Poisson distribution based on goals Home and Away (data is this season and last two). Over/Under 2.5 are odds from Bet365 and TOO/TOU2.5 are true odds based on output Excel.

1.How do you identify a potential good betting spot (solely based on this output) when you also have to account for the overround? Difference between the Bet365 and True Odds pair is around 5% juice.

2. Do I understand it correctly that "shots on target" can also contribute to the "Home Attacking"-ratings (and thus contributes to the Poisson distribution)? Let say Arsenal has average 5 shots on target per match and league average is 4 then Shots on Target-rating is 1.25. If I say this contributes 20% to Home Attack-rating can I take 20% of this value (0.25) plus 80% of Goal-rating 1.10 (output Excel) (0.88) for a total Home Attack-rating of 1.13? I wonder because Poisson goal distribution would then be partly based on shots on targets and actual goals instead of actual goals only.

 

 

    Over 2.5 Under 2.5 TOO2.5 TOU2.5
2 Leicester Leeds 1,57 2,37 1,37 4,63
3 Aston Villa Southampton 1,72 2,10 1,57 2,87
4 Burnley Chelsea 1,90 1,90 2,23 1,84
5 Newcastle Brighton 2,20 1,66 2,46 1,69
6 Norwich Brentford 2,20 1,66 2,07 1,96
7 Wolves Crystal Palace 2,30 1,61 2,69 1,60
8 Liverpool West Ham 1,44 2,75 1,78 2,41
9 Watford Arsenal 1,90 1,90 1,92 2,15
10 Man City Man United 1,57 2,37 1,86 2,23
11 Tottenham Everton 1,80 2,00 1,96

2,10

Edited by Charon84
Link to comment
Share on other sites

12 minutes ago, Charon84 said:

Have done some trying with Excel. Just the 'normal' calculations and Poisson distribution based on goals Home and Away (data is this season and last two).

I'd be wary of using data that goes so far back, especially if you are attaching equal weight to all the data. If I remember correctly, Kevin Pullein usually recommends using about a season's worth of data, e.g. a team's last 38 games. I'll double check what he says in his book when I get chance.

16 minutes ago, Charon84 said:

How do you identify a potential good betting spot (solely based on this output) when you also have to account for the overround? Difference between the Bet365 and True Odds pair is around 5% juice.

You just compare your view of the "true" odds to the prices available. If the latter are bigger than the former then you have a potential bet. Up to you whether to bet whenever actual>true or to apply a minimum edge (e.g. for my anytime goalscorer sysemt I only bet when I can get my true odds plus a 10% edge on top).

 

20 minutes ago, Charon84 said:

Do I understand it correctly that "shots on target" can also contribute to the "Home Attacking"-ratings (and thus contributes to the Poisson distribution)? Let say Arsenal has average 5 shots on target per match and league average is 4 then Shots on Target-rating is 1.25. If I say this contributes 20% to Home Attack-rating can I take 20% of this value (0.25) plus 80% of Goal-rating 1.10 (output Excel) (0.88) for a total Home Attack-rating of 1.13? I wonder because Poisson goal distribution would then be partly based on shots on targets and actual goals instead of actual goals only.

You could do that if you believe it will give you a more accurate metric on which to base your forecasts. I'm sure you could find a better input than actual goals for/against (especially if using data that is "too old") but I'm not convinced SoT is of much value. Some things that I think might be worth considering including in the mix:

  • The spread prices; these will almost certainly be more accurate than what you currently have.
  • The fixed odds prices; 50% each of your price and the bookies price is likely to improve the accuracy. If you can still get a price better than what you end up with then it's more likely to offer genuine value.
  • The xG (expected goals) numbers. I think some sort of blend of actual and expected goals is more likely to be of use than just actual or actual with SoT.
Link to comment
Share on other sites

@harry_rag Thanks.

1. I thought I was using recent data with just two seasons (lol). What is the name of the book you refer too?

2. I certainly will create a system with an edge added. Just to be accounting for some variance.

3a. Why do you think spread prices are more accurate? Didn't know about spread prices before so I read some information about it, but still don't understand how this wil blend in the mix?

3b. So with fixed odds you mean something like this; Leicester (see example above) Bet365 odds 1,57 and True Odds 1,37 means 1,47 odds? How does that help out? When it's 'value' in the first place, it will also be 'value' after using this method (but less).

3c. Where do you get good xG data? Understat? FootyStats? Something else?

I'm very grateful for helping me on my way!

Link to comment
Share on other sites

6 minutes ago, Charon84 said:

1. I thought I was using recent data with just two seasons (lol). What is the name of the book you refer too?

https://shop1.racingpost.com/products/rpbetft

I think the season before last is too far back, unless you were using some sort of decay for the older data (e.g. at it's simplest 15% of the oldest season, 30% of last season and 65% of the current one). Even then, I'd just want to establish a sample size and stick to it, be it last 20, 38 or 50 games.

10 minutes ago, Charon84 said:

2. I certainly will create a system with an edge added. Just to be accounting for some variance.

Yes, you need to arrive at your view of the "true" odds than establish a margin that you are happy to bet with that actually generates a worthwhile number of bets. I started out thinking about laying goalscorers at my true odds less a 10% edge but was hardly ever getting matched, so it was profitable in theory but not worth the effort in practice.

17 minutes ago, Charon84 said:

3a. Why do you think spread prices are more accurate? Didn't know about spread prices before so I read some information about it, but still don't understand how this wil blend in the mix?

To be continued...

Link to comment
Share on other sites

Well, I'm probably lazy but I've used the spread prices for years as the basis for my fixed odds betting on load of sports and markets. It's a lot easier than building your own predictive models! You can bet your life that their prices will be more accurate than anyone's first tentative attempt at building their own based on simple data and poisson. Why might it be accurate? Well, they have to offer 2-way action in terms of buying and selling? What might make it less accurate? Their knowledge of which way people prefer to bet may skew the price up or down. Take tomorrow's Peterborough v Man City game, total goals can be sold at 3.75 and bought at 3.85 with a midpoint of 3.8. In the past, I might have taken the midpoint as the true value but I suspect the sell price would be a better starting point. If they think a game is in for 3.8 goals and you think it's 2.9 then there's probably something wrong with your ratings! As you start out I'm pretty sure you'd do better if you took note of the spread prices rather than paid them no heed.

32 minutes ago, Charon84 said:

3b. So with fixed odds you mean something like this; Leicester (see example above) Bet365 odds 1,57 and True Odds 1,37 means 1,47 odds? How does that help out? When it's 'value' in the first place, it will also be 'value' after using this method (but less).

Unless you have proved the accuracy of your own ratings I suspect you would do better by only betting where the odds offer (say) a 10% edge over the average odds rather than just over your own odds. The bookies know a lot more than you, for the time being at least!

Another mantra that has some merits is, if you think an overs bet is value, check your figures. If you still think it's value, assume you've made a mistake. The theory is that any value is far more likely to come from betting unders because the majority of casual punters simply don't like to bet that way.

37 minutes ago, Charon84 said:

3c. Where do you get good xG data? Understat? FootyStats? Something else?

I don't use it myself but, from what I've read, I'd favour it over other metrics such as shots on target. You'll have to google it for views and sources though if I find a worthwhile link I'll post it.

Link to comment
Share on other sites

54 minutes ago, Charon84 said:

@harry_rag Thanks.

3b. So with fixed odds you mean something like this; Leicester (see example above) Bet365 odds 1,57 and True Odds 1,37 means 1,47 odds? How does that help out? When it's 'value' in the first place, it will also be 'value' after using this method (but less).

EDGE=1.57/1.37 = 1.14

(edge-1) / (Book odds-1) =(1.14-1) / (1.57-1) = .14/.57 =24%  Kelly stake

Edited by Valiant Thor
Forgot the kelly stake
Link to comment
Share on other sites

https://www.racingpost.com/sport/liverpool-and-leeds-served-sevengoal-premier-league-feast-with-much-to-digest/450882

Here's one article on the subject of xG. I may have read a suggestion that this site was as good as any in terms of freely available xG stats. (I think there are differing methods of calculation).

https://projects.fivethirtyeight.com/soccer-predictions/premier-league/

Link to comment
Share on other sites

On 2/28/2022 at 9:33 PM, harry_rag said:

If they think a game is in for 3.8 goals and you think it's 2.9 then there's probably something wrong with your ratings! As you start out I'm pretty sure you'd do better if you took note of the spread prices rather than paid them no heed.

How to calculate the number of goals I expect? Home strength (Home attack rating * Away defense rating * Home goals average)+ Away strength (Away attack rating * Home defense rating * Away goals average)? That results in the following (first number my calculations, second spread middle). Many are way off (these calculations are based on data this season only).

4,432366   3.2  
3,927636   2.75  
1,659885   2.6  
2,659768   2.4  
2,291417   2.3  
1,654608   2.2  
2,830698   3.25  
3,655371   2.6  
3,537423   3.05  
2,748091  

2.85

 
Link to comment
Share on other sites

Just flicking through the chapter on total goals in that book I mentioned. Kev's view is that you should give about 75% of the weight to what happens in a typical match involving teams of that "strength" and 25% to what has happened in games involving the teams in question (I'm guessing he'd divide a league up into different tiers in terms of team strength). I suspect your figures would immediately look more realistic if you "mashed" 25% of your team specific figures with 75% using the league average.

The logic is that teams involved in high/low scoring games in the past will continue to be higher/lower than the average but not by as much (regression to the mean perhaps). So by basing a rating on two higher than average teams with no reference to what happens on average you end up with an extreme figure that is unrealistic for betting purposes.

Link to comment
Share on other sites

12 hours ago, Charon84 said:

Going to order the book. Read same type of book for betting on horses while ago...good stuff.

Personally I wouldn't bother its not really what your looking for. ( Junk for the masses )

12 hours ago, Charon84 said:

 Additionally; I like messing around with data and Excel.

THIS IS A BETTER BOOK FOR WHAT YOU WANT
With this you'll learn how to set up in excel some of the more common models you need in an easy to follow plain English format ( Not math heavy ).
Distributions Models
Poisson
Zero Inflated Poisson
Negative Binomial
Geometric
Uniform
( unfortunately no Bivariate Poisson)
It also shows you how to set up Chi tests for each Model to check which one is performing best @ the present time. ( models are dynamic rather than static )

 

Sorry to rain on your Parade but...
Poisson didn't work in the 80's when I was first experimenting with it and still doesn't work now . ( Dont even go there for CS prediction with any of them )
With sports betting you first have to think about it logically ...
1] If a simple distribution method that every man and his dog can use could find value bets , (would then become over bet and hence no value)
2] Bookmakers have better models and more information than you could ever have,so would adjust accordingly therefore no value
On a less negative note
Some models are better @ pointing you in the right direction than others ( which is what they should be used for & not as " set in stone predictions" ),
Of the above I would rank in predictive order for W-D- L success -
Zero Inflated Poisson
Uniform
Poisson

Enjoy yor model building

ATB
VT :ok

 

Link to comment
Share on other sites

9 hours ago, Valiant Thor said:

Some models are better @ pointing you in the right direction than others ( which is what they should be used for & not as " set in stone predictions" ),
Of the above I would rank in predictive order for W-D- L success -
Zero Inflated Poisson
Uniform
Poisson

Any thoughts on the best distribution(s) and/or reading matter specifically from an anytime goalscorer perspective?

Link to comment
Share on other sites

14 minutes ago, harry_rag said:

Any thoughts on the best distribution(s) and/or reading matter specifically from an anytime goalscorer perspective?

I must admit its not a subject Ive come across much in my reading
Its a very select market thats for sure.
Depends on what data you have and how much you have of it.
GS'rs can go an age between goals then have a run of a lot all at once hence a lot of outliers (not good)
So I suppose linear weighting would be the way to go , but that works best on high scoring games , baseball ,Nfl, basketball etc
Monte Carlo simulation is another track which would fit the data but due to the lack of goals scored even over a season its not enough.
 

Take a look @ THIS SITE  (Tonys a top sports modeller)
Its an aussie football site but there could be a few models on player score ratios somewhere in there you can adapt your data to fit
In the meantime I'll have a look on a few of my portable drives to see if I have any any thing vaguely related to CG scorer models :unsure

 

ATB
VT :ok

 

 

 

 

Link to comment
Share on other sites

@harry_rag

I'd had a couple of bottles of Marques de Caseres last night so didn't really have my thinking head on
Basically what you want to predict is the probability of a binary (yes/no) event occurring.
Logistic Regression would be the easiest model you should be looking at (IMO)

P(scores) = 1/1+a X^b
a & ^b are adjusted parameters that give the best fit to the model

ie
say the book give 4.0 (3/1) for Player A to score anytime
if the bookies odds for scoring were 100% correct then the model would be  P(A scores) = 1 / 1+X or 1/4 = 25%
But by putting your past data into a log reg model it may give you factors 1.1 for a & 1.3 for ^b
This would then give the model P( A scores) = 1/1+1.1*(3)^1.3
P(A scores) =1/1+1.1*4.17 =1/5.58 = 18%
this would mean the bookies overestimate the chance of Player A scoring @ 25% to his actual chance of 18% giving a negative edge
Ex Value = 3*0.18 - 1*0.82 =0.54-0.81 =-0.27 or -27%

There are plenty of vids on youtube to help set up Log Reg

ATB
VT:ok

 


 

Edited by Valiant Thor
posted itself whilst doing draft
Link to comment
Share on other sites

1 hour ago, Valiant Thor said:

I'd had a couple of bottles of Marques de Caseres last night so didn't really have my thinking head on

:lol Thanks for the replies. I'll have a look at that. I embarked on a data gathering exercise (now up to around 1100 players) of players within certain price parameters. That's turned into the system I'm currently trying out in "Systems and Strategy". Basically I'm using the spread prices to arrive at "fair" and "back" (fair +10% edge) odds. It does involve the use of the dreaded possion but one has to start somewhere!

Link to comment
Share on other sites

13 minutes ago, harry_rag said:

. It does involve the use of the dreaded possion but one has to start somewhere!

There's not that much wrong with Poisson, does what it says on the tin so to speak
Its just people tend to take its result as "carved in stone" and are surprised when everything doesn't turn out hunky dory.
That's why models should be checked with other methods (chi etc) for validity

Zero Inflated poison is a better model for football IMO as it takes into account the additional amount of 0 goals that occur
maybe that model would be better suited to AGS as there's more times they have 0 goals than 1+ goals

G'luck
VT :ok

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...