Jump to content
** April Poker League Result : 1st Like2Fish, 2nd McG, 3rd andybell666 **

A football rating system, discussions, ideas


Recommended Posts

Re: A football rating system, discussions, ideas Right - going to push this along a bit now I've got some time. Just a reminder of where we are (finishing Ch 2) - and what is coming... Intro - Ratings Lists, Metrics 1. Generating a rating list - Least Squares solution 2. Generating a rating list - A simplified LS solution + another solution that approximates a LS solution method. 3. Creating data for backtesting 4. Using a rating list as a prediction model 5. Assessing the Model - How good is it? 6. Improving the Model - Fine tuning, combining different metrics, starting again with a new metric (that's what I normally do!!) 7. Finally, the model versus the Bookie! So, to round off chapter 2...

Link to comment
Share on other sites

  • Replies 265
  • Created
  • Last Reply

Top Posters In This Topic

Re: A football rating system, discussions, ideas So, we have a basic performance metric here (goals won by, adjusted for home advantage) We also have several versions of a LS solution using this metric - each later version making some assumptions - same number of games played etc - and making the calculations a little simpler. In addition, (my) experience tends to suggest that these simplified versions perform a little better than the pure matrix versions. Is there anything else we want to add. Well - I think most people would agree that more recent performances should outweigh earlier ones - so we'll look at this next. At present using the fomulas we have to create ratings you have to fix a point in the past to act as an 'anchor' for all the ratings that follow. You can move the position of this anchor as often as you like - say you want to produce a rating for a team based on the last 19 games - just keep changing the starting point, give all the teams a start rating, and use the formulas to come up with a new rating. I've not really experimented much with this method - and I'm sure there is some useful work to be done there. Instead, I adopt the very common approach of using a moving average, or more correctly an exponential moving average to the ratings. There are different ways of doing this - could be a fair discussion on this later. But they all have the advantage that once your ratings are established, a subsequent rating is determined with more emphasis given to the last rating - less to a more previous rating. It sounds very complicated - but actually the computations become even easier - as all that needs to be known to update a rating is the previous one, opponents previous rating, and the new result. We don't need to keep count of matches played etc. It is important however to 'run the ratings in' - a couple of seasons is more than adequate for most situations I feel. So, here at the end of our LS ratings approach journey, is the most simplified way to incorporate all the features we have discussed - I'm sure this approach is very familiar to ratings fans!! - a simple result that can follow from all the earlier results... New Rating = Old Rating + 1/n (Actual Result - Expected Result) with Expected result = Difference in old ratings ratings with home advantage added. Now the choice of value for n is very significant - but it is a choice, so I'm sure there will be discussion on this. It is linked to how important you would like the more recent games to be - the old class v form debate again... I tend to use n as about 20 - but later on I'll explain a bit more what n is. I have a spreadsheet to put up now that uses this formula to generate a ratings list for the EPL 1993-2000 that we can use for testing...

Link to comment
Share on other sites

Re: A football rating system, discussions, ideas Chapter 3 - Creating data for back-testing. First a quote from Mr Computer himself... On two occasions I have been asked, 'Pray, Mr Babbage, if you put into the machine wrong figures, will the right answers come out?' I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - Charles Babbage With those wise words - I suppose it's from here on in that the greatest care must be taken. Anyone can use the formulas from Chapter 2 - taking the spreadsheet I give here - and come up with ratings. However a lot of thought should go into the choice of metric - and whether the assumptions that we have made about the choice of LS solution can cope with what we are asking of the ratings system. We should also think carefully about the data we are going to use, and how best to use it. I freely admit to being a subscriber to the www.football-data.co.uk website run by PL member 'Joe'. A fantastic database of past results, but more importantly for here, bookies odds for past matches. Without that site there couldn't really be a strategy and systems page here, so I hope it is ok to mention this site. Anyway, I shall be using the past results from that database, which of course are freely available at many places on the internet - but will never post up the odds for each game- instead recommending any readers here to purchase the access rights themselves. (Mods - If I'm wrong posting the link here - please let me know...no offence intended to anyone, just highlighting what I believe to be an essential website.) As to how best to use the data you have - the above website gives results and odds from 1993 - present day. The spreadsheet I give below genarates ratings from 1993 - 2000 (running the ratings in for the first two years). We'll analyse these - then test out our conclusions on data from 2000 - 2005. An arbritary choice, but certainly one that has to be made to give integrity and confidence into any findings.

Link to comment
Share on other sites

Re: A football rating system, discussions, ideas Hi MO, with regards the 4-3 cp 1-0 situation.This boils down to the goals expectancy for a particular game.Teams that make a habit of 1-0's are possibly going to be involved in games that are less goal laden. If you use a LS to obtain likely goal supremacies,then a team that is say 0.5 of a goal superior to another will have slightly different % win expectations if the game is going to have say 2.5 goals compared to say 3.1. When that goal came,imo isn't particularly relevant(the timing's just a discrete occurance of a goal probability distribution).And I'd be surprised if early goals are anything more than transient trends that don't repeat.A bit like teams drawing lots of games one season & returning to normal the next. I've looked at using shots on target etc & correcting the score for unlucky teams,but I'm far from convinced it's a useful excercise. Missed chances & great saves aren't unusual,that's why a soccer game usually averages less than 2.5 goals per game.Losing teams may have lots of shots simply because their opponents are allowing them to attack more often. If a teams was genuinely unlucky,subsequent games will probably show you this & the unlucky game will lose its significance. Presumably you'd also have to downgrade teams that scored from less shots,which would hit teams with clinical finishers. I've also gone the home/away route & again wonder if home or away specialists isn't just a transient trend that doesn't persist,given credence with the benefit of hindsight ? Most home specialist systems I've seen have been backfitted to death. Getting 38 home games to analyse takes twice as long as analysing 38 home/away games.By going venue specific you lose out hugely on the recentness of the results & I think this kills the potential benefit. Finding a metric that beats the books is going to be very difficult.You really need a book to "take a view" or deliberately price a game wrongly because of anticipated punter demand to get an edge.That imo is where most value bets are to be found.Quarbs in spread betting for example. Good stuff D.

Link to comment
Share on other sites

Re: A football rating system, discussions, ideas Hi there Mr.O. great to see you back in the driving seat, have missed your postings.Your new rating formula looks logical and very easy to follow in this format.As you say, it looks very much like the class v form comes into mind again, and becomes a question of individual tastes as to just what constitutes recent performances and how far back you wish to go in order to try and produce reasonable ratings.At first glance,twenty looks fairly reasonable, but then you suddenly realise that this figure constitutes nearly half a season.Is this leaving us with a base rating that may be too old?There will be many that argue in favour of something like the last six games to provide form guidance,so I think that a happy medium will need to be reached prior to commencement.Btw I like the quotation by Mr.Cabbage,"If you put the wrong figures into the machine, will the right ones come out?".I wish that I had thought of this in my NN thread, as I really think it says it all.Looking forward to your spreadsheet, which I note there is a problem with at the moment. Can't help with that I'm afraid! Cheers.;) On second thoughts is it not possible to zip it up into two separate files?

Link to comment
Share on other sites

Re: A football rating system, discussions, ideas

New Rating = Old Rating + 1/n (Actual Result - Expected Result) with Expected result = Difference in old ratings ratings with home advantage added.
so using the above equation, does that mean that you tend to remove the ( 1 / ( NumA + 1 )) part, and replace it with the 1/n part, so it is not linked to the amount of games played? so for a 3-0 win and old ratings of 10 and 8, team As new rating = 10 + 0.05 * ((3-0) - (10 - 8 +0.4)) = 10 + 0.05 * 0.6 = 10.03 is this correct? also, before this game occurred, would the prediction for the match be = 10 - 8 + 0.4 = 2.4 is this correct? many thanks in advance. eagerly watching as this thread unfolds. still a belter in my book. :ok
Link to comment
Share on other sites

Re: A football rating system, discussions, ideas Hi Muppet, Yes - your calculations look right to me. As to what n means, and what value you should choose - well it is a choice, and I prefer to take a value around 20 - gives good results for most of my work anyway. My original thinking was taken from the previous coefficient being (1/ number of games played+1), so I like to think of half an EPL season, or 19 games. But I'm not sure this reasoning is sound now:( I've just looked at a piece of maths I'd ignored for many years now - I knew I'd benefit from this thread too!! - and I might need to look into n a bit more...more on this later... However - the good news is :D in practice - because the value of n is just a subjective choice - it can be played around with a bit to produce ratings. Of course - a better approach is to understand what affect n has on the ratings - so here is a little spreadsheet for everyone to show this. I hadn't ever bothered making this spreadsheet before - but it's quite fun!! The spreadsheet shows you the weightings effect of changing n - and seeing how much previous games contribute to the new rating. The new factor for me to think about is the number of games in the rating period you want to effect the new rating, let's call this m. It's this value I want to be 19, and it's interesting to see what happens as you change the value of n. When n=m I think it's just what I've always been using - but it doesn't have to be this way. Anyway - I've got something to look into a bit further. As far as the thread goes - we can still carry on with the same formula above, choosing a value of n - but there maybe some refinements we can make later - exciting!! To use the spreadsheet, just experiment with changing the values of m and n in the yellow boxes. Any red part of the spreadsheet is to be ignored after a calculation has been performed. The results show how much weighting a previous game has on the new rating - most recent at the top. Here's the maths for those interested... New Rating = Old Rating + 1/n (Actual Result - Expected Result) Now with Ratn being the rating after n games played, OppRatn the opponents rating after n games, pn the actual result in the nth game, and An the value of pn+OppRat(n-1) R1 = R0 + 1/n (p1-(R0-OppRat0)) = ((n-1)/n) * R0 + (1/n) (A1) R2 = ((n-1)/n) * R1 + (1/n) (A2) = ((n-1)/n)^2 * R0 + ((n-1)/n)(1/n)*(A1) + (1/n) * (A2) In general, Rm = ((n-1)/n)^m * R0 + (1/n)Am + ((n-1)/n)(1/n)Am-1 + ... +((n-1)/n)^(m-k)*(1/n)Ak + ... Just to note, as m increases - the number of games including in the rating period increases - the contribution of R0, the initial rating, becomes zero. This shows that the choice of initial ratings is arbritary, and after a short time the rating system settles in. Anyway, it's all programmed into the spreadsheet - interested in any comments on the values of m and n.

conv_14.xls

Link to comment
Share on other sites

Re: A football rating system, discussions, ideas I'm emailing a copy of the ratings spreadsheet to Datapunter - let's hope he can squeeze the 1.73Mb file onto here:ok When it's up - I'll explain what it does - and doesn't....there a lot of 'tidying' up and assumptions made with this particular rating list - and I'd hate to think anyone goes off immediately and uses that as a basis to betting lots of money:( A lot to say yet....

Link to comment
Share on other sites

Re: A football rating system, discussions, ideas Done it :D Had to remove some 'blank' sheets These ratings are very 'rough' Absolutely no thought put into promoted teams, going down and then coming back in again with previous ratings etc. There would be a lot I would tidy up here like that before analysing these. Still - it shows the basic way of producing a rating list. The ratings in red are during a 2 season 'run in' period - and I wouldn't use those to analyse how good the ratings are. Something to play around a bit with anyway... Corrected version

conv_16.zip

Link to comment
Share on other sites

Re: A football rating system, discussions, ideas i can get it fine my end, as you say needs some thought as to teams being demoted and then promoted to the epl a few years later, picking up the same rating. also shouldn't column k the rating adjust, connect to column G, the prediction, as this includes the 0.4 home advantage. at the mo it just looks at the difference in the raw ratings?

Link to comment
Share on other sites

Re: A football rating system, discussions, ideas

also shouldn't column k the rating adjust' date=' connect to column G, the prediction, as this includes the 0.4 home advantage. at the mo it just looks at the difference in the raw ratings?[/quote'] Absolutely Muppet - my mistake I'm afraid.:( Thanks. I'll correct the download above - or else anyone can manually change simply like this: The cell K3 should read (1/20)*(H3-G3) Then copy K3 and paste into the whole of col K
Link to comment
Share on other sites

Re: A football rating system, discussions, ideas In theory it should be possible to upload files up to 2Mb , but practically anything above about 800 - 900 Kb won't work. That is plenty for most purposes btw. If there is need for bigger files send them to me and i can use the backdoor. :ok

Link to comment
Share on other sites

Re: A football rating system, discussions, ideas Thanks Datapunter - will remember that for later ;) Right - going to hack on here... A little change to the running order first... Intro - Ratings Lists, Metrics 1. Generating a rating list - Least Squares solution 2. Generating a rating list - A simplified LS solution + another solution that approximates a LS solution method. 3. Creating data for backtesting 4. The rating list as a prediction model - How good is the model? 5. The model versus the Bookie!! 6. Improving the Model - Fine tuning, combining different metrics, starting again with a new metric (that's what I normally do!!)

Link to comment
Share on other sites

Re: A football rating system, discussions, ideas Chapter 4 - The rating list as a prediction model - How good is the model? Prediction is very difficult - especially if it's about the future - Niels Bohr Obviously not a betting man... Now we're at an exciting stage with our model. The Big question - are the ratings any good? Now, I'm certainly not a statistician, and my requirements from a ratings system are fairly modest. (profit!!!!) Anyway, the next section will be short on statistical rigour, but sound on general principles. I'll leave the big stats boys to argue over the finer points here - statistics as a discipline seems like one big argument to me anyway. Key words here are - keep it simple. First principle - we're going to leave the bookies well out of this for a bit. We know they've got a good model - we first need to check out ours for soundness. Next - a very crude check for the reasonableness of the ratings. I like to do this before I spend a lot of time tidying up the ratings - ie promoted teams distorting ratings. If the model passes this stage - it's worth carrying on - if not - a new metric is needed and it's back to post number #1 :( Right, some assumptions and ball park figures - please don't tell me how inaccurate I am here - rough is ok... EPL tends to be fairly constant in it's ratios of homes:draws:aways. It might be a bit different for other divisions and leagues - but for EPL the following are close... Homes...45% to 50% Draws...around 30% Aways...25% to 30% If the ratings list has any credibility, the ratio of predictions has to be around these figures. So, I first of all sort our list of ratings - see spreadsheet attached below. Playing it safe, I take the top 40% as home predictions. That's about the first 760 matches in the sorted list. That works out at roughly at a ratings prediction of 0.60 or better. Aways - take the bottom 20%. That's matches from about 1520 and down. That's ratings of -0.20 or less. Draws - I take a 20% band from 50% to 70%. Matches 950 to 1330. In this case that's 0.40 to 0.00 (stretching just a little). That's an encouraging start, given that the prediction ratings add 0.40 for home advantage. The idea here is to very conservative - no overlaps with the boundaries. A little stetch here and there to make the boundaries nicer - why not? It's all very rough at this stage... That gives this:

PredictionNumberCorrect%Target %
Homes0.60 and over76648363.150
Draws0.00 to 0.4040713232.430
Aways- 0.20 and below39815137.930
Now to me, that's a very encouraging result, certainly worth pursuing. Time to tidy up the rating list;)

conv_19.xls

Link to comment
Share on other sites

Re: A football rating system, discussions, ideas i can see now why you prefer n=20. eager to see if you get prices the same way as me. just going back a step mr O (as i know i regularly do) what if all league clubs were applied to the ratings method, starting all clubs off on the same value eg 2. after the first season the top club in the epl may be on 2.5 and the 3 demoted teams at the bottom may average 0.5. would it not be sensible to then restart the ratings for that season with say all the epl on 10 and the E1 on 2.5 - 0.5 = 2 less than the epl, so that's 10 - 2 = 8. the same could be done for the other divisions, and any new team entering the league could be given an average of the demoted teams from E3? by doing this the leagues after settling for a season or two would show some overlap, hence allowing promoted and demoted teams to keep their points with them for the following season. by choosing 4 divs, it also gives about 20,000 games ready for back testing.

Link to comment
Share on other sites

Re: A football rating system, discussions, ideas Muppet77 The approach i was thinking of with promoted teams is to look back at the data and take an average of how promoted teams fared in the first season in whichever league going back several seasons, then use this as a base value. I think its relevant to the league as well as the gap from 1st to premier is much bigger than say 3rd to 2nd.

Link to comment
Share on other sites

Re: A football rating system, discussions, ideas Hi Muppet, I certainly encourage you to try those kinds of ideas out for yourself - and with 20,000 games of backdata you should be able to come up with a test and be very confident of whether or not this is a viable approach. It's an appealing idea to have all the clubs included in the same rating list - and I'd like to see something like this work. Good luck with this and please post up any findings :ok I'd suggest you compute all the ratings in the way you describe - then test the predictions in two categories - all matches predicted - then exclude from the list any games where a promoted/relegated team is involved - perhaps until a half season has been played, to let the rating run in. In fact, I'm about to do a similar thing here - by 'tidying up' the rating list here, I'm going to reset each promoted team to a rating of 10 - arbritrary choice, but roughly in right area I'd say. Anything else comes under fine tuning of the system for me - and I don't normally spend much time on those matters. And then I'll test the system again with the matches excluded as above - hoping to get similar or better results - though it doesn't always work that way though I'm afraid :( btw - You might have a small problem holding all the data in the spreadsheet in the way I did it - A very cumbersome way I admit. I've started learning VBA to limit the amount of data that has to be held in the spreadsheet - for instance we only need the previous rating for each team now, and don't need to store all the previous values as the spreadsheet does. It's also a much more convenient way of manipulating and analysing all the data you produce.

Link to comment
Share on other sites

Re: A football rating system, discussions, ideas Well, here is the summary for our rating list after the list has been tidied up.

n is 20PredictionNumberCorrect%Target %
Homes0.60 and over66141462.650
Draws0.00 to 0.4034310731.230
Aways- 0.20 and below34813137.630
The number of matches rated dropped from 1900 to 1632 after excluding games where a newly promoted team was involved and had not played half a season worths of games yet. In practice, I cut out any game played with a newly promoted team before Jan 1st - actually losing a fraction more games. I always look at the draw potential of a new system - simply as it's the easiest to see at a glance whether there is a potential profit to be held over the bookies. Most bookies go about 3.20 the draw, meaning you need about 31.2% strike rate to break even. Bang on here!! - in fact as stated right at the start of the thread, this metric is probably just not quite good enough... Anyway - we can analyse these results a bit futher - and produce estimated odds for more of the boundary ranges. All that to come...
Link to comment
Share on other sites

Re: A football rating system, discussions, ideas Hi there Mr.Onemore, I'm still here, tin hat and all!Had some trouble downloading your EPL spreadsheet, all others were OK.Turned out to be my firewall protection system putting the block on.Thought that I had been barred! Looks like you have had some work with this spreadsheet,judging by the info produced in the workings. Not had much time to number crunch, but I note that our friend Mr. Muppett appears to have been beavering away behind the scenes, and will no doubt be interesting to see what is produced there.Works like a little good un! Once the ratings are assertained is it not possible to reduce the size of the spreadsheet so as to include the rating of each match on a separate basis with something like:- H Old Rating, A Old Rating, Goal Diff, rating adjustment,H New Rating,A New Rating so as each rating is completed, your match prediction sheet can then be updated.Food for thought?Cheers and keep up the good work fella!;)

Link to comment
Share on other sites

Re: A football rating system, discussions, ideas

alright boss, done masses of crunching. once you have sorted the ratings into order, how do you decide which to classify as those that should be homes? you say the top 40%?
Yes - initially it's just this simple, crude test, to see if the ratings are worth pursuing. I go top 40% of the sorted list as a home selection, 50% to 70% a draw selection, and 80% and above away selection. If you put a table up like the above, we can see how you're getting on...
Link to comment
Share on other sites

Re: A football rating system, discussions, ideas A couple of quick thoughts on dealing with promoted teams. You can deal with promoted/relegated teams by fitting a regression line between the number of points they gained in the previous season and the number of points they got as a promoted/relegated team.You can improve it by adding money spent as a variable,especially for a move to the Prem.Playing around with the points totals also helps the correlation.86 points gained in one season in the Championship isn't always as impressive as 86 points gained in an other. Alternatively,you can set ratings to an arbitrary figure,run some results & then stick these improved,but likely still incorrect ratings back in at the start.Continue until the final ratings converge.Iteration.Works for the NFL,haven't tried it for soccer. Altrnatively,one flaw of LS is that you're never sure when the final ratings have reached their true proportions.You might have the teams in the correct ratings order,but the gaps between ratings are too large.Less of a problem as you get more results but still a problem. However,the talent in divisions is reasonably similar,season on season.So you need to take the final ratings for a mature,complete season,work out the average and SD of the ratings for that league & compare those figures with your incomplete league.If these figures differ appreciably,especially the SD,then you need to reframe them or any odds you derive will be similarly false. This last suggestion works extremely well when combined with either of the first two suggestions. D.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.


×
×
  • Create New...