Jump to content
** April Poker League Result : 1st Like2Fish, 2nd McG, 3rd andybell666 **

Tennis rating/info/assessment system


Recommended Posts

  • Replies 114
  • Created
  • Last Reply

Top Posters In This Topic

Re: Tennis rating/info/assessment system This is going to take a wee bit longer :lol The ATP actually has over 20000 players you know, ( probably due to inclusion of challengers ), you'd think the names would be unique, but they are not, not even combined with country. Gonna have to get creative :lol WTA on the other hand only has 1285 players :unsure

Link to comment
Share on other sites

Re: Tennis rating/info/assessment system Only 10 players called "Rodriguez J" , first names are unique thou, 8 called "Lee J" and 8 called "Gonzalez J" :lol Got about 35 players with 2 times same lastname, firstname and country :wall Have decided to use a date range to make overlapping names unique, more later ;)

Link to comment
Share on other sites

Re: Tennis rating/info/assessment system Got access to the atp profile so all that info is available to make a search specific. The problem only arises when the name is all you have to go on so will be fine in most cases. For example some bookies only provide lastname and first letter first name. Those would be the tricky ones. Do we need all of them ? Don't know if we need all of them but that's an aspect of data processing. First you make a data collection and then see what comes out of it. I'll have more info later this week.

Link to comment
Share on other sites

  • 4 months later...

Re: Tennis rating/info/assessment system No, shit happened, life took over, had to spend time on other things. If there's a followup on this it will be in a while and after i got my own stuff sorted. Which implies i have / will have a database at some point, (which can be shared) just not in a position to put any date on it.

Link to comment
Share on other sites

Re: Tennis rating/info/assessment system OK. I did write a program to download the tennis-data stuff and save in mySQL in 2007, but I never got round to saving the ATP like data - which would be necessary, I thought. Also never got round to thinking about how to do the prediction - got side tracked into soccer and then real work. Do you know if it is possible to get historical match stats anywhere?

Link to comment
Share on other sites

Re: Tennis rating/info/assessment system The only actual source would be the stats from the atp site. ( but you'd need to collect then match per match ) Still not clear what you mean with "historically for each match" ? 1st serve% of past head-to-head matches ? Average 1st serve% past 10 matches each player ? Other ?

Link to comment
Share on other sites

Re: Tennis rating/info/assessment system In case you don't know, on the atp site lookup a player's activity, next to each match is a link with "stats". Only update after a tourament has completed, for recent matches in the same tourament the data is inside the atp scoreboard.

Link to comment
Share on other sites

Re: Tennis rating/info/assessment system The data is on the atp site - as you suggested. I'm sure last time I looked (about June 2007) this data was only available for the current year. So now all I need to do is to scrape it all into mySQL and merge with the odds data from tennis-data.co.uk. That's quite a lot of work for me - and I won't have time until April. Are you doing any tennis betting?

Link to comment
Share on other sites

Re: Tennis rating/info/assessment system

So now all I need to do is to scrape it all into mySQL and merge with the odds data from tennis-data.co.uk. That's quite a lot of work for me
tell me about it :lol not betting at the moment, just ends up taking up too much time, in other words too much other stuff to do at the moment. If you haven't got time until April make sure you check here first, really should have someting by then, not promising thou :$ :lol
Link to comment
Share on other sites

Re: Tennis rating/info/assessment system

I have the ability and also now the time, post wedding, to build a statistical model which will tell you the variables that contribute to one person winning over another. I had a go on my own and after a couple of hundred bets my Yield was -1%. Annoyingly close. The sort of thing I would look to do with the data is like the stuff I was doing with the premier league. I used to be PlopPlop around here for anyone that remembers. I've some very powerful statistical software at my disposal. As an idea my list of variables in my previous attempt was: MATCH_REF ATP COUNTRY LOCATION TOURNAMENT DATE SERIES COURT SURFACE ROUND BEST OF PLY1 PLY2 RANK_PLY1 RANK_PLY2 PTS_PLY1 PTS_PLY2 1_PLY1 1_PLY2 2_PLY1 2_PLY2 3_PLY1 3_PLY2 4_PLY1 4_PLY2 5_PLY1 5_PLY2 SETS_PLY1 SETS_PLY2 COMMENT B365_PLY1 B365_PLY2 CB_PLY1 CB_PLY2 EX_PLY1 EX_PLY2 GB_PLY1 GB_PLY2 IW_PLY1 IW_PLY2 PS_PLY1 PS_PLY2 SB_PLY1 SB_PLY2 B&W_PLY1 B&W_PLY2 UB_PLY1 UB_PLY2 NATIONALITY_PLY1 NATIONALITY_PLY2 AGE_IN_DAYS_PLY1 AGE_IN_DAYS_PLY2 PLAYING_IN_HOME_COUNTRY_PLY1 PLAYING_IN_HOME_COUNTRY_PLY2 TENNIS_CORNER_ID_PLY1 TENNIS_CORNER_ID_PLY2 H2H TENNIS CORNER URL MAX_ODDS_PLY1 MAX_ODDS_PLY2 H2H_WIN_PCT_PLY1 AVG_S_PER_MATCH_PLY1 AVG_G_PER_MATCH_PLY1 AVG_G_PER_SET_PLY1 AVG_S_WON_IN_3_SET_MATCHES_PLY1 AVG_S_WON_IN_5_SET_MATCHES_PLY1 AVG_G_WON_IN_3_SET_MATCHES_PLY1 AVG_G_WON_IN_5_SET_MATCHES_PLY1 AVG_G_PER_SET_IN_3_SET_MATCHES_PLY1 AVG_G_PER_SET_IN_5_SET_MATCHES_PLY1 SETTIEBREAK_PLY1 3SETTIEBREAK_PLY1 5SETTIEBREAK_PLY1 H2H_WIN_PCT_PLY2 AVG_S_PER_MATCH_PLY2 AVG_G_PER_MATCH_PLY2 AVG_G_PER_SET_PLY2 AVG_S_WON_IN_3_SET_MATCHES_PLY2 AVG_S_WON_IN_5_SET_MATCHES_PLY2 AVG_G_WON_IN_3_SET_MATCHES_PLY2 AVG_G_WON_IN_5_SET_MATCHES_PLY2 AVG_G_PER_SET_IN_3_SET_MATCHES_PLY2 AVG_G_PER_SET_IN_5_SET_MATCHES_PLY2 SETTIEBREAK_PLY2 3SETTIEBREAK_PLY2 5SETTIEBREAK_PLY2 SURFACE_H2H_WIN_PCT_PLY1 SURFACE_AVG_S_PER_MATCH_PLY1 SURFACE_AVG_G_PER_MATCH_PLY1 SURFACE_AVG_G_PER_SET_PLY1 SURFACE_AVG_S_WON_IN_3_SET_MATCHES_PLY1 SURFACEAVG_S_WON_IN_5_SET_MATCHES_PLY1 SURFACEAVG_G_WON_IN_3_SET_MATCHES_PLY1 SURFACEAVG_G_WON_IN_5_SET_MATCHES_PLY1 SURFACEAVG_G_PER_SET_IN_3_SET_MATCHES_PLY1 SURFACEAVG_G_PER_SET_IN_5_SET_MATCHES_PLY1 SURFACESETTIEBREAK_PLY1 SURFACE3SETTIEBREAK_PLY1 SURFACE5SETTIEBREAK_PLY1 SURFACE_H2H_WIN_PCT_PLY2 SURFACEAVG_S_PER_MATCH_PLY2 SURFACEAVG_G_PER_MATCH_PLY2 SURFACEAVG_G_PER_SET_PLY2 SURFACEAVG_S_WON_IN_3_SET_MATCHES_PLY2 SURFACEAVG_S_WON_IN_5_SET_MATCHES_PLY2 SURFACEAVG_G_WON_IN_3_SET_MATCHES_PLY2 SURFACEAVG_G_WON_IN_5_SET_MATCHES_PLY2 SURFACEAVG_G_PER_SET_IN_3_SET_MATCHES_PLY2 SURFACEAVG_G_PER_SET_IN_5_SET_MATCHES_PLY2 SURFACESETTIEBREAK_PLY2 SURFACE3SETTIEBREAK_PLY2 SURFACE5SETTIEBREAK_PLY2 NO_H2HS DAYS_SINCE_LAST_H2H NO_SURFACE_H2HS DAYS_SINCE_LAST_SURFACE_H2H H2H_3SET_WINPCT_PLY1 H2H_3SET_WINPCT_PLY2 H2H_5SET_WINPCT_PLY1 H2H_5SET_WINPCT_PLY2 SURFACE_H2H_3SET_WINPCT_PLY1 SURFACE_H2H_3SET_WINPCT_PLY2 SURFACE_H2H_5SET_WINPCT_PLY1 SURFACE_H2H_5SET_WINPCT_PLY2 NO_3SET_MATCHES NO_5_SET_MATCHES SURFACE_3SET_MATCHES SURFACE_5SET_MATCHES PLY1_CHANGE IN RANK (-1 MONTH) PLY1_CHANGE IN RANK (-2 MONTHS) PLY1_CHANGE IN RANK (-6 MONTHS) PLY1_CHANGE IN RANK (-12 MONTHS) PLY2_CHANGE IN RANK (-1 MONTH) PLY2_CHANGE IN RANK (-2 MONTHS) PLY2_CHANGE IN RANK (-6 MONTHS) PLY2_CHANGE IN RANK (-12 MONTHS) WINNER This would also give me a good excuse to get back into the groove of things and get a presence back on PL. Let me know if this assistance is what you are after............
an THAT`S it? :lol:lol
Link to comment
Share on other sites

Re: Tennis rating/info/assessment system

Seems far too much info in that list to me. It would ovewhelm most of the prediction tools people use (eg. neural networks).
I have to disagree. The more, the better. It takes little effort for a decent stats package to figure out the best variables in the list.
Link to comment
Share on other sites

Re: Tennis rating/info/assessment system

Is there any extra info on tennis corner - or is it all derivable from the ATP data?
It depends how fast you want updates, the ATP site isn't very fast, you need sites with livescores and results for fast updates, then you can add data from ATP if needed as soon as it's available. I'm not using tenniscorner as i found several other sites that are easier to access and parse. So does it have additional data ? at the moment i don't know. (but don't think so) The most complete looks to be www.coretennis.com another fairly good one is www.tennisinsight.com (going to a small subscription)
Link to comment
Share on other sites

Re: Tennis rating/info/assessment system I have manged to do the basic info - the results from tennis-data (using Ruby/Rails migrations which are brilliant). Now I need to get the extra data from the ATP site. Does anyone have a list of the ATP player codes and the tournament numers? Just worked out I can scrape the player numbers from http://www.atpworldtour.com/en/players/matchfacts/. And trn nums from http://www.atpworldtour.com/5/en/vault/archive.asp?year=2008&caltype=full Then I can get ATP stats for matchs by making the url. Now what about the analysis? First what data is useful? I will have all match stats for previous matchs - so moving averages of these would be quite easy. But what about the quality of the opposition? 50% return game % is better against Roddick than Betty Stove, so would a ratings approach be better? If so what ratings would you try to calculate? Serve, return to start with. What about nerve (including tie break %, break points converted %,etc)? Or consistancy?

Link to comment
Share on other sites

Re: Tennis rating/info/assessment system

First what data is useful? I will have all match stats for previous matchs - so moving averages of these would be quite easy. But what about the quality of the opposition? 50% return game % is better against Roddick than Betty Stove, so would a ratings approach be better? If so what ratings would you try to calculate? Serve, return to start with. What about nerve (including tie break %, break points converted %,etc)? Or consistancy?
All i can say at this point is... excellent question :lol and excactly the type of things i'd like to discuss on this thread. May i suggest we start by simply listing what data is available and how we define that data, which should provide some clarity and avoid confusion. ( I'll edit a full list and paste here as a file ) I'll kick off with some obvious ones: Player data: 1) Last name, 2) First name, 3) Date of Birth, 4) ( Age ) , derived from Date of birth and current date, 5) Gender, 6) Hand play, right / left / single or double backhand / more ??? 7) professional since, 8) height, 9) weight, 10) ... Match data: ( gets a little more tricky here ) 1) tournament, 2) season, 3) round, 4) players,[4] 5) ... so far ok, nothing special, now it gets a little more tricky and why we need uniform definitions: Number of 1st serves, total number of 1st serves a player had in the match. Successfull 1st serves, number of successfull 1st serves, 1st serve percentage, percentage of Successfull 1st serves in relation to total number of 1st serves. 1st serve points won, number of Successfull 1st serves won. 1st serve point won percentage, percentage of 1st serve points won in relation to the total number of Successfull 1st serves. Basically a reference list of what's available and how it's defined so everyone can check we're talking about the same thing. The last one illustrates the need as 1st serve points won percentage is clearly defined as related to the number of successfull 1st serves so it cannot be confused with someone thinking of total number of 1st serves which would give a different percentage. Get it ? ok, what else and how do you define it... ?
Link to comment
Share on other sites

Re: Tennis rating/info/assessment system Well, the ATP data can all be worked out from these numbers (all integers not percentages - which can be worked out): StatType.create(:name=>"aces") StatType.create(:name=>"double faults") StatType.create(:name=>"serve points") StatType.create(:name=>"first serve in") StatType.create(:name=>"first serve won") StatType.create(:name=>"second serve won") StatType.create(:name=>"break points") StatType.create(:name=>"break points saved") StatType.create(:name=>"service games played") StatType.create(:name=>"return points") StatType.create(:name=>"first return points") StatType.create(:name=>"first return won") StatType.create(:name=>"second return won") StatType.create(:name=>"break points") StatType.create(:name=>"break points won") StatType.create(:name=>"return games played") For example service games % = (service games - break points + break points saved)/service games So, we have all this for each match as well as the match data on tennis-data. If we start off with a plr1 service rating we could for example calulate: plyr1 serve game % for last 10 matches plyr2 return game % for last 10 matches and also for the opponents in these last 10 matches: plyr1 average return game % of opponents (when match was played) in last 10 matches plyr2 average service game % of opponents (when match was played) in last 10 matches From these we should be able to work out if plyr1 will win his serve. We could do the same thing for plyr2 serves. Then some 'bottle' rating could be calculated from tie-breaks, break points, close sets and matches. And we still have other data like ATP rating, surface, etc.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.


×
×
  • Create New...