Author: Jeff Lischer
Date: 09:37:49 02/05/01
Go up one level in this thread
On February 05, 2001 at 10:13:01, Andrew Dados wrote: >On February 05, 2001 at 09:49:27, Günther Simon wrote: > >>On February 04, 2001 at 13:58:11, Andrew Dados wrote: >> >>> >>>I decided to find out 'true chance' of draw outcome from real games. >>>The below is summarized output from my twic game files. >>>Rdiff means difference of players ratings in the (range, range+25). >>>Integrestingly around range of 0 draws approaches 50% and better won about 25%. >>> >>>Now if you take 2 tosses of coin you'll get total score of 2 heads in 25%, 1 >>>head in 50% and 0 heads in 25%... >>> >>> >>> files: 176 games: 170464 decisive (counted) : 128803 >>> >>>Rdiff+ games %draws %better won >>>======================================= >>>0 14480 48.82 25.67 >>>25 15365 46.57 31.45 >>>50 15784 44.60 35.52 >>>75 15429 41.80 39.62 >>>100 14092 38.68 44.26 >>>125 12121 35.08 49.18 >>>150 9926 31.92 54.47 >>>175 8129 28.34 58.89 >>>200 6478 25.39 63.71 >>>225 4707 22.82 67.94 >>>250 3443 20.65 69.82 >>>275 2575 18.45 73.67 >>>300 1879 15.59 77.38 >>>325 1370 15.62 78.83 >>>350 970 10.52 85.46 >>>375 687 9.32 86.90 >>>400 453 7.06 89.18 >>> >>>Is one chess game statistically equivalent of 2 coin tosses? :) >>> >>>-Andrew- >> >>There is something in this statistics which makes me get headaches >>because it has not much to do with chess. >>The point is in the mass of draws "played" in just a few moves sometimes >>even in zero moves!! There are several reasons for this behaviour like >>knowing the opponent very well,sharing at least prices,being exhausted >>after a long tournament and some others... >>But will this games not going to falsify the statistics?! >>I presume that in all kind of huge collections of chessgames nowadays are >>a terribly lot of such "games".(For my original Bigbase of CB I know this) >>For myself I decided long time ago to delete all drawn games under move 9 >>or 10 in my database well knowing that this is just kind of difficult >>compromise. >>But the question is how can I trust the statistics e.g. in showing the >>winning percentage of different opening variations or cant I trust >>statistics of that kind anyway? > >You have a point here indeed. For me however the question is not really >'exactness' of game scores, but 'trend'. > >I am curious how can I include draws in statistical modelling to get them close >to reality. No matter how I do it I get much higher 'confidence' then without >them (See Bruces concerns in '60-40' thread somewhere below). I am becoming >convinced that just existence of 3-state score raises confidence of performance >ratings above usually applied 'win-loss' model. > >-Andrew- I'm not sure if I understand you correctly, but the presence of draws definitely increases the 'confidence' of performance ratings. If you score 50% in a 100 game match, the maximum standard error (and therefore the lowest confidence) would be for the case of +50 =0 -50. The standard error actually goes to zero for the other extreme: +0 =100 -0. For awhile now, I've been using the following very simple scheme to account for the presence of draws if I have absolutely no information about the percentage of draws. It's the equivalent of playing 2 games for every game actually played. Then score 2 wins as 1 win, 2 losses as 1 loss, and 1 win/1 loss as a draw. Where We is the win expectancy, this is equivalent to: %Win = We^2, %Draw = 2*We*(1-We), %Loss = (1-We)^2. For We =0.5 it gives %Draws/%Wins 50%/25%, whereas for We = 0.8 it gives 32%/64%. The data above give %Draws/%Wins of about 50%/25% for Rdiff=0 and 21%/70% for Rdiff=250. Overall, this simple approach gives a reasonable fit to the data but it seems to overestimate the %Draws as We goes up. I think it might give a better fit to computer-computer results. It would always be better to base calculations on the actual draw percentage if that data is available.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.