Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Statistical data about draws and rating differences

Author: Jeff Lischer

Date: 09:37:49 02/05/01

Go up one level in this thread


On February 05, 2001 at 10:13:01, Andrew Dados wrote:

>On February 05, 2001 at 09:49:27, Günther Simon wrote:
>
>>On February 04, 2001 at 13:58:11, Andrew Dados wrote:
>>
>>>
>>>I decided to find out 'true chance' of draw outcome from real games.
>>>The below is summarized output from my twic game files.
>>>Rdiff means difference of players ratings in the (range, range+25).
>>>Integrestingly around range of 0 draws approaches 50% and better won about 25%.
>>>
>>>Now if you take 2 tosses of coin you'll get total score of 2 heads in 25%, 1
>>>head in 50% and 0 heads in 25%...
>>>
>>>
>>> files: 176      games: 170464   decisive (counted) : 128803
>>>
>>>Rdiff+  games   %draws  %better won
>>>=======================================
>>>0       14480   48.82   25.67
>>>25      15365   46.57   31.45
>>>50      15784   44.60   35.52
>>>75      15429   41.80   39.62
>>>100     14092   38.68   44.26
>>>125     12121   35.08   49.18
>>>150     9926    31.92   54.47
>>>175     8129    28.34   58.89
>>>200     6478    25.39   63.71
>>>225     4707    22.82   67.94
>>>250     3443    20.65   69.82
>>>275     2575    18.45   73.67
>>>300     1879    15.59   77.38
>>>325     1370    15.62   78.83
>>>350     970     10.52   85.46
>>>375     687     9.32    86.90
>>>400     453     7.06    89.18
>>>
>>>Is one chess game statistically equivalent of 2 coin tosses? :)
>>>
>>>-Andrew-
>>
>>There is something in this statistics which makes me get headaches
>>because it has not much to do with chess.
>>The point is in the mass of draws "played" in just a few moves sometimes
>>even in zero moves!! There are several reasons for this behaviour like
>>knowing the opponent very well,sharing at least prices,being exhausted
>>after a long tournament and some others...
>>But will this games not going to falsify the statistics?!
>>I presume that in all kind of huge collections of chessgames nowadays are
>>a terribly lot of such "games".(For my original Bigbase of CB I know this)
>>For myself I decided long time ago to delete all drawn games under move 9
>>or 10 in my database well knowing that this is just kind of difficult
>>compromise.
>>But the question is how can I trust the statistics e.g. in showing the
>>winning percentage of different opening variations or cant I trust
>>statistics of that kind anyway?
>
>You have a point here indeed. For me however the question is not really
>'exactness' of game scores, but 'trend'.
>
>I am curious how can I include draws in statistical modelling to get them close
>to reality. No matter how I do it I get much higher 'confidence' then without
>them (See Bruces concerns in '60-40' thread somewhere below). I am becoming
>convinced that just existence of 3-state score raises confidence of performance
>ratings above usually applied 'win-loss' model.
>
>-Andrew-

I'm not sure if I understand you correctly, but the presence of draws definitely
increases the 'confidence' of performance ratings. If you score 50% in a 100
game match, the maximum standard error (and therefore the lowest confidence)
would be for the case of +50 =0 -50. The standard error actually goes to zero
for the other extreme: +0 =100 -0.

For awhile now, I've been using the following very simple scheme to account for
the presence of draws if I have absolutely no information about the percentage
of draws. It's the equivalent of playing 2 games for every game actually played.
Then score 2 wins as 1 win, 2 losses as 1 loss, and 1 win/1 loss as a draw.
Where We is the win expectancy, this is equivalent to: %Win = We^2, %Draw =
2*We*(1-We), %Loss = (1-We)^2.

For We =0.5 it gives %Draws/%Wins 50%/25%, whereas for We = 0.8 it gives
32%/64%. The data above give %Draws/%Wins of about 50%/25% for Rdiff=0 and
21%/70% for Rdiff=250.

Overall, this simple approach gives a reasonable fit to the data but it seems to
overestimate the %Draws as We goes up. I think it might give a better fit to
computer-computer results.

It would always be better to base calculations on the actual draw percentage if
that data is available.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.