Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Dummy Cadaques Tournament (Long)

Author: Dave Gomboc

Date: 12:20:53 01/28/00

Go up one level in this thread


On January 28, 2000 at 14:44:48, Christophe Theron wrote:

>On January 28, 2000 at 13:01:05, Michael Neish wrote:
>
>>On January 28, 2000 at 07:27:54, Enrique Irazoqui wrote:
>>
>>>There is a degree of uncertainty, but I don't think you need 1000 matches of 200
>>>games each to have an idea of who is best.
>>
>>I have to agree with what Christophe says insofar as you need to play a
>>certain number of games before you can determine, to a certain (known)
>>degree of accuracy what the rating difference is between two programs.
>>You will never know exactly of course, hence the standard deviation
>>figures given next to the Elo ratings of human Chess players, which are
>>sometimes overlooked.  I haven't read Elo's book, but from what I know
>>of the Elo system he must have taken all this probability stuff into
>>account when he formulated it, so meaningless it is not.  In fact, it is
>>the core of the entire system.
>>
>>If the rating difference between two programs is quite small, say less than
>>35 points, then I'm afraid you will definitely need a lot of games to sort it
>>out from the results alone.  A 20-game match solves nothing.  Christophe,
>>if you're reading this, could you tell us what is the minimum Elo difference
>>that a 20-game match can estimate to a good degree of confidence?
>
>
>With a 20 games match, you can determine if prog A is 77 elo points above prog
>B, with a 80% confidence.
>
>If the programs are closer in ELO, the 20 games match is not enough.
>
>I have answered to Enrique and given a complete table in my answer, you might
>find it very interesting.
>
>The key point is that when the elo difference gets smaller, the number of games
>to play increases tremendously.

...and 80% confidence is terrible.  95% is usually used.  There'd have to be an
enormous strength difference for a 20-game match to be reasonably conclusive.

Dave



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.