Author: Christophe Theron
Date: 18:04:37 12/18/00
Go up one level in this thread
On December 18, 2000 at 17:43:43, Severi Salminen wrote:
>On December 18, 2000 at 10:48:49, Jorge Pichard wrote:
>
>>On December 18, 2000 at 09:55:42, Severi Salminen wrote:
>>
>>>>I agree with you that 24 games isn't enough, but 200 games is not really
>>>>necessary if one of the two programs reach a difference of over 7 games, in
>>>>which at that point I will stop the match. More likely this won't happen since
>>>>these two programs are too evenly match so far.
>>>
>>>I don't understand. Where do you get that 7? Are you saying that the result
>>>104-96 is significant? Or, even worse, 16-8 (this means nothing in practice)?
>>>Why not 8, 25 or 10056? I think there is no point to stop when difference is
>>>something. There _is_ a point to run a match with many games (500+). The closer
>>>the two programs are the more games you need to show the true difference. Also
>>>the learning abilities of both programs have to be taken in account. The chess
>>>community still seems to lack the knowledge on how to measure the strenght
>>>difference between two programs...
>>>
>>>Severi
>>
>>Okay I will run this tourney up to 200 games, and will post the result as soon
>>as the tourney is over, or will Email the PGN games to anybody interested.
>
>That begins to sound interesting. 200 games match still has some error margins
>but we'll see a lot from that result. I'm looking forward for the results - not
>too often someone runs a 200+ match here in CCC, thanks!
>
>Severi
On 200 games, the margin of error for 80% reliability is +/-3.5%.
For 70% reliability it's +/-3.0%.
If a program wins the 200 games match by 53.5% (107-93) or more, you can say
with 80% relability that it is stronger than its opponent.
If it wins by only 53% (106-94) you can say it is better, but only with 70%
reliability.
You see that when the programs are very close you need a very large number of
games to determine which is the best.
On the other hand, if there is a significant difference before you reach 200
games, it is possible to say which is the best without playing the 200 games.
Christophe
This page took 0.01 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.