Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Comet A.96 - Wcrafty15.20 20 games blitz match

Author: Dann Corbit

Date: 09:13:16 10/20/98

Go up one level in this thread


On October 20, 1998 at 10:37:36, Nouveau wrote:
>On October 20, 1998 at 01:36:22, Jouni Uski wrote:
>
>>Here's result for 20 games match with 60/5 time limit (under Winboard):
>>
>>Comet    0.5 0 1 0 0 0 1 1 0 1 0 0 0.5 0 0.5 1 0.5 0 1 0   = 8
>>Wcrafty  0.5 1 0 1 1 1 0 0 1 0 1 1 0.5 1 0.5 0 0.5 1 0 1   = 12
>>
>>So they are very close to each other in playing strength.
>>
>>Jouni
>
>12-8 is very close ??????????
>
>When can we say : Crafty is better than Comet ? 18-2 ?
>
>I don't understand these statistical stuff : I can't imagine a 12-8 result in a
>match between 2 GM with a conclusion like "They are very close in playing
>stregth".
>
>Why do we need hundreds, maybe thousands of games between computers to evaluate
>relative strength, when few dozens are more than needed for human GMs ?
Any strong conclusion from a single match is faulty.  It could be that Comet is
500 points above Crafty, or 500 points below (although both of these are
statistically very unlikely, really, very little has been demonstrated at this
point from a single set of games).  The international chess bodies like FIDE
have definitely got it right in the way that they perform evaluations using the
ELO method.  Also, in requiring a long period of excellent results to become a
GM.  I think, in general, statistics is not a strong point of chess programmers.
 Surely there are some who are experts, but I see a lot of very strange
statements.

In any scientific community, an experiment [read "match"] must be repeatable
before any sort of conclusion can be reached. (Does anyone remember the name
'Pons'?)



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.