Author: Dann Corbit
Date: 09:13:16 10/20/98
Go up one level in this thread
On October 20, 1998 at 10:37:36, Nouveau wrote: >On October 20, 1998 at 01:36:22, Jouni Uski wrote: > >>Here's result for 20 games match with 60/5 time limit (under Winboard): >> >>Comet 0.5 0 1 0 0 0 1 1 0 1 0 0 0.5 0 0.5 1 0.5 0 1 0 = 8 >>Wcrafty 0.5 1 0 1 1 1 0 0 1 0 1 1 0.5 1 0.5 0 0.5 1 0 1 = 12 >> >>So they are very close to each other in playing strength. >> >>Jouni > >12-8 is very close ?????????? > >When can we say : Crafty is better than Comet ? 18-2 ? > >I don't understand these statistical stuff : I can't imagine a 12-8 result in a >match between 2 GM with a conclusion like "They are very close in playing >stregth". > >Why do we need hundreds, maybe thousands of games between computers to evaluate >relative strength, when few dozens are more than needed for human GMs ? Any strong conclusion from a single match is faulty. It could be that Comet is 500 points above Crafty, or 500 points below (although both of these are statistically very unlikely, really, very little has been demonstrated at this point from a single set of games). The international chess bodies like FIDE have definitely got it right in the way that they perform evaluations using the ELO method. Also, in requiring a long period of excellent results to become a GM. I think, in general, statistics is not a strong point of chess programmers. Surely there are some who are experts, but I see a lot of very strange statements. In any scientific community, an experiment [read "match"] must be repeatable before any sort of conclusion can be reached. (Does anyone remember the name 'Pons'?)
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.