Author: Robert Hyatt
Date: 05:07:53 10/21/98
Go up one level in this thread
On October 21, 1998 at 04:37:38, Nouveau wrote: > >On October 20, 1998 at 12:13:16, Dann Corbit wrote: > >>On October 20, 1998 at 10:37:36, Nouveau wrote: >>>On October 20, 1998 at 01:36:22, Jouni Uski wrote: >>> >>>>Here's result for 20 games match with 60/5 time limit (under Winboard): >>>> >>>>Comet 0.5 0 1 0 0 0 1 1 0 1 0 0 0.5 0 0.5 1 0.5 0 1 0 = 8 >>>>Wcrafty 0.5 1 0 1 1 1 0 0 1 0 1 1 0.5 1 0.5 0 0.5 1 0 1 = 12 >>>> >>>>So they are very close to each other in playing strength. >>>> >>>>Jouni >>> >>>12-8 is very close ?????????? >>> >>>When can we say : Crafty is better than Comet ? 18-2 ? >>> >>>I don't understand these statistical stuff : I can't imagine a 12-8 result in a >>>match between 2 GM with a conclusion like "They are very close in playing >>>stregth". >>> >>>Why do we need hundreds, maybe thousands of games between computers to evaluate >>>relative strength, when few dozens are more than needed for human GMs ? >>Any strong conclusion from a single match is faulty. It could be that Comet is >>500 points above Crafty, or 500 points below (although both of these are >>statistically very unlikely, really, very little has been demonstrated at this >>point from a single set of games). > >Just imagine : the match between Kasparov and Chirov takes place and the result >is : Kasparov-Chirov = 12-8. >Maybe Kasparov is 500 points above Chirov or 500 points below...Show me any >chess magazine that would print such an affirmation. >I know, those chess journalists don't have a clue on science and stats ;o) > >> The international chess bodies like FIDE >>have definitely got it right in the way that they perform evaluations using the >>ELO method. Also, in requiring a long period of excellent results to become a >>GM. > >Can someone make the math for this : a player has a 2600 level but no rating, >how many games against a 2500 opposition does he need to reach 2600 ? > easy here. one game. his rating would be 2700 after that one game, since the first N games uses the usual "TPR" type calculation. >> I think, in general, statistics is not a strong point of chess programmers. >> Surely there are some who are experts, but I see a lot of very strange >>statements. >> >>In any scientific community, an experiment [read "match"] must be repeatable >>before any sort of conclusion can be reached. (Does anyone remember the name >>'Pons'?) > "repeatability" is not really a requirement imposed by statistics... that is what the "normal" curve (and central limit theorem is all about... the fact that repeated tests can and will produce different results.) >That's true if we consider that chess is science...has the "community" a strong >agreement on this ? > >Jeff
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.