Author: Uri Blass
Date: 22:34:17 08/29/05
Go up one level in this thread
On August 29, 2005 at 18:51:53, Dann Corbit wrote: >On August 29, 2005 at 18:21:33, Peter Berger wrote: > >>On August 29, 2005 at 10:40:54, Kurt Utzinger wrote: >> >>>On August 29, 2005 at 06:36:42, Jorge Pichard wrote: >>> >>>> Engine Score >>>>1: Spike10 16/26 1=010==1===110=110110=11== >>>>2: Zappa 10/26 0=101==0===001=001001=00== ··············· >>> >>> After only 26 games and a winning score of 61 % >>> it's too early for such a statement I think. >>> Kurt >> >>That's true. But only barely. >> >>Assuming that everything is set up properly, games are independent events ( aka >>no learning) and that white and black have same likeliness to win (just for sake >>of correctness, I am actually pretty sure this doesn't make a major difference), >> >>the result is good enough to claim that Spike is better with 90% confidence. And >>only one more win in the following game would have been enough for 95% >>confidence in fact ;) . >> >>How do you feel about this one? >> >>A 1 1 1 1 1 >>B 0 0 0 0 0 >> >>More games needed? Not if you can live with 97% confidence . > >Of course, if we recall the Cadaques tournament of some years ago, it stated as >a whitewash for Junior, but Junior eventually lost (possibly due to learning so >your statement above may apply). > >>Hmm, let's go back to the imagined 17/27 from Spike. We need more games? >> >>OK. Let's look at this result: >> >>Wins: 12 >>Loss: 5 >>Draws: 100000 >> >>Better? Worse? No, the same. > >I don't put much credence in any result of less than 30 games. >After 30 games, then you get a lot more plausibility. I disagree. If you see 20-0 you do not need to wait for 30 games to be practically sure that the winner is better. On the other hand I saw result of 18-12 in nunn match that was later changed to less than 50% for the side that scored 18-12. I think that it may be better to continue until the difference in result is 10 points and stop there. The idea of fixed number of games is bad. The point is that if the difference is big you do not want to waste too much time on testing and if the difference is small it is less important not to do errors in deciding which version is stronger. Uri
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.