Author: Heinz van Kempen
Date: 02:42:45 10/18/05
Go up one level in this thread
On October 18, 2005 at 02:09:42, billiau wrote: >Hi, > >I remarked Fruit 2.2 standard did not play against the strong Deep Fritz8. > >I agree, we need a lot of games against a lot of opponents due to the way CEGT >do the tests (differents opponents, differents hardwares...). > >I am a bit surprised by the Spike match result (compared with blitz results). >The other ones does not seem so bad for the moment. > >I think it's too early to reject this setting. >This setting should not be considered like the others ones. >We only changed the history pruning threshold (thats' all). > >I know it's a lot of work to test these programs. >Please, continue this good work to be sure we don't lose something great. > >G. Billiau Hi, you are right, much too early to take conclusions and it is done again and again. Even worse, when there are surprising results there are insinuations that the test conditions might be flawed, that something is wrong when there isnĀ“t. People just do not really understand that many games are needed. We have another example. I continued the Spike match and this time Fruit 2.2 Uri is leading by 7,5 to 0,5. No use to ask why, it just happens. This are usual statistical fluctuations. Run ten matches between Spike and Fruit and you will get all kinds of results. So easy. Best Regards Heinz
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.