Author: Graham Banks
Date: 02:37:22 10/18/05
Go up one level in this thread
On October 18, 2005 at 04:56:59, Uri Blass wrote: >On October 18, 2005 at 03:18:00, Graham Banks wrote: > >>On October 18, 2005 at 02:08:06, Uri Blass wrote: >> >>>From Heinz van Kempen's words: >>>>The majority of CEGT testers is not so keen on testing personalities, but Fruit >>>>is a special case. >>> >>>I do not see that fruit is a special case based on the list. >>>Based on looking at the list it seems that the only special case is chessmaster. >>> >>>Fruit has only one personality in the list except the default and I guess that >>>it is not going to have more than it when chessmaster has 10 personalities in >>>the list except the default. >>> >>>Of course it is the testers choice what to test but >>>I wonder what is the reason that they prefer testing chessmaster. >>> >>>I counted 10 different personalities except the default and it is not clear if >>>even one of them is stronger than the default when the possible error in the >>>default's rating is 23 elo points. >>> >>>13 CM10th Milan 2.3 2679 >>>14 CM10th Pestilence 2678 >>>15 CM10th Behemoth 2676 >>>16 CM10th Cell 2676 >>>20 CM10th Imperator 2665 >>>21 CM10th Default 2664 >>>26 CM10th Berean 5.54 2650 >>>27 CM10th Steadfast 2643 >>>30 CM10th Behemoth II 2634 >>>34 CM10th D1Meandros 2628 >>>35 CM10th Yoda 2.5 2627 >>> >>>Uri >>>Uri >> >> >>Hi Uri, >> >>one thing I can tell you with 100% certainty is that all of these CM10th >>settings are better than the default CM10th settings as the time control gets >>longer. I can provide proof of this if you require it. > >I do not know if you are correct and I doubt if you have enough games against >different opponents to prove it(I explain later in this post why I doubt if it >can be correct). > >Unfortunately CEGT are not very interesting in comparison between different time >control and I see only one chessmaster in 4/40 time control so we even have no >evidence that all these personalities improve relative to the default when the >time control is 40/40 relative to 40/4. > >> >>When I joined CEGT, I was asked to run the 6+6 tournament I'm running involving >>6 CM10th settings and 6 top engines. >>As I was also restructuring my CM10th Showdown tournament at this point in time, >>I offered to rerun it as a CEGT tournament. >> >>Note also that only the one setting of any program is included in one of the >>rating lists. >> >>I feel it's a little bit like sour grapes to start questioning the worth of CEGT >>now that Fruit 2.2 Uri isn't performing as well as hoped. >> >>Regards, Graham. > >It is not related. > >I also did not suggest that the CEGT will stop testing. >I did not claim that no testing is better than testing but only that I do not >understand the choice of the CEGT. > >Of course the CEGT like the SSDF is free to test what they want and if the SSDF >will also prefer to test 10 different personalities of one program it is their >right and I will not suggest them to stop testing because of it. > >It is not the first time that I do not understand the choice of CEGT. > >I also did not understand the choice to do small number of blitz games relative >to long time control. > >The choice of blitz of 4/40 also seemed to me not very good and I thought that >testers will prefer 2/40 for comparison with 40/40 but I read that some testers >even prefered slower time control in the blitz games that is simply against all >the idea of blitz games. > >The idea of blitz games is to compare between long time control and blitz to see >if there are programs that are probably better in blitz. > >It may be possible to try to speculate from it about longer time control. > >As far as I know we usually see relatively small difference between 4/40 and >40/40 and it may suggest that the difference in time control should be more than >1:10 in order to see big difference so if there is no significant difference >between CM default and other CM personality at 40/40 then I do not think that >there is going to be a significant difference between CM default and other CM >personality in a slower time control by factor of 2 or 3. > >Uri Note the time control used and the performance of the default settings in relation to 40/30 on my machine. THE GREAT CM10th SHOWDOWN! Athlon XP1900+ 128mb hash each 3,4,5 men tablebases Ponder on No opening books 78 rounds (2 cycles) at 90 mins + 30 secs Standings after Round 32 20.5 - D1 Meandros 20.0 - WoDra 19.5 - GL 19.5 - Milan 2.6 19.0 - SoFar 2 19.0 - Tsunami 18.5 - Cell 18.5 - Clown 1.01 17.5 - Beast 17.5 - Milan 2.4 17.5 - Milan 1.5 17.5 - Milan 2.3 17.5 - R2D2 17.5 - Emperor 17.0 - Undertaker 16.5 - Behemoth 16.5 - C3PO 16.5 - Berean 5.54 16.0 - Milan 2.5 16.0 - R1X 16.0 - D1 Pyr 15.5 - Wrath 15.5 - Scorpion 15.5 - D2 Alos 15.5 - Steadfast 15.0 - Milan 2.1 15.0 - Salamander 14.5 - Berean 5.53 14.5 - SoFar 14.5 - Yoda 2.7 14.5 - Darth Vader 14.5 - Schumacher 14.5 - Juggernaut 14.5 - Predator 13.5 - Medusa 12.5 - Myrddin 12.5 - Solomon 12.0 - Default 12.0 - Cobra 11.0 - Vegeta 2d
This page took 0.01 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.