Author: Uri Blass
Date: 03:05:48 10/18/05
Go up one level in this thread
On October 18, 2005 at 05:59:08, Graham Banks wrote: >On October 18, 2005 at 05:49:10, Uri Blass wrote: > >>On October 18, 2005 at 05:37:22, Graham Banks wrote: >> >>>On October 18, 2005 at 04:56:59, Uri Blass wrote: >>> >>>>On October 18, 2005 at 03:18:00, Graham Banks wrote: >>>> >>>>>On October 18, 2005 at 02:08:06, Uri Blass wrote: >>>>> >>>>>>From Heinz van Kempen's words: >>>>>>>The majority of CEGT testers is not so keen on testing personalities, but Fruit >>>>>>>is a special case. >>>>>> >>>>>>I do not see that fruit is a special case based on the list. >>>>>>Based on looking at the list it seems that the only special case is chessmaster. >>>>>> >>>>>>Fruit has only one personality in the list except the default and I guess that >>>>>>it is not going to have more than it when chessmaster has 10 personalities in >>>>>>the list except the default. >>>>>> >>>>>>Of course it is the testers choice what to test but >>>>>>I wonder what is the reason that they prefer testing chessmaster. >>>>>> >>>>>>I counted 10 different personalities except the default and it is not clear if >>>>>>even one of them is stronger than the default when the possible error in the >>>>>>default's rating is 23 elo points. >>>>>> >>>>>>13 CM10th Milan 2.3 2679 >>>>>>14 CM10th Pestilence 2678 >>>>>>15 CM10th Behemoth 2676 >>>>>>16 CM10th Cell 2676 >>>>>>20 CM10th Imperator 2665 >>>>>>21 CM10th Default 2664 >>>>>>26 CM10th Berean 5.54 2650 >>>>>>27 CM10th Steadfast 2643 >>>>>>30 CM10th Behemoth II 2634 >>>>>>34 CM10th D1Meandros 2628 >>>>>>35 CM10th Yoda 2.5 2627 >>>>>> >>>>>>Uri >>>>>>Uri >>>>> >>>>> >>>>>Hi Uri, >>>>> >>>>>one thing I can tell you with 100% certainty is that all of these CM10th >>>>>settings are better than the default CM10th settings as the time control gets >>>>>longer. I can provide proof of this if you require it. >>>> >>>>I do not know if you are correct and I doubt if you have enough games against >>>>different opponents to prove it(I explain later in this post why I doubt if it >>>>can be correct). >>>> >>>>Unfortunately CEGT are not very interesting in comparison between different time >>>>control and I see only one chessmaster in 4/40 time control so we even have no >>>>evidence that all these personalities improve relative to the default when the >>>>time control is 40/40 relative to 40/4. >>>> >>>>> >>>>>When I joined CEGT, I was asked to run the 6+6 tournament I'm running involving >>>>>6 CM10th settings and 6 top engines. >>>>>As I was also restructuring my CM10th Showdown tournament at this point in time, >>>>>I offered to rerun it as a CEGT tournament. >>>>> >>>>>Note also that only the one setting of any program is included in one of the >>>>>rating lists. >>>>> >>>>>I feel it's a little bit like sour grapes to start questioning the worth of CEGT >>>>>now that Fruit 2.2 Uri isn't performing as well as hoped. >>>>> >>>>>Regards, Graham. >>>> >>>>It is not related. >>>> >>>>I also did not suggest that the CEGT will stop testing. >>>>I did not claim that no testing is better than testing but only that I do not >>>>understand the choice of the CEGT. >>>> >>>>Of course the CEGT like the SSDF is free to test what they want and if the SSDF >>>>will also prefer to test 10 different personalities of one program it is their >>>>right and I will not suggest them to stop testing because of it. >>>> >>>>It is not the first time that I do not understand the choice of CEGT. >>>> >>>>I also did not understand the choice to do small number of blitz games relative >>>>to long time control. >>>> >>>>The choice of blitz of 4/40 also seemed to me not very good and I thought that >>>>testers will prefer 2/40 for comparison with 40/40 but I read that some testers >>>>even prefered slower time control in the blitz games that is simply against all >>>>the idea of blitz games. >>>> >>>>The idea of blitz games is to compare between long time control and blitz to see >>>>if there are programs that are probably better in blitz. >>>> >>>>It may be possible to try to speculate from it about longer time control. >>>> >>>>As far as I know we usually see relatively small difference between 4/40 and >>>>40/40 and it may suggest that the difference in time control should be more than >>>>1:10 in order to see big difference so if there is no significant difference >>>>between CM default and other CM personality at 40/40 then I do not think that >>>>there is going to be a significant difference between CM default and other CM >>>>personality in a slower time control by factor of 2 or 3. >>>> >>>>Uri >>> >>> >>> >>>Note the time control used and the performance of the default settings in >>>relation to 40/30 on my machine. >>> >>>THE GREAT CM10th SHOWDOWN! >>> >>>Athlon XP1900+ >>>128mb hash each >>>3,4,5 men tablebases >>>Ponder on >>>No opening books >>>78 rounds (2 cycles) at 90 mins + 30 secs >>> >>> >>>Standings after Round 32 >>> >>>20.5 - D1 Meandros >>>20.0 - WoDra >>>19.5 - GL >>>19.5 - Milan 2.6 >>>19.0 - SoFar 2 >>>19.0 - Tsunami >>>18.5 - Cell >>>18.5 - Clown 1.01 >>>17.5 - Beast >>>17.5 - Milan 2.4 >>>17.5 - Milan 1.5 >>>17.5 - Milan 2.3 >>>17.5 - R2D2 >>>17.5 - Emperor >>>17.0 - Undertaker >>>16.5 - Behemoth >>>16.5 - C3PO >>>16.5 - Berean 5.54 >>>16.0 - Milan 2.5 >>>16.0 - R1X >>>16.0 - D1 Pyr >>>15.5 - Wrath >>>15.5 - Scorpion >>>15.5 - D2 Alos >>>15.5 - Steadfast >>>15.0 - Milan 2.1 >>>15.0 - Salamander >>>14.5 - Berean 5.53 >>>14.5 - SoFar >>>14.5 - Yoda 2.7 >>>14.5 - Darth Vader >>>14.5 - Schumacher >>>14.5 - Juggernaut >>>14.5 - Predator >>>13.5 - Medusa >>>12.5 - Myrddin >>>12.5 - Solomon >>>12.0 - Default >>>12.0 - Cobra >>>11.0 - Vegeta 2d >> >>No proof. >> >>number of games is not enough. >> >>The same program can score 12/32 in one tournament and 20/32 in another >>tournament even without changing the time control. >> >Not under these conditions if you look - "no books" You may be right but this is not the CEGT conditions and they use the condition of starting from predefined book positions. A program that is better with no book at longer time control may be not better when you use some small book. You have some evidence to support the opinion that chessmaster default may be relatively worse at long time control but not enough to prove it even if we want only 95% certainty. Uri
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.