Computer Chess Club Archives


Search

Terms

Messages

Subject: Proof

Author: Graham Banks

Date: 02:37:22 10/18/05

Go up one level in this thread


On October 18, 2005 at 04:56:59, Uri Blass wrote:

>On October 18, 2005 at 03:18:00, Graham Banks wrote:
>
>>On October 18, 2005 at 02:08:06, Uri Blass wrote:
>>
>>>From Heinz van Kempen's words:
>>>>The majority of CEGT testers is not so keen on testing personalities, but Fruit
>>>>is a special case.
>>>
>>>I do not see that fruit is a special case based on the list.
>>>Based on looking at the list it seems that the only special case is chessmaster.
>>>
>>>Fruit has only one personality in the list except the default and I guess that
>>>it is not going to have more than it when chessmaster has 10 personalities in
>>>the list except the default.
>>>
>>>Of course it is the testers choice what to test but
>>>I wonder what is the reason that they prefer testing chessmaster.
>>>
>>>I counted 10 different personalities except the default and it is not clear if
>>>even one of them is stronger than the default when the possible error in the
>>>default's rating is 23 elo points.
>>>
>>>13 CM10th Milan 2.3 2679
>>>14 CM10th Pestilence 2678
>>>15 CM10th Behemoth 2676
>>>16 CM10th Cell 2676
>>>20 CM10th Imperator 2665
>>>21 CM10th Default 2664
>>>26 CM10th Berean 5.54 2650
>>>27 CM10th Steadfast 2643
>>>30 CM10th Behemoth II 2634
>>>34 CM10th D1Meandros 2628
>>>35 CM10th Yoda 2.5 2627
>>>
>>>Uri
>>>Uri
>>
>>
>>Hi Uri,
>>
>>one thing I can tell you with 100% certainty is that all of these CM10th
>>settings are better than the default CM10th settings as the time control gets
>>longer.  I can provide proof of this if you require it.
>
>I do not know if you are correct and I doubt if you have enough games against
>different opponents to prove it(I explain later in this post why I doubt if it
>can be correct).
>
>Unfortunately CEGT are not very interesting in comparison between different time
>control and I see only one chessmaster in 4/40 time control so we even have no
>evidence that all these personalities improve relative to the default when the
>time control is 40/40 relative to 40/4.
>
>>
>>When I joined CEGT, I was asked to run the 6+6 tournament I'm running involving
>>6 CM10th settings and 6 top engines.
>>As I was also restructuring my CM10th Showdown tournament at this point in time,
>>I offered to rerun it as a CEGT tournament.
>>
>>Note also that only the one setting of any program is included in one of the
>>rating lists.
>>
>>I feel it's a little bit like sour grapes to start questioning the worth of CEGT
>>now that Fruit 2.2 Uri isn't performing as well as hoped.
>>
>>Regards, Graham.
>
>It is not related.
>
>I also did not suggest that the CEGT will stop testing.
>I did not claim that no testing is better than testing but only that I do not
>understand the choice of the CEGT.
>
>Of course the CEGT like the SSDF is free to test what they want and if the SSDF
>will also prefer to test 10 different personalities of one program it is their
>right and I will not suggest them to stop testing because of it.
>
>It is not the first time that I do not understand the choice of CEGT.
>
>I also did not understand the choice to do small number of blitz games relative
>to long time control.
>
>The choice of blitz of 4/40 also seemed to me not very good and I thought that
>testers will prefer 2/40 for comparison with 40/40 but I read that some testers
>even prefered slower time control in the blitz games that is simply against all
>the idea of blitz games.
>
>The idea of blitz games is to compare between long time control and blitz to see
>if there are programs that are probably better in blitz.
>
>It may be possible to try to speculate from it about longer time control.
>
>As far as I know we usually see relatively small difference between 4/40 and
>40/40 and it may suggest that the difference in time control should be more than
>1:10 in order to see big difference so if there is no significant difference
>between CM default and other CM personality at 40/40 then I do not think that
>there is going to be a significant difference between CM default and other CM
>personality in a slower time control by factor of 2 or 3.
>
>Uri



Note the time control used and the performance of the default settings in
relation to 40/30 on my machine.

THE GREAT CM10th SHOWDOWN!

Athlon XP1900+
128mb hash each
3,4,5 men tablebases
Ponder on
No opening books
78 rounds (2 cycles) at 90 mins + 30 secs


Standings after Round 32

20.5 - D1 Meandros
20.0 - WoDra
19.5 - GL
19.5 - Milan 2.6
19.0 - SoFar 2
19.0 - Tsunami
18.5 - Cell
18.5 - Clown 1.01
17.5 - Beast
17.5 - Milan 2.4
17.5 - Milan 1.5
17.5 - Milan 2.3
17.5 - R2D2
17.5 - Emperor
17.0 - Undertaker
16.5 - Behemoth
16.5 - C3PO
16.5 - Berean 5.54
16.0 - Milan 2.5
16.0 - R1X
16.0 - D1 Pyr
15.5 - Wrath
15.5 - Scorpion
15.5 - D2 Alos
15.5 - Steadfast
15.0 - Milan 2.1
15.0 - Salamander
14.5 - Berean 5.53
14.5 - SoFar
14.5 - Yoda 2.7
14.5 - Darth Vader
14.5 - Schumacher
14.5 - Juggernaut
14.5 - Predator
13.5 - Medusa
12.5 - Myrddin
12.5 - Solomon
12.0 - Default
12.0 - Cobra
11.0 - Vegeta 2d




This page took 0.01 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.