Computer Chess Club Archives

Search

Terms
Messages

Subject: Re: PLEASE can someone test CM 6,7,8000 with other programs?

Author: Chessfun
Date: 13:22:49 04/02/01
On April 02, 2001 at 16:13:57, John Merlino wrote:

>On April 02, 2001 at 15:17:05, stuart taylor wrote:
>
>>On April 02, 2001 at 14:07:58, John Merlino wrote:
>>
>>>On April 02, 2001 at 06:37:22, stuart taylor wrote:
>>>
>>>>On April 01, 2001 at 13:37:31, Jorge wrote:
>>>>
>>>>>On April 01, 2001 at 06:31:46, stuart taylor wrote:
>>>>>
>>>>>>On April 01, 2001 at 01:39:10, Jorge wrote:
>>>>>>
>>>>>>>On April 01, 2001 at 00:21:52, Lin Harper wrote:
>>>>>>>
>>>>>>>>On March 31, 2001 at 22:15:53, stuart taylor wrote:
>>>>>>>>
>>>>>>>>>I'm longing to get to the bottom of this, and to know exactly where each of
>>>>>>>>>CM6000, CM7000 and CM8000 stand, in relation to each other, as well as compared
>>>>>>>>>to a selection of other programs, and how the three CMs compare in their
>>>>>>>>>comparisons to a few other programs.
>>>>>>>>>  Can someone do this one time, despite any difficulties involved?
>>>>>>>>>S.Taylor
>>>>>>>>     I've got CM6K and CM8K, no other recent programs. I've only got one
>>>>>>>>  computer, and that's the problem most people have. It's not good playing
>>>>>>>>  two programs against each other, with ponder off on both programs, IMO,
>>>>>>>>  because ponder on is the default on them all (I think), and that's the
>>>>>>>>  only way to give a program it's full rein. Different programs will not
>>>>>>>>  necessarily be handicapped equally with ponder off. That's where auto
>>>>>>>>  232 comes in.
>>>>>>>
>>>>>>>Yes, I have 2 compupters (PentIII 667 and Athlon 500, 128 Ram)and have both
>>>>>>>programs 6K and 8K installed. I don't have Auto232 though, so I play some of the
>>>>>>>games by hand.
>>>>>>
>>>>>>That could do a good job, I think, if you play all play all 8 games, but 4 on
>>>>>>each computer (2x Black and 2x White), and switch computers, so that each has
>>>>>>the same benefits and the same slight handicaps. That will also show how much
>>>>>>the slight speed differences affect each, if at all.
>>>>>>S.Taylor
>>>>>All right, I'm curious to find out too. Let me know which settings (Default? for
>>>>>both 6K and 8K) and Time control I can use. I will post the games here.
>>>>
>>>>That would be wonderful!
>>>>I think that tournament timings or thereabouts would be most interesting to know
>>>>about. Other settings, whatever is both strongest, and most equal to each other,
>>>>then rotate for the other half of the games.
>>>>  If it can be an actual minature tournament, that would be excellent, by adding
>>>>one, two or three other of the high level programs and making it all play all,
>>>>so we can see how both CM's compare in their handling of each other program.
>>>> But a simple match will also be great. And if it is 8 games each, that gives a
>>>>very good chance for getting a good idea of things, as each computer can have
>>>>twice white and twice black for each of the CM's.(6K and 8K)
>>>> Four games each (in this way) is also good, especially if it is too much work,
>>>>and you're adding other programs (even one).
>>>>thanks,
>>>>S.Taylor
>>>
>>>To get ANY reasonable conclusion from a tournament between two programs, I
>>>suspect you would need at LEAST 20 games (and others here -- the more
>>>statistically aware of us -- would say that you probably need closer to 50-100
>>>games).
>>>
>>>Eight games will prove nothing, even if the score is 8-0 for the winner.
>>>
>>>Sadly, the only way to compare CM6000 vs. CM8000 is manually WITH TWO MACHINES.
>>>Running both of these programs on the same machine gives a CPU advantage to
>>>CM6000. I doubt many people have the time and the hardware for this kind of
>>>testing, so getting the required number of games will take quite a long time.
>>>
>>>jm
>>
>>I feel convinced that CM6K vs. CM8K vs. 3 other top programs all play all 8
>>times, in a sensible way, like what I believe Jorge is in a situation to do
>>(i.e. half & half) would indicate things very clearly, if the results are
>>clearly tilted one way or the other. It will also be possible to understand much
>>more from studying the games, and the nature of the results than merely looking
>>for cold statistics and nothing else.
>>  An 8-0 score is pretty much conclusive. 5-3 may not be. I think 6.5-1.5 looks
>>almost conclusive, whereas 6-2 is not yet.
>>  And even if after 8x4 (all play all)[32 games each, but not just any old
>>games] the results are quite close, alot of other factors could be seen, which
>>would give a very good idea of where things stand. Certainly something to talk
>>about.
>>  And-above all, it will be MUCH MUCH better than the darkness we're in now! And
>>I don't recall anyone on this board disagreeing that CM6K is stronger than CM8K.
>>That's where things stand at present, but perhaps we can see something just a
>>little bit clearer.
>>  Even a short match would be better than nothing. It will STILL be somewhat
>>ambiguous, but I strongly believe-a bit less so.
>>S.Taylor
>
>You may think that "6.5-1.5 looks almost conclusive", but it really,
>statistically, does not allow for any conclusions. Dr. Hyatt has said that,
>given enough games (or a string of bad luck), even a 10-0 score really proves
>nothing! Look at this post (and thread):
>
>http://www.chessusa.com/forums/1/message.shtml?160782
>
>Now, truthfully I would honestly be concerned about 10-0 score. :-) But I would
>still say that, statistically, it proves nothing because the total test sample
>is too small.
>
>However, you are correct in saying that ANALYZING the games CAN prove to be more
>useful than mere hard results. But, this will take even MORE time, of course. To
>analyze a tournament in which 5 engines played 32 games each, that would be
>analyzing 80 games. Quite time consuming both in CPU time and in human scrutiny
>of the analysis results.
>
>I'm not saying that it shouldn't be done; I'm just questioning the quality of
>the data against the amount of time it will take to acquire it.
>
>On the completely other side of the discussion (which is about the "Selective
>Search" setting) I am currently near the end of a 16-game tournament between
>CM8000 SS=6 and CM8000 SS=12 (with both personalities using a 32MB hash table)
>at tournament time controls. Right now, SS=12 is leading by 7.0-6.0 (+3 =8 -2).
>So, once again, there is still no statistical data that can show that SS=12 is
>better than SS=6, despite the fact that it, in many cases, it solves test suite
>problems faster.
>
>As for your comment ("I don't recall anyone on this board disagreeing that CM6K
>is stronger than CM8K."), I also do not recall anybody stating with any
>certainty that CM6K WAS stronger than CM8K. People have expressed their opinions
>(on both sides of the argument), but nobody has shown any clear evidence one way
>or another.
>
>If this manual, two-machine tournament can be organized and completed (and I
>really don't see the need for the other three engines, as we're really only
>concerned about CM6K vs. CM8K), then finally we MIGHT have some useful data.
>Until then, we are all just speculating....
>
>jm


John correct me if I'm wrong but I believe there were test games played
by CM8K v either CM7K or CM6K. Can you supply more details of those game
scores, machines, times, etc.

Sarah.
Re: PLEASE can someone test CM 6,7,8000 with other programs? John Merlino 13:32:58 04/02/01
- Re: PLEASE can someone test CM 6,7,8000 with other programs? Chessfun 13:40:14 04/02/01
  - Re: PLEASE can someone test CM 6,7,8000 with other programs? John Merlino 13:51:50 04/02/01
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.