Author: Chessfun
Date: 13:22:49 04/02/01
Go up one level in this thread
On April 02, 2001 at 16:13:57, John Merlino wrote: >On April 02, 2001 at 15:17:05, stuart taylor wrote: > >>On April 02, 2001 at 14:07:58, John Merlino wrote: >> >>>On April 02, 2001 at 06:37:22, stuart taylor wrote: >>> >>>>On April 01, 2001 at 13:37:31, Jorge wrote: >>>> >>>>>On April 01, 2001 at 06:31:46, stuart taylor wrote: >>>>> >>>>>>On April 01, 2001 at 01:39:10, Jorge wrote: >>>>>> >>>>>>>On April 01, 2001 at 00:21:52, Lin Harper wrote: >>>>>>> >>>>>>>>On March 31, 2001 at 22:15:53, stuart taylor wrote: >>>>>>>> >>>>>>>>>I'm longing to get to the bottom of this, and to know exactly where each of >>>>>>>>>CM6000, CM7000 and CM8000 stand, in relation to each other, as well as compared >>>>>>>>>to a selection of other programs, and how the three CMs compare in their >>>>>>>>>comparisons to a few other programs. >>>>>>>>> Can someone do this one time, despite any difficulties involved? >>>>>>>>>S.Taylor >>>>>>>> I've got CM6K and CM8K, no other recent programs. I've only got one >>>>>>>> computer, and that's the problem most people have. It's not good playing >>>>>>>> two programs against each other, with ponder off on both programs, IMO, >>>>>>>> because ponder on is the default on them all (I think), and that's the >>>>>>>> only way to give a program it's full rein. Different programs will not >>>>>>>> necessarily be handicapped equally with ponder off. That's where auto >>>>>>>> 232 comes in. >>>>>>> >>>>>>>Yes, I have 2 compupters (PentIII 667 and Athlon 500, 128 Ram)and have both >>>>>>>programs 6K and 8K installed. I don't have Auto232 though, so I play some of the >>>>>>>games by hand. >>>>>> >>>>>>That could do a good job, I think, if you play all play all 8 games, but 4 on >>>>>>each computer (2x Black and 2x White), and switch computers, so that each has >>>>>>the same benefits and the same slight handicaps. That will also show how much >>>>>>the slight speed differences affect each, if at all. >>>>>>S.Taylor >>>>>All right, I'm curious to find out too. Let me know which settings (Default? for >>>>>both 6K and 8K) and Time control I can use. I will post the games here. >>>> >>>>That would be wonderful! >>>>I think that tournament timings or thereabouts would be most interesting to know >>>>about. Other settings, whatever is both strongest, and most equal to each other, >>>>then rotate for the other half of the games. >>>> If it can be an actual minature tournament, that would be excellent, by adding >>>>one, two or three other of the high level programs and making it all play all, >>>>so we can see how both CM's compare in their handling of each other program. >>>> But a simple match will also be great. And if it is 8 games each, that gives a >>>>very good chance for getting a good idea of things, as each computer can have >>>>twice white and twice black for each of the CM's.(6K and 8K) >>>> Four games each (in this way) is also good, especially if it is too much work, >>>>and you're adding other programs (even one). >>>>thanks, >>>>S.Taylor >>> >>>To get ANY reasonable conclusion from a tournament between two programs, I >>>suspect you would need at LEAST 20 games (and others here -- the more >>>statistically aware of us -- would say that you probably need closer to 50-100 >>>games). >>> >>>Eight games will prove nothing, even if the score is 8-0 for the winner. >>> >>>Sadly, the only way to compare CM6000 vs. CM8000 is manually WITH TWO MACHINES. >>>Running both of these programs on the same machine gives a CPU advantage to >>>CM6000. I doubt many people have the time and the hardware for this kind of >>>testing, so getting the required number of games will take quite a long time. >>> >>>jm >> >>I feel convinced that CM6K vs. CM8K vs. 3 other top programs all play all 8 >>times, in a sensible way, like what I believe Jorge is in a situation to do >>(i.e. half & half) would indicate things very clearly, if the results are >>clearly tilted one way or the other. It will also be possible to understand much >>more from studying the games, and the nature of the results than merely looking >>for cold statistics and nothing else. >> An 8-0 score is pretty much conclusive. 5-3 may not be. I think 6.5-1.5 looks >>almost conclusive, whereas 6-2 is not yet. >> And even if after 8x4 (all play all)[32 games each, but not just any old >>games] the results are quite close, alot of other factors could be seen, which >>would give a very good idea of where things stand. Certainly something to talk >>about. >> And-above all, it will be MUCH MUCH better than the darkness we're in now! And >>I don't recall anyone on this board disagreeing that CM6K is stronger than CM8K. >>That's where things stand at present, but perhaps we can see something just a >>little bit clearer. >> Even a short match would be better than nothing. It will STILL be somewhat >>ambiguous, but I strongly believe-a bit less so. >>S.Taylor > >You may think that "6.5-1.5 looks almost conclusive", but it really, >statistically, does not allow for any conclusions. Dr. Hyatt has said that, >given enough games (or a string of bad luck), even a 10-0 score really proves >nothing! Look at this post (and thread): > >http://www.chessusa.com/forums/1/message.shtml?160782 > >Now, truthfully I would honestly be concerned about 10-0 score. :-) But I would >still say that, statistically, it proves nothing because the total test sample >is too small. > >However, you are correct in saying that ANALYZING the games CAN prove to be more >useful than mere hard results. But, this will take even MORE time, of course. To >analyze a tournament in which 5 engines played 32 games each, that would be >analyzing 80 games. Quite time consuming both in CPU time and in human scrutiny >of the analysis results. > >I'm not saying that it shouldn't be done; I'm just questioning the quality of >the data against the amount of time it will take to acquire it. > >On the completely other side of the discussion (which is about the "Selective >Search" setting) I am currently near the end of a 16-game tournament between >CM8000 SS=6 and CM8000 SS=12 (with both personalities using a 32MB hash table) >at tournament time controls. Right now, SS=12 is leading by 7.0-6.0 (+3 =8 -2). >So, once again, there is still no statistical data that can show that SS=12 is >better than SS=6, despite the fact that it, in many cases, it solves test suite >problems faster. > >As for your comment ("I don't recall anyone on this board disagreeing that CM6K >is stronger than CM8K."), I also do not recall anybody stating with any >certainty that CM6K WAS stronger than CM8K. People have expressed their opinions >(on both sides of the argument), but nobody has shown any clear evidence one way >or another. > >If this manual, two-machine tournament can be organized and completed (and I >really don't see the need for the other three engines, as we're really only >concerned about CM6K vs. CM8K), then finally we MIGHT have some useful data. >Until then, we are all just speculating.... > >jm John correct me if I'm wrong but I believe there were test games played by CM8K v either CM7K or CM6K. Can you supply more details of those game scores, machines, times, etc. Sarah.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.