Author: Peter Fendrich
Date: 16:28:01 12/19/02
Go up one level in this thread
On December 19, 2002 at 02:35:47, Bruce Moreland wrote: >On December 19, 2002 at 00:58:30, Omid David Tabibi wrote: > >>Based on the presented data: >> >>Isn't it clear that vrfd R=3 is superior to std R=2 ? > >No, but it is likely. > >The Neishtadt suite is an odd choice since it contains a great many checkmate >combinations. I don't accept that this is a primary component of chess program >strength. I accept that VR=3 did better than R=2 on this test set, since the >number of solutions found was greater in less time. > >There is a table that shows that ECM required less nodes to get to depth D, but >there is no correct solution data given. I question this. You took pains to >present this data in other cases, but it is absent here. Those numbers would >have been very interesting. > >WCS is another strange suite, and everything said about the Neishtadt suite can >be said here. There appear to be at least 150 mates in the suite. Everything >said about the Neishtadt results can be said about these results. > >The mates from the CAP data are the same kind of thing. > >It is as if you've decided what VR=3 can do best, and you are matching it >against what R=2 is not known to do well. For some reason, you found three >suites loaded up with mates, and provided solution data. Solution data is not >provided for ECM, a harder suite that contains fewer direct mates. > >The most compelling evidence is the autoplay match where VR=3 scored 68.5%. >These games are not available online. I was going to check to see if the >programs got into a rut and played the same game over and over again, but I >can't do that. > >Assuming that they played 100 unique games, the question remains as to whether >68.5% proves anything. You can say, of course it does, but the real answer has >to do with statistics. There is no way that a "real" scientific journal would >accept "of course it does" as an answer -- they'd want the math. You don't >provide the math. > >What are the odds that this result was due to chance? The paper does not say, >and unless I wish to speculate, I can draw no conclusion from this other than >that it seems obvious that there is better than a 50% chance that VR=3 is better >than R=2. > >Match result math is rarely if ever done in the computer chess field. Figuring >out how to do this would be a *great* JICGA article, and it's amazing that >nobody has felt the need to do this until now. Being able to make positive >statements about match scores would be worth something, you'd think, but 40 >years into computer chess research nobody has published this. I did, some 15-20 years ago, in the Swedish "PLY" a couple of articles that later became the basics for the SSDF testing. A year or so ago you posted a question about how to interpret results with very few games. In a another thread I posted a new theory for this as an answer "Match results - a complete(!) theory (long)". I also made a program to use for this that can be found at Dann's ftp site. /Peter
This page took 0.01 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.