Author: Peter Kappler
Date: 11:37:08 07/02/99
Go up one level in this thread
On July 02, 1999 at 12:34:56, Paul Richards wrote: >On July 02, 1999 at 02:43:37, Peter Kappler wrote: > >>>I have both F5.32 and Hiarcs 7.32; I like them both; I do not know which >>>one will ultimately score higher on SSDF; but those engine v engine results >>>prove nothing. >>> >> >>Prove nothing?? Absurd. > >Not at all. > While perhaps not as optimal as a 2-computer match, I think it's a gross overstatement to suggest that a single-computer match is worthless. > >>>Nobody is alleging an anti-Fritz conspiracy. The results are dubious; >>>that's why he is saying it. >>> >>>eric >> >> >>You both are *saying* this, neither of you are *explaining* why. > >Probably because we both assumed you had also read recent threads like "Some >thoughts on engine vs. engine testing". In a nutshell this sort of testing is >unreliable because we know there are some problems with running both engines on >one machine, and we can infer that there are an unknown number of similar >possible issues. For example, the time allocation code for a program will be >based on the assumption that the program thinks while the opponent's clock is >running, which is not the case in engine-engine testing. Correctly allocating >time for making moves is non-trivial, and it's unlikely that programmers design >or test for engine-engine conditions. Hash tables are another issue, and some >of the machines used for these engine-engine tests don't have enough RAM to >allow optimal hash tables for both programs. We also don't have the source code >for the Chessbase interface to know what other resource allocation issues there >might be. In short the gold standard for such tests is clearly running the >programs on two separate machines. When this is done Hiarcs and Fritz appear >very close in strength. As a result it's not impressive when someone does an >engine-engine test that shows one to be far superior to the other, and even less >impressive when they keep doing it despite being told why single machine >engine-engine tests cannot be taken seriously. Thanks for the explanation. I agree that a 2-computer match is the optimal test environment. I don't agree that a single computer match is worthless - I think it is still an excellent indicator of the relative playing strength of the programs. It would be interesting to play ~200 games with each setup and compare the results. --Peter
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.