Computer Chess Club Archives

Search

Terms

Messages

Subject: Re: Hiarcs7.32 DISGRACED Fritz5.32 in 20 games under tournaments controls!

Author: Paul Richards

Date: 10:25:59 06/28/99

On June 27, 1999 at 23:45:33, Mark Young wrote:

>You are correct and is why I made such a statement. Any programmer can say my
>program needs to ponder, my program needs X amount of hash etc. This may be >true to some extent, but the question is how much a change is this going to >make, rating gain or loss.

Given the disparity between two-machine testing and what people excitedly
report here day after day in their one-machine engine-engine tests, it seems
to make quite a difference.  Perhaps you don't believe Bob and Ed's
comments regarding pondering and time allocation.  According to them
that would be enough to significantly ruin the performance.  There are
certainly other unknowns.  Hardware can be an issue even with separate
machines (see Eugene Nalimov's post regarding L2 cache), and who knows
what's going on with a single machine.  The point is that we have not
just one but an unknown number of variables potentially affecting the
validity of the single machine testing.  Therefore you can't run a series,
see similar results and declare that one machine testing is OK.  The fact
that similar results were achieved proves nothing that can be generalized.
One engine may experience a certain handicap while the other experiences
problems in another aspect, and by chance the results "appear" similar,
assuming the sample size is even statistically significant.  One test does
not control for an unknown number of variables.

This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.