Computer Chess Club Archives

Search

Terms

Messages

Subject: Re: None of these tests are truly scientific!

Author: KarinsDad

Date: 15:46:52 01/26/99

On January 26, 1999 at 17:52:25, Bruce Moreland wrote:

>
>On January 26, 1999 at 16:25:17, KarinsDad wrote:
>
>>I'm glad that you are running other programs against the control. At what times
>>are you running the programs, on what type and speed processors, and what is
>>your matching criteria?
>
>One minute per move, you choose the processor, and a match is scored if you'd
>play the move at the end of the minute.

I would prefer slower times. I think that the main indicator is nodes per second
times number of seconds or average total nodes per move. I realize that this is
difficult to estimate for Bionic, however, you guys have been doing this for a
long time and I think you could come up with an "educated guess".

I understand your practicality issue, however, I'd rather take one of the games
that Robert checked and run as close to an approximate in number of nodes per
move as I could (and yes, all of this is questionable due to the search changes
of running a program with SMP vs. no SMP, different hash sizes, etc.), rather
than run all of the games for very short durations.

Statistically, neither sample set is large enough nor accurate enough (one game
run at more exacting times, or multiple games run at quick times) to be
considered scientific. Any results you get, no matter how you do it, have to be
taken with a grain of salt.

My way would (rough guess) take 7 (?) games * 10 minutes per move per side (due
to a slower speed system) * 120 moves per game (for both sides combined, this is
an average, I did not look it up) * the number of programs tested (say 6) or
about 5 weeks if one person did it all. However, if you gave a different game
each to 7 individuals (all of whom had all 6 programs to test with), it would
take about 5 days. This would give you a better (but still not perfect) set of
data then 1 minute per move IMO.

However, I am not doing the tests, so I'm not trying to tell you how to do it.
Just my opinion.

>
>I am flexible about the processor because I didn't want to split hairs over
>whether a P5/133 is X% slower than a P6/200 or whatever.  I figured that a few
>people might run this on Crafty uing different hardware, and that might make
>show us what effect this had on match rate.
>
>This is a little too multivariate to make a good controlled experiment, but
>people will have reservations, possibly the same people, no matter what attempts
>are made to control the experiment better.  I don't think it is possible to
>control it perfectly, so if you try to do so, people will point out the flaws
>anyway.

I agree. No matter what you do, people (like myself above :) ) will point out
the "flaws" (I prefer to think of them as alternatives).

Good luck with your tests!

KarinsDad

>
>bruce

Re: None of these tests are truly scientific! Don Dailey 11:31:51 01/27/99
- Re: None of these tests are truly scientific! KarinsDad 15:06:03 01/27/99

This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.