Computer Chess Club Archives

Search

Terms

Messages

Subject: Re: SSDF is NOT Corrupted vis-a-vis CB/Rebel/etc.

Author: Dave Gomboc

Date: 01:00:00 09/28/99

On September 28, 1999 at 02:25:24, Ratko V Tomic wrote:

>>> If
>>>they picked 10 top programs (incl. e.g. Rebel 10, Hiarcs 7, not just 7.32
>>>from CB) and distributed time on the fastest machines equally, you still
>>>get the same overall info on the strength improvement on that hardware,
>>>just covering the wider spectrum of programs, but playing the same total
>>>number of games on the fast hardware. Nothing useful is gained by giving
>>>all the fast hardware to the 4 CB programs, in effect deciding before the
>>>cycle even started who will get the top 4 spots.
>>
>>My point is they won't publish entries unless 100 games have been played.  So
>>maybe they could have played 60 games with 10 programs on 450s, and we wouldn't
>>know squat about how much improvement to expect until the next list, because no
>>450 results would have been published this time around.
>>
>
>The rules are either not right or they're being applied with a lack
>of common sense. Suppose they did publish results with each of the
>top 8-10 programs playing equal number of games as now, except that all
>programs had equal average hardware. In that case each would have,
>taking your example, 60 games instead of 100 on K2-450, the rest on
>slower machines. Obvious drawback is that it makes the uncertainty of
>K2-450 improvement for the 4 CB programs slightly larger (the uncertianty
>increases as the 1/sqrt(N) as N drops, where N is number of games). But
>in return it makes the certainty for other manufacturers' products
>significantly greater (compared to the much greater guesswork in extrapolating
>from the lower speeds). And more importantly, regarding the fairness of
>the tests, it doesn't skew the list by willfully handing the top 4 spots
>to one company before the competition even started. And finally, since the
>total number of games played on 450's remains the same, while using
>the larger sample of programs, it improves the estimate of the
>average (across all programs) improvement on the fast hardware.
>
>I can't see how anyone (but CB) could weigh more the single "con"
>(in effect the absence of the preferred treatment of the CB programs)
>against all the "pros" of the equal average hardware tests.
>
>Although one might argue that some tests, warts and all, are still
>better than no tests, one can also say that the objectivity illusion
>and scientific aura it creates in the public mind about the relative
>strength of the programs may drive some competitors out of business,
>be it by making them appear worse in a scientifically sounding evaluation,
>or by denying them exposure if they refuse to play against the stacked
>deck (as some have done). Having the facts wrong may be worse than having
>no facts. And having fewer competing manufacturers is certainly worse.

While it may seem an arbitrary measure, they use 100 games as a cutoff because
they believe that results based on less games are simply unreliable.  As far as
I know, they have followed this convention for the entire time that the list has
existed.  Certainly, it has been the rule since I first heard about the list,
years ago.  I don't expect them to (indeed, I expect them not to!) change their
responsible reporting habits.  The onus is on businesses to also report SSDF
findings accurately.  A heavy burden, I know, but some companies have done this
well in the past, so it isn't impossible.  It bothers me that people threaten
the SSDF with lawsuits if they publish games or results of program vs. program
testing.  Would it be good if the SSDF ought to play tit-for-tat, and threatened
lawsuits if the results they publish are not reported properly?

It may seem like the current list is terribly pro-CB, but I think they simply
took the four best programs they had on 200MMXs, and started with them.  Sure,
Hiarcs 7 (DOS) and Hiarcs 7.32 (CB/Win) are really close to each other, but I
think there's an obvious preference for Windows applications in the buying
public.  Perhaps H7.32 was slightly higher than H7 on 200MMX machines anyway.  I
remember beta-testing Rebel 8 for Schroder BV: Ed had a contest for 10 people on
the internet, so I wrote that hi, I was a student, I'm into computers, I like
chess a lot, so because of these I am interested in computer chess, and that I
would be interested in doing this.  And somehow I was one of the lucky people,
thanks to Ed and his team, and perhaps this happening is one of the reasons I am
still around.  I wrote a review (preview, I suppose) of Rebel 8 (which perhaps
is still online at his site, or perhaps not) at the time.  In it, I said that I
would be surprised if Rebel did not debut at the top of the list.  This turned
out to be a big understatement: the first time it was on the list, Rebel 8 had a
crushing lead over every other program.  It's just the way it was.

I think that the SSDF is specifically interested in the fair testing of
different chess software packages.  This is why they exist.  It was an important
counterpoint to the incredible (incredibly bullshit!) claims made by hardware
manufacturers in the past.  It is understandable that we do not always agree
with some decisions that are made with respect to how they conduct their hobby.
However, it is their organization, their work and effort, and I think that they
are best placed to understand what information they want to know.  Accusing them
of selling out to corporate interests, without better than (IMO flimsy)
circumstantial evidence, is not going to convince me that it is true.

Dave

Re: SSDF is NOT Corrupted vis-a-vis CB/Rebel/etc. Ratko V Tomic 10:35:10 09/28/99
- Re: SSDF is NOT Corrupted vis-a-vis CB/Rebel/etc. KarinsDad 12:02:13 09/28/99
  - Re: SSDF is NOT Corrupted vis-a-vis CB/Rebel/etc. Thoralf Karlsson 13:12:44 09/28/99
    - Re: SSDF is NOT Corrupted vis-a-vis CB/Rebel/etc. blass uri 17:11:34 09/28/99
      - Re: SSDF is NOT Corrupted vis-a-vis CB/Rebel/etc. KarinsDad 17:38:35 09/28/99
  - Re: SSDF is NOT Corrupted vis-a-vis CB/Rebel/etc. blass uri 12:56:43 09/28/99
    - Re: SSDF is NOT Corrupted vis-a-vis CB/Rebel/etc. KarinsDad 13:23:51 09/28/99
      - Re: SSDF is NOT Corrupted vis-a-vis CB/Rebel/etc. Dann Corbit 13:31:00 09/28/99
- Re: SSDF is NOT Corrupted vis-a-vis CB/Rebel/etc. Dann Corbit 11:31:50 09/28/99
  - Re: SSDF is NOT Corrupted vis-a-vis CB/Rebel/etc. Ratko V Tomic 14:01:34 09/28/99
    - Re: SSDF is NOT Corrupted vis-a-vis CB/Rebel/etc. Dann Corbit 14:06:17 09/28/99
      - Re: SSDF is NOT Corrupted vis-a-vis CB/Rebel/etc. Ratko V Tomic 14:25:54 09/28/99

This page took 0.01 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.