Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Nimzo99 MMX - Hiarcs 6 P90 SSDF game 12/20 1-0 Now: 10 - 2

Author: Mark Young

Date: 22:29:47 05/29/99

Go up one level in this thread


On May 29, 1999 at 23:11:00, Dave Gomboc wrote:

>On May 29, 1999 at 22:11:57, Melvin S. Schwartz wrote:
>
>>
>>On May 29, 1999 at 15:59:32, Dave Gomboc wrote:
>>
>>>On May 29, 1999 at 14:05:29, Melvin S. Schwartz wrote:
>>>
>>>>
>>>>On May 29, 1999 at 11:16:22, Dave Gomboc wrote:
>>>>
>>>>>On May 29, 1999 at 10:09:02, Melvin S. Schwartz wrote:
>>>>>
>>>>>>
>>>>>>I don't understand how you can seriously give credence to this match when you
>>>>>>are running Nimzo on superior hardware. The advantage of Nimzo on a Pentium 200
>>>>>>MMX is not to be taken lightly. Regardless, Hiarcs 6 is outdated by Hiarcs 7 and
>>>>>>the Hash tables in Hiarcs 7 is much higher than what you listed for Hiarcs 6. My
>>>>>>main point is that when testing chess programs, you should test them on the SAME
>>>>>>type of computer.
>>>>>>
>>>>>>Regards,
>>>>>>Mel
>>>>>
>>>>>No, he shouldn't.  He should report the speed of the processor and the version
>>>>>of the software, just as he has.
>>>>
>>>>If you support this kind of testing, good luck on trying to get meaningful
>>>>evaluations. I think you're getting into more of a hypothetical circumstance
>>>>here with uneven testing.
>>>
>>>Are you suggestions that more meaningful evaluations are achieved by a closed
>>>group of 8 or so "newest version" programs on "latest" hardware?  How will you
>>>understand how good these are relative to older programs without
>>>intergenerational competition?  Besides, the larger player pool due to the
>>>increased number of hardware/software combinations will provide more reliable
>>>relative ratings for even the latest programs on the latest hardware.
>>>
>>>>>"Hiarcs 6, P90", "Hiarcs 7, P200MMX", and "Hiarcs 7, K2-450" are all different
>>>>>entities that can be expected to have significantly different ratings.  That a
>>>>>newer hardware/software combination exists does not make it invalid or even
>>>>>useless to assess the strength of an older one.
>>>>
>>>>I believe Nimzo 99 is a newer program than Hiarcs 6. If that is the case, it
>>>>would futher support uneven testing. How many people would be interested in how
>>>>Hiarcs 6 does against..as opposed to Hiarcs 7 against...?. Furthermore, who is
>>>>still selling Hiarcs 6???
>>>
>>>It doesn't matter if anyone is still selling it or not.  Hiarcs 6/P90 still
>>>provides an important performance benchmark for comparison.
>>>
>>>Even if this were not true, it is still the case that not everybody upgrades
>>>their hardware and software every year.
>>>
>>>>I'm not saying there is absolutely no purpose in testing outdated software, but
>>>>rather time and testing could be put to better use.
>>>
>>>I think the use it is being put to is very good.  The person has a spare P90
>>>lying around, he might as well get some SSDF testing in.
>>>
>>>>Mel>
>>>
>>>Dave
>>
>>I will refer you to a posting by Robert Hyatt on 5/29 with the heading "Re:
>>Uneven hardware". I will quote word for word, you can look at the posting if you
>>don't believe what I quote:
>>
>>"If program A on hardware B beats program D on hardware E - does that say much
>>about A compared to B? This belies the principles of science - you have to have
>>a uniform platform for all participants to make any kind of judgement".
>>Now, does that make it any clearer, or do you think Mr. Hyatt is also wrong?
>>Hmmm?? By the way, my opinion posted on this matter was BEFORE I found Mr.
>>Hyatt's statement quoted above.
>>
>>Mel
>
>I am not claiming that "program A on hardware B beats program D on hardware E
>implies that program A is better than program D".  You are trying to reach such
>a conclusion, and that is why you're at odds with everyone about this.

You are correct again Dave, and SSDF's testing is not claiming this either. It
is just showing how strongly a new program performed against the SSDF's rating
pool, thus gaining a rating itself. In SSDF testing Program A plays Program B,
C, D etc. on hardware A and B. So I disagree with how Melvin is applying and
understanding  Dr. Hyatts statement. An example for melvin would be, if SSDF
would say from now on we are only going to test Rebel 10 on PII 400 hardware,
and all other new program will be tested on P200 hardware. In terms of the
soundness of the ratings pools and getting a rating for Rebel 10 on PII 400
hardware this is ok and sound, of course policy wise this would piss of the
other programmers. But no where is SSDF drawing the conclusion that Rebel 10 is
also the best program on equal hardware, or any other way melvin want to apply
Dr. Hyatts statement to SSDF's rating and testing method.



>
>Dave



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.