Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Sorry Rolf - the winner is the winner.

Author: Uri Blass

Date: 10:52:52 09/13/02

Go up one level in this thread


On September 13, 2002 at 13:42:13, Rolf Tueschen wrote:

>On September 13, 2002 at 13:17:57, Uri Blass wrote:
>
>>On September 13, 2002 at 13:04:50, Rolf Tueschen wrote:
>>
>>>On September 13, 2002 at 12:20:36, Uri Blass wrote:
>>>
>>>>On September 13, 2002 at 11:25:44, David Dory wrote:
>>>>
>>>>>On September 13, 2002 at 09:20:26, Rolf Tueschen wrote:
>>>>>
>>>>><snip>
>>>>>>
>>>>>>Let's quickly compare human lists and computer rankings. The Elo method allows
>>>>>>to calculate the individual strength (performance) over the variable of age. In
>>>>>>CC programs have no age at all, because almost each new version gets completely
>>>>>>new limbs and organs so to speak. That means that you can't compare the old and
>>>>>>the new version. Or would you compare the embryo with M. Dos Savant?  We
>>>>>>remember the old saying "You can't compare apples with beans". Nevertheless CC
>>>>>>has ranking lists for decades now with the astonishing result that the newest
>>>>>>progs are on top and the oldest, on the weakest hardware, are at the bottom. >Big surprise!
>>>>>===================
>>>>>I agree with you 100%, Rolf on this issue: testing software on vastly unequal
>>>>>hardware is totally a waste of time and an insult to the reader's intelligence,
>>>>>really.
>>>>
>>>>I disagree
>>>>
>>>>It is not a waste of time to test programs with unequal hardware.
>>>>Not always the better hardware wins and you can learn from the results.
>>>>
>>>>palm tiger has a 50% against kallisto inspite of the fact that kallisto has 486
>>>>and palm has significantly slower hardware.
>>>>
>>>>I think that it may be interesting to see also other programs on slow hardware
>>>>and not only tiger14.9 but the ssdf has not unlimited time.
>>>>
>>>>I think that it is interesting to see how much rating programs earn from the new
>>>>hardware and without testing programs on old hardware there is no way to know.
>>>>
>>>>You also need games against different opponents in order to generate rating list
>>>>so games with unequal hardware are needed.
>>>>
>>>>Uri
>>>
>>>
>>>This is not meant as aggressive, Uri, but excuse me, I must say that your final
>>>sentence disqualifies you as a tester. You cannot proceed this way. Testing and
>>>statics is not a question of input here and there to get safe results. The bias
>>>alone from such intensiously implemented things invalidates your whole activity
>>>as a tester. This might be difficult to understand for laymen but it's still the
>>>truth.
>>
>>I do not understand what is the problem here.
>>
>>I think that the best thing to do is to give every 2 opponents to play the same
>>number of games(unfortunately the ssdf cannot do it).
>>
>>The only problem that can make the rating misleading in that case is killer
>>books and learning to repeat wins but hardware is not relevant for this problem.
>>
>>Uri
>
>I see that you have (?) little experience with statistics. The point is that you
>should define all design _in advance_. Only then the results have a real
>meaning. You simply can't take a few ancient progs if necessary and at will and
>then "complete" your data. This is regarded as a gross miscarriage.
>
>The point is your argument that you need such matches to be able to calculate
>your results!
>
>Rolf Tueschen


I agree that it is better if everything is defined before doing the games and it
is a disadvantage of the ssdf that there is no clear eules which games are going
to be played but I do not see how it is relevant for the question if to play
matches with unequal hardware.

The player in the ssdf are software+hardware and not only software.

Uri



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.