Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: CSS WM TEST - a technical view

Author: Rolf Tueschen

Date: 07:16:20 06/15/04

Go up one level in this thread


On June 15, 2004 at 10:07:26, Geert van der Wulp wrote:

>On June 15, 2004 at 09:39:20, Rolf Tueschen wrote:
>
>>On June 15, 2004 at 08:13:33, Geert van der Wulp wrote:
>>
>>>On June 15, 2004 at 07:58:42, Franz Hagra wrote:
>>>
>>>>>What is YOUR opinion about this? Should programs which solve 19 or 54, get the
>>>>>same rating? The test formula calculates for these program's performances
>>>>>
>>>>>19 sol. --> 2.553
>>>>>55 sol. --> 2.649
>>>>>
>>>>>But Hagra attaches 2.600 to both. I wonder who else accepts this as serious :))
>>>>>Send in the clowns...
>>>>>
>>>>>Steve
>>>>
>>>>Hagra attaches the range of 2550-2649 for both - using 2600 an sf=2 simplyfing
>>>>this as usual for measurement data in common.
>>>>
>>>>Hagra
>>>
>>>The fact that something is "common" to do does not mean that it is a good thing
>>>to do. Why do you believe that the ratings are accurate for the relative
>>>strengths of the programs up to 100 points? Why not 50, 25 or maybe 200?
>>>
>>>Geert
>>
>>
>>The answer is easy. Hagra does not follow daydreaming and wishes but a clear
>>mathematical urge. From that math formula above you cannot extract what you
>>seemingly want to have. This is the easy answer to that question. Please ask
>>further questions if you dont understand.
>
>If you read my question, then maybe it was not clear that it was meant as a
>rhetorical question. My point is that obviously Hagra has NO clue what the
>uncertainty in the quoted rating numbers is. But this he already confessed in
>another post.

He's saying that the WMTest formula does only allow to make statements with the
first two digits.


>
>Of course the advantage of having a bunch of chess programs play games against
>each other to determine their relative strength is that the uncertainty in the
>estimated relative playing strength will become smaller if more games are
>played.
>
>The advantage of having them analyse positions from human chess games is that we
>can see how good they are in analysing, a feature that is of most importance to
>most users.

Of course but this was never a question here. These positions are wonderful! And
big fun to test. Our sole topic was if these tests could have any meaning for
programmers and the answer is no. Not because all here are so dick headed. It's
because the results dont help. Test critics say that the so calle Elo numbers
are really looking good. But they are meaningless for tournament practice.
Believe it or not.


>
>Geert



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.