Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: CSS WM TEST - truth is NOT hypnosis

Author: Dan Honeycutt

Date: 00:05:59 06/19/04

Go up one level in this thread


On June 19, 2004 at 02:38:14, Sandro Necchi wrote:

>On June 18, 2004 at 14:31:12, Steve Glanzfeld wrote:
>
>>On June 18, 2004 at 13:39:29, Rolf Tueschen wrote:
>>
>>>On June 18, 2004 at 12:59:43, Steve Glanzfeld wrote:
>>>
>>>>On June 18, 2004 at 09:47:55, Rolf Tueschen wrote:
>>>>
>>>>[...]
>>>>:-))) I can imagine how a blackout must have suddenly hit you. Has someone
>>>>turned the lights off while you were writing? Why in the world is it "HYPNOSIS"
>>>>??? when people believe the truth to be true?
>>>>
>>>>Steve
>>>
>>>
>>>As I told you - with your insults you can't expect to get answers. You showed
>>>very well that you have a reading difficulty because above I didn't write that
>>>_I_ believed that the ranking lists were "similar". That was a quote from
>>>Gurevich. Understood?
>>
>>But in fact he's right, they ARE similar!! Understood? Compare any rankings you
>>like...
>>
>>>To all the other problems I am certain that to your reading difficulty you have
>>>even worse handicaps because you don't seem to be fit to get what is being
>>>discussed here. This test can't bring effective news, this is the main point.
>>
>>Why is this "the main point" suddenly?? You find new "main points" every day.
>>Don't you know that new engine versions are released every week? Testing them
>>DOES bring news, because there is no other estimation of their strength, yet. A
>>good test like the WM test can tell if it's a patzer or a potential top engine,
>>or what's different from the previous version of that engine...
>>
>>>_All_ the programmers I could read say more or less frankly that they can't work
>>>with _that_ test (100 positions). Because, surprise, to know a ranking place in
>>>that test or in other position tests, has no importance for their programming.
>>
>>Surprise: Computerchess testsuites aren't intended only for the use by chess
>>programmers. Acutally they are intended mainly to be used by fans, chess
>>players, common program users, to be able to investigate the strength profile
>>(strengths and weaknesses) of chess programs, find estimated rankings when they
>>want to...
>
>Hi Steve,
>
>do not get me wrong; I am not against you at all.
>
>I will try to let people understand why the programmers are not interested in
>these test suite.
>
>>
>>Since some chess programmers have said that they aren't interested much in such
>>tests, this seems to be your main argument against it. But this argument is not
>>valid, because tests are made for thousands of users and fans (who do not use
>>tests to develope, but to TEST), and not as a developing tool for programmers.
>
>OK, you may have made the best test set and I think chess funs will find it
>quite interesting to see if their latest chess program does perform well in this
>test set.
>
>This is very nice tool and we all must thank you for this, but it is not
>reliable (unfortunately) to estimate a program strenght.
>
>We have seen quite often; nearly all the time, that to modify a chess engine to
>play better in those tests set a drawback. I mean that most of the time a
>version of program X is better than another version of the same program
>performing better in that tests set.
>This means that in order to make a program stronger other things are more
>important.
>
>In reality this is explained if you consider the following:
>
>1. To find the best move which allowes you to win in 30 moves instead of 60
>moves does not bring you any Elo rating at all.
>2. To be able to play some !! moves and many ? moves does make the program
>weaker as with 2 ? moves one quite probably will lose the game while with some
>!! it may not be able to win.
>
>This means that the a chess program should be made overall stronger and not be
>able to solve some specific positions.
>
>So summarizing if one program is performing better in the test set could be
>stronger, but not necessarely; most of the time it is not.
>
>This is why the chess programmers do not rely on these test sets.
>
>I am not saying that it is not possible to make a test set that can help to
>reach what you are looking at, but probably this must be quite different and
>with a huge no. of positions covering other issues as well.
>
>>
>>I'm sure it gives you BIG TROUBLE that the usually top-listed engines from
>>gamebased rankings (Shredder, Fritz...) are also top in the WM test's results,
>>while engines which are playing weak compared to these, are also ranking bad
>>there :-))) It just works! Do you have sleepless nights now?
>>
>>Steve
>
>Sandro

Good try Sandro, but I fear you're wasting your breath.  Shredder and Fritz do
better on these tests than do PatzerChess.  Q.E.D.

Dan H.




This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.