Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: CSS WM TEST - truth is NOT hypnosis

Author: Sandro Necchi

Date: 00:14:55 06/19/04

Go up one level in this thread


On June 19, 2004 at 03:05:59, Dan Honeycutt wrote:

>On June 19, 2004 at 02:38:14, Sandro Necchi wrote:
>
>>On June 18, 2004 at 14:31:12, Steve Glanzfeld wrote:
>>
>>>On June 18, 2004 at 13:39:29, Rolf Tueschen wrote:
>>>
>>>>On June 18, 2004 at 12:59:43, Steve Glanzfeld wrote:
>>>>
>>>>>On June 18, 2004 at 09:47:55, Rolf Tueschen wrote:
>>>>>
>>>>>[...]
>>>>>:-))) I can imagine how a blackout must have suddenly hit you. Has someone
>>>>>turned the lights off while you were writing? Why in the world is it "HYPNOSIS"
>>>>>??? when people believe the truth to be true?
>>>>>
>>>>>Steve
>>>>
>>>>
>>>>As I told you - with your insults you can't expect to get answers. You showed
>>>>very well that you have a reading difficulty because above I didn't write that
>>>>_I_ believed that the ranking lists were "similar". That was a quote from
>>>>Gurevich. Understood?
>>>
>>>But in fact he's right, they ARE similar!! Understood? Compare any rankings you
>>>like...
>>>
>>>>To all the other problems I am certain that to your reading difficulty you have
>>>>even worse handicaps because you don't seem to be fit to get what is being
>>>>discussed here. This test can't bring effective news, this is the main point.
>>>
>>>Why is this "the main point" suddenly?? You find new "main points" every day.
>>>Don't you know that new engine versions are released every week? Testing them
>>>DOES bring news, because there is no other estimation of their strength, yet. A
>>>good test like the WM test can tell if it's a patzer or a potential top engine,
>>>or what's different from the previous version of that engine...
>>>
>>>>_All_ the programmers I could read say more or less frankly that they can't work
>>>>with _that_ test (100 positions). Because, surprise, to know a ranking place in
>>>>that test or in other position tests, has no importance for their programming.
>>>
>>>Surprise: Computerchess testsuites aren't intended only for the use by chess
>>>programmers. Acutally they are intended mainly to be used by fans, chess
>>>players, common program users, to be able to investigate the strength profile
>>>(strengths and weaknesses) of chess programs, find estimated rankings when they
>>>want to...
>>
>>Hi Steve,
>>
>>do not get me wrong; I am not against you at all.
>>
>>I will try to let people understand why the programmers are not interested in
>>these test suite.
>>
>>>
>>>Since some chess programmers have said that they aren't interested much in such
>>>tests, this seems to be your main argument against it. But this argument is not
>>>valid, because tests are made for thousands of users and fans (who do not use
>>>tests to develope, but to TEST), and not as a developing tool for programmers.
>>
>>OK, you may have made the best test set and I think chess funs will find it
>>quite interesting to see if their latest chess program does perform well in this
>>test set.
>>
>>This is very nice tool and we all must thank you for this, but it is not
>>reliable (unfortunately) to estimate a program strenght.
>>
>>We have seen quite often; nearly all the time, that to modify a chess engine to
>>play better in those tests set a drawback. I mean that most of the time a
>>version of program X is better than another version of the same program
>>performing better in that tests set.
>>This means that in order to make a program stronger other things are more
>>important.
>>
>>In reality this is explained if you consider the following:
>>
>>1. To find the best move which allowes you to win in 30 moves instead of 60
>>moves does not bring you any Elo rating at all.
>>2. To be able to play some !! moves and many ? moves does make the program
>>weaker as with 2 ? moves one quite probably will lose the game while with some
>>!! it may not be able to win.
>>
>>This means that the a chess program should be made overall stronger and not be
>>able to solve some specific positions.
>>
>>So summarizing if one program is performing better in the test set could be
>>stronger, but not necessarely; most of the time it is not.
>>
>>This is why the chess programmers do not rely on these test sets.
>>
>>I am not saying that it is not possible to make a test set that can help to
>>reach what you are looking at, but probably this must be quite different and
>>with a huge no. of positions covering other issues as well.
>>
>>>
>>>I'm sure it gives you BIG TROUBLE that the usually top-listed engines from
>>>gamebased rankings (Shredder, Fritz...) are also top in the WM test's results,
>>>while engines which are playing weak compared to these, are also ranking bad
>>>there :-))) It just works! Do you have sleepless nights now?
>>>
>>>Steve
>>
>>Sandro
>
>Good try Sandro, but I fear you're wasting your breath.  Shredder and Fritz do
>better on these tests than do PatzerChess.  Q.E.D.
>
>Dan H.

Thanks.

I just want to let people who are willing to lessen to understand.

Sandro




This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.