Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: CSS WM TEST - truth is NOT hypnosis

Author: Sandro Necchi

Date: 02:27:37 06/19/04

Go up one level in this thread


On June 19, 2004 at 04:40:23, Dan Honeycutt wrote:

>On June 19, 2004 at 03:14:55, Sandro Necchi wrote:
>
>>On June 19, 2004 at 03:05:59, Dan Honeycutt wrote:
>>
>>>On June 19, 2004 at 02:38:14, Sandro Necchi wrote:
>>>
>>>>On June 18, 2004 at 14:31:12, Steve Glanzfeld wrote:
>>>>
>>>>>On June 18, 2004 at 13:39:29, Rolf Tueschen wrote:
>>>>>
>>>>>>On June 18, 2004 at 12:59:43, Steve Glanzfeld wrote:
>>>>>>
>>>>>>>On June 18, 2004 at 09:47:55, Rolf Tueschen wrote:
>>>>>>>
>>>>>>>[...]
>>>>>>>:-))) I can imagine how a blackout must have suddenly hit you. Has someone
>>>>>>>turned the lights off while you were writing? Why in the world is it "HYPNOSIS"
>>>>>>>??? when people believe the truth to be true?
>>>>>>>
>>>>>>>Steve
>>>>>>
>>>>>>
>>>>>>As I told you - with your insults you can't expect to get answers. You showed
>>>>>>very well that you have a reading difficulty because above I didn't write that
>>>>>>_I_ believed that the ranking lists were "similar". That was a quote from
>>>>>>Gurevich. Understood?
>>>>>
>>>>>But in fact he's right, they ARE similar!! Understood? Compare any rankings you
>>>>>like...
>>>>>
>>>>>>To all the other problems I am certain that to your reading difficulty you have
>>>>>>even worse handicaps because you don't seem to be fit to get what is being
>>>>>>discussed here. This test can't bring effective news, this is the main point.
>>>>>
>>>>>Why is this "the main point" suddenly?? You find new "main points" every day.
>>>>>Don't you know that new engine versions are released every week? Testing them
>>>>>DOES bring news, because there is no other estimation of their strength, yet. A
>>>>>good test like the WM test can tell if it's a patzer or a potential top engine,
>>>>>or what's different from the previous version of that engine...
>>>>>
>>>>>>_All_ the programmers I could read say more or less frankly that they can't work
>>>>>>with _that_ test (100 positions). Because, surprise, to know a ranking place in
>>>>>>that test or in other position tests, has no importance for their programming.
>>>>>
>>>>>Surprise: Computerchess testsuites aren't intended only for the use by chess
>>>>>programmers. Acutally they are intended mainly to be used by fans, chess
>>>>>players, common program users, to be able to investigate the strength profile
>>>>>(strengths and weaknesses) of chess programs, find estimated rankings when they
>>>>>want to...
>>>>
>>>>Hi Steve,
>>>>
>>>>do not get me wrong; I am not against you at all.
>>>>
>>>>I will try to let people understand why the programmers are not interested in
>>>>these test suite.
>>>>
>>>>>
>>>>>Since some chess programmers have said that they aren't interested much in such
>>>>>tests, this seems to be your main argument against it. But this argument is not
>>>>>valid, because tests are made for thousands of users and fans (who do not use
>>>>>tests to develope, but to TEST), and not as a developing tool for programmers.
>>>>
>>>>OK, you may have made the best test set and I think chess funs will find it
>>>>quite interesting to see if their latest chess program does perform well in this
>>>>test set.
>>>>
>>>>This is very nice tool and we all must thank you for this, but it is not
>>>>reliable (unfortunately) to estimate a program strenght.
>>>>
>>>>We have seen quite often; nearly all the time, that to modify a chess engine to
>>>>play better in those tests set a drawback. I mean that most of the time a
>>>>version of program X is better than another version of the same program
>>>>performing better in that tests set.
>>>>This means that in order to make a program stronger other things are more
>>>>important.
>>>>
>>>>In reality this is explained if you consider the following:
>>>>
>>>>1. To find the best move which allowes you to win in 30 moves instead of 60
>>>>moves does not bring you any Elo rating at all.
>>>>2. To be able to play some !! moves and many ? moves does make the program
>>>>weaker as with 2 ? moves one quite probably will lose the game while with some
>>>>!! it may not be able to win.
>>>>
>>>>This means that the a chess program should be made overall stronger and not be
>>>>able to solve some specific positions.
>>>>
>>>>So summarizing if one program is performing better in the test set could be
>>>>stronger, but not necessarely; most of the time it is not.
>>>>
>>>>This is why the chess programmers do not rely on these test sets.
>>>>
>>>>I am not saying that it is not possible to make a test set that can help to
>>>>reach what you are looking at, but probably this must be quite different and
>>>>with a huge no. of positions covering other issues as well.
>>>>
>>>>>
>>>>>I'm sure it gives you BIG TROUBLE that the usually top-listed engines from
>>>>>gamebased rankings (Shredder, Fritz...) are also top in the WM test's results,
>>>>>while engines which are playing weak compared to these, are also ranking bad
>>>>>there :-))) It just works! Do you have sleepless nights now?
>>>>>
>>>>>Steve
>>>>
>>>>Sandro
>>>
>>>Good try Sandro, but I fear you're wasting your breath.  Shredder and Fritz do
>>>better on these tests than do PatzerChess.  Q.E.D.
>>>
>>>Dan H.
>>
>>Thanks.
>>
>>I just want to let people who are willing to lessen to understand.
>>
>>Sandro
>
>That's good.  But I'm afraid here the "willing to listen" is absent.  More like
>"my mind's made up, don't confuse me with the facts".
>
>Best.
>Dan H.

Dan, you are right. It has become a very rare attitude! Isn't?:-)

Sandro




This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.