Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: CSS WM TEST - truth is NOT hypnosis

Author: Dan Honeycutt

Date: 01:40:23 06/19/04

Go up one level in this thread


On June 19, 2004 at 03:14:55, Sandro Necchi wrote:

>On June 19, 2004 at 03:05:59, Dan Honeycutt wrote:
>
>>On June 19, 2004 at 02:38:14, Sandro Necchi wrote:
>>
>>>On June 18, 2004 at 14:31:12, Steve Glanzfeld wrote:
>>>
>>>>On June 18, 2004 at 13:39:29, Rolf Tueschen wrote:
>>>>
>>>>>On June 18, 2004 at 12:59:43, Steve Glanzfeld wrote:
>>>>>
>>>>>>On June 18, 2004 at 09:47:55, Rolf Tueschen wrote:
>>>>>>
>>>>>>[...]
>>>>>>:-))) I can imagine how a blackout must have suddenly hit you. Has someone
>>>>>>turned the lights off while you were writing? Why in the world is it "HYPNOSIS"
>>>>>>??? when people believe the truth to be true?
>>>>>>
>>>>>>Steve
>>>>>
>>>>>
>>>>>As I told you - with your insults you can't expect to get answers. You showed
>>>>>very well that you have a reading difficulty because above I didn't write that
>>>>>_I_ believed that the ranking lists were "similar". That was a quote from
>>>>>Gurevich. Understood?
>>>>
>>>>But in fact he's right, they ARE similar!! Understood? Compare any rankings you
>>>>like...
>>>>
>>>>>To all the other problems I am certain that to your reading difficulty you have
>>>>>even worse handicaps because you don't seem to be fit to get what is being
>>>>>discussed here. This test can't bring effective news, this is the main point.
>>>>
>>>>Why is this "the main point" suddenly?? You find new "main points" every day.
>>>>Don't you know that new engine versions are released every week? Testing them
>>>>DOES bring news, because there is no other estimation of their strength, yet. A
>>>>good test like the WM test can tell if it's a patzer or a potential top engine,
>>>>or what's different from the previous version of that engine...
>>>>
>>>>>_All_ the programmers I could read say more or less frankly that they can't work
>>>>>with _that_ test (100 positions). Because, surprise, to know a ranking place in
>>>>>that test or in other position tests, has no importance for their programming.
>>>>
>>>>Surprise: Computerchess testsuites aren't intended only for the use by chess
>>>>programmers. Acutally they are intended mainly to be used by fans, chess
>>>>players, common program users, to be able to investigate the strength profile
>>>>(strengths and weaknesses) of chess programs, find estimated rankings when they
>>>>want to...
>>>
>>>Hi Steve,
>>>
>>>do not get me wrong; I am not against you at all.
>>>
>>>I will try to let people understand why the programmers are not interested in
>>>these test suite.
>>>
>>>>
>>>>Since some chess programmers have said that they aren't interested much in such
>>>>tests, this seems to be your main argument against it. But this argument is not
>>>>valid, because tests are made for thousands of users and fans (who do not use
>>>>tests to develope, but to TEST), and not as a developing tool for programmers.
>>>
>>>OK, you may have made the best test set and I think chess funs will find it
>>>quite interesting to see if their latest chess program does perform well in this
>>>test set.
>>>
>>>This is very nice tool and we all must thank you for this, but it is not
>>>reliable (unfortunately) to estimate a program strenght.
>>>
>>>We have seen quite often; nearly all the time, that to modify a chess engine to
>>>play better in those tests set a drawback. I mean that most of the time a
>>>version of program X is better than another version of the same program
>>>performing better in that tests set.
>>>This means that in order to make a program stronger other things are more
>>>important.
>>>
>>>In reality this is explained if you consider the following:
>>>
>>>1. To find the best move which allowes you to win in 30 moves instead of 60
>>>moves does not bring you any Elo rating at all.
>>>2. To be able to play some !! moves and many ? moves does make the program
>>>weaker as with 2 ? moves one quite probably will lose the game while with some
>>>!! it may not be able to win.
>>>
>>>This means that the a chess program should be made overall stronger and not be
>>>able to solve some specific positions.
>>>
>>>So summarizing if one program is performing better in the test set could be
>>>stronger, but not necessarely; most of the time it is not.
>>>
>>>This is why the chess programmers do not rely on these test sets.
>>>
>>>I am not saying that it is not possible to make a test set that can help to
>>>reach what you are looking at, but probably this must be quite different and
>>>with a huge no. of positions covering other issues as well.
>>>
>>>>
>>>>I'm sure it gives you BIG TROUBLE that the usually top-listed engines from
>>>>gamebased rankings (Shredder, Fritz...) are also top in the WM test's results,
>>>>while engines which are playing weak compared to these, are also ranking bad
>>>>there :-))) It just works! Do you have sleepless nights now?
>>>>
>>>>Steve
>>>
>>>Sandro
>>
>>Good try Sandro, but I fear you're wasting your breath.  Shredder and Fritz do
>>better on these tests than do PatzerChess.  Q.E.D.
>>
>>Dan H.
>
>Thanks.
>
>I just want to let people who are willing to lessen to understand.
>
>Sandro

That's good.  But I'm afraid here the "willing to listen" is absent.  More like
"my mind's made up, don't confuse me with the facts".

Best.
Dan H.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.