Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: CSS WM TEST - a technical view

Author: Sune Fischer

Date: 15:56:41 06/16/04

Go up one level in this thread


On June 16, 2004 at 16:49:28, Steve Glanzfeld wrote:

>On June 15, 2004 at 17:28:38, Vincent Diepeveen wrote:
>
>>On June 15, 2004 at 16:26:09, Steve Glanzfeld wrote:
>>
>>>No normal program will choose an unusual move (i.e. a queen sac) "out of the
>>>blue" in a normal position. Except, the program is completely broken.
>>>
>>>You guys are argueing as if it would be DOWNRIGHT BAD when a chess program finds
>>>good moves (quickly)... I wonder what a chess program looks like, when it is
>>>based on that philosophy :)) Does it try to avoid the good moves? So, if there's
>>>a lack of success, the chances are good that we have found a major reason here
>>>:)
>
>>"I created a version that was tactical brilliant. It solved *everything* in the
>>testsuites. Then i started playing with it and it was hundreds of points weaker
>>in games." Stefan Meyer Kahlen a few months ago.
>
>No engine can solve everything in every testsuite. There are not only tactical
>tests, for example (big surprise eh? :)))
>
>>
>>So the answer to your question is: The version that scores hundreds of points
>>more onto testsuites is NOT the version to play with at tournaments, because in
>>testsuites all those patzermoves work as we know and they do not in tournaments.
>
>Again, don't you understand that those moves HAVE WORKED in games? :) These are
>World Champion's winning moves! What are you talking about "do not work in
>tournaments"...???
>
>Which program, in several versions, do you think ranks #2, #5 and #7 in the WM
>test results? Shredder! :)) Note, that the version ranking #2 has the same
>number of solutions as the leader. Ranks #1/3/6/8/9/10 are Fritz versions. Next
>best are CM versions, Hiarcs 9, and Deep Juniors. At the bottom of the list we
>find oldies and weaker freeware.
>
>So, we find the same engines in the top of that test's ranking list (from a
>total of 230 results in the currently available download), which we do as well
>find in many ranking lists based on games.
>
>I wonder why some people here have so much trouble understanding or accepting
>this. Strange.
>
>Steve

I think it has been explained to you already, but I'll give it another try.

The problem is that the implication
"higher testscores" => "stronger engine" is often false.

There are severy reasons for that I think, some of them already mentioned.

One of the biggest problems is that test positions are not really representative
of a real game.
It seems impossible to weigh in the different type of positions, ie. say you
have 10 king sac, 10 endgame and 10 midgame with subtle moves.

Now you take two engines and get resp. 4, 8, 2 and 6, 5, 3 solutions.

The right kind of king sac can of course decide the game, but these position may
occur rarely in games so it might not be hugely important for practical rating.
Of course the engine must also be able to get that kind of positions on the
board in the first place.

The subtle midgame moves occur extremely frequently of course, but a few 0.1
moves won't be enough to win a game.

Being excelent in the endgame won't help much if you can never survive the
midgame. Etc..

So even though the two engines may score the same, it says absolutely nothing
about which will be the better player.

Of course you can try and create a set of positions you think will be
representative and make a guess as to some proper weighing.
But that's all it's going to be, basicly guessing out of the blue.

Even if one engine scores higher on all suites, there is still a chance it is
worse if e.g. it is much too "trigger happy" and generally overestimates its
chances in, say, passed pawn endgames.
A good example of that is wac2 I think, high passed pawn values will usually
help the engine find the right sac although the same "knowledge" can backfire in
other positions.

-S.



This page took 0.01 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.