Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: WM Test Position 55 is WRONG !!!!!!!!!!!!!! (now with DIAGRAMS)

Author: Eduard Nemeth

Date: 08:22:10 06/13/04

Go up one level in this thread


Hello all!

This position of the WM-Test ist WRONG:

Anand - Shirov, Advanced Chess 2000

WM-Test 55:

[D]5b2/p2k1p2/P3pP1p/n2pP1p1/1p1P2P1/1P1KBN2/7P/8 w - - 0 1


here wins Nxg5!

The real position of the game BUT is that see self (first the game):

[Event "Leon Man+Comp"]
[Site "Leon"]
[Date "2000.06.04"]
[Round "2.1"]
[White "Anand, Viswanathan"]
[Black "Shirov, Alexei"]
[Result "1-0"]
[ECO "C11"]
[WhiteElo "2769"]
[BlackElo "2751"]
[PlyCount "77"]
[EventDate "2000.06.02"]
[Source "ChessBase"]
[SourceDate "2000.10.18"]

1. e4 e6 2. d4 d5 3. Nc3 Nf6 4. e5 Nfd7 5. Nce2 c5 6. c3 Nc6 7. f4 b5 8. a3
cxd4 9. Nxd4 Nxd4 10. cxd4 b4 11. a4 Qa5 12. Bd2 Be7 13. Nf3 O-O 14. Bb5 Nb6
15. b3 Ba6 16. Bxa6 Qxa6 17. a5 Nd7 18. Qe2 Nb8 19. Kf2 Qxe2+ 20. Kxe2 Nc6 21.
Rhc1 Rfc8 22. Ra2 Rc7 23. Rac2 Rac8 24. a6 Kf8 25. g4 Ke8 26. f5 Kd7 27. Bf4 g5
28. Be3 h6 29. f6 Bf8 30. Kd3 Na5 31. Rxc7+ Rxc7 32. Rxc7+ Kxc7 33. Nxg5 hxg5
34. Bxg5 Nxb3 35. h4 Na1 36. Bc1 Nb3 37. Be3 Na5 38. g5 Nc4 39. Bc1 1-0

Anand,V - Shirov,A, Leon Man+Comp 2000


[D]5b2/p1k2p2/P3pP1p/n2pP1p1/1p1P2P1/1P1KBN2/7P/8 w - - 0 1

Here, in the real game position, wins now Bxg5 and Nxe5 --> BOTH MOVES.

The name of the test is also a JOKE than position 55 is not from a real game!!

Eduard


On June 12, 2004 at 11:32:03, Robert Hyatt wrote:

>On June 11, 2004 at 13:15:25, Ed Schröder wrote:
>
>>On June 11, 2004 at 07:29:20, Rolf Tueschen wrote:
>>
>>>On June 11, 2004 at 02:14:32, Ed Schröder wrote:
>>>
>>>>On June 09, 2004 at 10:13:30, Franz Hagra wrote:
>>>>
>>>>>Te3 is not winning at all - its a draw (only in the original game black wins).
>>>>>
>>>>>Te3 is the key to draw the position, but its not essential to play it as first
>>>>>move at all - so the test position is clear in logical human sence, but not
>>>>>under test conditions, because the test only works correct, when only Te3 as
>>>>>first move is found!
>>>>>
>>>>>Tad8 also leads to a draw position like Te3 - so the TEST POSITION is not
>>>>>correct at all.
>>>>
>>>>[d]r3r1k1/1pq2pp1/2p2n2/1PNn4/2QN2b1/6P1/3RPP2/2R3KB b - -
>>>>
>>>>1..Re3 is a sound positional attacking move and according to my own brainchild
>>>>there is a difference of 0.25 in score between 1..Re3 and 1..Rad8. The position
>>>>IMO is a fine one to test the strategic insight of a chess program.
>>>>
>>>>My best,
>>>>
>>>>Ed
>>>
>>>Ed,
>>>
>>>you did NOT comment on the main finding Hagra has published here in
>>>http://www.talkchess.com/forums/1/message.html?369557!
>>>
>>>I translate a second time into English:
>>>
>>>a) machine FRITZ 8 on AMD 1400 gets a solution time of 1 sec and that means
>>>highest points for position no. 1 (which you gave thankfully above)
>>>
>>>b) machine FRITZ 8 on AMD 2800 gets a solution time of 480 sec!! So that it gets
>>>way worse points in position no. 1!!
>>>
>>>Here is my verbal explanation (all found by Hagra):
>>>
>>>a stronger [!] machine on better hardware (do you accept that or do you claim
>>>that AMD 1400 is STRONGER than AMD 2800?) is able to make a deeper [!!]
>>>calculation and therefore finds the variation with first Rad8 - NOT as a final
>>>solution, Ed! But as a variation, before it THEN comes back to Re3. Now, the
>>>point is that such a behaviour is by far a sign for weaker strength but for
>>>_better_ strength. But alas, Ed, the so called "WM-Test" of Dr. Mikhail Gurevich
>>>gives to the weaker machine more points than for the stronger machine.
>>
>>This is common problem with positions which nature is positional. Take the start
>>position for example, the moves 1.e4, 1.d4, 1.c4 and 1.Nf3 will produce scores
>>that are very close to each other, in this case 4 moves with almost identical
>>scores. What you see is that chess engines tend to switch from 1.e4 to 1.d4
>>frequently and that the speed of the PC actually introduces a random element as
>>in this case with Fritz.
>>
>>The problem is unavoidable, the only good way to deal with random effects is to
>>increase the number of positional positions to be tested, at least to 100.
>>
>>The problem of randomness does not exist in clear tactical positions, there is
>>only one good move, once the combination is found the key move will stay and the
>>chess engine will never switch. But then what you are actually testing is SEARCH
>>and not positional knowledge.
>
>
>This shows that such tests are basically flawed.  The test should state "The
>time to solution is the time where the engine chooses the right move, and then
>sticks with it from that point forward, searching at least 30 minutes more..."
>
>That stops this kind of nonsensical "faster = worse" problem.  Because as is,
>the test simply is meaningless when changing nothing but the hardware results in
>a poorer result...
>
>
>
>
>>
>>
>>
>>>Question of Hagra and also myself: is this a reasonable test design if a
>>>stronger machine gets less points just because it looks deeper into the
>>>position? As you know the time for all machines per position is 20 minutes. And
>>>Gurevich defines the "stable holding of the once chosen move" [my verbal
>>>interpretation] as the best way to test the *analytical ability* of a machine.
>>>Do you now understand the contradiction in the test design of Dr. Mikhail
>>>Gurevich, dear Ed? Higher abilities get a worse result! Is that sound? Hopefully
>>>NOT.
>>
>>There is only one good way to test a chess engine, play games and a lot of them.
>>Testsuites are fun but surrogate but they are popular because it is a quick way
>>to estimate the strength of an engine.
>>
>>
>>>Hope this clarifies the problem we faced with the German "WM-Test" in CSS.
>>
>>There are better ways, in the case of the WM-test I suggest the following
>>change:
>>
>>1) Keep the current tactical positions, thinking time much lower, say 5-10
>>minutes.
>>
>>2) Take 100-200 positional positions to deal a bit with randomness, thinking
>>time much lower, say time 1-2 minutes, different rating formula skipping the
>>time element, only criteria is if a move is found or not.
>>
>>My best,
>>
>>Ed



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.