Computer Chess Club Archives

Search

Terms
Messages

Subject: Re: Ups, text this time.

Author: Uri Blass
Date: 01:49:29 11/29/00
On November 29, 2000 at 02:53:28, Howard Exner wrote:

>On November 29, 2000 at 01:12:46, Ed Schröder wrote:
>
>>On November 28, 2000 at 20:15:16, Bruce Moreland wrote:
>>
>>>On November 28, 2000 at 17:16:25, Ed Schröder wrote:
>>>
>>>>On November 28, 2000 at 14:38:33, Robert Hyatt wrote:
>>>
>>>>>I personally don't feel very "safe" if my program is doing something good for
>>>>>the completely wrong reason(s) it found...  yes, I like to see it do the right
>>>>>thing, period.  But those "wrong reason" cases cause me to remember that for
>>>>>every right move, wrong reason, there will also be wrong move, wrong reason
>>>>>cases as well.
>>>>
>>>>Rebel from the start position will frequently switch from 1.d4 to
>>>>1.e4
>>>>
>>>>Does it play 1.e4 or 1.d4 for the wrong reason?
>>>
>>>There's no correct answer so this isn't the same thing.  A better case might be
>>>LCT I position 23:
>>>
>>>[D]8/5Bp1/4P3/6pP/1b1k1P2/5K2/8/8 w - - 0 1
>>>
>>>The key is Kg4 but fxg5 gets a similar score from my program, and it's random
>>>which one it will choose in any iteration.  It's seeing some of what is going
>>>on, but the program is a little bit too hard, and it's hit or miss whether a
>>>given version will find this, find it and switch away, switch back and forth
>>>several times, or fail to find it.
>>>
>>>I would be dishonest if I said my program "solves" this under any conditions,
>>>although if I were reporting scores for LCT 1 I would have no problem with
>>>reporting a "success" for this one as long as the rules allowed for that.
>>>
>>>Some test suites try to get you to look at the PV and see that you are finding
>>>the move for the right reasons, but this is tedious.  It's easier to just do
>>>time until find-and-hold.
>>>
>>>I don't tune for test suites.  I test against ECM and LCT 1 every day, so I know
>>>that I'm not losing tactical zip, so I know that I'm not doing something
>>>drastically weird, and so I can see the long-term effects of my changes upon
>>>node rate and search depth.
>>>
>>>bruce
>>
>>Same here, switching between 1.fxg5 and 1.Kg4 and I tend to agree on
>>what you have said. In the end (19 plies) the score looks convincing
>>enough to keep 1.Kg4 on the next iteration but you never can be sure.
>>
>>Ed
>>
>>================
>>
>>Engine version   : Rebel Century 2.01
>>Hash table size  :  40 Mb
>>
>>8/5Bp1/4P3/6pP/1b1k1P2/5K2/8/8 w - -
>>
>>00:00  03.00  1.38  1.fxg5 Bd6 2.Bg6
>>00:01  04.00  1.43  1.fxg5 Bf8 2.Kf4 Bc5
>>00:01  07.00  1.88  1.fxg5 Ke5 2.h6 gxh6 3.gxh6 Kd6 4.Ke4 Bc3
>>00:02  09.00  2.00  1.fxg5 Ke5 2.h6 gxh6 3.gxh6 Kd6 4.Ke4 Bc3 5.Kf5 Bd4
>>00:05  11.00  2.00  1.fxg5 Ke5 2.h6 gxh6 3.gxh6 Kd6 4.Kg4 Bc3 5.Kg5 Be5 6.Kf5
>>Bd4 7.Kg6
>>00:07  12.00  2.24  1.fxg5 Ke5 2.Kg4 Be7 3.h6 gxh6 4.gxh6 Bf6 5.Kh5 Kd6 6.Kg6
>>Bd4 7.Kh7
>>00:10  13.00  2.23  1.fxg5 Ke5 2.h6 gxh6 3.gxh6 Kd6 4.Kg4 Bc5 5.Kh5 Bf2 6.Kg6
>>Bd4 7.Kh7 Be3
>>00:18  14.00  2.18  1.fxg5 Ke5 2.h6 gxh6 3.gxh6 Kf6 4.Ke4 Bf8 5.h7 Kg7 6.Bg6 Be7
>>7.Kd5+ Kh8 8.Bd3 Bf6
>>00:30  14.04  2.18  1.Kg4
>>00:33  14.04  2.35  1.Kg4 gxf4 2.Kxf4 Bf8 3.Kf5 Ke3 4.Kg6 Kf4 5.Be8 Kg4 6.Bb5
>>Kg3 7.Be2 Kh4 8.Kf7 Bb4
>>01:07  15.00  1.98  1.Kg4 Be7 2.fxg5 Ke5 3.h6 gxh6 4.gxh6 Kf6
>>01:15  15.01  2.20  1.fxg5 Ke5 2.h6 gxh6 3.gxh6 Kf6 4.Ke4 Ba3 5.h7+ Kg7 6.Bg6
>>Be7 7.Kd5 Kh8 8.Be4
>>01:53  16.00  2.20  1.fxg5 Ke5 2.h6 gxh6 3.gxh6 Kf6 4.Ke4 Ba3 5.h7 Kg7+ 6.Bg6
>>Be7 7.Ke3
>>03:25  17.00  2.30  1.fxg5 Ke5 2.h6 gxh6 3.gxh6 Kd6 4.Kg4 Bd2 5.Kh5 Ke7 6.Kg6
>>08:10  18.00  2.50  1.fxg5 Ke5 2.h6 gxh6 3.gxh6
>>12:27  18.01  2.50  1.Kg4
>>17:11  18.01  2.82  1.Kg4 Be7 2.Kf5 g4
>
>This is the correct continuation, as Be7 is the best defense for black and then
>white must play Kf5 in order to win. If programs latch onto this sequence their
>eval should continue to increase.
>
>>45:50  19.00  2.61  1.Kg4 Be7 2.Kf5 gxf4

But Rebel's evaluation did not increase and got down from 2.82 to 2.61.
The evaluation of fxg5 went up so I am not sure that Rebel will keep the move.


The main question to ask in order to get an opinion if the program solved the
problem for the right reason from practical point of view are:

1)Can the program win the game against every defence in playing at the same time
control that it used to solve(we cannot be 100% sure about it if we did not try
all the possible defences but we can have an opinion about it).

2)Can the program avoid changing its mind for the wrong move later(again we
cannot know it when the program does not see a forced mate but we can have an
opinion).

3)Is there a bug in the evaluation that helped the program to find the right
move?(we cannot be sure about it because there may be a bug that we do not pay
attention to but we can have an opinion)

If the answer for 1 and 2 are positive and the answer for 3 is negative then
from practical point of view the program solved the problem for the right
reason.

In the case of the first nolot position I believe that Gandalf solved it for the
right reason(It did not find Kh2 in the pv but if you play against it at the
same time control it is going to find it).

I am not sure that it is not going to change its mind but it is my impression.

Uri
Re: Ups, text this time. Ed Schröder 03:38:35 11/29/00
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.