Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: What approach do you use to handle castling/en passant for repetition?

Author: Michael White

Date: 09:20:07 03/03/00

Go up one level in this thread


On March 03, 2000 at 09:13:16, Michael White wrote:

>On March 01, 2000 at 18:26:44, Robert Hyatt wrote:
>
>>On March 01, 2000 at 14:31:40, blass uri wrote:
>>
>>>On March 01, 2000 at 12:31:52, Peter Kappler wrote:
>>>
>>>>On March 01, 2000 at 05:19:38, Tom Kerrigan wrote:
>>>>
>>>>>On February 29, 2000 at 17:56:25, Robert Hyatt wrote:
>>>>>
>>>>>>>I asked you to back up your argument and you gave me some random numbers. Now
>>>>>>>you are asking me if I need confirmation that divide-by-zero is bad. It's no
>>>>>>>secret that you are being extremely insulting to me.
>>>>>>Tom...  look at _your_ response.  A curt "run some tests and prove I am wrong."
>>>>>>My position is "you run some test and prove you are right."  I already gave one
>>>>>
>>>>>Scientific papers do not look like this:
>>>>>"Electrons are green. If you don't agree, prove me wrong."
>>>>>
>>>>>You told me that ignoring the EP square is a big mistake. But your argument is
>>>>>like the "scientific paper" above. (See the text I quoted.)
>>>>>
>>>>
>>>>I must disagree here.  He told you about a specific incident where this caused a
>>>>problem in Crafty.  Why do you keep insisting that he offer more proof?
>>>>
>>>>If you feel so strongly about this, conduct your own experiments to prove Bob
>>>>wrong.
>>>>
>>>>--Peter
>>>
>>>The question is not if ignoring the EP square can cause problems but what is the
>>>probability that it practically cause problems in games
>>>
>>>There is no doubt that it can cause problems but if it cause problems only in 1
>>>out of 10000 games then saying that it is a big mistake to ignore EP square is
>>>wrong and using the same time to fix other problems in programs is more
>>>important for rating.
>>>
>>>If it can cause problems in 1 out of 20 games then saying that it is a big
>>>mistake is right.
>>>
>>>The point is not that Tom claims that Bob is wrong but the fact that Bob did
>>>not give data about the practical question.
>>>
>>>It is possible to get a good estimate for the relevant probability by dividing
>>>the number of games when ignoring the EP square caused problems by the number of
>>>total relevant games
>>>
>>>Uri
>>
>>
>>The data would be difficult to obtain.  Who would want to play games with a
>>known bug, just to see how many games it screwed up?  Who would want to go
>>thru each of those games, move by move, to see where the EP problem actually
>>influenced a score vs when it didn't?
>>
>>Lot of work, zero return.  For bugs this simple, just fix them and go on.
>>
>>It takes a couple of lines of code.


(Here goes again.  Something about this message board/Netscape
 locks up my system when I try to post a long message.  The
 cursor gets slower and slower and the disk caches...
 must be censorship of Slow Typists Who Type Long Posts.)

Indeed, testing whether recognition of EP and castle status will
affect game outcome is a big job, when you just need to "do the
right thing" and hove no worries.

I wrote the requisite "couple lines of code" to estimate how
often the status differs between played moves and their
alternatives in the opening and early middle game.  Briefly,
I took the CAP publicly available database (cap.epd, eco_pri.epd,
and popular.epd) and added approx 500000 opening positions that
I have evaluated to depth 10 or more, sorted them alphabetically,
and got 1281466 unique positions in the first four EPD fields.

Next, I applied to the set the "couple lines of code":

BEGIN{pos="";ac="";cs="";ep="";row=""}
(n>=4) { if (pos==$1 && ac==$2 && cs==$3) {print row; print}
         pos=$1;ac=$2;cs=$3;ep=$4;row=$0}

The code locates all of the positions that are duplicated with
except for a change in their castling status.  There were
230 lines (115 pairs).  Next, I changed the "&& cs==$3" in the
above to "&& ep==$4" and obtained a report of all positions
that were duplicated except for a change in their en passant
status.  There were 10732 lines (5366 pairs).

Unfortunately, different programs have different conventions for
reporting e.p. status, so these numbers represent an approximate
upper bound.  A programmer may choose to report e.p. status
for A) every two-hopper, B) two-hopper with enemy pawn in the
ready, or C) two-hopper w/enemy pawn who can truly execute a
legal e.p. capture.  I think Bruce Moreland may use criteria B,
in his search routine. For EPD reporting, the criteria needs to
be standardized!  The PGN standard does not distinguish between
these cases. but it should.
The CAP data had
   CCR.18 with (-) and (e6) e.p. status

Also, there is some ambiguity about reporting castling status for
test positions, because the information is sometimes not supplied.
Well, if it's a game, it is standardized, but if not, then what?
The CAP data had
   ECM.754 (kq) and ECM.754 (-) same position with ambiguous castle status.
   ECM.376 (-) and ECM.1314 (q) "" ""
   LK.95 (-) and TWGCG.03 (kq) "" ""
   LK.245 (-) and TWGCG.45 (kq) "" ""
   WCSAC.0400 (q) and KL.312 (-) "" ""
   ECM.760 (kq) and LK.247 (-) "" ""
   BK23 (kq) had a matching position with a castle ambiguity
   ECM.837 (k) and ECM.837 (-)
   ECM.834 with same position, different status
   Crafty2600 - Crafty-16_11 game with castle ambiguities
   ARCHANGL.001,003, etc. another game with castle ambiguities
and so on for a total of about 25 pairs or so.

Next I manually scanned the pairs for differences in score.  This is
not possible to program, because the EPD are evaluated to different
depths, or different machines, or different hash table sizes, different
programs, program versions, etc.

Most of what I saw had differences no greater than 5 ce.  I saw no
pair with a difference larger than 100 ce including a change in sign.
I found 2 with a change of 50 ce or more, and one of them included
a sign change.  I can't remember if it was due to castle or ep
ambiguity.  If anyone wants the data, I can send it (1.7 MB).

My conclusion is: if you want to risk approx 1 bad eval per million nodes
then drop the ep hash and save your clock ticks.  In my program, I will
continue to hash e.p. according to criteria B above.

Mike White



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.