Author: Michael White
Date: 09:20:07 03/03/00
Go up one level in this thread
On March 03, 2000 at 09:13:16, Michael White wrote: >On March 01, 2000 at 18:26:44, Robert Hyatt wrote: > >>On March 01, 2000 at 14:31:40, blass uri wrote: >> >>>On March 01, 2000 at 12:31:52, Peter Kappler wrote: >>> >>>>On March 01, 2000 at 05:19:38, Tom Kerrigan wrote: >>>> >>>>>On February 29, 2000 at 17:56:25, Robert Hyatt wrote: >>>>> >>>>>>>I asked you to back up your argument and you gave me some random numbers. Now >>>>>>>you are asking me if I need confirmation that divide-by-zero is bad. It's no >>>>>>>secret that you are being extremely insulting to me. >>>>>>Tom... look at _your_ response. A curt "run some tests and prove I am wrong." >>>>>>My position is "you run some test and prove you are right." I already gave one >>>>> >>>>>Scientific papers do not look like this: >>>>>"Electrons are green. If you don't agree, prove me wrong." >>>>> >>>>>You told me that ignoring the EP square is a big mistake. But your argument is >>>>>like the "scientific paper" above. (See the text I quoted.) >>>>> >>>> >>>>I must disagree here. He told you about a specific incident where this caused a >>>>problem in Crafty. Why do you keep insisting that he offer more proof? >>>> >>>>If you feel so strongly about this, conduct your own experiments to prove Bob >>>>wrong. >>>> >>>>--Peter >>> >>>The question is not if ignoring the EP square can cause problems but what is the >>>probability that it practically cause problems in games >>> >>>There is no doubt that it can cause problems but if it cause problems only in 1 >>>out of 10000 games then saying that it is a big mistake to ignore EP square is >>>wrong and using the same time to fix other problems in programs is more >>>important for rating. >>> >>>If it can cause problems in 1 out of 20 games then saying that it is a big >>>mistake is right. >>> >>>The point is not that Tom claims that Bob is wrong but the fact that Bob did >>>not give data about the practical question. >>> >>>It is possible to get a good estimate for the relevant probability by dividing >>>the number of games when ignoring the EP square caused problems by the number of >>>total relevant games >>> >>>Uri >> >> >>The data would be difficult to obtain. Who would want to play games with a >>known bug, just to see how many games it screwed up? Who would want to go >>thru each of those games, move by move, to see where the EP problem actually >>influenced a score vs when it didn't? >> >>Lot of work, zero return. For bugs this simple, just fix them and go on. >> >>It takes a couple of lines of code. (Here goes again. Something about this message board/Netscape locks up my system when I try to post a long message. The cursor gets slower and slower and the disk caches... must be censorship of Slow Typists Who Type Long Posts.) Indeed, testing whether recognition of EP and castle status will affect game outcome is a big job, when you just need to "do the right thing" and hove no worries. I wrote the requisite "couple lines of code" to estimate how often the status differs between played moves and their alternatives in the opening and early middle game. Briefly, I took the CAP publicly available database (cap.epd, eco_pri.epd, and popular.epd) and added approx 500000 opening positions that I have evaluated to depth 10 or more, sorted them alphabetically, and got 1281466 unique positions in the first four EPD fields. Next, I applied to the set the "couple lines of code": BEGIN{pos="";ac="";cs="";ep="";row=""} (n>=4) { if (pos==$1 && ac==$2 && cs==$3) {print row; print} pos=$1;ac=$2;cs=$3;ep=$4;row=$0} The code locates all of the positions that are duplicated with except for a change in their castling status. There were 230 lines (115 pairs). Next, I changed the "&& cs==$3" in the above to "&& ep==$4" and obtained a report of all positions that were duplicated except for a change in their en passant status. There were 10732 lines (5366 pairs). Unfortunately, different programs have different conventions for reporting e.p. status, so these numbers represent an approximate upper bound. A programmer may choose to report e.p. status for A) every two-hopper, B) two-hopper with enemy pawn in the ready, or C) two-hopper w/enemy pawn who can truly execute a legal e.p. capture. I think Bruce Moreland may use criteria B, in his search routine. For EPD reporting, the criteria needs to be standardized! The PGN standard does not distinguish between these cases. but it should. The CAP data had CCR.18 with (-) and (e6) e.p. status Also, there is some ambiguity about reporting castling status for test positions, because the information is sometimes not supplied. Well, if it's a game, it is standardized, but if not, then what? The CAP data had ECM.754 (kq) and ECM.754 (-) same position with ambiguous castle status. ECM.376 (-) and ECM.1314 (q) "" "" LK.95 (-) and TWGCG.03 (kq) "" "" LK.245 (-) and TWGCG.45 (kq) "" "" WCSAC.0400 (q) and KL.312 (-) "" "" ECM.760 (kq) and LK.247 (-) "" "" BK23 (kq) had a matching position with a castle ambiguity ECM.837 (k) and ECM.837 (-) ECM.834 with same position, different status Crafty2600 - Crafty-16_11 game with castle ambiguities ARCHANGL.001,003, etc. another game with castle ambiguities and so on for a total of about 25 pairs or so. Next I manually scanned the pairs for differences in score. This is not possible to program, because the EPD are evaluated to different depths, or different machines, or different hash table sizes, different programs, program versions, etc. Most of what I saw had differences no greater than 5 ce. I saw no pair with a difference larger than 100 ce including a change in sign. I found 2 with a change of 50 ce or more, and one of them included a sign change. I can't remember if it was due to castle or ep ambiguity. If anyone wants the data, I can send it (1.7 MB). My conclusion is: if you want to risk approx 1 bad eval per million nodes then drop the ep hash and save your clock ticks. In my program, I will continue to hash e.p. according to criteria B above. Mike White
This page took 0.01 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.