Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Pawn Hash Collisions in Crafty

Author: Robert Hyatt

Date: 11:11:25 12/05/01

Go up one level in this thread


On December 05, 2001 at 07:49:39, David Rasmussen wrote:

>On December 05, 2001 at 06:42:52, Sune Fischer wrote:
>
>>On December 05, 2001 at 06:02:45, David Rasmussen wrote:
>>
>>>>Whether or not 32 bit is sufficient depends on which "resolution" you require.
>>>>If you need to distinguish between millions of different positions, then 32 bit
>>>>will create too many collisions, probably somewhere near the numbers you've been
>>>>talking. But if you "only" have a few thousand different positions (which is
>>>>typical if you hash pawns _only_) then on average you should have about 1%
>>>>chance of a collision in an entire game.
>>>>
>>>>Try and count how many different positions you get, this is really the key
>>>>factor.
>>>>
>>>
>>>First of all, we're both talking about hashing pawns only. And that is what I
>>>have been talking about all along. Now:
>>>
>>>I disagree. They key factor is how many actual collisions returning wrong scores
>>>(and other wrong data from the pawn hash table) you have, regardless of how many
>>>positions etc.
>>
>>But I showed you the math, the odds can be calculated and that is what I base my
>>opinion on.
>
>You showed me _some_ math. Whether it describes the situation or not remains to
>be shown.


Some points to ponder.

Assuming you search 1M nodes per second, you really only have to do a
full evaluation on 10% of them, typically.  The rest can be eliminated
by 'lazy eval' testing since if you are two queens down, positional scores
likely won't help.

This means that for 3 minutes of search, you will fully evaluate maybe 20M
positions.  Now, for those 20M positions, how many will contain _different_
pawn positions that you have never seen before?

I believe that number is very small, simply because of the testing I have
done in the past.



>
>>I think either you have way more than a few thousand positions (why won't you
>>count them?), or you have overlooked something else.
>>
>
>I will count them if you want me to (when I get around to it :), but I don't
>really see the point. Even if there is a lot of positions or even if I have
>overlooked something, the fact still remains that a bogus value is returned from
>the pawn evaluation when such a collision occur. Do you disagree with that? I'm
>not only talking about Chezzz here, but also about Crafty. I haven't heard of
>anybody else in here making the same experiment with their programs, and
>reporting the results. Maybe the results are similar for all programs that use
>32-bit keys. That is what I suspect.
>
>>>I know that there aren't that many pawn positions during the
>>>cause of a game, but I really don't care. The important thing is to measure how
>>>often the hashtable adds incorrectness that _would_ not have been there, had it
>>>not been for the pawn hashtable, i.e. a defect or, if you will, a bug of the
>>>pawn hashtable.
>>
>>This process is way more complicated, I suspect you get "collisions" due some
>>other parts of the code.
>
>What do you mean? The issue is quite simple here. Crafty is probing it's tables
>to see if there is a hit. If there is, good. We can save some time, by not
>calculating the scores and properties for the pawn structure; we just return the
>values from the pawn hashtable. But if we (in a debug version), before returning
>these values from the pawn hashtable, actually calculate them anyway as we would
>have if there weren't a hit, we can check if the calculated values matches that
>hashtable ones. They better, because in the release version we would have just
>returned those. But appx. once a second, the values _don't_ match. It is that
>simple. It has nothing to do with other parts of the code, and the experiment is
>testing exactly the invariant that makes (or should make) the pawn hashtable
>work, namely that a hit implies data about the correct position, not another one
>with same hash signature (a collision). The experiment doesn't even alter the
>way the program runs at all, it just makes it run a bit slower. So take your
>pick:


One easy test in Crafty is to save the "hashed" stuff and _still_ call
EvaluatePawns() to see if the computed stuff differs.  I have done this
often in debugging and I do not ever see any disagreement here.  Which
doesn't necessarily mean there are no "collisions" but that if there are,
the two different positions produce both the same signature _and_ the same
score (and all the other bitmaps that are hashed for other evaluation
terms).




>
>1) Crafty has pawn hash collisions, but they don't matter because they are so
>few (why?)
>
>2) Crafty has pawn hash collisions, and it makes it does matter performancewise
>somehow

(3) Crafty doesn't any pawn hash collisions to speak of...


>
>3) I have implemented the experiment described in a buggy way (why?)
>
>You cannot pick:
>
>1) There are no collisions because my math says there shouldn't be, but your
>experiment is correct
>
>2) Your experiment is wrong, it doesn't prove anything, even though you've
>implemented it correctly (well you can, but I'd like to see your arguments then)
>
>>If you insist on the empirical data (which is great actaully), then perhaps try
>>a simpler approach, like someone suggested use 64 bit and see how often the
>>reduced 32 bit keys will collide.
>>
>
>My approach is at least as simple as that approach. And more to the point. It
>tests exactly what we want to know, and implementation is 5 lines into
>evaluate.c in Crafty. What do you think you would get out of such an experiment
>that is better than my experiment?
>
>>>And whether this rate is "high" or "low", I don't know. Until I
>>>am sure it is not "too high", I will stick with a rate of zero. Your argument
>>>that there is about 1% chance of collision per game is all well and good. It
>>>just doesn't matter, since Crafty have more than 1 collision a second with appx.
>>>150 kn/s from the initial position. I am asking several questions:
>
>You don't comment on this. Why? It is pretty essential :)



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.