Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Hash collisions with deep searches

Author: Vincent Diepeveen

Date: 10:56:17 12/23/02

Go up one level in this thread


On December 22, 2002 at 06:44:04, Frank Phillips wrote:

>On December 21, 2002 at 13:57:09, Vincent Diepeveen wrote:
>
>>On December 20, 2002 at 22:05:14, Walter Faxon wrote:
>>
>>>On December 20, 2002 at 05:40:35, Gerd Isenberg wrote:
>>>
>>>>On December 19, 2002 at 16:50:32, Dann Corbit wrote:
>>>>
>>>>>Suppose that we are searching at long time control, and the hash table rapidly
>>>>>fills up.  This can throw move ordering in the toilet, if we can't find
>>>>>transpositions any more.
>>>>>
>>>>>What sort of replacement scheme is best for very long searches (several
>>>>>minutes)?
>>>>
>>>>Hi Dann,
>>>>
>>>>I use a two table approach.
>>>>
>>>>A primary big one, where i replace one of eight with the lowest
>>>>timestamp/draft/someFlags priority. Exact entries became a bit more resistance
>>>>against overwriting, the same for most left succesors of every root move (pv and
>>>>refutations lines).
>>>>
>>>>A smaller secondary one, with an always replacement scheme.
>>>>
>>>>In general i don't store or probe in qsearch. I use IID in __every__ interior
>>>>node without a move hint from hash.
>>>>
>>>>Regards,
>>>>Gerd
>>>
>>>
>>>Hi, Gerd.
>>>
>>>Good idea about IID!
>>>
>>>I have no chess program but it seems to me that one uses hashing for two
>>>reasons:  (1) to save information that was expensive to compute but may not be
>>>immediately useful (closer to the root), and (2) to save information that wasn't
>>>very expensive to compute but may be useful very soon (close to the current
>>>node).  That's why a two-table approach (or a one-table, two types of entry
>>>approach) is so often preferred.
>>
>>Not really.
>>
>>first there was a concept which most used with a single hash transposition.
>>then the 2 table concept was invented, but long before that already 8
>>probe concepts were there.
>>
>>Yet recently cache lines (starting with DDR ram and RDRAM) have increased
>>that much in length, that you for free can implement without risk a
>>concept of 4 probes in a row.
>>
>>That's obviously working a lot better than 2 slow accesses to ram at
>>random positions.
>>
>
>I am confused.  Memory access for hash tables should be random - yes?
>Nevertheless, will not the following be together in memory.  There could be more
>elements within the structure of course.
>
>
>typedef struct trans_record {
>	TransRecordT top;
>#if defined TWO_LEVEL_HASH_TABLE
>	TransRecordT bottom;
>#endif
>} HashRecordT;
>

the first reference to hashtable most likely will be random. it's sad
but that's probably the truth. the other probes should be sequential.

Here is what crafty 18.15 (from around june-august 2002,
correct me if i'm wrong there) is doing:

int HashProbe(.. ) {
  HASH_ENTRY *htable;
  ..
  htable=trans_ref_a+((int) temp_hashkey&hash_maska);
  .. (probe 1)
  ..
  htable=trans_ref_b+((int)temp_hashkey&hash_maskb);
  .. (probe 2)

Both trans_ref_a and b were a special malloc to the
memory. That means that both the reference to trans_ref_a
is eating a random cache line from memory. That is
very expensive. I never managed to measure it very well
but i gamble it is around 400 clocks or something to
get that cache line.

Then another *random* cache line costs another 400 clockticks.

Faster is reading sequential memory. It's good to know that
the latest crafty is doing that too from Bob's posting here.

Good to know he corrected that.

A single cache line gets 64 (DDR ram) or 128 (RDRAM) bytes
in a single random read. Now if you do not put your hashtable
at the start of a cache line (which is of course hard to do
in case of using a number of bytes which can't divide 64),
then when trying 4 sequential probes of each hashentry being
16 bytes means you need 64 bytes.

That can be 1 cache line, or in bad luck 2 cache lines. But they
are sequential. Memory is far faster reading sequential memory than
it can read random cache lines.

In fact it is *very fast* reading sequential cache lines.

The 2 table concept (with sequential implementation)
is far superior to a single probe.

However it means you divide your hashtable by 2 in fact in total
size.

There is a far superb way to do it without losing the entire size
of the hashtable.

That's using a single table where you get to an entry number like you
do now and then probe all the 4 sequential entries.

Also it is less code than the 2 probes are in crafty currently.

Bob 'unlooped it' in 18.15 but it is easy to loop:

htable = ..;
for( i = 0 ; i < N_HASHPROBES ; i++ ) {
  ..
  htable++;
}


I use the number 8 for N_HASHPROBES in DIEP. I'm not claiming
that works best for you either. If you have sdram i would limit it to 2
if you have ddr ram i would take 4 and if you have rdram i would go for 8.

of course assuming 16 bytes an entry. In diep i regrettably didn't manage
to get my entry size smaller than 24 bytes because i do not lock the
hashtable (12 bytes are hashkey signature; so i easily can make that
20 bytes when i would lock the hashtable; a trick to do lockless
in such a way that i do not lose that extra 4 bytes i would love to
discuss here).

>
>
>>>Reason (2) implies that some sort of hashing should be done even in quiescence
>>>search.  The problem with qsearch hashing nowadays is the memory caching
>>>penalty.  Nowadays, if you have to wait on a memory access that might only help
>>>a little, you might as well do the search directly.
>>>
>>>Maybe one might use a very small hash table (10K entries, say), reserved for
>>>qsearch nodes?  (Or nodes beyond a certain depth or with reduced material.)
>>>Particularly if you could freeze it all into the cache...
>>>
>>>I think an older ICCAJ paper detailed experiments with hash node type
>>>allocation, but without reference to memory caching penalties.
>>>
>>>-- Walter



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.