Author: Robert Hyatt
Date: 10:14:43 01/15/98
Go up one level in this thread
On January 15, 1998 at 10:27:28, Stuart Cracraft wrote: >The research I've read indicates that a two-tiered >structure with nodes-based-replacement for the one >slot that most people check for subtree height is >preferred. > >That is, the number of nodes that were searched >for this subtree must be greater than the number of >nodes recorded for the position already in the hash >table. I tried this when the discussion hit r.g.c.c... for me, it didn't work. If you don't do many extensions, this might pan out. But if you do, node counts will vary based on how many extensions you do, not necessarily on how much work was done below that position. IE I once counted such info in Cray Blitz when we were discussing this a while back in r.g.c.c, and found two sibling nodes where one had one node below it examined, the other had one million nodes examined, purely because of the extensions (singular and others) that I do/did in CB. I found that depth-preferred seemed to work better for me at the time. Of course, it might have changed, but I tried it on two different programs (CB and Crafty) and depth-preferred was always just a little more efficient than nodes-preferred replacement. > >Recently, I replaced my single-tier depth-based >replacement with single-tier nodes-based and >experienced a small improvement. > >Later I will go to a two-tier algorithm so that new >things always get stored. > >Also, I think we need to discuss rehashing. Does somenoe >have this implemented? It seems like rehashing for say >8 slots based on nodes-recorded would be better than >single-tier since quite a few collisions could be handled >without replacing information already at that slot. > Rehashing is better. In Cray Blitz, we used 8 slots. Basically, we took the hash key and divided it into three fields: <lock><index><table_address> <table_address> is just what you'd expect... the address of the initial table probe. <index> is added to table_address 7 times to produce the other 7 entries in this "eight entry bucket" approach. <lock> is simply the "rest" of the hash signature. This has two good features: (1) you get to choose between replacing one of eight entries, sort of an 8-way set-associative hashing algorithm; (2) using the <index> field tends to help avoid "chaining" if you know what that is all about. This doesn't work well on a PC however, because the 8 loads on the Cray were essentially free after the first 1. On a PC, we have such poor memory bandwidth, the extra loads place a strain on the memory interface that will slow things down more than reduced node count will offset. Don Beal wrote a paper in the JICCA explaining this, but he only considered nodes searched. In that case, probing 4 addresses performed better than only one, or than the two-tiered approach. But *only* when the table is quite saturated. I avoid most of this by not hashing q-nodes, so I don't run into table saturation in normal circumstances... >Does anyone have some good algorithms for this? Let's say >I hash to index X and find it is filled. What is a good >algorithm to choose another index to try? And it seems like >that algorithm would have to be deterministic because the >next time into this entry, I'd want to go search those >alternate indexes/slots in the same order to find where I'd >stored it. > >--Stuart use the Cray Blitz approach. the danger is that when you hash to entry X, you try X, X+n, X+2n, etc. If you later hash to Y, and either Y, Y+n, Y+2n, ... hit one of the X's above, they will "match" from that point forward. In the CB approach, n would likely be different for two different hash signatures, so that the index between "bucket entries" would not be a constant, giving better and more uniform distribution of entries..
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.