Author: Vincent Diepeveen
Date: 13:42:02 10/17/02
Go up one level in this thread
On October 17, 2002 at 14:56:43, Robert Hyatt wrote:
>On October 17, 2002 at 12:01:37, Vincent Diepeveen wrote:
>
>>On October 17, 2002 at 06:20:11, Uri Blass wrote:
>>
>>>Today my repetition detection is not done based on hash tables and I plan to do
>>>it faster by hash tables.
>>
>>Why would this be faster, taking x86 cpu architecture into account?
>>
>>
>>>After every makemove I calculate the new hash key for the hash tables
>>
>>You don't incrementally calculate the zobrist hashing yet by just
>>xor-ing the pieces you moved to the hashkey?
>>
>>>but I do not have an array of all the hash keys and I use a global varaible
>>>__int64 zob to have the hash key.
>>
>>i'm using for the x86 architecture 2 x 'unsigned int' for the hashkey.
>>The reason is that it was faster than a single 'unsigned _int64'.
>>
>>Compilers not so efficient yet, though intel c++ might be doing this
>>more efficient than others :)
>>
>>>I plan to add an array zobkey[max_plies_of_game] for hash keys
>>>My question is what is faster:
>>
>>>1)Doing all the calculation on zob and after finishing them to do
>>>zobkey[hply]=zob;
>>
>>this is by far fastest of course.
>
>No it isn't, if zob is global. you have to update _both_ zob and zobkey[hply]
>in
>memory. If zob is local, the compiler might be able to keep it in a register
>for its
>scope and never do a write to memory until you write the value to zobkey...
there is no real difference between local and global unless you write to
other cache lines before doing all modifications to the global zob.
To keep it in a single register (which would be only advantage to do
it local) you cannot put the makemove() function in a special function
at all (or you have to inline it everywhere).
That's not neat programming.
>
>>
>>>2)Doing all the calculations on zobkey[hply]
>>
>>extra array references cause extra instructions such as
>>indirect accessing the array by [EAX].
>>
>>Way faster is all operations onto a single register.
>>
>
>
>
>However that has nothing to do with his question. You load the value _once_ and
>then
>fiddle with it in a register. Then you store it back once.
you have to start with a value already. So the choice is simple.
Either you start with a value from an array (slow), or you do stuff
to a global variable, which means it gets stored only after you modify
other cache lines.
>
>
>
>>>I guess that I am going to choose 1 because it is more simple and I guess that
>>>the difference in speed is less than 0.1% but I am interested to know what is
>>>faster.
>>
>>Well it should take very little system time in total anyway, but
>>working on 1 global variable is always faster than doing it by using
>>arrays.
>
>
>Vincent, this is simply wrong. You make a gross statement that has more
>exceptions than cases that it follows.
>For example:
>int x[2]; where that is a local array. It is more than possible that the two
>elements get put into registers and _no_ memory reads/writes are done at all.
>If you work with a _global_ value, the compiler loses some of its ability to
>optimize away writes back to the value. For local data, the compiler produces
No it doesn't Bob. As you could know the processor only stores cache line
after you do writes to other cache lines.
The question raised was whether he had to use arrays or a global variable.
In that case the answer is a global variable.
Of course i'm using local variables everywhere, but the principle is the
same simply that it's faster than fiddling with arrays.
If the question is whether referencing indirect is faster versus direct
reference, then the question is direct reference.
That was the raised question.
Apart from that global variables are only needed to put in local variables
when you fiddle with a lot of different stuff at the same time, that is
do writes to other cache lines *before* all modifications to the global
variable are done.
int a; // global
function() {
..
a ^= movelookup[move&mask];
a ^= iscapture[move&capturemask];
..
other code
..
}
No way to beat that with a local variable.
>a dependency graph and knows when a value gets written back but never used
>again, and it eliminates the write back. For global variables this can't be
>done
>because the compiler can't tell who might use that value later on after this
>procedure has returned...
>
>
>
>
>
>>
>>>Doing all the calculations on zobkey[hply] seems to have one less arithmetic
>>>calculation but more array calls.
>>
>>Arithmetic is very cheap (exception: BSF and BSR vector instructions)
>>
>>In general you should assume in the future that processors (take the
>>mckinley as example) will do more instructions in either a bundle or
>>within a single clock. Memory will get slower. So instructions that
>>act upon a single register will be very slow unless it is complex
>>instructions like BSF. Even multiplying i am using scrupuleous above
>>adding a single small local array!
>>
>>a hashtable is way slower because it eats more memory than a single
>>array [hply]. That O(1) lookup is of course way slower than doing
>>a lookup in that array with hashnumbers which is already inside
>>perhaps even your L1 cache already.
>>
>>Also in order to get a hashtable correctly to work you need a linked
>>list hashtable. At paper that sounds cool perhaps, but it is hell slow.
>>
>>>Uri
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.