Author: Gerd Isenberg
Date: 15:05:44 09/16/03
Go up one level in this thread
On September 16, 2003 at 17:48:32, Gerd Isenberg wrote:
>Hi all,
>
>Sorry for continuing my boring, slightly off topic, private math lessons for
>autodidactical reasons ;-)
>
>First i want to apologize for suborn you recently, trying an erroneous routine
>for div/mod, e.g. for calculating hash indices of fixed sized (pawn hash)
>tables.
>
>unsigned int mod49981 (unsigned __int64 a)
>{
> unsigned __int64 m = (0x00014FAC00053EB1 * (a >> 32)
> + 0x00014FAC * (a & 0xffffffff));
> int modulus = (int)(a - (__int64)(m>>32)* 49981);
>
> if ( modulus < 0 ) // correct negative results
> modulus += 49981; // due to the remainder of 0.0003
> // oups, that's not enough
> return (unsigned int) modulus;
>}
>
>The idea is to do DIV by reciprocal fixed point multiplication like
>a * 1/49981, where 1/49981 is precalculated. The integer part of the quotient is
>multiplied "back" with the same constant 49981. The remainder is simply the
>difference from the divisor:
>
> a % 49981 := a - (a div 49981)*49981
>
>see Paul Hsieh's formula (modified to 64 bit):
>
> a % (k*2^t) == a - (((((int)(((2^64)+k-1)/k)) * (a) / (2^64))) & (-2^t))*k
>
>http://www.azillionmonkeys.com/qed/adiv.html
>
>Due to rounding error of the reciprocal constant,
>the above routine is _not_ a correct % 49981 one!
>
>Few "remainders" are by up to 4 greater then the maximum of 49980.
>For a "modulus"-function a further correction via
>
> if ( modulus >= 49981 ) // correct wrong results
> modulus -= 49981; // due to the remainder of 0.0003
>
>is necessary. For table-index calculation one may safe it, considering this in
>table allocation, e.g. to allocate 16-entry aligned 50000 instead of 49984,
>which is one to small!
>
>One may try odd "k" < 4096 and some "t" > 0 to avoid corrections at all.
>For pawn hashing i tried an odd mod-constant with low error margin of the
>reciprocal 64-bit constant, but closer to the allocated 64K (e.g. 64577) and
>don't care about correct modulus by doing a final AND 0x0ffff, with a slightly
>better hit rate:
>
unsigned int CPawnHashTable::getIndex(const CNode &node)
{
unsigned __int64 a = node.pawnBB(0) - node.pawnBB(1);
unsigned __int64 m = 0x000103cd3ddab652 * (a >> 32)
+ 0x000103cd * (a & 0xffffffff);
return (unsigned int)((m>>32)*645777 - a) & 0x0ffff;
^^^ oups, typo should be 64577
}
>
>IMHO on opteron the "optimized" assembler output should look like this:
>
mov rax, [node]
mov rcx, [rax.m_PawnBB + 0] ; white pawns
sub rcx, [rax.m_PawnBB + 8] ; -black pawns
mov rax, 0x000103cd3ddab652 ; (((2^64)+645777-1)/645777)
mul rcx
mov rax, 64577 ; oups, typo
mul rdx
sub rax, rax
and rax, 0x0ffff
>
>Second mul (mov,sub,and) may use shorter and faster 32-bit ones, but i'm not
>sure about possible penalties if using same gp-register 64- and 32-bit wise in a
>row. Note that mul on opteron is no longer vector path instructions, 64-bit mul
>has latency of 5 cycles, 32-bit mul 3-cycles. 64-bit div/mod instead is vector
>path and takes 71 (DIV) or 75 (IDIV) cycles, as reported in latest Software
>Optimization Guide for AMD Athlon™ 64 and AMD Opteron™ Processors.
>
>A last comment on incremental zobrist keys for pawn hashing in conjunction with
>my program design: My node structure is well aligned, an additional "int" will
>blow it up over some bounderies without any severe changes. I hash eval and
>store eval + flags in main hash. And i do some (very) lazy eval, where i
>conditionally don't even probe pawn hash.
>
>An alibi not to try zobrist keys for pawn hashing at all
>(and lack of time of course), sorry!
>
>If you are looking for other reciprocal 64-bit constants you may try and improve
>(due to "double" resolution) this routine:
>
>void findGoodReciprocalConstants()
>{
> double errorMargin = 0.0000001; // !? 0 is also fine here with double
> double c = 0x100000000;
> c = c * c - 1.0; // 2**64-1
> for (int i = 1; i < 0x10000; i += 2)
> {
> double fdiv = (c + i)/i;
> unsigned __int64 idiv = fdiv;
> double rdiv = (double)idiv;
> if ( fdiv >= rdiv - errorMargin && fdiv <= rdiv + errorMargin)
> printf("0x%08x%08x for mod(%5d)\n", (int)(idiv>>32), (int)idiv, i);
> }
>}
>
>For division/modulos by odd 64-bit constant less than 4096 a correction
>is not necessary.
>
>Regards,
>Gerd
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.