Author: Bas Hamstra
Date: 08:45:42 04/27/01
Go up one level in this thread
Dann,
Your posted version was slightly slower than mine. For 32 bits it is faster, but
not for 64 bits, because the needed 64 bits shifts cause extra overhead. So, so
far I have not found a faster version than my simple table lookup. Tricks to sum
up 2 32 bit counts are about as fast as my original. I think it is more or less
the limit for ui64. I checked out all your routines, for 32 bits my version can
be much improved upon, but so to see not for 64 bits.
(I used Borland to compare)
Best regards,
Bas.
On April 25, 2001 at 14:53:00, Dann Corbit wrote:
>On April 25, 2001 at 14:38:48, Bas Hamstra wrote:
>
>>On April 25, 2001 at 14:31:27, Bas Hamstra wrote:
>>
>>>inline int PopCnt(BB M)
>>>{ unsigned char *p = (unsigned char*) &M;
>>> return PopCnt8[p[0]]+PopCnt8[p[1]]+PopCnt8[p[2]]+PopCnt8[p[3]]
>>> +PopCnt8[p[4]]+PopCnt8[p[5]]+PopCnt8[p[6]]+PopCnt8[p[7]];
>>>}
>>>
>>>Last time I tested, on my Celeron the above PopCnt was way faster than Crafty's
>>>asm version. Am I overlooking something? Maybe because my compiler refuses to
>>>inline asm statements?
>>
>>More accurately: my compiler refuses to inline functions that contain asm
>>statements.
>
>What compiler are you using? Both Intel and Microsoft compilers will do it. If
>you are using Borland, time to upgrade. It is pathetic in comparison to any of
>the really good ones. Sad, because it used to compete effectively. Now you
>know why they are giving it away.
>
>This works fairly well (which is {I suspect}) what is in your table:
>
>static int inbits[256] = {
> 0, 1, 1, 2, 1, 2, 2, 3, 1, 2, 2, 3, 2, 3, 3, 4,
> 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,
> 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,
> 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
> 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,
> 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
> 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
> 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,
> 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,
> 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
> 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
> 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,
> 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
> 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,
> 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,
> 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8,
>};
>
>int Count(const BITBOARD B)
>{
> return inbits[(unsigned char) B] +
> inbits[(unsigned char) (B >> 8)] +
> inbits[(unsigned char) (B >> 16)] +
> inbits[(unsigned char) (B >> 24)] +
> inbits[(unsigned char) (B >> 32)] +
> inbits[(unsigned char) (B >> 40)] +
> inbits[(unsigned char) (B >> 48)] +
> inbits[(unsigned char) (B >> 56)];
>}
>
>You might want to time this in comparison to your routine. For that matter, I
>have a collection of bit counting routines found here:
>ftp://cap.connx.com/pub/bitcount/BITC.ZIP
>
>That might prove useful. A benchmark routine is included. They are for 32
>bits, but the techniques easily generalize to larger integers.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.