Computer Chess Club Archives




Subject: Re: SSE2 bit[64] * byte[64] dot product

Author: Russell Reagan

Date: 21:51:40 07/19/04

Go up one level in this thread

On July 19, 2004 at 10:59:05, Anthony Cozzie wrote:

>On July 18, 2004 at 15:33:33, Gerd Isenberg wrote:
>>>I am guessing something like 50 cycles?  Really not that bad . . . probably
>>>close to the speed of a scan over attack tables.
>>14.45ns on a 2.2GHz Athlon64, ~32 cycles now.
>>Some minor changes, byte vector values (weights) 0..63, therefore only one
>>psadbw, no movd but two pextrw, final add with gp. Computed bit masks in two
>>xmm-registers (0x02:0x01). Some better instruction scheduling.
>If you would ship me the new code I would be much obliged (
> I am concentrating on parallel code right now, but once that is done I am going
>to do some serious work on my eval.  I want to prove Vincent wrong that a good
>eval cannot be done with bitboards :)
>32 cycles is _really_ good.  I think that on average rotated bitboard attack
>generation is 20 cycles, so that is 50 cycles / piece / mobility = 500 cycles
>(~250 ns on my computer) for all pieces, which is really not bad.  In fact, 32
>cycles is not that much slower than popcount!

I guess Gerd's code takes a bitboard and a piece-square lookup table and
produces a value which is the sum of the table with certain values "masked" off?
Is this correct?

Would this be the same basic idea?

int GetScore (Bitboard b, int scores[])
    int score = 0;
    while (b)
        int i = FirstOne(b);
        b ^= BitMask(i);
        score += scores[i];
    return score;

This page took 0.02 seconds to execute

Last modified: Thu, 07 Jul 11 08:48:38 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.