Computer Chess Club Archives




Subject: Re: SSE2 bit[64] * byte[64] dot product

Author: Anthony Cozzie

Date: 07:59:05 07/19/04

Go up one level in this thread

On July 18, 2004 at 15:33:33, Gerd Isenberg wrote:

>>I am guessing something like 50 cycles?  Really not that bad . . . probably
>>close to the speed of a scan over attack tables.
>14.45ns on a 2.2GHz Athlon64, ~32 cycles now.
>Some minor changes, byte vector values (weights) 0..63, therefore only one
>psadbw, no movd but two pextrw, final add with gp. Computed bit masks in two
>xmm-registers (0x02:0x01). Some better instruction scheduling.

If you would ship me the new code I would be much obliged (
 I am concentrating on parallel code right now, but once that is done I am going
to do some serious work on my eval.  I want to prove Vincent wrong that a good
eval cannot be done with bitboards :)

32 cycles is _really_ good.  I think that on average rotated bitboard attack
generation is 20 cycles, so that is 50 cycles / piece / mobility = 500 cycles
(~250 ns on my computer) for all pieces, which is really not bad.  In fact, 32
cycles is not that much slower than popcount!


This page took 0.02 seconds to execute

Last modified: Thu, 07 Jul 11 08:48:38 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.