Author: Vincent Diepeveen
Date: 05:44:15 04/24/01
Go up one level in this thread
On April 23, 2001 at 22:19:33, Landon Rabern wrote:
>On April 23, 2001 at 19:30:20, Alex Boby wrote:
>
>>
>>I used to have this:
>>
>>------------
>>void parseBitboard (int from, struct MoveList *ml, bitboard attack)
>> {
>> int i;
>>
>> for (i=0; i<64; i++)
>> {
>> if (attack&mask[i])
>> [add move to list]
>> }
>> }
>>------------
>>and got this in the profile:
>>7301.351 3.9 37127.739 19.6 538488 _parseBitboard (pierre.obj)
>>
>>and then, figuring I would get a significant speed increase, I switched to this:
>>
>>-----------------
>>int findBitIndex(bitboard data)
>> {
>> int index;
>>
>> __asm
>> {
>> bsr edx, dword ptr data+4
>> mov eax, 32
>> jnz s1
>> bsr edx, dword ptr data
>> mov eax, 0
>> jnz s1
>> mov edx, -1
>> s1: add edx, eax
>> mov index, edx
>> }
>>
>> return index;
>> }
>>
>>void parseBitboard (int from, struct MoveList *ml, bitboard attack)
>> {
>> int index;
>>
>> while ((index = findBitIndex(attack))!=-1)
>> {
>> [add move to list]
>> attack -= mask[index];
>> }
>> }
>>-------------
>>and then got this in the profile:
>> 6763.331 4.4 32424.707 21.1 530420 _parseBitboard (pierre.obj)
>> 1313.554 0.9 1313.554 0.9 3523746 _findBitIndex (pierre.obj)
>>
>>with about a 10% drop in nodes/sec.
>>
>>I thought that BSF & BSR were supposed to be fast! What am I doing wrong?
>>This is on an Intel P3/500 w/ win2k.
>
>This is what I do and I get about a 10% speed increase over my table lookup
>version. Im also running win2k on a P3/500.
>
>__forceinline int LSB(bitBoard n){
> __asm {
> bsf edx, dword ptr n
> mov eax, 0
> jnz l1
> bsf edx, dword ptr n+4
> mov eax, 32
> jnz l1
> mov edx, -33
> l1: add eax, edx
> }
>}
If you get rid of those 2 jnz instructions
then you can add a load of instructions to replace those
2 possible branches and get a lot faster.
If misprediction occurs it will be at least 10 clocks penalty on P3,
even more on K7 and 20 clocks at least at P4.
Means that you can use up to 20 instructions easily as long as they
do not use branches to replace it.
> while (toMap){
> toSquare=LSB(toMap);
> toMap&=notMask[toSquare];
> [add move]
> }
>
>A couple things, is that subtraction attack -= mask[index]; slower or faster
>than anding with a notmask?
>
>Also,
>
>while ((index = findBitIndex(attack))!=-1)
>{
> [add move to list]
> attack -= mask[index];
>}
>
>you keep going until findBitIndex outputs a -1, but isn't this also when
>attack==0? So, you are doing extra instructions on the last time.
>
>Try this:
>
>while (attack)
>{
> index = findBitIndex(attack);
> [add move to list]
> attack -= mask[index];//maybe a notMask and is faster like I did?
>}
>
>
>
>Regards,
>
>Landon W. Rabern
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.