Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: BSF/R not working well for me...

Author: Landon Rabern

Date: 18:14:32 04/24/01

Go up one level in this thread


On April 24, 2001 at 08:44:15, Vincent Diepeveen wrote:

>On April 23, 2001 at 22:19:33, Landon Rabern wrote:
>
>>On April 23, 2001 at 19:30:20, Alex Boby wrote:
>>
>>>
>>>I used to have this:
>>>
>>>------------
>>>void parseBitboard (int from, struct MoveList *ml, bitboard attack)
>>>  {
>>>  int i;
>>>
>>>  for (i=0; i<64; i++)
>>>    {
>>>    if (attack&mask[i])
>>>      [add move to list]
>>>    }
>>>  }
>>>------------
>>>and got this in the profile:
>>>7301.351   3.9    37127.739  19.6   538488 _parseBitboard (pierre.obj)
>>>
>>>and then, figuring I would get a significant speed increase, I switched to this:
>>>
>>>-----------------
>>>int findBitIndex(bitboard data)
>>>  {
>>>  int index;
>>>
>>>  __asm
>>>    {
>>>        bsr edx, dword ptr data+4
>>>        mov eax, 32
>>>        jnz s1
>>>        bsr edx, dword ptr data
>>>        mov eax, 0
>>>        jnz s1
>>>        mov edx, -1
>>>    s1:	add edx, eax
>>>        mov index, edx
>>>    }
>>>
>>>  return index;
>>>  }
>>>
>>>void parseBitboard (int from, struct MoveList *ml, bitboard attack)
>>>  {
>>>  int index;
>>>
>>>  while ((index = findBitIndex(attack))!=-1)
>>>    {
>>>      [add move to list]
>>>      attack -= mask[index];
>>>    }
>>>  }
>>>-------------
>>>and then got this in the profile:
>>>    6763.331   4.4    32424.707  21.1   530420 _parseBitboard (pierre.obj)
>>>    1313.554   0.9     1313.554   0.9  3523746 _findBitIndex (pierre.obj)
>>>
>>>with about a 10% drop in nodes/sec.
>>>
>>>I thought that BSF & BSR were supposed to be fast! What am I doing wrong?
>>>This is on an Intel P3/500 w/ win2k.
>>
>>This is what I do and I get about a 10% speed increase over my table lookup
>>version.  Im also running win2k on a P3/500.
>>
>>__forceinline int LSB(bitBoard n){
>>	__asm {
>>	      bsf     edx, dword ptr n
>>        mov     eax, 0
>>        jnz     l1
>>        bsf     edx, dword ptr n+4
>>        mov     eax, 32
>>        jnz     l1
>>        mov     edx, -33
>>  l1:   add     eax, edx
>>  }
>>}
>
>If you get rid of those 2 jnz instructions
>then you can add a load of instructions to replace those
>2 possible branches and get a lot faster.
>
>If misprediction occurs it will be at least 10 clocks penalty on P3,
>even more on K7 and 20 clocks at least at P4.
>
>Means that you can use up to 20 instructions easily as long as they
>do not use branches to replace it.


You are probably right, but I can see no way to get the same functionality
without branching.

>
>>  while (toMap){
>>    toSquare=LSB(toMap);
>>    toMap&=notMask[toSquare];
>>    [add move]
>>  }
>>
>>A couple things, is that subtraction attack -= mask[index]; slower or faster
>>than anding with a notmask?
>>
>>Also,
>>
>>while ((index = findBitIndex(attack))!=-1)
>>{
>>  [add move to list]
>>  attack -= mask[index];
>>}
>>
>>you keep going until findBitIndex outputs a -1, but isn't this also when
>>attack==0?  So, you are doing extra instructions on the last time.
>>
>>Try this:
>>
>>while (attack)
>>{
>>  index = findBitIndex(attack);
>>  [add move to list]
>>  attack -= mask[index];//maybe a notMask and is faster like I did?
>>}
>>
>>
>>
>>Regards,
>>
>>Landon W. Rabern



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.