Computer Chess Club Archives

Search

Terms
Messages

Subject: Re: BSF/R not working well for me...

Author: Landon Rabern
Date: 12:18:31 04/25/01
On April 25, 2001 at 14:36:55, Bas Hamstra wrote:

>On April 24, 2001 at 21:14:32, Landon Rabern wrote:
>
>>On April 24, 2001 at 08:44:15, Vincent Diepeveen wrote:
>>
>>>On April 23, 2001 at 22:19:33, Landon Rabern wrote:
>>>
>>>>On April 23, 2001 at 19:30:20, Alex Boby wrote:
>>>>
>>>>>
>>>>>I used to have this:
>>>>>
>>>>>------------
>>>>>void parseBitboard (int from, struct MoveList *ml, bitboard attack)
>>>>>  {
>>>>>  int i;
>>>>>
>>>>>  for (i=0; i<64; i++)
>>>>>    {
>>>>>    if (attack&mask[i])
>>>>>      [add move to list]
>>>>>    }
>>>>>  }
>>>>>------------
>>>>>and got this in the profile:
>>>>>7301.351   3.9    37127.739  19.6   538488 _parseBitboard (pierre.obj)
>>>>>
>>>>>and then, figuring I would get a significant speed increase, I switched to this:
>>>>>
>>>>>-----------------
>>>>>int findBitIndex(bitboard data)
>>>>>  {
>>>>>  int index;
>>>>>
>>>>>  __asm
>>>>>    {
>>>>>        bsr edx, dword ptr data+4
>>>>>        mov eax, 32
>>>>>        jnz s1
>>>>>        bsr edx, dword ptr data
>>>>>        mov eax, 0
>>>>>        jnz s1
>>>>>        mov edx, -1
>>>>>    s1:	add edx, eax
>>>>>        mov index, edx
>>>>>    }
>>>>>
>>>>>  return index;
>>>>>  }
>>>>>
>>>>>void parseBitboard (int from, struct MoveList *ml, bitboard attack)
>>>>>  {
>>>>>  int index;
>>>>>
>>>>>  while ((index = findBitIndex(attack))!=-1)
>>>>>    {
>>>>>      [add move to list]
>>>>>      attack -= mask[index];
>>>>>    }
>>>>>  }
>>>>>-------------
>>>>>and then got this in the profile:
>>>>>    6763.331   4.4    32424.707  21.1   530420 _parseBitboard (pierre.obj)
>>>>>    1313.554   0.9     1313.554   0.9  3523746 _findBitIndex (pierre.obj)
>>>>>
>>>>>with about a 10% drop in nodes/sec.
>>>>>
>>>>>I thought that BSF & BSR were supposed to be fast! What am I doing wrong?
>>>>>This is on an Intel P3/500 w/ win2k.
>>>>
>>>>This is what I do and I get about a 10% speed increase over my table lookup
>>>>version.  Im also running win2k on a P3/500.
>>>>
>>>>__forceinline int LSB(bitBoard n){
>>>>	__asm {
>>>>	      bsf     edx, dword ptr n
>>>>        mov     eax, 0
>>>>        jnz     l1
>>>>        bsf     edx, dword ptr n+4
>>>>        mov     eax, 32
>>>>        jnz     l1
>>>>        mov     edx, -33
>>>>  l1:   add     eax, edx
>>>>  }
>>>>}
>>>
>>>If you get rid of those 2 jnz instructions
>>>then you can add a load of instructions to replace those
>>>2 possible branches and get a lot faster.
>>>
>>>If misprediction occurs it will be at least 10 clocks penalty on P3,
>>>even more on K7 and 20 clocks at least at P4.
>>>
>>>Means that you can use up to 20 instructions easily as long as they
>>>do not use branches to replace it.
>>
>>
>>You are probably right, but I can see no way to get the same functionality
>>without branching.
>
>Yes, you can save 1 branch totally for free. I post mine again.
>
>int LastOne(BB M)
>{   __asm
>    {   mov EAX, dword ptr [M]
>        bsf EAX, EAX
>        jnz Done
>	mov EAX, dword ptr [M+4]
>        bsf EAX, EAX
>        add EAX, 32
>        Done:
>    }
>}
>
>Bas.

Ok, I see this.  Mine returns a -1 on failure, but there really is no reason to
do this since it never gets called on a zero BB.  Doesn't that also mean that
the last branch would be perfectly predictable anyway?

Regards,

Landon W. Rabern

>
>
>>
>>>
>>>>  while (toMap){
>>>>    toSquare=LSB(toMap);
>>>>    toMap&=notMask[toSquare];
>>>>    [add move]
>>>>  }
>>>>
>>>>A couple things, is that subtraction attack -= mask[index]; slower or faster
>>>>than anding with a notmask?
>>>>
>>>>Also,
>>>>
>>>>while ((index = findBitIndex(attack))!=-1)
>>>>{
>>>>  [add move to list]
>>>>  attack -= mask[index];
>>>>}
>>>>
>>>>you keep going until findBitIndex outputs a -1, but isn't this also when
>>>>attack==0?  So, you are doing extra instructions on the last time.
>>>>
>>>>Try this:
>>>>
>>>>while (attack)
>>>>{
>>>>  index = findBitIndex(attack);
>>>>  [add move to list]
>>>>  attack -= mask[index];//maybe a notMask and is faster like I did?
>>>>}
>>>>
>>>>
>>>>
>>>>Regards,
>>>>
>>>>Landon W. Rabern
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.