Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: BSF/R not working well for me...

Author: Bas Hamstra

Date: 11:36:55 04/25/01

Go up one level in this thread


On April 24, 2001 at 21:14:32, Landon Rabern wrote:

>On April 24, 2001 at 08:44:15, Vincent Diepeveen wrote:
>
>>On April 23, 2001 at 22:19:33, Landon Rabern wrote:
>>
>>>On April 23, 2001 at 19:30:20, Alex Boby wrote:
>>>
>>>>
>>>>I used to have this:
>>>>
>>>>------------
>>>>void parseBitboard (int from, struct MoveList *ml, bitboard attack)
>>>>  {
>>>>  int i;
>>>>
>>>>  for (i=0; i<64; i++)
>>>>    {
>>>>    if (attack&mask[i])
>>>>      [add move to list]
>>>>    }
>>>>  }
>>>>------------
>>>>and got this in the profile:
>>>>7301.351   3.9    37127.739  19.6   538488 _parseBitboard (pierre.obj)
>>>>
>>>>and then, figuring I would get a significant speed increase, I switched to this:
>>>>
>>>>-----------------
>>>>int findBitIndex(bitboard data)
>>>>  {
>>>>  int index;
>>>>
>>>>  __asm
>>>>    {
>>>>        bsr edx, dword ptr data+4
>>>>        mov eax, 32
>>>>        jnz s1
>>>>        bsr edx, dword ptr data
>>>>        mov eax, 0
>>>>        jnz s1
>>>>        mov edx, -1
>>>>    s1:	add edx, eax
>>>>        mov index, edx
>>>>    }
>>>>
>>>>  return index;
>>>>  }
>>>>
>>>>void parseBitboard (int from, struct MoveList *ml, bitboard attack)
>>>>  {
>>>>  int index;
>>>>
>>>>  while ((index = findBitIndex(attack))!=-1)
>>>>    {
>>>>      [add move to list]
>>>>      attack -= mask[index];
>>>>    }
>>>>  }
>>>>-------------
>>>>and then got this in the profile:
>>>>    6763.331   4.4    32424.707  21.1   530420 _parseBitboard (pierre.obj)
>>>>    1313.554   0.9     1313.554   0.9  3523746 _findBitIndex (pierre.obj)
>>>>
>>>>with about a 10% drop in nodes/sec.
>>>>
>>>>I thought that BSF & BSR were supposed to be fast! What am I doing wrong?
>>>>This is on an Intel P3/500 w/ win2k.
>>>
>>>This is what I do and I get about a 10% speed increase over my table lookup
>>>version.  Im also running win2k on a P3/500.
>>>
>>>__forceinline int LSB(bitBoard n){
>>>	__asm {
>>>	      bsf     edx, dword ptr n
>>>        mov     eax, 0
>>>        jnz     l1
>>>        bsf     edx, dword ptr n+4
>>>        mov     eax, 32
>>>        jnz     l1
>>>        mov     edx, -33
>>>  l1:   add     eax, edx
>>>  }
>>>}
>>
>>If you get rid of those 2 jnz instructions
>>then you can add a load of instructions to replace those
>>2 possible branches and get a lot faster.
>>
>>If misprediction occurs it will be at least 10 clocks penalty on P3,
>>even more on K7 and 20 clocks at least at P4.
>>
>>Means that you can use up to 20 instructions easily as long as they
>>do not use branches to replace it.
>
>
>You are probably right, but I can see no way to get the same functionality
>without branching.

Yes, you can save 1 branch totally for free. I post mine again.

int LastOne(BB M)
{   __asm
    {   mov EAX, dword ptr [M]
        bsf EAX, EAX
        jnz Done
	mov EAX, dword ptr [M+4]
        bsf EAX, EAX
        add EAX, 32
        Done:
    }
}

Bas.


>
>>
>>>  while (toMap){
>>>    toSquare=LSB(toMap);
>>>    toMap&=notMask[toSquare];
>>>    [add move]
>>>  }
>>>
>>>A couple things, is that subtraction attack -= mask[index]; slower or faster
>>>than anding with a notmask?
>>>
>>>Also,
>>>
>>>while ((index = findBitIndex(attack))!=-1)
>>>{
>>>  [add move to list]
>>>  attack -= mask[index];
>>>}
>>>
>>>you keep going until findBitIndex outputs a -1, but isn't this also when
>>>attack==0?  So, you are doing extra instructions on the last time.
>>>
>>>Try this:
>>>
>>>while (attack)
>>>{
>>>  index = findBitIndex(attack);
>>>  [add move to list]
>>>  attack -= mask[index];//maybe a notMask and is faster like I did?
>>>}
>>>
>>>
>>>
>>>Regards,
>>>
>>>Landon W. Rabern



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.