Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: BSF/R not working well for me...

Author: Bas Hamstra

Date: 12:41:19 04/24/01

Go up one level in this thread


On April 24, 2001 at 14:33:54, Eugene Nalimov wrote:

>On April 24, 2001 at 14:26:09, Victor Zakharov wrote:
>
>>On April 23, 2001 at 19:30:20, Alex Boby wrote:
>>
>>>
>>>I used to have this:
>>>
>>>------------
>>>void parseBitboard (int from, struct MoveList *ml, bitboard attack)
>>>  {
>>>  int i;
>>>
>>>  for (i=0; i<64; i++)
>>>    {
>>>    if (attack&mask[i])
>>>      [add move to list]
>>>    }
>>>  }
>>>------------
>>>and got this in the profile:
>>>7301.351   3.9    37127.739  19.6   538488 _parseBitboard (pierre.obj)
>>>
>>>and then, figuring I would get a significant speed increase, I switched to this:
>>>
>>>-----------------
>>>int findBitIndex(bitboard data)
>>>  {
>>>  int index;
>>>
>>>  __asm
>>>    {
>>>        bsr edx, dword ptr data+4
>>>        mov eax, 32
>>>        jnz s1
>>>        bsr edx, dword ptr data
>>>        mov eax, 0
>>>        jnz s1
>>>        mov edx, -1
>>>    s1:	add edx, eax
>>>        mov index, edx
>>>    }
>>>
>>>  return index;
>>>  }
>>
>>May be 2 jmp commands hurt the speed. I tried to reproduce the same code without
>>jmp commands.
>>
>>       xor  ecx,ecx
>>       cmp  dword ptr data+4,0
>>       setnz cl
>>       mov  edx,dword ptr [data+ecx*4]
>>       shl  ecx,5
>>       mov  eax,-1
>>       bsr  eax,edx
>>       add  eax,ecx
>>       mov  index,eax
>>
>>I am not sure about it because of the reason of Address Generation Interlock the
>>following pair of commands could be not fast.
>>
>>       setnz cl
>>       mov  edx,dword ptr [data+ecx*4]
>>
>>Also I am not sure how is dword ptr [data+ecx*4] will be processed by compiler
>>
>>
>>The second procedure could look like
>>
>>       xor  ecx,ecx
>>       cmp dword ptr data,0
>>       setz cl
>>       mov  edx,dword ptr [data+ecx*4]
>>       shl  ecx,5
>>       mov  eax,-1
>>       bsf  eax,edx
>>       add  eax,ecx
>>       mov  index,eax
>>
>>Victor
>
>I'd recommend to use sbb instead of xor/setz, i.e. beginning of the 2nd
>procedure should look something like
>
>       cmp  dword ptr data, 1
>       sbb  ecx, ecx
>       mov  edx,dword ptr [data+4+ecx*4]
>
>Eugene

Eugene, since you are obviously the expert: what would your version look like?
Also, I tried to do PopCnt in asm, but so far table lookup of 8 char's seems
faster by far.

Bas.









This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.