Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: BSF/R not working well for me...

Author: Eugene Nalimov

Date: 12:48:03 04/24/01

Go up one level in this thread


On April 24, 2001 at 15:41:19, Bas Hamstra wrote:

>On April 24, 2001 at 14:33:54, Eugene Nalimov wrote:
>
>>On April 24, 2001 at 14:26:09, Victor Zakharov wrote:
>>
>>>On April 23, 2001 at 19:30:20, Alex Boby wrote:
>>>
>>>>
>>>>I used to have this:
>>>>
>>>>------------
>>>>void parseBitboard (int from, struct MoveList *ml, bitboard attack)
>>>>  {
>>>>  int i;
>>>>
>>>>  for (i=0; i<64; i++)
>>>>    {
>>>>    if (attack&mask[i])
>>>>      [add move to list]
>>>>    }
>>>>  }
>>>>------------
>>>>and got this in the profile:
>>>>7301.351   3.9    37127.739  19.6   538488 _parseBitboard (pierre.obj)
>>>>
>>>>and then, figuring I would get a significant speed increase, I switched to this:
>>>>
>>>>-----------------
>>>>int findBitIndex(bitboard data)
>>>>  {
>>>>  int index;
>>>>
>>>>  __asm
>>>>    {
>>>>        bsr edx, dword ptr data+4
>>>>        mov eax, 32
>>>>        jnz s1
>>>>        bsr edx, dword ptr data
>>>>        mov eax, 0
>>>>        jnz s1
>>>>        mov edx, -1
>>>>    s1:	add edx, eax
>>>>        mov index, edx
>>>>    }
>>>>
>>>>  return index;
>>>>  }
>>>
>>>May be 2 jmp commands hurt the speed. I tried to reproduce the same code without
>>>jmp commands.
>>>
>>>       xor  ecx,ecx
>>>       cmp  dword ptr data+4,0
>>>       setnz cl
>>>       mov  edx,dword ptr [data+ecx*4]
>>>       shl  ecx,5
>>>       mov  eax,-1
>>>       bsr  eax,edx
>>>       add  eax,ecx
>>>       mov  index,eax
>>>
>>>I am not sure about it because of the reason of Address Generation Interlock the
>>>following pair of commands could be not fast.
>>>
>>>       setnz cl
>>>       mov  edx,dword ptr [data+ecx*4]
>>>
>>>Also I am not sure how is dword ptr [data+ecx*4] will be processed by compiler
>>>
>>>
>>>The second procedure could look like
>>>
>>>       xor  ecx,ecx
>>>       cmp dword ptr data,0
>>>       setz cl
>>>       mov  edx,dword ptr [data+ecx*4]
>>>       shl  ecx,5
>>>       mov  eax,-1
>>>       bsf  eax,edx
>>>       add  eax,ecx
>>>       mov  index,eax
>>>
>>>Victor
>>
>>I'd recommend to use sbb instead of xor/setz, i.e. beginning of the 2nd
>>procedure should look something like
>>
>>       cmp  dword ptr data, 1
>>       sbb  ecx, ecx
>>       mov  edx,dword ptr [data+4+ecx*4]
>>
>>Eugene
>
>Eugene, since you are obviously the expert: what would your version look like?
>Also, I tried to do PopCnt in asm, but so far table lookup of 8 char's seems
>faster by far.
>
>Bas.

My versions are part of Crafty :-)

Eugene



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.