Author: Eugene Nalimov
Date: 12:48:03 04/24/01
Go up one level in this thread
On April 24, 2001 at 15:41:19, Bas Hamstra wrote:
>On April 24, 2001 at 14:33:54, Eugene Nalimov wrote:
>
>>On April 24, 2001 at 14:26:09, Victor Zakharov wrote:
>>
>>>On April 23, 2001 at 19:30:20, Alex Boby wrote:
>>>
>>>>
>>>>I used to have this:
>>>>
>>>>------------
>>>>void parseBitboard (int from, struct MoveList *ml, bitboard attack)
>>>> {
>>>> int i;
>>>>
>>>> for (i=0; i<64; i++)
>>>> {
>>>> if (attack&mask[i])
>>>> [add move to list]
>>>> }
>>>> }
>>>>------------
>>>>and got this in the profile:
>>>>7301.351 3.9 37127.739 19.6 538488 _parseBitboard (pierre.obj)
>>>>
>>>>and then, figuring I would get a significant speed increase, I switched to this:
>>>>
>>>>-----------------
>>>>int findBitIndex(bitboard data)
>>>> {
>>>> int index;
>>>>
>>>> __asm
>>>> {
>>>> bsr edx, dword ptr data+4
>>>> mov eax, 32
>>>> jnz s1
>>>> bsr edx, dword ptr data
>>>> mov eax, 0
>>>> jnz s1
>>>> mov edx, -1
>>>> s1: add edx, eax
>>>> mov index, edx
>>>> }
>>>>
>>>> return index;
>>>> }
>>>
>>>May be 2 jmp commands hurt the speed. I tried to reproduce the same code without
>>>jmp commands.
>>>
>>> xor ecx,ecx
>>> cmp dword ptr data+4,0
>>> setnz cl
>>> mov edx,dword ptr [data+ecx*4]
>>> shl ecx,5
>>> mov eax,-1
>>> bsr eax,edx
>>> add eax,ecx
>>> mov index,eax
>>>
>>>I am not sure about it because of the reason of Address Generation Interlock the
>>>following pair of commands could be not fast.
>>>
>>> setnz cl
>>> mov edx,dword ptr [data+ecx*4]
>>>
>>>Also I am not sure how is dword ptr [data+ecx*4] will be processed by compiler
>>>
>>>
>>>The second procedure could look like
>>>
>>> xor ecx,ecx
>>> cmp dword ptr data,0
>>> setz cl
>>> mov edx,dword ptr [data+ecx*4]
>>> shl ecx,5
>>> mov eax,-1
>>> bsf eax,edx
>>> add eax,ecx
>>> mov index,eax
>>>
>>>Victor
>>
>>I'd recommend to use sbb instead of xor/setz, i.e. beginning of the 2nd
>>procedure should look something like
>>
>> cmp dword ptr data, 1
>> sbb ecx, ecx
>> mov edx,dword ptr [data+4+ecx*4]
>>
>>Eugene
>
>Eugene, since you are obviously the expert: what would your version look like?
>Also, I tried to do PopCnt in asm, but so far table lookup of 8 char's seems
>faster by far.
>
>Bas.
My versions are part of Crafty :-)
Eugene
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.