Author: Bas Hamstra
Date: 12:41:19 04/24/01
Go up one level in this thread
On April 24, 2001 at 14:33:54, Eugene Nalimov wrote:
>On April 24, 2001 at 14:26:09, Victor Zakharov wrote:
>
>>On April 23, 2001 at 19:30:20, Alex Boby wrote:
>>
>>>
>>>I used to have this:
>>>
>>>------------
>>>void parseBitboard (int from, struct MoveList *ml, bitboard attack)
>>> {
>>> int i;
>>>
>>> for (i=0; i<64; i++)
>>> {
>>> if (attack&mask[i])
>>> [add move to list]
>>> }
>>> }
>>>------------
>>>and got this in the profile:
>>>7301.351 3.9 37127.739 19.6 538488 _parseBitboard (pierre.obj)
>>>
>>>and then, figuring I would get a significant speed increase, I switched to this:
>>>
>>>-----------------
>>>int findBitIndex(bitboard data)
>>> {
>>> int index;
>>>
>>> __asm
>>> {
>>> bsr edx, dword ptr data+4
>>> mov eax, 32
>>> jnz s1
>>> bsr edx, dword ptr data
>>> mov eax, 0
>>> jnz s1
>>> mov edx, -1
>>> s1: add edx, eax
>>> mov index, edx
>>> }
>>>
>>> return index;
>>> }
>>
>>May be 2 jmp commands hurt the speed. I tried to reproduce the same code without
>>jmp commands.
>>
>> xor ecx,ecx
>> cmp dword ptr data+4,0
>> setnz cl
>> mov edx,dword ptr [data+ecx*4]
>> shl ecx,5
>> mov eax,-1
>> bsr eax,edx
>> add eax,ecx
>> mov index,eax
>>
>>I am not sure about it because of the reason of Address Generation Interlock the
>>following pair of commands could be not fast.
>>
>> setnz cl
>> mov edx,dword ptr [data+ecx*4]
>>
>>Also I am not sure how is dword ptr [data+ecx*4] will be processed by compiler
>>
>>
>>The second procedure could look like
>>
>> xor ecx,ecx
>> cmp dword ptr data,0
>> setz cl
>> mov edx,dword ptr [data+ecx*4]
>> shl ecx,5
>> mov eax,-1
>> bsf eax,edx
>> add eax,ecx
>> mov index,eax
>>
>>Victor
>
>I'd recommend to use sbb instead of xor/setz, i.e. beginning of the 2nd
>procedure should look something like
>
> cmp dword ptr data, 1
> sbb ecx, ecx
> mov edx,dword ptr [data+4+ecx*4]
>
>Eugene
Eugene, since you are obviously the expert: what would your version look like?
Also, I tried to do PopCnt in asm, but so far table lookup of 8 char's seems
faster by far.
Bas.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.