Author: Vincent Diepeveen
Date: 10:05:26 08/09/02
Go up one level in this thread
On August 08, 2002 at 11:47:35, Gerd Isenberg wrote:
This is really amazing to hear!
However we must take into account that in specint, inline assembly
is forbidden AFAIK. k7 is from before crafty area.
We will see what the future brings!
What i don't understand is why BSF is under the 'vector' instructions.
In my assembly manuals it's under something else.
What makes this a vector instruction?
>On August 08, 2002 at 08:37:48, Gian-Carlo Pascutto wrote:
>
>>On August 08, 2002 at 08:11:13, Vincent Diepeveen wrote:
>>
>>
>>>>This is very short, but in tests with my Athlon XP, it is no faster than the
>>>>version with branches and checks for zeroed bitmaps.
>>>
>>>that's of course a dumb test to do. you should try each time a random
>>>chosen bitmap.
>>
>>I think you misread what he wrote. He didn't test with zero bitmaps.
>>
>>The problem of the short routine is that bsf is very slow on the Athlon,
>>and it can only be handled by the vector decoder.
>>
>>In a bitboard program, typical case is to have an empty half-bitboard, and
>>avoiding the bsf is faster even if it's occasionally mispredicted.
>>
>>--
>>GCP
>
>I can confim this and was very surprised, that this (introduced by Tim and
>slighly modified) was significant faster in average ...
>
>// precondition: bb not null
>__forceinline unsigned int BitSearchAndReset(BitBoard &bb)
>{
> __asm
> {
> xor edx, edx
> mov ebx, [bb]
> mov eax, edx
> inc edx
>
> bsf ecx, [ebx]
> jnz found
>
> bsf ecx, [ebx + 4]
> lea ebx, [ebx + 4]
> xor eax, 32
> found:
> shl edx, cl
> xor eax, ecx
> xor [ebx], edx
> }
>}
>
>... than this short one, ...
>
>// precondition: bb not null
>__forceinline unsigned int BitSearchAndReset(BitBoard &bb)
>{
> __asm
> {
> mov edx, [bb]
> bsf eax, [edx+4]
> xor eax, 32
> bsf eax, [edx]
> btr [edx],eax
> }
>}
>
>... even if it used 4 registers instead of two. Three vector path instructions
>in a row seem to be real performance killers on athlon.
>
>Gerd
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.