Author: Tim Foden
Date: 04:32:12 08/08/02
Go up one level in this thread
On August 08, 2002 at 04:49:43, Russell Reagan wrote:
>First question...
>
>In Crafty, there is a function FirstOne() which will return the least
>significant bit set in a bitboard.
<ramble>
If you look carefully you will see that the code contains bsr's and checks the
high dword first, so Crafty's FirstOne() will return the index of the most
significant set bit. Useful to know if you are trying to use it in your
program! :)
</ramble>
>FORCEINLINE int FirstOne(BITBOARD a) {
> __asm {
> bsr edx, dword ptr a+4
> mov eax, 31
> jnz l1
> bsr edx, dword ptr a
> mov eax, 63
> jnz l1
> mov edx, -1
> l1: sub eax, edx
> }
>}
>
>Now I've always heard that you should pass by pointer or reference if you are
>dealing with a non-standard type, which I assume a 64-bit int would be on a
>32-bit machine.
I would say that this is in general a good rule (using const ref), but most of
the (32bit) compilers used can deal with 64 bit values in registers (by using
pairs of registers), so passing by value here (and inlining if possible) is
probably OK. (IMO anyway)
>Is there any speed hit from passing by value in this function?
I wouldn't be much, if any.
>Or is the reason it works due to inlining the function?
Inlining may remove any speed penalty, if the values end up being passed in
registers.
>
>Second question...
>
>I have seen different ASM used in different engines to accomplish this same task
>of determining the least significant set bit. This seems to be the shortest, but
>I don't see it in any engines.
>
>bsf eax, dword ptr a+4
>add eax, 32
>bsf eax, dword ptr a
This is very short, but in tests with my Athlon XP, it is no faster than the
version with branches and checks for zeroed bitmaps.
>The closest I see is Beowulf with:
>
>mov ecx,dword ptr [bits+4]
>mov ebx,dword ptr [bits]
>bsf eax,ecx
>add eax,32
>bsf eax,ebx
>
>Is the "problem" with this that you have to assume that the bitboard isn't equal
>to zero?
Not really. It's just that bsf can be quite slow, and here it is always called
twice. So you can save some time on an Athlon by branching, and only calling it
once.
In Green Light I have tested a number of routines. I generally use a
FindBitAndRemoveIt type of routine, as in the vast majority of cases you not
only want to find a bit, you also want to remove it. Here are the best two
routines I have found for this so far:
inline Square BitScanForwardRemove( UINT64& bitmap )
{
__asm
{
mov ebx, [bitmap]
mov edx, 1
bsf ecx, [ebx]
mov eax, 0
jnz found
bsf ecx, [ebx + 4]
mov eax, 32
lea ebx, [ebx + 4]
jnz found
mov eax, XX
jmp done
found:
shl edx, cl
add eax, ecx
xor [ebx], edx
done:
}
// value returned in eax
}
or (if you prefer one with no branches, and no test for a zero bitmap):
inline Square BitScanForwardRemove( UINT64& bitmap )
{
ASSERT( bitmap );
__asm
{
mov ebx, [bitmap]
mov eax, 0
cmp dword ptr [ebx], eax
mov edx, 1
setz al
bsf ecx, [ebx + eax * 4]
lea ebx, [ebx + eax * 4]
shl eax, 5
shl edx, cl
add eax, ecx
xor [ebx], edx
}
// value returned in eax
}
>If so, why is that a problem? Where do you need to use FirstOne() where you
>don't know if there are any bits set or not? It seems logical to me that the
>since you don't need to verify that the bitboard is not null most of the time,
>it is redundant and only wastes time. For example, if your FirstOne() function
>tests to make sure that the bitboard isn't null, you are doing extra work in the
>case of generating legal moves:
Except that the branch that runs if the bitmap is zero will almost always be
predicted correctly on modern processors. Again, in tests on my Athlon, I found
my first routine above to result in the same speed whether it checked for the
zero case or not. So I don't think it is a time issue. It uses some more
bytes, but you could of course have 2 versions of the routine, one that checks
and one that doesn't.
Cheers, Tim.
>while (whiteKnights) { // Make sure whiteKnights isn't null
> int index = FirstOne(whiteKnights); // Make sure _AGAIN_
> DoKnightStuff(index);
> Clear(whiteKnights,index);
>}
>
>Thanks,
>Russell
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.