Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Countbits() Function

Author: Eugene Nalimov

Date: 22:33:57 01/03/99

Go up one level in this thread


On January 04, 1999 at 01:25:20, James Robertson wrote:

>On January 03, 1999 at 22:23:32, Eugene Nalimov wrote:
>
>>On January 03, 1999 at 19:10:48, James Robertson wrote:
>>
>>>On January 03, 1999 at 19:06:03, Eugene Nalimov wrote:
>>>
>>>>On January 03, 1999 at 18:40:08, James Robertson wrote:
>>>>
>>>>>[snip]
>>>>>
>>>>>>Actually, you don't need BSF/BSR instructions here. You can
>>>>>>use exactly the same trick that Robert Hyatt used in Crafty
>>>>>>source - use the fact that A&(A-1) clears least significant
>>>>>>bit. In Crafty 16.3 source there is in-line assembly stuff
>>>>>>for VC++ as well as C function. Function that counts bits is
>>>>>>just 2 loops, first handles lower half of 64-bit bitboard,
>>>>>>and second handles upper half:
>>>>>>
>>>>>>        mov     ecx, low_part_of_the_bitboard
>>>>>>        xor     eax, eax
>>>>>>        test    ecx, ecx
>>>>>>        jz      end_of_loop1
>>>>>>loop1:
>>>>>>        lea     edx, [ecx-1]
>>>>>>        inc     eax
>>>>>>        and     ecx, edx
>>>>>>        jnz     loop1
>>>>>>end_of_loop1:
>>>>>>        mov     ecx,high_part_of_the_bitboard
>>>>>>        test    ecx, ecx
>>>>>>        jz      end_of_loop2
>>>>>>loop2:
>>>>>>        lea     edx, [ecx-1]
>>>>>>        inc     eax
>>>>>>        and     ecx, edx
>>>>>>        jnz     loop2
>>>>>>end_of_loop:
>>>>>>        ret
>>>>>>
>>>>>>I will not be surpised if that function will be faster even
>>>>>>on x86 processors that have fast BSF/BSR. Even if it's
>>>>>>slightly slower, it has the benefit that it's portable -
>>>>>>it will work reasonable well on *any* x86 compatible, be it
>>>>>>AMD, Cyrix, IDT, etc.
>>>>>>
>>>>>>Eugene
>>>>>
>>>>>I have several questions about x86 processors:
>>>>>Is
>>>>>    xor    eax,eax
>>>>>faster than
>>>>>    mov    eax,0
>>>>>
>>>>>Also, is
>>>>>    inc    eax
>>>>>faster than
>>>>>    add    eax,1
>>>>>?
>>>>>
>>>>>Thanks!
>>>>>
>>>>>James
>>>>
>>>>No and yes. If you just count # of ticks (or of ROPs),
>>>>they will be exactly the same. But those instructions
>>>>are shorter - xor uses 2 bytes, move uses 5 bytes; inc
>>>>is 1 byte, add is 3 bytes. Smaller programs usually
>>>>run faster - they occupy less space in cache, so
>>>>traffic to slow main memory is smaller.
>>>>
>>>>Eugene
>>>
>>>Thanks! Also, I just noticed a command I am unfamiliar with:
>>>lea
>>>What does it do?
>>>
>>>Thanks!
>>>James
>>
>>Load effective address. It evaluates address of its'
>>second operand (exactly as 'move' does), and loads that
>>address into destination register (move loads *contents*
>>of memory at that address). You often can use that
>>instruction to perform several operations at once - e.g.
>>        lea    eax, 8[ebx+ecx*8]
>>roughly equivalent (it doesn't set flags) to
>>        mov    eax, ecx
>>        shl    eax, 3
>>        add    eax, ebx
>>        add    eax, 8
>>
>>Eugene
>
>I am still confused. In this piece of code from the bit counting function:
>
>        mov     ecx, low_part_of_the_bitboard
>        xor     eax, eax
>        test    ecx, ecx
>        jz      end_of_loop1
>     loop1:
>        lea     edx, [ecx-1]
>        inc     eax
>        and     ecx, edx
>        jnz     loop1
>
>lea is used to avoid having to write:
>        mov     edx,ecx
>        sub     edx,1
>?
>(What is the opposite of inc?)
>
>James

You are right. And the opposite of inc is, of course, dec.

You can download entire set of Intel manuals from
www.intel.com. Instruction set description is not the best
I ever saw, but more or less adequate.

Eugene



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.