Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Countbits() Function

Author: James Robertson

Date: 22:25:20 01/03/99

Go up one level in this thread


On January 03, 1999 at 22:23:32, Eugene Nalimov wrote:

>On January 03, 1999 at 19:10:48, James Robertson wrote:
>
>>On January 03, 1999 at 19:06:03, Eugene Nalimov wrote:
>>
>>>On January 03, 1999 at 18:40:08, James Robertson wrote:
>>>
>>>>[snip]
>>>>
>>>>>Actually, you don't need BSF/BSR instructions here. You can
>>>>>use exactly the same trick that Robert Hyatt used in Crafty
>>>>>source - use the fact that A&(A-1) clears least significant
>>>>>bit. In Crafty 16.3 source there is in-line assembly stuff
>>>>>for VC++ as well as C function. Function that counts bits is
>>>>>just 2 loops, first handles lower half of 64-bit bitboard,
>>>>>and second handles upper half:
>>>>>
>>>>>        mov     ecx, low_part_of_the_bitboard
>>>>>        xor     eax, eax
>>>>>        test    ecx, ecx
>>>>>        jz      end_of_loop1
>>>>>loop1:
>>>>>        lea     edx, [ecx-1]
>>>>>        inc     eax
>>>>>        and     ecx, edx
>>>>>        jnz     loop1
>>>>>end_of_loop1:
>>>>>        mov     ecx,high_part_of_the_bitboard
>>>>>        test    ecx, ecx
>>>>>        jz      end_of_loop2
>>>>>loop2:
>>>>>        lea     edx, [ecx-1]
>>>>>        inc     eax
>>>>>        and     ecx, edx
>>>>>        jnz     loop2
>>>>>end_of_loop:
>>>>>        ret
>>>>>
>>>>>I will not be surpised if that function will be faster even
>>>>>on x86 processors that have fast BSF/BSR. Even if it's
>>>>>slightly slower, it has the benefit that it's portable -
>>>>>it will work reasonable well on *any* x86 compatible, be it
>>>>>AMD, Cyrix, IDT, etc.
>>>>>
>>>>>Eugene
>>>>
>>>>I have several questions about x86 processors:
>>>>Is
>>>>    xor    eax,eax
>>>>faster than
>>>>    mov    eax,0
>>>>
>>>>Also, is
>>>>    inc    eax
>>>>faster than
>>>>    add    eax,1
>>>>?
>>>>
>>>>Thanks!
>>>>
>>>>James
>>>
>>>No and yes. If you just count # of ticks (or of ROPs),
>>>they will be exactly the same. But those instructions
>>>are shorter - xor uses 2 bytes, move uses 5 bytes; inc
>>>is 1 byte, add is 3 bytes. Smaller programs usually
>>>run faster - they occupy less space in cache, so
>>>traffic to slow main memory is smaller.
>>>
>>>Eugene
>>
>>Thanks! Also, I just noticed a command I am unfamiliar with:
>>lea
>>What does it do?
>>
>>Thanks!
>>James
>
>Load effective address. It evaluates address of its'
>second operand (exactly as 'move' does), and loads that
>address into destination register (move loads *contents*
>of memory at that address). You often can use that
>instruction to perform several operations at once - e.g.
>        lea    eax, 8[ebx+ecx*8]
>roughly equivalent (it doesn't set flags) to
>        mov    eax, ecx
>        shl    eax, 3
>        add    eax, ebx
>        add    eax, 8
>
>Eugene

I am still confused. In this piece of code from the bit counting function:

        mov     ecx, low_part_of_the_bitboard
        xor     eax, eax
        test    ecx, ecx
        jz      end_of_loop1
     loop1:
        lea     edx, [ecx-1]
        inc     eax
        and     ecx, edx
        jnz     loop1

lea is used to avoid having to write:
        mov     edx,ecx
        sub     edx,1
?
(What is the opposite of inc?)

James




This page took 0.01 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.