Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Saved Another Cycle -- Woohoo!

Author: Robert Hyatt

Date: 19:52:06 01/05/03

Go up one level in this thread


On January 05, 2003 at 17:06:56, Matt Taylor wrote:

>>I was chugging away on the older binary search routine yesterday. I've modified
>>it to use a 5th register. This allows me to stagger the accumulation and make
>>use of extra cycles that were burning waiting on instructions to retire. It is
>>amazingly efficient -- considering the maximum of 3 instructions/cycle, there
>>are only 7 slots wasted across the entire routine (38 instructions). This means
>>optimum execution time for this sequence is 13 cycles, and I'm doing it in 15.
>
>Oops, did I say 15? I should correct myself. I am now doing it in 14. Only 4
>slots not utilized. I fixed a bug and gained performance. :-)
>
>-Matt
\

If you like cute code, here is Eugene's FirstOne() function for
Crafty.  It is cute, but has one drawback in that it can not be inlined
due to the addressing computations he does which means that the value being
checked for FirstOne() has to be in memory...

        cmpl    $1, 8(%esp)
        sbbl    %eax, %eax
        movl    8(%esp,%eax,4), %edx
        bsr     %edx, %ecx
        jz      l4
        andl    $32, %eax
        subl    $31, %ecx
        subl    %ecx, %eax
        ret
l4:     movl    $64, %eax
        ret


I think you will be amused if you work thru it.  I should add that Crafty's
"bit numbering" has the MSB (leftmost) as bit 0, LSB (rightmost) as bit 63,
which is how the Cray does this due to the fact that they have a "leadz"
instruction that counts leading (leftmost) zeroes rather than finding the
first one bit.  The Cray also returns 64 if there are no bits set.  Eugene's
code emulates this perfectly.

I set about trying to turn this into something inlineable, and I also wanted
to do it with _zero_ jumps, just for fun.  Here is what I came up with:

        movl    $-1, %ecx         ;magic number, becomes 64 if no bits on
        bsr     4(%esp), %eax     ;check right end first
        cmovz   %ecx, %eax        ;keep bit number or magic number
        subl    $32, %eax         ;adjust to right hand range
        bsr     8(%esp), %ecx     ;check left end next
        cmovz   %eax, %ecx        ;keep count if non-zero, otherwise use other
        addl    $32,%ecx          ;adjust so subtract below works
        movl    $63, %eax         ;reverse numbers to 0-64 (0-63 + nobit N)
        subl    %ecx, %eax        ;by subtracting from 63
        ret

If you walk thru that, you will see that there are three possible things
to do.  If both words are zero, I need 64 as the answer.  If the leftmost
32 bits are non-zero, I need the index to the first one bit there, otherwise
I need the index to the first non-zero bit n the right 32 bits.

My approach is to test the rightmost 32 bits first, and either get the bit
number that is set, or a special constant (more in a second).  I then test
the leftmost 32 bits, and if one of those is set, I use that number and
adjust it to be 63-N to match my numbering.

The above code tests the right 32 bits and produces a constant that when
subtracted from 63 either returns 64 (the -1 at the top) or a number between
32 and 63.  It then tests the left end and if nothing is set, it uses the
previous result and adjusts is.  If one is set in the left end, I use that
value instead and still adjust it with 63-N.

Goes to show that cmov solves a lot of jmp-able issues with some thought.

Note that I have not tried to make this "fast".  I just wanted to make it
work in a way that is pretty understandable...  And it is inlineable because
it doesn't depend on the address subtraction trick Eugene does.

The main problem (as always) with X86 is that there are 4 GP registers, where
other processors have a bunch (31 on the SPARC, for instance).

I did the same for LastOne() as well, but am not posting it unless someone
wants to see it.  Same basic idea, just scanned from the opposite end.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.