Computer Chess Club Archives

Search

Terms

Messages

Subject: Re: Possible small improvement to hacky method

Author: Matt Taylor

Date: 12:57:13 12/10/02

On December 06, 2002 at 00:49:11, Walter Faxon wrote:

>On December 02, 2002 at 20:27:39, Gerd Isenberg wrote:
>
>>Ahh.. i see,
>>
>>a special kind of popcount for consecutive trailing "ones".
>>I love this interactive programming lessions.
>>
>>Gerd
>>
>>Btw.: Impressive assembler listing:
>>
>>00401003 8B 4C 24 0C          mov         ecx,dword ptr [esp+0Ch]
>>00401007 8B 31                mov         esi,dword ptr [ecx]
>>00401009 8B 79 04             mov         edi,dword ptr [ecx+4]
>>0040100C 8B D6                mov         edx,esi
>>0040100E 8B C7                mov         eax,edi
>>00401010 83 C2 FF             add         edx,0FFFFFFFFh
>>00401013 83 D0 FF             adc         eax,0FFFFFFFFh
>>00401016 23 F2                and         esi,edx
>>00401018 23 F8                and         edi,eax
>>0040101A 89 31                mov         dword ptr [ecx],esi
>>0040101C 89 79 04             mov         dword ptr [ecx+4],edi
>>0040101F 8B CF                mov         ecx,edi
>>00401021 33 D6                xor         edx,esi
>>00401023 33 C1                xor         eax,ecx
>>00401025 33 C2                xor         eax,edx
>>00401027 5F                   pop         edi
>>00401028 35 81 FC C5 01       xor         eax,1C5FC81h
>>0040102D 5E                   pop         esi
>>0040102E 8B D0                mov         edx,eax
>>00401030 C1 EA 10             shr         edx,10h
>>00401033 03 C2                add         eax,edx
>>00401035 33 D2                xor         edx,edx
>>00401037 8B C8                mov         ecx,eax
>>00401039 C1 E9 08             shr         ecx,8
>>0040103C 2A C1                sub         al,cl
>>0040103E 25 FF 00 00 00       and         eax,0FFh
>>00401043 8A 90 D5 62 40 00    mov         dl,byte ptr [eax+4062D5h]
>>00401049 8B C2                mov         eax,edx
>>0040104B C3                   ret
>
>
>Hi, Gerd.
>
>I have noticed that most compilers aren't too smart where it comes to bitwise
>dataflow analysis.  After the "add  eax, edx" instruction above, better (well,
>shorter!) code would be:
>
>    sub    al, ah
>    and    eax, 0FFh
>    mov    al, byte ptr [eax+4062D5h]
>    ret
>
>Perhaps if we added a little "hint" in the last 2 lines of source:
>
>inline                  // inline declaration may differ by compiler
>u8 LSB_64( u64* bb )
>    {
>    u64 t64;
>    u32 t32;
>    u8 t8;
>    t64 = *bb - 1;
>    *bb &= t64;         // omit this line to retain current LSB
>    t64 ^= *bb;
>    t32 = (u32)t64 ^ (u32)(t64 >> 32);
>    t32 ^= LSB_64_magic;
>    t32 += t32 >> 16;
>    t8 = (u8)t32 - (u8)(t32 >> 8);
>    return LSB_64_table [LSB_64_adj + t8];
>    }
>
>Of course, this really only helps because we already know that the x86 class can
>independently address both the first and 2nd bytes of a register.  And it's
>wishful thinking that much better asm code will actually be produced!  I'd be
>interested to see.
>
>If/when you run your test suite again, please try this version.  Thanks.
>
>-- Walter

I wanted to procrastinate studying for my finals, and I was writing this routine
using MMX. MMX has an 8-byte word size, and some of these operations (like the
xor) become atomic. The only problem is that I can't do 64-bit subtraction.
Elsewhere the CPU makes emulation of large arithmetic quite easy by allowing one
to add in (or subtract out) the carry bit, but MMX has no easy way of doing
that.

(As a side-note, the latest extensions to SSE include the ability to do atomic
64-bit arithmetic in MMX registers, too, but since I do not have a P4, I'm not
interested in exploring that route at present.)

However, it dawned on me that perhaps I don't -need- to generate a carry. I was
looking at the first 3 lines:

>    t64 = *bb - 1;
>    *bb &= t64;         // omit this line to retain current LSB
>    t64 ^= *bb;

It seems to me that this would create a mask in t64 of 2^LSB - 1, right?

-Matt

Re: Possible small improvement to hacky method Gerd Isenberg 16:14:18 12/10/02
- Re: Possible small improvement to hacky method Matt Taylor 18:27:15 12/10/02

This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.