Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Implementation of the abs() function [o.t.]

Author: Dieter Buerssner

Date: 09:57:59 07/06/03

Go up one level in this thread


On July 06, 2003 at 05:02:50, Gerd Isenberg wrote:


>With mvc using math.h abs is fastest. With gcc cdq inline assembly abs or omids
>c-abs is much faster than the branching lib abs (maybe a macro from some header
>file?).

Hi Gerd, as far as I can see, abs is no macro in my gcc environment. It wouldn't
be possible with Standard C methods, would it? Because you would not be allowed
to evaluate the argument twice. Of course, they could use compiler specific
extensions and/or inlining. I checked by precompiling the source. I think, Gcc
will detect abs() just like other functions (memcpy for example) and can inline
it directly. Ineeded I see the "simple_abs" method branch in the assembly.

The strange thing, that omid_abs was significantly faster than nothing with MSVC
and rand(), do you have any idea? Here the assembly of tfunc_omid_abs

PUBLIC  @tfunc_omid_abs@0
;       COMDAT @tfunc_omid_abs@0
_TEXT   SEGMENT
@tfunc_omid_abs@0 PROC NEAR                             ; COMDAT
; Line 61
        push    esi
        push    edi
        xor     esi, esi
        mov     edi, 1000000000                         ; 3b9aca00H
$L877:
        call    _rand
        sub     eax, 16384                              ; 00004000H
        mov     ecx, eax
        sar     ecx, 31                                 ; 0000001fH
        mov     edx, ecx
        xor     edx, eax
        sub     edx, ecx
        add     esi, edx
        dec     edi
        jne     SHORT $L877
        pop     edi
        mov     eax, esi
        pop     esi
        ret     0
@tfunc_omid_abs@0 ENDP

Now for tfunc_nothing

;       COMDAT @tfunc_nothing@0
_TEXT   SEGMENT
@tfunc_nothing@0 PROC NEAR                              ; COMDAT
; Line 228
        push    esi
        push    edi
        xor     esi, esi
        mov     edi, 1000000000                         ; 3b9aca00H
$L969:
        call    _rand
        dec     edi
        lea     esi, DWORD PTR [esi+eax-16384]
        jne     SHORT $L969
        pop     edi
        mov     eax, esi
        pop     esi
        ret     0
@tfunc_nothing@0 ENDP

Looks about as tight as possible. The a += rand()-16384 with one lea.
But also shows, that with this method and clever inlining of the compiler,
things are not 100% comparable.

And tfunc_abs (library):

PUBLIC  @tfunc_abs@0
;       COMDAT @tfunc_abs@0
_TEXT   SEGMENT
@tfunc_abs@0 PROC NEAR                                  ; COMDAT
; Line 229
        push    esi
        push    edi
        xor     esi, esi
        mov     edi, 1000000000                         ; 3b9aca00H
$L978:
        call    _rand
        sub     eax, 16384                              ; 00004000H
        cdq
        xor     eax, edx
        sub     eax, edx
        add     esi, eax
        dec     edi
        jne     SHORT $L978
        pop     edi
        mov     eax, esi
        pop     esi
        ret     0
@tfunc_abs@0 ENDP

All very similar, all should use comparable time (the time of rand()), but
tfunc_omid_abs is double as fast!

Does the P4 like aligned jump lables? Can they give such extreme effects? Hard
to believe.

BTW. When I

#define RAND_VAL() ((int)n)

to get rid of the rand() overhead (and of course also giving the branch using
versions an advantage), I get normal results:

       nothing 4051657984 0.811
           abs 4051657984 1.702
    simple_abs 4051657984 1.923
      omid_abs 4051657984 1.702
       sbb_abs 4051657984 4.156
       cdq_abs 4051657984 4.457
      fish_abs 4051657984 2.063
       sar_abs 4051657984 3.324
     cmovl_abs 4051657984 2.604
     cmovs_abs 4051657984 2.644

405164798 = ((1e9 * (1e9+1))/2) % 2^^32; as expected for N_ITERATIONS=1e9.

The 0.8 s for nothing is about 2 cycles, which seems reasonable for the loop

$L977:
        add     eax, ecx
        dec     ecx
        jne     SHORT $L977

Regards,
Dieter




This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.