Author: Gerd Isenberg
Date: 00:23:46 11/22/05
Go up one level in this thread
On November 21, 2005 at 19:07:59, Dieter Buerssner wrote:
>On November 21, 2005 at 18:10:54, Dieter Buerssner wrote:
>
>>On November 20, 2005 at 16:38:38, Gerd Isenberg wrote:
>>
>>> [...] Here some branchless substitution may pay off:
>>>
>>>fm = (depth < 0) ? fm1 : fm2;
>>
>>I guess, you mean this as a substitution for
>> if (depth < 0)
>> fm = fm1;
>> else
>> fm = fm2;
>>
>>I am surprised, that compilers are not able to do this themselves.
>
>C:\src>cat s2.c
>/* #define fm1 1
>#define fm2 2 */
>
>int fm1=1, fm2=2;
>
>int gerd(int depth)
>{
> int fm;
> fm = (depth < 0) ? fm1 : fm2;
> return fm;
>}
>
>int ifelse(int depth)
>{
> int fm;
> if (depth < 0)
> fm = fm1;
> else
> fm = fm2;
> return fm;
>}
>
>C:\src>gcc -O2 -S -fomit-frame-pointer -march=pentium4 s2.c
>
>C:\src>cat s2.s
> .file "s2.c"
>.globl _fm2
> .data
> .align 4
>_fm2:
> .long 2
>.globl _fm1
> .align 4
>_fm1:
> .long 1
> .text
>.globl _gerd
> .def _gerd; .scl 2; .type 32; .endef
>_gerd:
> movl 4(%esp), %edx
> movl _fm2, %eax
> testl %edx, %edx
> cmovs _fm1, %eax
> ret
>.globl _ifelse
> .def _ifelse; .scl 2; .type 32; .endef
>_ifelse:
> movl 4(%esp), %ecx
> movl _fm2, %eax
> testl %ecx, %ecx
> cmovs _fm1, %eax
> ret
>
>Compiled with the #defines, I get:
>
>C:\src>cat s2.s
> .file "s2.c"
> .text
>.globl _gerd
> .def _gerd; .scl 2; .type 32; .endef
>_gerd:
> movl 4(%esp), %eax
> sarl $31, %eax
> addl $2, %eax
> ret
>.globl _ifelse
> .def _ifelse; .scl 2; .type 32; .endef
>_ifelse:
> movl 4(%esp), %eax
> sarl $31, %eax
> addl $2, %eax
> ret
>
>Which does look excellent at first sight (no idea, how "dead slow" the shifts
>really are).
>
>Cheers,
>Dieter
Wow, gcc - that's it. Shift is relative expensive on P4, something like four
cycles (shift alu of the mmx-unit is used with "long" data pathes). Shift is
cheap on amd cpus.
Gerd
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.