Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: GCC annihilating VISUAL C++ ==> branchless code in 2003?

Author: Andrzej Nagorko

Date: 06:25:07 02/28/03

Go up one level in this thread


On February 28, 2003 at 08:59:08, Vincent Diepeveen wrote:

<snip>

>
>Next is GCC output of this file:
>
>	.file	"tryx.c"
>	.text
>	.p2align 4,,15
>        .globl branchless
>	.type	branchless, @function
>branchless:
>	pushl	%ebp
>	movl	%esp, %ebp
>	.p2align 4,,7
>.L2:
>        movl	board(,%edx,4), %eax
>	addl	$8, %edx
>	subl	$5, %eax
>	testl	%eax, %eax
>	movl	$64, %eax
>	cmovne	%eax, %edx
>	cmpl	$63, %edx
>	jle	.L2
>	leave
>	ret
>	.size	branchless, .-branchless
>	.comm	board,256,32
>	.ident	"GCC: (GNU) 3.3 20021230 (prerelease)"
>

<snip>

>So in 1 small example we see both the strength of the new generations of
>processors released after 1996 (pentiumpro/klamath and newer) and the
>weakness of the software (visual c++ 6.0 despite pentiumpro released
>in 1996 already still with service packs not using P6 instructions) and the
>general inefficiency of the GNU world who isn't using "640KB should be enough
>RAM", but instead still is using the lemma "2 registers will do".
>

  My gcc produces better code:

        .file   "tryx.c"
        .text
        .p2align 4,,15
.globl branchless
        .type   branchless,@function
branchless:
        pushl   %ebx
        movl    $64, %ecx
        movl    $board, %ebx
        .p2align 4,,7
.L2:
        movl    (%ebx,%edx,4), %eax
        addl    $8, %edx
        subl    $5, %eax
        testl   %eax, %eax
        cmovne  %ecx, %edx
        cmpl    $63, %edx
        jle     .L2
        popl    %ebx
        ret
.Lfe1:
        .size   branchless,.Lfe1-branchless
        .comm   board,256,32
        .ident  "GCC: (GNU) 3.2.3 20030221 (Debian prerelease)"

  As you see it uses three registers (and doesn't do movl $64, %eax inside
loop). Either it is difference between gcc 3.2.3 and 3.3 or you didn't use
proper optimization switches. I compiled it with

gcc -Wall -O3 -fomit-frame-pointer -march=athlon -mcpu=athlon -funroll-loops
-fstrict-aliasing -S tryx.c

Andrzej



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.