Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Question for the Crafty/Compiler experts

Author: Robert Hyatt

Date: 13:01:33 02/19/04

Go up one level in this thread


On February 19, 2004 at 15:49:14, Dieter Buerssner wrote:

>On February 19, 2004 at 11:14:30, Robert Hyatt wrote:
>
>>Where did that come from?
>
>I downloaded yesterday the tarball of 19.10. It was the newest then. Just
>downloaded 19.11. Here the C-code seems faster again (with icc and gcc. THis
>time I used make profile for icc).
>
>There seems to be a similar issue for PopCnt, than the one I mentioned
>previously:
>
>int static __inline__ PopCnt(BITBOARD word)
>{
>/*  r0=result, %1=tmp, %2=first input, %3=second input */
>  long      dummy, dummy2;
>
>asm("        xorl    %0, %0"                    "\n\t"
>    "        testl   %2, %2"                    "\n\t"
>    "        jz      2f"                        "\n\t"
>    "1:      leal    -1(%2), %1"                "\n\t"
>    "        incl    %0"                        "\n\t"
>    "        andl    %1, %2"                    "\n\t"
>                         ^^^
>    "        jnz     1b"                        "\n\t"
>    "2:      testl   %3, %3"                    "\n\t"
>    "        jz      4f"                        "\n\t"
>    "3:      leal    -1(%3), %1"                "\n\t"
>    "        incl    %0"                        "\n\t"
>    "        andl    %1, %3"                    "\n\t"
>                         ^^^
>    "        jnz     3b"                        "\n\t"
>    "4:"                                        "\n\t"
>  : "=&q" (dummy), "=&q" (dummy2)
>  : "q" ((int) (word>>32)), "q" ((int) word)
>  : "cc");
>  return (dummy);
>}
>
>At the indicated points, you change the input registers. The compiler will not
>note this (I think). At least it seems against what I read some time ago in the
>gcc manual.
>
>This is what I have suggested in the WB forum some time ago. Possibly, you have
>to change the multi line string (it worked earlier with gcc). The WB-forum
>software did swallow the indentation ...
>
>http://f11.parsimony.net/forum16635/messages/31324.htm
>
>
>int static __inline__ PopCnt(BITBOARD word)
>{
> int tmp, tmp2, n;
> __asm__ __volatile__(
> "movl %3, %1
> xorl %0, %0
> testl %1, %1
> je 1f
> 0: incl %0
> leal -1(%1), %2
> andl %2, %1
> jne 0b
> 1: movl %4, %1
> testl %1, %1
> je 3f
> 2: incl %0
> leal -1(%1), %2
> andl %2, %1
> jne 2b
> 3:"
> : "=r&" (n), "=r&" (tmp), "=r&" (tmp2)
> : "g" (*(unsigned long *)&a)), "g" (*(((unsigned long *)&a)+1))
> : "cc" /* Flags "condition code" changed */);
> return n;
> }
>
>Note, that I used a bit less restrictive "registers", which should give the
>compiler a bit more liberty for optimization. Any register will do for %0, %1
>and %2 (not only a/b/c/dx, which would be selected by "q"). For the inputs, no
>register is needed at all (for example addressing via esp is ok).
>
>But of course, I think my suggested C-version should be preferred. If you use
>cast and shift instead of the horrible casts taking the adress here, it would
>even be totally portable (although not as efficient as possible on 64 bit
>platforms).
>
>Regards,
>Dieter


My reason for doing the inline in the first place was simplicity.  IE The AMD64
inline is tiny, but efficient since it does 64 bit instructions.  It was a bit
of a hassle to have external asm called for one option, inlined for another
option, etc.  I'm trying to simplify the conditional compilation stuff so that
it will be easier to use automake and get rid of the Makefile completely...

You are right about the changed input values.  I had that right at one point,
but somewhere along the way the = got lost in the shuffle...

Bob



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.