Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Crafty profits little from Itanium and Opteron versus Commercials

Author: Gerd Isenberg

Date: 11:32:59 08/07/03

Go up one level in this thread


On August 07, 2003 at 08:37:47, Vincent Diepeveen wrote:

>On August 07, 2003 at 08:08:57, Sune Fischer wrote:
>
>If you have a lot of inline assembly Sune, then you really
>can't expect the compiler to speed you up a lot. It's like saying:
>"optimize my path to get to amsterdam but be sure that you travel over
>arnhem and utrecht".
>
>then you penalize it first.


Hi Vincent,

Yes, inline assembly with MSVC on x86-32 targets only makes sense for
instructions that the compiler does not support, like bsf, bsr, btr, bswap, cmov
(MSVC6) and using mmx or xmm-registers.

You are rigth, inline assembly hurts the MSVC optimizer, if small __asm
functions are inlined, due to fixed register usage. But for me the fastest way
to Amsterdam is via Arnhem and Utrecht ;-)

With MSVC there is even no fastcall inline way. If you pass a single value,
pointer or reference, you often have not necessesary stores and loads to/from
stack, most often without any change of a register and without a real need to
store it on stack.

But don't confuse "inline" assembly with inlining (__asm) functions. That are
two different issues. If you have small to medium sized functions with pure
__asm body you may call them in fastcall way, passing something via ecx.

I guess GCC inline assembler is more flexible from compilers point of view, due
to "symbolic" registers, which the compiler may choose, depending on the
context.

As i learned from Eugene Nalimov, in AMD64-bit MSVC there is no inline assembler
anymore, but intrinsic functions for most, if not all
mmx/3DNow/sse2-instructions and for gp-instructions like bsf, bsr, bswap and
64*64=128bit
(i)mul, not supported by C/C++.

For C-programmers a good compromise to avoid MASM.


>
>Inline assembly is showing in general a wrong approach to the problem of chess
>IMHO. Try inline assembly at an itanium :)

I think that these intrinsics are easier to optimize by the compiler considering
the callee (register usage). Intrinsics are a slightly more portable way
(itanium?), to gain performance from otherwise unknown processor resources.

BitBoards forever!

Regards,
Gerd


Btw. another exciting day in Dortmund.
They really play spectacular chess.


<snip>



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.