Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Assembler handtuning benefit

Author: Robert Hyatt

Date: 06:42:55 11/11/97

Go up one level in this thread


On November 11, 1997 at 08:43:43, Bas Hamstra wrote:

>On November 11, 1997 at 08:11:37, Jouni Uski wrote:
>
>>Rebel, Genius and Fritz have written in assebler to get everything
>>from prosessor speed. But why are Hiarcs (and possible Junior,
>>Shredder etc.) better even if they are pure C programs? Is this
>>handtuning totally waste of time or is Mark superior in comparison
>>to assembler guys or what ?
>
>Today's C (and C++) compilers optimize very well. So well in fact that
>the executables run nearly as fast as the 100% assembler versions. To
>quote a few reasonably experienced assembler programmers:
>
>- Ed Schroder says highly optimized assembler gives at most 40% speed
>increase in comparison to C
>
>- Bruce Moreland said he got 0% (zero, BTW my experience also)
>
>So let's say only 20% speed.
>
>Then the same code can produce a special Pentium optimized 32 bits
>executable by setting simply a compiler switch or using a different
>compiler. If you coded your program in 16 bits assembler (Fritz/Genius)
>it is not that easy.
>
>So in C you are much more flexible at only very minor costs.


this is often not done well, however.  IE if the intent is to take a
piece
of C code and convert it into assembly, modest speed improvements are
possible
if the code is big enough, and if the compiler is forced to make some
assumptions that might not be needed.  IE taking a "short" int and
copying it
to a long (32 bit) integer.  Do you need to sign-extend or not?  The
compiler
assumes yes.  If you know the short is always positive, you don't need
to.

This sort of thing can happen enough that you can use knowledge that the
compiler doesn't have to eliminate some tests it makes...

However, there is *much* more to the topic.  *if* you are willing to
"write"
assembly language code, rather than just re-writing C code into assembly
code, the gains can be much bigger.  One example:  Cray Research has
what is
considered to be the best FORTRAN compiler on the market, for obvious
reasons.
In Cray Blitz, we spent a couple of years writing assembly code and
rather
than a paltry 20-40%, we got 500% faster.  Why?  Because we didn't
"convert"
FORTRAN to assembly, we completely re-designed the program with the
underlying
architecture firmly in mind.  For example, we were able to keep lots of
scalar
values in vector registers.  We didn't have to assume that the called
procedure
would destroy the contents of these registers and store them away, we
*knew*
they were not touched.  So we looked at the entire architecture, to
decide
what we could use, what we could modify in our algorithms to better fit
the
hardware architecture, and so forth.

I've been doing optimizations for many years, and consider 20-40% way
too
low, *if* you start from scratch and make the algorithm fit the
architecture
as well as possible.  Few people do this, and they get sub-optimal
results
for this failure.  I'd suspect Frans got *far* more than 20% for
example,
because Fritz seems to be 3x faster than the next closest commercial
program.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.