Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: intel c++ not giving deterministic results?

Author: Eugene Nalimov

Date: 13:33:44 07/05/01

Go up one level in this thread


On July 05, 2001 at 08:23:47, Robert Hyatt wrote:

>On July 05, 2001 at 08:07:58, Vincent Diepeveen wrote:
>
>>Hi, risking to be off topic,
>>here a small attempt to adress the intel c++ compiler 5.01 build 15
>>
>>I tried several options with it for DIEP and figured out that
>>it is 1.5% slower for me as msvc 6.0, nevertheless that's very
>>good considering that 5.00 completely crashed.
>>
>>this version does *not* crash. it produces an exe which runs and
>>which doesn't crash.
>>
>>I measured however in
>>another program of mine some weird things in floating point
>>unit calculations.
>>
>>An AI program of mine,
>>which fiddles with parameters (neural based) is regrettably
>>no longer deterministic. When i use optimizations
>>  -O3 and some optimizations that allow Pentiumpro instructions and
>>   MMX and SSE and similar instructions, then
>>   i get completely different results.
>>
>>Especially the function
>>
>>  y = ax^2 + bx + c
>>
>>which i use everywhere in the network, is giving different results
>>when using different optimizations. This is a bad thing!
>>
>>I cannot explain it but i feel that it has to do with the
>>fact that all these values are 'float'.
>>
>>Then when casting them back to parameters in 1/1000 pawn, i
>>get differences in testruns of up to 1 millipawn for the average
>>calculation!
>
>Some IEEE FP hardware has the ability to round or truncate on the LSB.
>Perhaps Intel is doing that opposite to MSVC?
>
>Another explanation is a program bug that has an unitialized variable.
>High levels of optimization often cause such bugs to produce non-deterministic
>results.

Actually, I believe the reason is one of the following:

(1) Different order of evaluations -- IEEE FP operations are not associative, so
(a+b)+c != a+(b+c). That's common problem -- tester (or customer) says "my
program produces different results when compiled with and without
optimizations", and it happened that with optimizations compiler choose
different order of evaluations.

(2) Most likely reason in this particular case -- different precision of
intermediate results. ANSI/ISO C/C++ does not specify precision of the FP
intermediate results, just says "it should be not less than source precision",
so when you are calculating (a*x*x + b*x + c) where all the variables are
'float', on x86 intermediates can be 'float' (32-bit), 'double'(64-bit), or
'extended' (80-bit). I suspect that when using SSE/SSE2 compiler choose
different precision than when using x87.

>>
>>This is WEIRD!!!!!!!!!!!!!!!!!!!!!!!!
>>
>>I cannot explain it!!!!!!!!!!!!!1
>>
>>So i cannot use this compiler. Note that i find it pretty bad that
>>a compiler which was already very good some time ago, that it still
>>can't beat at a P3-800, which is an intel chip, the visual c++ compiler.
>>
>>It was 0.5% slower this intel c++ some years ago as visual. Latest visual
>>c++ compiler is 1.5% faster, despite that it has all kind of instructions
>>nowadays this intel c++!
>
>
>why is this "pretty bad"?  MS has a good compiler group and they are selling
>MSVC to make money.  That is an incentive to produce the best compiler they
>can.  Intel isn't competing with that at all...

Actually, I believe MS and Intel's compiler groups have different goals.

MS Goal#1 -- produce good code for the MS own applications, i.e. Windows,
Office, SQL Server, etc. Code must run on *any* x86 PC. That, among other
things, means that (a) FP performance is not very important; (b) instructions
found only on *some* CPUs (SSE, SSE2, etc) should be used only if they results
in *much* better integer performance; (c) compiler is relatively good tested --
it compiled tens of millions lines of code, and that code was carefully tested,
too; (d) compilation time is also important, as compiler is used daily to
compile tens of millions lines of code at MS.

Intel's goal#1 -- produce fastest possible code for the SPEC benchmarks,
including FP benchmarks. That means that (a) compiler can use features found
only in newest Intel's processors; (b) FP performance is very important; (c)
performance of programs which behavior differ from SPEC's programs behavior
(e.g. server applications, operating systems) is *absolutely* not importamt; (d)
testing is much worse than in MS case, just because code base is much smaller.

Eugene

>>
>>Using all those instructions it's NOT giving deterministic results.
>>If i turn off those P3 instructions, then it is of course going to
>>be hell slower as visual.
>>
>>Nevertheless it's a good thing that it no longer crashes like version 5.00
>>
>>Best regards,
>>Vincent



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.