Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Intel C++ 7.0 compiler questions...

Author: Vincent Diepeveen

Date: 07:55:17 12/24/02

Go up one level in this thread


On December 24, 2002 at 04:47:21, Frank Phillips wrote:

>On December 23, 2002 at 12:12:29, Vincent Diepeveen wrote:
>
>>On December 23, 2002 at 12:01:10, Robert Hyatt wrote:
>>
>>You forget the crucial data point and that's that you have
>>no AMD K7s out there.
>
>Intel is about 30% faster than gcc3.2 (and gcc2.95 and gcc 2.96) with profile
>guided optimisation for me on my AMD Athlons (Palomino and Thoroughbred).  Not
>all of this can be due to incompetence, I suggest.
>
>Frank

you must be using bitboards then. No other option possible.
profile guided optimization speeds me up 20% at k7 with gcc 3.x

>
>>
>>>On December 23, 2002 at 09:45:25, Vincent Diepeveen wrote:
>>>
>>>>On December 22, 2002 at 08:42:23, Joel wrote:
>>>>
>>>>>Hey all,
>>>>>
>>>>>Was reading some of the previous threads where the general consensus seemed to
>>>>>be that the Intel C++ 7.0 compiler did a much better job at optimising than the
>>>>>VC 6.0 Sp4 compiler did.
>>>>
>>>>at intel hardware i do not doubt it.
>>>>
>>>>But it is at your own risk of course whether it is producing correctly
>>>>working code for all of your users who also have k7s and perhaps assume
>>>>the program must not crash.
>>>>
>>>>At my K7 the intel compiler crashes time and time again. Also it's slower
>>>>than the gcc compiler when using branch profile info (-fbranch-probabilities)
>>>>after first generating the info.
>>>>
>>>>intel without that branch profile info is just like gcc without that info
>>>>slower at the k7 than msvc 6 sp4 processor pack.
>>>>
>>>>the processor pack is crucial for sp4 because it adds a 2% in speed
>>>>and the speed differences between default gcc compile and intel c++ compiles
>>>>versus msvc sp4 with the procpack is measured at 1% and 0.5%
>>>>
>>>>but then that profile info increasing the speed for gcc (which is a
>>>>time consuming thing, also for the intel compiler of course) is giving
>>>>an additional 20% speedup blowing away the other compilers.
>>>>
>>>>Now let's touch correctness. For a long period of time GCC was a very bad
>>>>compiler. Especially many 2.96 versions were very broken. And very buggy.
>>>>
>>>>Before the 2.95.x versions also there were numerous bugs in gcc with regards
>>>>to parallel behaviour (i use 'volatile' variables a lot because diep is
>>>>SMP). Also they were dead slow. the 2.95.x versions are dead slow for me
>>>>when compared to a default msvc 6 compile. Like 12.5% difference is
>>>>no coincidence at a k7. And 10% at a P3.
>>>>
>>>>But the 3.xx versions are great. If i understand well AMD contributed to
>>>>some linux distributions money in order to improve the gcc compiler for
>>>>their processors. Of course i have no exact info here i just read around
>>>>at the internet for this.
>>>>
>>>>But the sad thing is that an old 586 compiler msvc6 with a processor pack
>>>>that just speeds it up 2% is faster on AMD hardware than the most recent
>>>>compilers without that reordering pass.
>>>>
>>>>Of course this is for DIEP.
>>>>
>>>>Crafty uses weird 64 bits structures called bitboards it is trivial that
>>>>older compilers didn't know how to emulate that very well on 32 bits
>>>>processors. It's here only where Bob can claim the intel compiler is
>>>>fast for him.
>>>
>>>
>>>I have absolutely no idea what you are talking about.  Every gcc compiler
>>>after 2.5 worked perfectly with long longs.  As does every compiler I have
>>>tried in the past 5 years porting crafty to every unix machine made.
>>>
>>>The intel compiler _is_ faster than gcc 3 for me.  And for everyone here at
>>>UAB that has tested the two.
>>>
>>>Too many data points from others, only one from you. I tend to believe the
>>>majority.
>>>
>>>
>>>
>>>>
>>>>for GCC i use next format to compile:
>>>>
>>>>CFLAGS   = -pg -fprofile-arcs -O2 -march=athlon -mcpu=athlon -frename-registers
>>>>-DUNIXPII -fno-gcse -foptimize-register-move
>>>>
>>>>then i run diep for half an hour.
>>>>
>>>>then i recompile it using:
>>>>
>>>>CFLAGS    = -O2 -march=athlon -fbranch-probabilities -frename-registers
>>>>-DUNIXPII -fomit-frame-pointer -fno-gcse -foptimize-register-move
>>>>
>>>>in case of boundschecking:
>>>>
>>>>#CFLAGS   = -g -DUNIXPII -O2 -fbounds-checking -Wall
>>>>
>>>># intel c++ nu
>>>>#CC	= icc
>>>>#CPP     = icc
>>>>#CFLAGS  = -g -DUNIXPII
>>>>#CFLAGS  = -O3 -tpp6 -axi -xi -rcd -prof_genx -DUNIXPII
>>>>#CFLAGS  = -O3 -tpp6 -axi -xi -rcd -prof_use -DUNIXPII
>>>>
>>>>Best regards,
>>>>Vincent
>>>>
>>>>>My compiler knowledge is very limited - I have written a C compiler before (uni
>>>>>assignment), but optimisation wasn't an issue. I have no real idea how an
>>>>>optimising compiler goes about it's work.
>>>>>
>>>>>For the record I have an Athlon XP 2100+, and my engine is bitboard based.
>>>>>
>>>>>Having said that, I installed the Intel compiler, and tried compiling my latest
>>>>>version of Bodo, and then ran my dodgy little speed benchmark on it. It was
>>>>>actually slower than the VC 6.0 compiler, though I have reason to suspect my
>>>>>incompetence is the issue, largely due to statements like:
>>>>>
>>>>>"Did you use the intel C++ 7.0? Of course not.  Did you do the profile-feedback
>>>>>optimizations?  Probably not."
>>>>
>>>>>What I am asking is how do I do this profile-feedback optimisations, and or any
>>>>>other optimisations which you guys do?
>>>>
>>>>>What would be particularly helpful is other people could give me the compiler
>>>>>command line parameters they use to generate fast code.
>>>>
>>>>>I really need to buy a book on optimising compilers so I understand what the
>>>>>hell is happening here. :|
>>>>
>>>>>Any help greatly appreciated,
>>>>>Joel Veness



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.