Author: Robert Hyatt
Date: 13:27:06 02/17/05
Go up one level in this thread
On February 17, 2005 at 14:37:50, Vincent Diepeveen wrote: >On February 17, 2005 at 13:56:51, Matthew Hull wrote: > > > >>On February 17, 2005 at 12:17:41, Vincent Diepeveen wrote: >> >>>On February 16, 2005 at 18:43:20, Frank Phillips wrote: >>> >>>here is what i do with diep at x86-64 which speeds up quite some : >>> >>>i take care that not a single file is there of the profile info and remove all >>>objects. >>> >>>first Run : >>>CFLAGS = -fprofile-generate -O3 -march=k8 -mtune=k8 -DUNIXPII >>> >>>then i start diep and run it for an hour analysing at openings position. >>>then i quit diep. do this a single time. don't run it 2 times. just a SINGLE >>>time otherwise the thing might get confused again. GPL programmers :) >> >> >> >>I guess none of that endgame code will ever get "profiled". >> >>Should you not run a variety of positions for the best profile? > >I want to be faster in middlegame don't you? > >All games get decided there. Diep won't blunder just like that in far endgame. >It gets a shitload more nodes anyway there than in middlegame and a far bigger >depth. > >So frankly i don't care for endgame. > >All hardware sites just test openings position too. Nothing else. > >Vincent Wrong way of testing by them if that is true. If all you care about is opening position search speed, this is OK. But if you are going to play a full game, you need to run everything. For example, none of the EGTB code will be profiled if you don't use a couple of endgames. none of the endgame evaluation will be used/profiled if you don't test them. Provides much less useful data for the compiler to move code around and help with branch predictions. > >> >> >> >> >>> >>>then secondly delete all object files and recompile with: >>>CFLAGS = -fprofile-use -O3 -march=k8 -mtune=k8 -DUNIXPII >>> >>>The above works for diep. >>> >>># -fif-conversion -frerun-loop-opt <== slows down diep a bit >>> >>> >>>>On February 16, 2005 at 17:46:23, Vincent Diepeveen wrote: >>>> >>>>>On February 16, 2005 at 13:09:14, Frank Phillips wrote: >>>>> >>>>>>Has anybody got any experience with g++ 3.4 for amd64 (x86_64) - for Linux? >>>>>> >>>>>>I have been using the profile generated optimisation option, but the code it >>>>>>produces is no faster then with simple -O3. >>>>>> >>>>>>I simply compile with -fprofile-generate >>>>>>then run, >>>>>>then recompile with -fprofile-use. >>>>>> >>>>>>The relevant *.gcno, *.gcda files are produced. Must be doing something >>>>>>wrong..... >>>>>> >>>>>>Frank >>>>> >>>>>First of all get the LATEST version of gcc. thats 3.4.3 now. and if when i post >>>>>3.4.4 is released get that one. like bob my experience is that the PGO in gcc is >>>>>pretty buggy. >>>>> >>>>>icc is however such a bad optimizing compiler that gcc is far faster for diep. i >>>>>guess icc is better bugfixed for 64 bits code as that mattered for specint2000, >>>>>guess why :) >>>>> >>>>>Anyway gcc isn't that great in 64 bits perhaps, but it's scheduling better for >>>>>opteron than icc is, which for diep is more important. icc of course is only >>>>>good for intel hardware when your program hasn't been in specint yet. >>>>> >>>>>main idea is. delete all your files except source files. >>>>> >>>>>THEN run the fprofile generate single cpu. >>>>> >>>>>then delete all object files >>>>> >>>>>then run the profile use. >>>>> >>>>>never use intel c++. they will do anything to slow you down at AMD hardware. >>>>> >>>>>Vincent >>>> >>>>Thanks, I do remove object files before recompiling after the profile run. >>>> >>>>What is confusing, is that I get no speed up at all over plain -O3, which made >>>>me suspect I must be doing something wrong. (Although all I have done is change >>>>fprofile-arcs / fprofile-branch-probabilites to fprofile-generate / fprofile-use >>>> in the makefile and the latter pair did have an effect in gcc3.3 - I think.). >>>> >>>>Frank >>>> >>>>BTW icc gave me 20-30% speed up in 32 bit mode over gcc. My program is an unholy >>>>mixture of bitboards and array look-up for move generation. The 64 bit amd and >>>>slower gcc slightly overcompensates for the loss of the 32bit Intel compiler.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.