Author: Robert Hyatt
Date: 08:18:00 09/03/02
Go up one level in this thread
On September 03, 2002 at 11:01:51, Aaron Gordon wrote: >On September 03, 2002 at 10:29:00, Robert Hyatt wrote: > >>I already do that. I have about 30 positions that I use during the profile >>stage of "make profile"... >> >>I'm not sure what the -openmp is supposed to do since I am not using any >>of the openMP pre-processing directives for parallel programming... >> >>By the way, I also use -fno-alias since I don't write that kind of sloppy >>code... >> >>I will try the -O3. Last time I did it slowed down just a bit so I stuck >>with -O2... > >Ah ok, just wondering. I only see: >CFLAGS='$(CFLAGS) -D_REENTRANT -O2 \ > -prof_use -prof_dir ./profile -fno-alias -tpp6' \ >in the crafty makefile. I just assumed you used the same on your quad. -O3 may >just depend on which version you're compiling. I had a similar experience.. I >found -O3 was a hair faster so I just stuck with it. I haven't compared it to >-O2 lately. If you find -O3 is slightly slower then I have a lot of recompiling >to do. :) > >Also something else I was curious about.. perhaps you'd be able to shed some >insight on. When I was doing the profiling for Slate's dual box I found the >fastest way to make the binaries was by using those settings *BUT* run the SMP >tests on my machine (single CPU). I just profiled with normal settings, exited. >Then I profiled again with smpmt 2 and exited (making two dyn's). After >recompiling it ended up being a good deal faster than when Slate profiled it on >his box using his dual cpus. > First, you should not profile using mt=2. For several reasons. 1. It will on occasion corrupt the profile files. Apparently the way they do that is not completely thread-safe. 2. It can produce misleading results. I do _all_ profiling using one cpu because parallel search is so non-deterministic, and different profile runs on a parallel code will produce different results, and hence slightly different optimizations. I want repeatibility whenevery possible. Second, you should profile a variety of positions. I use fine 70 for simple endgame example, the starting position to be sure that I profile the Evaluate Development() code. I have several opening, middlegame and endgame positions just to be sure that all the significant stuff gets hit during the profile run... >I have a guess as to why it could do this but I do not know enough about it to >explain it properly. I'm thinking maybe the way I did it provided a "synthetic" >dual cpu box which operated properly throughout the entire test with two >threads, providing better profiling information. Think that is possible? I think >Slate mentioned something about SMP doing random things, cutting off stuff and >whatnot. I'm not sure if it did that when I profiled but thats why I'm guessing >at this. Otherwise I haven't a clue. :) The SMP search is highly non-deterministic in its behavior. The serial search will repeat the same search every time, which is why I prefer to profile that behavior...
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.