Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Differences in speedup

Author: Andreas Guettinger

Date: 14:17:50 05/07/04

Go up one level in this thread


On May 07, 2004 at 16:53:57, Robert Hyatt wrote:

>On May 07, 2004 at 16:28:24, Andreas Guettinger wrote:
>
>>On May 07, 2004 at 12:02:28, Robert Hyatt wrote:
>>
>>>On May 07, 2004 at 11:53:29, Andreas Guettinger wrote:
>>>
>>>>On May 07, 2004 at 04:38:00, Vincent Diepeveen wrote:
>>>>
>>>>>On May 06, 2004 at 19:03:48, martin fierz wrote:
>>>>>
>>>>>>aloha!
>>>>>>
>>>>>>bob posted some crafty logfiles running a 24-position test set on his ftp site
>>>>>>(for anyone else crazy enough to repeat what i did:
>>>>>>ftp.cis.uab.edu/pub/hyatt/smpdata)
>>>>>>
>>>>>>these are logfiles of crafty running as single CPU, dual, or quad; on opterons.
>>>>>>i took the last completed ply on the single CPU set for each position (marked by
>>>>>>-> in the logfile, i hope...), wrote down the time to complete this ply, and did
>>>>>>this for all logfiles. there are 9 of these, 4 repeats for 2 and 4 CPUs. i
>>>>>>computed the speedup for time-to-finish-ply-X for each of the multi-CPU runs
>>>>>>with the following results:
>>>>>>
>>>>>>2 CPUs:
>>>>>>1.961 +- 0.093
>>>>>>1.888 +- 0.074
>>>>>>1.846 +- 0.078
>>>>>>1.763 +- 0.084
>>>>>>
>>>>>>4 CPUs:
>>>>>>3.15 +- 0.15
>>>>>>3.29 +- 0.20
>>>>>>3.06 +- 0.12
>>>>>>3.19 +- 0.13
>>>>>>
>>>>>>now, is there any meaning to this, and if yes, what?
>>>>>>
>>>>>>point #1 to make is that the numbers here are mutually consistent with each
>>>>>>other, given the error margins quoted. which should show those skeptical of this
>>>>>>statistical approach that it makes sense to do it this way, rather than to just
>>>>>>write "i measured speedup 3.1".
>>>>>>
>>>>>>point #2 is that the speedup on 4 CPUs on average is 3.17 in this test, which
>>>>>>might be one point for bob in the duel with vincent; although i suspect that the
>>>>>>speedup depends on the hardware architecture - i will leave this question to the
>>>>>>parallel computing experts though...
>>>>>
>>>>>Bob has tested the SMP version 1 cpu versus SMP version 2 or 4 cpus. The single
>>>>>cpu version of crafty is just hardly existing because of a stupid thread pointer
>>>>>which is a constant. Optimizing that crafty is 5% faster for sure in time single
>>>>>cpu at opteron.
>>>>
>>>>I don't understand that. What does that mean?
>>>>
>>>>regards
>>>>Andy
>>>
>>>Ever heard of "the fog of war"?  This is "the fog of vincent".
>>>
>>>In crafty, I pass a pointer to a "TREE struct" around so that each thread can
>>>use a different struct for their local tree state.  This is done even with mt=0
>>>or when Crafty is compiled with no SMP support.  Vincent claims it would speed
>>>Crafty up by 5% if the pointer were removed.  That would be neat as it didn't
>>>slow me down 5% when I added the pointer.
>>>
>>>But that's irrelevant because Vincent has said so...
>>>
>>>IE everywhere that I now say tree->something such as:
>>>
>>>tree->node_count++;
>>>
>>>could be replaced by a non-pointer:
>>>
>>>node_count++;
>>>
>>>It doesn't cost 5%...
>>
>>For me this seems faster than if (SMP== 0) everywhere...
>>
>>regards
>>Andy
>
>
>I'm not sure what you mean.. There is only _one_ test in Search, done once per
>node.  Comment it out and you can't measure the speed change..  If you compile
>without -DSMP it is removed and the speed difference is < .1%.  But anyone can
>confirm this easily enough without Vincent's speculation...

Yes, forget the _everywhere_.

Do your search threads run consistently and wait when there's nothing to do or
do you create the threads at the begining of the search (or pondering) and
eliminate them when the search is done?

regards
Andy



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.