Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Differences in speedup

Author: Vincent Diepeveen

Date: 16:27:00 05/07/04

Go up one level in this thread


On May 07, 2004 at 11:53:29, Andreas Guettinger wrote:

>On May 07, 2004 at 04:38:00, Vincent Diepeveen wrote:
>
>>On May 06, 2004 at 19:03:48, martin fierz wrote:
>>
>>>aloha!
>>>
>>>bob posted some crafty logfiles running a 24-position test set on his ftp site
>>>(for anyone else crazy enough to repeat what i did:
>>>ftp.cis.uab.edu/pub/hyatt/smpdata)
>>>
>>>these are logfiles of crafty running as single CPU, dual, or quad; on opterons.
>>>i took the last completed ply on the single CPU set for each position (marked by
>>>-> in the logfile, i hope...), wrote down the time to complete this ply, and did
>>>this for all logfiles. there are 9 of these, 4 repeats for 2 and 4 CPUs. i
>>>computed the speedup for time-to-finish-ply-X for each of the multi-CPU runs
>>>with the following results:
>>>
>>>2 CPUs:
>>>1.961 +- 0.093
>>>1.888 +- 0.074
>>>1.846 +- 0.078
>>>1.763 +- 0.084
>>>
>>>4 CPUs:
>>>3.15 +- 0.15
>>>3.29 +- 0.20
>>>3.06 +- 0.12
>>>3.19 +- 0.13
>>>
>>>now, is there any meaning to this, and if yes, what?
>>>
>>>point #1 to make is that the numbers here are mutually consistent with each
>>>other, given the error margins quoted. which should show those skeptical of this
>>>statistical approach that it makes sense to do it this way, rather than to just
>>>write "i measured speedup 3.1".
>>>
>>>point #2 is that the speedup on 4 CPUs on average is 3.17 in this test, which
>>>might be one point for bob in the duel with vincent; although i suspect that the
>>>speedup depends on the hardware architecture - i will leave this question to the
>>>parallel computing experts though...
>>
>>Bob has tested the SMP version 1 cpu versus SMP version 2 or 4 cpus. The single
>>cpu version of crafty is just hardly existing because of a stupid thread pointer
>>which is a constant. Optimizing that crafty is 5% faster for sure in time single
>>cpu at opteron.
>
>I don't understand that. What does that mean?
>
>regards
>Andy

In very simple words, to run parallel you first slow down your program.
Then the slowed down program gets when compared to the slowed down program the
speedups that Bob reports.

However this is not fair.

In diep i just compare the single cpu version versus the parallel version of
diep.

Other good examples of unfair compares are what the Chrilly donninger is posting
about hydra.

Hydra does not use hashtables last 6 plies. 3 ply not in hardware and 3 ply not
in software.

He compares 1 cpu not doing last 6 plies in hardware versus 16 cpu's not doing
last 6 ply in hardware.

That is not fair however, the *only* reason to not use the hashtable the last 3
ply in software is because that would not run parallel well.

However, single cpu it does run well using hashtable there.

This is a very common trick in computerchess and some are very bad in this. Like
cilkchess was slowed down 40 times in speed. Reduced from like 200k nps to 5k
nps in order to run parallel better.

Then it shows up with 500 processors somewhere or even in 1995 it showed up at
like 1800 processors.

But it is losing somewhere a factor 40 to start with.

Is it fair to compare a slowed down program versus n processors?

I do not think so. I find it very bad compare.

I also can get a much better speedup with diep when slowing it down first.





This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.