Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Differences in speedup

Author: Robert Hyatt

Date: 05:22:02 05/09/04

Go up one level in this thread


On May 09, 2004 at 01:45:57, Eugene Nalimov wrote:

>Bob, please relax. Just ignore Vincent as I do.
>
>Thanks,
>Eugene

I really should, at this point.  His credibility is at zero.  But I hate to let
absolute nonsense slip by, because there are enough new people that would
believe most of his nonsense with no contradicting data...

He would have a promising career in counter-intelligence.  He is quite good at
producing disinformation.




>
>On May 09, 2004 at 00:05:47, Robert Hyatt wrote:
>
>>On May 07, 2004 at 19:27:00, Vincent Diepeveen wrote:
>>
>>>On May 07, 2004 at 11:53:29, Andreas Guettinger wrote:
>>>
>>>>On May 07, 2004 at 04:38:00, Vincent Diepeveen wrote:
>>>>
>>>>>On May 06, 2004 at 19:03:48, martin fierz wrote:
>>>>>
>>>>>>aloha!
>>>>>>
>>>>>>bob posted some crafty logfiles running a 24-position test set on his ftp site
>>>>>>(for anyone else crazy enough to repeat what i did:
>>>>>>ftp.cis.uab.edu/pub/hyatt/smpdata)
>>>>>>
>>>>>>these are logfiles of crafty running as single CPU, dual, or quad; on opterons.
>>>>>>i took the last completed ply on the single CPU set for each position (marked by
>>>>>>-> in the logfile, i hope...), wrote down the time to complete this ply, and did
>>>>>>this for all logfiles. there are 9 of these, 4 repeats for 2 and 4 CPUs. i
>>>>>>computed the speedup for time-to-finish-ply-X for each of the multi-CPU runs
>>>>>>with the following results:
>>>>>>
>>>>>>2 CPUs:
>>>>>>1.961 +- 0.093
>>>>>>1.888 +- 0.074
>>>>>>1.846 +- 0.078
>>>>>>1.763 +- 0.084
>>>>>>
>>>>>>4 CPUs:
>>>>>>3.15 +- 0.15
>>>>>>3.29 +- 0.20
>>>>>>3.06 +- 0.12
>>>>>>3.19 +- 0.13
>>>>>>
>>>>>>now, is there any meaning to this, and if yes, what?
>>>>>>
>>>>>>point #1 to make is that the numbers here are mutually consistent with each
>>>>>>other, given the error margins quoted. which should show those skeptical of this
>>>>>>statistical approach that it makes sense to do it this way, rather than to just
>>>>>>write "i measured speedup 3.1".
>>>>>>
>>>>>>point #2 is that the speedup on 4 CPUs on average is 3.17 in this test, which
>>>>>>might be one point for bob in the duel with vincent; although i suspect that the
>>>>>>speedup depends on the hardware architecture - i will leave this question to the
>>>>>>parallel computing experts though...
>>>>>
>>>>>Bob has tested the SMP version 1 cpu versus SMP version 2 or 4 cpus. The single
>>>>>cpu version of crafty is just hardly existing because of a stupid thread pointer
>>>>>which is a constant. Optimizing that crafty is 5% faster for sure in time single
>>>>>cpu at opteron.
>>>>
>>>>I don't understand that. What does that mean?
>>>>
>>>>regards
>>>>Andy
>>>
>>>In very simple words, to run parallel you first slow down your program.
>>>Then the slowed down program gets when compared to the slowed down program the
>>>speedups that Bob reports.
>>
>>That is crap, but that is yet another bit of disinformation that we can work
>>around.  Assume that I could speed Crafty up by 5% without the tree pointer,
>>something that is not possible.  But it is an assumption.  Here is the BK data I
>>posted last night for two processors.  First as I computed last night, then
>>computed after reducing the 1cpu times by 5%, just to see what would happen, not
>>because I believe 5% is a reasonable number on the opteron.
>>
>>2cpu normal data:
>>
>>  2   34     51/0.67   25/1.36   28/1.21   35/0.97
>>  3  139     51/2.73   45/3.09   58/2.40   74/1.88
>>  4  154    106/1.45   84/1.83   83/1.86   84/1.83
>>  5  175    112/1.56  105/1.67  114/1.54  109/1.61
>>  6  145     69/2.10   70/2.07   74/1.96   70/2.07
>>  7  110     65/1.69   71/1.55  112/0.98  111/0.99
>>  8  115     60/1.92   66/1.74   58/1.98   60/1.92
>>  9  171    101/1.69  104/1.64  101/1.69   94/1.82
>> 10   95     45/2.11   43/2.21   38/2.50   41/2.32
>> 11   97     35/2.77   55/1.76   52/1.87   56/1.73
>> 12  147    100/1.47  113/1.30  107/1.37  114/1.29
>> 13  153    108/1.42   98/1.56   79/1.94   83/1.84
>> 14  137     75/1.83   88/1.56   81/1.69   87/1.57
>> 15   86     42/2.05   42/2.05   41/2.10   42/2.05
>> 16  141     78/1.81   78/1.81   78/1.81   77/1.83
>> 17   38     25/1.52   21/1.81   23/1.65   21/1.81
>> 18  154     95/1.62   60/2.57   91/1.69   72/2.14
>> 19  128     67/1.91   57/2.25   65/1.97   58/2.21
>> 20   96     66/1.45   63/1.52   66/1.45   66/1.45
>> 21  123     70/1.76   70/1.76   67/1.84   74/1.66
>> 22   98     46/2.13   48/2.04   45/2.18   45/2.18
>> 23  137     62/2.21   61/2.25  106/1.29  106/1.29
>> 24   87     45/1.93   43/2.02   39/2.23   44/1.98
>>average SU      1.82      1.89      1.79      1.76
>>
>>
>>OK.  Now after fudging the 1cpu time for your bogus 5% number:
>>
>>  2   32     51/0.63   25/1.28   28/1.14   35/0.91
>>  3  132     51/2.59   45/2.93   58/2.28   74/1.78
>>  4  146    106/1.38   84/1.74   83/1.76   84/1.74
>>  5  166    112/1.48  105/1.58  114/1.46  109/1.52
>>  6  137     69/1.99   70/1.96   74/1.85   70/1.96
>>  7  104     65/1.60   71/1.46  112/0.93  111/0.94
>>  8  109     60/1.82   66/1.65   58/1.88   60/1.82
>>  9  162    101/1.60  104/1.56  101/1.60   94/1.72
>> 10   90     45/2.00   43/2.09   38/2.37   41/2.20
>> 11   92     35/2.63   55/1.67   52/1.77   56/1.64
>> 12  139    100/1.39  113/1.23  107/1.30  114/1.22
>> 13  145    108/1.34   98/1.48   79/1.84   83/1.75
>> 14  130     75/1.73   88/1.48   81/1.60   87/1.49
>> 15   81     42/1.93   42/1.93   41/1.98   42/1.93
>> 16  133     78/1.71   78/1.71   78/1.71   77/1.73
>> 17   36     25/1.44   21/1.71   23/1.57   21/1.71
>> 18  146     95/1.54   60/2.43   91/1.60   72/2.03
>> 19  121     67/1.81   57/2.12   65/1.86   58/2.09
>> 20   91     66/1.38   63/1.44   66/1.38   66/1.38
>> 21  116     70/1.66   70/1.66   67/1.73   74/1.57
>> 22   93     46/2.02   48/1.94   45/2.07   45/2.07
>> 23  130     62/2.10   61/2.13  106/1.23  106/1.23
>> 24   82     45/1.82   43/1.91   39/2.10   44/1.86
>>average SU      1.72      1.79      1.70      1.66
>>
>>
>>Do you like those numbers better??
>>
>>They are still right in line with my formula, and this is for the BK test data.
>>I don't have the raw times in a useful form for the CB positions, where the
>>speedup was actually a fair bit better.
>>
>>Here is 4cpu with normal data:
>>
>>  2   34     26/1.31   27/1.26   18/1.89   18/1.89
>>  3  139     54/2.57   29/4.79   75/1.85   75/1.85
>>  4  154     49/3.14   46/3.35   52/2.96   52/2.96
>>  5  175     71/2.46   53/3.30   58/3.02   58/3.02
>>  6  145     34/4.26   33/4.39   51/2.84   51/2.84
>>  7  110     61/1.80   73/1.51   43/2.56   43/2.56
>>  8  115     37/3.11   39/2.95   35/3.29   35/3.29
>>  9  171     67/2.55   37/4.62   41/4.17   41/4.17
>> 10   95     42/2.26   28/3.39   40/2.38   40/2.38
>> 11   97     30/3.23   27/3.59   32/3.03   32/3.03
>> 12  147     77/1.91   55/2.67   63/2.33   63/2.33
>> 13  153     55/2.78   56/2.73   40/3.83   40/3.83
>> 14  137     47/2.91   42/3.26   39/3.51   39/3.51
>> 15   86     26/3.31   26/3.31   25/3.44   25/3.44
>> 16  141     51/2.76   50/2.82   47/3.00   47/3.00
>> 17   38     12/3.17   13/2.92   13/2.92   13/2.92
>> 18  154     50/3.08   50/3.08   79/1.95   79/1.95
>> 19  128     38/3.37   38/3.37   30/4.27   30/4.27
>> 20   96     30/3.20   36/2.67   25/3.84   25/3.84
>> 21  123     42/2.93   44/2.80   43/2.86   43/2.86
>> 22   98     24/4.08   24/4.08   25/3.92   25/3.92
>> 23  137     76/1.80   61/2.25   46/2.98   46/2.98
>> 24   87     31/2.81   32/2.72   33/2.64   33/2.64
>>average SU      2.82      3.12      3.02      3.02
>>
>>
>>Here is the same table with the 1cpu time reduced by your mythical 5%.
>>
>>  2   32     26/1.23   27/1.19   18/1.78   18/1.78
>>  3  132     54/2.44   29/4.55   75/1.76   75/1.76
>>  4  146     49/2.98   46/3.17   52/2.81   52/2.81
>>  5  166     71/2.34   53/3.13   58/2.86   58/2.86
>>  6  137     34/4.03   33/4.15   51/2.69   51/2.69
>>  7  104     61/1.70   73/1.42   43/2.42   43/2.42
>>  8  109     37/2.95   39/2.79   35/3.11   35/3.11
>>  9  162     67/2.42   37/4.38   41/3.95   41/3.95
>> 10   90     42/2.14   28/3.21   40/2.25   40/2.25
>> 11   92     30/3.07   27/3.41   32/2.88   32/2.88
>> 12  139     77/1.81   55/2.53   63/2.21   63/2.21
>> 13  145     55/2.64   56/2.59   40/3.62   40/3.62
>> 14  130     47/2.77   42/3.10   39/3.33   39/3.33
>> 15   81     26/3.12   26/3.12   25/3.24   25/3.24
>> 16  133     51/2.61   50/2.66   47/2.83   47/2.83
>> 17   36     12/3.00   13/2.77   13/2.77   13/2.77
>> 18  146     50/2.92   50/2.92   79/1.85   79/1.85
>> 19  121     38/3.18   38/3.18   30/4.03   30/4.03
>> 20   91     30/3.03   36/2.53   25/3.64   25/3.64
>> 21  116     42/2.76   44/2.64   43/2.70   43/2.70
>> 22   93     24/3.88   24/3.88   25/3.72   25/3.72
>> 23  130     76/1.71   61/2.13   46/2.83   46/2.83
>> 24   82     31/2.65   32/2.56   33/2.48   33/2.48
>>average SU      2.67      2.96      2.86      2.86
>>
>>2.7, 3.0, 2.9, 2.9.  Do you like those numbers better?  Do they prove your
>>"point" whatever that might be?  The numbers drop by just over .1 on each test.
>>The CB positions would average 3.0 rather than 3.1 as calculated by Martin.
>>
>>Or, perhaps, this is all nonsense and I should reduce them by 20%.  IE pick a
>>number to get the speedup down to where you want it.  Or use the real number
>>which used to be about 3% on Intel, probably less on opteron with more
>>registers.  And just deal with a number you want to pretend I can't possibly
>>reach...
>>
>>Your choice.
>>
>>Real data?  Or your imaginary stuff?
>>
>>I prefer "real".
>>
>>
>>
>>>
>>>However this is not fair.
>>>
>>>In diep i just compare the single cpu version versus the parallel version of
>>>diep.
>>
>>What are your speedup numbers?  Where is the data?
>>
>>
>>>
>>>Other good examples of unfair compares are what the Chrilly donninger is posting
>>>about hydra.
>>>
>>>Hydra does not use hashtables last 6 plies. 3 ply not in hardware and 3 ply not
>>>in software.
>>>
>>>He compares 1 cpu not doing last 6 plies in hardware versus 16 cpu's not doing
>>>last 6 ply in hardware.
>>>
>>>That is not fair however, the *only* reason to not use the hashtable the last 3
>>>ply in software is because that would not run parallel well.
>>>
>>>However, single cpu it does run well using hashtable there.
>>>
>>
>>
>>So?  Suppose there was something that could be done in the parallel version but
>>not in the serial version.  Is it fair to compare?  Or should the serial version
>>be modified the same way to look worse?
>>
>>You are wasting time worrying about some mythical speedup number.  The rest of
>>us just care "how much does another processor help?"
>>
>>
>>
>>
>>
>>>This is a very common trick in computerchess and some are very bad in this. Like
>>>cilkchess was slowed down 40 times in speed. Reduced from like 200k nps to 5k
>>>nps in order to run parallel better.
>>>
>>>Then it shows up with 500 processors somewhere or even in 1995 it showed up at
>>>like 1800 processors.
>>>
>>>But it is losing somewhere a factor 40 to start with.
>>>
>>>Is it fair to compare a slowed down program versus n processors?
>>>
>>>I do not think so. I find it very bad compare.
>>
>>At that extreme, perhaps.  But you always suppose fraud.  With no evidence.  Do
>>you _know_ what they did in slowing it down?  Do you really know if they slowed
>>it down that much?  It doesn't sound reasonable.  It smells of speculation and
>>guessing.
>>
>>
>>
>>>
>>>I also can get a much better speedup with diep when slowing it down first.
>>
>>Just show _any_ numbers.  Anything is better than what you have shown so far...
>>Even if it is bogus...



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.