Computer Chess Club Archives

Search

Terms
Messages

Subject: Re: Quad scaling ?

Author: Robert Hyatt
Date: 08:27:15 08/25/05
On August 25, 2005 at 10:56:45, Vincent Diepeveen wrote:

>On August 24, 2005 at 23:27:39, Robert Hyatt wrote:
>
>>On August 24, 2005 at 21:52:53, Eelco de Groot wrote:
>>
>>>On August 24, 2005 at 14:50:46, Thomas Logan wrote:
>>>
>>>>On August 24, 2005 at 12:05:09, Vincent Diepeveen wrote:
>>>>
>>>>>On August 24, 2005 at 08:59:28, Thomas Logan wrote:
>>>>>
>>>>>>On August 24, 2005 at 08:47:33, Vincent Diepeveen wrote:
>>>>>>
>>>>>>>On August 24, 2005 at 08:20:25, Thomas Logan wrote:
>>>>>>>
>>>>>>>>Does anyone have scaling figures for various deep programs
>>>>>>>>
>>>>>>>>and systems with 2 dual core processors
>>>>>>>>
>>>>>>>>Tom
>>>>>>>
>>>>>>>hi, i just started a test at my k7 single cpu machine
>>>>>>>to compare an output created at a quad dual core 1.8Ghz.
>>>>>>>
>>>>>>>The test is over 213 positiosn and statistical significant.
>>>>>>>
>>>>>>>I expect results within 2 weeks.
>>>>>>>
>>>>>>>You can calculate what time it takes 70 minutes * 213 positions.
>>>>>>>
>>>>>>>one thing already seems sure:
>>>>>>>
>>>>>>>x86-64 has no scaling problems with big hashtables, x86 has.
>>>>>>>
>>>>>>>Vincent
>>>>>>
>>>>>>Hello Vincent
>>>>>>
>>>>>>Thank you
>>>>>>
>>>>>>Are you using Diep ?
>>>>>>
>>>>>>Any knowledge concerning Fritz, Junior or Shredder
>>>>>>Please post your results when obtained
>>>>>
>>>>>Shredder is scaling 3.3 at quad single core, so that'll be like scaling of 4 at
>>>>>dual core quad or so?
>>>>>
>>>>>junior was single core and fritz will not be scaling well either (deepfritz8).
>>>>>
>>>>>We know all this already from 8 cpu Xeon machines in fact. See results donninger
>>>>>posted once.
>>>>>
>>>>>If you don't run well at 8 cpu xeon then forget dual core.
>>>>>
>>>>So you believe Shreder scale fairly well on a quad
>>>>
>>>>But regarding Junior and Fritz I of course meant the deep versions of Junior and
>>>>Fritz
>>>>
>>>>Any knowledge on these ?
>>>>
>>>>Thanks
>>>>
>>>>Tom
>>>
>>>Hello Thomas,
>>>
>>>I think what Vincent meant was that Junior played on a single core 4 processor
>>>machine on the WCCC. Deep Junior, or it could not have used more than one
>>>processor, but actually I have not read anywhere what kind of processor Amir and
>>>Shay used, AMD or Intel?
>>>
>>>If I understand correctly, scaling involves the NPS numbers,which is not the
>>>same as plydepths/timeunit, the actual speed-up?
>>
>>Actually this is a poor practice, but it is becoming common.
>>
>>Scaling actually refers to how well an application performs as the number of
>>processors increases.  With chess, NPS is one type of scaling, but it isn't the
>>accurate number.  The actual time-to-solution is the right way to measure
>>scaling, because most programs can come pretty close to perfect NPS scaling,
>>unless they run into NUMA issues they don't handle, but getting a reasonable
>>speedup is another issue altogether...
>
>Scaling is very important when you have big number of cpu's.
>Of course you have little experience above 16 cpu's, so we will
>forgive you the naivity here.

I would be willing to bet I have run on larger clusters than you have.  Not with
Crafty, since it doesn't work on clusters, but there _are_ other interesting
applications to work with.

The only one naive here is yourself.


>
>However at a big machine the first and most important thing is to get
>your program scale well at this big iron.
>
>If you do not even scale at a big machine, forget a good speedup
>in that case too.

You are simply misusing the terminology.

From "Parallel Programming" by Wilkinson and Allen

Scalability.  Scalability is used to indicate a hardware design that allows the
system to be increased in size and in doing so obtaining increased system
performance.  This is called "architecture or hardware scalability".  It is also
used to indicate that a parallel algorithm can accomodate increased data size by
using increased system size."

So please stop trying to tell me, once again, that your incorrect definition is
the one that should be used.  NPS scalability is one measure, but it is not
_the_ measure used.  Clearly if you run more processors, you do more work in a
given unit of time.  But does that additional work translate into higher
performance?  That is what everyone calls scalability.




>
>Assuming the YBW algorithm gets used, scaling is the most difficult problem to
>solve at big iron.
>
>The reason for this is that the speedup is usually not better than the scaling.

What a nonsensical statement.  It can't possibly be, using your incorrect
definition of scaling...



>
>Now diep of course is a slow program, so it's probably easier to let it scale
>well than a fast bitboard engine like cilkchess.
>


The speed of the engine has nothing to do with scaling.  Unless you only want to
measure NPS, which is not what scaling is really all about.





>I remember how at old single cpu 300Mhz cpu's cilkchess searched with the cilk
>frame 5000 nps, whereas it without cilk reached 200k nps.
>
>Diep single cpu searched 20k nps at those cpu's and at 460 cpu's it searched
>around 5.5 mln - 9.99 mln nps (that last in far endgame against Brutus).
>
>Scaling is very hard when a program, such as crafty, has a global lock.


You keep saying that, yet I keep getting 8x on 8-way boxes (using your NPS
scaling term here of course) and 16X on 16-way boxes.  I think the best we ever
got on an Itanium was 48X or so out of 64, but we didn't spend much time on that
particular architecture trying to tune...

The global lock is used typically 5000-6000 times in a 3 minute search.  It is
typically held for less than a millisecond.  More like a few microseconds today.
 That is exactly what percentage of that 180 seconds search time?  It isn't a
problem.  If it were, I'd change the approach.  But there's no need to change it
until it becomes a problem.




>
>Actually no chance to get something with a global lock to work at a big super.
>
>So scaling matters really a lot.
>
>Only after you fixed scaling, the speedup is the most important thing.
>

Except they are the _same_ thing in everyone's vocabulary, except for yours...




>>
>>>
>>>Kevin Burcham posted some interesting numbers for Deep Fritz above
>>>http://www.talkchess.com/forums/1/message.html?444985
>>>
>>>If you convert that back to nodes per core per MHz that would roughly get the
>>>Fritz scaling numbers.
>>>
>>>Groeten,
>>>Eelco
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.