Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Quad scaling ?

Author: Robert Hyatt

Date: 11:44:53 08/25/05

Go up one level in this thread


On August 25, 2005 at 11:21:51, Uri Blass wrote:

>On August 25, 2005 at 10:56:45, Vincent Diepeveen wrote:
>
>>On August 24, 2005 at 23:27:39, Robert Hyatt wrote:
>>
>>>On August 24, 2005 at 21:52:53, Eelco de Groot wrote:
>>>
>>>>On August 24, 2005 at 14:50:46, Thomas Logan wrote:
>>>>
>>>>>On August 24, 2005 at 12:05:09, Vincent Diepeveen wrote:
>>>>>
>>>>>>On August 24, 2005 at 08:59:28, Thomas Logan wrote:
>>>>>>
>>>>>>>On August 24, 2005 at 08:47:33, Vincent Diepeveen wrote:
>>>>>>>
>>>>>>>>On August 24, 2005 at 08:20:25, Thomas Logan wrote:
>>>>>>>>
>>>>>>>>>Does anyone have scaling figures for various deep programs
>>>>>>>>>
>>>>>>>>>and systems with 2 dual core processors
>>>>>>>>>
>>>>>>>>>Tom
>>>>>>>>
>>>>>>>>hi, i just started a test at my k7 single cpu machine
>>>>>>>>to compare an output created at a quad dual core 1.8Ghz.
>>>>>>>>
>>>>>>>>The test is over 213 positiosn and statistical significant.
>>>>>>>>
>>>>>>>>I expect results within 2 weeks.
>>>>>>>>
>>>>>>>>You can calculate what time it takes 70 minutes * 213 positions.
>>>>>>>>
>>>>>>>>one thing already seems sure:
>>>>>>>>
>>>>>>>>x86-64 has no scaling problems with big hashtables, x86 has.
>>>>>>>>
>>>>>>>>Vincent
>>>>>>>
>>>>>>>Hello Vincent
>>>>>>>
>>>>>>>Thank you
>>>>>>>
>>>>>>>Are you using Diep ?
>>>>>>>
>>>>>>>Any knowledge concerning Fritz, Junior or Shredder
>>>>>>>Please post your results when obtained
>>>>>>
>>>>>>Shredder is scaling 3.3 at quad single core, so that'll be like scaling of 4 at
>>>>>>dual core quad or so?
>>>>>>
>>>>>>junior was single core and fritz will not be scaling well either (deepfritz8).
>>>>>>
>>>>>>We know all this already from 8 cpu Xeon machines in fact. See results donninger
>>>>>>posted once.
>>>>>>
>>>>>>If you don't run well at 8 cpu xeon then forget dual core.
>>>>>>
>>>>>So you believe Shreder scale fairly well on a quad
>>>>>
>>>>>But regarding Junior and Fritz I of course meant the deep versions of Junior and
>>>>>Fritz
>>>>>
>>>>>Any knowledge on these ?
>>>>>
>>>>>Thanks
>>>>>
>>>>>Tom
>>>>
>>>>Hello Thomas,
>>>>
>>>>I think what Vincent meant was that Junior played on a single core 4 processor
>>>>machine on the WCCC. Deep Junior, or it could not have used more than one
>>>>processor, but actually I have not read anywhere what kind of processor Amir and
>>>>Shay used, AMD or Intel?
>>>>
>>>>If I understand correctly, scaling involves the NPS numbers,which is not the
>>>>same as plydepths/timeunit, the actual speed-up?
>>>
>>>Actually this is a poor practice, but it is becoming common.
>>>
>>>Scaling actually refers to how well an application performs as the number of
>>>processors increases.  With chess, NPS is one type of scaling, but it isn't the
>>>accurate number.  The actual time-to-solution is the right way to measure
>>>scaling, because most programs can come pretty close to perfect NPS scaling,
>>>unless they run into NUMA issues they don't handle, but getting a reasonable
>>>speedup is another issue altogether...
>>
>>Scaling is very important when you have big number of cpu's.
>>Of course you have little experience above 16 cpu's, so we will
>>forgive you the naivity here.
>>
>>However at a big machine the first and most important thing is to get
>>your program scale well at this big iron.
>>
>>If you do not even scale at a big machine, forget a good speedup
>>in that case too.
>>
>>Assuming the YBW algorithm gets used, scaling is the most difficult problem to
>>solve at big iron.
>>
>>The reason for this is that the speedup is usually not better than the scaling.
>>
>>Now diep of course is a slow program, so it's probably easier to let it scale
>>well than a fast bitboard engine like cilkchess.
>>
>>I remember how at old single cpu 300Mhz cpu's cilkchess searched with the cilk
>>frame 5000 nps, whereas it without cilk reached 200k nps.
>>
>>Diep single cpu searched 20k nps at those cpu's and at 460 cpu's it searched
>>around 5.5 mln - 9.99 mln nps (that last in far endgame against Brutus).
>>
>>Scaling is very hard when a program, such as crafty, has a global lock.
>>
>>Actually no chance to get something with a global lock to work at a big super.
>>
>>So scaling matters really a lot.
>>
>>Only after you fixed scaling, the speedup is the most important thing.
>
>I do not think that scaling or speedup is the most important thing.
>The most important thing is to have a good program.
>
>Fruit with a single processor did clearly better than most programs with many
>processors so I think that it may be better if programmers even do not think
>about parallel search when their program is even weaker than buggy fruit WCCC(it
>is no secret that Fruit has search bugs for example it cannot solve fine70 and
>the fact that it extends every check and single reply by a full ply is also
>something not smart to do that fabien still did not fix for wccc).
>
>Uri


Of course that is true.  A crappy search, twice as fast, is just twice as
crappy.  So "decent program" is a given before considering a parallel search
implementation.  Although an experienced programmer would probably want to
design in the capability for parallel search from the ground-up, as it is far
easier than stuffing it in way later in the development...



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.