Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Crafty and NUMA

Author: Vincent Diepeveen

Date: 09:17:45 09/03/03

Go up one level in this thread


On September 03, 2003 at 12:04:01, Robert Hyatt wrote:

>On September 03, 2003 at 05:13:07, Mridul Muralidharan wrote:
>
>>Hi,
>>
>>
>>On September 02, 2003 at 18:37:05, Jeremiah Penery wrote:
>>
>>>On September 02, 2003 at 07:15:55, Mridul Muralidharan wrote:
>>>
>><snip>
>>>
>>>If I didn't have some idea what I was talking about, I wouldn't be talking,
>>>unlike a lot of people in these discussions.
>>>
>>
>>This should give you some direction to think about :
>>http://www.talkchess.com/forums/1/message.html?313791
>>
>>Actually , more references could be give - but considering your set mindset, no
>>thanks :)
>>
>>>> Refer to cray architecture , an opteron 8 way box architecture , and some
>>>>IBM supercomp cc-NUMA based system architecture docs for more info. I'm not
>>>
>>>Those machines are designed and built for *completely* different purposes.  You
>>>might as well compare the documentation for a P4 to that of an UltraSPARC, for
>>>all the good it would do you.
>>>
>>
>>If you say a cc-NUMA is built for a entirely different purpose - definitely , I
>>agree with you ! Like Bob Hyatt mentions in the above mentioned post -
>>performace / price / scalability matrix works quiet well for NUMA at higher
>>number of processors.
>>Which is exactly what I said - no point in saying crafty (or any other program
>>for that matter) will scale well on a 16 or 64 proc NUMA box just 'cos it scales
>>well on a 4 proc smp box.
>>NUMA machines are a slightly different breed.
>>
>>>>refering to just theoretical differences , or _only_ architecture differences -
>>>>but as a programmer - what details that need to be taken care of while writing
>>>>apps for such a system.
>>>
>>>And those details would be what, other than the aforementioned theoretical or
>>>architectural differences?
>>>
>>
>>Quiet simple - on a smp or cray box , typically you do not care much about
>>latency for accessing memory being different for different processors , etc. As
>>a programmer , you have to be aware of all these.
>>Why do you think Linux on numa sucks ass ?!!!
>>Also , depending on how the box is configured , number of procs per node , etc -
>>you memory management , thread/process splitting , etc (for a chess program that
>>is) will have to be modified.
>>Just because you might know what the architecture of the box is , does not imply
>>that you will come up with a program which scales well on NUMA !
>>
>>
>>>>>But in reality, almost nobody uses a machine that big, especially for chess.
>>>>
>>>>The question was - can it be done , is it just a bunch of tweaks - not do you
>>>>have a system.
>>>>Answer : Yes it cn be done , needs lots of rewrite - not just "tweaks".
>>>
>>>Not really.  Bob said he already completed the changes, and it didn't really
>>>involve much.  Only instead of forking processes he had to manually start
>>>processes on each processor.  That really doesn't take much work.
>>>
>>
>>
>>If it was just a bunch of tweaks that you mention here - I would love to see how
>>much performance it will give on a 64/128 proc NUMA box :)
>>I can make a guess - it will suck a**. (No offence to anyone here)
>
>You are mixing apples and oranges.  How will it do on a 128 node SMP
>box?  How will it do on a 128 node NUMA box?  _both_ will not do very well
>since things are not tuned for that many processors.  However, the original
>NUMA port did pretty well on a 32 CPU box.  Not as well as it would have done
>on a 32 CPU SMP box however.  But then NUMA won't _ever_ produce the same
>level of performance as pure SMP boxes will.  They are just much more
>affordable.

Bob, show that 32 cpu output and crafty version number with it.

Thanks,
Vincent

>
>>
>>
>>>>>For any but the most extremely scalable architectures, there is significant
>>>>>diminishing returns when adding processors for chess playing.  I'd say that a
>>>>>very scalable 8-way SMP or NUMA (Opteron) machine will not be very much slower
>>>>>than even a 64-way Alpha/Itanium/xxx machine for chess.
>>>>
>>>>If badly programmed , then yes not much difference between a 8 proc box and a 64
>>>>proc box (actually it can be lower performing!).
>>>>Which is exactly my point , you need to design a program specifically to run on
>>>>such a system - not expect something that works on a 2 or 4 proc system and
>>>>expect it to work for a 64 proc system !
>>>
>>>The Alpha-Beta algorithm used for chess is a serial algorithm.  There's no
>>>getting around that.  The more processors you use, the less efficiency you will
>>>get, unless you use something else than Alpha-Beta.
>>>
>>>No matter how much you want to rewrite and "tweak" for a NUMA machine (or any
>>>kind of machine, for that matter), adding more and more processors is simply
>>>going to stop being beneficial at some point.
>>
>>
>>Just because alpha-beta is serial does not imply that it need not scale well
>>beyond the 4 or 8 or 16 proc boxes that it is shown to scale well to.
>>I _have_ seen results of how well it scales :)
>>sadly I'm not at liberty to reveal them - but in a few months/next year or so ,
>>you will also see how well it scales when results are published.
>>I'm not denying the limitations of alphabeta algo - definitely they exist - but
>>not to the extent to which it is believed to exist.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.