Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: parallel scaling

Author: Robert Hyatt

Date: 20:21:55 10/28/03

Go up one level in this thread


On October 28, 2003 at 18:12:16, Vincent Diepeveen wrote:

>On October 28, 2003 at 09:48:52, Robert Hyatt wrote:
>
>>On October 27, 2003 at 21:23:13, Vincent Diepeveen wrote:
>>
>>>On October 27, 2003 at 20:09:55, Eugene Nalimov wrote:
>>>
>>>>On October 27, 2003 at 20:00:54, Robert Hyatt wrote:
>>>>
>>>>>On October 27, 2003 at 19:57:12, Eugene Nalimov wrote:
>>>>>
>>>>>>On October 27, 2003 at 19:24:10, Peter Skinner wrote:
>>>>>>
>>>>>>>On October 27, 2003 at 19:06:51, Eugene Nalimov wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>I don't think you should be afraid. 500 CPUs is not enough -- you need
>>>>>>>>reasonable good program to run on them.
>>>>>>>>
>>>>>>>>Thanks,
>>>>>>>>Eugene
>>>>>>>
>>>>>>>I would bet on Crafty with 500 processors. That is for sure. I know it is quite
>>>>>>>a capable program :)
>>>>>>>
>>>>>>>Peter.
>>>>>>
>>>>>>Efficiently utilizing 500 CPUs is *very* non-trivial task. I believe Bob can do
>>>>>>it, but it will be nor quick nor easy.
>>>>>>
>>>>>>Thanks,
>>>>>>Eugene
>>>>>
>>>>>
>>>>>If the NUMA stuff doesn't swamp me.  And if your continual updates to the
>>>>>endgame tables doesn't swamp me.  We _might_ see some progress here.  :)
>>>>>
>>>>>If I can just figure out how to malloc() the hash tables reasonably on your
>>>>>NUMA platform, without wrecking everything, that will be a step...
>>>>
>>>>Ok, just call the memory allocation function exactly where you are calling it
>>>>now, and then let the user issue "mt" command before "hash" and "hashp" if (s)he
>>>>want good scaling.
>>>>
>>>>Thanks,
>>>>Eugene
>>>
>>>That's why i'm multiprocessing. All problems solved at once :)
>>
>>
>>And several added.  Duplicate code.  Duplicate LRU egtb buffers.  Threads
>
>Duplicate code is good. Duplicate indexation egtb tables is good too (note the
>DIEP ones do not require 200MB for 6 men, but a few hundreds of KB only).
>

wanna compare access speeds for decompression on the fly?  If you make
the indices smaller, you take a big speed hit.  It is a trade-off.


>Everything that's done local is better of course. By starting the entire process
>local at a cpu you have that garantuee very sure. With multithreading you never
>know what surprise hits you :)
>
>>are not necessarily bad here.  We're hitting 6.75M+ nodes per second on a quad
>
>That's *very* good.
>
>>opteron at 1.8ghz.  That's not bad.  I'll post some output when everything is
>>cleaned up and finalized, particularly allocating the hash tables.
>
>Please do so. Would be cool to know some speedups as well using say a 400MB
>hashtable.
>
>Also which kernel do you use?

Eugene ran the tests, so you _know_ which kernel he used.  Windows.




>
>Default linux kernel at quad opteron sucked ass when i applied latency tests to
>it. There is a few special patched kernels. Not default patched but by certain
>manufacturers.
>
>Like SGI.
>
>If you find a kernel that's real good NUMA keep us up to date here in CCC which
>one.



This page took 0.01 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.