Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: parallel scaling

Author: Robert Hyatt

Date: 14:37:08 10/29/03

Go up one level in this thread


On October 29, 2003 at 14:20:01, Vincent Diepeveen wrote:

>On October 28, 2003 at 23:21:55, Robert Hyatt wrote:
>
>>On October 28, 2003 at 18:12:16, Vincent Diepeveen wrote:
>>
>>>On October 28, 2003 at 09:48:52, Robert Hyatt wrote:
>>>
>>>>On October 27, 2003 at 21:23:13, Vincent Diepeveen wrote:
>>>>
>>>>>On October 27, 2003 at 20:09:55, Eugene Nalimov wrote:
>>>>>
>>>>>>On October 27, 2003 at 20:00:54, Robert Hyatt wrote:
>>>>>>
>>>>>>>On October 27, 2003 at 19:57:12, Eugene Nalimov wrote:
>>>>>>>
>>>>>>>>On October 27, 2003 at 19:24:10, Peter Skinner wrote:
>>>>>>>>
>>>>>>>>>On October 27, 2003 at 19:06:51, Eugene Nalimov wrote:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>I don't think you should be afraid. 500 CPUs is not enough -- you need
>>>>>>>>>>reasonable good program to run on them.
>>>>>>>>>>
>>>>>>>>>>Thanks,
>>>>>>>>>>Eugene
>>>>>>>>>
>>>>>>>>>I would bet on Crafty with 500 processors. That is for sure. I know it is quite
>>>>>>>>>a capable program :)
>>>>>>>>>
>>>>>>>>>Peter.
>>>>>>>>
>>>>>>>>Efficiently utilizing 500 CPUs is *very* non-trivial task. I believe Bob can do
>>>>>>>>it, but it will be nor quick nor easy.
>>>>>>>>
>>>>>>>>Thanks,
>>>>>>>>Eugene
>>>>>>>
>>>>>>>
>>>>>>>If the NUMA stuff doesn't swamp me.  And if your continual updates to the
>>>>>>>endgame tables doesn't swamp me.  We _might_ see some progress here.  :)
>>>>>>>
>>>>>>>If I can just figure out how to malloc() the hash tables reasonably on your
>>>>>>>NUMA platform, without wrecking everything, that will be a step...
>>>>>>
>>>>>>Ok, just call the memory allocation function exactly where you are calling it
>>>>>>now, and then let the user issue "mt" command before "hash" and "hashp" if (s)he
>>>>>>want good scaling.
>>>>>>
>>>>>>Thanks,
>>>>>>Eugene
>>>>>
>>>>>That's why i'm multiprocessing. All problems solved at once :)
>>>>
>>>>
>>>>And several added.  Duplicate code.  Duplicate LRU egtb buffers.  Threads
>>>
>>>Duplicate code is good. Duplicate indexation egtb tables is good too (note the
>>>DIEP ones do not require 200MB for 6 men, but a few hundreds of KB only).
>>>
>>
>>wanna compare access speeds for decompression on the fly?  If you make
>>the indices smaller, you take a big speed hit.  It is a trade-off.
>
>Not really, I need compressed around 500MB for all 5 men. Nalimov 7.5GB.
>
>What's more compact?

Let's compare apples to apples.  You are storing DTM for all 3-4-5 piece
files in 500MB?  You just set a new world record for size.

Aha.  You aren't storing DTM, you are storing W/L/draw?  Then the comparison
is not equal.

Either way, my statement stands...

>
>Idem 6 men. I might need more entries (5.22T in total) against nalimov a bit
>less, but i would never store each entry in 2 bytes an entry.

Neither does he...

>
>So the direct savings is bigger already.
>
>I proposed to Nalimov a scheme where you just store 'mate, mate in 15, mate in
>16 etc.
>
>So storing all mates in 0..12 and -1..-12 like 'mate' or '-mate'.
>
>But the EGTBs compress a lot better then. This where any engine can calculate a
>mate in 12 to 15 without problems (diep can at least).

Most can, depending on the time control.  But that causes a problem...


>
>>
>>>Everything that's done local is better of course. By starting the entire process
>>>local at a cpu you have that garantuee very sure. With multithreading you never
>>>know what surprise hits you :)
>>>
>>>>are not necessarily bad here.  We're hitting 6.75M+ nodes per second on a quad
>>>
>>>That's *very* good.
>>>
>>>>opteron at 1.8ghz.  That's not bad.  I'll post some output when everything is
>>>>cleaned up and finalized, particularly allocating the hash tables.
>>>
>>>Please do so. Would be cool to know some speedups as well using say a 400MB
>>>hashtable.
>>>
>>>Also which kernel do you use?
>>
>>Eugene ran the tests, so you _know_ which kernel he used.  Windows.
>
>Ah and a better compiler. That explains your speed.
>
>
>
>
>>>Default linux kernel at quad opteron sucked ass when i applied latency tests to
>>>it. There is a few special patched kernels. Not default patched but by certain
>>>manufacturers.
>>>
>>>Like SGI.
>>>
>>>If you find a kernel that's real good NUMA keep us up to date here in CCC which
>>>one.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.