Author: Vincent Diepeveen
Date: 11:20:01 10/29/03
Go up one level in this thread
On October 28, 2003 at 23:21:55, Robert Hyatt wrote: >On October 28, 2003 at 18:12:16, Vincent Diepeveen wrote: > >>On October 28, 2003 at 09:48:52, Robert Hyatt wrote: >> >>>On October 27, 2003 at 21:23:13, Vincent Diepeveen wrote: >>> >>>>On October 27, 2003 at 20:09:55, Eugene Nalimov wrote: >>>> >>>>>On October 27, 2003 at 20:00:54, Robert Hyatt wrote: >>>>> >>>>>>On October 27, 2003 at 19:57:12, Eugene Nalimov wrote: >>>>>> >>>>>>>On October 27, 2003 at 19:24:10, Peter Skinner wrote: >>>>>>> >>>>>>>>On October 27, 2003 at 19:06:51, Eugene Nalimov wrote: >>>>>>>> >>>>>>>>> >>>>>>>>>I don't think you should be afraid. 500 CPUs is not enough -- you need >>>>>>>>>reasonable good program to run on them. >>>>>>>>> >>>>>>>>>Thanks, >>>>>>>>>Eugene >>>>>>>> >>>>>>>>I would bet on Crafty with 500 processors. That is for sure. I know it is quite >>>>>>>>a capable program :) >>>>>>>> >>>>>>>>Peter. >>>>>>> >>>>>>>Efficiently utilizing 500 CPUs is *very* non-trivial task. I believe Bob can do >>>>>>>it, but it will be nor quick nor easy. >>>>>>> >>>>>>>Thanks, >>>>>>>Eugene >>>>>> >>>>>> >>>>>>If the NUMA stuff doesn't swamp me. And if your continual updates to the >>>>>>endgame tables doesn't swamp me. We _might_ see some progress here. :) >>>>>> >>>>>>If I can just figure out how to malloc() the hash tables reasonably on your >>>>>>NUMA platform, without wrecking everything, that will be a step... >>>>> >>>>>Ok, just call the memory allocation function exactly where you are calling it >>>>>now, and then let the user issue "mt" command before "hash" and "hashp" if (s)he >>>>>want good scaling. >>>>> >>>>>Thanks, >>>>>Eugene >>>> >>>>That's why i'm multiprocessing. All problems solved at once :) >>> >>> >>>And several added. Duplicate code. Duplicate LRU egtb buffers. Threads >> >>Duplicate code is good. Duplicate indexation egtb tables is good too (note the >>DIEP ones do not require 200MB for 6 men, but a few hundreds of KB only). >> > >wanna compare access speeds for decompression on the fly? If you make >the indices smaller, you take a big speed hit. It is a trade-off. Not really, I need compressed around 500MB for all 5 men. Nalimov 7.5GB. What's more compact? Idem 6 men. I might need more entries (5.22T in total) against nalimov a bit less, but i would never store each entry in 2 bytes an entry. So the direct savings is bigger already. I proposed to Nalimov a scheme where you just store 'mate, mate in 15, mate in 16 etc. So storing all mates in 0..12 and -1..-12 like 'mate' or '-mate'. But the EGTBs compress a lot better then. This where any engine can calculate a mate in 12 to 15 without problems (diep can at least). > >>Everything that's done local is better of course. By starting the entire process >>local at a cpu you have that garantuee very sure. With multithreading you never >>know what surprise hits you :) >> >>>are not necessarily bad here. We're hitting 6.75M+ nodes per second on a quad >> >>That's *very* good. >> >>>opteron at 1.8ghz. That's not bad. I'll post some output when everything is >>>cleaned up and finalized, particularly allocating the hash tables. >> >>Please do so. Would be cool to know some speedups as well using say a 400MB >>hashtable. >> >>Also which kernel do you use? > >Eugene ran the tests, so you _know_ which kernel he used. Windows. Ah and a better compiler. That explains your speed. >>Default linux kernel at quad opteron sucked ass when i applied latency tests to >>it. There is a few special patched kernels. Not default patched but by certain >>manufacturers. >> >>Like SGI. >> >>If you find a kernel that's real good NUMA keep us up to date here in CCC which >>one.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.