Author: Robert Hyatt
Date: 10:16:48 01/30/03
Go up one level in this thread
On January 29, 2003 at 23:31:19, Matthew Hull wrote: >On January 29, 2003 at 23:20:11, Vincent Diepeveen wrote: > >>On January 29, 2003 at 12:06:50, Robert Hyatt wrote: >> >>Bob let me explain to you. DIEP is written for machines which have a bit slower >>latency for global memory accesses. whereas the world champs 2002 version wasn't >>like that and would probably act like crafty on that 8 processor, the end of >>august 2002 versions and further are using a new type of parallellism which >>doesn't need much locking. Each processor takes care of itself without hurting >>bandwidth while searching too much. >> >>There is no dead slow global locks which is killing the 8 processor thing of >>course. >> >>therefore it works great for example at cc-NUMA machines and all types of Xeon >>machines. > > >Wow dude. Impressive. Could you supply some time-to-ply benchmarks for Diep on >8-way Xeon vis-a-vis 4-way Xeon. That would refute the proffessor like nothing >else. > >Sincerely, >Matt > Stand by for a hurricane caused by wild hand-waving... > >> >>Now you have some examples of software written for fast latency shared memory >>machines and then claim the thing is slower because the software isn't written >>for such types of machines? >> >>That already should give you the answer. Writing parallel programs is 1 thing. >>Writing something that works well without inventing numbers yourself is another >>thing. >> >> >>>On January 29, 2003 at 11:38:37, Vincent Diepeveen wrote: >>> >>>>On January 28, 2003 at 10:33:15, Robert Hyatt wrote: >>>> >>>>>On January 28, 2003 at 09:07:35, Vincent Diepeveen wrote: >>>>> >>>>>>On January 28, 2003 at 03:33:44, Mig Greengard wrote: >>>>>> >>>>>>>According to the tech I talked with, Amir and Shay were testing both machines >>>>>>>before the match to see which one they would use. To my knowledge it wasn't >>>>>>>decided until a day or two before the match. Obviously there isn't a big >>>>>>>difference in performance. >>>>>>> >>>>>>>Saludos, Mig >>>>>>>http://www.chessninja.com >>>>>> >>>>>>thanks. >>>>>> >>>>>>DIEP onto the 8 processor 1.6 would be running 16 processes and speed would >>>>>>be about expressed in K7: >>>>>> 8 x 1.6 Ghz / 1.4 = 9 Ghz >>>>> >>>>> >>>>>No it wouldn't. You haven't tried an 8-way intel box yet. It doesn't scale >>>>>nearly as well as the 2-way and 4-way intel boxes do. The chipset for >>>>>supporting 8 cpus is simply not very good... >>>> >>>>DIEP isn't demanding much bandwidth Bob in case you missed it, it works >>>>great on a cc-NUMA machine too. >>> >>>It demands _enough_ bandwidth. My comment wasn't only about "crafty" It was >>>about the 8-way boxes in general. I ran on a dell 8450, with 8 700mhz xeon >>>processors, and it was about 1.5X faster than my box. And again, _not_ with >>>Crafty. I ran 8 copies of the same thing on the 8450, 4 copies on the quad, >>>and compared the total run times. The 8450 was only about 50% faster when it >>>should be 100% based on clock... >>> >>> >>> >>>> >>>>>The 8-way box using the same clock speed for the processors will only be about >>>>>1.5X faster than the 4-way box, and that doesn't count parallel search overhead >>>>>at all. >>>> >>>>That's not true. It's 8 times faster for good software. Of course there is >>>>algorithmic loss but there is no sequential loss unless the software sucks, >>>>to say it rude. >>> >>>Have you ever run on one? Of course not. I have. So your "that's not true" >>>is simply nonsense... There are _plenty_ of good benchmarks that can be used >>>to draw conclusions about the 8-way memory bottleneck problem. >>> >>>It _might_ be 8x faster if you can fit in the L2 cache (this machine had >>>2mb of L2 per processor compared to my 1mb on my quad 700). But if you have >>>any memory bandwidth at all, it has a problem. And a 8-probe hash table is >>>more than enough to highlight the problem. >>> >>> >>> >>> >>>> >>>>Doesn't say that it is easy to make software that can handle the latencies. >>>> >>>>It sure isn't easy to make a chessprogram that is having a good speedup >>>>(without a too big sequential loss first like Zugzwang which was slowed down >>>>first like 100 times or so in order to then have a decent speedup at like >>>>256 processors; 50% speedup even incredible much i would be *very* happy with >>>>around 15% already). >>>> >>>>But it is possible to make. >>>> >>>>DIEP is such a program that shows it can. DIEP runs like the sun on 8 cpu's >>>>(2 nodes quad SGI), even at the slowest partitions (slowest latency speeds >>>>are of course at the biggest partitions: 512 cpu partition). >>>> >>>>A 8 processor Xeon is hell for pc software like Fritz, Junior, Crafty, but it >>>>is very good for DIEP. >>>> >>>>Best regards, >>>>Vincent
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.