Author: Robert Hyatt
Date: 06:34:05 09/04/00
Go up one level in this thread
On September 04, 2000 at 02:37:13, Bernhard Bauer wrote: >On September 03, 2000 at 23:02:51, Robert Hyatt wrote: > >>On September 03, 2000 at 20:13:57, Aaron Gordon wrote: >> >>>What about with channel bonding in linux? For example you could get 6 100mbit >>>nic's per pc to have 0.6Gbit (or if you've got some serious cash then 6 1gbit >>>NICs). As soon as I get some money I'm going to try to experiment with the small >>>linux cluster I've got here, maybe go with three 100mbit nics per pc. >>>Anyway, it's a thought.. should help the bandwidth problem a lil'.. >> >> >>No, sorry. Totally wrong idea. The PCI bus _is_ the problem. It can >>sustain about 100mbytes/second. If you put 6 NICS in the machine, you >>are talking about roughly 10mbytes/sec per nic and you really can't drive >>the things at 100mbits/sec... I have been able to get roughly 70 mbits/sec >>as an upper bound. You could go to something faster (giganet) but then you >>run into the PCI bus limit, and trying to bond more than one of those will >>result in bus saturation... at the 100mbytes/second limit... >> >>We are talking about the max bandwidth between the CPU and memory, which is >>the bottleneck in the PC, and that is also where the C90 totally cooks the >>PC. IE the C90 has 16 cpus at a 2ns clock cycle (500mhz roughly). In one >>clock cycle, the machine can do 4 64 bit memory reads, and two 64 bit memory >>write. If you multiply that out, that is 48 bytes per cycle, times 16 cpus, >>which is 500,000,000 * 48 * 16. Compare that memory bandwidth to the PC >>bandwidth and you see why the C90 came out at 30 million dollars when they >>were first delivered. That is roughly 4 x 10^11 bytes per second... >> >>400,000,000,000 bytes per second. Think about that number for a second. 400 >>gigabytes per second... compared to 100 megabytes per second... >> >>:) >> >>Then you realize just how far the PCs have to go... > >A look at Jack J. Dongarra's benchmark gives the following line for the C90: >Cray C90 (16 proc. 4.2 ns) 479 mflops 10780 mflops in TPP 15238 mflops >theoretical peak. > >So it looks like that old machine was running at 4.2 nanoseconds. >My Pc running at 450 Mhz gives 218 mflops when solving a system of linear >equations with a size of 1000. > >However, mflops are pretty useless for chess programming. >16 procs will not make a chessprogram 16 times faster, but maybe 10 times >faster. > >Kind regards >Bernhard That is correct, now that I think about it. The XMP had a fastest version at 8.6ns, the YMP was 6 ns, the C90 was 4ns and the T90 was 2ns. I wrote the above from home and couldn't check the clock speed. Divide what I wrote by 2.0... Sorry... As far as speeding things up, Cray Blitz ran almost exactly 12 times faster on the C90 (published in the JICCA a few years ago, the article on DTS tree splitting). It lost about 1/4 of the machine, but it was also a flattening curve, in that 32 would not be 24 times faster...
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.