Author: Vincent Diepeveen
Date: 05:27:03 07/09/03
Go up one level in this thread
On July 09, 2003 at 08:25:39, Vincent Diepeveen wrote: >On July 09, 2003 at 01:19:00, Jeremiah Penery wrote: > >>On July 09, 2003 at 00:09:03, Vincent Diepeveen wrote: >> >>>On July 08, 2003 at 19:37:48, Jeremiah Penery wrote: >>> >>>>On July 08, 2003 at 08:37:49, Vincent Diepeveen wrote: >>>> >>>>>On July 08, 2003 at 00:33:09, Jeremiah Penery wrote: >>>>> >>>>>>Each chip consumes only about 140W, rather than Vincent's assertion of 150KW. >>>>> >>>>>the 125KW is for Cray 'processors' not fujitsu processors that are in the NEC >>>>>machine. >>>>> >>>>>Ask bob i remember he quoted 500 kilowatt for a 4 processor Cray. So i divided >>>>>that by 4. >>>> >>>>That 500KW was probably for the entire machine. Each processor probably >>> >>>Yes a 4 processor Cray. >>> >>>Just for your own understanding of what a cray is. it is NOT a processor. >>>It is a big block of electronics put together. So no wonder it eats quite a bit >>>more than the average cpu. >> >>Your own words: "the 125KW is for Cray 'processors'". But that is not the >>truth. >> >>>Another major difference with Cray machines (using cray processor blocks) is >>>typically not using too many processors, because all processors are cross >>>connected with very fast connections. No clever routing system at all. Brute >>>force. >> >>Earth Simulator: >> >>Each node of 8 processors is connected to 128 IN (Interconnected Network) >>cabinets. Each of those cabinets is connected to each other processing nodes >>(all 639 other nodes). Each of these connections is 12.3GB/s bi-directional. >>Each IN cabinet has 2 640x640 crossbar switches to handle this. "Several >>data-transfer modes, including access to three-dimensional (3D) sub-arrays and >>indirect access modes, are realized in hardware. In an operation that involves >>access to the data of a sub-array, the data is moved from one PN [processor >>node] to another in a single hardware operation..." So, basically, every >>processor has 1-hop access to every other processor's memory. >> >>I guess that's how the machine sustained over 85% of theoretical peak performace >>on LINPACK, and 66% of theoretical peak on a real-world atmospheric simulation. > >All these testsets are not very random latency hungry. In contradiction they >just need big bandwidth and this Earth machine has just that. > >Altix3000 for example has 6.4GB biderectional bandwidth to 4 processors. So >that's 12.8. Seems to me that's about the maximum you can get at old hardware. New is the hypertransport of course. >Of course a vector processor is kicking butt for those testsets but that's what >it was designed for. > >This design is superb to simulate nuclear stuff, but i bet they'll be bragging >about other things more as we can see everywhere. > >Nevertheless this machine is record breaking and always will be remembered for >that. Assuming it is designed for big vectors it's quite a bit slower in latency >then because if you optimize for huge transfers at once then a single transfer >is probably very pricey. > >So let's ignore the latency question, it wasn't designed for it simply. > >You don't put 8 processors in a node if you do. > >Best regards, >Vincent
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.