Author: Robert Hyatt
Date: 17:35:07 07/11/02
Go up one level in this thread
On July 11, 2002 at 17:47:11, Joshua Lee wrote: >On July 11, 2002 at 12:30:50, Matthew Hull wrote: > >>On July 11, 2002 at 00:18:36, Robert Hyatt wrote: >> >>>On July 10, 2002 at 12:53:11, Joshua Lee wrote: >>> >>>>On July 09, 2002 at 13:28:54, Robert Hyatt wrote: >>>> >>>>>It doesn't... For example, the C90 had a 2 nanosecond clock. The cpu >>>>>could read two 64 bit words and write one 64 bit word per clock cycle, >>>>>per cpu. With 16 cpus, that is 16 * 24 * 500000000 bytes per second >>>>>and that can be _sustained_ forever. >>>>> >>>>>Compare that to any PC you want and you see why (a) the supercomputers are >>>>>so expensive and (b) why the micros have absolutely no chance at catching >>>>>them in terms of speed. >>>>> >>>>192,000,000,000 bytes per second >>>>That's over 178 Gigabytes a second. Is Bandwidth refered to in this way or how >>>>fast the memory can communicate with the cpu or both? In Athlon's it's has a 2.1 >>>>GB/Sec bus it can execute a multiply and add on every clock cycle which gives it >>>>a peak throughput of 3.2 gigaflops. >>>> >>> >>> >>>I don't see any way a 2.1 gigabyte per second memory bandwidth can translate >>>into 3.2 gigaflops. A flop requires accessing two operands, doing something >>>to them, and putting the result back... IE a flop == 12 bytes of memory >>>traffic (cache doesn't count because big applications and arrays don't fit >>>into cache). That translates into maybe 100 gigaflops as a more realistic >>>number... And I don't believe any PC has a prayer of coming within a factor >>>of 10 of that number in reality. >> >>Right! Even Apple only of only a little more than 1 gigaflop for a 500mhz G4, >>which has it's own vector processor (ALTIVEC). And this chip flogs any x86 chip >>as far as FLOPS is concerned. It's the classic memory bottleneck of the micro. >>Bus speed versus processor speed equals wait states, or something like that. > > >The Memory bandwidth i quoted from microway's site was how fast the memory was >communicating to the cpu ,i didn't think this translated into Gigaflops >but for all i knew it may be effecting it. My question then is how is a chess >program using memory bandwith or how is it using the Floating point capabilities >of the cpu? I thought that most chess engines are using interger strength >anyway. A chess program will use the fastest math possible. Were floating point faster than integer, it would make sense. IE on a Cray there are good reasons to use _both_ at the same time... Bandwidth to memory is another issue. Hashing is a high-bandwidth operation. But there are others as well, such as move generation where you stuff data into a large move array, or look up things in large tables as I do in the rotated stuff in crafty. You will likely design a program differently. For example, in Cray Blitz, we did a "copy-make" operation so that there was no unmake required. But the cray made that "copy" operation very inexpensive due to bandwidth. When I tried that approach on the PV, it died badly because of low bandwidth...
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.