Author: Robert Hyatt
Date: 07:51:29 10/15/03
Go up one level in this thread
I blew one bit of the previous calculation. The C90 is a "super-scalar" sort
of vector machine. Where I said "one floating add per cycle" change that to
two. A single vector instruction does _two_ operations per cycle, not one, and
I had simply failed to note that. That was the main change from the older
X-MP and Y-MP, that was introduced on the C90. Obviously it makes vector
performance 2x faster even without the clock speed improvement. IE for my
example:
v0 v1+v2
v3 v4+v5
v6 v0*v3
that code will produce _six_ results per cycle, once the chained vector
pipeline is filled. Not the _three_ I had given.
_that_ is why the Cray buries the PC in _any_ program that can use vectors.
Even though the C90 only runs at 250 mhz. The T90 runs that up to 500mhz,
and the Cray-3 doubled it again to 1ghz. But all mhz/ghz are _not_ created
"equal" for those that understand vector operations.
The C90 is a 250mhz machine, not the 100 Vincent pulls from you-know-where.
But no 2500mhz 80x86 can produce 6 64-bit IEEE floating point operations
every 4 nanoseconds.
I don't know how to explain it better to someone that simply doesn't have a
single scintilla of background on understanding the concept of "a vector
machine."
This page took 0.01 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.