Author: h.g.muller
Date: 05:41:37 02/09/06
Go up one level in this thread
On February 08, 2006 at 16:17:37, Robert Hyatt wrote: >Makes the code look ugly and awkward as hell, but it runs on "cachey" systems... I agree, it makes for ugly code. As a first step you have to 'stratify' all natural loops into one that runs over a cache-size block and one that does the courser steps, and then move all the former to the inside of nested loop constructs. And be very aware when defining multi-dimensional array boundaries to avoid unnecessary cache collisions, by inserting spacer elements. (But that was important on the Cray too, although any stride was possible, memory bandwith dropped like a stone if you took a power-of-two stride.) But it really pays off, on a PC it can easily make your code run 10 times faster. For solving partial differential equations on multi-dimensional grids I could achieve speeds of 60 MFLOPS even in the days of the 200MHz Pentium 1, despite the fact that half the time was taken up by the 1% or so divisions among the FLOPS. That was very competative, even with respect to collegues who had their own private Cray that they could run 24/7. Because to beat those, I simply bought 4 more PCs, something they could never hope to do with Crays. ;-)
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.