Author: Robert Hyatt
Date: 12:32:20 06/08/01
Go up one level in this thread
On June 08, 2001 at 14:13:10, martin fierz wrote:
>On June 08, 2001 at 14:02:53, Robert Hyatt wrote:
>
>>On June 08, 2001 at 12:52:55, martin fierz wrote:
>>
>>>hi bob,
>>>
>>>>>Small, local variables in a function do not have cache problems.
>>>>>The problem is when you have large tables with data. Then you need to worry
>>>>>about how the data is organized.
>>>>>
>>>>>/Johan Melin
>>>>
>>>>But they _do_. If you put 8 "small local variables" together in memory,
>>>>whenever one is referenced the other 7 will be brought into cache at the same
>>>>time. When you use them, you use a cache cycle rather than a memory cycle.
>>>>That basically means that the first variable costs you a memory cycle, the
>>>>other 7 cost you nothing whatsoever to use.
>>>>
>>>>Small savings add up over millions of repetitions...
>>>
>>>if i give you some sample code:
>>>
>>>int somefunction(int p1)
>>> {
>>> int x1,x2,x3...x20;
>>>
>>> x1=something;
>>> /* can i assume that now x2..x8 are in my cache?*/
>>
>>Probably, yes. Although as I mentioned, the ANSI standard does not require
>>that the variables be laid out in memory as they are declared. I can't imagine
>>it not happening here, of course. But if you mix chars, ints, doubles, etc,
>>the compiler would probably shuffle things around unless you put everything in
>>a struct... and even then you have slack bytes added for alignment on some
>>architectures.
>>
>>
>>
>>
>>> x5=something;
>>> /* and now still x1..x8 because probably x5 was in cache?*/
>>
>>Yes... but note that a cache line fill comes from a distinct 32-byte block
>>of memory. It is difficult to be sure X1 is on a 32 byte address boundry (to
>>make sure it is the first word of that 8-word block) without using some sort
>>of align directive in assembly.
>>
>>
>>> x12=something else;
>>> /* and now x12..x19 are in cache?*/
>>> }
>>
>>
>>You got it almost right. Probably x9-x16 will be in cache (32 byte linesize
>>remember).
>>
>>
>>>
>>>are these comments reasonable? i know you wrote that there is no guarantee
>>>that the variables will be allocated this way, but probably this is the best
>>>guess i have!?
>>>
>>>cheers
>>> martin
>>
>>
>>That is probably very close except for your mistake in missing that x1-x8 might
>>end up in one line together, followed by x9-x16 in another line...
>>
>>In practice, x1-x8 will probably be in adjacent memory words, but might not be
>>in the same cache line if x1 isn't on a 32-byte addressing boundary.
>
>thanks for the explanation - i will try this on sunday...
>
>cheers
> martin
Here is a brief cache lecture:
Q: How can cache improve the performance of my program?
A: In the following ways:
(1) data re-use. Once you bring something in from memory and suffer a long
memory read-cycle to get it, the next time you use it you get it after a quick
cache-cycle, lowering the effective memory latency. The more you reuse it, the
closer effect memory latency approaches actual cache latency.
(2) prefetching data. When you reference any byte within a 32-byte block of
memory that starts on a 32-byte boundary, the entire block is burst-loaded into
cache. When you refer to any of these words, they are fetched far quicker than
normal because they are already in cache and ready for use.
(3) avoiding memory writes. When you write to memory, you really write to
cache, but the data in cache isn't written back to memory instantly. It is held
so that additional memory writes will access cache only and avoid the memory
write cycle. Just before a line of cache is replaced by another line, the
cache controller checks to see if the cache line was modified, and if it was, it
gets written back then. This is called either write-back or copy-back depending
on the book you use.
A programmer can directly affect (1) and (2) by being careful of how he uses
memory. That is what this "temporal access" stuff is all about. Things that
are close together in when they are accessed (temporal reference pattern) should
be close together in memory so they are brought into cache in a chunk.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.