Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: cache optimization

Author: Robert Hyatt

Date: 07:57:36 06/07/01

Go up one level in this thread


On June 07, 2001 at 05:10:26, martin fierz wrote:

>hi,
>
>this is from an earlier post of bob hyatt:
>
>>That is a basic optimization strategy...  variables used close together in
>>time should be close together in memory to take advantage of 32-byte line
>>fills in cache.
>
>is this somthing which everybody is doing? is it worth the trouble to move
>around variable declarations? if this is optimized, how much performance
>gain can you expect compared to random variable placing?

It is definitely worth doing.  The problem is that the ANSI standard says that
just because two variables are declared in adjacent declarations, they are not
guaranteed to be adjacent in memory unless they are both in the same structure.

Any good book on optimizing teaches this as an early principle.  For math
problems, there are the so-called "blocked" algorithms that work with their
data in smaller chunks so it will all fit into cache, etc.

Performance is harder to predict.  It has the potential to eliminate 6
unnecessary memory reads out of 8, if you use 32 bit values.  Because cache
fills an entire 32 byte line in one burst.


>are there any other strategies to optimize a program for good cache performance?
>and how do you measure this, if ordering the variables in one function causes
>that function to be 1% faster & the overall program 0.01% or something? just
>so i can try it in one function for starters...

1% is simply 1%.  Pick up a few of those and it becomes 10%.  Etc.




>
>here's another observation i made: i have a P4 [i know... no need to tell me
>that it's a bad choice :-)] desktop and a P3 laptop. i tried bobs recommendation
>to use char and short arrays for small variables (like my
>"lastbit" array), and on the P3 this was indeed faster - on the P4 it was
>slower though. is this to be expected?

I haven't studied the p4 pipe at all.  It is longer.  Why it would be worse
for bytes over words is hard to say.  Accessing bytes has always been harder
than accessing words, but the memory bandwidth of accessing words has made
using bytes pay off.  Perhaps the RDRAM in the P4 is the issue here.




>
>cheers
>  martin



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.