Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Full circle

Author: Johan de Koning

Date: 23:34:42 08/28/03

Go up one level in this thread


On August 28, 2003 at 11:45:35, Robert Hyatt wrote:

>On August 28, 2003 at 01:50:46, Johan de Koning wrote:
>
>>On August 27, 2003 at 12:25:51, Robert Hyatt wrote:
>>
>>>On August 26, 2003 at 21:12:45, Johan de Koning wrote:
>>
>>[snip]
>>
>>It seems we're finally back to where you and me started off.
>>
>>>>You keep saying that copy/make causes problems with cach to memory traffic.
>>>>Here I was just saying it doesn't, if cache is plenty.
>>>
>>>Here is the problem:
>>>
>>>When you write to a line of cache, you _guarantee_ that entire line of cache
>>>is going to be written back to memory.  There is absolutely no exceptions to
>>>that.  So copying from one cache line to another means that "another line" is
>>>going to generate memory traffic.
>>
>>Here is the solution: write-through caches were abondoned a long time ago.
>
>I'm not talking about write-through.

I'm glad you aren't. :-)

>  I am talking about write-back.  Once
>you modify a line of cache, that line of cache _is_ going to be written back
>to memory.  When is hard to predict, but before it is replaced by another cache
>line, it _will_ be written back.  So you write one byte to cache on a PIV, you
>are going to dump 128 bytes back to memory at some point.  With only 4096 lines
>of cache, it won't be long before that happens...  And there is no way to
>prevent it.

Sure, every dirty cache line will be written back at *some* point. But you're
allowed to use or update it a million times before it is flushed only once.
Number of cache lines has nothing to do with it. On a lean and empty system
some lines might even survive until after program termination.

>>And for good reason, think of the frequency at wich data is written (eg just
>>stack frame). Once CPU speed / RAM speed hits 10 or so, write-through cache will
>>cause almost any program to run RAM bound.
>
>Sure, but that wasn't what I was talking about.  Once a line is "dirty" it is
>going back to memory when it is time to replace it.  With just 4K lines of
>cache, they get recycled very quickly.
>
>>
>>>>>  I claimed that for _my_ program,
>>>>>copy/make burned the bus up and getting rid of it made me go 25% faster.
>>>>
>>>>And I suspect this was because of a tiny cache that couldn't even hold the
>>>>heavily used stuff.
>>>
>>>This was on a (originally) pentium pro, with (I believe) 256K of L2 cache.
>>
>>L2 is not a good place to keep your heavily used data.
>
>There's no other choice.  L1 is not big enough for anything.

It's big enough to hold your position and top of stack. It's even big enough to
hold *my* position of 22000 bytes, except for the rarely addressed parts.

The less heavily used data will live briefly in the LRU lines but is typically
not dirty. Though it is certainly possible to get unlucky and flush hot data,
depending on memory lay-out and program flow.

>  IE the pentium
>pro had 16K of L1, 8K data, 8K instruction.  Newer pentiums are not much
>better although the 8K instruction has been replaced by the new trace cache
>that holds more than 8KB.  And the data cache is up to 16K.  However, I have
>run personally on xeons with 512K L2, 1024K L2 and 2048K L2 and I didn't see
>any significant difference in performance for my program...  Bigger is slightly
>better in each case, but it was never "big enough".

I guess most of your tables are pretty sparse in terms of access frequency. So
you might get away with 2048 lines of L2. In fact I'm pretty sure you get away
with it since a few RAM accesses per node would kill any 1+ MN/s badly.

But regarding L1 size: Intel's policy simply sucks. :-)

... Johan



This page took 0.01 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.