Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Full circle

Author: Johan de Koning

Date: 23:58:53 08/29/03

Go up one level in this thread


On August 29, 2003 at 08:53:50, Robert Hyatt wrote:

>On August 29, 2003 at 02:34:42, Johan de Koning wrote:
>
>>On August 28, 2003 at 11:45:35, Robert Hyatt wrote:
>>
>>>On August 28, 2003 at 01:50:46, Johan de Koning wrote:
>>>
>>>>On August 27, 2003 at 12:25:51, Robert Hyatt wrote:
>>>>
>>>>>On August 26, 2003 at 21:12:45, Johan de Koning wrote:
>>>>
>>>>[snip]
>>>>
>>>>It seems we're finally back to where you and me started off.
>>>>
>>>>>>You keep saying that copy/make causes problems with cach to memory traffic.
>>>>>>Here I was just saying it doesn't, if cache is plenty.
>>>>>
>>>>>Here is the problem:
>>>>>
>>>>>When you write to a line of cache, you _guarantee_ that entire line of cache
>>>>>is going to be written back to memory.  There is absolutely no exceptions to
>>>>>that.  So copying from one cache line to another means that "another line" is
>>>>>going to generate memory traffic.
>>>>
>>>>Here is the solution: write-through caches were abondoned a long time ago.
>>>
>>>I'm not talking about write-through.
>>
>>I'm glad you aren't. :-)
>>
>>>  I am talking about write-back.  Once
>>>you modify a line of cache, that line of cache _is_ going to be written back
>>>to memory.  When is hard to predict, but before it is replaced by another cache
>>>line, it _will_ be written back.  So you write one byte to cache on a PIV, you
>>>are going to dump 128 bytes back to memory at some point.  With only 4096 lines
>>>of cache, it won't be long before that happens...  And there is no way to
>>>prevent it.
>>
>>Sure, every dirty cache line will be written back at *some* point. But you're
>>allowed to use or update it a million times before it is flushed only once.
>>Number of cache lines has nothing to do with it. On a lean and empty system
>>some lines might even survive until after program termination.
>
>Number of cache lines has everything to do with it.  If you can keep 4K
>chunks of a program in memory, and the program is _way_ beyond 4K chunks
>in size of the "working set", then cache is going to thrash pretty badly.

Working set, by whatever definition, is not relevant.
Frequency distribution is.
Since *that* is the basis of caching.
(And analogously of compression.)

>I've already reported that I've tested on 512K, 1024K and 2048K processors,
>and that I have seen an improvement every time L2 gets bigger.

Yes, reported as not significant. But off-topic since your large tables are
addressed irregularly, hence never threaten hot data in L1.

>As I said initially, my comments were _directly_ related to Crafty.  Not to
>other mythical programs nor mythical processor architectures.  But for Crafty,
>copy/make was slower on an architecture that is _very_ close to the PIV of
>today, albiet with 1/2 the L2 cache, and a much shorter pipeline.

As I said initially, writing to cache (ie just writing) does not relate to
memory traffic. That was the issue for the last 10 days.

I'm not challenging your results with Crafty at all, I'm only doubting them.
And I'd still like to see a copy simulation, preferrably on different machines,
to put things in at least some perspective.



[...]
>Of course X86 is crippled for many more reasons than that.  8 registers for
>starters.  :)

Well, here is *something* we agree on.

... Johan



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.