Computer Chess Club Archives

Search

Terms
Messages

Subject: Re: How can you draw conclusions from this data?

Author: Robert Hyatt
Date: 11:01:42 08/25/04
On August 25, 2004 at 13:43:52, Tom Kerrigan wrote:

>On August 25, 2004 at 10:52:23, Robert Hyatt wrote:
>
>>My issue was _never_ "cache misses".  It was _always_ "how big does cache have
>>to be to contain enough of Crafty to not have many misses?"
>
>Sure, that makes sense. Your issue was cache misses, not cache misses.

Real conceptual problems with you it seems.

To make it _real_ simple:

The issue is _not_ the _number_ of cache misses.  The issue is "how big does
cache have to be to eliminate the cache misses completely?"  Because _that_
represents the size of cache necessary to contain the "working set" of the
program, _by definition_.

So whether there are one million misses per second or 50 per second doesn't
matter.  As long as there are misses, there are things being used that are not
in cache, hence part of the program's working set is not there.

Is that _really_ so hard to understand???


>
>>The answer is obvious from _either_ set of data.  There are definite "points"
>>where cache size increases for either instructions or data stop reducing the
>>misses.  That is the cache size that contains nearly everything, and it gives a
>>pretty good benchmark for the instruction or data working set sizes.
>
>Now we're getting into a semantic argument. If that's how you want to define
>working set, fine. I'm defining it as the point where enough of Crafty is in
>cache that main memory performance doesn't make a significant (measurable?)
>difference.

I define working set as every textbook I have defines it.  It is usually used in
reference to demand paging, but it fits here as well...

"The working set of a program is defined as the set of pages that have to be in
main memory to avoid any paging activity."

"The resident set of a program is defined as the set of pages that are actually
in memory at any time."

The optimal resident set size for a program is its working set size.

If you want to use another definition, then fine.  Define it and we can go from
there.  But without an * behind the name, I tend to use the normal definition of
a term.  IE Vincent's incorrect "memory latency" number.  Latency is the time to
read/write a word of memory.  Independent of any TLB/cache issues.  access time
factors in TLB misses/hits and so forth.  But he scrambles the terms and then
argues when someone disagrees with his unique definition that is unrecognized
anywhere else.

If you want to try to define "What is the optimal cache size for Crafty?" that
is an interesting question.  If you want to define "what is the cache footprint
for Crafty?" that is also interesting.  And different.

I am talking about the cache footprint at the moment.  Clearly once you have a
cache large enough to hold everything the program uses, making it bigger will
_not_ make the program run any faster.  Once it is all in L2, memory is out of
the picture and more L2 will have no effect.  I showed that effect for
instructions as being 256K.  For data, it is bigger, although there are marked
drops at certain points and I would not be willing to pay the extra cost for L2
beyond that which provides acceptable performance gains.  A total of 1024 looks
pretty good.  A total of 2048 looks a bit better although it would not be worth
doubling the processor price IMHO.




>
>>I _knew_ that you would be unable to see the forest for the trees.  And that
>>posting the numbers would only serve to divert the conversation to a _different_
>>topic, which it has.
>
>Yeah, the fact that you have a new, contradictory set of data throws a wrench
>into the whole "drawing conclusions from data" thing, doesn't it? Seriously,
>your new data shows miss rates that are different by factors of > 4 and a
>completely different trend between cache sizes. Sorry if this kinda thing
>"cramps your style."


I don't have any "new and contradictory data."  I gave old cache data.  Eugene
gave more.  I just finished producing even more.  None of it contradicts
anything, except for your 512/1024 test which is different.

You originally claimed my working set was < 512K.  I don't see how you can
continue to claim that after the data I have provided from testing it two
different ways.  If the working set was less than 512K, then any cache size
larger than 512K would not help performance at all, yet it clearly does.  As it
did a couple of years back when I did the financial analysis to decide which CPU
version to buy for my quad xeon when I did the upgrade.

In the absence of more data or information, it would seem that this particular
argument has reached the end of the road.  If you think that the WS size is <
512K, even when 512K of data cache still has cache misses, fine by me...  And
that ignores the approximately 128K-256K of instructions that are also needed in
cache.



>
>-Tom
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.