Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Sempron vs. Athlon 64: Proof that Crafty's working set is < 256k

Author: Robert Hyatt

Date: 15:15:55 08/22/04

Go up one level in this thread


On August 22, 2004 at 17:12:05, Tom Kerrigan wrote:

>On August 22, 2004 at 11:10:33, Robert Hyatt wrote:
>
>>Simple.  I access a batch of random attack entries.  I then access a _lot_ of
>>other stuff before I come back to the attack entries.  Your 256K cache has 4K
>>lines.  I don't know what the set associativity is, as AMD has lots of options
>>there in recent history.  But that further reduces the number of "buckets" to
>>stuff stuff in.  My xeon claims 16-way set associativity, with 128 byte lines.
>>That turns into a paultry 256 sets.  It is very easy to get a bad physical
>>memory layout where you don't even use all of those sets, and where some sets
>>get badly overloaded.
>
>This is a bunch of nonsense. You make it sound like associativity somehow
>decreases the amount of cache you have. Really, associativity has no place in
>this discussion, except maybe to note that it reduces the behavior that you're
>complaining about, namely random accesses evicting important data from the
>cache.

Then you don't understand set associativity.  It _does_ influence what goes
where.  Physical memory maps directly to a set.  A program doesn't necessarily
use _every_ set in cache due to poor physical memory layout decisions by the
O/S.




>
>But let's say you do randomly access your working set. How about you explain how
>performance isn't increased going from 256k to 512k cache?


I believe I already answered.  I _can't_ explain it because I _can't_ reproduce
it.  All of my data was for 512K/1024K/2048K with everything else being
identical except for unpublished things Intel _could_ have done, produced faster
results.  Eugene did the same.  I had some old alpha data with some larger L3
cache sizes but I haven't included those as I don't know exactly how the
hardware configurations might have differed.

So I can _not_ explain why you got no speedup.  That is contrary to everything I
have ever seen.

I'll ask the AMD folks if they have some machines I can try that are identical
except for cache size.  Ditto for an Intel box although I am not sure that Intel
still does the 2048K xeons...




> You're randomly
>accessing a bigger random subset of your working set. (And really, 512k is a
>large percentage of your maximum possible working set... Windows reports that it
>only allocates 5MB for Crafty, including hash tables, tables that never get
>used, code that isn't part of the search, memory for the stack, etc.)
>
>-Tom


Strange number.  Linux reports 20M.

How can it use 5mb when the default hash sizes total 4 megs?

IE start crafty and type "hash" and "hashp".  You get almost 4 megs for those.

There are 128 TREE blocks also, but they get malloc()'ed.  That is way over a
meg total...

5mb has to be wrong...




This page took 0.01 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.