Computer Chess Club Archives

Search

Terms

Messages

Subject: Re: Sempron vs. Athlon 64: Proof that Crafty's working set is < 256k

Author: Robert Hyatt

Date: 14:48:08 08/21/04

On August 21, 2004 at 12:18:15, Sune Fischer wrote:

>On August 21, 2004 at 10:46:13, Robert Hyatt wrote:
>
>>>>extern BITBOARD set_mask_rr45[65];
>>>>
>>>>
>>>>Those are just a few random tables,
>>>
>>>They aren't random at all, that's my point.
>>
>>That has been _my_ point.  I run through a _lot_ of such tables.  And that tends
>>to flush cache before something gets reused.  A random probe into anything
>>replaces N bytes (a cache line).  IE on my PIV that is a 1 byte access replaces
>>128 bytes of cache in one chunk.  On my dual xeons, that is 4096 cache lines.
>>IE 4096 random accesses can completely flush cache.
>
>I think 4096 _random_ accesses is a lot.
>It's going to take you quite a few nodes to hit that number of accesses and most
>those accesses will not be random.

I can only speak for my program.  All of the table accesses I gave you are
random.

>
>>Who is ignoring that fact?  I reported that when I first looked into buying my
>>first quad xeon, I benchmarked Crafty on the PII xeon with 512kb, 1024kb, and
>>2048kb of L2 cache.  1024K was 10% faster.  2048 was another 7% faster.  Eugene
>>ran crafty on IA64 with 1.5mb L2 and 3.0mb L2 and found 3.0mb was 10% faster.
>>
>>Who is ignoring what?
>>
>>Tom ran on 256 and 512K and concluded the working set for crafty was < 256K.  I
>>simply said the data doesn't support the conclusion.  The conclusion _may_ be
>>right.  But two of us ran on larger L2 boxes and got better performance.  One
>>did not.  Two of us ran up to 2048K, one only tried 256 and 512k.
>
>What you're saying here is basicly that Crafty has a working set that is much
>larger than 3 MB, for sure it is so much larger that no improvement can even be
>measured when going from 256 to 512 kB.

No I am not saying _anything_ about the working set of Crafty.  You _totally_
miss the point.  I am only saying that if you run a program with X L2 and then
with 2X L2, and the speed is the _same_ you did _not_ just prove that the
working set of the program is <= X.

That is _all_ I have said.  I don't know what my working set is.  I don't care
what it is.  I do know that for two different testers, bigger cache was faster,
for Tom it wasn't.  Why that is I have no idea, I really don't care, and I don't
see any point in investigating further.

I only pointed out that the _original_ conclusion was badly flawed.  Nothing
more, nothing less...

>
>I don't believe that.
>
>What I do believe however, is that to measure the influence of cache size you
>should run on identical machines (Eugene did not, I don't know about you).

As I said, I did.  It was a dell poweredge 6000.  Dell did nothing but swap
processors for me after each test.  Everything else was identical, the only
question left is did Intel change anything between the three different
processors?  They had the same clock speed and bus speed, all 3 were pentium 2
400mhz xeons, the only thing intel _claimed_ was different was L2 cache size.
Whether there was another non-published difference (different set size, etc, I
don't know).

>
>Secondly, consider we have other factors that would benefit from larger caches,
>things like the hash.
>Even if the cache could only store the last 1000 hash entries, those 1000 would
>also be the most interesting and most likely to be used next.

Fine. But then how would you explain Tom's test showing _no_ improvement with 2X
L2?  We have to stick to what the data shows.  His suggests something strange is
going on.

>
>How big is this effect? You don't know, so how do you know this is not what
>you're seeing when going from 1.5 to 3 MB cache?

Then again, why did tom get _zero_ going from 256 to 512K when I got a bonus
going from 512 to 1024 and 1024 to 2048, and Eugene got a bonus for going from
1.5 to 3.0M?  Hopefully you get my point.  What Tom's zero proved I don't see.

>
>
>>What "problem" did I not isolate?  I simply ran exactly the same program, on
>>exactly the same processors, but with three different cache sizes.  I measured
>>the difference in speed and since the only difference was cache size, the speed
>>difference had to be attributed to that..

>Correct, but how big a hash did you use?
>Suppose the cache was able to store half your hash.
>

4 megs.  if going from 512 to 1024 helps, I'd expect going from 256 to 512 to
help as well.  Again the working set is not the point.  The suggestion that no
performance improvement going from 256 to 512 proves it to be less than 256K is
simply wrong.  That was the _only_ point...

>
>>Give me a break.  There is but _one_ interpretation of the data I presented.
>>Bigger cache improves performance measurably for Crafty.  What other
>>interpretation is possible from the data either I or Eugene have observed?
>
>There is a discrepancy with what others have seen, this must be explained
>somehow or your theory is not viable.

Again, what on earth are you talking about?  What "theory" have I proposed?

>
>I've aired a hypothesis that fits the data, further experiments are required to
>confirm it however.

Absolutely _no_ hypothesis fits the data presented.

>
>Two experiments I could think of:
>1) do the same tests again, only this time with 0 kB main+pawn hash
>2) add a "blow-out" table to be looped at every node and let it grow to see
>where the barrier is.
>
>-S.

Re: Sempron vs. Athlon 64: Proof that Crafty's working set is < 256k Tom Kerrigan 00:11:59 08/22/04
- Re: Sempron vs. Athlon 64: Proof that Crafty's working set is < 256k Robert Hyatt 08:18:43 08/22/04
  - Re: Sempron vs. Athlon 64: Proof that Crafty's working set is < 256k Tom Kerrigan 14:20:08 08/22/04
    - Re: Sempron vs. Athlon 64: Proof that Crafty's working set is < 256k Robert Hyatt 15:24:44 08/22/04
      - Re: Sempron vs. Athlon 64: Proof that Crafty's working set is < 256k Tom Kerrigan 18:22:30 08/22/04
        
        Re: Sempron vs. Athlon 64: Proof that Crafty's working set is < 256k Robert Hyatt 19:50:47 08/22/04
  - Re: Sempron vs. Athlon 64: (more) Robert Hyatt 08:35:47 08/22/04

This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.