Computer Chess Club Archives

Search

Terms
Messages

Subject: Re: Importance of L2 cache speed/size for diff programs (was:..Genius sp

Author: fca
Date: 15:55:12 08/12/98
On August 12, 1998 at 14:08:59, Robert Hyatt wrote:

>On August 12, 1998 at 14:04:19, Robert Hyatt wrote:
>
>>On August 12, 1998 at 10:58:17, fca wrote:
>>
>>>On August 12, 1998 at 07:29:11, Robert Hyatt wrote:
>>>
>>>>my P5/233/mmx clocks in at about 70% of the speed of my
>>>>P6/200 machine.  As I've said before, the P6/PII core logic is simply better
>>>>with the instruction pool and speculative execution and register renaming and
>>>>you-name-it.
>>>
>>>Sure, all of us agree here.  But to view cache speed (especially) and size
>>>differences as having an additive (i.e. multiplicative!) effect together with
>>>core speed differences is IMO misleading.  If you have something that executes
>>>instructions twice as fast, you need a cache that works twice as fast, simply to
>>>keep pace (i.e. to prevent a bottleneck).  For lower frequency (as % of CPU
>>>time) activity like hash accessing, L2 cache speed/size is therefore not so
>>>important.
>>>
>>>Which was my point to Tom, whom I believe(d) was attributing to L2 hit
>>>differences part of the reasons for the 2.5x reported by blass uri between a
>>>P200MMX and a P2/300 running Junior.  I assume L2 only gets stressed by hash
>>>activity.
>>>
>>>>But hashing has little to do with how a program performs on either, due to the
>>>>relative infrequency of hash probes compared to all the other stuff like
>>>>move generation and positional evaluation...
>>>
>>>and which activities do not stress L2 cache, but live within L1. :-)
>>>
>>>SUMMARISED VIEW OF FCA:  As long as L2 size/speed keep up with "core MHz ratios
>>>times allowance for P-->P2 ", it is not relevant when trying to explain
>>>differences in performance of (say) Junior on a P200MMX and a P2/300.
>>>
>>>Q1.  Tom, are you disagreeing with me in any of this?
>>>Q2.  Bob, ditto?

>>nope...

Good. I was pretty sure you would.  Tom _is_ disagreeing, though, and I await
clarification from him.

>>L2 cache speed and core cpu speed are tightly coupled.  If you
>>only improve the core cpu cycle time, you get much less bang for the mhz
>>than you would expect.

Exactly so.  Which was among my reasons for _disagreeing_ with Tom's
cause-attribution that part of the 2.5x ratio between Junior on P2/300 : P200MMX
could arise from heavy L2 cache hitting.

>> If you increase both proprotionally, as all the PII
>>processors do (except xeon which is 1:1) then you find other bottlenecks,
>>like bus bandwidth or memory bandwidth...

Absolutely.


>>>SUMMARISED VIEW OF FCA:  As long as L2 size/speed keep up with "core MHz ratios
>>>times allowance for P-->P2 ", it is not relevant when trying to explain
>>>differences in performance of (say) Junior on a P200MMX and a P2/300.
>>
>>yes...  except I add that hash table accesses don't even count in this equation
>>at all and have little measurable effect on performance of the cache system,
>>because they happen so infrequently when compared to all the other memory
>>references needed to process a single node...

You are not adding this, as you already said it "above" :-)

>>>>But hashing has little to do with how a program performs on either
>>>>due to the relative infrequency of hash probes compared to all
>>>>the other stuff like move generation and positional evaluation...

And I already agreed with it above, so I need not agree with it again. ;-)

>I should have added that there are lots of things that will make programs
>scale differently on identical machines.  IE take branch prediction for one
>issue.  If a program has a *lot* of if..then..else.. tests, it is going to
>do poorly when you compare it to a program that doesn't.  An example for
>clarity might be evaluating to see if a pawn is passed.  I do one test to
>answer this (and the state of enemy pawns with a bit mask that must not
>have any opponent pawns on those squares for this pawn to be passed.)  I do
>one and and one compare/branch.  Another program might have to look down the
>three files with a loop to be sure the pawn is passed, which means several
>conditional tests and branches.
>Since there are so many ways to do these things, and since branch prediction
>and prediction failure penalties are so variable on different processors, such
>simple algorithmic differences can make big differences in the way two >different programs "scale"...

I agree.  And these are typically not things where L2 size has much relevance
provided some L2 exists, so nothing here supports any change of opinion by me.

Over to TK!
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.