Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: working set is < 384k

Author: Robert Hyatt

Date: 14:29:32 08/21/04

Go up one level in this thread


On August 21, 2004 at 08:41:27, Vincent Diepeveen wrote:

>On August 20, 2004 at 11:51:27, Tom Kerrigan wrote:
>
>>On August 20, 2004 at 10:51:50, Robert Hyatt wrote:
>>
>>>On August 20, 2004 at 04:33:07, Tom Kerrigan wrote:
>>>
>>>>Now that AMD is selling two processors that are identical other than L2 cache
>>>>size (Sempron has 256k, Athlon 64 has 512k) we have proof of Crafty's working
>>>>set size:
>>>>
>>>>Sempron:    1,080,020 NPS
>>>>Athlon 64:  1,080,230 NPS
>
>Did you test in 32 bits mode or so?
>
>In 64 bits mode the instruction sizes are bigger (though slightly), but you need
>less instructions, so the stress is less on the processor then and more on the
>L2/ main memory.
>
>>>>http://www.anandtech.com/linux/showdoc.aspx?i=2170&p=3
>>>>
>>>>This should prove once and for all that Crafty's working set is < 256k and
>>>>therefore that size of L2 cache has no effect on its performance (as long as
>>>>it's >= 256k) and that main memory speed likely plays a trivial role
>>>>performance-wise.
>>>>
>>>>I bring this up because of all of the long debates that have occurred in the
>>>>past about the value of L2 cache, the speed of memory, and the working set size
>>>>of chess programs.
>>>>
>>>>I have no doubt that Crafty uses a bunch of memory, but obviously not with
>>>>enough temporal locality for it to matter one iota.
>>>>
>>>>-Tom
>>>
>>>
>>>Your interpretation _could_ be seriously flawed.  IE suppose its working set is
>>>2mb?  You can't conclude anything if that is true as both the 256K and 512K
>>>would be thrashing equally.
>>
>>How is it they would by thrashing equally? Let's say a cache access takes 5ns
>>and main memory takes 50ns. Average access times for 2MB working set:
>>256k cache: (256k/2MB)*5ns + (2MB-256k/2MB)*50ns = 44.37ns
>
>A number of things Tom.
>
>a1) First of all you need to prove that the L2 caches from both cpu's is giving
>data in the same number of cycles
>
>a2) what type of memory is used with the processors? If 256KB processor has
>400Mhz CL2 memory and the other one has 266Mhz CL3 memory then the comparision
>would look odd.
>
>b) You forgot to add the L1 cache
>  128 + 256 = 384KB cache
>
>I assume your claim is working set size < 384 KB
>
>c) how big did you set the hashtable size to? Some specint2000 comparision where
>only 2 MB or something similar gets used is not real interesting. We want 400MB
>memory or so.
>
>d) which version of crafty did you use? Nowadays versions do only sequential
>lookups in 1 table and old specint one is doing 2 lookups in 2 different tables.
>
>e) when using a big hashtable the fastest way to get hashentries is around 91 ns
>that assumes 400Mhz memory and CL2 memory. For example dual opterons it is hard
>to get under 133 ns.
>
>f) you really should do 64 bits tests, we've had enough 32 bits tests already.
>
>This all doesn't take away that i agree with the conclusion that for a
>chessprogram when running SINGLE cpu it doesn't matter whether the L2 cache is
>256 or 512 or 2048 KB. The real important things are the L1 cache size, the L2
>cache random lookup SPEED and how fast the main memory can randomly access
>hashtables.
>
>>512k cache: (512k/2MB)*5ns + (2MB-512k/2MB)*50ns = 38.75ns
>
>>That's 15% faster. You'd think a difference that big would show up in the
>>benchmark score but it doesn't. Or are you going to claim that Crafty always
>>uses memory that it hasn't used for the last ~512k?
>
>I agree with the math when the L2 cache speed is the same speed. However you
>must *prove* those first. There is big differences even from processor line to
>processor line.
>
>I remember that northwood P4 was a lot faster for DIEP than the generation P4
>before and yet everyone is doing as if it is the same processor even. For diep
>there was a 20% difference between P3 and P2 speed, for fritz it didn't matter
>anything. And so on.
>
>>>Only _real_ way is to test with larger sizes as well.  I've done that up to 2mb
>>>and saw improvement from 512 to 1024K and from 1024K to 2048K, on older xeons.
>>
>>Sure. And if other people run those tests and don't see a difference, you'd say
>>a chip needed a 4MB/8MB cache for anybody to be sure.
>
>You are correct here. That's how Bob works.

Nah.  That is how _you_ work.  Give one example and claim it 'proofs' your
point.  One example doesn't prove something.  It can disprove a claim easily
enough however.  You've been burned enough times that way to understand...




>
>>I have easy access to 2GHz Athlon 64s with 512k and 1MB cache... if somebody can
>>point me to a Windows Crafty executable and tell me what to type, I'll happily
>>run the test.
>
>>-Tom



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.