Author: Robert Hyatt
Date: 15:15:55 08/22/04
Go up one level in this thread
On August 22, 2004 at 17:12:05, Tom Kerrigan wrote: >On August 22, 2004 at 11:10:33, Robert Hyatt wrote: > >>Simple. I access a batch of random attack entries. I then access a _lot_ of >>other stuff before I come back to the attack entries. Your 256K cache has 4K >>lines. I don't know what the set associativity is, as AMD has lots of options >>there in recent history. But that further reduces the number of "buckets" to >>stuff stuff in. My xeon claims 16-way set associativity, with 128 byte lines. >>That turns into a paultry 256 sets. It is very easy to get a bad physical >>memory layout where you don't even use all of those sets, and where some sets >>get badly overloaded. > >This is a bunch of nonsense. You make it sound like associativity somehow >decreases the amount of cache you have. Really, associativity has no place in >this discussion, except maybe to note that it reduces the behavior that you're >complaining about, namely random accesses evicting important data from the >cache. Then you don't understand set associativity. It _does_ influence what goes where. Physical memory maps directly to a set. A program doesn't necessarily use _every_ set in cache due to poor physical memory layout decisions by the O/S. > >But let's say you do randomly access your working set. How about you explain how >performance isn't increased going from 256k to 512k cache? I believe I already answered. I _can't_ explain it because I _can't_ reproduce it. All of my data was for 512K/1024K/2048K with everything else being identical except for unpublished things Intel _could_ have done, produced faster results. Eugene did the same. I had some old alpha data with some larger L3 cache sizes but I haven't included those as I don't know exactly how the hardware configurations might have differed. So I can _not_ explain why you got no speedup. That is contrary to everything I have ever seen. I'll ask the AMD folks if they have some machines I can try that are identical except for cache size. Ditto for an Intel box although I am not sure that Intel still does the 2048K xeons... > You're randomly >accessing a bigger random subset of your working set. (And really, 512k is a >large percentage of your maximum possible working set... Windows reports that it >only allocates 5MB for Crafty, including hash tables, tables that never get >used, code that isn't part of the search, memory for the stack, etc.) > >-Tom Strange number. Linux reports 20M. How can it use 5mb when the default hash sizes total 4 megs? IE start crafty and type "hash" and "hashp". You get almost 4 megs for those. There are 128 TREE blocks also, but they get malloc()'ed. That is way over a meg total... 5mb has to be wrong...
This page took 0.01 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.