Author: Robert Hyatt
Date: 14:48:08 08/21/04
Go up one level in this thread
On August 21, 2004 at 12:18:15, Sune Fischer wrote: >On August 21, 2004 at 10:46:13, Robert Hyatt wrote: > >>>>extern BITBOARD set_mask_rr45[65]; >>>> >>>> >>>>Those are just a few random tables, >>> >>>They aren't random at all, that's my point. >> >>That has been _my_ point. I run through a _lot_ of such tables. And that tends >>to flush cache before something gets reused. A random probe into anything >>replaces N bytes (a cache line). IE on my PIV that is a 1 byte access replaces >>128 bytes of cache in one chunk. On my dual xeons, that is 4096 cache lines. >>IE 4096 random accesses can completely flush cache. > >I think 4096 _random_ accesses is a lot. >It's going to take you quite a few nodes to hit that number of accesses and most >those accesses will not be random. I can only speak for my program. All of the table accesses I gave you are random. > >>Who is ignoring that fact? I reported that when I first looked into buying my >>first quad xeon, I benchmarked Crafty on the PII xeon with 512kb, 1024kb, and >>2048kb of L2 cache. 1024K was 10% faster. 2048 was another 7% faster. Eugene >>ran crafty on IA64 with 1.5mb L2 and 3.0mb L2 and found 3.0mb was 10% faster. >> >>Who is ignoring what? >> >>Tom ran on 256 and 512K and concluded the working set for crafty was < 256K. I >>simply said the data doesn't support the conclusion. The conclusion _may_ be >>right. But two of us ran on larger L2 boxes and got better performance. One >>did not. Two of us ran up to 2048K, one only tried 256 and 512k. > >What you're saying here is basicly that Crafty has a working set that is much >larger than 3 MB, for sure it is so much larger that no improvement can even be >measured when going from 256 to 512 kB. No I am not saying _anything_ about the working set of Crafty. You _totally_ miss the point. I am only saying that if you run a program with X L2 and then with 2X L2, and the speed is the _same_ you did _not_ just prove that the working set of the program is <= X. That is _all_ I have said. I don't know what my working set is. I don't care what it is. I do know that for two different testers, bigger cache was faster, for Tom it wasn't. Why that is I have no idea, I really don't care, and I don't see any point in investigating further. I only pointed out that the _original_ conclusion was badly flawed. Nothing more, nothing less... > >I don't believe that. > >What I do believe however, is that to measure the influence of cache size you >should run on identical machines (Eugene did not, I don't know about you). As I said, I did. It was a dell poweredge 6000. Dell did nothing but swap processors for me after each test. Everything else was identical, the only question left is did Intel change anything between the three different processors? They had the same clock speed and bus speed, all 3 were pentium 2 400mhz xeons, the only thing intel _claimed_ was different was L2 cache size. Whether there was another non-published difference (different set size, etc, I don't know). > >Secondly, consider we have other factors that would benefit from larger caches, >things like the hash. >Even if the cache could only store the last 1000 hash entries, those 1000 would >also be the most interesting and most likely to be used next. Fine. But then how would you explain Tom's test showing _no_ improvement with 2X L2? We have to stick to what the data shows. His suggests something strange is going on. > >How big is this effect? You don't know, so how do you know this is not what >you're seeing when going from 1.5 to 3 MB cache? Then again, why did tom get _zero_ going from 256 to 512K when I got a bonus going from 512 to 1024 and 1024 to 2048, and Eugene got a bonus for going from 1.5 to 3.0M? Hopefully you get my point. What Tom's zero proved I don't see. > > >>What "problem" did I not isolate? I simply ran exactly the same program, on >>exactly the same processors, but with three different cache sizes. I measured >>the difference in speed and since the only difference was cache size, the speed >>difference had to be attributed to that.. >Correct, but how big a hash did you use? >Suppose the cache was able to store half your hash. > 4 megs. if going from 512 to 1024 helps, I'd expect going from 256 to 512 to help as well. Again the working set is not the point. The suggestion that no performance improvement going from 256 to 512 proves it to be less than 256K is simply wrong. That was the _only_ point... > >>Give me a break. There is but _one_ interpretation of the data I presented. >>Bigger cache improves performance measurably for Crafty. What other >>interpretation is possible from the data either I or Eugene have observed? > >There is a discrepancy with what others have seen, this must be explained >somehow or your theory is not viable. Again, what on earth are you talking about? What "theory" have I proposed? > >I've aired a hypothesis that fits the data, further experiments are required to >confirm it however. Absolutely _no_ hypothesis fits the data presented. > >Two experiments I could think of: >1) do the same tests again, only this time with 0 kB main+pawn hash >2) add a "blow-out" table to be looped at every node and let it grow to see >where the barrier is. > >-S.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.