Author: Robert Hyatt
Date: 20:35:51 09/01/03
Go up one level in this thread
On August 30, 2003 at 10:36:53, Vincent Diepeveen wrote: >On August 30, 2003 at 02:58:53, Johan de Koning wrote: > >>On August 29, 2003 at 08:53:50, Robert Hyatt wrote: >> >>>On August 29, 2003 at 02:34:42, Johan de Koning wrote: >>> >>>>On August 28, 2003 at 11:45:35, Robert Hyatt wrote: >>>> >>>>>On August 28, 2003 at 01:50:46, Johan de Koning wrote: >>>>> >>>>>>On August 27, 2003 at 12:25:51, Robert Hyatt wrote: >>>>>> >>>>>>>On August 26, 2003 at 21:12:45, Johan de Koning wrote: >>>>>> >>>>>>[snip] >>>>>> >>>>>>It seems we're finally back to where you and me started off. >>>>>> >>>>>>>>You keep saying that copy/make causes problems with cach to memory traffic. >>>>>>>>Here I was just saying it doesn't, if cache is plenty. >>>>>>> >>>>>>>Here is the problem: >>>>>>> >>>>>>>When you write to a line of cache, you _guarantee_ that entire line of cache >>>>>>>is going to be written back to memory. There is absolutely no exceptions to >>>>>>>that. So copying from one cache line to another means that "another line" is >>>>>>>going to generate memory traffic. >>>>>> >>>>>>Here is the solution: write-through caches were abondoned a long time ago. >>>>> >>>>>I'm not talking about write-through. >>>> >>>>I'm glad you aren't. :-) >>>> >>>>> I am talking about write-back. Once >>>>>you modify a line of cache, that line of cache _is_ going to be written back >>>>>to memory. When is hard to predict, but before it is replaced by another cache >>>>>line, it _will_ be written back. So you write one byte to cache on a PIV, you >>>>>are going to dump 128 bytes back to memory at some point. With only 4096 lines >>>>>of cache, it won't be long before that happens... And there is no way to >>>>>prevent it. >>>> >>>>Sure, every dirty cache line will be written back at *some* point. But you're >>>>allowed to use or update it a million times before it is flushed only once. >>>>Number of cache lines has nothing to do with it. On a lean and empty system >>>>some lines might even survive until after program termination. >>> >>>Number of cache lines has everything to do with it. If you can keep 4K >>>chunks of a program in memory, and the program is _way_ beyond 4K chunks >>>in size of the "working set", then cache is going to thrash pretty badly. >> >>Working set, by whatever definition, is not relevant. >>Frequency distribution is. >>Since *that* is the basis of caching. >>(And analogously of compression.) >> >>>I've already reported that I've tested on 512K, 1024K and 2048K processors, >>>and that I have seen an improvement every time L2 gets bigger. >> >>Yes, reported as not significant. But off-topic since your large tables are >>addressed irregularly, hence never threaten hot data in L1. >> >>>As I said initially, my comments were _directly_ related to Crafty. Not to >>>other mythical programs nor mythical processor architectures. But for Crafty, >>>copy/make was slower on an architecture that is _very_ close to the PIV of >>>today, albiet with 1/2 the L2 cache, and a much shorter pipeline. >> >>As I said initially, writing to cache (ie just writing) does not relate to >>memory traffic. That was the issue for the last 10 days. >> >>I'm not challenging your results with Crafty at all, I'm only doubting them. >>And I'd still like to see a copy simulation, preferrably on different machines, >>to put things in at least some perspective. > >A few years ago when Hyatt was asked why he chose for Xeons PII/PIII with less >L2 cache than possible in his quad xeon, he answered there was 0% difference in >performance for crafty :) I said "there was very little improvement". I haven't changed that one bit. I have 512K, 1M and 2M xeons here. A couple of percent faster for each jump. Compare the prices to see why I said "don't buy the 1M/2M processors." It's pretty clear. It's also pretty clear that Crafty is very L2-unfriendly. It's not hard to run two programs, and use the MSR counters to measure cache line misses for each program. > >> >> >>[...] >>>Of course X86 is crippled for many more reasons than that. 8 registers for >>>starters. :) >> >>Well, here is *something* we agree on. >> >>... Johan
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.