Author: Robert Hyatt
Date: 21:18:24 04/01/03
Go up one level in this thread
On April 01, 2003 at 21:32:35, Vincent Diepeveen wrote: >On April 01, 2003 at 13:58:49, Robert Hyatt wrote: > >>On April 01, 2003 at 13:10:20, Vincent Diepeveen wrote: >>>> >>> >>>Remember that the slow thing for hashtables is to get the bytes out of the RAM, >>>not from L2 or L1 cache. Those caches are very fast on the P4. >>> >> >>Have I _ever_ said otherwise. That is why _latency_ is more important than >>_bandwidth_ >>to a chess engine. >> >> >> >>>Actually i do assume 64 to 128 bytes with DIEP. 128 i use for hashtable. >>> >>>Even then betting on 64 bytes is safe as that's what DDR ram gives at K7. >> >>DDR isn't giving 64 bytes. The cache is _outside_ the memory system and doesn't >>change >>depending on what kind of memory you have. If the K7 has 64 byte line length, >>then it >>simply has 64 byte lines regardless of memory.. Cache is on-chip. Memory is >>_way_ off- >>chip. >> >> >> >> >>> >>>Currently you use 32 bytes.\ >> >>Nope. Currently a hash entry is a triplet, with 48 bytes. I've already told >>you this once. >>But the hash entries are not on 32 or 64 byte boundaries so it is likely that >>hash entries >>span two cache lines as a result. If you don't do your first probe on a >>multiple of 128 bytes, >>you are going to access two lines if you need 128 bytes. > >If you code well you can prevent that of course. > >Even in C this is possible. As you might know a pointer is an adress in memory. >So that gives too where the cache line starts. Set the pointer such that you >always start at a cache line and you can prevent all horrors easily. I've done this forever, you know. I force my hash table to 16 byte boundary, because malloc() only guarantees 8 byte boundary. But I don't access a multiple of 16 bytes. I access 48. So there is no way to make that lie in one cache line unless I change the basic idea to 4 or 8 entries, rather than 3. I kept 3 so the node counts would match the old 2-table approach exactly. > >>Cache doesn't load the >>first byte >>you access at the beginning of a line, so it is possible that the first byte you >>access is anywhere >>in the line unless you force alignment to 128 (or 64) byte boundaries. I don't >>do that as I >>want the 1:2 ratio of depth-preferred to always-store entries. And at 16 bytes >>per entry, >>this puts three entries in 48 consecutive bytes, but only forced to 16 byte >>alignment. >> >>> >>>within a few years trivially all manufacturers will move up to 128 bytes at >>>least, simply because that gives a bigger bandwidth and though big bandwidth for >>>99% of all applications is not the biggest issue (latency more important of >>>course) the market only knows the word 'bandwidth'. So they demand big >>>bandwidth. >> >> >>Maybe or maybe not. Beyond a programmer's control however. But internally L1 >>and L2 >>are probably not going to match as they did in the P5 days. PIV is certainly an >>example of >>two different line lengths. And newer processors with three levels of cache may >>take this >>to three different line lengths. I try to not worry about it.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.