Author: Roberto Waldteufel
Date: 23:43:41 10/22/98
Go up one level in this thread
On October 22, 1998 at 16:34:11, Robert Hyatt wrote: >On October 22, 1998 at 14:03:01, Roberto Waldteufel wrote: > >> >>I know this may seem like a dumb question, but could you explain exactly how the >>cache works? I know the basic idea is to speed up memory acces by holding >>frequently accessed data in a special place, the "cache", from where it may be >>accessed more quickly, but I don't know much more than that. Is the cache part >>of RAM, or is it held on the CPU chip, or on a separate chip? What is the >>significance of L1 and L2 cache? I have heard that sometimes the cache is >>accessed at different speeds depending on the machine. Is it possible to add >>more cache, and if so would this be likely to improve performance? The most >>important thing I wold like to understand is how I can organise my programs so >>as to extract maximum benefit from the cache available (I use a P II 333MHz). >> >>In the Spanish Championships I heard that some programs (including mine), >>especially the bigger ones, were at a disadvantage due to the absence of L2 >>cache on the Celeron machines. I don't know how big an effect this may have had. >>If I better understood how caching works, maybe I could improve the way I code >>things. I'm afraid I'm rather a novice regarding hardware details like this. >> >>Thanks in advance, >>Roberto > > > >The idea is simple. On the pentium/pentiumII/etc machines, cache is broken >up into "lines" of 32 bytes. When the CPU fetches a word, instruction, >a byte, etc, the entire block of 32 bytes (one line) is copied to the cache >in a very fast burst. Then, from this point forward, rather than waiting >for 20+ cpu cycles to fetch data, it is fetched from the L2 cache in 2 >cycles on the Pentium II, or 1 cycle on the pentium pro or xeons. It can >make the program run much faster as you might guess, in that when you >modify memory, you really only modify the cache, and memory is updated way >later if possible. So a loop might increment a value 100 times, but memory >is only fetched once and written to once, which saves substantial amounts >of time. > >The plain celeron 300 doesn't have any L2 cache, while the newer celeronA >or any that are 333mhz and faster have 128KB of L2 cache that is as fast >as the pentium pro/xeon (core cpu speed rather than core cpu speed / 2) > >But it is nothing more than a high-speed buffer between the CPU and >memory, but it is obviously much smaller than memory... Hi Bob, Thanks for the explanation. Would I be right in thinking that data structures of exactly 32 bytes lengthe would be the most efficient on PII machines, so that the structure fits exactly into one "line" of cache? Best wishes, Roberto
This page took 0.01 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.