Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Move ordering ?

Author: James Robertson

Date: 07:39:40 10/23/98

Go up one level in this thread


On October 23, 1998 at 02:43:41, Roberto Waldteufel wrote:

>
>On October 22, 1998 at 16:34:11, Robert Hyatt wrote:
>
>>On October 22, 1998 at 14:03:01, Roberto Waldteufel wrote:
>>
>>>
>>>I know this may seem like a dumb question, but could you explain exactly how the
>>>cache works? I know the basic idea is to speed up memory acces by holding
>>>frequently accessed data in a special place, the "cache", from where it may be
>>>accessed more quickly, but I don't know much more than that. Is the cache part
>>>of RAM, or is it held on the CPU chip, or on a separate chip? What is the
>>>significance of L1 and L2 cache? I have heard that sometimes the cache is
>>>accessed at different speeds depending on the machine. Is it possible to add
>>>more cache, and if so would this be likely to improve performance? The most
>>>important thing I wold like to understand is how I can organise my programs so
>>>as to extract maximum benefit from the cache available (I use a P II 333MHz).
>>>
>>>In the Spanish Championships I heard that some programs (including mine),
>>>especially the bigger ones, were at a disadvantage due to the absence of L2
>>>cache on the Celeron machines. I don't know how big an effect this may have had.
>>>If I better understood how caching works, maybe I could improve the way I code
>>>things. I'm afraid I'm rather a novice regarding hardware details like this.
>>>
>>>Thanks in advance,
>>>Roberto
>>
>>
>>
>>The idea is simple.  On the pentium/pentiumII/etc machines, cache is broken
>>up into "lines" of 32 bytes.  When the CPU fetches a word, instruction,
>>a byte, etc, the entire block of 32 bytes (one line) is copied to the cache
>>in a very fast burst.  Then, from this point forward, rather than waiting
>>for 20+ cpu cycles to fetch data, it is fetched from the L2 cache in 2
>>cycles on the Pentium II, or 1 cycle on the pentium pro or xeons.  It can
>>make the program run much faster as you might guess, in that when you
>>modify memory, you really only modify the cache, and memory is updated way
>>later if possible.  So a loop might increment a value 100 times, but memory
>>is only fetched once and written to once, which saves substantial amounts
>>of time.
>>
>>The plain celeron 300 doesn't have any L2 cache, while the newer celeronA
>>or any that are 333mhz and faster have 128KB of L2 cache that is as fast
>>as the pentium pro/xeon (core cpu speed rather than core cpu speed / 2)
>>
>>But it is nothing more than a high-speed buffer between the CPU and
>>memory, but it is obviously much smaller than memory...
>
>Hi Bob,
>
>Thanks for the explanation. Would I be right in thinking that data structures of
>exactly 32 bytes lengthe would be the most efficient on PII machines, so that
>the structure fits exactly into one "line" of cache?

I think that the computer uses the same method it does with the registers; i.e.
the computer reads and writes in 32 bit chunks, regardless of the size of the
data being transferred. It will take 2 machine cycles for a 64 bit structure,
but an 8 bit and a 32 bit structure will both take 1 machine cycle. I think this
is how it works; if I am wrong please correct me!!

James

>
>Best wishes,
>Roberto



This page took 0.01 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.