Computer Chess Club Archives




Subject: Re: Source code to measure it - there is something wrong

Author: Keith Evans

Date: 14:37:33 07/17/03

Go up one level in this thread

On July 17, 2003 at 17:26:50, Robert Hyatt wrote:

>On July 17, 2003 at 02:17:29, Gerd Isenberg wrote:
>>>>And, after all, we use virtual memory nowadays. Doesn't this include one more
>>>>indirection (done by hardware). Without knowing much about it, I wouldn't be
>>>>surprized, that hardware time for those indirections is needed more often with
>>>>the random access style pattern.
>>>You are talking about the TLB.
>>>The memory mapping hardware needs two memory references to compute a real
>>>address before it can be accessed.  The TLB keeps the most recent N of these
>>>things around.  If you go wild with random accessing, you will _certainly_
>>>make memory latency 3x what it should be, because the TLB entries are 100%
>>>useless.  Of course that is not sensible because 90+ percent of the memory
>>>references in a chess program are _not_ scattered all over memory.
>>Aha, that's interesting. So memory latency is really the time between switching
>>the physical address to the bus and getting the data _and_ does not consider
>>translation from virtual to physical addresses via TLB (Translation Look-aside
>>So Vincent's benchmark seems not that bad to get a feeling for "worst case"
>>virtual address latency - which is likely for hashtable reads.
>Sure.  But that simply isn't "memory latency".  And, as I mentioned in another
>post, the PC supports 4K or 4M pages.  4M pages means a 62 entry TLB is good
>for over 1/4 gig of RAM, accessed randomly, with _no_ TLB penalty.
>The X86 also supports a three-level map, which would add even another cycle
>to the virtual-to-real translation, should a system use it.  I'd think a saner
>approach would be to step up to 4M pagesize before going to that huge map
>BTW, lm-bench says my xeon has 62 TLB entries.  I've not verified that from
>Intel however.

So I guess that you can make your hash tables too big ;-)

If this is the cause of the discrepancy, can't those other benchmarks be run
with say a 250 MB array, and see a reduced latency?

This page took 0.02 seconds to execute

Last modified: Thu, 07 Jul 11 08:48:38 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.