Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Programming question.

Author: Eugene Nalimov

Date: 21:00:21 03/19/99

Go up one level in this thread


On March 19, 1999 at 21:06:08, Bruce Moreland wrote:

>
>On March 19, 1999 at 20:58:05, Robert Hyatt wrote:
>
>>On March 19, 1999 at 16:06:10, Inmann Werner wrote:
>>
>>>Hello.
>>>
>>>Today I changed my hashtables from fixed one ta allocated one.
>>>
>>>Example
>>>Old
>>>
>>>long hindex[65536]
>>>signed char hcolor[65536]
>>>signed short int value[65536]
>>>.....
>>>
>>>New
>>>
>>>long *hindex
>>>signed char *hcolor
>>>signed short int *value
>>>....
>>>
>>>hindex=malloc(4*65536)
>>>hcolor=malloc(65536)
>>>value=malloc(2*65536)
>>>....
>>>
>>>Works fine but:
>>>
>>>On my P90, the NPS decreases about 30%  !!!!!
>>>On my Cyrix 233 everything is fine (same speed)
>>>
>>>Do I anything extremly wrong or what happens here. Has this something to do
>>>with Prozessor cache or what?
>>>
>>>Any suggestions??
>>>
>>>Werner
>>>
>>>P.S.: Excuse my C, I come from Cobol....
>>>I use Watcom C Version 11 Compiler.
>>
>>
>>here is one useful trick:  The PC L2 cache is fetched in blocks of 32 bytes
>>from memory in a 'burst'.  You want to make sure that a hash entry doesn't
>>'span' two cache lines if you can help it.  Here is a way:
>>
>>TABLE *hash_table
>>
>>hash_table=malloc(tablesize+15);
>>hash_table=(hash_table+15)&~15;
>>
>>what that does (and you will probably have to do some casting to void * to make
>>the compiler accept some of the above) is do the malloc for the normal size +
>>15 bytes, then we add 15 to that pointer and then scrub the low order 4 bits to
>>make the pointer align on 16 byte (and also 32 byte) boundaries.  malloc() will
>>align on the boundary required by the hardware, which generally tops out at 8
>>bytes.  This aligns it better.
>>
>>Although I am not saying that is your problem...
>>
>>Your problem is that a single hash table entry is scattered all over memory,
>>so that if you have draft, value, move, wtm, signature, you need _five_ cache
>>line fills to get one entry.
>>
>>Put the whole entry into a structure like this:
>>
>>struct hash_entry {
>>  long long signature
>>  short value;
>>  short move;
>>  etc.
>>}
>>
>>then use htable->signature to get to the signature, htable->value to get to
>>the value.  Now with everything in adjacent bytes, one memory 'burst' will suck
>>in the whole hash entry and each value can be accessed in cache speed, rather
>>than memory speed.
>>
>>it will make a huge difference...
>
>He had essentially the same code before, he just turned three static arrays into
>three arrays initialized at runtime.
>
>This made his code slow down by 30% and that's insane.  I can imagine something
>slowing down by 30% if it spends all of its time in one loop, but a chess
>program is more complicated than this, I think it would be hard to kill yourself
>*that* badly with a cache issue.
>
>You can talk about how to make it faster in general, but the issue here isn't
>poor performance, the issue is a huge change in performance due to something
>that shouldn't have caused perfmance to change hugely.
>
>bruce

My guess there are several factors involved:

1. Different memory layout can cause wild performance variations,
   especially on "clasic" Pentium. P6/PII/PII are much better
   here.
2. The new code needs one more register per indexation. There are
   only 7 registers available on x86, and you just added 5 values
   (or more - depend on number of the parallel arrays you use
   instead of the array of structures) that had to be keeped in
   registers. You greatly increased register pressure, and my
   guess is that register allocator in the compiler was unable to
   do the good job.
3. You could expose bug that always was in the program, especially
   if you use hard-coded constants (2 instead of sizeof (short)).
   Also, please note that static arrays are initialized by zeroes
   prior to program start. Dynamicly allocated memory initially
   contains garbage.

Eugene



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.