Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: copy/make vs make/unmake some data

Author: Robert Hyatt

Date: 13:52:42 09/03/03

Go up one level in this thread


On September 03, 2003 at 16:37:03, Gerd Isenberg wrote:

>On September 03, 2003 at 14:53:23, Robert Hyatt wrote:
>
>>I finally found time to run a few tests.
>>
>>First, the set-up:  I took all my bitmap stuff, added it up, and put it in
>>a structure that was then set up as an array.  IE tdata[128] was an array
>>of these structures, each element being just enough bytes to hold all of what
>>I call a "position" including bitmap boards, hash signature, 50 move counter,
>>etc...)
>>
>>At the top of MakeMove() I added tdata[ply+1]=tdata[ply]; to do the copy
>>stuff.  That's all.
>>
>>Some results:
>>
>>fine 70 searched to 39 plies deep:
>>
>>copy/make : 15.5 seconds, make/unmake 14.0 seconds  (11% overhead)
>>
>>kopec 22 searched to 12 plies deep:
>>
>>copy/make : 32.4 seconds, make/unmake 28.3 seconds. (14.5% overhead)
>>
>>mate position searched 9 plies deep (mate in 10)
>>
>>copy/make : 8.9 seconds, make/unmake 7.5 seconds.  (18.7% overhead)
>>
>>That's not all the story. however.  The copy/make approach requires an
>>extra register everywhere since the data structures have to be accessed
>>through a pointer (or via an array subscript, same thing).  My test case
>>does not take care of that.  But if you were to mark one register as
>>"unusable" for the compiler, the result would be worse, for certain.  Since
>>these data values are accessed all over the place, a register has to be used
>>everywhere, which is going to add to the above, significantly.  If it only adds
>>10% then the above numbers are back to what I originally saw when I speeded
>>things up by 25% by getting rid of copy/make.
>>
>>That's data for Crafty.  YMMV of course...
>
>Interesting results. Not sure about the additional register, depends on the
>implementation.

I assume that _any_ pointer needs a register.  Where the global board stuff
does not.  IE I know how to get a global w_pawns into a register, but if I
have w_pawns[i] I have to deal with i, which eats a register.  That was my
point.  I am copying data in this test, but I am not using it anywhere so there
is no register loss.  If this were my old copy/make program, after copying all
that stuff, I would continually reference it with a subscript or via pointer to
get to the right instance of it.



> If you address an array with n and n-1, one register for n
>should be enough. I guess it's really the latency of the additional read cycles
>- plus the penalty for additional cache pollution.

I think cache is a real issue.


>
>How many bytes do you copy?



The actual number is, I believe, 232 bytes.  But that turns into 256 obviously,
as it takes two cache lines.

>Are source and target adjacent and properly aligned?

Yes.  I rounded the struct to 256 bytes, so that each element is exactly the
right size and properly aligned.

>Have you ever tried MMX-copy instead of memcpy?

Nope.  Any performance numbers to compare them???


>
>movq mm0, [source]
>movq mm1, [source+8]
>...
>movq [target  ], mm0
>movq [target+8], mm1
>...
>
>Another improvement in copymake is to combine the seperate copy make
>read-writes.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.