Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: copy/make vs make/unmake some data

Author: Gerd Isenberg

Date: 13:37:03 09/03/03

Go up one level in this thread


On September 03, 2003 at 14:53:23, Robert Hyatt wrote:

>I finally found time to run a few tests.
>
>First, the set-up:  I took all my bitmap stuff, added it up, and put it in
>a structure that was then set up as an array.  IE tdata[128] was an array
>of these structures, each element being just enough bytes to hold all of what
>I call a "position" including bitmap boards, hash signature, 50 move counter,
>etc...)
>
>At the top of MakeMove() I added tdata[ply+1]=tdata[ply]; to do the copy
>stuff.  That's all.
>
>Some results:
>
>fine 70 searched to 39 plies deep:
>
>copy/make : 15.5 seconds, make/unmake 14.0 seconds  (11% overhead)
>
>kopec 22 searched to 12 plies deep:
>
>copy/make : 32.4 seconds, make/unmake 28.3 seconds. (14.5% overhead)
>
>mate position searched 9 plies deep (mate in 10)
>
>copy/make : 8.9 seconds, make/unmake 7.5 seconds.  (18.7% overhead)
>
>That's not all the story. however.  The copy/make approach requires an
>extra register everywhere since the data structures have to be accessed
>through a pointer (or via an array subscript, same thing).  My test case
>does not take care of that.  But if you were to mark one register as
>"unusable" for the compiler, the result would be worse, for certain.  Since
>these data values are accessed all over the place, a register has to be used
>everywhere, which is going to add to the above, significantly.  If it only adds
>10% then the above numbers are back to what I originally saw when I speeded
>things up by 25% by getting rid of copy/make.
>
>That's data for Crafty.  YMMV of course...

Interesting results. Not sure about the additional register, depends on the
implementation. If you address an array with n and n-1, one register for n
should be enough. I guess it's really the latency of the additional read cycles
- plus the penalty for additional cache pollution.

How many bytes do you copy?
Are source and target adjacent and properly aligned?
Have you ever tried MMX-copy instead of memcpy?

movq mm0, [source]
movq mm1, [source+8]
...
movq [target  ], mm0
movq [target+8], mm1
...

Another improvement in copymake is to combine the seperate copy make
read-writes.








This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.