Author: Gerd Isenberg
Date: 13:37:03 09/03/03
Go up one level in this thread
On September 03, 2003 at 14:53:23, Robert Hyatt wrote: >I finally found time to run a few tests. > >First, the set-up: I took all my bitmap stuff, added it up, and put it in >a structure that was then set up as an array. IE tdata[128] was an array >of these structures, each element being just enough bytes to hold all of what >I call a "position" including bitmap boards, hash signature, 50 move counter, >etc...) > >At the top of MakeMove() I added tdata[ply+1]=tdata[ply]; to do the copy >stuff. That's all. > >Some results: > >fine 70 searched to 39 plies deep: > >copy/make : 15.5 seconds, make/unmake 14.0 seconds (11% overhead) > >kopec 22 searched to 12 plies deep: > >copy/make : 32.4 seconds, make/unmake 28.3 seconds. (14.5% overhead) > >mate position searched 9 plies deep (mate in 10) > >copy/make : 8.9 seconds, make/unmake 7.5 seconds. (18.7% overhead) > >That's not all the story. however. The copy/make approach requires an >extra register everywhere since the data structures have to be accessed >through a pointer (or via an array subscript, same thing). My test case >does not take care of that. But if you were to mark one register as >"unusable" for the compiler, the result would be worse, for certain. Since >these data values are accessed all over the place, a register has to be used >everywhere, which is going to add to the above, significantly. If it only adds >10% then the above numbers are back to what I originally saw when I speeded >things up by 25% by getting rid of copy/make. > >That's data for Crafty. YMMV of course... Interesting results. Not sure about the additional register, depends on the implementation. If you address an array with n and n-1, one register for n should be enough. I guess it's really the latency of the additional read cycles - plus the penalty for additional cache pollution. How many bytes do you copy? Are source and target adjacent and properly aligned? Have you ever tried MMX-copy instead of memcpy? movq mm0, [source] movq mm1, [source+8] ... movq [target ], mm0 movq [target+8], mm1 ... Another improvement in copymake is to combine the seperate copy make read-writes.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.