Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: To: Bob Hyatt, Fritz SMP Optimization

Author: Robert Hyatt

Date: 15:12:02 02/20/03

Go up one level in this thread


On February 20, 2003 at 17:19:20, Charles Worthington wrote:

>On February 20, 2003 at 16:45:35, Robert Hyatt wrote:
>
>>On February 20, 2003 at 14:56:01, Matt Taylor wrote:
>>
>>>On February 20, 2003 at 11:31:04, Robert Hyatt wrote:
>>>
>>>>On February 20, 2003 at 02:49:04, Matt Taylor wrote:
>>>>
>>>>>On February 19, 2003 at 20:05:30, Robert Hyatt wrote:
>>>>>
>>>>>>On February 19, 2003 at 18:28:12, Charles Worthington wrote:
>>>>>>
>>>>>>>Bob I am not a programmer. Can you direct me to a source that will enable me to
>>>>>>>make the tweaks on the Fritz Spinlocks and Spinwaits myself? Getting chessbase
>>>>>>>to do anything would be near impossible.
>>>>>>
>>>>>>
>>>>>>No.  It is a source code modification, which is the problem.  You _might_ be
>>>>>>able to use a debugger and find the spinlocks, but inserting a pause instruction
>>>>>>might not be possible as it will mean "stretching" the code which would break
>>>>>>things without some cute tricks, such as a jump out to the lock and a jump
>>>>>>back...
>>>>>>
>>>>>>I wouldn't want to think about it not having any idea where the locks really
>>>>>>are.  :)
>>>>>
>>>>>It wouldn't be too difficult with a good knowledge of assembly and some good
>>>>>tools. Profile for highly localized hotspots (there are your spinlocks) and then
>>>>>use code stretching to insert code. I wrote a tool at work that does this for
>>>>>Windows programs, but it requires PDB information. I know of no tool on the
>>>>>market that is capable of code stretching without such information.
>>>>>
>>>>>Realistically he might as well ask nicely. :-)
>>>>>
>>>>>-Matt
>>>>
>>>>
>>>>That might work.  However, the spinlocks (in crafty) don't get burned a lot, due
>>>>to the way
>>>>it is designed in attempting to minimize locking.  It might well be possible to
>>>>miss a few that
>>>>don't get beat on in the particular run.
>>>
>>>I guess I shouldn't have called them hotspots. I was thinking of any short loop.
>>>Most do some sort of computation, so most wouldn't be seen as spinlocks.
>>>Alternatively you could search for pause (if it uses it) or the lock prefix.
>>
>>Both would fail for older crafty versions.  pause is what needs to be added for
>>hyper-threading.  And I use the xchg instruction which doesn't need the LOCK
>>prefix.  But (for crafty) looking for xchg might have been good enough.  But I
>>would hate to try to tell someone how to patch the executable as they would
>>have to know where it was safe to add the new instructions plus the jump to
>>them....
>>
>>
>>>Perhaps look for the bts/btr or xchg instructions inside the loop. You -might-
>>>incur false positives, but I doubt there would be many.
>>>
>>>>As I said, it is doable.  I used to have to patch operating systems like that
>>>>all the time, where
>>>>you need to insert code, which means you replace a good instruction with a jump
>>>>to a patch
>>>>area, stick the replaced instruction plus the ones you need to add there,
>>>>followed by a jump
>>>>back to where you came from.
>>>>
>>>>But it is not for the non-ASM person, obviously.
>>>
>>>Yes, you can do it like that. I was talking about recompilation, which is a much
>>>slicker trick. You have to move code around to insert your own, and that's what
>>>makes it a pain to do. My own tool still breaks in some cases because it can't
>>>guarantee that it will find all function pointers. If you miss a function
>>>pointer, the code obviously is going to break.
>>>
>>>Your method is definitely easier without nice tools. In performance-optimized
>>>applications, functions are going to be 16-byte aligned, too. That will  leave a
>>>fair amount of CCh (int3) or 90h (nop) space to use. If you can find small
>>>functions in a relatively small area, you can use the 2-byte jump form, too.
>>>
>>>-Matt
>
>
>So essentially what you guys are telling me is that this will be a very tough
>task and I may as well wait until deep fritz 8 is released in the fall?
>I did talk to chessbase about this but the engine programmer was not there at
>the time so I could not get his input. Perhaps I can find a private programmer
>somewhere who can pull it off. I'm certainly going to try :-)) I would really
>like to see that extra 20% in performance!


You will get performance from hyper-threading.  But without proper
spinlock/spinwait
modifications, you will get less than what is possible.  That is the issue.  How
big a problem
this is depends on how Fritz relies on locks.  I'm sure everyone is using the
lockless hash
table trick from Crafty, which eliminates a lot of lock overhead.  If the SMP
code is designed
reasonably, then the locks aren't a big problem and the overhead of not using
pause in the
spins is not really much of a problem.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.