Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Crafty MPC ( Is it Hyper Threading?)

Author: Robert Hyatt

Date: 10:33:19 09/14/03

Go up one level in this thread


On September 13, 2003 at 17:58:02, Sune Fischer wrote:

>On September 13, 2003 at 17:07:47, Robert Hyatt wrote:
>
>>On September 13, 2003 at 11:26:37, Nolan Denson wrote:
>>
>>>Hey Dan
>>>
>>>I decided to compile and give your crafty MPC a try.  I am getting very good
>>>results but have notice something strange.  I am not sure if this version is
>>>suppose to be able to take advantage of hyper threading ... but when i run the
>>>bench test my nps are better for a dual processor instead of a quad, But a
>>>strange things happens when actual game play starts.  Using it as a Dual Xeon my
>>>nps (I compiled for multi-thread) 1536 nps avg, as a Quad the bench test is 1200
>>>nps.  But when actual game play starts the Dual stays about the same. But the
>>>Quad via Hyperthreading gets like 2100 nps ... Do you think is doing this on its
>>>own, or is this MPC Crafty Hyperthreading enable.  If an one know the files that
>>>are involve ... i can make this MPC version truly Hyperthreading ...base on the
>>>results I am getting now .. if this is not Hyperthreading now ... i think once
>>>it is i should get around 2500-2600 nps.
>>
>>
>>Something is wrong.  Do you really have SMT turned on?  IE I have some dual
>>xeons here and hyper-threading definitely kicks the NPS up.  Somewhere around
>>30% or so, for raw NPS...  My dual 2.8 gets around 2.1M nodes per second on
>>the default setting bench command with mt=4
>>
>>But be sure that on a dual with hyperthreading on, you do mt=4, _not_ mt=2.
>
>I have question,
>the way Crafty computes nps in a SMP run, is by summing up the nodes searched by
>the helper threads, right?
>
>As I understand it this should be more or less equivalent to the cpu-load
>figure, so it won't account for some of those nodes being lost to search
>inefficiency.

The issue is "search overhead".  A parallel searches more nodes than a
pure serial (one thread) search due to alpha/beta issues.  I really don't
know how to estimate this number from within a parallel search.  And the
really strange part is that sometimes, a parallel search will search a
_smaller_ tree than the serial search, due to move ordering issues.

>
>For me, I'd like to look at the nps and know that this is "the speed I'm going".
>Unfortunately, I see no way of computing the exact nps, but it should be
>possible to compute an upper and lower bound on the effective nodes I think.

You can pick wild guesses.  IE guess the upper bound is 2x the size of
the parallel search, while the lower is 1/2.  But those are just wild guesses
that will be wrong most of the time.





>
>When on an "allnode" we can add nodes to a counter when threads merge, let's
>call it lower_nodes, and when search is terminated prematurely or bounds gets
>adjusted we add to upper_nodes.

That won't work for a recursive implementation like mine.  IE I can
split _within_ a split.  And those two processors (or more) that do the
second split might complete normally, but their entire sub-tree is pointless
at the previous split point...

It isn't easy to calculate.




>
>Somewhere inbetween lies the effective node count, the count that would have
>been produced in a serial search, and some average can be used to calculate
>effective nps (enps?:).


It is worse than that.  What about the case where the parallel search searches
a smaller tree than the serial search?




>
>Let me know if I have I overlooked something.
>
>-S.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.