Author: Robert Hyatt
Date: 10:33:19 09/14/03
Go up one level in this thread
On September 13, 2003 at 17:58:02, Sune Fischer wrote: >On September 13, 2003 at 17:07:47, Robert Hyatt wrote: > >>On September 13, 2003 at 11:26:37, Nolan Denson wrote: >> >>>Hey Dan >>> >>>I decided to compile and give your crafty MPC a try. I am getting very good >>>results but have notice something strange. I am not sure if this version is >>>suppose to be able to take advantage of hyper threading ... but when i run the >>>bench test my nps are better for a dual processor instead of a quad, But a >>>strange things happens when actual game play starts. Using it as a Dual Xeon my >>>nps (I compiled for multi-thread) 1536 nps avg, as a Quad the bench test is 1200 >>>nps. But when actual game play starts the Dual stays about the same. But the >>>Quad via Hyperthreading gets like 2100 nps ... Do you think is doing this on its >>>own, or is this MPC Crafty Hyperthreading enable. If an one know the files that >>>are involve ... i can make this MPC version truly Hyperthreading ...base on the >>>results I am getting now .. if this is not Hyperthreading now ... i think once >>>it is i should get around 2500-2600 nps. >> >> >>Something is wrong. Do you really have SMT turned on? IE I have some dual >>xeons here and hyper-threading definitely kicks the NPS up. Somewhere around >>30% or so, for raw NPS... My dual 2.8 gets around 2.1M nodes per second on >>the default setting bench command with mt=4 >> >>But be sure that on a dual with hyperthreading on, you do mt=4, _not_ mt=2. > >I have question, >the way Crafty computes nps in a SMP run, is by summing up the nodes searched by >the helper threads, right? > >As I understand it this should be more or less equivalent to the cpu-load >figure, so it won't account for some of those nodes being lost to search >inefficiency. The issue is "search overhead". A parallel searches more nodes than a pure serial (one thread) search due to alpha/beta issues. I really don't know how to estimate this number from within a parallel search. And the really strange part is that sometimes, a parallel search will search a _smaller_ tree than the serial search, due to move ordering issues. > >For me, I'd like to look at the nps and know that this is "the speed I'm going". >Unfortunately, I see no way of computing the exact nps, but it should be >possible to compute an upper and lower bound on the effective nodes I think. You can pick wild guesses. IE guess the upper bound is 2x the size of the parallel search, while the lower is 1/2. But those are just wild guesses that will be wrong most of the time. > >When on an "allnode" we can add nodes to a counter when threads merge, let's >call it lower_nodes, and when search is terminated prematurely or bounds gets >adjusted we add to upper_nodes. That won't work for a recursive implementation like mine. IE I can split _within_ a split. And those two processors (or more) that do the second split might complete normally, but their entire sub-tree is pointless at the previous split point... It isn't easy to calculate. > >Somewhere inbetween lies the effective node count, the count that would have >been produced in a serial search, and some average can be used to calculate >effective nps (enps?:). It is worse than that. What about the case where the parallel search searches a smaller tree than the serial search? > >Let me know if I have I overlooked something. > >-S.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.