Author: Robert Hyatt
Date: 10:31:14 05/03/04
Go up one level in this thread
On May 03, 2004 at 13:03:16, Sune Fischer wrote:
>On May 03, 2004 at 11:13:11, Vincent Diepeveen wrote:
>
>>On May 03, 2004 at 10:53:19, Robert Hyatt wrote:
>>
>>>On May 03, 2004 at 10:19:46, Sune Fischer wrote:
>>>
>>>>
>>>>>I don't see any at 8. I don't personally have access to a 16-way box yet so I
>>>>>can't say anything there. But there is nothing that really makes null-move hurt
>>>>>parallel search...
>>>>
>>>>Couldn't it be, that nullmove hurts scalability in much the same way alpha-beta
>>>>"hurts" parallel search compared to minimax?
>>>
>>>I don't see how. It might make _some_ positions more unstable, and unstable
>>>positions hurt parallel search. But it also makes other positions more stable.
>>
>>What a nonsense. "i don't see how."
>>
>>it is very clearly proven. For everyone doing parallel research it is *trivial*
>>that more unstable trees are harder to split.
>
>Yes I think that is trivial too, but what is not so trivial is if nullmove
>causes these unstable search trees.
>
>>Further you must wait *longer* now to split because you first nullmove and must
>>search another move before splitting.
>
>Yes, but wouldn't this result in some processors running idle?
>
>>>I don't see why, unless you form the hypothesis of "forward pruning makes move
>>>ordering _worse_." That's the only wat this could happen...
>>>
>>>There are obviously "issues". Forward pruning tosses moves out. So at any node
>>>you will have fewer branches to search than in a normal (non-pruning) program.
>>>But if you don't require that all processors always work at the same node, this
>>>should not be a problem. IE Crafty searches endgames just as efficiently as it
>>>searches complex middlegames, from an SMP perspective...
>>
>>In endgames you search in general bigger depths. So on average the trees that
>>are there after you split are bigger. Bigger depthleft means less overhead and
>>more efficient parallel search.
>
>That sounds logical, but I think splitting overhead << parallel search overhead,
>so there must be another explanation too.
>
>I seem to recall some numbers from Crafty, that a split is basicly a copy of a
>few kilobytes and a split happens a few thousand times a minute. This would
>accumulate to 1 second CPU time?
>
>-S.
Here are some split numbers for a quad search in a game played on ICC. The
"elap" is the actual wall-clock time taken per move, most were a little over a
minute.. I don't think those splits account for much time. Why? just
comparing 1 cpu nps to 2cpu nps for example. In 1cpu _no_ splits are ever done.
SMP-> split=7437 stop=1129 data=14/128 cpu=4:25 elap=1:06
SMP-> split=9961 stop=1435 data=10/128 cpu=8:40 elap=2:10
SMP-> split=2454 stop=348 data=9/128 cpu=4:13 elap=1:03
SMP-> split=8381 stop=1400 data=12/128 cpu=4:06 elap=1:02
SMP-> split=6453 stop=1111 data=11/128 cpu=4:01 elap=1:00
SMP-> split=8694 stop=1451 data=10/128 cpu=5:28 elap=1:22
SMP-> split=7986 stop=1103 data=10/128 cpu=3:53 elap=58.52
Here is the nps data from my dual xeon:
time=21.85 cpu=99% mat=0 n=23757358 fh=92% nps=1.09M
SMP-> split=0 stop=0 data=0/128 cpu=21.75 elap=21.85
time=11.86 cpu=198% mat=0 n=23740599 fh=92% nps=2.00M
SMP-> split=222 stop=21 data=6/128 cpu=23.58 elap=11.86
NPS for dual should be 2.18M. How much of that is caused by parallel splits and
how much is caused by hardware conflicts for memory accesses and cache
invalidation is hard to say. Crafty claims 2% of total time (out of 200%
available) was lost outside the search, which should be idle time and time spent
doing splits. The rest of the missing NPS is most likely hardware-related as
some machines have produced perfect NPS doublings...
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.