Author: Robert Hyatt
Date: 11:10:53 09/03/03
Go up one level in this thread
On September 03, 2003 at 13:33:19, Gian-Carlo Pascutto wrote:
>On September 03, 2003 at 13:22:50, Robert Hyatt wrote:
>
>>It is a _TINY_ part of the total time spent. So tiny, it can be ignored.
>
>Que?
>
>Maybe so on an SMP quad (as I stated), but surely not on a large NUMA system.
>
>If this isn't the issue, I'd expect my thing to run like the blazes
>on a NUMA box, but I doubt I'm that lucky.
>
>--
>GCP
There are three things that have to be done by a thread:
1. copy local data somewhere else for another thread to use (splitting in
crafty terminology). That happens once per "split". How many splits are done?
Here is the data I provided in another thread here...
SMP-> split=6266 stop=875 data=19/64 cpu=10:00 elap=2:39
SMP-> split=3511 stop=440 data=16/64 cpu=5:20 elap=1:27
SMP-> split=3768 stop=524 data=17/64 cpu=5:45 elap=1:33
SMP-> split=1724 stop=275 data=13/64 cpu=3:59 elap=1:04
SMP-> split=4894 stop=671 data=15/64 cpu=3:55 elap=1:03
SMP-> split=2666 stop=420 data=15/64 cpu=3:51 elap=1:02
SMP-> split=3412 stop=683 data=17/64 cpu=3:46 elap=1:00
SMP-> split=3447 stop=476 data=15/64 cpu=3:55 elap=1:03
SMP-> split=2985 stop=345 data=19/64 cpu=1:13 elap=19.53
SMP-> split=11657 stop=1620 data=23/64 cpu=3:32 elap=58.12
SMP-> split=1928 stop=292 data=17/64 cpu=3:24 elap=57.08
SMP-> split=53912 stop=6999 data=30/64 cpu=32:06 elap=8:42
SMP-> split=9997 stop=1209 data=23/64 cpu=3:31 elap=56.69
SMP-> split=2966 stop=527 data=19/64 cpu=3:28 elap=55.49
Worst case was 54000 splits for a 9 minute long search. Using 4 processors.
More typical seems to be about 500 splits per minute of search. That is
not much time.
2. Search. Here I only do local memory accesses, so there is just normal
tree search overhead, nothing related to NUMA.
3. completion. Here I have to either copy a score/PV or just score back to
the parent thread data or set a "stop" flag to say my result is good enough,
no others are needed. Either of these is a trivial amount of non-local memory
traffic.
If you do that right, NUMA should not hurt. The issue is going to become
how to use a large number of processors, which is much harder to do that
to use a small number as we are today.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.