Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Crafty Cluster Feasible in Future?

Author: Robert Hyatt

Date: 11:31:14 02/21/04

Go up one level in this thread


On February 21, 2004 at 10:54:52, Bob Durrett wrote:

>On February 21, 2004 at 09:49:03, Robert Hyatt wrote:
>
>>On February 21, 2004 at 09:02:43, Bob Durrett wrote:
>>
>>>On February 21, 2004 at 01:24:29, Robert Hyatt wrote:
>>>
>>>>On February 20, 2004 at 21:25:33, Bob Durrett wrote:
>>>>
>>>>>On February 20, 2004 at 21:22:12, Robert Hyatt wrote:
>>>>>
>>>>>>On February 20, 2004 at 13:29:51, Dann Corbit wrote:
>>>>>>
>>>>>>>On February 20, 2004 at 10:43:11, Robert Hyatt wrote:
>>>>>>>
>>>>>>>>On February 19, 2004 at 15:09:25, Dann Corbit wrote:
>>>>>>>>
>>>>>>>>>On February 19, 2004 at 14:55:40, Robert Hyatt wrote:
>>>>>>>>>
>>>>>>>>>>On February 19, 2004 at 14:25:07, Dann Corbit wrote:
>>>>>>>>>>
>>>>>>>>>>>On February 19, 2004 at 14:14:11, Robert Hyatt wrote:
>>>>>>>>>>>[snip]
>>>>>>>>>>>>Clusters can work.  1000 nodes is a totally different problem from using 1-64,
>>>>>>>>>>>>but nothing I see says it is an impossible task.  Impossible to "some" perhaps
>>>>>>>>>>>>so.  Lots of things are impossible to "some".  Fortunately, for the rest of us,
>>>>>>>>>>>>what is "impossible" is something that just takes a bit longer to do... :)
>>>>>>>>>>>
>>>>>>>>>>>A cluster notion (e.g. 64000 machines maybe using the network):
>>>>>>>>>>>
>>>>>>>>>>>Divide the machines into clumps that work on forward nodes, and multiple
>>>>>>>>>>>possible best moves.
>>>>>>>>>>
>>>>>>>>>>I already do this on large SMP platforms.  IE crafty has a "group" concept, so
>>>>>>>>>>that with (say) 64 nodes, you don't get 64 nodes working on one split point.
>>>>>>>>>>That is bad for multiple reasons.  Usually there are not 64 moves to search so
>>>>>>>>>>the split overhead is just pure overhead.  Also, on NUMA boxes, you don't want
>>>>>>>>>>64 processors banging on one "split block" even though they don't do it that
>>>>>>>>>>much unless the split is too deep in the tree (this is also controllable in
>>>>>>>>>>current crafty).
>>>>>>>>>>
>>>>>>>>>>I suspect the same idea will be the right way for clusters.  Divide it into
>>>>>>>>>>groups, sub-groups, and so forth.  The "groups" have a "head" and the initial
>>>>>>>>>>split is between the "group heads".  Then each group head can further split its
>>>>>>>>>>tree farther down the path, with sub-group members.  Of course each of these
>>>>>>>>>>might be a "sub-group-head" that could later further split the tree among
>>>>>>>>>>sub-sub-group members.
>>>>>>>>>>
>>>>>>>>>>I think some such hierarchy is necessary for truly large numbers of processors,
>>>>>>>>>>and the potential for search overhead is going to be significant.
>>>>>>>>>
>>>>>>>>>Have one CPU that looks at the whole tree and makes recommendations (a sort of
>>>>>>>>>loadmaster).  In that way, the CPU pool does not get divided evenly.  So (in the
>>>>>>>>>first move example) moves like 1. e4, 1. d4, 1. Nc3 get 20,000 CPUs each and
>>>>>>>>>things like 1. a3 might only get 10 CPUs.
>>>>>>>>>
>>>>>>>>>This concept can also carry forward as far as you like.
>>>>>>>>
>>>>>>>>I don't think that will work.  The reason is that there _are_ positions where a
>>>>>>>>move like a3 is the key move.
>>>>>>>
>>>>>>>But there are still ten CPUs working on that one.  About like the system that
>>>>>>>recently pounded the opposition into oblivion on CCT-6
>>>>>>
>>>>>>Yes, but if there are thousands on _other_ moves, and you run out of time before
>>>>>>the "10" have an answer, you never know the move is good.  This is a problem I
>>>>>>have worked on in Cray Blitz and Crafty for many years.  My current approach has
>>>>>>some specific stuff for this case, in fact...
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>> I don't think that some sort of static decision
>>>>>>>>can be made as to which move deserves the most (or least) processors.  It seems
>>>>>>>>to make more sense for this decision to be dynamic and based on what is actually
>>>>>>>>happening within the tree, not on some static characteristic of the individual
>>>>>>>>moves, which might be misleading.
>>>>>>>
>>>>>>>The way that I suggest is how humans play chess.  We look at the bad lines, but
>>>>>>>we focus much more on the stuff that looks good.
>>>>>>
>>>>>>Aha.  But your task now is to write the function that makes this statement work
>>>>>>correctly:
>>>>>>
>>>>>>    if ( FeelsGood(move) ) ...
>>>>>>
>>>>>>:)
>>>>>>
>>>>>>It isn't so easy...  I've been trying to do that for years, GMs have told me
>>>>>>many times that is the "key".  :)
>>>>>>
>>>>>>>
>>>>>>>>My main goal was to avoid having too many processors work together, which causes
>>>>>>>>unnecessary overhead.  IE on the Cray C90/T90, with 16/32 processors, we had to
>>>>>>>>do the same sort of thing, otherwise in endgames, there are plenty of positions
>>>>>>>>with less than 32 or even less than 16 moves, and we didn't want the split
>>>>>>>>overhead there to go out of control..  However, on the cray it was a one-level
>>>>>>>>hierarchy, where it might be much more complex on a large cluster-type or
>>>>>>>>NUMA-type box that could have way more than 32 processors.
>>>>>>>>
>>>>>>>>My feeling is that for any architecture, there is an "optimal" number of
>>>>>>>>processors that divide up a chunk of work and search the same tree in parallel.
>>>>>>>>You apply this "number" recursively in a divide-and-conquer approach to increase
>>>>>>>>the total number of processors "in the loop" without ever having more than some
>>>>>>>>finite number working on a particular node together.
>>>>>>>>
>>>>>>>>I also have a depth limit on splits in Crafty, to control the split overhead,
>>>>>>>>which is not "free".  Every time I change hardware, I run tests to see if the
>>>>>>>>number should be changed, but since I left the quad pentium-pro 200 and changed
>>>>>>>>it then, it has not been changed since.  But again, it is an intent to adjust
>>>>>>>>the algorithm to the specific hardware, limiting overhead to some reasonable
>>>>>>>>level while doing so.
>>>>>>>>
>>>>>>>>I suspect over time, other constraints will be found and allowed for in the
>>>>>>>>split algorithm.
>>>>>>>
>>>>>>>It is a very interesting area for study.  The ChessBrain project has some
>>>>>>>interesting stuff going on in that regard, and they use the internet as the
>>>>>>>medium for the CPU farm.  The fact that they are able to succeed against much
>>>>>>>stronger programs than the leaf processors shows that even a distributed system
>>>>>>>with very slow links can be feasible.
>>>>>>
>>>>>>
>>>>>>Never said it couldn't.  That was Vincent.  I've been a "cluster believer" for
>>>>>>many years...
>>>>>>
>>>>>>:)
>>>>>>
>>>>>>I will have a cluster crafty one day also...  once we get our opteron cluster, I
>>>>>>plan on a serious look...  Right now we are targeting a 128 node cluster with
>>>>>>each node being a dual opteron box...
>>>>>>
>>>>>>In theory that could max out at 4M nps per node, or deep-blue-like speeds of
>>>>>>512M nodes per second.  Of course it won't be as efficient as my current SMP
>>>>>>search, but it will be interesting to see how close it can get...
>>>>>
>>>>>Like many words in the English language, "cluster" could mean many things, even
>>>>>to an electrical engineer.
>>>>>
>>>>>I would like to know what "Cluster" means in the context of this thread, so I
>>>>>can understand this thread better.
>>>>>
>>>>>Help!!!!
>>>>>
>>>>>Bob D.
>>>>
>>>>
>>>>Buy a dual-processor opteron box.  Then buy a second one and connect them with
>>>>something decent, say at least gigabit ethernet.  Now you have a two-node
>>>>cluster.
>>>>
>>>>Clusters imply no shared memory, just some sort of message-passing facility that
>>>>lets the nodes communicate in some reasonably efficient manner.  Nothing is as
>>>>good as a real shared memory machine (SMP).  Next comes NUMA machines that are
>>>>similar to SMP but with memory latencies that vary depending on how far it is
>>>>from one processor to the memory it want's to access.  Then come clusters that
>>>>are based on various connection facilities, from plain 100mbit ethernet, to
>>>>gigabit ethernet, to more exotic (and way faster) interconnections like our old
>>>>cLAN cluster that has very low latency and very high bandwidth, but at way
>>>>expensive cost...
>>>
>>>OK, so my concept of a cluster was not so far off after all.  I had been
>>>thinking that it didn't matter how the different cluster elements were arranged
>>>with respect to each other and the above seems to agree with that.
>>>
>>>By way of contrast, neural networks seem different.  What would be the minimum
>>>requirements to be met for something to be called a neural network, and are
>>>neural networks considered to be a special type of the clusters you are
>>>discussing in this thread?
>>>
>>>Bob D.
>>
>>
>>Neural networks is a specific approach to solving a problem by using a
>>human-like (in some twisted sense) approximation to basic brain building blocks.
>> It really doesn't matter how the "neurons" communicate.  There, the issue is
>>only speed.  IE the various layers of "neurons" communicate in some fashion,
>>where a SMP box provides the fastest communication, a cluster the slowest.  But
>>either way it would still be a neural network.  Just as either would allow us to
>>build a parallel chess search engine, but one would definitely have a
>>performance advantage over the other, node for node.
>
>It sounds like there may yet be room for advancement here.
>
>Bob D.

The problem with neural nets is "training".  IE for some tasks, the training is
quick and the proficiency is very high after that training.  But chess is a game
of exceptions to rules as well as a game of rules.  Neural nets don't really
extract the same "information" from a game (or a position) that we do.  We
simply tell them "here is a position, here is the score" and we repeat that N
times.  The larger N, the better (in general) the final product.  I am highly
suspicious, however, that for chess, N is beyond ridiculous, and the internal
geometry of the "network" will be huge as well.

I simply think in our chess domain, neural nets won't work, even though they are
quite good in some domains...




This page took 0.01 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.