Author: Bob Durrett
Date: 07:54:52 02/21/04
Go up one level in this thread
On February 21, 2004 at 09:49:03, Robert Hyatt wrote: >On February 21, 2004 at 09:02:43, Bob Durrett wrote: > >>On February 21, 2004 at 01:24:29, Robert Hyatt wrote: >> >>>On February 20, 2004 at 21:25:33, Bob Durrett wrote: >>> >>>>On February 20, 2004 at 21:22:12, Robert Hyatt wrote: >>>> >>>>>On February 20, 2004 at 13:29:51, Dann Corbit wrote: >>>>> >>>>>>On February 20, 2004 at 10:43:11, Robert Hyatt wrote: >>>>>> >>>>>>>On February 19, 2004 at 15:09:25, Dann Corbit wrote: >>>>>>> >>>>>>>>On February 19, 2004 at 14:55:40, Robert Hyatt wrote: >>>>>>>> >>>>>>>>>On February 19, 2004 at 14:25:07, Dann Corbit wrote: >>>>>>>>> >>>>>>>>>>On February 19, 2004 at 14:14:11, Robert Hyatt wrote: >>>>>>>>>>[snip] >>>>>>>>>>>Clusters can work. 1000 nodes is a totally different problem from using 1-64, >>>>>>>>>>>but nothing I see says it is an impossible task. Impossible to "some" perhaps >>>>>>>>>>>so. Lots of things are impossible to "some". Fortunately, for the rest of us, >>>>>>>>>>>what is "impossible" is something that just takes a bit longer to do... :) >>>>>>>>>> >>>>>>>>>>A cluster notion (e.g. 64000 machines maybe using the network): >>>>>>>>>> >>>>>>>>>>Divide the machines into clumps that work on forward nodes, and multiple >>>>>>>>>>possible best moves. >>>>>>>>> >>>>>>>>>I already do this on large SMP platforms. IE crafty has a "group" concept, so >>>>>>>>>that with (say) 64 nodes, you don't get 64 nodes working on one split point. >>>>>>>>>That is bad for multiple reasons. Usually there are not 64 moves to search so >>>>>>>>>the split overhead is just pure overhead. Also, on NUMA boxes, you don't want >>>>>>>>>64 processors banging on one "split block" even though they don't do it that >>>>>>>>>much unless the split is too deep in the tree (this is also controllable in >>>>>>>>>current crafty). >>>>>>>>> >>>>>>>>>I suspect the same idea will be the right way for clusters. Divide it into >>>>>>>>>groups, sub-groups, and so forth. The "groups" have a "head" and the initial >>>>>>>>>split is between the "group heads". Then each group head can further split its >>>>>>>>>tree farther down the path, with sub-group members. Of course each of these >>>>>>>>>might be a "sub-group-head" that could later further split the tree among >>>>>>>>>sub-sub-group members. >>>>>>>>> >>>>>>>>>I think some such hierarchy is necessary for truly large numbers of processors, >>>>>>>>>and the potential for search overhead is going to be significant. >>>>>>>> >>>>>>>>Have one CPU that looks at the whole tree and makes recommendations (a sort of >>>>>>>>loadmaster). In that way, the CPU pool does not get divided evenly. So (in the >>>>>>>>first move example) moves like 1. e4, 1. d4, 1. Nc3 get 20,000 CPUs each and >>>>>>>>things like 1. a3 might only get 10 CPUs. >>>>>>>> >>>>>>>>This concept can also carry forward as far as you like. >>>>>>> >>>>>>>I don't think that will work. The reason is that there _are_ positions where a >>>>>>>move like a3 is the key move. >>>>>> >>>>>>But there are still ten CPUs working on that one. About like the system that >>>>>>recently pounded the opposition into oblivion on CCT-6 >>>>> >>>>>Yes, but if there are thousands on _other_ moves, and you run out of time before >>>>>the "10" have an answer, you never know the move is good. This is a problem I >>>>>have worked on in Cray Blitz and Crafty for many years. My current approach has >>>>>some specific stuff for this case, in fact... >>>>> >>>>> >>>>>> >>>>>>> I don't think that some sort of static decision >>>>>>>can be made as to which move deserves the most (or least) processors. It seems >>>>>>>to make more sense for this decision to be dynamic and based on what is actually >>>>>>>happening within the tree, not on some static characteristic of the individual >>>>>>>moves, which might be misleading. >>>>>> >>>>>>The way that I suggest is how humans play chess. We look at the bad lines, but >>>>>>we focus much more on the stuff that looks good. >>>>> >>>>>Aha. But your task now is to write the function that makes this statement work >>>>>correctly: >>>>> >>>>> if ( FeelsGood(move) ) ... >>>>> >>>>>:) >>>>> >>>>>It isn't so easy... I've been trying to do that for years, GMs have told me >>>>>many times that is the "key". :) >>>>> >>>>>> >>>>>>>My main goal was to avoid having too many processors work together, which causes >>>>>>>unnecessary overhead. IE on the Cray C90/T90, with 16/32 processors, we had to >>>>>>>do the same sort of thing, otherwise in endgames, there are plenty of positions >>>>>>>with less than 32 or even less than 16 moves, and we didn't want the split >>>>>>>overhead there to go out of control.. However, on the cray it was a one-level >>>>>>>hierarchy, where it might be much more complex on a large cluster-type or >>>>>>>NUMA-type box that could have way more than 32 processors. >>>>>>> >>>>>>>My feeling is that for any architecture, there is an "optimal" number of >>>>>>>processors that divide up a chunk of work and search the same tree in parallel. >>>>>>>You apply this "number" recursively in a divide-and-conquer approach to increase >>>>>>>the total number of processors "in the loop" without ever having more than some >>>>>>>finite number working on a particular node together. >>>>>>> >>>>>>>I also have a depth limit on splits in Crafty, to control the split overhead, >>>>>>>which is not "free". Every time I change hardware, I run tests to see if the >>>>>>>number should be changed, but since I left the quad pentium-pro 200 and changed >>>>>>>it then, it has not been changed since. But again, it is an intent to adjust >>>>>>>the algorithm to the specific hardware, limiting overhead to some reasonable >>>>>>>level while doing so. >>>>>>> >>>>>>>I suspect over time, other constraints will be found and allowed for in the >>>>>>>split algorithm. >>>>>> >>>>>>It is a very interesting area for study. The ChessBrain project has some >>>>>>interesting stuff going on in that regard, and they use the internet as the >>>>>>medium for the CPU farm. The fact that they are able to succeed against much >>>>>>stronger programs than the leaf processors shows that even a distributed system >>>>>>with very slow links can be feasible. >>>>> >>>>> >>>>>Never said it couldn't. That was Vincent. I've been a "cluster believer" for >>>>>many years... >>>>> >>>>>:) >>>>> >>>>>I will have a cluster crafty one day also... once we get our opteron cluster, I >>>>>plan on a serious look... Right now we are targeting a 128 node cluster with >>>>>each node being a dual opteron box... >>>>> >>>>>In theory that could max out at 4M nps per node, or deep-blue-like speeds of >>>>>512M nodes per second. Of course it won't be as efficient as my current SMP >>>>>search, but it will be interesting to see how close it can get... >>>> >>>>Like many words in the English language, "cluster" could mean many things, even >>>>to an electrical engineer. >>>> >>>>I would like to know what "Cluster" means in the context of this thread, so I >>>>can understand this thread better. >>>> >>>>Help!!!! >>>> >>>>Bob D. >>> >>> >>>Buy a dual-processor opteron box. Then buy a second one and connect them with >>>something decent, say at least gigabit ethernet. Now you have a two-node >>>cluster. >>> >>>Clusters imply no shared memory, just some sort of message-passing facility that >>>lets the nodes communicate in some reasonably efficient manner. Nothing is as >>>good as a real shared memory machine (SMP). Next comes NUMA machines that are >>>similar to SMP but with memory latencies that vary depending on how far it is >>>from one processor to the memory it want's to access. Then come clusters that >>>are based on various connection facilities, from plain 100mbit ethernet, to >>>gigabit ethernet, to more exotic (and way faster) interconnections like our old >>>cLAN cluster that has very low latency and very high bandwidth, but at way >>>expensive cost... >> >>OK, so my concept of a cluster was not so far off after all. I had been >>thinking that it didn't matter how the different cluster elements were arranged >>with respect to each other and the above seems to agree with that. >> >>By way of contrast, neural networks seem different. What would be the minimum >>requirements to be met for something to be called a neural network, and are >>neural networks considered to be a special type of the clusters you are >>discussing in this thread? >> >>Bob D. > > >Neural networks is a specific approach to solving a problem by using a >human-like (in some twisted sense) approximation to basic brain building blocks. > It really doesn't matter how the "neurons" communicate. There, the issue is >only speed. IE the various layers of "neurons" communicate in some fashion, >where a SMP box provides the fastest communication, a cluster the slowest. But >either way it would still be a neural network. Just as either would allow us to >build a parallel chess search engine, but one would definitely have a >performance advantage over the other, node for node. It sounds like there may yet be room for advancement here. Bob D.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.