Author: Vincent Diepeveen
Date: 03:44:33 09/17/01
Go up one level in this thread
On September 17, 2001 at 00:01:47, Robert Hyatt wrote: >On September 16, 2001 at 20:47:25, Vincent Diepeveen wrote: > >>On September 16, 2001 at 09:44:37, Robert Hyatt wrote: >> >>>On September 15, 2001 at 22:48:41, Vincent Diepeveen wrote: >>> >>>>On September 15, 2001 at 22:34:27, Robert Hyatt wrote: >>>> >>>>>On September 15, 2001 at 03:28:18, Tony Werten wrote: >>>>> >>>>>>On September 14, 2001 at 22:56:06, Pham Minh Tri wrote: >>>>>> >>>>>>>I see that dual computers are expensive, not easy to own and still limited in >>>>>>>power of computing. >>>>>>> >>>>>>>I wonder how good / possible if we use all computers in a LAN for chess >>>>>>>computing. LANs are very popular and the numbers of computers could be hundreds. >>>>>>>Even though a LAN is not effective as a dual circuit, but the bigger number of >>>>>>>processors could help and break the limit. >>>>>>> >>>>>>>What do you think? >>>>>> >>>>>>When you search a chesstree, a lot of times you come into parts of tree that you >>>>>>have searched before. You either don't want to search this part again ( you have >>>>>>searched it deep enough before ) or you want to have the best move from the >>>>>>previous search. Hashtables do exactly this. >>>>>> >>>>>>In a LAN (or a cluster) you don't share this hashtable and therefor are >>>>>>searching the same tree (or parts of it ) time and time again. If you count the >>>>>>number of nodes searched per second it's a linear speedup but effectively it's >>>>>>useless. You have to add a lot of computers before you get any real speedup, >>>>>>specially in the endgame. >>>>>> >>>>>>cheers, >>>>>> >>>>>>Tony >>>>> >>>>> >>>>>This is not necessarily true. Several programs have distributed the hash table >>>>>across network nodes. It requires small changes to the basic search algorithm, >>>>>but a distributed hash table is not only doable, it has been done more than >>>>>once. >>>>> >>>>>I will probably do this in the distributed Crafty when I do it... >>>> >>>>At a 100mbit network i tried to ship 16 bytes packet as fast as possible >>>>from a node to another node. >>>> >>>>I managed to do that with a CROSS-UTP cable about 3000 times a second. >>>> >>>>That means in short that without counting the 60ms receiving delay in linux, >>>>30ms in windows or something, that you can only ship and get a hashtable >>>>entry at 1500 times a second. >>> >>> >>>There is no 60 ms delay in linux. I run this test all the time. And I can >>>sustain 80 megabytes per second with no trouble at all using a C program and >>>a tcp/ip stream protocol. You should also check your math. 50ms would mean >> >>>20 packets per second, which is ridiculous. 1ms means 1000 packets per second >>>which is always doable. >> >>As i said i am not counting the thread wakeup delay here, because there >>are no doubt technical solutions for that. >> >>Note that what i put through >>is using DATAGRAMs at 100mbit ethernet cards. That's 3000 times a second >>for 16 bytes packets. >> >>Your cluster has 1.25 gigabit per second on paper. getting 80 megabytes >>a second on it means 0.64 gigabit a second. That isn't much but plenty >>to work with, when you only talk about 8 nodes. >> >>However may i take into account that each node gets like a million nodes >>a second. >> >>That's in total 8 million nodes a second. That's producing (if each >>node would get stored) a total number of hashrequests of 8 million >>entries a second. That's quite a lot. > > >Not at all. the giganet stuff will do 1.25 gigabits per second between >any two nodes. on my 8 node cluster, any 4 nodes may talk to any other 4 >nodes at that speed, simultaneously. For a total potential bandwidth of >16 * 1.25 gigabits per second. since each node can send and receive at 1.25 >gigabits per second full-duplex. > >If you distribute the hash table, then you also tend to distribute the hash >probes. Which means you spend time trying to prevent multiple nodes from >probing a single node. And this is doable, from experience and talking with >guys like Schaeffer and others that have done it. > > > > >> >>Let's look to each node: >> 1 million nodes a second. I could deliver and answer >> with a cross-utp (which is the fastest way to communicate >> with 2 nodes i assume) 1500 requests a second. >> >>Now your network is factor 10 faster on paper, so let's for easyness >>assume you can handle 15000 requests a second. Note that this >>would take 100% system time then. > >my network is _not_ 10x faster. With a latency of .5 microseconds, >it is _way_ faster. Perhaps 1000x faster for short packets. i forgot you have a cluster at university. Which network cards must i buy to get that latency and what's their price, because i'm interested in buying 2 of those myself if they're not $$$$. most important thing to realize is that big majority of dudes here are talking about running a program at their homenetwork, so their question is always regarding 100mbit networks at most. to my amazement still most companies still create 10mbit networks. > >> >>That means, depending upon branching factor you will lose loads of system >>time waiting for hashentries. > > >Not if I do the algorithm right. I'm not going to be "waiting" for hash >entries and doing nothing. Yes you told some years ago already this delayed transposition suggestion and i have sincethen thought long about it, but the main problem is that branching factor gets worse. My move ordering already *only* for <= alfa nodes gets 1% worse and several % for cutoff nodes. I do not know numbers for crafty here, but if your move ordering statistics are the same, then imagine what 1% is at a depth of 16 ply for crafty. You search plies less! Now the effect is a bit more limited here because you can of course at depths near the root still see it as linear overhead. The basic slowdown is near the leafs! >>it's impossible to already carry on with your search because then the >>branching factor gets bigger. > > >It is interesting to see you label as "impossible" something that others >have _already_ done. You do the probe, you continue searching as though >there was no "hit" (which is correct in 80-90% of the positions). If you >get a probe "hit" back later, you can then react to it as necessary. The >overall slow-down is then not very significant. > >Schaeffer did this. The WayCool guys did this. And Hsu did this. > > > > > >> >>Also i'm always amazed that people never hash qsearch, well with a million >>nodes a second that's perhaps harder than with 120k nps. > > >Perhaps we don't hash it because we tested it and found no advantage to >doing so??? > > > >> >>I hash *everything*. > > >Nothing wrong with that. But saying it is the _only_ way is not so >hot... > > > > > >> >>If i would not do it i would suffer bigtime especially in qsearch >>because huge trees would get researched there! >> > > > >that is because you fail to appreciate the fact that a very small q-search >doesn't have this problem. > > > >> >> >>>> >>>>so unless you want to create a deep blue crafty where you only hash the >>>>first so many plies, then you sure will slow down crafty a factor of say >>>>450? >>>> >>> >>>Hashing the first N plies may well be an idea. I already don't hash the >>>q-search. I believe Junior doesn't hash the last non-qsearch ply. It seems >>>to work ok for us... >>> >>> >>> >>> >>> >>>>How big is your cluster then to get a speedup of over 1.0 ? > > >I believe 2 nodes will do just fine to get a speedup of > 1.0...
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.