Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: How good to use a LAN for chess computing?

Author: Vincent Diepeveen

Date: 03:44:33 09/17/01

Go up one level in this thread


On September 17, 2001 at 00:01:47, Robert Hyatt wrote:

>On September 16, 2001 at 20:47:25, Vincent Diepeveen wrote:
>
>>On September 16, 2001 at 09:44:37, Robert Hyatt wrote:
>>
>>>On September 15, 2001 at 22:48:41, Vincent Diepeveen wrote:
>>>
>>>>On September 15, 2001 at 22:34:27, Robert Hyatt wrote:
>>>>
>>>>>On September 15, 2001 at 03:28:18, Tony Werten wrote:
>>>>>
>>>>>>On September 14, 2001 at 22:56:06, Pham Minh Tri wrote:
>>>>>>
>>>>>>>I see that dual computers are expensive, not easy to own and still limited in
>>>>>>>power of computing.
>>>>>>>
>>>>>>>I wonder how good / possible if we use all computers in a LAN for chess
>>>>>>>computing. LANs are very popular and the numbers of computers could be hundreds.
>>>>>>>Even though a LAN is not effective as a dual circuit, but the bigger number of
>>>>>>>processors could help and break the limit.
>>>>>>>
>>>>>>>What do you think?
>>>>>>
>>>>>>When you search a chesstree, a lot of times you come into parts of tree that you
>>>>>>have searched before. You either don't want to search this part again ( you have
>>>>>>searched it deep enough before ) or you want to have the best move from the
>>>>>>previous search. Hashtables do exactly this.
>>>>>>
>>>>>>In a LAN (or a cluster) you don't share this hashtable and therefor are
>>>>>>searching the same tree (or parts of it ) time and time again. If you count the
>>>>>>number of nodes searched per second it's a linear speedup but effectively it's
>>>>>>useless. You have to add a lot of computers before you get any real speedup,
>>>>>>specially in the endgame.
>>>>>>
>>>>>>cheers,
>>>>>>
>>>>>>Tony
>>>>>
>>>>>
>>>>>This is not necessarily true.  Several programs have distributed the hash table
>>>>>across network nodes.  It requires small changes to the basic search algorithm,
>>>>>but a distributed hash table is not only doable, it has been done more than
>>>>>once.
>>>>>
>>>>>I will probably do this in the distributed Crafty when I do it...
>>>>
>>>>At a 100mbit network i tried to ship 16 bytes packet as fast as possible
>>>>from a node to another node.
>>>>
>>>>I managed to do that with a CROSS-UTP cable about 3000 times a second.
>>>>
>>>>That means in short that without counting the 60ms receiving delay in linux,
>>>>30ms in windows or something, that you can only ship and get a hashtable
>>>>entry at 1500 times a second.
>>>
>>>
>>>There is no 60 ms delay in linux.  I run this test all the time.  And I can
>>>sustain 80 megabytes per second with no trouble at all using a C program and
>>>a tcp/ip stream protocol.  You should also check your math.  50ms would mean
>>
>>>20 packets per second, which is ridiculous.  1ms means 1000 packets per second
>>>which is always doable.
>>
>>As i said i am not counting the thread wakeup delay here, because there
>>are no doubt technical solutions for that.
>>
>>Note that what i put through
>>is using DATAGRAMs at 100mbit ethernet cards. That's 3000 times a second
>>for 16 bytes packets.
>>
>>Your cluster has 1.25 gigabit per second on paper. getting 80 megabytes
>>a second on it means 0.64 gigabit a second. That isn't much but plenty
>>to work with, when you only talk about 8 nodes.
>>
>>However may i take into account that each node gets like a million nodes
>>a second.
>>
>>That's in total 8 million nodes a second. That's producing (if each
>>node would get stored) a total number of hashrequests of 8 million
>>entries a second. That's quite a lot.
>
>
>Not at all.  the giganet stuff will do 1.25 gigabits per second between
>any two nodes.  on my 8 node cluster, any 4 nodes may talk to any other 4
>nodes at that speed, simultaneously.  For a total potential bandwidth of
>16 * 1.25 gigabits per second. since each node can send and receive at 1.25
>gigabits per second full-duplex.
>
>If you distribute the hash table, then you also tend to distribute the hash
>probes.  Which means you spend time trying to prevent multiple nodes from
>probing a single node.  And this is doable, from experience and talking with
>guys like Schaeffer and others that have done it.
>
>
>
>
>>
>>Let's look to each node:
>>  1 million nodes a second. I could deliver and answer
>>  with a cross-utp (which is the fastest way to communicate
>>  with 2 nodes i assume) 1500 requests a second.
>>
>>Now your network is factor 10 faster on paper, so let's for easyness
>>assume you can handle 15000 requests a second. Note that this
>>would take 100% system time then.
>
>my network is _not_ 10x faster.  With a latency of .5 microseconds,
>it is _way_ faster.  Perhaps 1000x faster for short packets.

i forgot you have a cluster at university.

Which network cards must i buy to get that latency and what's their
price, because i'm interested in buying 2 of those myself if they're
not $$$$.

most important thing to realize is that big majority of dudes here
are talking about running a program at their homenetwork, so their question
is always regarding 100mbit networks at most. to my amazement still
most companies still create 10mbit networks.



>
>>
>>That means, depending upon branching factor you will lose loads of system
>>time waiting for hashentries.
>
>
>Not if I do the algorithm right.  I'm not going to be "waiting" for hash
>entries and doing nothing.

Yes you told some years ago already this delayed transposition
suggestion and i have sincethen thought long about it, but the
main problem is that branching factor gets worse.

My move ordering already *only* for <= alfa nodes gets 1% worse and
several % for cutoff nodes.

I do not know numbers for crafty here, but if your move ordering
statistics are the same, then imagine what 1% is at a depth
of 16 ply for crafty. You search plies less!

Now the effect is a bit more limited here because you can of course
at depths near the root still see it as linear overhead. The basic
slowdown is near the leafs!

>>it's impossible to already carry on with your search because then the
>>branching factor gets bigger.
>
>
>It is interesting to see you label as "impossible" something that others
>have _already_ done.  You do the probe, you continue searching as though
>there was no "hit" (which is correct in 80-90% of the positions).  If you
>get a probe "hit" back later, you can then react to it as necessary.  The
>overall slow-down is then not very significant.
>
>Schaeffer did this.  The WayCool guys did this.  And Hsu did this.
>
>
>
>
>
>>
>>Also i'm always amazed that people never hash qsearch, well with a million
>>nodes a second that's perhaps harder than with 120k nps.
>
>
>Perhaps we don't hash it because we tested it and found no advantage to
>doing so???
>
>
>
>>
>>I hash *everything*.
>
>
>Nothing wrong with that.  But saying it is the _only_ way is not so
>hot...
>
>
>
>
>
>>
>>If i would not do it i would suffer bigtime especially in qsearch
>>because huge trees would get researched there!
>>
>
>
>
>that is because you fail to appreciate the fact that a very small q-search
>doesn't have this problem.
>
>
>
>>
>>
>>>>
>>>>so unless you want to create a deep blue crafty where you only hash the
>>>>first so many plies, then you sure will slow down crafty a factor of say
>>>>450?
>>>>
>>>
>>>Hashing the first N plies may well be an idea.  I already don't hash the
>>>q-search.  I believe Junior doesn't hash the last non-qsearch ply.  It seems
>>>to work ok for us...
>>>
>>>
>>>
>>>
>>>
>>>>How big is your cluster then to get a speedup of over 1.0 ?
>
>
>I believe 2 nodes will do just fine to get a speedup of > 1.0...



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.