Computer Chess Club Archives

Search

Terms
Messages

Subject: Re: How good to use a LAN for chess computing?

Author: Robert Hyatt
Date: 20:38:19 09/16/01
On September 16, 2001 at 21:10:46, Vincent Diepeveen wrote:

>On September 16, 2001 at 09:37:50, Robert Hyatt wrote:
>
>>On September 15, 2001 at 22:44:51, Vincent Diepeveen wrote:
>>
>>>On September 15, 2001 at 22:29:57, Robert Hyatt wrote:
>>>
>>>>On September 15, 2001 at 16:31:07, Vincent Diepeveen wrote:
>>>>
>>>>>On September 14, 2001 at 22:56:06, Pham Minh Tri wrote:
>>>>>
>>>>>>I see that dual computers are expensive, not easy to own and still limited in
>>>>>>power of computing.
>>>>>>
>>>>>>I wonder how good / possible if we use all computers in a LAN for chess
>>>>>>computing. LANs are very popular and the numbers of computers could be hundreds.
>>>>>
>>>>>LAN 1Gigabit /s or a slow 100mbit LAN?
>>>>>
>>>>>>Even though a LAN is not effective as a dual circuit, but the bigger number of
>>>>>>processors could help and break the limit.
>>>>>>What do you think?
>>>>>
>>>>>the problem is the hard work to make it. I had done some tests and have
>>>>>a version of diep that nearly worked over the lan, but then i was confronted
>>>>>with some huge slowdowns. Then i talked to Bob and i knew why.
>>>>>
>>>>>note that 100mbit networks aren't 100mbit networks really. Even the fastest
>>>>>cards i could not get more than 60mbit through a second.
>>>>>
>>>>>a major problem is that if you try to get read info from it in a multithreaded
>>>>>way that you get huge delays. Also multiprocessor the problem is exactly as
>>>>>big.
>>>>>
>>>>>Before you receive info over the network you are already hundreds of
>>>>>milliseconds further. This is a major problem.
>>>>>
>>>>
>>>>
>>>>I don't see that kind of speed on 100mbit switched networks.  I don't even see
>>>>10ms delays there.  And I have actually seen real speeds in the 1-5ms range to
>>>>send a single packet from any two non-conflicting nodes (using a switch, ie).
>>>
>>>but you are sending a byte or 2?
>>
>>
>>
>>No...  try writing some code.  One of the things I have kids do in my
>>network programming course is to answer just that question.  And it turns
>>out that the size of the packet is _far_ less important than the number
>>of packets sent.  IE sending one 1K packet is almost 10x faster than sending
>>10 100-byte packets.
>
>
>>>
>>>How about a chessprogram that's communicating with all bandwidth used up,
>>>try that and start horrorring!
>>
>>
>>When you are out of bandwidth, you are just out of bandwidth.  But in Linux,
>>I can sustain about 80 mbits/second on 100mbit ethernet using a switch.  You
>>simply have to write the algorithm with the bandwidth limit in mind.
>
>I did 2 tests
>  a) trying to get the bandwidth filled as much as possible with huge
>     packets. Note that 1kb packets isn't the optimal size. For me it was
>     several tens of kbs optimal size


This doesn't make any sense to me.  First, you can't _send_ any packet
larger than 1518 bytes.  If you do, it gets fragmented on one end, and
re-assembled on the other.  The only exception is "jumbo packets" made
popular by gigabit switches.

If you are sending lots of big packets, you simply increase the probability
of collisions since a single big packet turns into lots of 1500 byte packets.

The first thing in a distributed algorithm is to control packet size.  The
second is to control packet count.




>  b) trying to measure how fast i could ship a question and get answer
>     back (so ship packet from 16 bytes and get that packet back).
>
>for a) i never could fill my bandwidth more than 60mbit, i'm amazed you get
>       to 80mbit. Perhaps linux is faster or your student did his math bad :)



This wasn't a student.  This is my own simple TCP/IP program.



>    b) 3000 messages could be send and shipped back. that is each couple of
>       messages. that's 1500 times a second.
>
>Of course i measured under windows. Linux might be slightly faster.
>60 versus 80mbit isn't going to get me awake :)
>
>But it's clear that one needs more bandwidth than say around 10MB/s
>to do something useful at say a 256 processor cluster.



Not really.  You need reasonable bandwidth between two nodes.  If you do
things right, you don't have 255 nodes talking to _one_.  That will cause
a problem.  But if you split things up correctly, you don't have a lot
of many-to-one traffic, which means you can used the switched bandwidth
capability of the network to get _really_ high overall throughput.  But you
can't slap some piece of crap algorithm on a distributed cluster and expect
it to work well.  Part of the "fun" is to find ways to do things so that the
benefit outweighs the cost of doing whatever it is...

>
>>
>>
>>
>>>
>>>>Of course there are faster ways to do this, by reducing the latency.  Clan is
>>>>one answer there.  The latency can be dropped to the sub-microsecond range with
>>>>no problems.
>>>
>>>Clan?
>>>
>>>$$$$ for each network card and $$$$$ for each switch?
>>
>>
>>At the moment.  However, the price is coming down.
>
>Yeah 1 gigabit network cards are a couple of hundreds of dollars here now.
>
>But that would mean 2 nodes can get connected. Switches for them are
>very expensive.
>
>A major problem is that it takes 10 years for networks to get 10
>times faster.
>
>It's 2001 now. In 1991 i had an XT at 10Mhz.
>Now i have a dual AMD 1.2Ghz.

10 years ago there were 33mhz machines, and 10mbit ethernet.  5
years ago we were 10x faster in _both_.  today we are another 10x
faster in _both_.

Why do you insist on spending 1000 for the fastest cpu, but then use
a piece of crap network hardware?  gigabit is not expensive today.  There
are dozens of gigabit switches and NICS available.  At non-horrendous
costs.



>
>Just the clockrate is already a factor 240. XT did most likely not
>execute 3 instructions a clock. So a factor 500 is definitely less than
>it most likely is.
>
>In 1991 i didn't have a network but i assume that there were already 10
>mbits networks then for not too huge prices.
>
>In 2001 i can buy for not too huge prices a 100mbit network. a 1gigabit
>network i can't afford simply because the switch is too expensive.
>


the gigabit switch is no more expensive than your fast computer.




>Not even a hub i can afford at 1 gigabit (if hubs at 1 gigabit
>exist anyway).
>
>So network speed went up a factor 10. Processor speed went up factor 500.
>
>Now processors will go faster and faster and faster, but it's unlikely
>that networks will get much above a gigabit a second soon.


gigabit is common. Our _entire_ campus is now gigabit ethernet.  It is
very common.  10gigabit ethernet will be available in another year or
two...





>
>
>
>>
>>
>>>
>>>>
>>>>>So a) you have huge overhead
>>>>>   b) you cannot communicate much
>>>>>   c) you will not be able to get systemtime on a big 100mbit network anyway.
>>>>>   d) the bigger the network the more chanceless you get a speedup at a
>>>>>      100mbit network.
>>>>
>>>>
>>>>"big networks" are pretty common now.  If by "big" you mean "switched"
>>>>rather than a "hub" network.  We don't have any non-switched networks in
>>>>our department now, since switches are cheap.
>>>
>>>I doubt the 'pretty common'.
>>
>>
>>Not sure what you mean, but I haven't seen a "dumb hub" in a couple of years.
>>Here at UAB.  At public K-12 schools.  Where my wife works.  Etc.  Why would
>>anyone buy a dumb hub when a switched 8-port device is so cheap?
>>
>>
>>
>>
>>
>>
>>>
>>>I never got any system time on such a network so far and the 'big networks'
>>>still have huge latencies and it is very uncommon that they have over 100mbit
>>>network cards. I'm not speaking for the US here of course, can't judge
>>>things over there.
>>
>>
>>1-2ms latency is reachable on 100mbit networks.  That isn't horrible and
>>can be lived with.
>>
>>
>>
>>
>>>
>>>>
>>>>
>>>>>   e) where at networks with nodes being dual or quad getting a speedup is
>>>>>      already hard, at networks where nodes are single cpu getting a positive
>>>>>      speedup is nearly impossible.
>>>>
>>>>I wouldn't go that far. Jonathan did pretty well several years ago using
>>>>10mbit non-switched (thickwire) ethernet.  It obviously is not as fast as
>>>>SMP machines, but it is better than nothing.
>>>
>>>Please don't compare a $0.001 program with nowadays strong chess programs.
>>
>>
>>Then please don't assume that "If I can't do it it can't be done".  It was done
>>15 years ago on non-switched 10mbit ethernet.  It can be done _better_ today on
>>100mbit or gigabit switched ethernet.
>>
>>
>>
>>
>>
>>>
>>>Get *any* speedup with crafty over a 10mbit network at 256 nodes
>>>and i'll believe you!
>>
>>I don't have any 10mbit networks.  But I will get a speedup on a 100mbit
>>network before too long.
>>
>>
>>
>>
>>
>>
>>>
>>>If you get over the square root speedup for crafty
>>>out of 256 node 10mbit network you'll earn a nobel
>>>prize for sure!
>>
>>
>>This has already been done.
>>
>>
>>
>>>
>>>Of course crafty compared with the normal crafty that's running on a single
>>>cpu K7. Not the special network crafty at 1 processor compared to the
>>>speed of the 256 node crafty.
>>>
>>>Because this is exactly the problem.
>>
>>
>>The "special network" crafty will be exactly the same when only using one
>>node.  Just like I don't lose a thing with the SMP crafty if it uses just one
>>processor.
>>
>>
>>
>>
>>>
>>>Jonathans search depths and the program that he uses to
>>>get it is anything but impressive.
>>
>>
>>
>>So?  He didn't use null-move with R=2 or R=3, he didn't use it recursively.
>>That would put him right back in line with today's programs.
>>
>>
>>>
>>>>>I asked here some time ago for some volunteers and only got a few responses.
>>>>>Regrettably the mailing list didn't work anymore so i lost most email
>>>>>adresses, also not a single one has dual or quad machines. Getting a speedup
>>>>>from a network 100mbit with single cpu nodes is nearly impossible for
>>>>>an efficient program.
>>>>>
>>>>>Of course for the nodes a second it might look great, but that's not my
>>>>>goal.
>>>>>
>>>>>So in short you CAN get a huge nps but if you measure speedup in the depth
>>>>>you get at a dual versus a 8 node single cpu, then you will be hugely
>>>>>dissappointed. The dual will outgun the 8 node anywhere if it's a 100mbit
>>>>>network.
>>>>
>>>>
>>>>I wouldn't bet on that myself, if the dual cpus are the same speed as the 8
>>>>networked cpus.  It will take some work, but getting 4x faster would not be
>>>>anywhere near impossible.
>>>
>>>With a 100 mbit network with crafty you'll not even get close to 1.7
>>
>>
>>
>>Care to make a wager?  I'll guarantee you you will lose.  But it is your
>>money to throw away..
>>
>>
>>
>>
>>>
>>>Best regards,
>>>Vincent
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.