Author: Vincent Diepeveen
Date: 05:49:48 07/08/03
Go up one level in this thread
On July 07, 2003 at 10:48:02, Robert Hyatt wrote: >On July 05, 2003 at 23:37:47, Jay Urbanski wrote: > >>On July 04, 2003 at 23:33:46, Robert Hyatt wrote: >> >><snip> >>>"way better than MPI". Both use TCP/IP, just like PVM. Except that MPI/OpenMP >>>is designed for homogeneous clusters while PVM works with heterogeneous mixes. >>>But for any of the above, the latency is caused by TCP/IP, _not_ the particular >>>library being used. >> >>With latency a concern I don't know why you'd use TCP/IP as the transport for >>MPI when there are much faster ones available. >> >>Even VIA over Ethernet would be an improvement. > >I use VIA over ethernet, and VIA over a cLAN giganet switch as well. The >cLAN hardware produces .5usec latench which is about 1000X better than any Bob, the latencies that i quote are RASML : Random Average Shared Memory Latencies. The latencies that you quote here are sequential latencies. Bandwidth divided by the number of seconds = latency (according to the manufacturers). For computer chess that can't be used however. You can more accurate get an indication by using the well known ping pong program. What it does is over MPI it ships messages and then WAITS for them to come back. Then it divides that time by 2. Then it is called one way ping pong latencies. If you multiply that by 2, you already get closer to the latency that it takes to get a single bitboard out of memory. Even better is using the RASML test i wrote. That's using OpenMP though but conversion to MPI is trivial (yet slowing down things so much that it is less accurate than openmp). So the best indication you can get is by doing a simple pingpong latency test. The best ethernet network cards are myrilnet work cards (about $1300). I do not know which chipset they have. They can achieve at 133Mhz PCI64X (jay might know more about specifications here) like 5 usec one way ping pong latency, so that's a minimum of way more than 10 usec to get a bitboard from the other side of th emachine. In your cluster you probably do not have such PCI stuff Bob. Most likely it is around 10 usec for one way latency at your cluster so you can get at minimum of 20 usec to get a message. Note that getting a cache line out of local memory of your quad xeons is already taking about 0.5 usec. You can imagine hopefully that the quoted usecs by the manufacturer for cLan is based upon bandwidth / time needed. And NOT the RASM latencies. Best regards, Vincent >TCP/IP-ethernet implementation. However, ethernet will never touch good >hardware like the cLAN stuff. > >MPI/PVM use ethernet - tcp/ip for one obvious reason: "portability" and >"availability". :)
This page took 0.01 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.