Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: MP system info

Author: Robert Hyatt

Date: 09:53:44 05/28/02

Go up one level in this thread


On May 28, 2002 at 10:06:42, Vincent Diepeveen wrote:

>On May 28, 2002 at 09:06:36, K. Burcham wrote:
>
>for computerchess that is way too optimistic Kim.
>
>programs like Cray Blitz or DIEP might do pretty well at
>8 processors, but crafty, fritz, sos, shredder, patzer,
>junior and these programs
>scale pretty bad at 8 processors.


What on earth are you talking about when you mention Crafty?  I have
run crafty on 16 cpu machines and it works just as well as it does on
4...  From actual testing, not "speculation".


>
>bandwidth is not the issue here. speedup is the issue here.
>


For the 8-way machines, bandwidth _is_ the issue.  4-way boxes use
4-way interleaving to provide enough memory bandwidth for 4 cpus.
8-way boxes lose in two ways.  (1) they still use 4-way memory
interleaving;  (2) the cache coherency hardware treats the machine as
two "clusters" of 4 cpus, making "inter-cluster" cache coherency less
efficient than on the 4-way clusters...



>If you split at random like most of these programs do, then
>you have simply major speedup problems soon.
>
>In case of patzer a big issue is that it is tactical extending
>a lot, so the search space is not identical (i don't even
>know whether it runs at 8 processors).
>
>So where crafty gets 1.7 speedup at 2 processors and like 2.5 speedup
>at 4 processors at crucial moments (when score drops a little) in
>the game, there the speedup at 8 processors for these random
>splitting programs is very horrible at 8 processors;

Crafty runs consistently at > 3.0 speedup at 4 processors.  I have posted
the data for several positional/tactical test positions that clearly proves
this.

Creating numbers out of the clear blue simply is not productive.


>
>in some positions you get 10 times speedup, in other positions a
>2 times speedup. When you need the speedup you don't get it.

Perhaps not all programs behave this badly?


>
>Anyway this is all theoretic discussion. I am pretty sure chessbase
>doesn't want to buy a 8 way Xeon system, even though they can afford
>the $100k easily.
>
>With regard to memory i need to mention that memory is faster on
>these systems than at our slow dual systems (with respect to memory),
>memory goes in parallel at the big machines, it doesn't at dual
>machines.

However, the 4-way and 8-way boxes share the _same_ memory system.


>
>Best regards,
>Vincent
>
>>In absolute terms, the 8-way Pentium 3 Xeon systems are only 44% faster than the
>>4-way ones, which means that with the 4 extra CPUs, the system only gets 1.76
>>CPUs worth of extra performance, which is poor value for money. This level of
>>scalability is not that surprising since each group of 4 CPUs share 0.8GByte/s
>>of memory bandwidth. As a side note, it seems likely though that 252.eon fits
>>almost perfectly into the 2MByte cache the Pentium 3 Xeons have as it gets
>>nearly linear scalability - the higher the cache hit rate, the less main memory
>>is needed, which leaves more for the other CPUs.
>>
>>Even worse, in some tests, the 8-way system actually does worse than the 4-way
>>system, and this could possibly be due to differences in the chipsets or because
>>the extra contention itself on the shared Pentium system bus causes efficiency
>>to drop. It's unlikely that the compilers/OS would have made much difference as
>>for each CPU type the tests were done at similar times with the same compilers.
>>
>>
>>http://www.aceshardware.com/read.jsp?id=45000338
>>
>>kburcham



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.