Computer Chess Club Archives


Search

Terms

Messages

Subject: Superdome, montecito and opteron

Author: Vincent Diepeveen

Date: 04:48:16 07/03/05

Go up one level in this thread


On July 01, 2005 at 22:27:02, Eugene Nalimov wrote:

>On July 01, 2005 at 21:38:28, Robert Hyatt wrote:
>
>>...
>>
>>My point was that Hydra is most _certainly_ not some new level of computer chess
>>as stated by Adams.  I wouldn't argue against it being the best computer chess
>>entity at the moment.  But it is absolutely _not_ head and shoulders above
>>others.  The advantage I have is that I have a lot of experience with parallel
>>and distributed search, and know the losses that a distributed search entails
>>compared to a pure SMP approach.  And even if they are currently reaching 200M
>>nodes per second, which I somehow doubt given the FPGA numbers they have
>>published in the past, that is not _that_ much faster than other readily
>>available hardware.  I've seen numbers well beyond 20M for Crafty on a quad
>>dual-core opteron, for example.  I've seen numbers more than double that on
>>other machines I can't really mention at the moment.  So they are not _that_ far
>>beyond today's programs.  Clearly Adam's comments are based on some other
>>reality or understanding that is not based on factual analysis.
>
>Today you can buy Itanium2 64-CPU system at
>http://www.hp.com/products1/servers/integrity/superdome_high_end/

Superdome sucks for chess. Just show up with crafty at a superdome. I'm 100%
convinced you can get one for crafty at world champs 2005 if crafty can WORK at
one.

Effectively a 8 processor dual core machine equipped with 2.2Ghz dual core
processors is cheaper (around 25k euro) and far faster in terms of time per ply.

Crafty will get a tad less than 2.2 mln * 16 = 25 mln nps at that 8 processor
dual core, whereas it NEVER will work very fine at the superdome when using 64
cpu's.

The superdome is having a price of what is it half a million dollar or so?
Altix3000 series (64 processor itanium2, similar to superdome) is $1 mln a
machine.

It's SLOWER and more EXPENSIVE than a 8 processor opteron dual core.

If you look at the supercomputer/cluster/beowulf mailing lists NOW, you will
realize that the die hard intel fans there are busy stepping over to dual core
opterons there.

Some will step over to CELL processor, because when montecito releases, CELL is
there too.

Whereas i fail to see how CELL will deliver them 256 gflop (as the caches can't
deliver 256 gflop to the cpu's) it still is obvious that it will have a minimum
of 32+ gflop for even 'average' optimized code.

That will blow away of course any itanium machine.

>Last time I measured Crafty run at ~1.5Mnps on one Itanium2 CPU. So with some
>additional work (avoid cache conflicts, maybe introduce smaller local hash to be
>probed at the last ply or two) Crafty can hit ~100Mnps on such beast.

"additional work"

The only one who has a working NUMA algorithm that works at 128 cores is DIEP.

From a superior SMP algorithm, it took 2 years hard debugging work to get that
to work.

Please note that superdome has the same latencies from processor 0 to 64 like a
cluster has. Any program that works fine at a superdome, also works fine at a
cluster with good network cards.

I'm sure you can make it yourself Eugene, but it is 'some' work to be done on
crafty to get it to work at a single image cluster like supercome/altix3000 are.

Of course unless when you are happy with a 4.0 speedup out of 128 cores at a
superdome.

Please note that HP sold all their staff to intel. Sold is probably a big word.
Probably intel got 'em for free, as it just costs money.

Montecito hasn't been released yet, and right now even in floating point the
dual core opteron is FASTER than an itanium2.

itanium2 delivers like 7 gflop. A dual core opteron 8.8 gflop.

That's pure MATRIX calculation stuff, where opteron hasn't been designed for.

At a dual core opteron 2.2Ghz, crafty gets > 4 mln nps. That's far superior to
1.5 mln nps of an itanium2.

Montecito will be faster than that, no doubts, but for a dying processor line,
it's too little, too late, too expensive. Most importantly too late.

Previous itanium2 when introduced cost around $8000 if you would buy a 1000 of
them.

If they do that with montecito too, that means that even the CHEAPEST possible
price they can sell a 64 processor superdome for *starts* at:

$8000 * 64 = half a million dollar.

Add another 250k dollar to highend equipment and compilers and software and 24
hour support and electricity bills and you're at nearly a million for a 64
processor superdome.

Now that 64 processor superdome is EFFECTIVELY slower (time to plydepth) for
crafty than a dual core 8 processor board is.

For diep superdome is faster obviously. Opteron is relative faster for crafty
than it is for DIEP. Diep profits more from the itanium2 in that respect.

So let's not keep posting too optimistic here about superdomes, unless you want
to post it in context to Diep.

Diep runs fine at them, if you can afford one.

>For less than $40k you can buy reasonable configured 8 sockets / 16 cores

A company posted at beowulf list that recently they bought a complete configured
8 processor dual core for 18000 dollar. I guess that's with 1.8Ghz dual cores.

>Opteron system. For example take a look at
>http://www.pcsforeveryone.com/product_info.php?cPath=1967&products_id=14101&customize=true
>
>Crafty should run at ~30-35Mnps on such system.

It'll lose a lot. let's say 25 mln.

Also the hashtable probing must be put back to say not storing 1 ply depthleft
anymore.

The memory subsystem is real weak at itanium2.

>Both those systems are NUMA, not clusters, so search should be more efficient.

>Thanks,
>Eugene



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.