Computer Chess Club Archives

Search

Terms
Messages

Subject: Re: New intel 64 bit ?

Author: Vincent Diepeveen
Date: 18:21:37 07/10/03
On July 10, 2003 at 00:09:48, Jay Urbanski wrote:

>On July 09, 2003 at 00:18:25, Vincent Diepeveen wrote:
>
>>On July 08, 2003 at 23:10:03, Jay Urbanski wrote:
>>
>>>On July 08, 2003 at 23:03:01, Jay Urbanski wrote:
>>>
>>>>On July 07, 2003 at 23:35:45, Robert Hyatt wrote:
>>>>
>>>>>>I have PVM running on our giganet switch, which is faster than myrinet.  But,
>>>>>as I said, such clusters are _rare_.  TCP/IP is the common cluster connection,
>>>>>for obvious reasons.  And that's where the interest in clusters lies, not
>>>>>in how exotic a combination you can put together, but in what kind of
>>>>>performance you can extract from a common combination.
>>>>
>>>>Giganet is not faster than Myrinet - it's 1.25Gb/s compared to Myrinet's 2Gb/s
>>>>and it has higher latency.  Giganet is also no longer being sold - it's a dead
>>>>technlogy.  But such clusters aren't *that* rare - I count 57 Linux clusters
>>>>with fast (better than GigE) on the TOP500 list.
>>>>
>>>>Heck - if we had a decent MPI chess program available I bet any number of those
>>>>"exotic" clusters would sign up for an exhibition match with one of the
>>>>super-GM's.   One thing they all have in common is that they *love* publicity.
>>>
>>>Assuming, of course, that such a program / hardware combination warranted such a
>>>match. :)
>>
>>You mean: that i need to pay for such a match?
>>
>>That's the opposite of what you just posted the message before.
>>
>>Hell, the only guys that gave me a logon to their supercomputer was the dutch
>>government for which i thank them. Also  it's the worlds fastest machine
>>(expressed in latency) that is giving away system time to such projects.
>>
>>Note that IBM only gave away system time at a poor 32 node cluster. With sick
>>high latencies. each node was 100Mhz (2 nodes 120Mhz). Even in 1997 that wasn't
>>considered fast.
>>
>>Zugzwang uses MPI by the way. I remember Feldmann telling how hard it was for
>>him to get system time, and i can assure you. That's *definitely* the case.
>>
>>I'm glad i just had to write 1 page for each processor that i get. Otherwise i
>>would not have a life.
>
>No, I don't mean pay for a match.  I mean it would have to be demonstrated that
>Diep (for example) running on a large cluster was significantly stronger than
>any other combination of chess-playing hardware/software out there.  Then you

So after a software program proves he can convincingly beat Kasparov, and that
without effort such a match can be organized again, you finally want to organize
a DIEP - Yusopov?

To mention one of the GMs that didn't show up playing me. (i had preparedin
masterclass to face Yusupov but got instead some unknown GM from germany which i
then drew in 19 moves).

You are running in vague circles here.

What i will tell you as someone who has asked over and again to about any
company in the world and organisation in the world to get system time for DIEP,
that you're simply not telling the truth.

The problem of the machines is that they are so sick insane expensive a cpu
hour, that is let's be clear here and distinguish in 2 groups.

First of all there is the real supercomputers let's skip the math they are very
expensive a cpu hour. If you take 500 cpu's for a week that's millions.

Easy math.

You will argue that there is cheap clusters available.

So let's take the so called cheap clusters.

up to factor 20 cheaper sure.

But let's do some practical math by someone who doesn't even run them himself so
i might be forgetting some major costs here.

Let's assume some university department has a 4096 processor cluster.

Let's do 2 things now.
  a) look how many cpu hours a year you effectively have
  b) calculate rude price for everything
  c) calculate how expensive it is to let chess software run at a cluster

The slower the latency and the bigger the system the more cpu's you usually need
to be used as kernel servers or for harddisks etc.

So effectively you won't be able to use 10% from that machine in advance *ever*.

4096 - 10% = 3686 processors left.

Note that at small machines up to a processor or 32 such stuff never happens but
it is common to large scale machines. Especially if they run cheapo linux.

Secondly how efficient can a cluster be used practically spoken?

We must not do as if we are in a perfect world of course. The average
supercomputer not to mention cluster is poor loaded. Especially if we skip the
time that gets used to benchmark the machine. Some cheapo machines get
benchmarked for half a year...

How effective does the average cluster get used? I know this is not a fair
question. Some clusters are there for a company to be there in case they need
some calculatiosn to be done. So it's idling 99% of the time then suddenly it
must produce within 20 hours at 500 cpu's the answer that the dudes need.

That's why they buy such a 1024 processor cluster then...

So if we take the *real* average then people will be laughing for a short period
of time. 10%?

Of course until you calculate how effective they use their machines...

Now this 10% figure is not nice to use.

We can better take the scientific clusters. Those are better loaded. Way better.

In fact i have a big paper here which gets produced every year. One of the
things it has is how effective much computing power every nation has.

Clusters are great for such reports as they add a huge number of flops for a low
price.

Still how do you fill such a computer? There's not too much software running at
it very well in fact. The average 'cluster' is hardly loaded when compared to
supercomputers.

Of course the reason is trivial. Supercomputers for jobs till a processor or 32
are having very fast latencies. An extra router usually is needed to get to 64
processors and when using as much as DIEP then even the TERAS machine won't be
avoiding latencies of 5 microseconds or more. 6-7, you guess it.

There is not many applications that can use so many cpu's. I know a few because
i talked to biologists who run at the computer and i talked to oceanographic
rendering dudes and a lot of visualisation guys (one of 'em a professor of 3d
art on the computer: "i push the button and then the computer has to create
wonderful graphics for me at a shitload of processors"). Vaste majority is
chemistry.

The 1024 processor TERAS supercomputer in 2002 was used for 54.9% for chemistry
purposes.

So not 54.9% of the system time, but 54.9% of the total system time used from
the machine.

In 2002, the TERAS system was used effectively for 4801515 PNUs (processor node
hours where node = 1 processor in this case).

Note that this is probably allocated hours. I allocated for example 2 hours for
a 130 processor job even when i need it 10 minutes, because i get charged for
1/6 hour * 130, but the OS sees it as 260 hours.

So let's simply use that number because they calculated here that this is 53% of
its 'theoretic capacity' and 70% of its economic capacity.

My experience in supercomputing is that you must redo every math figure you see
as usually there's something wrong...

365 days * 1024 processors * 24 hours a day = 53.5%

You see those guys are great. I love them. They can do real math in
contradiction to manufacturers...

Let's use this 53.5% figure further.

Yet in the first year of its usage this was 33%.

They clain in this report that 30% is normal for the first year of a new
supercomputer and that the years after that this 53.5% is considered to be good.

So let's take 45% as a rude estimate.

You see we lost a shitload of processors again:

45% * 4096 = virtual used 1843.2 processors.

Now a supercomputer just works for 3 years after that it is 'outdated'. Also the
tax office says you must book it away within 3 years and not 2 or 1.

So we assume 4 years.

How much power does this 4096 processor cluster use then?

Usually only 10% is the processor usage. Let's use a dual Xeon machine as
average. 100 watts a processor. That's 10%. Then for routers and network cards
and harddisks etc another 90% added. 1000 watt a cpu.

That's still cheap compared to the Earth machine btw. That thing eats like 1400
watts a cpu effectively and those are just clocked to 500Mhz and with just 1 big
central router.

1000 watt a cpu * 4096 processors = 4 MW.

Now i happen to know quite a bit from high voltage power lines as i was in a
political committee which is about a 150KV line that powers city Utrecht (1
million inhabitants.

If you use up 4 MW then not only the power will cost a lot:
  365 days * 4000KW * 24 hours * 0.10$  / kilowatt hour = $3.5MLN a year

Note that power in europe is more than twice expensive than that especially
between 6 AM and 12 AM. Companies that start their machines here after 6 AM get
fined incredibly here.

Now cost of machine. Say 20 MLN dollar for 4096 processors.

Personnel that watches the machine, you need at least 15 guys for that. They
need offices and work places and a good salary as they need to be skilled
personnel otherwise your cluster might not be working correctly. For medium
skilled personnel they count here 50000 euro a year. These dudes already earn
more a year. Let's say they'll cost 2 million a year, as you need a big security
too 24 hours a day, because perhaps some thieves will be very strong and be
capalbe of lifting your stuff.

Then to transport 4 MW you can't use underground cabled of 10KV. those 'cheapo'
cables won't work very well with 700A.

You need like 100KV or something higher than that. That's like 40A then which is
a very economic way to transport 4MW.

Note that such big 100KV lines can handle couple of hundreds of Amps without
problems also, so you can put more machines there.

There is however small problems with these lines. First of them is that they
cost 1 million euro for each 1000 meters. Or 1/1.6 mile. A multiple of that you
pay to buy off the land and other stuff. Then to scale down from 100KV to the
voltage that is needed for the machines you need to do it in at least 2 steps.

First step is from 100KV to like 10KV.

A problem of 100+KV as you might know is that within 5 meters of the cables
range you already get fried like hell.

In case of australia you have to also use the law that within the 0.4 microtesla
area no person may live or even exist. So assuming others use that cable of
100KV too and that like 200A is at the cable, then you talk about 50 meters
distance and in some bad cases up to 100 meters. The more water in the air the
worse it is.

So you can't have that power station nearby the supercomputer either.

Then electrical equipment gets disturbed a lot when you get above 2 microtesla,
so it's not like connectin ga cable from that power station that powers it down
to 10KV to it.

That power station is very expensive. A million or 30 euro they needed here to
transfer 1 power station from inner city to outside the city.

All that cost gets indirectly paid by the government here just to give that
cluster power. If we add all those costs the machine will look suddenly very
expensive of course.

The big luck is that you can use that power line for another 50 years then.

Say 1 million euro added a year.

Then we have location costs where the cluster is located busy doing its
excercises.

So we have 33% from 20 million + 1 million + 3.5 million + 2 million = 13.1
million dollar.

So for that record breaking attempt where you filled a sporthal full of machines
and called it a cluster, the PNU costs are like:
  13,100,000 / 1843.2*24*365 = 0.81 dollar

And *that* is very cheap compared to what real supercomputers cost a PNU
(processor node hour).

But the bad news is that those machines aren't created to serve big jobs like
DIEP. If i would allocate 500 cpu's of such a machine for a match that takes a
week and of course to prepare i need at least 2 weeks preparation time at such a
machine as i want to test and play some test matches and i want to optimize for
the hardware and produce testresults to show how good the machine is etcetera.

So what i need is 3 * 1 week. Or roughly 90000 hours (that's what i got now
too).

Then we speak about a present of 72000 dollar.

So you won't ever see such machines toy at the ICC.

And most likely even for prestigeous matches they won't get friendly either.

No way sir. For a match versus GM Jeroen Piket (46th of world with FIDE=2646
when i played him in september 2002; he quits chess now regrettably but has a
good reason to do so) i am sure that i won't get 500 cpu's at any of your
cluster mr Urbanski.

Only when i show a golden coin like a match versus kasparov they will consider
giving 72000 dollar away of system time.

Note that for the readers who managed to follow this posting so far the truth is
even more sad.

The important tournaments and matches are *always* at times that such clusters
have their peak usage.

Take world champs. End of november: most busiest month of the year for a
supercomputer.

End of august is no fun either. Best time of the year for a
supercomputer/cluster is end of july/start of august or the middle of april (one
of the RAINMONTHs IN EUROPE).

On average end of november the machine is loaded 3 times more than middle of
april.

>might have a chance to convince Braingames or whoever that the next Man/Machine
>contest should use a cluster for the Machine side.

I will not consider doing business with braingames *ever* unless i get forced to
do so. volunteerly. NO. Such swindlers i do not want to do business with.

Read at Eric Schillers homepage extensively on the corrupsy within braingames,
especially the corrupsy that went on to 'qualify' for Kramnik-Fritz. Of course
chessbase tried that, chessbase would have been stupid not to. I do not blame
chessbase for that. But the money grabbing within braingames is really sick.

It gets quoted with emails there so for the Jeremiah dudes there is a lot of
proof on paper.

However the worst emails are not even posted there. I have a few of them here of
course. No i won't post them. Someone gave them to me in big trust.

Now i do not know who within braingames is corrupt or not, as not all the
relevant emails are quoted the posted emails at Eric's homepage sure are
protecting a few persons (yes more than 1) who are completely sick corrupt. I do
not say everyone within braingames is corrupt.

Nevertheless because of how things work withint the organisation, the
organisation as a whole is completely corrupt.

Anyway, it doesn't matter anymore. Aren't they already bankrupt?

I will never ever in my life consider doing business with someone who was in the
braingames organisation around that time, with exception of the persons who
stepped outside of that organisation because of the scandals within the
organisation. This is a matter of principle.

History has proven that chessbase has other principles. That's their choice. If
they clearly post they do not ever do business again with braingames and or
persons who within that organisation were clearly corrupt, then i would take my
words back.

Matter of fact is that chessbase is very good in doing business with
organisations that directly after they did business with chessbase fall apart or
some big scandals get created. From head i can name at least 10 different cases
but i leave it to the reader who follows RGCC and CCC and the several court
cases everywhere to figure that out.

Main point is that Braingames can try to sell me any matdch they like. I'll go
to the GM in question and let him sign himself. Then you avoid useless
organisations like Braingames directly.

Being a titled player myself i have no problems in approaching them. In fact a
number of GMs already have approached me with their face sometimes crying loud:
"please give me a match against diep".

Being a grandmaster and getting > 40 years old is not always fun. Fighting
always every tournament for a few coins at the different locations at the globe
is only fun when you belong to the top 40 of the world. Or top 10 when you have
(RUS) behind your name at the FIDE list.

So doing business with GMs is very easy.

However most GMs do not care shit for computerchess. They prefer some money in
advance and if you pay well, then they will give a show where your program looks
fine.

Are you looking for such a type of a match?

>Now I'll readily admit that I'm not aware of all the chess politics that goes
>into organizing these matches so maybe I'm wrong - but I think part of the
>appeal of the Deep Blue / Kasparov match was that Deep Blue was such a monster
>on paper at least.  (32 CPUs and several hundred dedicated chess chips)

First of all it played kasparov. Win or lose that means that people guess you
are at the same level, otherwise you would not play kasparov; because only the
best in the world get a chance to play kasparov, right?

Secondly the huge marketing department we should not underestimate.

Third, they focussed upon 1 thing (nodes a second, though without proof as you
say just upon paper) which means that the marketing department could do the rest

fourth point and this is *real* important. the marketing department gets support
from the scientific world a lot. For them deep blue is the proof they indirectly
use that if you get access to big resources, that it is good for the company or
especially the government to give them access to big resources.

firth point because of point 2 and 4, there were zero negative comments on the
thing which means that the effect will get quadrupled.

Sixth point what most chessplayers already realized in 1990, because kasparov
lost some games to Genius back then in 1989 (if i recall the date accurately),
is that mankind is beatable. However, by losing this match, Deep Blue showed
very clearly that mankind was beatable. Very clear without a single refutation
possible.

No matter how bad the games were, Deep Blue *did* very clearly show that. And it
is that realization of that technology was advancing, that is what indirectly
speaks to the imagination of mankind. A good marketing department does the rest
then. And IBM sure did.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.