Author: Vincent Diepeveen
Date: 18:21:37 07/10/03
Go up one level in this thread
On July 10, 2003 at 00:09:48, Jay Urbanski wrote: >On July 09, 2003 at 00:18:25, Vincent Diepeveen wrote: > >>On July 08, 2003 at 23:10:03, Jay Urbanski wrote: >> >>>On July 08, 2003 at 23:03:01, Jay Urbanski wrote: >>> >>>>On July 07, 2003 at 23:35:45, Robert Hyatt wrote: >>>> >>>>>>I have PVM running on our giganet switch, which is faster than myrinet. But, >>>>>as I said, such clusters are _rare_. TCP/IP is the common cluster connection, >>>>>for obvious reasons. And that's where the interest in clusters lies, not >>>>>in how exotic a combination you can put together, but in what kind of >>>>>performance you can extract from a common combination. >>>> >>>>Giganet is not faster than Myrinet - it's 1.25Gb/s compared to Myrinet's 2Gb/s >>>>and it has higher latency. Giganet is also no longer being sold - it's a dead >>>>technlogy. But such clusters aren't *that* rare - I count 57 Linux clusters >>>>with fast (better than GigE) on the TOP500 list. >>>> >>>>Heck - if we had a decent MPI chess program available I bet any number of those >>>>"exotic" clusters would sign up for an exhibition match with one of the >>>>super-GM's. One thing they all have in common is that they *love* publicity. >>> >>>Assuming, of course, that such a program / hardware combination warranted such a >>>match. :) >> >>You mean: that i need to pay for such a match? >> >>That's the opposite of what you just posted the message before. >> >>Hell, the only guys that gave me a logon to their supercomputer was the dutch >>government for which i thank them. Also it's the worlds fastest machine >>(expressed in latency) that is giving away system time to such projects. >> >>Note that IBM only gave away system time at a poor 32 node cluster. With sick >>high latencies. each node was 100Mhz (2 nodes 120Mhz). Even in 1997 that wasn't >>considered fast. >> >>Zugzwang uses MPI by the way. I remember Feldmann telling how hard it was for >>him to get system time, and i can assure you. That's *definitely* the case. >> >>I'm glad i just had to write 1 page for each processor that i get. Otherwise i >>would not have a life. > >No, I don't mean pay for a match. I mean it would have to be demonstrated that >Diep (for example) running on a large cluster was significantly stronger than >any other combination of chess-playing hardware/software out there. Then you So after a software program proves he can convincingly beat Kasparov, and that without effort such a match can be organized again, you finally want to organize a DIEP - Yusopov? To mention one of the GMs that didn't show up playing me. (i had preparedin masterclass to face Yusupov but got instead some unknown GM from germany which i then drew in 19 moves). You are running in vague circles here. What i will tell you as someone who has asked over and again to about any company in the world and organisation in the world to get system time for DIEP, that you're simply not telling the truth. The problem of the machines is that they are so sick insane expensive a cpu hour, that is let's be clear here and distinguish in 2 groups. First of all there is the real supercomputers let's skip the math they are very expensive a cpu hour. If you take 500 cpu's for a week that's millions. Easy math. You will argue that there is cheap clusters available. So let's take the so called cheap clusters. up to factor 20 cheaper sure. But let's do some practical math by someone who doesn't even run them himself so i might be forgetting some major costs here. Let's assume some university department has a 4096 processor cluster. Let's do 2 things now. a) look how many cpu hours a year you effectively have b) calculate rude price for everything c) calculate how expensive it is to let chess software run at a cluster The slower the latency and the bigger the system the more cpu's you usually need to be used as kernel servers or for harddisks etc. So effectively you won't be able to use 10% from that machine in advance *ever*. 4096 - 10% = 3686 processors left. Note that at small machines up to a processor or 32 such stuff never happens but it is common to large scale machines. Especially if they run cheapo linux. Secondly how efficient can a cluster be used practically spoken? We must not do as if we are in a perfect world of course. The average supercomputer not to mention cluster is poor loaded. Especially if we skip the time that gets used to benchmark the machine. Some cheapo machines get benchmarked for half a year... How effective does the average cluster get used? I know this is not a fair question. Some clusters are there for a company to be there in case they need some calculatiosn to be done. So it's idling 99% of the time then suddenly it must produce within 20 hours at 500 cpu's the answer that the dudes need. That's why they buy such a 1024 processor cluster then... So if we take the *real* average then people will be laughing for a short period of time. 10%? Of course until you calculate how effective they use their machines... Now this 10% figure is not nice to use. We can better take the scientific clusters. Those are better loaded. Way better. In fact i have a big paper here which gets produced every year. One of the things it has is how effective much computing power every nation has. Clusters are great for such reports as they add a huge number of flops for a low price. Still how do you fill such a computer? There's not too much software running at it very well in fact. The average 'cluster' is hardly loaded when compared to supercomputers. Of course the reason is trivial. Supercomputers for jobs till a processor or 32 are having very fast latencies. An extra router usually is needed to get to 64 processors and when using as much as DIEP then even the TERAS machine won't be avoiding latencies of 5 microseconds or more. 6-7, you guess it. There is not many applications that can use so many cpu's. I know a few because i talked to biologists who run at the computer and i talked to oceanographic rendering dudes and a lot of visualisation guys (one of 'em a professor of 3d art on the computer: "i push the button and then the computer has to create wonderful graphics for me at a shitload of processors"). Vaste majority is chemistry. The 1024 processor TERAS supercomputer in 2002 was used for 54.9% for chemistry purposes. So not 54.9% of the system time, but 54.9% of the total system time used from the machine. In 2002, the TERAS system was used effectively for 4801515 PNUs (processor node hours where node = 1 processor in this case). Note that this is probably allocated hours. I allocated for example 2 hours for a 130 processor job even when i need it 10 minutes, because i get charged for 1/6 hour * 130, but the OS sees it as 260 hours. So let's simply use that number because they calculated here that this is 53% of its 'theoretic capacity' and 70% of its economic capacity. My experience in supercomputing is that you must redo every math figure you see as usually there's something wrong... 365 days * 1024 processors * 24 hours a day = 53.5% You see those guys are great. I love them. They can do real math in contradiction to manufacturers... Let's use this 53.5% figure further. Yet in the first year of its usage this was 33%. They clain in this report that 30% is normal for the first year of a new supercomputer and that the years after that this 53.5% is considered to be good. So let's take 45% as a rude estimate. You see we lost a shitload of processors again: 45% * 4096 = virtual used 1843.2 processors. Now a supercomputer just works for 3 years after that it is 'outdated'. Also the tax office says you must book it away within 3 years and not 2 or 1. So we assume 4 years. How much power does this 4096 processor cluster use then? Usually only 10% is the processor usage. Let's use a dual Xeon machine as average. 100 watts a processor. That's 10%. Then for routers and network cards and harddisks etc another 90% added. 1000 watt a cpu. That's still cheap compared to the Earth machine btw. That thing eats like 1400 watts a cpu effectively and those are just clocked to 500Mhz and with just 1 big central router. 1000 watt a cpu * 4096 processors = 4 MW. Now i happen to know quite a bit from high voltage power lines as i was in a political committee which is about a 150KV line that powers city Utrecht (1 million inhabitants. If you use up 4 MW then not only the power will cost a lot: 365 days * 4000KW * 24 hours * 0.10$ / kilowatt hour = $3.5MLN a year Note that power in europe is more than twice expensive than that especially between 6 AM and 12 AM. Companies that start their machines here after 6 AM get fined incredibly here. Now cost of machine. Say 20 MLN dollar for 4096 processors. Personnel that watches the machine, you need at least 15 guys for that. They need offices and work places and a good salary as they need to be skilled personnel otherwise your cluster might not be working correctly. For medium skilled personnel they count here 50000 euro a year. These dudes already earn more a year. Let's say they'll cost 2 million a year, as you need a big security too 24 hours a day, because perhaps some thieves will be very strong and be capalbe of lifting your stuff. Then to transport 4 MW you can't use underground cabled of 10KV. those 'cheapo' cables won't work very well with 700A. You need like 100KV or something higher than that. That's like 40A then which is a very economic way to transport 4MW. Note that such big 100KV lines can handle couple of hundreds of Amps without problems also, so you can put more machines there. There is however small problems with these lines. First of them is that they cost 1 million euro for each 1000 meters. Or 1/1.6 mile. A multiple of that you pay to buy off the land and other stuff. Then to scale down from 100KV to the voltage that is needed for the machines you need to do it in at least 2 steps. First step is from 100KV to like 10KV. A problem of 100+KV as you might know is that within 5 meters of the cables range you already get fried like hell. In case of australia you have to also use the law that within the 0.4 microtesla area no person may live or even exist. So assuming others use that cable of 100KV too and that like 200A is at the cable, then you talk about 50 meters distance and in some bad cases up to 100 meters. The more water in the air the worse it is. So you can't have that power station nearby the supercomputer either. Then electrical equipment gets disturbed a lot when you get above 2 microtesla, so it's not like connectin ga cable from that power station that powers it down to 10KV to it. That power station is very expensive. A million or 30 euro they needed here to transfer 1 power station from inner city to outside the city. All that cost gets indirectly paid by the government here just to give that cluster power. If we add all those costs the machine will look suddenly very expensive of course. The big luck is that you can use that power line for another 50 years then. Say 1 million euro added a year. Then we have location costs where the cluster is located busy doing its excercises. So we have 33% from 20 million + 1 million + 3.5 million + 2 million = 13.1 million dollar. So for that record breaking attempt where you filled a sporthal full of machines and called it a cluster, the PNU costs are like: 13,100,000 / 1843.2*24*365 = 0.81 dollar And *that* is very cheap compared to what real supercomputers cost a PNU (processor node hour). But the bad news is that those machines aren't created to serve big jobs like DIEP. If i would allocate 500 cpu's of such a machine for a match that takes a week and of course to prepare i need at least 2 weeks preparation time at such a machine as i want to test and play some test matches and i want to optimize for the hardware and produce testresults to show how good the machine is etcetera. So what i need is 3 * 1 week. Or roughly 90000 hours (that's what i got now too). Then we speak about a present of 72000 dollar. So you won't ever see such machines toy at the ICC. And most likely even for prestigeous matches they won't get friendly either. No way sir. For a match versus GM Jeroen Piket (46th of world with FIDE=2646 when i played him in september 2002; he quits chess now regrettably but has a good reason to do so) i am sure that i won't get 500 cpu's at any of your cluster mr Urbanski. Only when i show a golden coin like a match versus kasparov they will consider giving 72000 dollar away of system time. Note that for the readers who managed to follow this posting so far the truth is even more sad. The important tournaments and matches are *always* at times that such clusters have their peak usage. Take world champs. End of november: most busiest month of the year for a supercomputer. End of august is no fun either. Best time of the year for a supercomputer/cluster is end of july/start of august or the middle of april (one of the RAINMONTHs IN EUROPE). On average end of november the machine is loaded 3 times more than middle of april. >might have a chance to convince Braingames or whoever that the next Man/Machine >contest should use a cluster for the Machine side. I will not consider doing business with braingames *ever* unless i get forced to do so. volunteerly. NO. Such swindlers i do not want to do business with. Read at Eric Schillers homepage extensively on the corrupsy within braingames, especially the corrupsy that went on to 'qualify' for Kramnik-Fritz. Of course chessbase tried that, chessbase would have been stupid not to. I do not blame chessbase for that. But the money grabbing within braingames is really sick. It gets quoted with emails there so for the Jeremiah dudes there is a lot of proof on paper. However the worst emails are not even posted there. I have a few of them here of course. No i won't post them. Someone gave them to me in big trust. Now i do not know who within braingames is corrupt or not, as not all the relevant emails are quoted the posted emails at Eric's homepage sure are protecting a few persons (yes more than 1) who are completely sick corrupt. I do not say everyone within braingames is corrupt. Nevertheless because of how things work withint the organisation, the organisation as a whole is completely corrupt. Anyway, it doesn't matter anymore. Aren't they already bankrupt? I will never ever in my life consider doing business with someone who was in the braingames organisation around that time, with exception of the persons who stepped outside of that organisation because of the scandals within the organisation. This is a matter of principle. History has proven that chessbase has other principles. That's their choice. If they clearly post they do not ever do business again with braingames and or persons who within that organisation were clearly corrupt, then i would take my words back. Matter of fact is that chessbase is very good in doing business with organisations that directly after they did business with chessbase fall apart or some big scandals get created. From head i can name at least 10 different cases but i leave it to the reader who follows RGCC and CCC and the several court cases everywhere to figure that out. Main point is that Braingames can try to sell me any matdch they like. I'll go to the GM in question and let him sign himself. Then you avoid useless organisations like Braingames directly. Being a titled player myself i have no problems in approaching them. In fact a number of GMs already have approached me with their face sometimes crying loud: "please give me a match against diep". Being a grandmaster and getting > 40 years old is not always fun. Fighting always every tournament for a few coins at the different locations at the globe is only fun when you belong to the top 40 of the world. Or top 10 when you have (RUS) behind your name at the FIDE list. So doing business with GMs is very easy. However most GMs do not care shit for computerchess. They prefer some money in advance and if you pay well, then they will give a show where your program looks fine. Are you looking for such a type of a match? >Now I'll readily admit that I'm not aware of all the chess politics that goes >into organizing these matches so maybe I'm wrong - but I think part of the >appeal of the Deep Blue / Kasparov match was that Deep Blue was such a monster >on paper at least. (32 CPUs and several hundred dedicated chess chips) First of all it played kasparov. Win or lose that means that people guess you are at the same level, otherwise you would not play kasparov; because only the best in the world get a chance to play kasparov, right? Secondly the huge marketing department we should not underestimate. Third, they focussed upon 1 thing (nodes a second, though without proof as you say just upon paper) which means that the marketing department could do the rest fourth point and this is *real* important. the marketing department gets support from the scientific world a lot. For them deep blue is the proof they indirectly use that if you get access to big resources, that it is good for the company or especially the government to give them access to big resources. firth point because of point 2 and 4, there were zero negative comments on the thing which means that the effect will get quadrupled. Sixth point what most chessplayers already realized in 1990, because kasparov lost some games to Genius back then in 1989 (if i recall the date accurately), is that mankind is beatable. However, by losing this match, Deep Blue showed very clearly that mankind was beatable. Very clear without a single refutation possible. No matter how bad the games were, Deep Blue *did* very clearly show that. And it is that realization of that technology was advancing, that is what indirectly speaks to the imagination of mankind. A good marketing department does the rest then. And IBM sure did.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.