Author: Vincent Diepeveen
Date: 09:58:23 09/03/02
Go up one level in this thread
On September 03, 2002 at 12:54:05, Matthew Hull wrote: >On September 03, 2002 at 12:33:12, Vincent Diepeveen wrote: > >>On September 03, 2002 at 12:28:55, Matthew Hull wrote: >> >>>On September 03, 2002 at 11:56:48, Vincent Diepeveen wrote: >>> >>>>We all know how many failures the past years parallel programs have been >>>>when developed by scientists. This years diep show at the teras was no >>>>exception to that. The 3 days preparation time i had to get >>>>to the machine (and up to 5 days before tournament >>>>i wasn't sure whether i would get system time *anyway*). >>>> >>>>However sponsors want to hear how well your thing did. At a 1024 >>>>processor machine (maximum allocation 512 processors within 1 partition >>>>of shared memory) from which you get 60 with bandwidth of the memory >>>>2 times slower than local ram, and let's not even *start* to discuss >>>>the latency otherwise you will never start to fear diep using that >>>>machine. All i can say about it is that the 20 times slowed down >>>>Zugzwang was at 1999 at a machine with faster latency... >>>> >>>>I'm working hard now to get a DIEP DTS NUMA version ready. >>>> >>>>DTS it is because it is dynamic splitting wherever it wants to. >>>> >>>>Work for over a month fulltime has been done now. Tests at a dual K7 >>>>as well as dual supercomputer processors have been very positive. >>>> >>>>Nevertheless i worried about how to report about it. So i checked out the >>>>article from Robert Hyatt again. Already in 1999 when i had implemented >>>>a pc-DTS version i wondered why i never got near the speeds of bob >>>>when i was not forward pruning other than nullmove. The 1999 world champs >>>>version i had great speedups, but i could all explain them by forward >>>>pruning which i was using at the time. >>>> >>>>Never i got close even dual xeon or quad xeon to speeds reported by Bob >>>>in his DTS version described 1997. I concluded that it had to do with >>>>a number of things, encouraged by Bob's statements. In 99 bob explained >>>>that splitting was very cheap at the cray. He copied a block with all >>>>data of 64KB from processor 0 to P1 within 1 clock at the cray. >>>> >>>>I didn't know much of crays or supercomputers at the time, except that >>>>they were out of my budget so i believed it. However i have a good memory >>>>for certain numbers, so i have remembered his statement very well. >>>> >>>>In 2002 Bob explained the cray could copy 16 bytes each clock. A >>>>BIG contradiction to his 1999 statement. No one here will wonder >>>>about that, because regarding deep blue we have already seen hundreds >>>>of contradicting statements from bob. Anyway, that makes >>>>splitting at the cray of course very expensive, considering bob copied >>>>64KB data for each split. Crafty is no exception here. >>>> >>>>I never believed the 2.0 speedup in his tabel at page 16 for 2 processors, >>>>because if i do a similar test i sometimes get also > 2.0, usually less. >>>> >>>>Singular extensiosn hurted diep's speedup incredible, but even today >>>>i cannot get within a few minutes get to the speedup bob achieved in >>>>his 1997 article. >>>> >>>>In 1999 i wondered about why his speedup was so good. >>>>So Bob concluded he splitted in a smarter way when i asked. >>>>Then i asked obviously how he splitted in cray blitz, because >>>>what bob is doing in crafty is too horrible for DIEP to get a speedup >>>>much above 1.5 anyway. I asked obviously how he splitted in cray blitz. >>>> >>>>The answer was: "do some statistical analysis yourself on game trees >>>>to find a way to split well it can't be hard, i could do it too in >>>>cray blitz but my source code is gone. No one has it anymore". >>>> >>>>So you can feel my surprise when he suddenly had data of crafty versus >>>>cray blitz after 1999, which bob quotes till today into CCC to proof how >>>>well his thing was. >>>> >>>>Anyway, i can analyze games as FM, so i already knew a bit about how well >>>>this cray blitz was. I never paid much attention to the lies of bob here. >>>> >>>>I thought he was doing this in order to save himself time digging up old >>>>source code. >>>> >>>>Now after a month of fulltime work at DIEP at the supercomputer and having >>>>it working great at a dual (and very little overhead) but still a bad >>>>speedup i started worrying about my speedup and future article to write >>>>about it. >>>> >>>>So a possible explanation for the bad speedup of todays software when compared >>>>to bob's thing in 1993 and writing about it in 1997 is perhaps explained >>>>by nullmove. Bob still denies this despite a lot of statistical data >>>>at loads of positions (150 positions in total tried) with CRAFTY even. >>>> >>>>Bob doesn't find that significant results. Also he says that not a >>>>single of MY tests is valid because i have a stupid PC with 2 processors >>>>and bad RAM. a dual would hurt crafties performance too much. >>>> >>>>This because i concluded also that the speedup crafty gets here >>>>is between 1.01 and 1.6 and not 1.7. >>>> >>>>Data suggests that crafties speedup at his own quad is about 2.8, >>>>where he claims 3.1. >>>> >>>>Then bob referred back to his 1997 thesis that the testmethod wasn't good. >>>>Because to get that 2.8 we used cleared hashtables and in his thesis he >>>>cheats a little by not clearing the tables at all. to simulate a game >>>>playing environment that's ok of course. >>>> >>>>However there is a small problem with his article. The search times and >>>>speedup numbers are complete fraud. If i divide the times of 1 cpu by >>>>the speedup bob claims he has, i get perfect numbers nearly. >>>> >>>>Here is the result for the first 10 positions based upon bob's article >>>>march 1997 in icca issue #1 that year, the tables with the results >>>>are on page 16: >>>> >>>>When diep searches at a position it is always a weird number. >>>>If i claim a speedup of 1.8 then it is usually 1.7653 or 1.7920 or 1.8402 >>>>and so on. Not with bob. Bob knows nothing from statistical analysis >>>>of data (i must claim innocent here too but i am at least not STUPID >>>>like bob here): >>>> >>>>pos 2 4 8 16 >>>>1 2.0000 3.40 6.50 9.09 >>>>2 2.00 3.60 6.50 10.39 >>>>3 2.0000 3.70 7.01 13.69 >>>>4 2.0000 3.90 6.61 11.09 >>>>5 2.0000 3.6000 6.51 8.98876 >>>>6 2.0000 3.70 6.40 9.50000 >>>>7 1.90 3.60 6.91 10.096 >>>>8 2.000 3.700 7.00 10.6985 >>>>9 2.0000 3.60 6.20 9.8994975 = 9.90 >>>>10 2.000 3.80 7.300 13.000000000000000 >>>> >>>>This clearly PROOFS that he has cheated completely about all >>>>search times from 1 processor to 8 processors. Of course >>>>now that i am running myself at supercomputers i know what is >>>>the problem. I only needed a 30 minute look a month ago >>>>to see what is in crafty the problem and most likely that was >>>>in cray blitz also the problem. The problem is that crafty >>>>copies 44KB data or so (cray blitz 64KB) and while doing that >>>>it is using smp_lock. That's too costly with more than 2 cpu's. >>>> >>>>This shows he completely lied about his speedups. All times >>>>from 1-8 cpu's are complete fraud. >>>> >>>>There is however also evidence he didn't compare the same >>>>versions. Cray Blitz node counts are also weird. >>>> >>>>The more processors you use the more overhead you have obviously. >>>>Please don't get mad at me for calculating it in the next simple >>>>but very convincing way. I will do it only for his first node >>>>counts at 1..16 cpu's, the formula is: >>>> (nodes / speedup_i-cpu's ) * speedup_i+1_cpu's >>>> >>>>1 to 2 cpu's we don't need the math. >>>>If you need exactly 2 times shorter to get to it but >>>>thereby you need more nodes at more cpu's (where you need >>>>expensive splits) then that's already weird of course, though >>>>not impossible. >>>> >>>>2 to 4 cpu's: >>>> 3.4 * (89052012 / 2.0) = 151388420.4 nodes. >>>> bob needed: 105.025.123 which in itself is possible. >>>> Simply like 40% overhead extra for 4 processors which 2 do >>>> not have. This is very well possible. >>>> >>>>4 to 8 cpu's: >>>> 6.5 * 105025123 nodes / 3.4 = 200.783.323 >>>> bob needed: 109MLN nodes >>>> That means at 8 cpu's the overhead is already approaching >>>> 100% rapidly. This is very well possible. The more cpu's >>>> the bigger the overhead. >>>> >>>>8 to 16 cpu's: >>>> 9.1 * (109467495 / 6.5) = 153254493 >>>> bob needed: 155.514.410 >>>> >>>>My dear fellow programmers. This is impossible. >>>> >>>>Where is the overhead? >>>> >>>>The factor 100% at least overhead? >>>> >>>>More likely factor 3 overhead. >>>> >>>>The only explanation i can come up with is that the node counts >>>>from 2..8 processors are created by a different version from >>>>Cray Blitz than the 16 processor version. >>>> >>>>From the single cpu version we already know the number of nodes gotta >>>>be weird because it is using a smaller hashtable (see page 4.1 in the >>>>article second line there after 'testing methodology'). >>>> >>>>We talk about mass fraud here. >>>> >>>>Of course it is 5 years ago this article and i do not know whether >>>>he created the table in 1993. >>>> >>>>How am i going to tell my sponsor that my speedup won't be the same >>>>as that from the 1997 article? To whom do i compare, zugzwang? >>>>'only' had on paper 50% speedup out of 512 processors. Of course also >>>>something which is not realistic. However Feldmann documented most of >>>>the things he did in order to cripple zugzwang to get a better speedup. >>>> >>>>A well known trick is to kick out nullmove and only use normal alfabeta >>>>instead of PVS or other forms of search. Even deep blue did that :) >>>> >>>>But what do you guys think from this alternative book keeping from Bob? >>>> >>>>Best regards, >>>>Vincent >>> >>> >>>It sounds like you are saying in effect, "If I cannot duplicate Bob's >>>performance numbers with DIEP, then Bob's claims are false". >> >>No. please look at the data. >> >>There is a 1 / 10^30 chance you get such data. >> >>In short he has made up the data. The search times he has 'invented' >>himself. I am not talking about my machine here. i am talking about the fraud committed by bob. pos 2 4 8 16 1 2.0000 3.40 6.50 9.09 2 2.00 3.60 6.50 10.39 3 2.0000 3.70 7.01 13.69 4 2.0000 3.90 6.61 11.09 5 2.0000 3.6000 6.51 8.98876 6 2.0000 3.70 6.40 9.50000 7 1.90 3.60 6.91 10.096 8 2.000 3.700 7.00 10.6985 9 2.0000 3.60 6.20 9.8994975 = 9.90 10 2.000 3.80 7.300 13.000000000000000 There is a chance smaller than 1/10^30 that 'by accident' such numbers happen. that's 0.0000000000000000000000000000001 with about 30 zero's before the 1 happens. In short statistical analysis very clearly shows his fraud. I hope you realize in court statistical analysis is a legal method to proof you are right. It proofs clearly here his numbers are a big fraud and setup. >Perhaps if you had a good understanding and experience of Cray architecture, >your statement would have more weight. But, the supercomputer you are using is >really very different from a Cray. That much I do know. You can't expect to >get the same performance with a fundamentally different architecture. > >It's the same with the AMD versus XEON memory architecture. They're not the >same. XEON with interleaved memory has an advantage here. Everyone acknowleges >that an AMD is not going to get as good a speed up as a XEON with interleaved >memory, as has been explained countless times already. > >The same is true for supercomputers. The designs and special hardware >advantages differ significantly. > >You can't prove a lie by comparing apples to oranges. > >> >>I hope you realize that. >> >>It shows very hard he cheated. There is no way to escape statistical >>analysis, even though in computer chess most dudes do not know what it is. >> >>They do not know you can catch fraud with statistical analysis. >> >>Bob sure didn't. >> >>>To an outside observer, this would not necessarily follow. It remains to the >>>reader to wonder if a person making such a statement is necessarily up to the >>>task. You might be a great programmer. You might be journeyman programmer. >>>You might be a sub-par programmer. How are we to know? >>> >>>I for one cannot simply take your word for it.
This page took 0.02 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.