Author: Robert Hyatt
Date: 09:08:03 09/05/02
Go up one level in this thread
On September 05, 2002 at 10:43:44, Vincent Diepeveen wrote: >On September 05, 2002 at 00:27:21, Uri Blass wrote: > >look at all the speedup numbers now posted on crafty. >not a single one ends at 1.90 it's all different numbers. > >There are other things. Look i never believed in bob's >math, but i'm not just posting it just like that. > >When i saw it in 1997 i already knew something bad >was going on, because 2.0 is getting claimed as speedup >in a load of positions. > >Nothing about permanent brain mentionned, and there are >other flaws in the test method. Flaws in the test method are open to debate. I set up the test method to answer a question _I_ wanted to answer, not one _you_ wanted to answer. But discussing it is not a problem for me. Calling the data a "fraud" _is_ a problem however. > >Nothing of it i call good science. The only thing he did >better then than Bob is doing now, is that we talk about >24 positions instead of 1 weak test position (a testposition >is weak if any parallellism which is proven to be bad >has a good speedup at it). > >I have had last 2 months a number of bad DIEP versions, >because i was rewriting my parallellims from focussing upon >locking in shared memory to using as little as possible locks, >which therefore also allowed me to rewrite it to using mainly >local memory. handwaving excuses??? you are full of 'em... Why you are going to win the next WCCC. Then afterward, why you didn't win it... Excuses don't replace results... > >The first few versions were just random splitting and losing >all kind of 'possible' splitpoints. The search was bugfree in >the sense that there were no bugs then (such a basic search is >easier to get bugfree thana complex search). > >All these versions score a 1.9 speedup (overhead is very little) >at the BK 22 testposition. > >However at other testpositions i tried, the average speedup >was real bad. Some versions came to only 1.4 speedup. > >I need to add one important thing for people who wonder why >speedups of diep are always better than that of other programs. > >DIEP is a very slow searching program. My evaluation is big. A classic "justification" that should convince _everybody_ of its accuracy... > >Getting just 1 cache line in shared memory of a K7 is like 400 clocks >or so. If a program is searching at like 40000 clocks a node, then >you do not feel that as soon as a program which is like 4000 clocks >a node. 400 clocks is 1/10 of 4000. > >To copy 3KB of data for a splitpoint, first of all both processors >are not searching (i call that idle state if you don't mind), So >for the number of clocks it costs to the first processor which was searching >to split, you lose that directly. Crap, crap and more crap. First, in a 3 minute search, I posted some stat output for you yesterday. It showed 700+ splits done in three+ minutes. How long does it take to copy 3kb +/- 700 times, out of a total time of 3 minutes? It doesn't take a fraction of a second. So crap, and more crap. Furthermore, crafty sees one cpu spend about 2-3% of its time "spinning" waiting for work. this is given in the search statistics as well. More crap. > >No big losses in DIEP here. The only cost is that they need to >ship the move list if requested by another processor. We talk about >16 moves or so. That's 16 x 4 = 64 bytes. So usually within 1 cache line. Usually within 2. Most machines have 32 byte cache lines. > >Basically a searching processor just searches on in DIEP. >Processors take care themselves they get a splitpoint basically. >The overhead for the searching processors is near zero. > >In crafty which is getting single cpu 1 million nodes a second at this 1.6Ghz >machine that means it is 1600 million clocks / 1 million nodes = 1600 >clocks a node on average. > >A penalty of 3 KB data to copy it to the other processor, that's >*really* hurting it. It would require a more indepth study to see how >many cache lines it is losing on average to it here. practically none, is the answer. Just do the math. You can have it do a search and see how many "splits" it does. It's not a big number, so the cost is not a big number either. Hand waving... > >And in the meantime the searching process is idling too. For the time taken to copy 3kb. Which is probably a few microseconds at worst. 700 times in 3 minutes. +big+ loss... > >So it is logical that crafty loses loads of system time. Yes. Almost a millisecond or maybe in bad cases a whole second of one cpu's time, in three minutes of total searching. 1/720th loss. Very big. Needs lots of work. > >>On September 04, 2002 at 19:06:33, martin fierz wrote: >> >>>On September 04, 2002 at 17:57:06, Robert Hyatt wrote: >>> >>>>On September 04, 2002 at 17:16:09, martin fierz wrote: >>>> >>>>>On September 04, 2002 at 13:06:37, Robert Hyatt wrote: >>>>> >>>>>>On September 04, 2002 at 11:56:29, Uri Blass wrote: >>>>>> >>>>>>>On September 04, 2002 at 10:25:38, Robert Hyatt wrote: >>>>>>> >>>>>>>>On September 04, 2002 at 02:47:20, Uri Blass wrote: >>>>>>>> >>>>>>>>> >>>>>>>>>I here agree with GCP >>>>>>>>>If Vincent's target was to convince the sponsor >>>>>>>>>not to look at the speedup of crayblitz as real he probably >>>>>>>>>suceeded. >>>>>>>>> >>>>>>>>>He does not need to prove that the results of the >>>>>>>>>speed up are a lie but only to convince them >>>>>>>>>not to trust the results. >>>>>>>>> >>>>>>>>>The times and not the speedup are the important information. >>>>>>>>> >>>>>>>>>Times are calculated first and speedup is calculated only >>>>>>>>>later after knowing the times. >>>>>>>> >>>>>>>>I've said it several times, but once more won't hurt, I guess. >>>>>>>> >>>>>>>>The original speedup numbers came _directly_ from the log files. Which >>>>>>>>had _real_ times in them. The nodes and times were added _way_ later. >>>>>>>>Once you have a speedup for 2,4,8 and 16 processors, you can _clearly_ >>>>>>>>(and _correctly_) reconstruct either the time, or the nodes searched, >>>>>>>>or both. We _had_ to calculate the nodes searched for reasons already given. >>>>>>>>It is possible that the times were calculated in the same way. I didn't do >>>>>>>>that personally, and without the "log eater" I can't confirm whether it was >>>>>>>>done or not. >>>>>>>> >>>>>>>>If you don't trust the speedups, that's something you have to decide, and it >>>>>>>>really doesn't matter to me since that program is no longer playing anyway. In >>>>>>>>fact, I don't have any source code for the thing as that was one of the many >>>>>>>>things lost when I lost the logs and everything else. >>>>>>>> >>>>>>>>But, as I said, the paper was about the _performance_. And the speedup >>>>>>>>numbers were direct computations from raw data. I consider _that_ to be >>>>>>>>the important data presented in the paper, along with the description of how >>>>>>>>the algorithm worked. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>>Usually we tend to trust scientists but if the information >>>>>>>>>about times is wrong then it means that >>>>>>>>>we cannot trust the other details in the article. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>So if the _main_ data is correct, and is then used to calculate something >>>>>>>>else, the something-else can't be trusted, and therefore neither can the >>>>>>>>main data??? >>>>>>>> >>>>>>>>Perhaps I am missing something... >>>>>>> >>>>>>>If the something else(times) was originally used to calculate the main data then >>>>>>>there is a problem. >>>>>>> >>>>>>>The information that was used to calculate the main data is not less important >>>>>>>than the main data and if we have not correct information about the information >>>>>>>there is a problem to trust the main data(it is clear that we had wrong >>>>>>>information about times). >>>>>>> >>>>>>>Uri >>>>>> >>>>>> >>>>>>Uri, follow closely: >>>>>> >>>>>>1. I computed the speedups by using a log eater that ate the raw search logs >>>>>>and grabbed the times, and then computed those and wrote the results out in a >>>>>>simple table, exactly as it appears in the article. The speedups came right >>>>>>from the raw data. >>>>>> >>>>>>2. We needed (much later) to make a similar table with node counts. We could >>>>>>not directly obtain this because it wasn't in the logs, as I have explained >>>>>>previously, because the tests were not run to a fixed depth, but came from a >>>>>>real game where iterations were rarely finished before time ran out. We >>>>>>computed the node counts by using the one-processor node counts which we _could_ >>>>>>get, and then using some internal performance measures gathered during the >>>>>>2,4,8 and 16 cpu runs. >>>>>> >>>>>>3. the time table is something I simply don't recall. It is certainly possible >>>>>>that we computed that the same way we computed the node counts, but note that >>>>>>I am talking about doing step 2 and 3 several years _after_ the original test >>>>>>was run and the raw speedup table was computed. >>>>> >>>>>bob, follow closely :-) >>>>> >>>>>even though you do not remember, the data in the table is *obviously* not really >>>>>measured time. if you just divide the time for 1 processor by the time for n >>>>>processors you see that immediately - all numbers come out as 1.7 or 1.9 or 7.3 >>>>>or something very close like 1.703. all 2nd digits after the . come out as 0. >>>>>the probability for this happening for random data is 10 to the -24... >>>>>therefore, you certainly did it for the times too. >>>> >>>>Note I am not disagreeing. I simply remember having to do it for the nodes, >>>>because of the problem in measuring them. I do not remember doing it (or not >>>>doing it) for the times, so as I said, it was likely done that way, but I am >>>>not going to say "it absolutely was" without being sure... Which I am not... >>> >>>but do you understand the argument? even if you do not remember, and even if you >>>are not sure, the probablity that you did not measure these numbers is about >>>0.999999999999999999999999 = 1-(10^-24). now if that is not enough for you to >>>say "it absolutely was" then i don't know ;-) >>> >>>aloha >>> martin >> >>I agree that the data is enough to be sure that >>the times are not measure times but I have one correction >>and some comments. >> >>10^-24 is not the probability that he measured the numbers >>but the probability to get always 0 in the second >>digit of the division when we assume that he measured >>the numbers. >> >>I do not know to calculate the probability that he measured the >>numbers because I have not apriory distribution >>of believing but even if I believed in 99.999% that >>he did measure the numbers before rading the information >>the results should be enough to convince me to change my mind. >> >>More than 99.999% is too much trust for everybody. >> >>I also have a problem with using the data to calculate probability >>even with apriory distribution >>because I do not have a defined test with H0 and H1. >> >>I found something strange with probability 10^-24 but >>the probability to find something strange with >>probability 10^-24 may be more than 10^-24 because >>there may be another >>strange data that I did not think about. >> >>On the other hand the starnge thing is not only the 0 in the second digit and >>there is 0 in the 3 digit in most of the cases. >> >>Another point is that >>10^-24 is the probability only if we assume >>uniform distribution. >> >>This is a good estimate but >>I guess that it is not exactly correct. >> >>Uri
This page took 0.01 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.