Author: Uri Blass
Date: 21:27:21 09/04/02
Go up one level in this thread
On September 04, 2002 at 19:06:33, martin fierz wrote: >On September 04, 2002 at 17:57:06, Robert Hyatt wrote: > >>On September 04, 2002 at 17:16:09, martin fierz wrote: >> >>>On September 04, 2002 at 13:06:37, Robert Hyatt wrote: >>> >>>>On September 04, 2002 at 11:56:29, Uri Blass wrote: >>>> >>>>>On September 04, 2002 at 10:25:38, Robert Hyatt wrote: >>>>> >>>>>>On September 04, 2002 at 02:47:20, Uri Blass wrote: >>>>>> >>>>>>> >>>>>>>I here agree with GCP >>>>>>>If Vincent's target was to convince the sponsor >>>>>>>not to look at the speedup of crayblitz as real he probably >>>>>>>suceeded. >>>>>>> >>>>>>>He does not need to prove that the results of the >>>>>>>speed up are a lie but only to convince them >>>>>>>not to trust the results. >>>>>>> >>>>>>>The times and not the speedup are the important information. >>>>>>> >>>>>>>Times are calculated first and speedup is calculated only >>>>>>>later after knowing the times. >>>>>> >>>>>>I've said it several times, but once more won't hurt, I guess. >>>>>> >>>>>>The original speedup numbers came _directly_ from the log files. Which >>>>>>had _real_ times in them. The nodes and times were added _way_ later. >>>>>>Once you have a speedup for 2,4,8 and 16 processors, you can _clearly_ >>>>>>(and _correctly_) reconstruct either the time, or the nodes searched, >>>>>>or both. We _had_ to calculate the nodes searched for reasons already given. >>>>>>It is possible that the times were calculated in the same way. I didn't do >>>>>>that personally, and without the "log eater" I can't confirm whether it was >>>>>>done or not. >>>>>> >>>>>>If you don't trust the speedups, that's something you have to decide, and it >>>>>>really doesn't matter to me since that program is no longer playing anyway. In >>>>>>fact, I don't have any source code for the thing as that was one of the many >>>>>>things lost when I lost the logs and everything else. >>>>>> >>>>>>But, as I said, the paper was about the _performance_. And the speedup >>>>>>numbers were direct computations from raw data. I consider _that_ to be >>>>>>the important data presented in the paper, along with the description of how >>>>>>the algorithm worked. >>>>>> >>>>>> >>>>>> >>>>>>> >>>>>>>Usually we tend to trust scientists but if the information >>>>>>>about times is wrong then it means that >>>>>>>we cannot trust the other details in the article. >>>>>> >>>>>> >>>>>> >>>>>>So if the _main_ data is correct, and is then used to calculate something >>>>>>else, the something-else can't be trusted, and therefore neither can the >>>>>>main data??? >>>>>> >>>>>>Perhaps I am missing something... >>>>> >>>>>If the something else(times) was originally used to calculate the main data then >>>>>there is a problem. >>>>> >>>>>The information that was used to calculate the main data is not less important >>>>>than the main data and if we have not correct information about the information >>>>>there is a problem to trust the main data(it is clear that we had wrong >>>>>information about times). >>>>> >>>>>Uri >>>> >>>> >>>>Uri, follow closely: >>>> >>>>1. I computed the speedups by using a log eater that ate the raw search logs >>>>and grabbed the times, and then computed those and wrote the results out in a >>>>simple table, exactly as it appears in the article. The speedups came right >>>>from the raw data. >>>> >>>>2. We needed (much later) to make a similar table with node counts. We could >>>>not directly obtain this because it wasn't in the logs, as I have explained >>>>previously, because the tests were not run to a fixed depth, but came from a >>>>real game where iterations were rarely finished before time ran out. We >>>>computed the node counts by using the one-processor node counts which we _could_ >>>>get, and then using some internal performance measures gathered during the >>>>2,4,8 and 16 cpu runs. >>>> >>>>3. the time table is something I simply don't recall. It is certainly possible >>>>that we computed that the same way we computed the node counts, but note that >>>>I am talking about doing step 2 and 3 several years _after_ the original test >>>>was run and the raw speedup table was computed. >>> >>>bob, follow closely :-) >>> >>>even though you do not remember, the data in the table is *obviously* not really >>>measured time. if you just divide the time for 1 processor by the time for n >>>processors you see that immediately - all numbers come out as 1.7 or 1.9 or 7.3 >>>or something very close like 1.703. all 2nd digits after the . come out as 0. >>>the probability for this happening for random data is 10 to the -24... >>>therefore, you certainly did it for the times too. >> >>Note I am not disagreeing. I simply remember having to do it for the nodes, >>because of the problem in measuring them. I do not remember doing it (or not >>doing it) for the times, so as I said, it was likely done that way, but I am >>not going to say "it absolutely was" without being sure... Which I am not... > >but do you understand the argument? even if you do not remember, and even if you >are not sure, the probablity that you did not measure these numbers is about >0.999999999999999999999999 = 1-(10^-24). now if that is not enough for you to >say "it absolutely was" then i don't know ;-) > >aloha > martin I agree that the data is enough to be sure that the times are not measure times but I have one correction and some comments. 10^-24 is not the probability that he measured the numbers but the probability to get always 0 in the second digit of the division when we assume that he measured the numbers. I do not know to calculate the probability that he measured the numbers because I have not apriory distribution of believing but even if I believed in 99.999% that he did measure the numbers before rading the information the results should be enough to convince me to change my mind. More than 99.999% is too much trust for everybody. I also have a problem with using the data to calculate probability even with apriory distribution because I do not have a defined test with H0 and H1. I found something strange with probability 10^-24 but the probability to find something strange with probability 10^-24 may be more than 10^-24 because there may be another strange data that I did not think about. On the other hand the starnge thing is not only the 0 in the second digit and there is 0 in the 3 digit in most of the cases. Another point is that 10^-24 is the probability only if we assume uniform distribution. This is a good estimate but I guess that it is not exactly correct. Uri
This page took 0.01 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.