Author: martin fierz
Date: 14:16:09 09/04/02
Go up one level in this thread
On September 04, 2002 at 13:06:37, Robert Hyatt wrote: >On September 04, 2002 at 11:56:29, Uri Blass wrote: > >>On September 04, 2002 at 10:25:38, Robert Hyatt wrote: >> >>>On September 04, 2002 at 02:47:20, Uri Blass wrote: >>> >>>> >>>>I here agree with GCP >>>>If Vincent's target was to convince the sponsor >>>>not to look at the speedup of crayblitz as real he probably >>>>suceeded. >>>> >>>>He does not need to prove that the results of the >>>>speed up are a lie but only to convince them >>>>not to trust the results. >>>> >>>>The times and not the speedup are the important information. >>>> >>>>Times are calculated first and speedup is calculated only >>>>later after knowing the times. >>> >>>I've said it several times, but once more won't hurt, I guess. >>> >>>The original speedup numbers came _directly_ from the log files. Which >>>had _real_ times in them. The nodes and times were added _way_ later. >>>Once you have a speedup for 2,4,8 and 16 processors, you can _clearly_ >>>(and _correctly_) reconstruct either the time, or the nodes searched, >>>or both. We _had_ to calculate the nodes searched for reasons already given. >>>It is possible that the times were calculated in the same way. I didn't do >>>that personally, and without the "log eater" I can't confirm whether it was >>>done or not. >>> >>>If you don't trust the speedups, that's something you have to decide, and it >>>really doesn't matter to me since that program is no longer playing anyway. In >>>fact, I don't have any source code for the thing as that was one of the many >>>things lost when I lost the logs and everything else. >>> >>>But, as I said, the paper was about the _performance_. And the speedup >>>numbers were direct computations from raw data. I consider _that_ to be >>>the important data presented in the paper, along with the description of how >>>the algorithm worked. >>> >>> >>> >>>> >>>>Usually we tend to trust scientists but if the information >>>>about times is wrong then it means that >>>>we cannot trust the other details in the article. >>> >>> >>> >>>So if the _main_ data is correct, and is then used to calculate something >>>else, the something-else can't be trusted, and therefore neither can the >>>main data??? >>> >>>Perhaps I am missing something... >> >>If the something else(times) was originally used to calculate the main data then >>there is a problem. >> >>The information that was used to calculate the main data is not less important >>than the main data and if we have not correct information about the information >>there is a problem to trust the main data(it is clear that we had wrong >>information about times). >> >>Uri > > >Uri, follow closely: > >1. I computed the speedups by using a log eater that ate the raw search logs >and grabbed the times, and then computed those and wrote the results out in a >simple table, exactly as it appears in the article. The speedups came right >from the raw data. > >2. We needed (much later) to make a similar table with node counts. We could >not directly obtain this because it wasn't in the logs, as I have explained >previously, because the tests were not run to a fixed depth, but came from a >real game where iterations were rarely finished before time ran out. We >computed the node counts by using the one-processor node counts which we _could_ >get, and then using some internal performance measures gathered during the >2,4,8 and 16 cpu runs. > >3. the time table is something I simply don't recall. It is certainly possible >that we computed that the same way we computed the node counts, but note that >I am talking about doing step 2 and 3 several years _after_ the original test >was run and the raw speedup table was computed. bob, follow closely :-) even though you do not remember, the data in the table is *obviously* not really measured time. if you just divide the time for 1 processor by the time for n processors you see that immediately - all numbers come out as 1.7 or 1.9 or 7.3 or something very close like 1.703. all 2nd digits after the . come out as 0. the probability for this happening for random data is 10 to the -24... therefore, you certainly did it for the times too. the real point is that there is *no way* you could have measured those search times, and that if you were to claim you really did measure them, you would be a *proven* fraud. but, as you say, you measured the speedup to 1 digit, and not the real time, then it all makes sense - except that you did something you shouldnt really do... aloha martin >Conclusions: > >1. the speedup data came directly from five large log files, run thru a >program that matched up depths and moves and grabbed the time for the one >that was of interest (the last move displayed in the real game). This data >I have 100% confidence in as representing actual raw data. > >2. The times/nodes I am not sure about. They were produced either in 1996 >or early 1997. According to annual faculty activy reports here, I started >working on this paper in 1993, and submitted it late in 1994. We haggled over >various things for about two years, back and forth. It was actually published >in the March 1997 JICCA. The key is that the speedup data was produced in >late 1993 and early 1994, right after the 1993 ACM event where the game in >question was played. The paper was finished a couple of years later. It is >certainly possible that this happened after I lost all files here so that we >had to extrapolate times based on the rather simple data I have in my paper >files. > >The actual data I have here, is the 24 positions, the time taken on the 16 >processor test, and then the printout of the raw speedup table that is in >the JICCA. So it is certainly possible that the times _and_ nodes were >extrapolated. It is possible that it was done because it was easier than trying >to round up the old logs and produce them by eating those. It is possible it >was done because the logs were lost. I have tried to remember when the disk >crash happened here... and I will probably probe DejaNews as when it happened, >I immediately sent out an appeal for any old crafty versions since they were all >lost, excepting the ones on my ftp machine (a different box). That might help >in remembering more. But doing this paper was mainly an effort fro 1993-1994. >The later additions, such as the two additional tables, more explanations about >some parts, less info about others, was done over the next 2-3 years, and it >was done _very_ sporadically. That's why I don't remember a lot of specific >details, it was spread over a long time, with a lot of other things going on, >and didn't seem very important when we were doing it. Had I thought "Hey I >am really cheating the world here." I would have at least remembered that. But >the extrapolation seemed quite accurate as we ran a few positions and >extrapolated and compared that to real positions, just to be sure the >extrapolations were reasonable. They were, and we never gave the node issue >another moment's thought. Until now, of course..
This page took 0.01 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.