Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: I can't believe this bashing is being allowed on here: "Bad Math To

Author: Matthew Hull

Date: 15:45:30 09/05/02

Go up one level in this thread


On September 05, 2002 at 18:09:44, martin fierz wrote:

>On September 05, 2002 at 17:09:44, Matthew Hull wrote:
>
>>On September 05, 2002 at 16:43:16, martin fierz wrote:
>>
>>>On September 05, 2002 at 13:43:10, Matthew Hull wrote:
>>>
>>>>On September 05, 2002 at 13:28:20, Miguel A. Ballicora wrote:
>>>>
>>>>>On September 05, 2002 at 10:05:05, Robert Hyatt wrote:
>>>>>
>>>>>>On September 05, 2002 at 00:25:58, Miguel A. Ballicora wrote:
>>>>>>
>>>>>>>On September 04, 2002 at 18:38:17, Dann Corbit wrote:
>>>>>>>
>>>>>>>>My take on the matter (in one paragraph):
>>>>>>>>Robert wrote a paper on parallel speedup, showing a 1.7 increase for 2 CPU's (as
>>>>>>>>derived from his more general formula).  Vincent was unable to reproduce this
>>>>>>>>sort of speedup, and thought the research was faulty.  Robert agreed that the
>>>>>>>>test set was limited and you won't always get that sort of speedup, but as an
>>>>>>>>average (over a broad set of positions) that's about what he got.  There has
>>>>>>>>been some acrimony over whether superlinear speedups are possible.  I think that
>>>>>>>>the jury is still out on that one.
>>>>>>>>
>>>>>>>>At any rate, that's my take on the whole thing.
>>>>>>>>
>>>>>>>>Vincent always sees things in pure, jet black or gleaming, powder white.  If
>>>>>>>>something isn't terrific, then it is pure junk.  While I think his mode of
>>>>>>>>interesting is a bit odd, it's one of the things that make Vincent interesting.
>>>>>>>
>>>>>>>He crossed the line when he used the word "fraud" and "lie"
>>>>>>>to describe a scientific paper without any solid proof (he only proved a flaw in
>>>>>>>the presentation). Too serious.
>>>>>>>
>>>>>>>To be honest, I am embarrassed to be reading this thread. One side does not
>>>>>>>recognize a flaw (it could be honest and I believe it, happens many times, big
>>>>>>>deal) and the other makes pathetic accusations of fraud mixing it up with old
>>>>>>>issues (Deep blue etc.). To top it all, ad hominem attacks.
>>>>>>>
>>>>>>>In this conditions it is impossible to discuss anything.
>>>>>>
>>>>>>While I understand what you mean, I don't see any major "flaw".
>>>>>
>>>>>No, it is not major, it is a very minor flaw in the presentation. Not a big
>>>>>deal, but you cannot stand there and say that it is just ok. You cannot say that
>>>>>it is ok the way you rounded it and everything is justified by the big
>>>>>variability. The only thing that the big variability shows is that the flaw is
>>>>>minor, but it does not show that there is no flaw in the presentation. In those
>>>>>cases standard deviations should be shown using numbers that were rounded
>>>>>properly.
>>>>>
>>>>>Don't get me wrong, I understand and accept completely everything you say, if I
>>>>>were accused of fraud I would go overboard myself. But please, do not try to
>>>>>convince us that those tables are the proper way to present something.
>>>>
>>>>Under the circumstances, I don't think he had a choice.  It was the only way to
>>>>add the data so long after the fact at the request of the referees.  Was he
>>>>supposed to say in the paper that the refs wanted the data so I was forced to
>>>>extrapolate it?  What would you have done in that situation?
>>>
>>>of course he had a choice! when a referee asks you for something more than what
>>>you have in your paper, you can either comply, and produce that data, or you can
>>>try to convince him that it's not important. even if you fail to convince the
>>>referee, the editor of the journal has the last word about publication, and if
>>>your paper is really good (it is), then you can also survive a hostile referee.
>>>bob should either have rerun his tests and saved his logs to produce real data,
>>>or he should have convinced the referee that this data is unnecessary (it is!),
>>>and if that failed, he should have admitted to the journal editor that he did
>>>not have the raw data any more, and asked him to publish it all the same.
>>>making up data (to please a referee or for other reasons) and passing it on as
>>>actual data is probably the biggest no-no of all in science.
>>
>>"Making up data" is too strong here.  The times and nodes are functions of the
>>speedup.  Deriving them from this function is not "making up" data, it's
>>calculating the data from a known function.
>>
>>Is this not standard practice, especially under the circumstances described
>>above?
>
>making up data is not too strong.

Making up is pure invention "ex nihilo".  Don't exaggerate.  He did a legitimate
calculation.  Calling that "made up" is baloney.

>of course the numbers in the table are very
>close to what bob actually measured. but the table says: "this is the time in
>seconds that the program needed on this position" - and the numbers in the table
>are not what it says. even if it is very close to the truth, it is still made
>up. this is definitely NOT standard practice - under no circumstances do you
>EVER make up data. even if it is very close to the true data....
>
>let me give you an example: i'll claim i made an experiment measuring the time
>it takes for a stone to drop from x meters height. thanks to physics, i already
>know how long it will take. so i give you a table, with numbers in, which, if
>you actually repeat the experiment, will turn out to be true. am i a fraud if i
>do this? certainly. i claim to have measured something, which i haven't in fact.
>it doesn't matter whether my numbers are correct or not, it only matters that i
>pretend to have measured something i didn't really measure.

But he _did_ measure it.  He just didn't have access to the numbers anymore at a
late date when requested to include them.  So he derived them.  The times were
not the pertinent issue.  The speedups _were_.  Get real.

>
>uri puts it very well: science is about trusting another person's results. if i
>see that bob publishes data which he didn't really measure, i lose some of my
>faith in his results. so you say: "well, this was just a minor flaw which was
>none of his fault, because a referee made him do it." the problem is this: bob
>was willing to cheat a little bit just to satisfy the referee. so how do you
>know that the next time a referee asks him to add something he will not cheat a
>little bit more? and if he gets away with that too? what will he do next?
>i'm not suggesting that he did anything else wrong, or that he will do anything
>else wrong in the future (in fact, i trust bob's results very much, and will
>continue to do so), i just want to explain to you why it is important in science
>that you never publish anything you didnt really measure.
>
>aloha
>  martin
>
>>
>>>people routinely
>>>get fired for doing that in academia. even if it is totally irrelevant data, as
>>>in this case, you just don't do that.
>>>
>>>it is no coincidence that miguel and i who say bob did something wrong are both
>>>in research...
>>>
>>>bob's argument that it's irrelevant, which convinces both of us, should have
>>>been used to counter the referee's request.
>>>
>>>aloha
>>>  martin
>>>
>>>
>>>
>>>>>What I accept is that it does not change a bit the relevance of the ideas.
>>>>>
>>>>>Regards,
>>>>>Miguel
>>>>>
>>>>>
>>>>>>As in the
>>>>>>question I posed to vincent, if I ask you to measure the radius, diameter
>>>>>>and circumference of a piece of pipe, once you have one, you can derive the
>>>>>>others, but there will be a tiny accuracy issue.  Because pi factors into
>>>>>>the relationship among the three values, and what value do you use for pi?
>>>>>>So even in something so simple as diameter vs circumference, there are a
>>>>>>_set_ of possible diameters that will yield the same circumference, to some
>>>>>>particular number of decimel places of accuracy.  Yet in reality, only _one_
>>>>>>of those diameters can be the _real_ diameter.
>>>>>>
>>>>>>After showing Martin just how "bad" smp variability can be, that equates to
>>>>>>a piece of pipe made out of some _very_ temp-sensitive material, so that the
>>>>>>very act of touching it to measure it causes a change in the
>>>>>>diameter/circumference.  And that is what I see in SMP results _all_ the time.
>>>>>>
>>>>>>IE for a given actual time for a 16 processor test, and a computed speedup
>>>>>>made by dividing the raw 1-cpu time by the raw 16-cpu time, you can now go back
>>>>>>and re-compute the 1-cpu raw time.  And there will be a _set_ of answers to
>>>>>>that computation that still produce the same 16-cpu speedup.  Is one of those
>>>>>>pre-computed 1-cpu times (which can vary in the low order digits when the
>>>>>>value is in the thousands) any better than another?  Is one better because you
>>>>>>computed it, and you actually observed it in the game under question?  What if
>>>>>>you run it again and you get _another_ one of the computed 1-cpu times, as it
>>>>>>is easy to see the one-cpu time vary by a second or so over a long period,
>>>>>>due to network traffic, interrupts, and so forth, maybe even a daemon process
>>>>>>waking up for a moment to do something?
>>>>>>
>>>>>>My point?
>>>>>>
>>>>>>If you take a 16-cpu speedup, and a 16-cpu time, and use that to derive
>>>>>>the set of 1-cpu times (measured only in seconds) you will get a set of
>>>>>>N times that could produce that same speedup given the observed 16-cpu time
>>>>>>and computed 16-cpu speedup.  The inverse is the better way of thinking about
>>>>>>this of course, because that 16 cpu time is just a "snapshot" that could vary
>>>>>>dynamically, and as the 16-cpu time varies, so would the set of 1-cpu times
>>>>>>that would still produce that same speedup.
>>>>>>
>>>>>>In other words, there is a _lot_ of inaccuracy already.  So the question
>>>>>>becomes, "why compute some of the times, why not observe them all?"  That is
>>>>>>"the" question here.  And it is one I am not sure I can answer yet.  I have
>>>>>>gotten "closer" to the disk failure date.  From info in my files at the office.
>>>>>>
>>>>>>I bought a Gateway G6/200 (pentium pro 200) when they first came out.  I think
>>>>>>Bruce got his a month or so before me.  In early 1996, I had a disk crash,
>>>>>>and called gateway for a replacement (this is the part I have docs for as you
>>>>>>will see shortly).  They sent one but when it arrived, it was a 68-pin wide
>>>>>>scsi drive, while the machine they had sold me 2 months earlier was a 50-pin
>>>>>>narrow scsi drive (this machine had 4 4.5 gig scsi drives).  I called them to
>>>>>>tell them about the "error" (I couldn't use that drive with my scsi controller)
>>>>>>and they responded "we no longer have any narrow scsi stuff, we have migrated
>>>>>>to wide scsi."  I responded "great, so I have a machine 2 months old, that
>>>>>>cost me $5,000, and you can't get me a replacement drive?  Mark me off the
>>>>>>gateway customer list (I used to order 20-30 machines _per year_ for our
>>>>>>department labs)."  That got their attention, they sent me three more wide
>>>>>>scsi drives and a new controller.  So I have at least homed in on when I lost
>>>>>>"the world".  And that suggests that we _had_ to reconstruct the data as best
>>>>>>we could, although neither of us can say "yes that was the reason" with 100%
>>>>>>reliability.  I mainly remember trying to get Crafty back up to speed.  I was
>>>>>>getting ready for the WMCCC event, and was not releasing new versions.  As a
>>>>>>result, when the disk went down, I lost everything, and had to back up to the
>>>>>>last released version and start over, trying to remember the changes I had
>>>>>>made.
>>>>>>
>>>>>>I don't know when I started back to work on the dts paper, because the panic
>>>>>>about the WMCCC and losing things is the thing that I _really_ recall (not
>>>>>>to mention class assignments, handouts, etc.)  But it is certainly within
>>>>>>reason to think that led to the time computations.  As I have said all along,
>>>>>>the node computation was unavoidable from the start...
>>>>>>
>>>>>>So, as far as I am concerned, there is _nothing_ "false" in the DTS paper.  I
>>>>>>computed the speedups in the normal way, using raw data, back in early 1994
>>>>>>after the tests were run.  That was the "critical data" presented in the paper.
>>>>>>Later, we extrapolated the nodes when they were asked for, and it would seem
>>>>>>that we extrapolated the times at the same time.
>>>>>>
>>>>>>That is about all I can remember, so far.  I don't consider the "node" data to
>>>>>>be very interesting at all, and the "time" data is dynamic enough due to the
>>>>>>variability in speedup that any value I published, whether it be extrapolated or
>>>>>>observerd, could likely _never_ be repeated again.  As a new test run would
>>>>>>certainly vary by seconds if not much more (another thread in this discussion
>>>>>>shows some wild behavior on kopec 2, and some rock-solid behavior on kopec 3,
>>>>>>to illustrate this point.)
>>>>>>
>>>>>>I don't see it as being a "fraud" at all, but then I'm not Vincent.  I think
>>>>>>the concept of "if you can't make yours look better, then make someone else's
>>>>>>look worse" is extremely sad, of course.  But he will continue to do that on
>>>>>>all fronts.  "crafty's eval is simple".  "fritz knows shit about chess".
>>>>>>"tiger is stupid in endgames."  We've all seen those comments over and over.
>>>>>>As well as "diep has the most sophisticated eval of any chess program" or
>>>>>>"diep is the strongest program in the world at correspondence time controls".
>>>>>>
>>>>>>What more could anyone say?
>>>>>>
>>>>>>Will Crafty ever approach CB's DTS speedup?  I doubt it, because I am not
>>>>>>willing to give up the recursive search, which would be the first thing to
>>>>>>go.  If I had high expectations of running on a 32 cpu machine all the
>>>>>>time, then I certainly might consider it.  But I suspect that I will be
>>>>>>using a quad for the foreseable future, with maybe an 8-way or 16-way
>>>>>>box on rare tournament occasions.  I'm old enough to try to avoid
>>>>>>headaches, which you get with non-recursion when you try to read it later.
>>>>>>
>>>>>>Will someone else beat the DTS result?  Most certainly.  Even as I wrote the
>>>>>>paper I noticed a couple of things that ought to be changed.  Will it bother
>>>>>>me when they do?  Not at all.  My EPVS algorithm beat the old PVS by a small
>>>>>>amount.  Schaeffer tried a new approach that was better than EPVS.  Which led
>>>>>>me to try something even better since I had a lot of time that I would be
>>>>>>spending on my dissertation anyway, so that seemed like a natural project to
>>>>>>undertake.  Will I do that again?  I don't think so.  Too much development
>>>>>>time.  multiple years of debugging, including one full-time year as I finished
>>>>>>the dissertation work and started the writing.  The current SMP approach in
>>>>>>Crafty is not half-bad.  And it was not hard to write at all.  Those two points
>>>>>>say a lot...
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>Regards,
>>>>>>>Miguel
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>Robert has always been a man of strong convictions, and if you call him a
>>>>>>>>'noo-noo head' he'll call you one back.  He isn't one to back down when he
>>>>>>>>thinks he is right.  That's one of the things I like about Dr. Hyatt.
>>>>>>>>
>>>>>>>>When these two styles happen to ram into one another, the sparks are sure.  A
>>>>>>>>philosophical question is often asked:
>>>>>>>>"What happens when an immovable object meets an irresistable force?"
>>>>>>>>
>>>>>>>>The 'debate' is an answer to that question.
>>>>>>>>;-)



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.