Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: I can't believe this bashing is being allowed on here: "Bad Math To

Author: Matthew Hull

Date: 17:54:16 09/05/02

Go up one level in this thread


On September 05, 2002 at 20:06:36, martin fierz wrote:

>On September 05, 2002 at 19:08:54, Matthew Hull wrote:
>
>>On September 05, 2002 at 18:09:44, martin fierz wrote:
>>
>>>On September 05, 2002 at 17:09:44, Matthew Hull wrote:
>>>
>>>>On September 05, 2002 at 16:43:16, martin fierz wrote:
>>>>
>>>>>On September 05, 2002 at 13:43:10, Matthew Hull wrote:
>>>>>
>>>>>>On September 05, 2002 at 13:28:20, Miguel A. Ballicora wrote:
>>>>>>
>>>>>>>On September 05, 2002 at 10:05:05, Robert Hyatt wrote:
>>>>>>>
>>>>>>>>On September 05, 2002 at 00:25:58, Miguel A. Ballicora wrote:
>>>>>>>>
>>>>>>>>>On September 04, 2002 at 18:38:17, Dann Corbit wrote:
>>>>>>>>>
>>>>>>>>>>My take on the matter (in one paragraph):
>>>>>>>>>>Robert wrote a paper on parallel speedup, showing a 1.7 increase for 2 CPU's (as
>>>>>>>>>>derived from his more general formula).  Vincent was unable to reproduce this
>>>>>>>>>>sort of speedup, and thought the research was faulty.  Robert agreed that the
>>>>>>>>>>test set was limited and you won't always get that sort of speedup, but as an
>>>>>>>>>>average (over a broad set of positions) that's about what he got.  There has
>>>>>>>>>>been some acrimony over whether superlinear speedups are possible.  I think that
>>>>>>>>>>the jury is still out on that one.
>>>>>>>>>>
>>>>>>>>>>At any rate, that's my take on the whole thing.
>>>>>>>>>>
>>>>>>>>>>Vincent always sees things in pure, jet black or gleaming, powder white.  If
>>>>>>>>>>something isn't terrific, then it is pure junk.  While I think his mode of
>>>>>>>>>>interesting is a bit odd, it's one of the things that make Vincent interesting.
>>>>>>>>>
>>>>>>>>>He crossed the line when he used the word "fraud" and "lie"
>>>>>>>>>to describe a scientific paper without any solid proof (he only proved a flaw in
>>>>>>>>>the presentation). Too serious.
>>>>>>>>>
>>>>>>>>>To be honest, I am embarrassed to be reading this thread. One side does not
>>>>>>>>>recognize a flaw (it could be honest and I believe it, happens many times, big
>>>>>>>>>deal) and the other makes pathetic accusations of fraud mixing it up with old
>>>>>>>>>issues (Deep blue etc.). To top it all, ad hominem attacks.
>>>>>>>>>
>>>>>>>>>In this conditions it is impossible to discuss anything.
>>>>>>>>
>>>>>>>>While I understand what you mean, I don't see any major "flaw".
>>>>>>>
>>>>>>>No, it is not major, it is a very minor flaw in the presentation. Not a big
>>>>>>>deal, but you cannot stand there and say that it is just ok. You cannot say that
>>>>>>>it is ok the way you rounded it and everything is justified by the big
>>>>>>>variability. The only thing that the big variability shows is that the flaw is
>>>>>>>minor, but it does not show that there is no flaw in the presentation. In those
>>>>>>>cases standard deviations should be shown using numbers that were rounded
>>>>>>>properly.
>>>>>>>
>>>>>>>Don't get me wrong, I understand and accept completely everything you say, if I
>>>>>>>were accused of fraud I would go overboard myself. But please, do not try to
>>>>>>>convince us that those tables are the proper way to present something.
>>>>>>
>>>>>>Under the circumstances, I don't think he had a choice.  It was the only way to
>>>>>>add the data so long after the fact at the request of the referees.  Was he
>>>>>>supposed to say in the paper that the refs wanted the data so I was forced to
>>>>>>extrapolate it?  What would you have done in that situation?
>>>>>
>>>>>of course he had a choice! when a referee asks you for something more than what
>>>>>you have in your paper, you can either comply, and produce that data, or you can
>>>>>try to convince him that it's not important. even if you fail to convince the
>>>>>referee, the editor of the journal has the last word about publication, and if
>>>>>your paper is really good (it is), then you can also survive a hostile referee.
>>>>>bob should either have rerun his tests and saved his logs to produce real data,
>>>>>or he should have convinced the referee that this data is unnecessary (it is!),
>>>>>and if that failed, he should have admitted to the journal editor that he did
>>>>>not have the raw data any more, and asked him to publish it all the same.
>>>>>making up data (to please a referee or for other reasons) and passing it on as
>>>>>actual data is probably the biggest no-no of all in science.
>>>>
>>>>"Making up data" is too strong here.  The times and nodes are functions of the
>>>>speedup.  Deriving them from this function is not "making up" data, it's
>>>>calculating the data from a known function.
>>>>
>>>>Is this not standard practice, especially under the circumstances described
>>>>above?
>>>
>>>making up data is not too strong. of course the numbers in the table are very
>>>close to what bob actually measured. but the table says: "this is the time in
>>>seconds that the program needed on this position" - and the numbers in the table
>>>are not what it says. even if it is very close to the truth, it is still made
>>>up. this is definitely NOT standard practice - under no circumstances do you
>>>EVER make up data. even if it is very close to the true data....
>>
>>To say that the times were "made up" is to say that the speedups were "made up",
>>since the times were derived directly from the speedups.
>>
>>You are saying the speedups are "made up".
>
>you don't understand...

I understand perfectly.  I'm just saying that polemically, you are contradicting
yourself when you say it.  If B was derived via repeatable function from A, then
to say B is made up is really saying A was made up.  Because B cannot be made up
if it was calculated.  B can only be made up if A is made up.

The ancient Greeks would be all over you on this.

I think you really wanted to say, "It's not right to imply by silence that B are
the observed values from which A was derived."  That's fine.

I know that's what you are saying.  But saying B is "made up" is not accurate.
B was backed into from A and is an approximation of the observed facts which are
lost.  That is more accurate.

>read carefully: i never said that anything bob did
>invalidates any conclusion he made in the article. i said the opposite. i am
>definitely not saying the speedups are made up. vincent says that. i don't. if
>you read my posts, you will see that i never say the speedups are made up...
>
>the point is that the data in the table is not real, measured data. bob had real
>data, calculated the speedups, rounded them, and later recalculated the times in
>that table with the rounded numbers (if the speedup was not rounded, you would
>be right, then his numbers would be real - but they're not). yes, the numbers in
>this table are off only by a few % from the real numbers, but that doesn't
>change the fact that they are not what he claims to be.
>if the table says "i measured these numbers", and it turns out that they are not
>measured numbers, then it is wrong. in science, you just don't do that... for
>the reasons i gave to you. OF COURSE this is only a very small fudging of
>reality. but if you are willing to fudge very little, you might also be willing
>to fudge a little bit more. and where do you draw the line?
>
>obviously, if you are not in science, this may all be a bit too much nitpicking
>for you. big business doesn't work this way (enron, worldcom, you name it).
>science does.

Yes, in the presence of a true scientist (if you listen carefully) one can hear
the flutter of angels wings.

>
>aloha
>  martin
>
>PS:
>>Making up is pure invention "ex nihilo".  Don't exaggerate.
>that is your definition of "making up". i say: putting any number there that you
>did not actually measure (even if it is close to the true number) is "making
>up".


Most people don't get that impression in their mind when you say "Those numbers
are made up" that they were derived by a function that gave a close
approximation of the original measurement.  "Made up" implies complete
prevarication.  That's why I still say that it's too strong a term for this
situation.

Regards,

>you write something down which you didn't measure. i don't care how close
>to the true number it is. again, that is the way it works in science.
>if you want to publish such results on a webpage, fine. if you publish that kind
>of stuff in a scientific journal, not fine.
>
>>>
>>>let me give you an example: i'll claim i made an experiment measuring the time
>>>it takes for a stone to drop from x meters height. thanks to physics, i already
>>>know how long it will take. so i give you a table, with numbers in, which, if
>>>you actually repeat the experiment, will turn out to be true. am i a fraud if i
>>>do this? certainly. i claim to have measured something, which i haven't in fact.
>>>it doesn't matter whether my numbers are correct or not, it only matters that i
>>>pretend to have measured something i didn't really measure.
>>>
>>>uri puts it very well: science is about trusting another person's results. if i
>>>see that bob publishes data which he didn't really measure, i lose some of my
>>>faith in his results. so you say: "well, this was just a minor flaw which was
>>>none of his fault, because a referee made him do it." the problem is this: bob
>>>was willing to cheat a little bit just to satisfy the referee. so how do you
>>>know that the next time a referee asks him to add something he will not cheat a
>>>little bit more? and if he gets away with that too? what will he do next?
>>>i'm not suggesting that he did anything else wrong, or that he will do anything
>>>else wrong in the future (in fact, i trust bob's results very much, and will
>>>continue to do so), i just want to explain to you why it is important in science
>>>that you never publish anything you didnt really measure.
>>>
>>>aloha
>>>  martin
>>>
>>>>
>>>>>people routinely
>>>>>get fired for doing that in academia. even if it is totally irrelevant data, as
>>>>>in this case, you just don't do that.
>>>>>
>>>>>it is no coincidence that miguel and i who say bob did something wrong are both
>>>>>in research...
>>>>>
>>>>>bob's argument that it's irrelevant, which convinces both of us, should have
>>>>>been used to counter the referee's request.
>>>>>
>>>>>aloha
>>>>>  martin
>>>>>
>>>>>
>>>>>
>>>>>>>What I accept is that it does not change a bit the relevance of the ideas.
>>>>>>>
>>>>>>>Regards,
>>>>>>>Miguel
>>>>>>>
>>>>>>>
>>>>>>>>As in the
>>>>>>>>question I posed to vincent, if I ask you to measure the radius, diameter
>>>>>>>>and circumference of a piece of pipe, once you have one, you can derive the
>>>>>>>>others, but there will be a tiny accuracy issue.  Because pi factors into
>>>>>>>>the relationship among the three values, and what value do you use for pi?
>>>>>>>>So even in something so simple as diameter vs circumference, there are a
>>>>>>>>_set_ of possible diameters that will yield the same circumference, to some
>>>>>>>>particular number of decimel places of accuracy.  Yet in reality, only _one_
>>>>>>>>of those diameters can be the _real_ diameter.
>>>>>>>>
>>>>>>>>After showing Martin just how "bad" smp variability can be, that equates to
>>>>>>>>a piece of pipe made out of some _very_ temp-sensitive material, so that the
>>>>>>>>very act of touching it to measure it causes a change in the
>>>>>>>>diameter/circumference.  And that is what I see in SMP results _all_ the time.
>>>>>>>>
>>>>>>>>IE for a given actual time for a 16 processor test, and a computed speedup
>>>>>>>>made by dividing the raw 1-cpu time by the raw 16-cpu time, you can now go back
>>>>>>>>and re-compute the 1-cpu raw time.  And there will be a _set_ of answers to
>>>>>>>>that computation that still produce the same 16-cpu speedup.  Is one of those
>>>>>>>>pre-computed 1-cpu times (which can vary in the low order digits when the
>>>>>>>>value is in the thousands) any better than another?  Is one better because you
>>>>>>>>computed it, and you actually observed it in the game under question?  What if
>>>>>>>>you run it again and you get _another_ one of the computed 1-cpu times, as it
>>>>>>>>is easy to see the one-cpu time vary by a second or so over a long period,
>>>>>>>>due to network traffic, interrupts, and so forth, maybe even a daemon process
>>>>>>>>waking up for a moment to do something?
>>>>>>>>
>>>>>>>>My point?
>>>>>>>>
>>>>>>>>If you take a 16-cpu speedup, and a 16-cpu time, and use that to derive
>>>>>>>>the set of 1-cpu times (measured only in seconds) you will get a set of
>>>>>>>>N times that could produce that same speedup given the observed 16-cpu time
>>>>>>>>and computed 16-cpu speedup.  The inverse is the better way of thinking about
>>>>>>>>this of course, because that 16 cpu time is just a "snapshot" that could vary
>>>>>>>>dynamically, and as the 16-cpu time varies, so would the set of 1-cpu times
>>>>>>>>that would still produce that same speedup.
>>>>>>>>
>>>>>>>>In other words, there is a _lot_ of inaccuracy already.  So the question
>>>>>>>>becomes, "why compute some of the times, why not observe them all?"  That is
>>>>>>>>"the" question here.  And it is one I am not sure I can answer yet.  I have
>>>>>>>>gotten "closer" to the disk failure date.  From info in my files at the office.
>>>>>>>>
>>>>>>>>I bought a Gateway G6/200 (pentium pro 200) when they first came out.  I think
>>>>>>>>Bruce got his a month or so before me.  In early 1996, I had a disk crash,
>>>>>>>>and called gateway for a replacement (this is the part I have docs for as you
>>>>>>>>will see shortly).  They sent one but when it arrived, it was a 68-pin wide
>>>>>>>>scsi drive, while the machine they had sold me 2 months earlier was a 50-pin
>>>>>>>>narrow scsi drive (this machine had 4 4.5 gig scsi drives).  I called them to
>>>>>>>>tell them about the "error" (I couldn't use that drive with my scsi controller)
>>>>>>>>and they responded "we no longer have any narrow scsi stuff, we have migrated
>>>>>>>>to wide scsi."  I responded "great, so I have a machine 2 months old, that
>>>>>>>>cost me $5,000, and you can't get me a replacement drive?  Mark me off the
>>>>>>>>gateway customer list (I used to order 20-30 machines _per year_ for our
>>>>>>>>department labs)."  That got their attention, they sent me three more wide
>>>>>>>>scsi drives and a new controller.  So I have at least homed in on when I lost
>>>>>>>>"the world".  And that suggests that we _had_ to reconstruct the data as best
>>>>>>>>we could, although neither of us can say "yes that was the reason" with 100%
>>>>>>>>reliability.  I mainly remember trying to get Crafty back up to speed.  I was
>>>>>>>>getting ready for the WMCCC event, and was not releasing new versions.  As a
>>>>>>>>result, when the disk went down, I lost everything, and had to back up to the
>>>>>>>>last released version and start over, trying to remember the changes I had
>>>>>>>>made.
>>>>>>>>
>>>>>>>>I don't know when I started back to work on the dts paper, because the panic
>>>>>>>>about the WMCCC and losing things is the thing that I _really_ recall (not
>>>>>>>>to mention class assignments, handouts, etc.)  But it is certainly within
>>>>>>>>reason to think that led to the time computations.  As I have said all along,
>>>>>>>>the node computation was unavoidable from the start...
>>>>>>>>
>>>>>>>>So, as far as I am concerned, there is _nothing_ "false" in the DTS paper.  I
>>>>>>>>computed the speedups in the normal way, using raw data, back in early 1994
>>>>>>>>after the tests were run.  That was the "critical data" presented in the paper.
>>>>>>>>Later, we extrapolated the nodes when they were asked for, and it would seem
>>>>>>>>that we extrapolated the times at the same time.
>>>>>>>>
>>>>>>>>That is about all I can remember, so far.  I don't consider the "node" data to
>>>>>>>>be very interesting at all, and the "time" data is dynamic enough due to the
>>>>>>>>variability in speedup that any value I published, whether it be extrapolated or
>>>>>>>>observerd, could likely _never_ be repeated again.  As a new test run would
>>>>>>>>certainly vary by seconds if not much more (another thread in this discussion
>>>>>>>>shows some wild behavior on kopec 2, and some rock-solid behavior on kopec 3,
>>>>>>>>to illustrate this point.)
>>>>>>>>
>>>>>>>>I don't see it as being a "fraud" at all, but then I'm not Vincent.  I think
>>>>>>>>the concept of "if you can't make yours look better, then make someone else's
>>>>>>>>look worse" is extremely sad, of course.  But he will continue to do that on
>>>>>>>>all fronts.  "crafty's eval is simple".  "fritz knows shit about chess".
>>>>>>>>"tiger is stupid in endgames."  We've all seen those comments over and over.
>>>>>>>>As well as "diep has the most sophisticated eval of any chess program" or
>>>>>>>>"diep is the strongest program in the world at correspondence time controls".
>>>>>>>>
>>>>>>>>What more could anyone say?
>>>>>>>>
>>>>>>>>Will Crafty ever approach CB's DTS speedup?  I doubt it, because I am not
>>>>>>>>willing to give up the recursive search, which would be the first thing to
>>>>>>>>go.  If I had high expectations of running on a 32 cpu machine all the
>>>>>>>>time, then I certainly might consider it.  But I suspect that I will be
>>>>>>>>using a quad for the foreseable future, with maybe an 8-way or 16-way
>>>>>>>>box on rare tournament occasions.  I'm old enough to try to avoid
>>>>>>>>headaches, which you get with non-recursion when you try to read it later.
>>>>>>>>
>>>>>>>>Will someone else beat the DTS result?  Most certainly.  Even as I wrote the
>>>>>>>>paper I noticed a couple of things that ought to be changed.  Will it bother
>>>>>>>>me when they do?  Not at all.  My EPVS algorithm beat the old PVS by a small
>>>>>>>>amount.  Schaeffer tried a new approach that was better than EPVS.  Which led
>>>>>>>>me to try something even better since I had a lot of time that I would be
>>>>>>>>spending on my dissertation anyway, so that seemed like a natural project to
>>>>>>>>undertake.  Will I do that again?  I don't think so.  Too much development
>>>>>>>>time.  multiple years of debugging, including one full-time year as I finished
>>>>>>>>the dissertation work and started the writing.  The current SMP approach in
>>>>>>>>Crafty is not half-bad.  And it was not hard to write at all.  Those two points
>>>>>>>>say a lot...
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>>Regards,
>>>>>>>>>Miguel
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>Robert has always been a man of strong convictions, and if you call him a
>>>>>>>>>>'noo-noo head' he'll call you one back.  He isn't one to back down when he
>>>>>>>>>>thinks he is right.  That's one of the things I like about Dr. Hyatt.
>>>>>>>>>>
>>>>>>>>>>When these two styles happen to ram into one another, the sparks are sure.  A
>>>>>>>>>>philosophical question is often asked:
>>>>>>>>>>"What happens when an immovable object meets an irresistable force?"
>>>>>>>>>>
>>>>>>>>>>The 'debate' is an answer to that question.
>>>>>>>>>>;-)



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.