Computer Chess Club Archives

Search

Terms
Messages

Subject: Re: A more meaningful test?

Author: Matthew Hull
Date: 09:25:36 05/05/04
On May 05, 2004 at 12:21:31, Robert Hyatt wrote:

>On May 05, 2004 at 12:12:59, Matthew Hull wrote:
>
>>On May 05, 2004 at 11:37:50, Robert Hyatt wrote:
>>
>>>On May 05, 2004 at 10:25:23, Matthew Hull wrote:
>>>
>>>>On May 05, 2004 at 04:44:19, martin fierz wrote:
>>>>
>>>>>On May 04, 2004 at 21:06:30, Robert Hyatt wrote:
>>>>>
>>>>>>On May 04, 2004 at 18:21:07, martin fierz wrote:
>>>>>>
>>>>>>>On May 04, 2004 at 13:44:21, Robert Hyatt wrote:
>>>>>>>
>>>>>>>>On May 04, 2004 at 10:49:42, martin fierz wrote:
>>>>>>>>
>>>>>>>>>On May 04, 2004 at 07:32:01, Rolf Tueschen wrote:
>>>>>>>>>
>>>>>>>>>>On May 04, 2004 at 07:11:15, martin fierz wrote:
>>>>>>>>>>
>>>>>>>>>>>On May 03, 2004 at 22:50:58, Robert Hyatt wrote:
>>>>>>>>>>>
>>>>>>>>>>>>If you recall, I _have_ given some error estimates in the past.   Remember the
>>>>>>>>>>>>wildly varying speedup numbers I showed you the first time this issue came up?
>>>>>>>>>>>
>>>>>>>>>>>i recall that you gave wildly varying speedup numbers, and an explanation for
>>>>>>>>>>>why this happens. i  don't recall a real error estimate, but that can be either
>>>>>>>>>>>because
>>>>>>>>>>>-> you gave one and i didn't see it
>>>>>>>>>>>-> you gave one, i saw it and forgot
>>>>>>>>>>>-> you didn't give one at all
>>>>>>>>>>>
>>>>>>>>>>>so... what kind of numbers would you give if you were pressed?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>Isn't it impolite to imply the third option if Bob JUST said that he did give
>>>>>>>>>>some?
>>>>>>>>>
>>>>>>>>>no - asking questions always has to be allowed among scientists. forbidding to
>>>>>>>>>ask questions is the hallmark of religious fanatics and fascists... but i
>>>>>>>>>digress :-)
>>>>>>>>>bob says he gave numbers, which he did. but IIRC, he never gave an error
>>>>>>>>>estimate. so i am allowed to ask for it, and it is not at all impolite to do so.
>>>>>>>>>what he did show is the speedup in about 30 different positions, which could
>>>>>>>>>vary wildly depending on the position.
>>>>>>>>>
>>>>>>>>>i don't know why you think you have to stand up and defend bob every time
>>>>>>>>>somebody says something about him you don't like. just leave that up to him. he
>>>>>>>>>can take it :-)
>>>>>>>>>
>>>>>>>>>cheers
>>>>>>>>>  martin
>>>>>>>>
>>>>>>>>
>>>>>>>>I wasn't offended.  I hope my answer was ok.
>>>>>>>
>>>>>>>i didn't think you'd be offended, and your answer was ok, but...why don't you
>>>>>>>take N (preferably N>>30...) positions and compute the standard deviation of
>>>>>>>your speedup numbers, and the standard deviation of the average speedup? you can
>>>>>>>still discuss the meaning of this, but at least you have an error margin you can
>>>>>>>attach to your speedup. i don't see anything wrong with that!? even if the
>>>>>>>probability distribution is obviously not a normal distribution, you can
>>>>>>>probably approximate it as such, and get an idea of it's width from these
>>>>>>>numbers.
>>>>>>>
>>>>>>>>This is not an easy question to deal with.
>>>>>>>
>>>>>>>>IE if you take the standard deviation of a set of random numbers between
>>>>>>>>0 and N what do you get?  That is what the speedup numbers look like for some
>>>>>>>>positions.  For others the speedup is a near-perfect constant value.  Add some
>>>>>>>>perfect constants plus some randomly distributed values and exactly what does
>>>>>>>>the SD show?  :)
>>>>>>>
>>>>>>>i don't quite understand your question. if you take enough positions, then you
>>>>>>>will get something sensible, i would think. if you doubt this, you can take e.g.
>>>>>>>10'000 sequential positions from crafty's ICC log, and bunch them together in
>>>>>>>groups of 1000, and compute average speedup + stdev-of-average-speedup for each
>>>>>>>of the bunches. i can't imagine that you get 10 wildly differing values, as your
>>>>>>>statement above suggests.
>>>>>>>
>>>>>>>cheers
>>>>>>>  martin
>>>>>>
>>>>>>
>>>>>>It isn't so easy to get speedup.  IE how would I take a position that took X
>>>>>>seconds with 2 or 4 cpus and compute the 1 cpu time?  Think about it carefully
>>>>>>and you will see the problem.  How to get the 1-cpu test case to have a properly
>>>>>>loaded hash table, killer move table, history table, etc, before starting the
>>>>>>search???
>>>>>
>>>>>i don't understand this part at all. run the exact same test on a 1 CPU machine,
>>>>>and then on an N-CPU machine.
>>>>>the reason i don't understand this at all is that all the details you are
>>>>>talking about are irrelevant. i want to know what happens if i run the same test
>>>>>positions on a 1 CPU box or on a N CPU box. this is easy to answer, because it
>>>>>can be determined experimentally. whether or not there are philosophical issues
>>>>>as those you raise above can be discussed, but it doesn't stop you from getting
>>>>>your number...
>>>>>
>>>>>
>>>>>>The best bet is to take N positions where N is large.  But then that is not the
>>>>>>same as what happens in real games where the positions are connected via info
>>>>>>passed from search to search in the hash table.
>>>>>
>>>>>not true if you use 1000 positions from crafty's ICC log file, where the exact
>>>>>same thing happens, because these are also positions from real games, connected
>>>>>to each other.
>>>>
>>>>
>>>>The only way I can see how your idea would work is if crafty played a series of
>>>>games searching to a fixed depth with a single processor.  Then take the
>>>>positions from those games and set them up in the order they occurred and time
>>>>crafty's response to those positions to the same depth, but with 2 processors,
>>>>then do it all over again with 3 processors, etc.  That way for each run, the
>>>>effects of cache relavency are not lost as would be the case in disconnected
>>>>test set positions.
>>>>
>>>>Then you might have an idea of an expected speedup in actual games, rather than
>>>>from disconnected test positions.
>>>>
>>>>That is a much more involved and time consuming test.  I don't think he could
>>>>have afforded that kind of CPU time on a CRAY.
>>>>
>>>>
>>>
>>>
>>>That is actually what I did.  I used a Cray for about a year.  Multi-cpu tests
>>>could get about 5 minutes of "dedicated whole-machine" every hour followed by a
>>>checkpoint and 55 minute wait for the next "slot".
>>>
>>>That's why I said it took way over a year to run all the tests to get the
>>>results.  Only exception was the 1 cpu tests which ran for long times, but with
>>>1 cpu I could rely on CPU time rather than elapsed time.
>>
>>
>>Even so, we still can't see the effect of pondering on the hash tables, since we
>>are searching to a fixed depth.  So the resemblance to real game conditions is
>>still somewhat wanting.
>
>I agree.  I fiddled with this during the CB testing.  IE it should be possible
>to mimic the effect by doing a ponder search for a fixed period of time, then
>reading in the move and continuing.  I did this.  But on occasion the wrong move
>gets played and the _rest_ of the test is ruined...  I would have to re-read the
>DTS article to remember exactly how I did this stuff since we are talking 10
>years ago.
>
>
>>
>>I have a tangential question.  Assuming you have scads of memory, could you make
>>a copy of the hash table(s) and ponder using the copy, then if the predicted
>>move was not made, simply "context switch" to the before-ponder hash table?
>>Would there be any benefit in that?  Would such a copy operation not do for
>>short time controls?
>>
>>
>
>
>I've never had that much memory, even on a Cray.  :)


Is there any merit to the idea in principle?


>
>
>
>
>>
>>
>>>
>>>
>>>
>>>
>>>
>>>>>
>>>>>>It is just a very hard question to answer.  And change the positions and you can
>>>>>>change the answer significantly...
>>>>>
>>>>>perhaps, perhaps not. you didn't measure it, and so you can't say :-)
>>>>>
>>>>>cheers
>>>>>  martin
Re: A more meaningful test? Robert Hyatt 10:02:26 05/05/04
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.