Computer Chess Club Archives

Search

Terms
Messages

Subject: Re: A more meaningful test?

Author: Matthew Hull
Date: 09:12:59 05/05/04
On May 05, 2004 at 11:37:50, Robert Hyatt wrote:

>On May 05, 2004 at 10:25:23, Matthew Hull wrote:
>
>>On May 05, 2004 at 04:44:19, martin fierz wrote:
>>
>>>On May 04, 2004 at 21:06:30, Robert Hyatt wrote:
>>>
>>>>On May 04, 2004 at 18:21:07, martin fierz wrote:
>>>>
>>>>>On May 04, 2004 at 13:44:21, Robert Hyatt wrote:
>>>>>
>>>>>>On May 04, 2004 at 10:49:42, martin fierz wrote:
>>>>>>
>>>>>>>On May 04, 2004 at 07:32:01, Rolf Tueschen wrote:
>>>>>>>
>>>>>>>>On May 04, 2004 at 07:11:15, martin fierz wrote:
>>>>>>>>
>>>>>>>>>On May 03, 2004 at 22:50:58, Robert Hyatt wrote:
>>>>>>>>>
>>>>>>>>>>If you recall, I _have_ given some error estimates in the past.   Remember the
>>>>>>>>>>wildly varying speedup numbers I showed you the first time this issue came up?
>>>>>>>>>
>>>>>>>>>i recall that you gave wildly varying speedup numbers, and an explanation for
>>>>>>>>>why this happens. i  don't recall a real error estimate, but that can be either
>>>>>>>>>because
>>>>>>>>>-> you gave one and i didn't see it
>>>>>>>>>-> you gave one, i saw it and forgot
>>>>>>>>>-> you didn't give one at all
>>>>>>>>>
>>>>>>>>>so... what kind of numbers would you give if you were pressed?
>>>>>>>>
>>>>>>>>
>>>>>>>>Isn't it impolite to imply the third option if Bob JUST said that he did give
>>>>>>>>some?
>>>>>>>
>>>>>>>no - asking questions always has to be allowed among scientists. forbidding to
>>>>>>>ask questions is the hallmark of religious fanatics and fascists... but i
>>>>>>>digress :-)
>>>>>>>bob says he gave numbers, which he did. but IIRC, he never gave an error
>>>>>>>estimate. so i am allowed to ask for it, and it is not at all impolite to do so.
>>>>>>>what he did show is the speedup in about 30 different positions, which could
>>>>>>>vary wildly depending on the position.
>>>>>>>
>>>>>>>i don't know why you think you have to stand up and defend bob every time
>>>>>>>somebody says something about him you don't like. just leave that up to him. he
>>>>>>>can take it :-)
>>>>>>>
>>>>>>>cheers
>>>>>>>  martin
>>>>>>
>>>>>>
>>>>>>I wasn't offended.  I hope my answer was ok.
>>>>>
>>>>>i didn't think you'd be offended, and your answer was ok, but...why don't you
>>>>>take N (preferably N>>30...) positions and compute the standard deviation of
>>>>>your speedup numbers, and the standard deviation of the average speedup? you can
>>>>>still discuss the meaning of this, but at least you have an error margin you can
>>>>>attach to your speedup. i don't see anything wrong with that!? even if the
>>>>>probability distribution is obviously not a normal distribution, you can
>>>>>probably approximate it as such, and get an idea of it's width from these
>>>>>numbers.
>>>>>
>>>>>>This is not an easy question to deal with.
>>>>>
>>>>>>IE if you take the standard deviation of a set of random numbers between
>>>>>>0 and N what do you get?  That is what the speedup numbers look like for some
>>>>>>positions.  For others the speedup is a near-perfect constant value.  Add some
>>>>>>perfect constants plus some randomly distributed values and exactly what does
>>>>>>the SD show?  :)
>>>>>
>>>>>i don't quite understand your question. if you take enough positions, then you
>>>>>will get something sensible, i would think. if you doubt this, you can take e.g.
>>>>>10'000 sequential positions from crafty's ICC log, and bunch them together in
>>>>>groups of 1000, and compute average speedup + stdev-of-average-speedup for each
>>>>>of the bunches. i can't imagine that you get 10 wildly differing values, as your
>>>>>statement above suggests.
>>>>>
>>>>>cheers
>>>>>  martin
>>>>
>>>>
>>>>It isn't so easy to get speedup.  IE how would I take a position that took X
>>>>seconds with 2 or 4 cpus and compute the 1 cpu time?  Think about it carefully
>>>>and you will see the problem.  How to get the 1-cpu test case to have a properly
>>>>loaded hash table, killer move table, history table, etc, before starting the
>>>>search???
>>>
>>>i don't understand this part at all. run the exact same test on a 1 CPU machine,
>>>and then on an N-CPU machine.
>>>the reason i don't understand this at all is that all the details you are
>>>talking about are irrelevant. i want to know what happens if i run the same test
>>>positions on a 1 CPU box or on a N CPU box. this is easy to answer, because it
>>>can be determined experimentally. whether or not there are philosophical issues
>>>as those you raise above can be discussed, but it doesn't stop you from getting
>>>your number...
>>>
>>>
>>>>The best bet is to take N positions where N is large.  But then that is not the
>>>>same as what happens in real games where the positions are connected via info
>>>>passed from search to search in the hash table.
>>>
>>>not true if you use 1000 positions from crafty's ICC log file, where the exact
>>>same thing happens, because these are also positions from real games, connected
>>>to each other.
>>
>>
>>The only way I can see how your idea would work is if crafty played a series of
>>games searching to a fixed depth with a single processor.  Then take the
>>positions from those games and set them up in the order they occurred and time
>>crafty's response to those positions to the same depth, but with 2 processors,
>>then do it all over again with 3 processors, etc.  That way for each run, the
>>effects of cache relavency are not lost as would be the case in disconnected
>>test set positions.
>>
>>Then you might have an idea of an expected speedup in actual games, rather than
>>from disconnected test positions.
>>
>>That is a much more involved and time consuming test.  I don't think he could
>>have afforded that kind of CPU time on a CRAY.
>>
>>
>
>
>That is actually what I did.  I used a Cray for about a year.  Multi-cpu tests
>could get about 5 minutes of "dedicated whole-machine" every hour followed by a
>checkpoint and 55 minute wait for the next "slot".
>
>That's why I said it took way over a year to run all the tests to get the
>results.  Only exception was the 1 cpu tests which ran for long times, but with
>1 cpu I could rely on CPU time rather than elapsed time.


Even so, we still can't see the effect of pondering on the hash tables, since we
are searching to a fixed depth.  So the resemblance to real game conditions is
still somewhat wanting.

I have a tangential question.  Assuming you have scads of memory, could you make
a copy of the hash table(s) and ponder using the copy, then if the predicted
move was not made, simply "context switch" to the before-ponder hash table?
Would there be any benefit in that?  Would such a copy operation not do for
short time controls?




>
>
>
>
>
>>>
>>>>It is just a very hard question to answer.  And change the positions and you can
>>>>change the answer significantly...
>>>
>>>perhaps, perhaps not. you didn't measure it, and so you can't say :-)
>>>
>>>cheers
>>>  martin
Re: A more meaningful test? Robert Hyatt 09:21:31 05/05/04
- Re: A more meaningful test? Matthew Hull 09:25:36 05/05/04
  - Re: A more meaningful test? Robert Hyatt 10:02:26 05/05/04
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.