Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: A more meaningful test?

Author: Robert Hyatt

Date: 10:02:26 05/05/04

Go up one level in this thread


On May 05, 2004 at 12:25:36, Matthew Hull wrote:

>On May 05, 2004 at 12:21:31, Robert Hyatt wrote:
>
>>On May 05, 2004 at 12:12:59, Matthew Hull wrote:
>>
>>>On May 05, 2004 at 11:37:50, Robert Hyatt wrote:
>>>
>>>>On May 05, 2004 at 10:25:23, Matthew Hull wrote:
>>>>
>>>>>On May 05, 2004 at 04:44:19, martin fierz wrote:
>>>>>
>>>>>>On May 04, 2004 at 21:06:30, Robert Hyatt wrote:
>>>>>>
>>>>>>>On May 04, 2004 at 18:21:07, martin fierz wrote:
>>>>>>>
>>>>>>>>On May 04, 2004 at 13:44:21, Robert Hyatt wrote:
>>>>>>>>
>>>>>>>>>On May 04, 2004 at 10:49:42, martin fierz wrote:
>>>>>>>>>
>>>>>>>>>>On May 04, 2004 at 07:32:01, Rolf Tueschen wrote:
>>>>>>>>>>
>>>>>>>>>>>On May 04, 2004 at 07:11:15, martin fierz wrote:
>>>>>>>>>>>
>>>>>>>>>>>>On May 03, 2004 at 22:50:58, Robert Hyatt wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>>If you recall, I _have_ given some error estimates in the past.   Remember the
>>>>>>>>>>>>>wildly varying speedup numbers I showed you the first time this issue came up?
>>>>>>>>>>>>
>>>>>>>>>>>>i recall that you gave wildly varying speedup numbers, and an explanation for
>>>>>>>>>>>>why this happens. i  don't recall a real error estimate, but that can be either
>>>>>>>>>>>>because
>>>>>>>>>>>>-> you gave one and i didn't see it
>>>>>>>>>>>>-> you gave one, i saw it and forgot
>>>>>>>>>>>>-> you didn't give one at all
>>>>>>>>>>>>
>>>>>>>>>>>>so... what kind of numbers would you give if you were pressed?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>Isn't it impolite to imply the third option if Bob JUST said that he did give
>>>>>>>>>>>some?
>>>>>>>>>>
>>>>>>>>>>no - asking questions always has to be allowed among scientists. forbidding to
>>>>>>>>>>ask questions is the hallmark of religious fanatics and fascists... but i
>>>>>>>>>>digress :-)
>>>>>>>>>>bob says he gave numbers, which he did. but IIRC, he never gave an error
>>>>>>>>>>estimate. so i am allowed to ask for it, and it is not at all impolite to do so.
>>>>>>>>>>what he did show is the speedup in about 30 different positions, which could
>>>>>>>>>>vary wildly depending on the position.
>>>>>>>>>>
>>>>>>>>>>i don't know why you think you have to stand up and defend bob every time
>>>>>>>>>>somebody says something about him you don't like. just leave that up to him. he
>>>>>>>>>>can take it :-)
>>>>>>>>>>
>>>>>>>>>>cheers
>>>>>>>>>>  martin
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>I wasn't offended.  I hope my answer was ok.
>>>>>>>>
>>>>>>>>i didn't think you'd be offended, and your answer was ok, but...why don't you
>>>>>>>>take N (preferably N>>30...) positions and compute the standard deviation of
>>>>>>>>your speedup numbers, and the standard deviation of the average speedup? you can
>>>>>>>>still discuss the meaning of this, but at least you have an error margin you can
>>>>>>>>attach to your speedup. i don't see anything wrong with that!? even if the
>>>>>>>>probability distribution is obviously not a normal distribution, you can
>>>>>>>>probably approximate it as such, and get an idea of it's width from these
>>>>>>>>numbers.
>>>>>>>>
>>>>>>>>>This is not an easy question to deal with.
>>>>>>>>
>>>>>>>>>IE if you take the standard deviation of a set of random numbers between
>>>>>>>>>0 and N what do you get?  That is what the speedup numbers look like for some
>>>>>>>>>positions.  For others the speedup is a near-perfect constant value.  Add some
>>>>>>>>>perfect constants plus some randomly distributed values and exactly what does
>>>>>>>>>the SD show?  :)
>>>>>>>>
>>>>>>>>i don't quite understand your question. if you take enough positions, then you
>>>>>>>>will get something sensible, i would think. if you doubt this, you can take e.g.
>>>>>>>>10'000 sequential positions from crafty's ICC log, and bunch them together in
>>>>>>>>groups of 1000, and compute average speedup + stdev-of-average-speedup for each
>>>>>>>>of the bunches. i can't imagine that you get 10 wildly differing values, as your
>>>>>>>>statement above suggests.
>>>>>>>>
>>>>>>>>cheers
>>>>>>>>  martin
>>>>>>>
>>>>>>>
>>>>>>>It isn't so easy to get speedup.  IE how would I take a position that took X
>>>>>>>seconds with 2 or 4 cpus and compute the 1 cpu time?  Think about it carefully
>>>>>>>and you will see the problem.  How to get the 1-cpu test case to have a properly
>>>>>>>loaded hash table, killer move table, history table, etc, before starting the
>>>>>>>search???
>>>>>>
>>>>>>i don't understand this part at all. run the exact same test on a 1 CPU machine,
>>>>>>and then on an N-CPU machine.
>>>>>>the reason i don't understand this at all is that all the details you are
>>>>>>talking about are irrelevant. i want to know what happens if i run the same test
>>>>>>positions on a 1 CPU box or on a N CPU box. this is easy to answer, because it
>>>>>>can be determined experimentally. whether or not there are philosophical issues
>>>>>>as those you raise above can be discussed, but it doesn't stop you from getting
>>>>>>your number...
>>>>>>
>>>>>>
>>>>>>>The best bet is to take N positions where N is large.  But then that is not the
>>>>>>>same as what happens in real games where the positions are connected via info
>>>>>>>passed from search to search in the hash table.
>>>>>>
>>>>>>not true if you use 1000 positions from crafty's ICC log file, where the exact
>>>>>>same thing happens, because these are also positions from real games, connected
>>>>>>to each other.
>>>>>
>>>>>
>>>>>The only way I can see how your idea would work is if crafty played a series of
>>>>>games searching to a fixed depth with a single processor.  Then take the
>>>>>positions from those games and set them up in the order they occurred and time
>>>>>crafty's response to those positions to the same depth, but with 2 processors,
>>>>>then do it all over again with 3 processors, etc.  That way for each run, the
>>>>>effects of cache relavency are not lost as would be the case in disconnected
>>>>>test set positions.
>>>>>
>>>>>Then you might have an idea of an expected speedup in actual games, rather than
>>>>>from disconnected test positions.
>>>>>
>>>>>That is a much more involved and time consuming test.  I don't think he could
>>>>>have afforded that kind of CPU time on a CRAY.
>>>>>
>>>>>
>>>>
>>>>
>>>>That is actually what I did.  I used a Cray for about a year.  Multi-cpu tests
>>>>could get about 5 minutes of "dedicated whole-machine" every hour followed by a
>>>>checkpoint and 55 minute wait for the next "slot".
>>>>
>>>>That's why I said it took way over a year to run all the tests to get the
>>>>results.  Only exception was the 1 cpu tests which ran for long times, but with
>>>>1 cpu I could rely on CPU time rather than elapsed time.
>>>
>>>
>>>Even so, we still can't see the effect of pondering on the hash tables, since we
>>>are searching to a fixed depth.  So the resemblance to real game conditions is
>>>still somewhat wanting.
>>
>>I agree.  I fiddled with this during the CB testing.  IE it should be possible
>>to mimic the effect by doing a ponder search for a fixed period of time, then
>>reading in the move and continuing.  I did this.  But on occasion the wrong move
>>gets played and the _rest_ of the test is ruined...  I would have to re-read the
>>DTS article to remember exactly how I did this stuff since we are talking 10
>>years ago.
>>
>>
>>>
>>>I have a tangential question.  Assuming you have scads of memory, could you make
>>>a copy of the hash table(s) and ponder using the copy, then if the predicted
>>>move was not made, simply "context switch" to the before-ponder hash table?
>>>Would there be any benefit in that?  Would such a copy operation not do for
>>>short time controls?
>>>
>>>
>>
>>
>>I've never had that much memory, even on a Cray.  :)
>
>
>Is there any merit to the idea in principle?

My first guess is no.  The reasoning is that the hash table before the pondering
is produced from a depth N search starting at move N.  When pondering, we will
probably do a depth N search starting at move N+1.  Many times, even if the
predicted move is not played, we are dealing with nothing but a transposition of
move order so that the pondered hash stuff still helps.

It would certainly do a lot for repeatibility however, if you want to play back
over the game and get the same moves/scores/times...


>
>
>>
>>
>>
>>
>>>
>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>>>
>>>>>>>It is just a very hard question to answer.  And change the positions and you can
>>>>>>>change the answer significantly...
>>>>>>
>>>>>>perhaps, perhaps not. you didn't measure it, and so you can't say :-)
>>>>>>
>>>>>>cheers
>>>>>>  martin



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.