Computer Chess Club Archives




Subject: Re: Proving something is better

Author: Robert Hyatt

Date: 19:43:45 12/18/02

Go up one level in this thread

On December 18, 2002 at 17:18:56, Bruce Moreland wrote:

>On December 18, 2002 at 16:20:16, Robert Hyatt wrote:
>>That's not what I was talking about.  There are two ways you can compare
>>two chess algorithms, logically.
>>1.  time until it finds the solution move with the right score.
>>2.  time until it completes a particular search depth.
>>If you are playing with search extensions (or de-extensions in the case of
>>null-move) 2 has an obvious problem.  Because search depths don't mean much
>>when you are trying to find a move.
>>If you are playing with search extensions or whatever, it is possible to
>>greatly change the shape of the search tree, so that time to solution looks
>>good, but time to depth looks bad, or vice-versa.
>I don't consider this case.  To make sure that I understand what you're talking
>about, it sounds like this case is when the thing finds the answer, but then
>goes to hell and locks up.

Basically, yes.  If a program does that, time to solution looks good, but
time to depth looks horrible.  The point was that it could have a severe bug
and still look good by one metric...

>You can take a less pathological case and demonstrate that this can be normal.
>If you find an answer in ply 9, you will have resolved the PV twice, which takes
>time.  If you don't find the answer in ply 9, the solution move will have failed
>low, which typically takes less time.
>Personally, I'd rather have ply 9 take longer to complete, if it gets me the
>best score.

I don't disagree.  And this is something the "verification search" tends to
influence.  IE the remainder of the ply-1 moves don't whiz by as fast, but
they tend to be searched a bit more accurately under some conditions...

>So I don't look at time taken to finish the ply.  I'm interested in time taken
>to find the move and hold it, regardles of anything else that happens, because
>that's what happens in a game.

There I agree.  But then you run into the problem I had when I did the DTS
paper.  Displaying the node count in the middle of a search is non-trivial
when things are not synchronized, so that reporting nodes for a partial search
(for me) is impossible.  Reporting nodes when an iteration ends is easy since
at that point every processor is idle and not searching anything.

>I try to avoid the problem of finding a few solutions for bad reasons by doing a
>lot of tests.  If version A gets 30 more of them than version B, that's pretty
>strongly indicative.
>But this doesn't address the problem I pointed out with Omid's thing:
>A takes 30 seconds, produces 50 answers.
>B takes 40 seconds, produces 55 answers.
>It makes absolutely no sense to say that B is better than A, and if the time
>differential is large enough, and the difference in number of solutions is small
>enough, it may make sense to say the reverse.

I don't disagree there...

>>I'm personally not convinced there is _any_ way to _really_ compare two
>>algorithms except to play them in a long match.  Tactically stronger !=
>>stronger overall, in all cases.  Tactically weaker != weaker overall, in
>>all cases.
>There are lots of problems with this, too.
>>I think that for times, _both_ approaches are needed.  measuring the time
>>until the correct PV pops out has problems, as does searching to a fixed depth.
>>Unless you simply search to the depth required to find the solution, with each
>>algorithm.  And even that is not so easy to compare.
>Doing a search to fixed depth only make sense if you are trying to test pure
>performance modifications, which at most obvious are algorithm improvements that
>do not change the shape of the tree, but perhaps you can extend this to include
>changes that affect the order in which the tree is searched (move ordering
>issues).  To test tactical changes this way is crazy.
>>You can compare node counts.  Search times to find a solution, or search times
>>to search to a specific depth.  All three reveal different things about the
>>program...  relying on one is not something I do.  I try to look at them all
>>to get a better overall picture of what is happening...

This page took 0.08 seconds to execute

Last modified: Thu, 07 Jul 11 08:48:38 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.