Computer Chess Club Archives




Subject: Re: Proving something is better

Author: Ricardo Gibert

Date: 00:32:05 12/18/02

Go up one level in this thread

On December 18, 2002 at 02:28:29, Tony Werten wrote:

>On December 17, 2002 at 17:30:36, Bruce Moreland wrote:
>>Omid wrote an article that's in the ICGA that I received today, and there are
>>two tables:
>>D        R=1               R=2            R=3           VR=3
>>9     1,652,668,804    603,549,661   267,208,422    449,744,588
>>10   11,040,766,367  1,892,829,685   862,153,828  1,449,589,289
>>D        R=1               R=2            R=3           VR=3
>>9        64                62             53            60
>>10       71                66             65            71
>>The first table is nodes taken to search to depth D with four techniques, which
>>are standard null-move with various R values, and a "verified" null move search
>>with R=3.
>>The second table is the number of problem solutions in a test suite, given
>>particular depths of search and using these techniques.
>>The conclusion is that VR=3 is better.
>>Does anyone else see the two problems here?
>Yes, but 2 different ones then yours :)
>1)R2-3 is expected to search a number of nodes between r=2 and r=3, just as v3.
>Yet R2-3 is not in the table.
>2) Time to solution is missing (maybe the same as your point 2). I think that
>that table would be a good addition. Number of solutions at depth x would make
>the algoritm "extend every move at the root with 3 ply" look very good.
>I don't understand why time to solution is not used. The only argument I saw was
>"it's too hardware dependend". But I don't think that a good argument since a a
>percentile speedup ( ie 10% faster ) will still be the same no matter how much
>faster the hardware is.

If "better" means the TTS (=Time To Solution) for one algorithm is less than
anothers, then consider 2 different hardware architectures X and Y with respect
to algorithms A and B:

On Hardware X, the TTS for algorithm A may be less than the TTS for algorithm B.
Therefore algorithm A would be "better" than algorithm B.

Unfortunately, it is possible that on hardware Y, the TTS for algorithm A may be
greater than the TTS for algorithm B. In this case algorithm B would be "better"
than algorithm A. With this example, it is seen that this standard for "better"
leads to inconsistencies.

On the other hand if "better" means the NNS (=Number of Nodes Searched) for one
algorithm is less than anothers:

On hardware X, if the NNS of algorithm A is less than the NNS for algorithm B,
then algorithm A is "better" than algorithm B.

If we now switch to hardware B, the NNS of each algorithm will remain the same
of course, so the 2nd standard of "better" remains consistent, since it is
hardware independent. This is why NNS is preferred to TTS.

As you and others have pointed out, NNS isn't the whole story, but usually it is

If you want to do a really good job, I would suggest that both TTS and NNS be
used to produced a more informative comparison, but you should expect that the
TTS part may become completely obsolete after enough time. Even though the TTS
measurements may become irrelevant for future computer architectures, at least
it will be helpful over the short term.

For a sufficiently large problem, "big-oh" notation does not suffer any
exceptions. This is why it used in computer science. Unfortunately it is not
always useful for comparing computer chess algorithms.

>Furthermore, time to solution is the only thing I care about. My program has to
>produce a good move within a certain time, not within a certain amount of nodes
>or depth.
>>1) The amount of time taken to process a leaf node may very well be less than
>>the amount of time taken to process an interior node (not including its
>>children, of course).  So if the tree shape changes, it is possible that it
>>could take longer to search a smaller tree.  Unless that is ruled out, all that
>>has been proven is that VR=3 works in fewer nodes.  That is pretty interesting,
>>but I think it makes a stronger case if *time* is used as well, so we will know
>>that there is at least one real case where this technique makes the program
>>2) The amount of nodes traversed to get through ply 10, with R=3, is about 60%
>>of the number taken to get through ply 10 with VR=3.  It can be assumed (perhaps
>>wrongly, due to my previous point) that this search takes 60% as long.  The
>>number of solutions found by R=3 is fewer than with VR=3, granted.  But what is
>>the R=3 version doing while the other version is trying to finish up ply 10?  It
>>is going on to ply 11.  How many solutions has it found in ply 11 before VR=3
>>has finished ply 10?  In this case, 65 is sufficiently less than 71 that the
>>answer is probably less then 71.  But maybe not!
>>It seems likely that VR=3 is tactically faster than R=3, but I cannot know for
>>sure, since the results have not been reported!  We do not know if VR=3 is
>>tactically *faster* than R=3.  Isn't that an important point, since all of us
>>who read this article are wondering if we can stick this in our own programs and
>>obtain benefit?
>>I have seen people report results like this forever.  I wish that they would use
>>a the more sensible method of reporting number of solutions found correct in a
>>certain amount of *time*, since that is the true measure of tactical speed.
>>Nothing personal, Omid.
>>I'm trying to verify your results by using ECM with Gerbil.  First I have to get
>>good numbers for R=2 and R=3.
>>By the way, if anyone wants to take Gerbil and use it as a second example when
>>writing articles like this, I'm all for it.  It's simple so it should be pretty
>>easy to modify.

This page took 0.02 seconds to execute

Last modified: Thu, 07 Jul 11 08:48:38 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.