Computer Chess Club Archives




Subject: Re: Proving something is better

Author: Peter McKenzie

Date: 01:07:23 12/18/02

Go up one level in this thread

On December 18, 2002 at 03:32:05, Ricardo Gibert wrote:

>On December 18, 2002 at 02:28:29, Tony Werten wrote:
>>On December 17, 2002 at 17:30:36, Bruce Moreland wrote:
>>>Omid wrote an article that's in the ICGA that I received today, and there are
>>>two tables:
>>>D        R=1               R=2            R=3           VR=3
>>>9     1,652,668,804    603,549,661   267,208,422    449,744,588
>>>10   11,040,766,367  1,892,829,685   862,153,828  1,449,589,289
>>>D        R=1               R=2            R=3           VR=3
>>>9        64                62             53            60
>>>10       71                66             65            71
>>>The first table is nodes taken to search to depth D with four techniques, which
>>>are standard null-move with various R values, and a "verified" null move search
>>>with R=3.
>>>The second table is the number of problem solutions in a test suite, given
>>>particular depths of search and using these techniques.
>>>The conclusion is that VR=3 is better.
>>>Does anyone else see the two problems here?
>>Yes, but 2 different ones then yours :)
>>1)R2-3 is expected to search a number of nodes between r=2 and r=3, just as v3.
>>Yet R2-3 is not in the table.
>>2) Time to solution is missing (maybe the same as your point 2). I think that
>>that table would be a good addition. Number of solutions at depth x would make
>>the algoritm "extend every move at the root with 3 ply" look very good.
>>I don't understand why time to solution is not used. The only argument I saw was
>>"it's too hardware dependend". But I don't think that a good argument since a a
>>percentile speedup ( ie 10% faster ) will still be the same no matter how much
>>faster the hardware is.
>If "better" means the TTS (=Time To Solution) for one algorithm is less than
>anothers, then consider 2 different hardware architectures X and Y with respect
>to algorithms A and B:
>On Hardware X, the TTS for algorithm A may be less than the TTS for algorithm B.
>Therefore algorithm A would be "better" than algorithm B.
>Unfortunately, it is possible that on hardware Y, the TTS for algorithm A may be
>greater than the TTS for algorithm B. In this case algorithm B would be "better"
>than algorithm A. With this example, it is seen that this standard for "better"
>leads to inconsistencies.
>On the other hand if "better" means the NNS (=Number of Nodes Searched) for one
>algorithm is less than anothers:
>On hardware X, if the NNS of algorithm A is less than the NNS for algorithm B,
>then algorithm A is "better" than algorithm B.
>If we now switch to hardware B, the NNS of each algorithm will remain the same
>of course, so the 2nd standard of "better" remains consistent, since it is
>hardware independent. This is why NNS is preferred to TTS.

NNS is preferred only by people who run their programs on theoretical computers.
The rest of us must use real hardware so we prefer TTS :-)

To put it another way, I think thing your argument is invalid and here is why:
We have 2 definitions of 'better'.

One of them, based on TTS means that the algorithm is definitely better for some
real computer system.  Yes, there actually exists a computer where we are 100%
sure that this algorithm is superior for a real world applicaton.  Is it
superior on all computers?  Well, we don't know for sure although it may be
possible to draw some inference.

Now, take the definition of 'better' based on NNS.  We don't actually know if
this algorithm will perform better for a real world application on any computer.
 All we know is that it will be consistently use less nodes on all computers
than the competing algorithm.  But it could also consistently take 20% more time
on all computers.  Of course we suspect better NNS performance will result in
better real world application performance, but this is BY NO MEANS CERTAIN.

TTS *demonstrates* that the thing really works, on at least 1 platform.
NNS suggests that the thing might work, on at least 1 platform.

>As you and others have pointed out, NNS isn't the whole story, but usually it is
>If you want to do a really good job, I would suggest that both TTS and NNS be
>used to produced a more informative comparison, but you should expect that the
>TTS part may become completely obsolete after enough time. Even though the TTS
>measurements may become irrelevant for future computer architectures, at least
>it will be helpful over the short term.
>For a sufficiently large problem, "big-oh" notation does not suffer any
>exceptions. This is why it used in computer science. Unfortunately it is not
>always useful for comparing computer chess algorithms.
>>Furthermore, time to solution is the only thing I care about. My program has to
>>produce a good move within a certain time, not within a certain amount of nodes
>>or depth.
>>>1) The amount of time taken to process a leaf node may very well be less than
>>>the amount of time taken to process an interior node (not including its
>>>children, of course).  So if the tree shape changes, it is possible that it
>>>could take longer to search a smaller tree.  Unless that is ruled out, all that
>>>has been proven is that VR=3 works in fewer nodes.  That is pretty interesting,
>>>but I think it makes a stronger case if *time* is used as well, so we will know
>>>that there is at least one real case where this technique makes the program
>>>2) The amount of nodes traversed to get through ply 10, with R=3, is about 60%
>>>of the number taken to get through ply 10 with VR=3.  It can be assumed (perhaps
>>>wrongly, due to my previous point) that this search takes 60% as long.  The
>>>number of solutions found by R=3 is fewer than with VR=3, granted.  But what is
>>>the R=3 version doing while the other version is trying to finish up ply 10?  It
>>>is going on to ply 11.  How many solutions has it found in ply 11 before VR=3
>>>has finished ply 10?  In this case, 65 is sufficiently less than 71 that the
>>>answer is probably less then 71.  But maybe not!
>>>It seems likely that VR=3 is tactically faster than R=3, but I cannot know for
>>>sure, since the results have not been reported!  We do not know if VR=3 is
>>>tactically *faster* than R=3.  Isn't that an important point, since all of us
>>>who read this article are wondering if we can stick this in our own programs and
>>>obtain benefit?
>>>I have seen people report results like this forever.  I wish that they would use
>>>a the more sensible method of reporting number of solutions found correct in a
>>>certain amount of *time*, since that is the true measure of tactical speed.
>>>Nothing personal, Omid.
>>>I'm trying to verify your results by using ECM with Gerbil.  First I have to get
>>>good numbers for R=2 and R=3.
>>>By the way, if anyone wants to take Gerbil and use it as a second example when
>>>writing articles like this, I'm all for it.  It's simple so it should be pretty
>>>easy to modify.

This page took 0.01 seconds to execute

Last modified: Thu, 07 Jul 11 08:48:38 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.