Computer Chess Club Archives

Search

Terms

Messages

Subject: Re: Proving something is better

Author: Dann Corbit

Date: 16:10:42 12/17/02

On December 17, 2002 at 18:55:35, Bruce Moreland wrote:

>On December 17, 2002 at 18:06:09, Dann Corbit wrote:
>
>>Time to solution and number of solutions is the key for test problems (to my way
>>of thinking).  If we count nodes, the slow searchers will look very bad.
>>
>>I think a bigger problem with test suites is that there is a plethora of
>>excellent tactical test suites and a dirth of excellent positional test suites.
>>So we measure tactical prowess effectively, but this does not necessarily
>>translate into excellence in play.
>
>Counting nodes means nothing if you are going to compare them between programs,
>which Omid is not doing, so no knock on Omid there.

Neither by me.  I think his paper was genuinely interesting.

>With this kind of thing, what you are trying to do is get (presumably) the same
>result in less time.
>
>Less nodes is not *quite* the same as less time, even with the same program.
>It's like measuring someone's height by measuring their shadow.  It is more
>direct to measure their height, even though measuring their shadow will get you
>close.
>
>Getting to the same depth in shorter time is also not quite the same, if you
>don't get at least as many problems correct.
>
>Neither is getting to the same depth in more time and then saying this is better
>because you find more solutions.
>
>You have to compare apples with apples or you've *proven* nothing, all you've
>done is *implied* something.  That's not science.
>
>This has nothing to do with "excellence" in play.  The whole idea is to take
>what excellences is there and make it faster.
>
>Is this technique faster?  The data doesn't say.
>
>And this is a chronic problem when people write articles on search techniques.

I think perhaps a good measure of ability would be to take a set such as WAC and
normalize it with a good engine on a platform of known strength.  The time to
complete would be (perhaps) 5 seconds per position, and the square root of the
sum of the time squared would be used as a measure.

Let's suppose that on a 1GHz machine, Crafty solves 297/300 and that the square
root of the sum of the time squared was 300.  If two program solve an equal
number of problems, then we use the time for a measure of goodness.  If not,
then the number of solutions will be more important.
Now, we will have a test that should be fairly reproducible.  Repeat this test
procedure for a dozen or so test sets.

After all, when playing chess, two things are important:
1.  Getting the right answer.
2.  Getting it fast.

If other programs were tested under a similar setup, we might find some
interesting results.  For instance, if one program averages 1/10 of a second to
solve problems, even though it solves the same number, it would probably
dominate over a program that takes 1 second on average to solve them.  Of
course, it might not scale cleanly to longer time controls, but it seems nobody
has the patience to test them like that.

I suggest taking the square root of the sum of the squares to reduce the effect
of sports that are abnormal either in quickness or slowness to solve.  Then the
general ability will be more clearly seen.  A straight arithmetic average could
easily be bent by outliers.

Re: Proving something is better Bruce Moreland 16:42:10 12/17/02
- Re: Proving something is better Sune Fischer 04:47:15 12/18/02
  - Re: Proving something is better Gian-Carlo Pascutto 08:38:41 12/18/02
    - Re: Proving something is better Sune Fischer 09:13:56 12/18/02
      - Re: Proving something is better Bruce Moreland 12:32:55 12/18/02
        
        Re: Proving something is better Sune Fischer 13:15:52 12/18/02
        
        Re: Proving something is better Bruce Moreland 15:23:28 12/18/02
        
        Re: Proving something is better Sune Fischer 16:16:16 12/18/02
      - Re: Proving something is better Gian-Carlo Pascutto 09:53:09 12/18/02
        
        Re: Proving something is better Sune Fischer 10:09:19 12/18/02
- Re: Proving something is better Dann Corbit 16:48:42 12/17/02
  - Re: Proving something is better Omid David Tabibi 17:44:45 12/17/02
    - Re: Proving something is better Bruce Moreland 00:21:02 12/18/02
      - Re: Proving something is better Omid David Tabibi 08:07:49 12/18/02
        
        Re: Proving something is better Ed Schröder 03:29:44 12/19/02
        
        Re: Proving something is better Omid David Tabibi 09:32:41 12/19/02
        
        Re: Proving something is better Miguel A. Ballicora 21:27:53 12/18/02
        
        Re: Proving something is better Omid David Tabibi 21:53:55 12/18/02
        
        Re: Proving something is better Bruce Moreland 23:42:55 12/18/02
        
        Re: Proving something is better Miguel A. Ballicora 22:05:24 12/18/02
        
        Re: Proving something is better Omid David Tabibi 22:12:49 12/18/02
        
        Re: Proving something is better Bruce Moreland 13:02:38 12/18/02
        
        Re: Proving something is better Omid David Tabibi 13:13:59 12/18/02
        
        Re: Proving something is better Bruce Moreland 16:04:28 12/18/02
        
        Re: Proving something is better Omid David Tabibi 22:11:09 12/18/02
        
        Re: Proving something is better Bruce Moreland 23:49:17 12/18/02
        
        Re: Proving something is better Gian-Carlo Pascutto 08:20:58 12/18/02
        
        Re: Proving something is better Bruce Moreland 12:48:50 12/18/02
        
        Re: Proving something is better Omid David Tabibi 13:04:55 12/18/02
Re: Proving something is better Vincent Diepeveen 16:36:09 12/17/02
- Re: Proving something is better Bob Durrett 16:49:47 12/17/02
  - Re: Proving something is better Bruce Moreland 00:28:11 12/18/02
  - Re: Proving something is better Vincent Diepeveen 17:13:10 12/17/02
- Re: Proving something is better Dann Corbit 16:44:03 12/17/02
  - Re: Proving something is better Vincent Diepeveen 16:57:10 12/17/02

This page took 0.02 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.