Computer Chess Club Archives

Search

Terms

Messages

Subject: Testing Chess Programs

Author: Tom Likens

Date: 16:51:29 04/12/04

On April 12, 2004 at 19:15:01, Christophe Theron wrote:

>On April 12, 2004 at 16:44:04, Tord Romstad wrote:
>
>>On April 12, 2004 at 14:45:28, Christophe Theron wrote:
>>
>>>On April 12, 2004 at 07:50:47, Tord Romstad wrote:
>>>>
>>>>Assume that you make a change to your engine which improves the playing strength
>>>>by
>>>>about 10 Elo points.  How many hours of CPU time do you need before you are sure
>>>>that
>>>>the change was an improvement?
>>>>
>>>
>>>I would say approximately one week, and I would not even be really sure it is an
>>>improvement. We are talking about a 1.5% improvement in winning percentage here,
>>>it's below the statistical noise of a several hundreds games match if you want
>>>95% reliability!
>>
>>Thanks, Christophe!
>>
>>Reading this is actually a great relief to me.  I wondered if you had invented
>>some kind
>>of magic which enabled you to find tiny improvements in much shorter time.
>>
>>>And unfortunately a 10 elo points improvement is becoming rare for me. Most of
>>>the changes I try make the program weaker, and many changes do not provide any
>>>measurable improvement!
>>
>>I have no difficulties believing this.  My engine is still at least 200 points
>>weaker than
>>yours, and I have exactly the same experience.
>>
>>>That's why not having a strong test methodology is totally out of question if
>>>you are serious about chess programming.
>>
>>Yes.  It is extremely difficult to me, because I am a very impatient person.
>>When I make
>>a small change to my engine, I rarely have enough time to play enough games to
>>determine
>>whether it is an improvement, because I have a dozen new ideas I want to try
>>before my
>>first test matches are finished.
>
>
>
>That's the real motivation killer. I also have many ideas and when I want to try
>them I realize I'm currently testing another idea and that the test running will
>not be over until next week. So I have to wait for one week before I can start
>testing, and another week to know the result.
>
>In two weeks from now, my interests will clearly have switched to another idea.
>
>That makes computer chess programming more and more boring.
>
>
>
>    Christophe

I was thinking about this *exact* problem on the way home from work today.
The only solution I could come up with was to add more computers and thus
attack the problem in parallel.  I currently have three computers I can
dedicate to running various test matches, if I could validate an idea in
roughly two days then this problem wouldn't be so bad.  A week, as both you
and Tord point out, is difficult.  Adding more CPUs to the problem would
make this possible, but it might also turn me into a bachelor again!!

As I mentioned previously, testing is my primary focus for the next few weeks.
If I come up with anything interesting I'll share it (not being commercial
does have a few advantages).  Also don't hesitate to *not* share anything
since you make your living at this I can appreciate your position.  If
someone asked me to design an integrated circuit for free, I might be
reluctant to do so (especially, if it resulted in my not being able to design
one for a paycheck in the future).

regards,
--tom

>
>
>>>Even with a good test methodology chess programming is still an art: in many
>>>cases you have to decide with your feelings, because the raw data does not give
>>>you a definite answer.
>>>
>>>Now of course there are small improvements that I do not even need to test for a
>>>long time: if I find a way to make my program 10% faster without changing the
>>>shape of the tree, then all I need to do is run some safety tests that will only
>>>look at the number of nodes searched on a large set of positions and compare it
>>>to the last stable version.
>>
>>Yes.  That is the only good reason I see to do any low-level optimization (for
>>weak
>>amateur engines like mine, that is -- the situation is of course entirely
>>different for
>>authours of top engines).  If I manage to make my engine a few percent faster
>>without
>>changing the eval or the shape of the tree, I can be 100% sure that the change
>>was an
>>improvement, without doing lots of time-consuming tests.
>>
>>Tord

Re: Testing Chess Programs Christophe Theron 20:07:46 04/12/04
- Re: Testing Chess Programs Robert Hyatt 10:36:08 04/13/04
  - Re: Testing Chess Programs Christophe Theron 19:18:00 04/13/04
- Re: Testing Chess Programs Tord Romstad 10:14:21 04/13/04
- Re: Testing Chess Programs Peter Fendrich 02:50:29 04/13/04
- Re: Testing Chess Programs Russell Reagan 22:29:02 04/12/04
  - Re: Testing Chess Programs Christophe Theron 11:21:07 04/13/04
    - Crafty Stats Matthew Hull 14:00:24 04/13/04
      - Re: Crafty Stats martin fierz 07:21:55 04/14/04
        
        Re: Crafty Stats Robert Hyatt 09:32:56 04/14/04
        
        Re: Crafty Stats Christophe Theron 16:29:44 04/14/04
        
        Re: Crafty Stats martin fierz 22:05:38 04/14/04
        
        Re: Crafty Stats Christophe Theron 10:18:04 04/15/04
        
        Re: Crafty Stats Robert Hyatt 18:54:58 04/14/04
        
        Re: Crafty Stats martin fierz 16:06:58 04/14/04
        
        Re: Crafty Stats Robert Hyatt 18:56:58 04/14/04
        
        Re: Crafty Stats martin fierz 22:01:59 04/14/04
        
        Re: Crafty Stats Robert Hyatt 06:08:50 04/15/04
        
        Re: Crafty Stats martin fierz 08:01:06 04/15/04
        
        Re: Crafty Stats Robert Hyatt 11:23:51 04/15/04
        
        Re: Crafty Stats Vincent Lejeune 10:06:14 04/15/04
        
        Re: Crafty Stats Keith Evans 22:25:35 04/14/04
        
        Re: Crafty Stats Matthew Hull 07:33:42 04/14/04
        
        Re: Crafty Stats martin fierz 07:38:52 04/14/04
        
        Re: Crafty Stats Robert Hyatt 10:44:54 04/14/04
        
        Re: Crafty Stats martin fierz 16:05:25 04/14/04
        
        Re: Crafty Stats Robert Hyatt 19:06:53 04/14/04
        
        Re: Crafty Stats Dave Gomboc 16:55:15 04/14/04
        
        Re: Crafty Stats martin fierz 22:03:14 04/14/04
        
        Re: Crafty Stats Tord Romstad 13:47:57 04/14/04
        
        Re: Crafty Stats Robert Hyatt 19:10:10 04/14/04
        
        Re: Crafty Stats Sune Fischer 14:38:34 04/14/04
      - Re: Crafty Stats Christophe Theron 23:51:29 04/13/04
        
        Re: Crafty Stats Matthew Hull 07:15:37 04/14/04
        
        Re: Crafty Stats Christophe Theron 16:14:22 04/14/04
    - Re: Testing Chess Programs Tord Romstad 12:09:10 04/13/04
      - Re: Testing Chess Programs Christophe Theron 23:48:34 04/13/04
- Re: Testing Chess Programs Tom Likens 21:30:33 04/12/04

This page took 0.01 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.