Computer Chess Club Archives


Search

Terms

Messages

Subject: Testing Chess Programs

Author: Tom Likens

Date: 16:51:29 04/12/04

Go up one level in this thread


On April 12, 2004 at 19:15:01, Christophe Theron wrote:

>On April 12, 2004 at 16:44:04, Tord Romstad wrote:
>
>>On April 12, 2004 at 14:45:28, Christophe Theron wrote:
>>
>>>On April 12, 2004 at 07:50:47, Tord Romstad wrote:
>>>>
>>>>Assume that you make a change to your engine which improves the playing strength
>>>>by
>>>>about 10 Elo points.  How many hours of CPU time do you need before you are sure
>>>>that
>>>>the change was an improvement?
>>>>
>>>
>>>I would say approximately one week, and I would not even be really sure it is an
>>>improvement. We are talking about a 1.5% improvement in winning percentage here,
>>>it's below the statistical noise of a several hundreds games match if you want
>>>95% reliability!
>>
>>Thanks, Christophe!
>>
>>Reading this is actually a great relief to me.  I wondered if you had invented
>>some kind
>>of magic which enabled you to find tiny improvements in much shorter time.
>>
>>>And unfortunately a 10 elo points improvement is becoming rare for me. Most of
>>>the changes I try make the program weaker, and many changes do not provide any
>>>measurable improvement!
>>
>>I have no difficulties believing this.  My engine is still at least 200 points
>>weaker than
>>yours, and I have exactly the same experience.
>>
>>>That's why not having a strong test methodology is totally out of question if
>>>you are serious about chess programming.
>>
>>Yes.  It is extremely difficult to me, because I am a very impatient person.
>>When I make
>>a small change to my engine, I rarely have enough time to play enough games to
>>determine
>>whether it is an improvement, because I have a dozen new ideas I want to try
>>before my
>>first test matches are finished.
>
>
>
>That's the real motivation killer. I also have many ideas and when I want to try
>them I realize I'm currently testing another idea and that the test running will
>not be over until next week. So I have to wait for one week before I can start
>testing, and another week to know the result.
>
>In two weeks from now, my interests will clearly have switched to another idea.
>
>That makes computer chess programming more and more boring.
>
>
>
>    Christophe

I was thinking about this *exact* problem on the way home from work today.
The only solution I could come up with was to add more computers and thus
attack the problem in parallel.  I currently have three computers I can
dedicate to running various test matches, if I could validate an idea in
roughly two days then this problem wouldn't be so bad.  A week, as both you
and Tord point out, is difficult.  Adding more CPUs to the problem would
make this possible, but it might also turn me into a bachelor again!!

As I mentioned previously, testing is my primary focus for the next few weeks.
If I come up with anything interesting I'll share it (not being commercial
does have a few advantages).  Also don't hesitate to *not* share anything
since you make your living at this I can appreciate your position.  If
someone asked me to design an integrated circuit for free, I might be
reluctant to do so (especially, if it resulted in my not being able to design
one for a paycheck in the future).

regards,
--tom

>
>
>>>Even with a good test methodology chess programming is still an art: in many
>>>cases you have to decide with your feelings, because the raw data does not give
>>>you a definite answer.
>>>
>>>Now of course there are small improvements that I do not even need to test for a
>>>long time: if I find a way to make my program 10% faster without changing the
>>>shape of the tree, then all I need to do is run some safety tests that will only
>>>look at the number of nodes searched on a large set of positions and compare it
>>>to the last stable version.
>>
>>Yes.  That is the only good reason I see to do any low-level optimization (for
>>weak
>>amateur engines like mine, that is -- the situation is of course entirely
>>different for
>>authours of top engines).  If I manage to make my engine a few percent faster
>>without
>>changing the eval or the shape of the tree, I can be 100% sure that the change
>>was an
>>improvement, without doing lots of time-consuming tests.
>>
>>Tord



This page took 0.01 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.