Computer Chess Club Archives

Search

Terms
Messages

Subject: Re: Testing Chess Programs

Author: Christophe Theron
Date: 19:18:00 04/13/04
On April 13, 2004 at 13:36:08, Robert Hyatt wrote:

>On April 12, 2004 at 23:07:46, Christophe Theron wrote:
>
>>On April 12, 2004 at 19:51:29, Tom Likens wrote:
>>
>>>On April 12, 2004 at 19:15:01, Christophe Theron wrote:
>>>
>>>>On April 12, 2004 at 16:44:04, Tord Romstad wrote:
>>>>
>>>>>On April 12, 2004 at 14:45:28, Christophe Theron wrote:
>>>>>
>>>>>>On April 12, 2004 at 07:50:47, Tord Romstad wrote:
>>>>>>>
>>>>>>>Assume that you make a change to your engine which improves the playing strength
>>>>>>>by
>>>>>>>about 10 Elo points.  How many hours of CPU time do you need before you are sure
>>>>>>>that
>>>>>>>the change was an improvement?
>>>>>>>
>>>>>>
>>>>>>I would say approximately one week, and I would not even be really sure it is an
>>>>>>improvement. We are talking about a 1.5% improvement in winning percentage here,
>>>>>>it's below the statistical noise of a several hundreds games match if you want
>>>>>>95% reliability!
>>>>>
>>>>>Thanks, Christophe!
>>>>>
>>>>>Reading this is actually a great relief to me.  I wondered if you had invented
>>>>>some kind
>>>>>of magic which enabled you to find tiny improvements in much shorter time.
>>>>>
>>>>>>And unfortunately a 10 elo points improvement is becoming rare for me. Most of
>>>>>>the changes I try make the program weaker, and many changes do not provide any
>>>>>>measurable improvement!
>>>>>
>>>>>I have no difficulties believing this.  My engine is still at least 200 points
>>>>>weaker than
>>>>>yours, and I have exactly the same experience.
>>>>>
>>>>>>That's why not having a strong test methodology is totally out of question if
>>>>>>you are serious about chess programming.
>>>>>
>>>>>Yes.  It is extremely difficult to me, because I am a very impatient person.
>>>>>When I make
>>>>>a small change to my engine, I rarely have enough time to play enough games to
>>>>>determine
>>>>>whether it is an improvement, because I have a dozen new ideas I want to try
>>>>>before my
>>>>>first test matches are finished.
>>>>
>>>>
>>>>
>>>>That's the real motivation killer. I also have many ideas and when I want to try
>>>>them I realize I'm currently testing another idea and that the test running will
>>>>not be over until next week. So I have to wait for one week before I can start
>>>>testing, and another week to know the result.
>>>>
>>>>In two weeks from now, my interests will clearly have switched to another idea.
>>>>
>>>>That makes computer chess programming more and more boring.
>>>>
>>>>
>>>>
>>>>    Christophe
>>>
>>>I was thinking about this *exact* problem on the way home from work today.
>>>The only solution I could come up with was to add more computers and thus
>>>attack the problem in parallel.  I currently have three computers I can
>>>dedicate to running various test matches, if I could validate an idea in
>>>roughly two days then this problem wouldn't be so bad.  A week, as both you
>>>and Tord point out, is difficult.  Adding more CPUs to the problem would
>>>make this possible, but it might also turn me into a bachelor again!!
>>>
>>>As I mentioned previously, testing is my primary focus for the next few weeks.
>>>If I come up with anything interesting I'll share it (not being commercial
>>>does have a few advantages).  Also don't hesitate to *not* share anything
>>>since you make your living at this I can appreciate your position.  If
>>>someone asked me to design an integrated circuit for free, I might be
>>>reluctant to do so (especially, if it resulted in my not being able to design
>>>one for a paycheck in the future).
>>>
>>>regards,
>>>--tom
>>
>>
>>
>>I consider that I *do* contribute to computer chess programming. Not by
>>providing code, but by providing advices.
>>
>>I'm not doing what Bob does. Bob provides excellent advices on code, or code
>>structure.
>
>I'm not even sure that "what Bob does" is correct.  IE based on all the "clone"
>problems it has caused, sometimes it seems that the cure has been worse than the
>disease..  :(



This particular problem is caused by bad behaviour, but there is no cure for
it... Education maybe, but there will always be people with bad education.

Top-level open source chess program have forced the professional chess
programmers to make real progress or to die...

I cannot say I'm particulary happy to know that Crafty code is out there, but so
far I have not really been a victim of it. I do not know if you will agree, but
I would have expected more dramatically new ideas contributed to Crafty, like
revolutionary forward pruning techniques and such. I have not seen this
happening and I regret it. Well actually I don't regret it that much! :)

On the other hand, I do believe in open source and I'm trying to support it.
These lines are written on a Fedora Core 1 Linux box for example.

So I feel that I am in a very ambiguous and difficult position: I support open
source as being a really efficient way of building good and reliable software,
and at the same time I make a living from selling closed-source software.

Maybe my willingness to provide general advices comes from this paradox. :)



    Christophe






>>I provide more general, or philosophical, advices. They do not cover the same
>>areas as advices provided by other people. Some of them took me years to come up
>>with, so from my point of view they are valuable, maybe more than code.
>>
>>On the other hand I'm still learning myself, so sometimes my advices are not
>>that clever. :)
>>
>>Further, wouldn't you just *hate* if I took the fun out of chess programming by
>>telling you everything? :)
>>
>>
>>
>>    Christophe
>>
>>
>>
>>
>>
>>>>>>Even with a good test methodology chess programming is still an art: in many
>>>>>>cases you have to decide with your feelings, because the raw data does not give
>>>>>>you a definite answer.
>>>>>>
>>>>>>Now of course there are small improvements that I do not even need to test for a
>>>>>>long time: if I find a way to make my program 10% faster without changing the
>>>>>>shape of the tree, then all I need to do is run some safety tests that will only
>>>>>>look at the number of nodes searched on a large set of positions and compare it
>>>>>>to the last stable version.
>>>>>
>>>>>Yes.  That is the only good reason I see to do any low-level optimization (for
>>>>>weak
>>>>>amateur engines like mine, that is -- the situation is of course entirely
>>>>>different for
>>>>>authours of top engines).  If I manage to make my engine a few percent faster
>>>>>without
>>>>>changing the eval or the shape of the tree, I can be 100% sure that the change
>>>>>was an
>>>>>improvement, without doing lots of time-consuming tests.
>>>>>
>>>>>Tord
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.