Computer Chess Club Archives

Search

Terms
Messages

Subject: Re: Updating engines during tournaments? (Odyssee Tournament)

Author: Uri Blass
Date: 14:34:59 03/06/01
On March 06, 2001 at 16:54:17, Dann Corbit wrote:

>On March 06, 2001 at 16:16:01, Robert Hyatt wrote:
>
>>On March 06, 2001 at 14:58:24, Dann Corbit wrote:
>>
>>>On March 05, 2001 at 22:51:19, Robert Hyatt wrote:
>>>
>>>>On March 05, 2001 at 20:59:05, Dann Corbit wrote:
>>>>
>>>>>On March 05, 2001 at 20:41:17, Robert Hyatt wrote:
>>>[snip]
>>>>>If you are trying to produce a reproducable experiment, then you don't change
>>>>>the parameters as you go.  If you just want a fun contest, then do whatever you
>>>>>want.
>>>>
>>>>Tournaments are definitely _not_ reproducible.  In any shape, form or
>>>>fashion...  either human events, nor computer events where the authors
>>>>are present.
>>>
>>>By the same token, a sequence of 1000 coin flips won't be reproducable either.
>>>Any measurement with a degree of randomness will suffer from this problem (which
>>>is *truthfully* -- ALL of them).  At any rate, if you are trying to *win* a
>>>contest, then you will try anything at your disposal.  Certainly you can get
>>>some gains by being unpredictable (e.g. changing the openings or whatever).  But
>>>if the experiment is planned to measure something and produce a number, then you
>>>should eliminate as many variables as possible.
>>
>>
>>With software this is simply impossible to eliminate.  IE if you use crafty
>>version X to play in a tournament, even after version X+1 is out, then you
>>are already introducing a random variable to the tournament.  Because if you
>>hold another one immediately after this one and use the then-current versions
>>of everything you will get different results.  You will probably get different
>>results if you use _exactly_ the same engines, so worrying about a few bug
>>fixes is really about like trying to optimize a piece of code that takes less
>>than .0000001% of the total search time.  Any changes won't make a difference
>>there.
>
>Precisely because there are so many variables is why a scientific experiment
>should try to hold things fixed.  EXACTLY the same conditions occur in
>biological experiments because there are so many contributing factors an hard to
>control inputs.  Sometimes, a seemingly innocuous change can cause an enormous
>difference in outcome (Famous e.g.: the cleaning lady who hugged the baby at the
>end of the line...)
>
>Fixing a bug is important if you want to win a contest.  And all programs of any
>appreciable size will have bugs.  Programs under constant development, even more
>so.  So fixing bugs to win a contest is a great idea if you just want to win
>contests.  But if you want to find reliable information, you simply have to "get
>the experiment to hold as still as is humanly possible" while you are taking the
>picture.

The problem is that the result of the tournament may be not interesting.

People want to know which program is best today and not which program was best
sometimes ago.

You cannot get correct information for this question but if you do not allow
fixing bugs during the tournament you have bigger probability to get misleading
information.


  I have very little faith in a large fraction of the scientific
>research that is conducted precisely because adequate control is not maintained
>and new variables are injected.  I know of some cases that would be actually
>funny if it did not at the same time represent a tragic waste of millions of
>dollars.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.