Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Opinions? A Crafty experiment...

Author: Robert Hyatt

Date: 09:23:08 05/26/04

Go up one level in this thread


On May 26, 2004 at 05:16:05, José Carlos wrote:

>On May 25, 2004 at 20:15:58, Dann Corbit wrote:
>
>>On May 25, 2004 at 15:12:01, Russell Reagan wrote:
>>
>>>On May 25, 2004 at 14:33:31, Dann Corbit wrote:
>>>
>>>>I doubt that very much.  There are some engines that vary in strength with time
>>>>control, but it is generally at the blitz level where these transitions take
>>>>place.  An engine that scores 30% at G/40 will probably score 30% at G/120 and
>>>>at 40/2 against the same opponent.
>>>
>>>
>>>I'll test it. What engines would you like me to use?
>>>
>>>
>>>>I suspect that you saw it happen once or twice and are now extrapolating the
>>>>result in your mind.
>>>
>>>
>>>Yes, maybe. I need to test the idea some more.
>>>
>>>
>>>>If the effect were profound, wouldn't Crafty score 50% against Shredder in the
>>>>SSDF?
>>>
>>>
>>>I don't understand the reasoning here. The effect may only be subtle. I don't
>>>even know if it is testable in practical time.
>>>
>>>
>>>>The reason an engine might pick up strength at longer time controls is that it
>>>>has a better fundamental algorithm, but it is poorly microoptimized.
>>>
>>>
>>>What about diminishing returns? If we plotted the results of matches with
>>>respect to time (ex. 30%, 35%, 38%, etc.), what do the curves look like? At the
>>>beginning of the curve, the slow program with a superior algorithm won't fit the
>>>overall pattern, but I'm after the overall shape of the curve, where it levels
>>>off (or if it levels off), and things like that.
>>
>>Why will one program have diminishing returns and not the other?
>>There is no conclusive evidence that diminishing returns occur.  Citations"
>>"Dark Thought Goes Deep"
>>"Crafty Goes Deep"
>>
>>>>A great painter paints a picture in a month.  The same painter paints a picture
>>>>in ten minutes.  I am guessing that the slower time of painting made a much
>>>>better picture.
>>>>
>>>>When I play a chess engine contest, I want the result to be art, not comedy.
>>>>For me (though not for the majority) high speed blitz games are a crime against
>>>>humanity.
>>>>
>>>>It is not the end point (who won?) that is interesting to me.  It is the journey
>>>>along the way.
>>>
>>>
>>>This is where we differ somewhat. I am not uninterested in the quality of the
>>>games, but I am more interested in the outcome of the match and finding out who
>>>is better. A G/30 match might be of lower quality, but in general it will
>>>probably produce the same winner as a G/120 match, don't you think?
>>
>>What you will see is how strong the program is on that hardware at G/30.
>>Chances are good that there is a correlation to how the program does on that
>>hardware ag G/120.
>>
>>>I am thinking about this from the point of view of an engine developer. If I can
>>>reliably tell which engine is stronger in 1/10th of the time, without having to
>>>play G/120 matches for weeks, then that will benefit me greatly in finding out
>>>whether changes to the engine are improvements, and the engine will improve more
>>>quickly.
>>
>>The higher the speed of the games, the greater the amount of randomness if the
>>pace is very fast.  At some point, I think it levels out.
>
>
>  This is an interesting point. I had never thought at it that way. So basically
>you say "faster implies more data and more randomness, and that probably levels
>out at some point". So an interesting experiment would be: try 1000 games at G1,
>100 games at G30 and 10 games at G120. The % of w/d/l should somehow be similar.
>Of course the numbers should be calculated in a more elaborated way, I just made
>them up, but that's the idea. Do you know how to do the calculations (my
>mathematical background is not enough)?
>  Or we could do the other way, this is, run 1000 games at G1. Then start a
>match at G30 (with at least n games) until results are similiar in % to the
>first match. Then do the same with G120.
>  What do you think?
>
>  José C.

I think the idea is flawed.

Suppose you play two programs and limit them so they can only search to a depth
of 1 ply.  It becomes "static evaluation vs static evaluation".  If A has a
better evaluation, A wins.

Suppose you now search for a long time, but A uses minimax (Just for a gross but
impractical example) and B uses alpha/beta.  B will probably win on tactics.
Short games favor good evaluation over tactics.  Longer games can give a program
a tactical edge over a smarter program...

I am _certain_ that Crafty plays worse against the same program at blitz, as
opposed to playing the program in standard time controls.  From looking at
literally thousands of logs from ICC...


>
>
>
>>In a contest, I will spend a lot of time generating data.  I would like the data
>>to be valuable to me.
>>
>>>In that respect, I think longer games tell us less about which engine is better,
>>>and about whether a change was really an improvement. I may be wrong though. It
>>>is just an idea.
>>
>>I think that there is probably some happy medium for experimental quality (IOW,
>>to collect the most reliable data in the least amount of time).  But it probably
>>varies quite a bit from program to program and from machine to machine, etc.
>>
>>When I generate a chess contest, I want the data to be interesting enough for me
>>to read.  Who wins the contest is purely an afterthought for me.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.