Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Opinions? A Crafty experiment...

Author: Uri Blass

Date: 11:30:30 05/26/04

Go up one level in this thread


On May 26, 2004 at 13:46:51, Robert Hyatt wrote:

>On May 26, 2004 at 12:46:09, Uri Blass wrote:
>
>>On May 26, 2004 at 12:23:08, Robert Hyatt wrote:
>>
>>>On May 26, 2004 at 05:16:05, José Carlos wrote:
>>>
>>>>On May 25, 2004 at 20:15:58, Dann Corbit wrote:
>>>>
>>>>>On May 25, 2004 at 15:12:01, Russell Reagan wrote:
>>>>>
>>>>>>On May 25, 2004 at 14:33:31, Dann Corbit wrote:
>>>>>>
>>>>>>>I doubt that very much.  There are some engines that vary in strength with time
>>>>>>>control, but it is generally at the blitz level where these transitions take
>>>>>>>place.  An engine that scores 30% at G/40 will probably score 30% at G/120 and
>>>>>>>at 40/2 against the same opponent.
>>>>>>
>>>>>>
>>>>>>I'll test it. What engines would you like me to use?
>>>>>>
>>>>>>
>>>>>>>I suspect that you saw it happen once or twice and are now extrapolating the
>>>>>>>result in your mind.
>>>>>>
>>>>>>
>>>>>>Yes, maybe. I need to test the idea some more.
>>>>>>
>>>>>>
>>>>>>>If the effect were profound, wouldn't Crafty score 50% against Shredder in the
>>>>>>>SSDF?
>>>>>>
>>>>>>
>>>>>>I don't understand the reasoning here. The effect may only be subtle. I don't
>>>>>>even know if it is testable in practical time.
>>>>>>
>>>>>>
>>>>>>>The reason an engine might pick up strength at longer time controls is that it
>>>>>>>has a better fundamental algorithm, but it is poorly microoptimized.
>>>>>>
>>>>>>
>>>>>>What about diminishing returns? If we plotted the results of matches with
>>>>>>respect to time (ex. 30%, 35%, 38%, etc.), what do the curves look like? At the
>>>>>>beginning of the curve, the slow program with a superior algorithm won't fit the
>>>>>>overall pattern, but I'm after the overall shape of the curve, where it levels
>>>>>>off (or if it levels off), and things like that.
>>>>>
>>>>>Why will one program have diminishing returns and not the other?
>>>>>There is no conclusive evidence that diminishing returns occur.  Citations"
>>>>>"Dark Thought Goes Deep"
>>>>>"Crafty Goes Deep"
>>>>>
>>>>>>>A great painter paints a picture in a month.  The same painter paints a picture
>>>>>>>in ten minutes.  I am guessing that the slower time of painting made a much
>>>>>>>better picture.
>>>>>>>
>>>>>>>When I play a chess engine contest, I want the result to be art, not comedy.
>>>>>>>For me (though not for the majority) high speed blitz games are a crime against
>>>>>>>humanity.
>>>>>>>
>>>>>>>It is not the end point (who won?) that is interesting to me.  It is the journey
>>>>>>>along the way.
>>>>>>
>>>>>>
>>>>>>This is where we differ somewhat. I am not uninterested in the quality of the
>>>>>>games, but I am more interested in the outcome of the match and finding out who
>>>>>>is better. A G/30 match might be of lower quality, but in general it will
>>>>>>probably produce the same winner as a G/120 match, don't you think?
>>>>>
>>>>>What you will see is how strong the program is on that hardware at G/30.
>>>>>Chances are good that there is a correlation to how the program does on that
>>>>>hardware ag G/120.
>>>>>
>>>>>>I am thinking about this from the point of view of an engine developer. If I can
>>>>>>reliably tell which engine is stronger in 1/10th of the time, without having to
>>>>>>play G/120 matches for weeks, then that will benefit me greatly in finding out
>>>>>>whether changes to the engine are improvements, and the engine will improve more
>>>>>>quickly.
>>>>>
>>>>>The higher the speed of the games, the greater the amount of randomness if the
>>>>>pace is very fast.  At some point, I think it levels out.
>>>>
>>>>
>>>>  This is an interesting point. I had never thought at it that way. So basically
>>>>you say "faster implies more data and more randomness, and that probably levels
>>>>out at some point". So an interesting experiment would be: try 1000 games at G1,
>>>>100 games at G30 and 10 games at G120. The % of w/d/l should somehow be similar.
>>>>Of course the numbers should be calculated in a more elaborated way, I just made
>>>>them up, but that's the idea. Do you know how to do the calculations (my
>>>>mathematical background is not enough)?
>>>>  Or we could do the other way, this is, run 1000 games at G1. Then start a
>>>>match at G30 (with at least n games) until results are similiar in % to the
>>>>first match. Then do the same with G120.
>>>>  What do you think?
>>>>
>>>>  José C.
>>>
>>>I think the idea is flawed.
>>>
>>>Suppose you play two programs and limit them so they can only search to a depth
>>>of 1 ply.  It becomes "static evaluation vs static evaluation".  If A has a
>>>better evaluation, A wins.
>>>
>>>Suppose you now search for a long time, but A uses minimax (Just for a gross but
>>>impractical example) and B uses alpha/beta.  B will probably win on tactics.
>>>Short games favor good evaluation over tactics.  Longer games can give a program
>>>a tactical edge over a smarter program...
>>>
>>>I am _certain_ that Crafty plays worse against the same program at blitz, as
>>>opposed to playing the program in standard time controls.  From looking at
>>>literally thousands of logs from ICC...
>>
>>Crafty against which program?
>>Is not the answer dependent on the name of the opponent program?
>>
>>Uri
>
>
>Not particularly.  But in general I am talking about commercial programs.  You
>could look at some stats on ICC for example.  It simply seems to play better at
>longer time controls...

I cannot use search command because I am not member of ICC so I only looked at
rating.

I see that you have 3241 at bullet 2929 at blitz and 2637 at standard.


Bullet      3241  [8]  6675  1468  1088  9231   3286 (27-Dec-2002)
Blitz       2929      59626 17733 14094 91453   3388 (09-Jun-2000)
Standard    2637       5281  2747  2442 10470   2792 (25-Oct-2000)


For comparison


DeepFritz

Bullet      2958  [8]    40     2     5    47   3003 (02-Jan-2002)
Blitz       3038  [8]    68    11    10    89   3038 (25-Aug-2002)
Standard    2746  [6]    74    15    20   109   2801 (27-Jan-2001)


Rebel12

Bullet      2190  [8]     1     0     1     2
Blitz       2774       3467  2416  1849  7732   3018 (26-Apr-2004)
Standard    2551        169   190   133   492   2677 (27-Aug-2003)


I do not see a tendency to do better at longer time control relative to the
commercial based on that data.

Uri



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.