Author: Robert Hyatt
Date: 10:43:57 05/26/04
Go up one level in this thread
On May 26, 2004 at 13:01:16, José Carlos wrote: >On May 26, 2004 at 12:23:08, Robert Hyatt wrote: > >>On May 26, 2004 at 05:16:05, José Carlos wrote: >> >>>On May 25, 2004 at 20:15:58, Dann Corbit wrote: >>> >>>>On May 25, 2004 at 15:12:01, Russell Reagan wrote: >>>> >>>>>On May 25, 2004 at 14:33:31, Dann Corbit wrote: >>>>> >>>>>>I doubt that very much. There are some engines that vary in strength with time >>>>>>control, but it is generally at the blitz level where these transitions take >>>>>>place. An engine that scores 30% at G/40 will probably score 30% at G/120 and >>>>>>at 40/2 against the same opponent. >>>>> >>>>> >>>>>I'll test it. What engines would you like me to use? >>>>> >>>>> >>>>>>I suspect that you saw it happen once or twice and are now extrapolating the >>>>>>result in your mind. >>>>> >>>>> >>>>>Yes, maybe. I need to test the idea some more. >>>>> >>>>> >>>>>>If the effect were profound, wouldn't Crafty score 50% against Shredder in the >>>>>>SSDF? >>>>> >>>>> >>>>>I don't understand the reasoning here. The effect may only be subtle. I don't >>>>>even know if it is testable in practical time. >>>>> >>>>> >>>>>>The reason an engine might pick up strength at longer time controls is that it >>>>>>has a better fundamental algorithm, but it is poorly microoptimized. >>>>> >>>>> >>>>>What about diminishing returns? If we plotted the results of matches with >>>>>respect to time (ex. 30%, 35%, 38%, etc.), what do the curves look like? At the >>>>>beginning of the curve, the slow program with a superior algorithm won't fit the >>>>>overall pattern, but I'm after the overall shape of the curve, where it levels >>>>>off (or if it levels off), and things like that. >>>> >>>>Why will one program have diminishing returns and not the other? >>>>There is no conclusive evidence that diminishing returns occur. Citations" >>>>"Dark Thought Goes Deep" >>>>"Crafty Goes Deep" >>>> >>>>>>A great painter paints a picture in a month. The same painter paints a picture >>>>>>in ten minutes. I am guessing that the slower time of painting made a much >>>>>>better picture. >>>>>> >>>>>>When I play a chess engine contest, I want the result to be art, not comedy. >>>>>>For me (though not for the majority) high speed blitz games are a crime against >>>>>>humanity. >>>>>> >>>>>>It is not the end point (who won?) that is interesting to me. It is the journey >>>>>>along the way. >>>>> >>>>> >>>>>This is where we differ somewhat. I am not uninterested in the quality of the >>>>>games, but I am more interested in the outcome of the match and finding out who >>>>>is better. A G/30 match might be of lower quality, but in general it will >>>>>probably produce the same winner as a G/120 match, don't you think? >>>> >>>>What you will see is how strong the program is on that hardware at G/30. >>>>Chances are good that there is a correlation to how the program does on that >>>>hardware ag G/120. >>>> >>>>>I am thinking about this from the point of view of an engine developer. If I can >>>>>reliably tell which engine is stronger in 1/10th of the time, without having to >>>>>play G/120 matches for weeks, then that will benefit me greatly in finding out >>>>>whether changes to the engine are improvements, and the engine will improve more >>>>>quickly. >>>> >>>>The higher the speed of the games, the greater the amount of randomness if the >>>>pace is very fast. At some point, I think it levels out. >>> >>> >>> This is an interesting point. I had never thought at it that way. So basically >>>you say "faster implies more data and more randomness, and that probably levels >>>out at some point". So an interesting experiment would be: try 1000 games at G1, >>>100 games at G30 and 10 games at G120. The % of w/d/l should somehow be similar. >>>Of course the numbers should be calculated in a more elaborated way, I just made >>>them up, but that's the idea. Do you know how to do the calculations (my >>>mathematical background is not enough)? >>> Or we could do the other way, this is, run 1000 games at G1. Then start a >>>match at G30 (with at least n games) until results are similiar in % to the >>>first match. Then do the same with G120. >>> What do you think? >>> >>> José C. >> >>I think the idea is flawed. >> >>Suppose you play two programs and limit them so they can only search to a depth >>of 1 ply. It becomes "static evaluation vs static evaluation". If A has a >>better evaluation, A wins. >> >>Suppose you now search for a long time, but A uses minimax (Just for a gross but >>impractical example) and B uses alpha/beta. B will probably win on tactics. >>Short games favor good evaluation over tactics. Longer games can give a program >>a tactical edge over a smarter program... >> >>I am _certain_ that Crafty plays worse against the same program at blitz, as >>opposed to playing the program in standard time controls. From looking at >>literally thousands of logs from ICC... > > > Yes, that's probably true for some programs (though I don't think in short >games eval is more important than search. Take this to a boundary condition where you do a 1 ply search and that is _all_. All that will select moves is evaluation... Another example. A _very_ slow program that at blitz can only search exactly 1 moves. While at longer time controls it can search them all. There will be a big difference between blitz and non-blitz results. I think that writing off the possibility of a program being better at blitz vs standard (or vice-versa) is the wrong thing to do without a _lot_ of testing to verify it... > The opposite seems more logical to >me). But Dann's point about noise still makes sense to me. Maybe the experiment >could be done with Tiger (Christophe always claims it plays the same at all time >controls) versus other program... > > José C. > > > > >>>>In a contest, I will spend a lot of time generating data. I would like the data >>>>to be valuable to me. >>>> >>>>>In that respect, I think longer games tell us less about which engine is better, >>>>>and about whether a change was really an improvement. I may be wrong though. It >>>>>is just an idea. >>>> >>>>I think that there is probably some happy medium for experimental quality (IOW, >>>>to collect the most reliable data in the least amount of time). But it probably >>>>varies quite a bit from program to program and from machine to machine, etc. >>>> >>>>When I generate a chess contest, I want the data to be interesting enough for me >>>>to read. Who wins the contest is purely an afterthought for me.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.