Author: Vincent Diepeveen
Date: 16:10:48 12/27/99
Go up one level in this thread
On December 27, 1999 at 18:20:27, Robert Hyatt wrote: >On December 27, 1999 at 16:01:24, Vincent Diepeveen wrote: > >>On December 27, 1999 at 15:38:11, Dann Corbit wrote: >> >>>On December 27, 1999 at 14:57:09, Ed Schröder wrote: >>>[snip] >>>How to know which is best? >>> >>>I think Dr. Hyatt's approach is a good one -- play a bazillion games on the net >>>against quality opponents. I see that Chris W. and Vincent D. have also >>>followed this strategy. Since the new improvements to Rebel also allow this >> >>Partly true and partly wrong. >> >>Dead wrong conclusion would be that i improve my program in order to blitz >>better. >>A few blitz game shows bugs in evaluation. however i feel blitz is >>not relevant for my engine to measure it strength at these days. >>The general problem of blitz games is that they go too fast. >>I can't examine a 100 games each days! >> >>I feel standard rated is a lot more important. However Bob success is so >>massive that there play hundreds of crafties out there. So rating >>is so much dependant upon how diep scores against the current crafty version >>that it's hard to sometimes draw conclusions. Some people running >>diep at icc especially for this reason put !computer therefore. >> >>Secret has !computer but plays every computer unrated. Moron has >>!computer only at blitz, otherwise it is at the interesting hours >>only busy playing a thousand 3 0 games >>against a dual crafty. >> >>Despite its allowing all computers at all levels >>unrated (and rated at standard), to my big surprise not many apart >>from a few programmers/bookmakers match Moron with their program. The >>vaste majority of operators seemingly only kick on their dicks height, >>as they do usual find the quickest level that they can match DIEP at >>running under judgeturpin (allowing rated against everyone, no rating >>limits. 1100 rated sometimes fanatically play it a couple of games). >> >>>kind of competition directly, I expect that you can gather a massive amount of >>>data with free testers at will. You can see how a change in Rebel performs >>>against top computers. You can see how a change in Rebel performs against top >> >>don't expect a single 40 in 2 game though Ed in case you're interested... >>...icc doesn't allow m moves in t time levels. >> >>>humans. I suggest you may write a parameter driven version of Rebel (or an >>>engine that can write personalities to disk based upon a set of criteria) and >>>then run one hundred games with the parameter at one setting, change the setting >>>and run another hundred. Using this sort of technique, you can find out what >>>settings work best against various types of competition. I think that will work >>>very much better than your contest, since the attempts at producing good >>>settings by others will be redundant and unscientific, for the most part. >> >>I completely disagree here. 100 blitz games is not gonna show much. >>apart from that you're dependant against who you play. >> > > bob you're saying exactly what i wrote above... ...in case of gross eval blunders you see them of course, in case program is having a bug which causes it to crash then it directly loses bunches of games... ...but other changes are pretty hard to judge. Like if i add some stupid and completely insane pruning then it'll have at judgeturpin for sure a 100 points more at blitz. If you search 6 ply at blitz then you can rape search and still do better... >I disagree with your disagreement. :) Blitz games _are_ useful. Because >they can, with a lot of work, highlight holes that have to be fixed. IE the >most recent change to my eval, reported here a few weeks ago. Roman watched >it play against several different GM players, and he noticed that once it got >to king and pawn endings, it greatly over-valued connected passers vs non- >connected passers. And that 'hole' was quickly repaired, so that it hasn't >lost to that particular glitch again. But this was found from blitz games. > >I am careful about using blitz games, of course, as at the Paris WMCCC event >I had allowed the tuning to get grossly out of line. It was holy hell at blitz >on ICC, but it played badly at longer time controls vs computers. I tend to >watch all the standard games it plays carefully (Varguz generally plays at least >2 one hour + games every day, others are doing the same thing with commercial >programs). But blitz games _can_ reveal weaknesses. I have found passed pawn >problems, distant passed pawn problems, majority problems, and so forth. By >going over lots of games quickly looking for that "pattern/trend" that is giving >it problems... > > > >> >>>By using the net as a resource, you double your compute power. By selling >>>copies of Rebel that can use the net as a resource you multiply your compute >>>power by the number of sales (e.g. you can gather a huge number of games from >>>the net and calculate strengths and weaknesses against rated opponents and you >>>don't even have to run them). >>> >>>Suggestion: >>>Have Rebel automatically annotate the network games with settings information so >>>that you can glean the effectiveness of various settings.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.