Author: Robert Hyatt
Date: 10:40:28 06/02/04
Go up one level in this thread
On June 02, 2004 at 12:16:16, Sune Fischer wrote: >On June 02, 2004 at 11:42:24, Robert Hyatt wrote: > >> >>> >>>> But even >>>>then, you have _real_ problem because there is some randomness built into my >>>>move selection logic to provide variety. >>> >>>That's annoying yes, but as long as it averages the same strength it might not >>>be totally damaging. >> >> >>Playing the Sicilian in one match as black, the Latvian in the next match will >>not "average the same strength"... >> > >Doesn't matter, what matters is playing the sicilian and Latvian at constant >levels. Do you play 100,000 game matches? If not they will _not_ be at "constant levels". My book selection has some randomness built in that will take it into oddball (and unsound) lines with some frequency... > >Eg. suppose that in one of our basement tournaments Crafty gets the same >sicilian twice against two engines of equal strength. >Crafty falls into a trap in the first game, "learns" and manages to avoid it for >the next match. This will only punish the latter of the two opponents. Isn't that _exactly_ how a human works? Invite me to a "basement tournament" but tell me I can't learn from one game to the next and see what happens. :) > >I think that is unacceptable in a testing environment. You are seriously mixing terms. Are you talking about a basement tournament, where a third-party (not the programmer of any engine) or a basement match (same idea), or are you talking about _you_ testing your engine (as a programmer) against mine? If the latter, then why are you in this conversation? I _specifically_ addressed a "basement tournament by a non-programmer with learning disabled." I have been _specifically_ addressing that concept from the beginning. So let's fix the setting first, and stay on the same page together. I'm talking about "third-party events" _only_... > >>>>If you play a 20 game match, make >>>>changes, and play another 20 game match, comparing the results is less than >>>>worthless... >>> >>>So maybe Crafty is just worthless for testing, that is possible. >> >>Or perhaps your testing methodology is worthless... Crafty is not that >>different from any other program. You have to be sure to play enough games to >>hide the random factor. > >I can't do that against Crafty, if it learns it's a moving target. > And if it doesn't learn it is a random target. >Testing is already hard enough, you don't need to throw additional random >factors into the equation. Have you tried Crafty? It varies significantly. > >>BTW, you have greatly changed the original point of my post... I _clearly_ >>asked "why learn=off in a tournament that was being played." I didn't ask "why >>learn=off in a test match for a single program?" >> >>If a _programmer_ wants to test his program against some crippled version of >>Crafty, that's one issue. It is _not_ the issue that was being discussed until >>you twisted the conversation in that direction... I was _specifically_ talking >>about someone playing a basement tournament or basement match, not someone >>trying to develop an engine... > >Why should there be a difference? > If you don't see the difference, I certainly can't further explain it to you. But there is a _big_ one. >The poster just had an interest in seeing how the engines did under those >conditions, nothing more nothing less. I doubt the poster understood exactly what he was doing. And how it could affect some engines more than others. he probably thought it was "fair". It certainly wasn't for reasons already given. He didn't disable hand-tuned books, but he prevented programs that don't have one from adjusting on the fly to opponents that do... > >If Crafty is greatly handicapped by disabling learning then I think it's still >interesting to find out how much weaker it is. What on earth for? To see how good Kure, or someone else is at preparing good books? There's a better way to test that. Use their program with a random book, then with their book, and see which does better, even against a "learner". > >>>>My philosophy has _always_ been one of "don't whine about a problem, fix it." >>> >>>Nothing wrong with that of course, but why complain if some decides to disable >>>the cause of all the problems and thus remove the problem itself? >> >> >>Because there _is_ no "problem" to remove. It actually _adds_ a problem, rather >>than removing one. > >If there is no problem, why did you go through all that trouble? You miss the point. Learning is not a problem. Turning it _off_ is a problem. > > >>>I think I see where you are comming from though. >>>Because you've fixed it the problem should always hang around, so that everyone >>>else is doomed to spend an equal amount of time in fix it too, or else it's not >>>"fair"? >> >>That is one view, yes. If you don't ponder, do you turn it off in my engine? > >If it's a ponder on tournament I play with the handicap, if it's ponder off it >gets disabled for everyone. > >Couldn't be simpler. If you don't see the flaw in that logic, I won't attempt to point it out again... > >>If you don't program endgame knowledge, do you adjudicate all games once they >>reach the endgame? >>You _must_ fill in the holes you have, because in real events you can't hide >>them by pretending everybody has them and bypassing the issue in some artificial >>way... > >I don't know what you mean by real events. > >Anything is real or unreal pending your perspective. real events are events where _you_ don't get to twiddle with _my_ engine settings, and vice-versa. Ie strongest-settings vs strongest-settings, period. IE WCCC, WMCCC, CCT, ICC games, etc... > >I'm mostly interested in events that are well controlled and statisticly >significant, anything else is just a bit of fun. If you lop off pieces of a program, here and there, and call that "statistically significant" then we _do_ have a serious problem with understanding the term "statistically significant"... > >Well okay perhaps great fun, but still nothing that can be taken too seriously. > >> >> >>My point _exactly_... "their way" == "only way". > >Says who? >AFAIK they don't test in smp mode, so there are definitely other ways. > >-S. Says them. They use a book. Learning on. Pondering on. Two machines. biggest hash possible. Both have tables available. No strange personality settings. No adjusted search parameters. Etc..
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.