Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Thinker 4.6b third after 1st round!

Author: Robert Hyatt

Date: 08:42:24 06/02/04

Go up one level in this thread


On June 02, 2004 at 04:13:20, Sune Fischer wrote:

>On June 01, 2004 at 20:08:37, Robert Hyatt wrote:
>
>>On June 01, 2004 at 19:25:41, Sune Fischer wrote:
>>
>>>On June 01, 2004 at 18:39:46, Robert Hyatt wrote:
>>>
>>>>>We wanted to know what the strength relations were with learning off, now we
>>>>>know.
>>>>
>>>>Why?
>>>
>>>Mainly because you want reproducability, it's no good to have an engine that's
>>>not performing at a constant level.
>>>
>>>Ie. suppose I play a match against Crafty, then I change something in my engine
>>>and wants to see if it got better.
>>>If Crafty learns, my engine will probably do worse even if it is an improvement.
>>>
>>>Surely you can see why that is a nonsese experiment to do, learning _must_ be
>>>switched off or I simply cannot test against Crafty.
>>
>>There is a solution.  Clear the learning before you start a test.
>
>This won't work if I run tests of different lengths, I can't compare a 200 game
>match to a 100 game match.

Nor can you statistically do that under _any_ circumstances, unfortunately...


>
>> But even
>>then, you have  _real_ problem because there is some randomness built into my
>>move selection logic to provide variety.
>
>That's annoying yes, but as long as it averages the same strength it might not
>be totally damaging.


Playing the Sicilian in one match as black, the Latvian in the next match will
not "average the same strength"...





>
>>If you play a 20 game match, make
>>changes, and play another 20 game match, comparing the results is less than
>>worthless...
>
>So maybe Crafty is just worthless for testing, that is possible.

Or perhaps your testing methodology is worthless...  Crafty is not that
different from any other program.  You have to be sure to play enough games to
hide the random factor.


>
>>You can easily answer that.  But you also wouldn't publish a result of such a
>>match without clearly identifying that Crafty was badly handicapped.  That was
>>the point.  If someone reads "book learning disabled" and they don't know the
>>whys and whatfors about book learning, they might say "so what, no big deal"
>>when it really is.
>
>Crafty has been around so long, people (that matters) knowns pretty well the
>level it's playing at.
>Give us some credit here :)
>


I've tried, but it doesn't seem to be working.

BTW, you have greatly changed the original point of my post...  I _clearly_
asked "why learn=off in a tournament that was being played."  I didn't ask "why
learn=off in a test match for a single program?"

If a _programmer_ wants to test his program against some crippled version of
Crafty, that's one issue.  It is _not_ the issue that was being discussed until
you twisted the conversation in that direction...  I was _specifically_ talking
about someone playing a basement tournament or basement match, not someone
trying to develop an engine...





>>My philosophy has _always_ been one of "don't whine about a problem, fix it."
>
>Nothing wrong with that of course, but why complain if some decides to disable
>the cause of all the problems and thus remove the problem itself?


Because there _is_ no "problem" to remove.  It actually _adds_ a problem, rather
than removing one.



>
>I think I see where you are comming from though.
>Because you've fixed it the problem should always hang around, so that everyone
>else is doomed to spend an equal amount of time in fix it too, or else it's not
>"fair"?

That is one view, yes.  If you don't ponder, do you turn it off in my engine?
If you don't program endgame knowledge, do you adjudicate all games once they
reach the endgame?

You _must_ fill in the holes you have, because in real events you can't hide
them by pretending everybody has them and bypassing the issue in some artificial
way...


>
>>Several years ago Ed was complaining about "duplicate" games in the SSDF testing
>>that was being done.  I thought about that and decided "rather than complaining
>>about duplicate losses, I'm going to simply avoid them by having crafty notice
>>that it got into trouble in an opening and not play it again for a while."  That
>>is where my "book learning idea" was founded.  A problem that you could either
>>complain about (does it make sense to let a program lose the same opening over
>>and over and count that against it and for its opponent?) or solve.  I chose
>>"solve" and have not had the problem happen to me, at all...
>>
>>Of course if you turn it off, the problem comes right back, bigger than life,
>>and sticks around.
>
>Of course Ed has no control over how SSDF does their testing, if he cares about
>their result he must "do it their way".
>


My point _exactly_...  "their way" == "only way".


>-S.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.