Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: All this talk of cooking is making me hungry. Let's bake a program

Author: Bruce Moreland

Date: 13:47:21 02/24/99

Go up one level in this thread



On February 24, 1999 at 16:04:41, Dann Corbit wrote:

>I have an email from Frederic Friedel, and I am fully convinced that Fritz did
>not 'cook' anything to get the answers right.  I am writing an interface that
>will cause all programs that use it to get all test problems right instantly
>once asked (except those for which the answers are clearly wrong -- it will
>stubborly provide "its" answer in such cases).  Will a program that uses this
>add-on be guilty of 'cooking'?  Who cares if it makes the game play better
>chess?  Did Fritz also cook its way to the top of the SSDF?  If so then we had
>all better learn how to bake.
>
>I think that to accuse someone of doing something underhanded, we should have
>*very* strong evidence.  Is the evidence against Fritz "Well, my program could
>not do it!"?  Hardly damning evidence.  If you should find some epd strings
>buried in the executable with 'bm=foo' right after them, you might have some
>evidence that a solution to a problem was bolted into the source.  Even this
>might be innocuous.  I have some test positions that occur literally hundreds of
>times in real games.  Not only that, but they are *very* difficult to solve.  In
>such a case, I see nothing wrong with a program putting special case code in to
>deal with it.  Or consider the dreaded D00 opening "stonewall attack."  There
>are programs which *admittedly* have special case programming to deal with it.
>Are those programs doing something underhanded?
>
>The proof of the pudding is in the eating.  If it plays well then it plays well.
> If you are not really convinced, why not try some very similar positions and
>see if it is eval tuning.  If an eval function is tuned to defeat Nunn type
>positions, then perhaps we can all learn something from studying it.
>
>In any case, claiming that program "x" is cheating to solve problem "y" without
>evidence would be (in my view) worse than making program "x" solve problem "y"
>in a sneaky way.

With the Nunn test we have a test whose questions are known for years in
advance.  Imagine what would happen if the SAT questions were given to teachers
at the start of the school year.  This would put the teachers in a bad position.
 They can make their kids score better on the test by teaching the questions and
answers directly, but this would clearly defeat the purpose of the test, which
is to evaluate knowledge by sampling the student.  On the other hand, some of
the questions might have demonstrable utility, and it may make a lot of sense to
teach on topics related to those questions, even though this could be construed
as cooking the test.

Bob has been talking about Wac 2.  That's a problem that involves a rook
sacrifice in exchange for unstoppable pawns.  Bob added knowledge that solves
this problem quickly now.  But the reason he did it wasn't to solve Wac 2.
Anyone who has watched a program play a lot on ICC knows that you'll lose a few
games to advanced passed pawns, and that's what he's trying to correct.  I've
added the same sorts of terms to my program, but mine are more conservative, and
coincidentally don't solve Wac 2.

Another case is for sure the Stonewall.  If the Stonewall is on a test (maybe it
is one of the Nunn positions, I don't know), then you'd be cooking the test by
writing special stonewall code.  But once again, if you play on ICC you know
that you'll face this opening a lot, and it is effective against computers if
they don't know how to handle it *specifically*.  If this makes the program play
better in some Stonewall test someone has devised, that's fine, because the
improvement in play extends beyond the test.

The same is true of some endings, as well.  Humans will skip into opposite
bishop endings, so you try to avoid them.  If someone devises a test that
measures this, and you have code to avoid drawing humans because of this, you
should score on the test.

A lot of terms in a program are attempts to drive the game into positions that
it understands, and away from ones that it doesn't.  There is nothing wrong with
this as long as it is done in order to improve performance for end-users.

I think that where it becomes wrong is when you try to solve specific positions
and situations when your concern is a specific test.  So I think that building
books for the Nunn positions, and repeatedly doing machine learning, etc. for
those positions would be wrong.  The person who is damaged is the end-user, who
gets something based upon a good performance that they think extends to other
areas of the program's play, when in fact the good performance may not extend
beyond the test.

I don't know if anyone is doing this, I don't particularly care.

I think that this will be a problem with tests of all sorts, as long as they are
published and used for years.

bruce
..



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.