Computer Chess Club Archives

Search

Terms

Messages

Subject: Re: Measuring program improvement

Author: Dann Corbit

Date: 16:16:13 11/16/01

On November 16, 2001 at 18:46:35, Brian Richardson wrote:

>I am somewhat frustrated.  I made some changes to Tinker adding lazy eval and
>some other terms and then tested several things.
>
>It improved nps speed by about 30%.
>
>It improved WAC results.
>
>It won or drew every self play game (blitz and standard).
>
>Then I put the new version up to play at ICC.
>It promptly lost 100 points in blitz and standard !?
>
>This is not the first time that my "enhancements" seemed to improve things but
>turned out to play worse in actual games.
>
>Does this sort of thing happen to others too?

Sure.  A good example is piece values.  You can give your program an instant
boost in tactical play by making the pieces worth less.  It's pretty easy to add
quite a few more correct answers in a tactical suite and make the actual play a
lot worse.

You can actually make the program much stronger and still lose a pile of points.
 Suppose (for instance) that there is some loophole in your eval, or a trojan
attack or something like that.  Someone figures it out, and beats your program
15 times in a row, and you take a big pounding in the ratings.  Well, that hole
might have been there before, but nobody got around to exploiting it yet.

Getting clubbed is the best thing that can happen.  Look at all the losses and
figure out if there is any pattern.  Then plug the leak and try again.

Every program has problems that can be exploited.  By getting knocked down, you
find out where they are.  Get hit in the head with a bat a few times, and you go
and find yourself a helmet.

This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.