Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: WCCC vs auto232

Author: Robert Hyatt

Date: 17:07:27 09/14/00

Go up one level in this thread


On September 14, 2000 at 12:51:50, James T. Walker wrote:

>At the risk of being on the wrong side of programmers, I have to agree with
>Enrique.  I think programmers, especially commercial programmers put a lot of
>emphasis on the World Championship because of the extra sales it might bring.
>In my opinion it is just another tournament with a lot at stake.  Testing
>programs like the SSDF does is a more "real world" situation.  Still the best
>test of all would be hundreds of games vs humans.  But even better than that is
>that you test it yourself and decide if it does what YOU want.
>Jim Walker


It really isn't "real world".  IE when you buy a new engine to play against,
how often do you play hundreds of games?  The SSDF test method tests one
particular aspect of an engine quite well:  how will it adapt after it wins
or loses a game?  But if you are playing just one game against Kasparov, that
aspect is totally unimportant.  If you are playing in a WMCCC event, you might
well spend a lot of time preparing book lines to kill your opponents.  I don't
have time for this so I usually try to prepare lines to take the opponent out
of book before his book can kill me (I didn't have time for this in the recent
WMCCC event).  So the WMCCC event tests programmer preparation more than any
other factor.  I think ICC is the _hardest_ test to pass.  You will play
humans at IM and GM strength, and they will play hundreds of games, trying to
crack your book, or find a positional weakness they can exploit, and then they
will do so over and over until you fix it.  Opening preparation is no good
there if you play 100% automatically.  You _must_ have some randomness or you
get killed no matter how good your "good lines" are.

I think it is a question of 'benchmarking'.  When someone asks _me_ which
processor to buy, I don't randomly say "buy Intel" or "buy AMD".  I _always_
say "benchmark the software you want to run and use the result to choose."
Because different programs respond differently to different processors.  And
the processor you get the best results on won't necessarily be the one I get
the best results on.  Similarly, you should test an engine in the environment
you expect to use it in.  If you want to annotate/analyze your games for errors,
who cares how its "learning" works?  If you want to run automatically on the
servers using one of the new auto-interfaces that are available, you had
_better_ have learning facilities or you get cooked, but good.  If your goal is
to troll around ICC trying to produce the most inflated rating you can, you
should probably choose the program at the top of the SSDF list (for the record,
I dislike ICC computer accounts that run multiple programs....  it is not a
reasonable thing to do under one account, any more than it would be reasonable
to have a GM, and IM and a patzer all playing using one handle.)  If your
intent is to find the strongest opponent for yourself or to use against a human,
then the SSDF list is not the place to look.

That's about as simply as I can explain what ought to be going on when someone
looks at the various results.  Just because a program is on top of the SSDF does
_not_ mean it will do the best against GM players.  Just because a program does
better than others against GM players does not mean it will do better on the
SSDF list.  And you can add any sub-combinations of the above that you like
to the discussion.  Some SSDF games are played in long matches where learning
is critical.  Some are played as single games where learning is not used at
all.  Learning doesn't flow across two different testers either.  Ditto on the
chess servers and at WMCCC events...

All very cloudy, IMHO.  Hard to see the facts through the heavy "fog"...



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.