Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Dummy Cadaques Tournament (Long)

Author: Michael Neish

Date: 09:44:38 01/28/00

Go up one level in this thread



>I think the difference is the gap in playing strength between Kasparov and the
>others! If you take Fritz 5.32 for example and let it play several blitz
>tournaments with 10 or 12 rounds against public domain winboard engines, you
>will see it wins almost every tournament (like Kasparov).

Exactly.  Now who is the better player, Morozevich or Kramnik?  Have any
tournaments resolved this matter?  No, and that's why their rating difference
is 10 points.  If they were computers, and always performed invariably and
aways stayed at the same strength, I wouldn't be surprised if more than a
century of tournament playing were needed (at the human rate of competition)
to resolve the question.

If you're curious, why don't you get your pet program and play it against
itself over a 20 game match?  Of course you'd have to call one "Player A"
and the other "Player B" so you can add up the scores as you go along.
You have only a one-in-six chance of getting a 10-10 score.  Try it.  Let's
say you get a 12-8 score (more likely than 10-10, believe it or not), and
then calculate their Elo ratings based on these results.  Then lo and behold
Player A, say, gets a higher rating than B.  But they are the same program!
Well this is what I think is going on when computers play each other (if
someone could describe how this whole computer rating business works
I'd be very grateful).  But never mind this.  What are people thinking?
Fritz plays Junior over 24 games and beats it 15-9.  A big win: Fritz must
be the better program.  Then later on Fritz stumbles to Crafty. What's
going on here?  Nothing is going on: these variations are precisely what
you expect.

So by considering results alone you are not getting the full picture.  You
can get more information by observing how the program plays, if you
are a good enough player to be able to do that.  Someone said in an
earlier post that someone else commented that he only needed one game
to know who is better.  There is some truth in this because you can
analyse his moves to see how deeply he thinks and how much he
understands.  But on the other hand if there isn't much difference
between each player you can easily get the wrong impression.  Maybe
A handles Q-Pawn openings better than B, or whatever.  So there is
only so far you can go.  Maybe Computer A is tactically stronger than
B, but B handles closed positions better.  It's sounds impressive when
someone says "I only need one game to know who's best", worthy of
a quotation, but if you think about it a little bit you realise it's
completely wrong.

I set off this thread in the first place.  My mission was this: to remind
people that the fluctuations are there.  Because of fluctuations you
cannot decide from a 20-game match which program is better, unless
the difference between them is pretty large (I don't know the exact
value.  Maybe Christophe can unleash his program on it).  A 420-game
tournament is obviously better, although it also has its limits.  I will
try to work them out over the weekend if I have time and report
back.

Cheers,

Mike.







This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.