Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Dummy Cadaques Tournament (Long)

Author: Michael Neish

Date: 10:01:05 01/28/00

Go up one level in this thread


On January 28, 2000 at 07:27:54, Enrique Irazoqui wrote:

>There is a degree of uncertainty, but I don't think you need 1000 matches of 200
>games each to have an idea of who is best.

I have to agree with what Christophe says insofar as you need to play a
certain number of games before you can determine, to a certain (known)
degree of accuracy what the rating difference is between two programs.
You will never know exactly of course, hence the standard deviation
figures given next to the Elo ratings of human Chess players, which are
sometimes overlooked.  I haven't read Elo's book, but from what I know
of the Elo system he must have taken all this probability stuff into
account when he formulated it, so meaningless it is not.  In fact, it is
the core of the entire system.

If the rating difference between two programs is quite small, say less than
35 points, then I'm afraid you will definitely need a lot of games to sort it
out from the results alone.  A 20-game match solves nothing.  Christophe,
if you're reading this, could you tell us what is the minimum Elo difference
that a 20-game match can estimate to a good degree of confidence?

The assertions that I made in my original post, which Christophe commented
on, were that even with programs of equal strength you can expect the sort of
fluctuations that I showed.  This is an inescapable fact.  You could try
playing  Fritz 6a against Fritz 6a twenty times and see what happens.  You
have about a chance in six of getting a 10-10 score, even though it's the
same program. In fact, you're much more likely to get a an 11-9 score, or
even 12-8.

>Kasparov has been the undisputed best for many years. From 1984 until now, he
>played a total of 772 rated games. He needed less than half these games to
>convince everyone about who is the best chess player.
>
>This makes more sense to me than the probability stuff of your Qbasic program.
>Otherwise we would reach the absurd of believing that all the rankings in the
>history of chess are meaningless, and Capablanca, Fischer and Kasparov had long
>streaks of luck.

By the way Enrique (just in case you thought so), I hope you haven't taken these
posts as an attack on the Cadaques tournament, which I, and many other people,
are very interested in, and which must be hard work for those responsible.  I
wrote
"Cadaques" in the title as an eye-catcher, as it was topical, and I wished to
express my views once again on the fact that you cannot ignore the natural
fluctuations that occur when small numbers of games are played.

Anyway, humans are one thing and computers are another.

Humans are subject to moods, good spells, bad spells, psychological warfare, you
name it.  Draw a lost game and your confidence soars.  Lose a won game and it
plummets.  Computers, as you know very well, are not susceptible to this sort of
thing.  They play at the same level day or night, win or lose.  But I'm afraid
they
are still subject to the basic laws of probability, whatever one says, and
however
passionately one may say it.  In fact, their invariability means that my
Christophe's
"probability" program applies to them far better than it does to humans.

I guess people who take the time to examine Kasparov's games can appreciate
that he is a fine player.  If you look at his results, of course they are good,
but
you don't get the full picture.  The same with computer games I suppose.
Kasparov has played fewer than 200,000 games of course, but then again
there is quite a gap between himself and the next man.  The Elo rating reflects
this.  Now who's a better player, Morozevich or Leko?  If they play a 20-game
match and, say, Leko wins 15-5, is it conclusively proved that Leko is
better?  Maybe he's just good at psyching out his opponent and induce him
to play worse than usual.  Maybe this should be considered as one aspect of
Chess skill.

There are a certain number of games you need to play to be reasonably sure
of who is best.  The closer the players are in strength, naturally the more
games they will need to play, right? It shouldn't be difficult to work it out
-- I've never had reason to do so.  In fact, the number of games can be
worked out exactly.  Maths may be drab, boring, a drag, a conversation killer,
call it what you want, but it's there whether one likes it or not, and its
effects pervade the whole business of Chess and match playing.  You sit
at the board to play Your Game; you think you are in control of the situation,
but your performance at the board is subject to the same laws regardless.

Cheers,

Mike.




This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.