Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Dummy Cadaques Tournament (Long)

Author: Christophe Theron

Date: 11:44:48 01/28/00

Go up one level in this thread


On January 28, 2000 at 13:01:05, Michael Neish wrote:

>On January 28, 2000 at 07:27:54, Enrique Irazoqui wrote:
>
>>There is a degree of uncertainty, but I don't think you need 1000 matches of 200
>>games each to have an idea of who is best.
>
>I have to agree with what Christophe says insofar as you need to play a
>certain number of games before you can determine, to a certain (known)
>degree of accuracy what the rating difference is between two programs.
>You will never know exactly of course, hence the standard deviation
>figures given next to the Elo ratings of human Chess players, which are
>sometimes overlooked.  I haven't read Elo's book, but from what I know
>of the Elo system he must have taken all this probability stuff into
>account when he formulated it, so meaningless it is not.  In fact, it is
>the core of the entire system.
>
>If the rating difference between two programs is quite small, say less than
>35 points, then I'm afraid you will definitely need a lot of games to sort it
>out from the results alone.  A 20-game match solves nothing.  Christophe,
>if you're reading this, could you tell us what is the minimum Elo difference
>that a 20-game match can estimate to a good degree of confidence?


With a 20 games match, you can determine if prog A is 77 elo points above prog
B, with a 80% confidence.

If the programs are closer in ELO, the 20 games match is not enough.

I have answered to Enrique and given a complete table in my answer, you might
find it very interesting.

The key point is that when the elo difference gets smaller, the number of games
to play increases tremendously.




>The assertions that I made in my original post, which Christophe commented
>on, were that even with programs of equal strength you can expect the sort of
>fluctuations that I showed.  This is an inescapable fact.  You could try
>playing  Fritz 6a against Fritz 6a twenty times and see what happens.  You
>have about a chance in six of getting a 10-10 score, even though it's the
>same program. In fact, you're much more likely to get a an 11-9 score, or
>even 12-8.
>
>>Kasparov has been the undisputed best for many years. From 1984 until now, he
>>played a total of 772 rated games. He needed less than half these games to
>>convince everyone about who is the best chess player.
>>
>>This makes more sense to me than the probability stuff of your Qbasic program.
>>Otherwise we would reach the absurd of believing that all the rankings in the
>>history of chess are meaningless, and Capablanca, Fischer and Kasparov had long
>>streaks of luck.
>
>By the way Enrique (just in case you thought so), I hope you haven't taken these
>posts as an attack on the Cadaques tournament, which I, and many other people,
>are very interested in, and which must be hard work for those responsible.  I
>wrote
>"Cadaques" in the title as an eye-catcher, as it was topical, and I wished to
>express my views once again on the fact that you cannot ignore the natural
>fluctuations that occur when small numbers of games are played.
>
>Anyway, humans are one thing and computers are another.
>
>Humans are subject to moods, good spells, bad spells, psychological warfare, you
>name it.  Draw a lost game and your confidence soars.  Lose a won game and it
>plummets.  Computers, as you know very well, are not susceptible to this sort of
>thing.  They play at the same level day or night, win or lose.  But I'm afraid
>they
>are still subject to the basic laws of probability, whatever one says, and
>however
>passionately one may say it.  In fact, their invariability means that my
>Christophe's
>"probability" program applies to them far better than it does to humans.
>
>I guess people who take the time to examine Kasparov's games can appreciate
>that he is a fine player.  If you look at his results, of course they are good,
>but
>you don't get the full picture.  The same with computer games I suppose.
>Kasparov has played fewer than 200,000 games of course, but then again
>there is quite a gap between himself and the next man.  The Elo rating reflects
>this.  Now who's a better player, Morozevich or Leko?  If they play a 20-game
>match and, say, Leko wins 15-5, is it conclusively proved that Leko is
>better?  Maybe he's just good at psyching out his opponent and induce him
>to play worse than usual.  Maybe this should be considered as one aspect of
>Chess skill.
>
>There are a certain number of games you need to play to be reasonably sure
>of who is best.  The closer the players are in strength, naturally the more
>games they will need to play, right? It shouldn't be difficult to work it out
>-- I've never had reason to do so.  In fact, the number of games can be
>worked out exactly.  Maths may be drab, boring, a drag, a conversation killer,
>call it what you want, but it's there whether one likes it or not, and its
>effects pervade the whole business of Chess and match playing.  You sit
>at the board to play Your Game; you think you are in control of the situation,
>but your performance at the board is subject to the same laws regardless.
>
>Cheers,
>
>Mike.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.