Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Logistical questions

Author: Bruce Moreland

Date: 23:30:18 12/25/00

Go up one level in this thread


On December 25, 2000 at 23:41:38, Roger D Davis wrote:

>Actually, statistical significance is a function of sample size and effect size.
>Sample size is number of persons in a study, for example, or the number of times
>a particular value in an experiment was recorded. The effect size is regarded as
>the true difference between the means of the samples on the relevant
>variable(s). Statistical significance is always a function of sample size AND
>effect size. If the effect size is large, as with a large difference of ability
>between two players, then fewer games will be needed to determine who is the
>significant winner (as opposed to just the winner, which may be statistically
>meaningless).
>
>Conversely, it is certainly possible that the match could stretch out forever
>between evenly matched opponents. Obviously, this statement is tautological,
>since that's what it means to be evenly matched in the first place...it is
>trivially true. One way of dealing with this possibility is to set a minimum and
>maximum number of games. If players reach the maximum number of games without a
>statistically significant result, then the tie could go to the old champion, or
>there could be no champion.
>
>Alternately, a draw could be agreed unless both players wanted to play on, in
>which the match could continue another ten games or whatever.
>
>There are many ways of constructing it that I would consider interesting.
>Obviously, Bruce has another opinion.

I think it could stretch out forever between *unevenly* matched opponents.

We currently have a gap of 75 Elo points between #1 and #2.  The gap between #2
and #3 is 2 points, then 18 more is #4.  In fact, that 75 points can get you all
the way from #2 to #11, there are 75 points between these two guys, with 8
players in between.

But let's assume to start with that you get players this amazing 75 Elo points
apart.  How many games would it take, on average, to get statistical
significance?

I wrote a program that figures some of this out.  I chose to look at the case of
50 games, since that's about twice as many as have been played in the past, and
I take this as an impractical number.  I don't think you'd find participants who
would agree to a 50-game match.  It would be even less likely if you told them
they would have to demonstrate clear superiority or they'd share the prize.

Ok, let's look at 50 games.  I believe my program shows, that with a decent draw
percentage (I chose 45%), that if you have two equal players, and you contend
that one is better than the other, and your player scores 30.0 points, you'll be
wrong less than 5% of the time.

So if you say that your guy is better, and he scores 30 points, you are probably
right.

First of all, no GM is going to consent to a match where he has to win by 30-20
in order to be the winner of the match.  But let's assume that the two best,
separated by 75 Elo points, are willing to go for it.

I think that someone could get 30 out of 50, if they really are 75 points
better, about half the time (51%).  The second best guy would have to be an
especially competel idiot to do this, by the way.  All he can hope for,
essentially, is a draw, he has a vanishingly small chance to win 30 out of 50 if
he's really 75 Elo points worse.  "There's Anand, trying to hang onto
statistical insignificance with three games to go.  Can he do it?"

A 50-game match is only a waste half the time.  If you are willing to go 100
games, I think you could pick a champion 85% of the time, given this 75 Elo
point delta.

Of course, this assumes that the Elo delta is 75, which is a very fortunate
condition for you.  If you assume a 25 point Elo delta between the best player
and the second best player, your chances of finding a winner in 50 games are
like 9%, and in 100 games are something like 20%.

So every few years you get a clear champion, the rest of the time you have
co-champions.  But why stop there?  Why not have a huge multiple round-robin
tournament?  You could spend many entertaining years trying to prove that one of
the top ten was actually best.

Of course by that time, someone else would be best, so you'd have to start over.

bruce



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.