Author: Christophe Theron
Date: 14:11:17 01/28/00
Go up one level in this thread
On January 28, 2000 at 16:36:57, Amir Ban wrote:
>On January 28, 2000 at 04:07:44, Christophe Theron wrote:
>
><snip>
>
>>I have just run it. My sample is 1000 matches. Each match is made of 200 games.
>>My program tells me that with 200 games I can only be sure that one program is
>>stronger if the elo difference of the two is above 35 elo points, and this is
>>sure with a 93.5% confidence.
>>
>>If the programs are closer than 35 elo points, 200 games are not enough to be
>>sure which is best.
>>
>>Number of matches: 1000
>>Number of games in each match: 200
>>Compute probability of error greater than: 5
>>
>>
>>
>> Christophe
>>
>>
>
>Something wrong with the numbers here: 200x1000 games are good enough to
>establish a rating with 95% confidence margib of 1.5 points.
But this is not what I computed.
I computed the average error margin of a 200 games match, by simulating 1000 of
such matches.
The experimental result I get with this simple program is that with 80%
confidence the error margin is below of equal to 3.5% in 200 games matches.
Does it fit with your own numbers? I'm interested in this.
> If two programs are
>35 points apart, you would need only about 400 games to say tell with 95%
>confidence which is better.
I have not tried to establish the table for 95% confidence, but your numbers
sound OK for me.
>This also fails to say something important: The greater the difference in
>strength, the less games needed to prove who is better. If players are 100
>points apart, only about 50 games are needed. A 200 point difference would show
>up almost immediately.
If you run my program for a while, it quickly becomes obvious.
Christophe
>I think there's also a logical trap than even the smartest fall into. When
>people see for example the SSDF list, and see their 95% confidence intervals,
>they jump to the conclusion that if the point spread is within this interval, it
>has NO significance, which is not true. I can very well make statements based on
>only 80% (gasp!) probablility. I expect to be right 80% of the time, and in most
>cases I will pass for a very smart person.
>
>Amir
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.