Author: Uri Blass
Date: 15:42:47 02/03/01
Go up one level in this thread
On February 03, 2001 at 17:49:02, Bruce Moreland wrote: >On February 01, 2001 at 17:08:36, Amir Ban wrote: > >>On January 31, 2001 at 20:17:17, Bruce Moreland wrote: >> >>>I expressed very forcefully that a 10-0 result was more valid than a 60-40 >>>result. >>> >>>I've done some experimental tests and it appears that I'm wrong. >>> >> >>No, you were right the first time. Check again. >> >>10-0 gets better than 99.9% confidence for the winner to be better. >> >>60-40 has about 95% confidence. >> >>To calculate confidence, you assume the null hypothesis, which is that the >>result is NOT significant and is a random occurrence between equals. You >>calculate the probability for that, and subtract from 1 to get confidence. >> >>Amir > >I've been dealing with a fever for the past two days so I haven't come back to >this. > >I think this stuff is all very important. I have seen endless conclusions about >computer chess strength, which are based upon intuition and common sense, which >I think means that they are often wrong. > >We're clearly working in the realm of statistics here, but I think that most >people aren't interested in doing proper statistical analysis. > >I want to try to change this, but I admit that I am not qualified. I have some >math ability, but I haven't taken a statistics course, and I don't any experts >on the subject. I took some statistics courses in university and I think that courses in statistics may be misleading because they do not talk about the question that you talk about. I started to think about the question that you are talking about only after reading your posts about the test of stopping the match when there is a difference of 7. <snipped> >If you know the two programs are 16 Elo points apart, and you do a match and >happen to get a 10-0 score, and you declare the winner of the match to be the >stronger one, you will be correct about 75% of the time. > >If you do a match and get a 60-40 score, and you declare the winner of the match >to be the stronger one, you will be correct about 85% of the time. > >If the Elo delta is higher, 60-40 is even more likely to indicate which is >really stronger. If you know the ELO delta only the difference is important. You can check and find that 55-45 and 10-0 gives the same result. The practical case is when you do not know that there is 16 elo difference but you know that 16 is an upper bound. The fact that you do not know the difference in rating suggest that 10-0 indicates better than 55-45 about the better player because 10-0 suggests that the difference in rating is bigger and in this case there is a bigger chance that the winner is better because you know that the difference is probably close to 16 elo. You can do an experiment and assume that the difference in rating is not constant but a random number that can get every value between 0 and 16. In this case you will find that the the probability that the winner is better is bigger in the case of 10-0 relative to the case of 55-45. On the other hand in this case the mistake that you do when you stop in 55-45 may be smaller so my intuition tells me that the rule to stop after a constant difference is a good rule because it is logical to take more risks when the change is smaller. Uri
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.