Author: Dann Corbit
Date: 17:16:39 12/27/01
Go up one level in this thread
On December 27, 2001 at 19:56:59, Mark Young wrote: >Dann don't go nuts, many of us understand the stats. We are also smart enough to >know that the range given could be greater and smaller then what is shown. It >depends on what degree of confidence you are looking for in the stats. > >It could also be Crafty 18.12 is the strongest program on the ssdf list or many >other programs with lower ratings. Thats the thing with stats, you can never be >100% sure with 100% confidence. You can only go with what is most likely. Crafty 18.12 is well outside of the specified range for the error bars. So (for instance) under the conditions of the test, if the hypothesis were: "Crafty 18.12 run on Athlon 1200 under Autoplayer is as strong as Chess Tiger 14.0 CB on Athlon 1200" then the hypothesis would be rejected. In fact, why don't we look at the numbers: 1 Chess Tiger 14.0 CB 256MB Athlon 1200 2715 38 -36 378 66% 2600 [snip] 14 Crafty 18.12/CB 256MB Athlon 1200 MHz 2601 44 -43 261 53% 2577 2601 + 44 = 2645 (the upper range of Crafty's strength under the conditions of the experiment) 2715 - 36 = 2679 (the lower range of CT 14's strength). Hence, we can say with confidence that within the precision of the error bars, crafty is not as strong as CT. Let's suppose that it is only one standard deviation. Then that means we have a probability of 2/3 or better that CT is stronger. With the figures for CT verses DF, the difference in strength is statistically totally insignificant. We cannot say (based on that available data) that CT is stronger than DF. It might be (or vice-versa) but it has not been shown. This is the most difficult case -- two sets of measurements that are almost identical. It would take far more data than anyone is willing to generate to settle this issue. You are (apparently) still insisting that the SSDF results show Chess Tiger to be stronger, which clearly shows that you do not understand what the table means. The table definitely does *not* show that CT is stronger that DF. To state otherwise is a clear demonstration that you do not know what the numbers mean. Of course CT *might* be stronger. It's just that the numbers do not show that. What they show is parity. Your interpretation of their meaning is just plain wrong.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.