Author: Rolf Tueschen
Date: 15:12:19 01/23/04
Go up one level in this thread
On January 23, 2004 at 16:14:57, Kolss wrote: >On January 23, 2004 at 14:43:10, Bruce Cleaver wrote: > >>The real weakness here is accepting the 95% certainty claim. It's traditional, >>and makes some kind of sense, but when you get down to it there is also an >>element of arbitrariness. If you are satisfied with a 90% certainty, or perhaps >>+/- 1.5 sigma, or whatever, the scores that will impress you (out of 300 games) >>change. >> >>There are also times when 95% isn't good enough, and the claim has to be much >>tighter. > >I completely agree. > >95% is arbitrary; take 68% or 82.73% if that makes you happy. Only you should >know what confidence interval you employ and what it tells you. Keep in mind >that you can always only say something like "A is better than B" with a certain >(error) probability / confidence - it is entirely up to you what probability is >good enough for you (nothing magical about the 95%). > >Just to repeat the previous case (S8 vs. S7.04 162.5-137.5, +90 -65 =145): the >probability that S8 is the better of the two programs, based on this single >match and therefore only valid for a direct match between the two in this >particular setup etc., is about 97.5% (it will in fact even be slightly higher, >maybe 98%). If you say you want 99% certainty before you believe it, that is >fine - you may have to run a 1000-game match then after all. If you "believe" in >the 95% and want to show that one program is better than the other, then you can >stop. If you want 100% certainty, you will have to play an infinite number of >games. > >Best regards - Munjong. Munjong, you can repeat it as much as you want but it remains wrong! With your data you can NOT say that Shredder8 is now better with 97,5% or whatever "probability". That is nonsense. Period. Rolf
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.