Author: Uri Blass
Date: 08:11:08 09/28/05
Go up one level in this thread
On September 28, 2005 at 10:20:49, George Tsavdaris wrote: >On September 28, 2005 at 10:01:18, Uri Blass wrote: > >>On September 28, 2005 at 07:04:44, Heinz van Kempen wrote: >> >>>On September 28, 2005 at 06:54:09, Uri Blass wrote: >>> >>>>On September 28, 2005 at 06:05:25, Heinz van Kempen wrote: >>>> >>>>>Hi all , >>>>> >>>>>the need for many games is again shown in CEGT. After a rocket-like start by >>>>>Fritz 9 a catastrophical first series in the match by Michael against Toga was >>>>>already sufficient to let it drop like a stone from highest level to below Fruit >>>>>currently. So it happened like we expected. >>>> >>>>Dropping like a stone? >>>> >>>>I think that a stone should be able to drop faster than that >>>> >>>>Fritz9 is still second place and first place in 4/40 list when more games were >>>>played by Fritz(478 games and not 288 games). >>>> >>>>http://www.husvankempen.de/nunn/eloblitz.html >>>> >>>> >>>>Here are the numbers: >>>>40/4 >>>>1 Fritz 9 2796 30 29 478 71.8 % 2634 21.8 % >>>>2 Fruit WCCC'05 2786 24 24 758 73.9 % 2606 22.7 % >>>> >>>>40/40 >>>>1 Fruit WCCC'05 2778 12 12 2219 68.6 % 2642 32.6 % >>>>2 Fritz 9 2769 35 35 288 61.5 % 2688 27.8 % >>>> >>>>Nothing significant was changed and the error is still too high to decide which >>>>version is better at 40/4 or 40/40 >>>> >>>>Uri >>> >>>Hi Uri, >>> >>>the rating lists will be updated again in the evening with around 700 games for >>>Fritz 9 in Blitz and more results 40/40 from the Leagues and World Trophy. >>> >>>We are again seeing that only 250 games are just a joke and people should stop >>>to draw conclusions from this. >>> >>>Best Regards >>>Heinz >>> >>>http://www.husvankempen.de/nunn/ >> > >Some additions assuming that the ratings-list has been created with 0.95 >certaincy: > >>Maybe I am stupid when I think that I can draw conclusions but here is my >>conclusion from results of 288 games of Fritz9 >> >>1)Fritz9's rating is at least 2734 >1)With 95% probability Fritz9's rating is at least 2734 > >>2)Fritz8 Bilbao's rating is at most 2724 >2)With 95% probability Fritz8 Bilbao's rating is at most 2724 > >> >>Conclusion >>Fritz9 is better than Fritz8 Bilbao. >There are 95% chances that Fritz9 is better than Fritz8 Bilbao. >But still 5% that Fritz8 Bilbao is better than Fritz9.......... Not exactly. I say it shortly A is better than B and practically we are almost sure about it but. Problems: 1)The model that is used to give all the errors is wrong because it does not consider the choice of the opponents. For example it is possible that A beat B 600-400 and B beats C 600-400 and c beat A 600-400 and you can find that everyone of them is the strongest with probability that you want by choosing the right opponents and choosing enough games. 2)Even in case that the model is right the number of 95% is wrong because the error in the difference with 95% certainty is smaller than the sum of the errors. if A~N(0,1) B~N(0,1) then the standard deviation of A-B is square root of 2 and not 2. 3)The model does not give probability that A is better than B. It is a fact that has probability 1 or 0 inspite of the fact that we do not know for sure if it is 1 or 0. It only give probability that we are wrong in our prediction. The way that I used to decide that Fritz9 is better than Fritz8 may give wrong results in 5% of the cases but the probability that Fritz9 is better than Fritz8 is 1 or 0 and not something else. Note also that we even cannot say that we believe that A is better than B with 95% confidence without knowing the aprior opinion that we had before testing. There are cases when we have some strong aprior opinion against something and we are right to have it. Suppose that you get 100,000 coins and you know that one of them is not fair(probability of 60% 40%) and the rest of them are fair. If you want to test and find if one of the coins is fair you need to be sure with confidence of clearly more than 95% because confidence of 95% is probably because of bad luck and you will get 1 out of 20 that is not fair and it will probably not be the right coin. Uri
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.