Author: Uri Blass
Date: 13:17:09 10/20/05
Go up one level in this thread
On October 20, 2005 at 10:05:09, Heinz van Kempen wrote: >Hi David, > >yes, some people like you indeed understand weird statistics and also that the >output from EloStat for versions with few games is not the best, but the >majority does not. > >An engine starting like the new superstar and then dropping quickly like a stone >just afterwards, this just gives the impression to most that testers either told >lies or did something wronng. > >Best Regards >Heinz Some conmments: 1)I never claimed that testers told lies. 2)problems happen with hardware and I did not claim that only the CEGT may have problems that are not results of statistical errors. 3)If an engine starts like a superstar and drop like a stone or the opposite it increase the probability that something is wrong with the results. It does not mean that something is wrong in the last games and it is also possible that something was wrong in the first games. The question is only what is the result that justify checking if there is problem. Let imagine some extreme cases that never happened a)if an engine scores 100% in the first 50 games against average rating of 2650 and 0% against average rating of 2650 in the next 50 games then you can be practically sure that there is a problem in testing and the result is wrong. b)if an engine scores 99% in the first 50 games against average rating of 2650 and 1% in the next 50 games against average rating of 2650 then you can also be practically sure that there is a problem and the result is wrong. What happened was of course less extreme relative to a and b but please understand that what happened was enough to increase the suspect that something is wrong. The question is how much suspicion that something is wrong suggest checking the games to find if something is wrong. I do not suspect that something may be wrong only in testing of other people but also in my testing in the time that I did some tests with movei versions. I remember that in the past there was some strange results that I got in testing 2 versions of Movei in very fast time control. I suspected that something is wrong and I was right. It turned out that accidentally both versions did not use the same hash and clearing hash tables between moves caused the version with more hash to lose most of the games. Suspecting that something is wrong is normal behaviour and it is not a personal attack. I can only be sorry that people see it as personal against them. Uri
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.