Author: Uri Blass
Date: 06:54:33 06/03/02
Go up one level in this thread
On June 03, 2002 at 08:54:50, Rolf Tueschen wrote: >On June 03, 2002 at 01:42:07, Uri Blass wrote: > >>On June 02, 2002 at 22:06:58, Vincent Diepeveen wrote: >> >>>On June 01, 2002 at 13:14:58, Andrew Dados wrote: >>> >>>Hello, >>> >>>of course there are problems with the rating truths. >>> >>>For example here it says someone 1000 points lower >>>rated has 0.003 chance against me. >>> >>>Or 3 out of 1000 games. >>> >>>However OTB, it is not 3 out of 1000 games. Not even 1 out of 1000. >>>It is 0 out of 1000 and nothing else. >> >>I do not know about a lot of cases when the difference was 1000 points so I >>doubt if we have enough statistic but I agree that the expected result is >>probably less than 1 out of 1000. > >Uri, why are you so fixed on this? As Vincent told you, and he comes from both >directions, human chess and computer chess, the assumption of a normal >distribution in "real" strength as far as computer chess is concerned - is >simply false. The assumptions are also false for human chess but we use them to find rating of people. The fact that we use wrong assumptions mean that there are some errors in the results but it does not mean that the errors are big. The error exists both for the ssdf and for every rating list. It is a fata morgana. Pure nonsense. And this is the main reason >why all the pseudo-stats of SSDF is the same. Your expectancy is for human >beings but not for machines. The reason is the determinism in machines! I do not know about investigation about the difference in the expected results of humans and machines but it is clear that the expected result is not only function of the difference in the rating and there is a lot in common for humans and machines. The following may be truth both for humans and machines: It is posible that the expected result of a against B is more than 50% and the same for B agains C and C against A respectively. In that case the rating may be dependent on the number of games that are played between every 2 of the 3. There are other things that are common to humans and machines. There are humans who have more draws and there are humans who have less draws. The same for machines and Junior7 is known to have less draws than other programs. Machines can be unstable. They can play well and beat better programs if they are lucky to get a position that their evaluation is better than the opponent and they can also lose against weaker players if they are unlucky to get to positions that the opponent has the better evaluation. The same is correct also for humans. > >(Just to make a few conclusins for you, but we don't need them, because with the >absence of normal distribution it's already all over: > >The main error margins they have in SSDF result from chess coincidences but not >from differences in strength. The strength is predefined and only the learning >could be better or worsen the performance. The actual versions on better >hardware will always be "better", but this isn't a test result, it was known >before. That is why we needed thousands of games between programs not just 40 to >get rid of the big margin. Then you are at Vincent's certainties. It's either >50%, 100% or 0% each for the three possible results in chess. The expectancy >itself is always 1 for one of the three priorily known and defined results, >either equal or weaker or better - period. So, over thousands and millions of >games, the few new versions of a season will be equal, what is already clear for >someone who knows about stats when he's looking at the SSDF ranking list. The >new progs are always better than the older ones, no matter if with a result of >75% or 87%. Gandalf5.1 and gandalf5 were not bettter than Gandalf4.32h. The same for Nimzo8 and Nimzo7.32 Uri
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.