Author: Martin Schubert
Date: 03:34:06 05/26/02
Go up one level in this thread
On May 26, 2002 at 05:09:33, Torstein Hall wrote: >On May 26, 2002 at 04:27:46, Martin Schubert wrote: > >>On May 25, 2002 at 23:17:05, Dann Corbit wrote: >> >>>On May 25, 2002 at 22:01:24, Rolf Tueschen wrote: >>>[snip] >>> >>>>I do not say this. What I mean is, that they could even invest the same time in >>>>a better testing. With no big changes. >>>[snip] >>> >>>>Why not change a little bit of SSDF itself? >>> >>>What (exactly) are the changes you would have them make so that the list would >>>be better? >> >>I don't understand why matches last sometimes 40 games, sometimes 43. Why not >>say: a match lasts exactly 40 games. A small change without any effort. > >I guess they have a standard for 40 games, but if you let it run on autoplayer >during the night and it reaches 43 games I think it reasonable to keep the extra >3 games. It just gives more information. > >>Another point: if you took a look at the list where Shredder was leading you >>could see that the leading programs had played their games against totally >>different opponents. So you can't compare the ratings at all. > >If you can not do that then I think you can forget about rating. I'm playing >different players based on rating and of course often we have not played the >same persons. That is one of the reasons we have rating! > >>My suggestion: the top programms should play the same opponents to make it >>possible to compare their results. >>If I remember right it happens quite often that a program is very strong in the >>first rating list it appears in (where it plays against weak opponents). In the >>next rating list where it has to fight the tough ones it falls back in the >>rating list. > >That is what the error margins are for. I think the rating normally stays within >this limits. So for a given program that has got a SSDF rating of say 2600 +/- >43 You can say with 95% (if I remember right) confidence that the program has a >rating within the range 2557 - 2643 > That's not what the error margins are for. There is an systematic error in the assumptions which are made. The assumption of an existence of an independent rating (independent from the opponents). By the way: 95% confidence doesn't mean what you're saying. I tried to explain the real meaning of confidence intervall in my answer to Uri Blass. >Torstein >> Regards, Martin
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.