Author: Torstein Hall
Date: 07:47:52 05/26/02
Go up one level in this thread
On May 26, 2002 at 06:34:06, Martin Schubert wrote: >On May 26, 2002 at 05:09:33, Torstein Hall wrote: > >>On May 26, 2002 at 04:27:46, Martin Schubert wrote: >> >>>On May 25, 2002 at 23:17:05, Dann Corbit wrote: >>> >>>>On May 25, 2002 at 22:01:24, Rolf Tueschen wrote: >>>>[snip] >>>> >>>>>I do not say this. What I mean is, that they could even invest the same time in >>>>>a better testing. With no big changes. >>>>[snip] >>>> >>>>>Why not change a little bit of SSDF itself? >>>> >>>>What (exactly) are the changes you would have them make so that the list would >>>>be better? >>> >>>I don't understand why matches last sometimes 40 games, sometimes 43. Why not >>>say: a match lasts exactly 40 games. A small change without any effort. >> >>I guess they have a standard for 40 games, but if you let it run on autoplayer >>during the night and it reaches 43 games I think it reasonable to keep the extra >>3 games. It just gives more information. >> >>>Another point: if you took a look at the list where Shredder was leading you >>>could see that the leading programs had played their games against totally >>>different opponents. So you can't compare the ratings at all. >> >>If you can not do that then I think you can forget about rating. I'm playing >>different players based on rating and of course often we have not played the >>same persons. That is one of the reasons we have rating! >> >>>My suggestion: the top programms should play the same opponents to make it >>>possible to compare their results. >>>If I remember right it happens quite often that a program is very strong in the >>>first rating list it appears in (where it plays against weak opponents). In the >>>next rating list where it has to fight the tough ones it falls back in the >>>rating list. >> >>That is what the error margins are for. I think the rating normally stays within >>this limits. So for a given program that has got a SSDF rating of say 2600 +/- >>43 You can say with 95% (if I remember right) confidence that the program has a >>rating within the range 2557 - 2643 >> >That's not what the error margins are for. There is an systematic error in the >assumptions which are made. The assumption of an existence of an independent >rating (independent from the opponents). >By the way: 95% confidence doesn't mean what you're saying. I tried to explain >the real meaning of confidence intervall in my answer to Uri Blass. I did read your statement to Uri and are quite sure you are wrong. :-) Torstein PS And you are silent on the other statement you made about rating. Shall I take it that you agree on the rest of my arguments? > >>Torstein >>> > >Regards, Martin
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.