Author: Tina Long
Date: 18:22:25 05/26/02
Go up one level in this thread
On May 26, 2002 at 17:30:59, Rolf Tueschen wrote: >On May 26, 2002 at 08:47:38, Tina Long wrote: > >>On May 26, 2002 at 08:13:08, Rolf Tueschen wrote: >> >>>I would not support this. Many aspects are flawed. What is large enough? >> >>At least 12 opponents at 40 games/match to give a +-40ish deviation is large >>enough to provide the information I derive from the SSDF list. >> >>>You >>>won't think that 40 is large enough?! >> >>40,000 is better, but 40 per match will do, as that is 1000 times quicker. >> > >I have some strange findings out of the recent SSDF list. I quote: > >11 Gandalf 5.1 256MB Athlon 1200 MHz, 2646 >GT2.0 A1200 13.5-26.5 DpFritz A1200 13.5-21.5 Shredd6 A1200 1.5-5.5 >Shre532 A1200 15-23 DpFritz K6450 22-22 CT14 CB K6450 19-14 >Craf18. A1200 22-18 Junior6 K6450 30-13 Shred5 K6-450 52-28 >Frit532 K6450 27-17 Junior5 K6450 31.5-12.5 Hiar732 K6450 29-19 >SOS K6-2 450 3.5-1.5 Goliath K6450 32-22 Nimzo99 K6450 29.5-10.5 > >Tina, would you still be pleased with such 4 (four!) or 6 (six!) "matches" in >the SSDF? Um, that's 5 (five!) and 7 (seven!). (now lecture me on statistics) Yes I'm pleased those results are included. Those matches will be finished by next list. The effect of the 5 games so far in Gan-SOS, on their total ratings will be small. >What is the reason for such strange matches? Do you still feel that >you should be thankful that SSDF gives you the results Yes, I see no reason not to be. > and how would you make >your own estimation on the basis of such short matches? Individual match results mean little, and of course the 5 games Gan-SOS is only just started. But that doesn't mean they should be left out of the calculations. The accumulation of All the games against MANY opponents gives a rating. It is infeasable to test only against similar strength opponents as ther SAMPLE SIZE of opponents is too small. > >Please note, that this here is just what I found by chance in Thoralf Karlsson's >own posting someone later quoted into this thread. > >Someone here asked if I wanted to imply cheating and I aswered "No!", but "No!", but From here on you are getting very biased and emotional Rolf, and I know better than argue against you in that state. Tina Long > could >you explain why Gandalf had 54 games against Goliath? BTW Goliath on weaker >hardware! Oops, Gandalf had 80 games against Shredder 5, also on weaker >hardware. In short: Do you agree that _not_ the later 5% bogus is so important >but much more such deliberate differences, say the quantity of the games in a >match and the different hardware? > >I would still reject the possibility of cheating but I know for sure, if _I_ >wanted to cheat, it would be easy to succeed if I were allowed to play matches >between 5 (!) and 80 games, I can guarantee you this for sure. No matter the >size of the margin of error... > >This practice is happening in SSDF since at least 1996 when I asked the same >questions and Peter answered me the following, I recall by heart: > >...such differences are completely uninteresting, simply because we have many >games with a program, so that such differences have no influence... > >At the time I criticized such a practice and opposed the logic too that because >of some hundred of games overall such little extremes had no meaning. Exactly >here, I can say now, we have the basic fallacy in the whole SSDF practice of >testing. What they deal with are mere numbers, no matter how they got them. >Whether on different hardware, different quantity of games and many more >uncontrolled and statistically unallowed behaviour. > >And it should be clear to the reader that the SSDF should change the wrong >tradition. Numbers are mathematically the same, but already in stats there are >numbers with a better status and a worse. In the end, you'd never know how your >Elo for a specific program was summed up. With blanks or good data. > >Rolf Tueschen
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.