Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Comments of latest SSDF list 2

Author: Bertil Eklund

Date: 14:53:43 05/26/02

Go up one level in this thread


On May 26, 2002 at 17:30:59, Rolf Tueschen wrote:

>On May 26, 2002 at 08:47:38, Tina Long wrote:
>
>>On May 26, 2002 at 08:13:08, Rolf Tueschen wrote:
>>
>>>I would not support this. Many aspects are flawed. What is large enough?
>>
>>At least 12 opponents at 40 games/match to give a +-40ish deviation is large
>>enough to provide the information I derive from the SSDF list.
>>
>>>You
>>>won't think that 40 is large enough?!
>>
>>40,000 is better, but 40 per match will do, as that is 1000 times quicker.
>>
>
>I have some strange findings out of the recent SSDF list. I quote:
>
>11 Gandalf 5.1  256MB Athlon 1200 MHz, 2646
>GT2.0 A1200     13.5-26.5  DpFritz A1200   13.5-21.5  Shredd6 A1200    1.5-5.5
>Shre532 A1200     15-23    DpFritz K6450     22-22    CT14 CB K6450     19-14
>Craf18. A1200     22-18    Junior6 K6450     30-13    Shred5 K6-450     52-28
>Frit532 K6450     27-17    Junior5 K6450   31.5-12.5  Hiar732 K6450     29-19
>SOS  K6-2 450    3.5-1.5   Goliath K6450     32-22    Nimzo99 K6450   29.5-10.5
>
>Tina, would you still be pleased with such 4 (four!) or 6 (six!) "matches" in
>the SSDF? What is the reason for such strange matches? Do you still feel that
>you should be thankful that SSDF gives you the results and how would you make
>your own estimation on the basis of such short matches?

Never heard of a deadline?! You can see the results from the matches in the next
list. In the case of S5 the reason could be that by mistake two testers played
the same opposition. The "odd" matches in example against H7.32 is caused by
autoplayer trouble, in example you sum up the game and finds 23 w and 19 b
games, then it is balanced with i.e, 4 more b games.

I think I have heard some nuts here (i.e. someone named Rolf or Martin) saying
that the new programs omly should play against programs on the same new
hardware, resulting that we should have to wait until 4-5 new programs shows up
before the next list. Of course you never play against players 150 elo stronger
or weaker than yourself either.

>Please note, that this here is just what I found by chance in Thoralf Karlsson's
>own posting someone later quoted into this thread.
>
>Someone here asked if I wanted to imply cheating and I aswered "No!", but could
>you explain why Gandalf had 54 games against Goliath? BTW Goliath on weaker
>hardware! Oops, Gandalf had 80 games against Shredder 5, also on weaker
>hardware. In short: Do you agree that _not_ the later 5% bogus is so important
>but much more such deliberate differences, say the quantity of the games in a
>match and the different hardware?

Everyone with a brain or something like that could understand the reasons  for
the above except you and a handful others. Everyone over seven years understands
that you can and must play against stronger and weaker players too. Not always
against yourself.

Maybee you should talk to your hidden source in the SSDF so he could give you
some other interesting information about all the cheating that's going on.
>
>I would still reject the possibility of cheating but I know for sure, if _I_
>wanted to cheat, it would be easy to succeed if I were allowed to play matches
>between 5 (!) and 80 games, I can guarantee you this for sure. No matter the
>size of the margin of error...
>
>This practice is happening in SSDF since at least 1996 when I asked the same
>questions and Peter answered me the following, I recall by heart:
>
>...such differences are completely uninteresting, simply because we have many
>games with a program, so that such differences have no influence...

He didn't said no influence but very small influence. Of course you could do
like other people and play a dozen of games on the same hardware against a few
other programs and decide which program are the best.
>
>At the time I criticized such a practice and opposed the logic too that because
>of some hundred of games overall such little extremes had no meaning. Exactly
>here, I can say now, we have the basic fallacy in the whole SSDF practice of
>testing. What they deal with are mere numbers, no matter how they got them.
>Whether on different hardware, different quantity of games and many more
>uncontrolled and statistically unallowed behaviour.

As far as I remember you have always critisized each and everyone and I can't
still remember that you have ever been right.
>
>And it should be clear to the reader that the SSDF should change the wrong
>tradition. Numbers are mathematically the same, but already in stats there are
>numbers with a better status and a worse. In the end, you'd never know how your
>Elo for a specific program  was summed up. With blanks or good data.
>
I believe your statements are always backed up by blanks, insinuations and lies.

Bertil

>Rolf Tueschen



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.