Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Comments on SSDF by Mr.Diepeveen

Author: Vincent Diepeveen

Date: 15:23:44 03/05/04

Go up one level in this thread


On March 05, 2004 at 15:51:47, Thoralf Karlsson wrote:

>On March 05, 2004 at 03:54:57, Afzal Siddique wrote:
>
>>Hello All,
>>
>>http://www.aceshardware.com/forum?read=105063596
>>
>>Afzal
>
>
>I have never asked Vincent Diepeveen for money in order to test his program.

That is correct. You told me that you were lacking hardware that much that
without another machine or 2 you would not be able to garantuee me that diep
would be soon at the list.

I won't start any courtcase or something, everyone knows you are on the legal
correct side of the matter.

Your statement at the time just let me draw my own conclusions just like you
draw yours.

SPEC for example is a paid organisation but nevertheless is accepted as a legal
testing organ.

>Neither SSDF nor it's testers have ever received any PC-hardware or money for
>buying it, from programmers or program-selling companies. A vast majority of the
>PCs which we have used for testing have been bought by the individual testers,
>and the rest SSDF has bought with it's membership fees.
>What we have received is only the program itself on diskettes or CDROMs.

there is a difference between 'we' and 'i'.

"i never received money"
"what we have received"

This is just a legal discussion of course and i'm not real interested in it
other than pointing you out that it might cause confusion.

'we' stands for SSDF
'i' stands for Karlsson i assume now.

>All games with PC-programs included in the SSDF rating list are played on two
>separate PCs. We have not accepted dual machines and we have certainly not
>played games on one single CPU PC only.

That is very good. It avoids a lot of the systemtime problems that many testers
of mine had with certain software.

>Match results presented by the program are not accepted. The end position in
>each game is chequed on both programs.

How do you handle aborted games?

Even though today of course that problem is a lot smaller than in the old 'DOS'
period.

>I admit that we haven't published all 96 451 games which presently is the base
>for the list. There are limits for how much time and effort we want to spend on
>our hobby. If someone wants to believe that we cheat with the unpublished games,

I never accused you of doing that. Definitely i didn't accuse you of
deliberately. From my own testers i know however that the biggest problem is the
human error that happens. Basically this is never intentional done.

But the error margin from this is between 50 and 150 rating points, and the
differences at SSDF list are substantial within those points.

A single match of 40 games when something goes wrong with either learning or
whatever, with a deviation of 20 points, that means already according to
professor ELO:
  20 * k = 20 * 10 = 200 rating points

*by definition*.

So only counting games that you can put online is pretty trivial IMHO, as
assuming no fraud/cheating that means everything is transparant.

Note that certain persons have entire databases online of up to 2 million games
online. So there are no real physical limits there. 100k games, no matter how
good it is that you produce so many games, it's transparant to put played games
online.

>feel free to do it. Even if we published all games included in the list, we
>could probably still cheat, by playing even more games and then discarding

I am nowhere accusing you nor your organisation from cheating. Assuming no false
intentions of the SSDF and all games published, it's easy for everyone to figure
out whether by accident errors have been made.

1 error = 50% from K = 0.5 * K = 0.5 * 10 = 5 rating points.

So that means for certain programs being #1 flip to #2 or #2 flipping to #3,
which is pretty important. I have seen many SSDF lists where 5 rating points in
the top mattered a lot. Nowadays with the latest shredder classic feature, most
likely an unintentional thing done by Stefan (it just happens by accident with
the first game), it can happen if you pause a match for a few hours, that 2
games become 1.

Of course on paper that could have negative impacts upon shredder too. Just on
paper...

>losses for our "favourites" and change dates and game numbers in the pgn-file.
>
>We don't claim that the SSDF ratings correctly predict which rating the
>individual program would get if they played hundred of games against humans. If

Ed Schroder saw that different a few years ago it seems. I wonder why.

>games were not played in long series between two opponents, but against a
>different program in each game so that the book learning didn't work, the list
>would probably look somewhat different. We test games in the way we have found
>appropriate or practical and present the results. If someone prefers other ways
>of testing, then he can consult other available rating lists or organize his own
>in the way he wants.
>Thoralf Karlsson
>SSDF

Thanks for your reaction.
As usual it is a correct reaction.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.