Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Comments on SSDF by Mr.Diepeveen * The Two Computer Quest

Author: Mridul Muralidharan

Date: 11:38:38 03/06/04

Go up one level in this thread


On March 06, 2004 at 12:42:24, Uri Blass wrote:

>On March 06, 2004 at 12:21:11, Mridul Muralidharan wrote:
>
>>On March 06, 2004 at 06:08:31, Rolf Tueschen wrote:
>>
>>>On March 05, 2004 at 19:44:56, Ed Schröder wrote:
>>>
>>>>On March 05, 2004 at 18:23:44, Vincent Diepeveen wrote:
>>>>
>>>>>On March 05, 2004 at 15:51:47, Thoralf Karlsson wrote:
>>>>>
>>>>>>On March 05, 2004 at 03:54:57, Afzal Siddique wrote:
>>>>>>
>>>>>>>Hello All,
>>>>>>>
>>>>>>>http://www.aceshardware.com/forum?read=105063596
>>>>>>>
>>>>>>>Afzal
>>>>>>
>>>>
>>>>>>I have never asked Vincent Diepeveen for money in order to test his program.
>>>>
>>>>>That is correct. You told me that you were lacking hardware that much that
>>>>>without another machine or 2 you would not be able to garantuee me that diep
>>>>>would be soon at the list.
>>>>
>>>>So from that statement (we don't have enough PC's to test Diep) you concluded
>>>>Thoralf was asking you to send him 2 PC's?
>>>>
>>>>Ed
>>>
>>>
>>>Dear Ed,
>>>
>>>please stop playing these games and let me answer for VD. Yes, of course, a
>>>young programmer MUST understand the words of TK this way! Period.
>>>
>>>VD wanted to reach the goal that his program was tested as soon as possible.
>>>Understandable wish. Now the responsible man from SSDF says that he could only
>>>do this if he had more resources - BUT of course he hasn't yet. OF COURSE this
>>>does NOT say word for word that Vincent should send hardware as soon as
>>>possible. But the implication is absolutely clear. Taken that VD WOULD have done
>>>this, the SSDF certainly wouldn't have rejected the kind present.
>>>
>>>But all this is only part of the overall general problem!!!
>>>
>>>And we should thank VD that he has published this here in CCC. The other aspect
>>>of the problem is that a company like ChessBase has more resources than just a
>>>young programmer. THEY make an invoice with an autoplayer and whoopieee, the
>>>SSDF is accepting it. Later it was found out that this autoplayer gave FRITZ an
>>>edge. At that moment also Ed Schröder began to jump up and down. A kind of war
>>>began.
>>>
>>>So here we come to the final aspect of this problem. Speaking in terms of
>>>history. Overall, these parts of the "SSDF problem" could be defined as follows:
>>>
>>>   *** the SSDF is held by amateurs, by certainly sympathetic hobby freaks
>>>
>>>   *** due to a lack of resources SSDF had to test by hand in the early days
>>>
>>>   *** due to that same aspect SSDF became open for manipulative tools
>>>
>>>   *** in consequence gifts of hardware alone _could_ influence the results
>>>
>>>This is all, what Vincent is saying and this is correct. If the SSDF were really
>>>independent, they would test completely without contacts with the programmers.
>>>They would buy the progs in shops and they would test them. They would test in
>>>the spirit of the potential clients, the end-users. The whole communicating with
>>>the programmers and their companies makes the SSDF object of almost invisible
>>>manipulations.
>>>
>>>Also herefore Vincent gave perfect examples. The invoice of special "books" and
>>>"learning files" is obviously a tool to manipulate the results of the tests
>>>because the programmers want to react themselves on the reactions of the other
>>>collegues with newer program versions on _their_ progs. Obviously this has no
>>>longer something to do with independent and reliable testing standards.
>>>
>>>To make this absolutely clear: a test, once begun, does NOT allow a tester to
>>>later make all kind of replacements or manipulative novelties because that
>>>simply and perfectly destroys the validity of the tests! (Just to tell the truth
>>>to many testers here around: you should NOT update your progs in a test
>>>"tournament" because that makes the whole tournament invalid.
>>
>><snip>
>>
>>
>>Hi Rolf,
>>
>>  Very nicely put.
>>I would like to point out another thing - Vincent mentioned that he was asked
>>for Hardware : some 6 years or so back - not yesterday or last year !
>
>No
>
>Vincent admitted that the ssdf did not ask him for hardware but only said that
>they have not enough hardware to test it immediately.
>
>>
>>Ed also keeps mentioning - "not anymore" - does this mean that there
>>may/defintely(!?) have been problems earlier on in the past ?
>>Could this h/w request also be a thing of that same past period ? :)
>
>It was not about hardware request but about the Fritz autoplayer that was not
>public at that time.
>
>Things were changed since that time.
>
>Uri

I dont know why I am bothering to reply to you , but since everyone should be
given a fair chance to understand , here goes a last ditch effort.

(All refernces to "you" below are to a third person/neutral observer).

1) You come to me and say - "can you test my program" , I say "No because I dont
have any machines free now , If only I had ...."

Yes this is not a straight forward request for machines - but in real world -
esp among businessmen (though not necessarily only among them! ) - it could mean
only "I need more machines , if you want me to start testing now , please help
me by helping me with machines so that I can help you !".

note : could be still be considered as a help / loan / whatever.

Nobody is corrupt or wants to be in this case - no machines free for testing ,
so physical impossibility to test.

Now if I have lot of resources and money , and say I am the CEO of a
hypothetical company chessX then I could say "Hey , my new program unnamedX
needs to be on the list immediately since it will get released soon/was just
released. So here are 10 dual boxes for your testing" - this can be considered
as pure help/loan.

But in real world ,when I do this , I earn a "favour" as Mario Puzo would have
put it :) Which I may use in future at an appropriate time.

(at some future date) I could say :
"I helped you in past when you had no machines ,so :
(the above would be implict statement - I ned not mention it explictly !)

a) please insert this program which just got released - lot of revenue is
dependent on it - I just want your honest test opinion. (another matter that my
program will have a book to kill all opponent books in the testing forum).

b) Please release list a few days/weeks before normal date" , etc , etc

Now if you are going to say that I sound like a conspiracy theorist , please
look at SSDF history and things will fall into place , people who know it will
understand what I am saying here ...

So the independence of the testing entity is already lost ! - without even them
attempting anything wrong or having any evil/corrupt intentions.

2) I introduce a dirty bug - (intentional/unintentional debate could last
forever !) now surprisingly , this hurts me very very low while hurts opponent
bigtime (already suspicious). I dont publish possibility of this bug or
proactively fix this (suspicion ++).
When people find this , suppose there is major hue and cry - "stupid bug
wreaking games" - "results are skewed 'cos of this" , "automated internet based
tournaments are the way to go" ( ;-) ) , etc , etc.

What do I do ?

If I am evil CEO , then :
Fix this obvious bug , but introduce more subtle bugs - which are not so easy to
spot. (uci 1mb ?!)

Now , you will say again that this is also another subtle bug which was not
intentional.
Then we have the process priority issue - this can be debated to death that it
is not existing (on a dual box with pondering on in a two engine match - I have
personally seen this happen).

Or you could say that in latest version this is fixed and was a bad bug in
previous release.

But the history of this product will make me suspect more "bugs" in this new
version too !
ones which will take more time and testing to debug and ones which will be
tougher to identify and track.

It is like the normal problem with chess prog clones - initially they were
stupid enough to only change the program/version strings in code , now they are
more sophisticated - people flick bulk code snippets , becomes tougher and
tougher to identify and track a clone.

Similarly , it will become tougher and tougher to find these "bugs" in the
interface.
GCP was kind enough to make the 1 MB bug publically known , though some people
knew this privately for sometime - including me (when i wanted to add support to
uci to earlier mess versions) , and I had informed some of my friends already
about this !
So just 'cos you do not hear of the new "bugs" need to mean that others dont
know or that they dont exist !

This means , that the product is already suspect !

All this can be said to be old history , etc , etc , etc but as time progresses
, more and more new ways will comes up - and maybe 5 years later Vincent or
someone else will again post on how some interfaces cheat using some xxxxx
inconsistency in the yyyy and due to which how some programs zzz1 , zzz2 , etc
from compnay qX were benifiting for past 8 years.
Again we will debate and then in end conclude - all this is history and forget
and forgive !

Fact does not change that throughout history of the list , you have only
controversies and that at no point of time you can take the results to mean
anything substantial.

Even if we are willing to overlook the fact that no games are posted/seldom
posted and assume that everythig is properly tested and there are no evil
intentions in SSDF maintainers/testers/managers/etc , the list already becomes
suspect !

SSDF is a great list and no doubt gives a good idea about the approximate
strength of programs - but among top 25 - I would not say that any one of them
is better than the other based on ssdf results even if you have like 120 elo
diff due to the way things are going.
When the relative position of the commercials in the ssdf gets used for their
commercial purposes , then we have commercial interests coming in direct anyway
!

Time to end my rambling ! :)

Mridul



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.