Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Comments on SSDF by Mr.Diepeveen * The Two Computer Quest

Author: Ed Schröder

Date: 02:09:09 03/08/04

Go up one level in this thread


On March 07, 2004 at 15:56:07, Rolf Tueschen wrote:

>On March 06, 2004 at 08:43:15, Ed Schröder wrote:
>
>>On March 06, 2004 at 06:08:31, Rolf Tueschen wrote:
>>
>>>On March 05, 2004 at 19:44:56, Ed Schröder wrote:
>>>
>>>>On March 05, 2004 at 18:23:44, Vincent Diepeveen wrote:
>>>>
>>>>>On March 05, 2004 at 15:51:47, Thoralf Karlsson wrote:
>>>>>
>>>>>>On March 05, 2004 at 03:54:57, Afzal Siddique wrote:
>>>>>>
>>>>>>>Hello All,
>>>>>>>
>>>>>>>http://www.aceshardware.com/forum?read=105063596
>>>>>>>
>>>>>>>Afzal
>>>>>>
>>>>
>>>>>>I have never asked Vincent Diepeveen for money in order to test his program.
>>>>
>>>>>That is correct. You told me that you were lacking hardware that much that
>>>>>without another machine or 2 you would not be able to garantuee me that diep
>>>>>would be soon at the list.
>>>>
>>>>So from that statement (we don't have enough PC's to test Diep) you concluded
>>>>Thoralf was asking you to send him 2 PC's?
>>>>
>>>>Ed
>>
>>
>>>Dear Ed,
>>>
>>>please stop playing these games and let me answer for VD. Yes, of course, a
>>>young programmer MUST understand the words of TK this way! Period.

>>No, it is just an explanation why a program can't be tested. Anything other than
>>that is paranoia.

>I must strongly object because this is a dirty insult against VD. I repeat it:
>if a tester answers this way to a young programmer, the latter must understand
>that if he could supply them with hardware THEN they would test his program. No
>Wch title can change this meaning, Ed, I'm so sorry.

Consider I am having a friendly email exchange with a colleague programmer.
While writing my stuff I (in my innocence) state, "... but lately I am having
made not much progress". Then the programmer goes into paranoia mode and writes
in CCC, "Ed Schroder has tried to diddle my source code". So much for ambiguous
sentences, paranoia decides how they are interpreted.



>>>VD wanted to reach the goal that his program was tested as soon as possible.
>>>Understandable wish. Now the responsible man from SSDF says that he could only
>>>do this if he had more resources - BUT of course he hasn't yet. OF COURSE this
>>>does NOT say word for word that Vincent should send hardware as soon as
>>>possible.
>>
>>>But the implication is absolutely clear.
>>
>>No, it is paranoia.

>The same dirty insult again. Are you a psychiatrist? How can you say that?

Common sense?

:)



>>>Taken that VD WOULD have done
>>>this, the SSDF certainly wouldn't have rejected the kind present.
>>
>>I once made a careful allusion to test them, they past the test successful.

>Please more details for this experiment.

Details forgotten, the act itself and its outcome not.




>>>But all this is only part of the overall general problem!!!
>>>
>>>And we should thank VD that he has published this here in CCC. The other aspect
>>>of the problem is that a company like ChessBase has more resources than just a
>>>young programmer.
>>
>>>THEY make an invoice with an autoplayer and whoopieee, the
>>
>>What invoice are you talking about, I asssume you meant to express something
>>else.

>I busted the English. What I meant with invoice is simply something someone SENT
>in to the SSDF. I didn't know the correct meaning of invoice.

I agree with you there is room for improvement, all talks between programmers
and the SSDF guys regarding testing procedures (including updates) should be
public for reasons of transparency.

I will set an example and hereby post the Rebel 12 instructions, as you will see
there is nothing fishy.

=========================================================

Hi Thoralf and company,

Here are the instructions to test Rebel 12 for your rating list:

Interface : ChessPartner 5.3
Engine : Rebel 12.00.01 (rebel.eng) (November 2003)

All learning settings are listed in the default setting "rebel.eng" so you don't
have to do anything special provided "rebel.eng" is loaded.

There is a little trick to speed-up Rebel with 3-4%, it is listed on:

http://members.home.nl/matador/rebel12.htm#UPDATE

[ begin quote ]

For Connoisseurs Only: You can speed-up Rebel 12 under ChessPartner (or ERT)
with 3-4% by modifying the REBEL.INI file. Turn off all the [Display Support
Chesspartner] parameters in the following way:
  [Display Support Chesspartner]
  Support EOC1 = OFF                 * OFF | ON
  Support EOC2 = OFF                 * OFF | ON
  Support EOC3 = OFF                 * OFF | ON
  Support SEARCH = OFF               * OFF | ON
  Support MOVES = OFF                * OFF | ON
  Support STYLE = OFF                * OFF | ON
  Support LEARN = OFF                * OFF | ON
  Support BRAIN = OFF                * OFF | ON
  Support DEBUG = OFF                * OFF | ON
Doing so you will notice the engine window is no longer updated, exactly where
the 3-4% speed gain comes from.

[ end quote ]

I have attached the "rebel.ini" file for your convenience, just copy it over the
old one.

There is one thing you should know, when starting auto232 with ChessPartner make
sure you set the "Maximum moves in game" not higher than 160. After move 161 and
higher Rebel might start to produce rubbish moves or crash. This was also the
case in all Rebel (DOS) versions but the DOS version refused to play further at
move 161, the Windows version (probably) not.

In case it is needed games > 160 moves can be continued by pasting the EPD
position.

I think this is it.

My best and happy testing,

Ed

===================================================




>>>SSDF is accepting it. Later it was found out that this autoplayer gave FRITZ an
>>>edge. At that moment also Ed Schröder began to jump up and down. A kind of war
>>>began.

>>The "Fritz-5 secret autoplayer" issue is a separate subject, there is no
>>relationship with the current topic -> sending hardware to Sweden.

>Objection! Of course it's in connection. True testers wouldn't and shouldn't
>stay in contact with the business world. Period. That means IF they don't have
>strict rules for their testing procedures which they don't have.

See my above comment about transparency.




>>>So here we come to the final aspect of this problem. Speaking in terms of
>>>history. Overall, these parts of the "SSDF problem" could be defined as follows:
>>>
>>>   *** the SSDF is held by amateurs, by certainly sympathetic hobby freaks
>>>
>>>   *** due to a lack of resources SSDF had to test by hand in the early days
>>
>>>   *** due to that same aspect SSDF became open for manipulative tools

>>There are no manipulative tools, you forget that the autoplayers are accepted by
>>the programmers who participate, the SSDF can't function function without the
>>programmers approval. The "Fritz-5 secret autoplayer" issue was such an example.

>It is interesting how you change your opinion. You were among those who were
>certain that something was not kosher with that tool. VD said nothing else. But
>because you changed your opinion, Vincent must be parano???? Strange logic!

I have not changed my position, the complaint was the autoplayer should have
been public. It was not, it was hidden and only available to 2-3 special testers
and the SSDF people to test the new Fritz. Naturally it created a big fuss and
rightly so.



>>>   *** in consequence gifts of hardware alone _could_ influence the results
>>
>>No.
>>
>>There have been irregularities in the past, today everything is okay.


>Except that the validity isn't there! Period.

You miss the most obvious one of all possible objections. Even if all games are
published, even if all email exchanges are public or whatever improvements are
included nothing will prevent a corrupt SSDF tester to manipulate the results by
skipping lost games of his favorite engine. To solve this an arbiter should
watch every SSDF game which is insanity as there are not enough resources,
computer chess competion isn't a multi-billion dollar industry like soccer.

Meaning, under the given circumstances there has to be some kind of basic trust,
always, no way around.



>>>This is all, what Vincent is saying and this is correct. If the SSDF were really
>>>independent, they would test completely without contacts with the programmers.
>>
>>And then you will find programmers complaining in public. They will publicly
>>criticize the SSDF for not using the optimal settings.
>
>
>And what could that change? They could jump up and down and the SSDF should
>continue as usual --- IF they were real testers. But they don't.
>
>
>>
>>
>>>They would buy the progs in shops and they would test them. They would test in
>>>the spirit of the potential clients, the end-users. The whole communicating with
>>>the programmers and their companies makes the SSDF object of almost invisible
>>>manipulations.
>>

>>Nonsense, name one of such a manipulation that would cause a program to perform
>>better.

>Ed, what's going on with you? The time alone when somethig is being sent IS of
>importance for the results! You know that!

I understood the question, you on the other hand should try to think how your
alleged objection could possibly be realistic. You speak of "invisible
manipulations" that make the program to perform better. Just name one of such a
possibility.

HINT, programmers release their babies with the strongest settings possible.



>>>Also herefore Vincent gave perfect examples. The invoice of special "books" and
>>>"learning files" is obviously a tool to manipulate the results of the tests
>>>because the programmers want to react themselves on the reactions of the other
>>>collegues with newer program versions on _their_ progs. Obviously this has no
>>>longer something to do with independent and reliable testing standards.
>>
>>The word "invoice" again, what do mean by that?
>>
>>The general agreed rule is that the version in test is publicly available to the
>>user. This means the engine, the interface, the books, the learning files.


>>>To make this absolutely clear: a test, once begun, does NOT allow a tester to
>>>later make all kind of replacements or manipulative novelties because that
>>>simply and perfectly destroys the validity of the tests!
>>
>>Meaning a programmer can't replace a bugged engine?

>NOT during a certain match. This is a crucial and trivial standard in testings.
>Period.

Debatable but reasonable.




>>>(Just to tell the truth
>>>to many testers here around: you should NOT update your progs in a test
>>>"tournament" because that makes the whole tournament invalid. For that same
>>>reason the disqualifying of LIST in Graz three rounds before the end faked the
>>>whole results.)
>>>
>>>[ I must beg the reader in a general question. Please do not read something into
>>>this message, just because I may have used the improper wordings. Since I am a
>>>foreign English speaker, I have no sense for the objective and concrete meaning
>>>of certain words. Please take my words in its meant context of the whole
>>>message. To make this clear, I do NOT bash the SSDF and its amateur testers.
>>>Nobody does that if he wants to be taken for serious. And if I accept their
>>>amateur status in the beginning of this message, then it is impossible that my
>>>critic later could be taken as insulting abuse. Because mistakes in the test
>>>proceedings must be called mistakes ALSO if amateurs make the tests. If the
>>>amateurs couldn't know or couldn't avoid the mistakes then bad luck, but the
>>>fact remains that these mistakes happened. - All this is a bit difficult to
>>>differentiate but for one time I wanted to add this in case of the usual protest
>>>on my messages about the SSDF. ]
>>
>>Okay.
>>
>>Ed
>
>Thanks Ed.

You are welcome.

Ed




This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.