Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Chessfun and Nunn1 Tests

Author: blass uri

Date: 15:46:08 05/06/00

Go up one level in this thread


On May 06, 2000 at 17:28:13, Mogens Larsen wrote:
<snipped>
>1) First it was an attempt to discredit the Jouni test. Failed because the
>parameters Jouni used was unknown at the time.

It was not an attempt to discredit Jouni test.

Chessfun was surprised to see jouni results and wanted to see if she can repeat
the results.

She found that she could not repeat the results(there was only one case when
Fritz lost 11:9 and she found that fritz was slowed down in this case).

It was not something against Jouni and she did not say that
Jouni cheated.


>
>2) Ponder off vs. on. Comparing on one machine is uncertain, which she
>acknowledged so it's okay (she ran the games anyway(?)). Comparing ponder off on
>one machine with ponder on on two machines didn't work either due to the
>uncertainty introduced by autoplayer.

The uncertainty introduced by the autoplayer is also in the ssdf games.

>
>3) Blitz vs. standard. I haven't seen any publication of the complete set of
>standard games, so there might something interesting there. But it would depend
>on the comparable data.
>
>These problems and "minor" things like learning, Nunn, computer usage during
>test (thanks Tony), GUI and sample size makes any prediction of strength futile.

Computer usage during the test was alos in some ssdf games.
I found that Junior was slowed down in some ssdf games
and the tester admitted that the reason was that he used another program
in the same time and had to repeat 4 games.

I checked minority of the games so I believe that a similiar problem exists also
in games that I did not check.

>
>The only data usable is X beats Y at Game/1 as far as I'm concerned. That isn't
>a lot compared to the effort. Since Chessfun refuses to discuss the issues at
>hand, I wouldn't mind hearing your thoughts.

I think that there is no proof for a significant difference in the results and
Fritz6a is about 200 elo better than Crafty17.10 in the nunn match games in all
time controls and pondering on or off does not change much.

I do not say that both sides earn the same from time or earn the same from
pondering but the difference is too small to know by some hundreds of games and
I guess that we need about 10000 games to know.

I think that the situation is different in tournament time control when Fritz is
only about 100 elo better.

>
>>There were some errors in the games when one computer was slowed down but the
>>same thing happened in the past also in the public ssdf games.
>
>I thought you said that you didn't have Fritz, so I assume that errors you've
>found only covers a small portion of all the possible errors. You don't have to
>check games to evaluate a test. If the test is okay _then_ you can check the
>games.

I admit that I checked only small portion of all the possible errors but I guess
that I am not the only one who checked games.

I also guess that people are more responsible when they know that the games are
public and I trust them more to check for errors relative to the case when the
games are not public.

Uri



This page took 0.02 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.